Thank you, Dan, Felipe
thanks for sharing the video!!
if you happen to have an example to show the new behavior is more
"correct", I would love to know. I thought I know then I feel I don't
know, weird feeling
since this mailing list is tremendously helpful, I decided to try the
one for DBI::pg to see if they know the change :) I am good now, just
for fun.
I will keep you posted.
THANK YOU
Shirley

On Wed, Dec 18, 2024 at 8:53 AM Felipe Gasper <fel...@felipegasper.com> wrote:
>
> Do we know, in fact, why this changed?
>
> The new behaviour may be “more correct”, but it’ll still subtly break a bunch 
> of stuff that worked fine before.
>
> Encoding bugs in Perl are notoriously hard to track down. DBD::Pg is popular; 
> it would be good to know exactly why this happened so that others could 
> proactively adjust their code accordingly.
>
> Also, I recommend my Unicode/UTF-8 talk on this topic, particularly the “use 
> utf8” section starting at about 9m30s and again around 20m20s: 
> https://www.youtube.com/watch?v=yH5IyYyvWHU
>
> -FG
>
>
> > On Dec 18, 2024, at 12:12 AM, Dan Book <gri...@gmail.com> wrote:
> >
> > Indeed, how strings work has not changed, but DBD::Pg's interpretation of 
> > your strings probably did; the new behavior is more "correct" and now that 
> > you are sending it decoded Unicode characters you may avoid other 
> > mysterious issues. (Note that DBI itself does not handle strings, it just 
> > provides the interface, DBD::Pg defines how strings are send to and from 
> > the database)
> >
> > -Dan
> >
> > On Tue, Dec 17, 2024 at 11:09 PM Shaomei Liu <sliu.newjer...@gmail.com> 
> > wrote:
> > Dear Dan, Mark, Felipe, Alexander,
> > Thank you all for your valuable feedback!
> > as I replied Dan yesterday, this is my first time to ask for support from a 
> > mailing list. I was very surprised and happy to get answers so quickly!
> > I added "use utf8;" as suggested by Dan and it worked for my test program 
> > shown in the email, but not for project.
> > then I tried decode as suggested by Dan and it worked for both test program 
> > and project. so issue solved for me!!!
> > perl version is 5.26.3 and 5.16.3 on EL8 and EL7 respectively.
> > DBI version is 1.641 and 1.627 on EL8 and EL7 respectively.
> >
> > here is the test program with decode. I also printed length. I thought it 
> > is a perl thing. but the length is the same on EL8 and EL7. so not sure it 
> > is perl or DBI change causing the issue.
> > xxx.com> cat testutf_decode.pl
> > #!/usr/bin/perl
> > use strict;
> > use warnings;
> > use DBI;
> > use Encode 'decode';
> > print "DBI version: $DBI::VERSION\n";
> >
> > my $db = "debugutf";
> > my $host = "db";
> > my $user = "postgres";
> > my $pass = "";
> > my $dbh = DBI->connect("DBI:Pg:dbname=$db;host=$host",$user,$pass);
> > my $sql = 'INSERT INTO table1 (title) VALUES (?)';
> > my $query = $dbh->prepare($sql);
> > my $bytes = '“';
> > my $chars = decode('UTF-8', $bytes);
> > print "$bytes contains ".length($bytes)." characters\n";
> > print "after decode $bytes contains ".length($chars)." characters\n";
> > #my @values = ($bytes); #=======>with this line, Database shows “ on EL7 
> > but â\u0080\u009C on EL8
> > my @values = ($chars);  #======>Database shows “ on both EL8 and EL7, so 
> > decode fixed the issue
> > $query->execute(@values);
> >
> > xxx.com> ./testutf_decode.pl  #running on EL8
> > DBI version: 1.641
> > “ contains 3 characters
> > after decode “ contains 1 characters
> >
> > xxx.com> ./testutf_decode.pl #running on EL7
> > DBI version: 1.627
> > “ contains 3 characters
> > after decode “ contains 1 characters
> >
> > Thank you!!
> > Shirley
> >
> > On Tue, Dec 17, 2024 at 3:30 PM Alexander Foken via dbi-users 
> > <dbi-users@perl.org> wrote:
> > Hi,
> > DBD::ODBC has several tests related to Unicode handling 
> > (40UnicodeRoundTrip.t, 41Unicode.t, 45_unicode_varchar.t), they should also 
> > work with other DBDs. They should tell you if your problem is between Perl 
> > and Postgres or if it is simply in the encoding of your terminal.
> > Alexander
> > On 17.12.2024 13:31, Felipe Gasper via dbi-users wrote:
> >> Respectfully to Dan & others, I don’t advocate adding “use utf8” to 
> >> existing code without a clear understanding of where your program’s decode 
> >> & encode points are.
> >>
> >> Check to see what DBD::Pg actually writes to the database. If it suddenly 
> >> started encoding, that’s a breaking change that either was documented or 
> >> should be reported upstream.
> >>
> >>> On Dec 16, 2024, at 17:13, Shaomei Liu <sliu.newjer...@gmail.com> wrote:
> >>>
> >>>  Hello,
> >>> very happy to find this mailing list as it is my last resort!!
> >>> I have a project which uses DBI to write to postgres DB.
> >>> after upgrading from RHEL7 to RHEL8, the utf-8 character is not displayed 
> >>> properly in the DB. DB has correct utf-8 encoding set.
> >>> for example, left double quotation mark   “  is displayed as 
> >>> â\u0080\u009C.
> >>> You can use this link to check hex utf-8 bytes
> >>> https://www.cogsci.ed.ac.uk/~richard/utf-8.cgi?input=%E2%80%9C&mode=char
> >>>
> >>> below is the file testutf.pl which writes left double quotation mark  “ 
> >>> to the database. it also shows the query results from psql for both EL8 
> >>> and EL7.
> >>>
> >>> ==========file testutf.pl==========
> >>> #!/usr/bin/perl
> >>> use strict;
> >>> use warnings;
> >>> use DBI;
> >>> print "DBI version: $DBI::VERSION\n";
> >>>
> >>> my $db = "debugutf";
> >>> my $host = "db";
> >>> my $user = "postgres";
> >>> my $pass = "";
> >>> my $dbh = DBI->connect("DBI:Pg:dbname=$db;host=$host",$user,$pass);
> >>> my $sql = 'INSERT INTO table1 (title) VALUES (?)';
> >>> my $query = $dbh->prepare($sql);
> >>> my @values = ('“');
> >>> $query->execute(@values);
> >>> ===================================
> >>>
> >>> ==============on RHEL8
> >>> #execute testutf.pl which wrote “ to database on RHEL8
> >>> text.tac1.dev.bia-boeing.com> ./testutf.pl
> >>> DBI version: 1.641
> >>>
> >>> #from psql
> >>> debugutf=# select * from table1;
> >>>      title
> >>> ---------------
> >>>  â\u0080\u009C  =========>unexpected
> >>> (1 row)
> >>>
> >>>
> >>> ==============on RHEL7
> >>> #execute testutf.pl which wrote “ to database on RHEL8
> >>> text.tac1.dev.bia-boeing.com> ./testutf.pl
> >>> DBI version: 1.627
> >>>
> >>> #from psql
> >>> debugutf=# select * from table1;
> >>>      title
> >>> ---------------
> >>>  “       ============>expected
> >>> (1 row)
> >>>
> >>> Any feedback is appreciated.
> >>> thank you
> >>> Shirley
> > --
> > Alexander Foken
> > mailto:alexan...@foken.de
>

Reply via email to