Thank you, Dan, Felipe thanks for sharing the video!! if you happen to have an example to show the new behavior is more "correct", I would love to know. I thought I know then I feel I don't know, weird feeling since this mailing list is tremendously helpful, I decided to try the one for DBI::pg to see if they know the change :) I am good now, just for fun. I will keep you posted. THANK YOU Shirley
On Wed, Dec 18, 2024 at 8:53 AM Felipe Gasper <fel...@felipegasper.com> wrote: > > Do we know, in fact, why this changed? > > The new behaviour may be “more correct”, but it’ll still subtly break a bunch > of stuff that worked fine before. > > Encoding bugs in Perl are notoriously hard to track down. DBD::Pg is popular; > it would be good to know exactly why this happened so that others could > proactively adjust their code accordingly. > > Also, I recommend my Unicode/UTF-8 talk on this topic, particularly the “use > utf8” section starting at about 9m30s and again around 20m20s: > https://www.youtube.com/watch?v=yH5IyYyvWHU > > -FG > > > > On Dec 18, 2024, at 12:12 AM, Dan Book <gri...@gmail.com> wrote: > > > > Indeed, how strings work has not changed, but DBD::Pg's interpretation of > > your strings probably did; the new behavior is more "correct" and now that > > you are sending it decoded Unicode characters you may avoid other > > mysterious issues. (Note that DBI itself does not handle strings, it just > > provides the interface, DBD::Pg defines how strings are send to and from > > the database) > > > > -Dan > > > > On Tue, Dec 17, 2024 at 11:09 PM Shaomei Liu <sliu.newjer...@gmail.com> > > wrote: > > Dear Dan, Mark, Felipe, Alexander, > > Thank you all for your valuable feedback! > > as I replied Dan yesterday, this is my first time to ask for support from a > > mailing list. I was very surprised and happy to get answers so quickly! > > I added "use utf8;" as suggested by Dan and it worked for my test program > > shown in the email, but not for project. > > then I tried decode as suggested by Dan and it worked for both test program > > and project. so issue solved for me!!! > > perl version is 5.26.3 and 5.16.3 on EL8 and EL7 respectively. > > DBI version is 1.641 and 1.627 on EL8 and EL7 respectively. > > > > here is the test program with decode. I also printed length. I thought it > > is a perl thing. but the length is the same on EL8 and EL7. so not sure it > > is perl or DBI change causing the issue. > > xxx.com> cat testutf_decode.pl > > #!/usr/bin/perl > > use strict; > > use warnings; > > use DBI; > > use Encode 'decode'; > > print "DBI version: $DBI::VERSION\n"; > > > > my $db = "debugutf"; > > my $host = "db"; > > my $user = "postgres"; > > my $pass = ""; > > my $dbh = DBI->connect("DBI:Pg:dbname=$db;host=$host",$user,$pass); > > my $sql = 'INSERT INTO table1 (title) VALUES (?)'; > > my $query = $dbh->prepare($sql); > > my $bytes = '“'; > > my $chars = decode('UTF-8', $bytes); > > print "$bytes contains ".length($bytes)." characters\n"; > > print "after decode $bytes contains ".length($chars)." characters\n"; > > #my @values = ($bytes); #=======>with this line, Database shows “ on EL7 > > but â\u0080\u009C on EL8 > > my @values = ($chars); #======>Database shows “ on both EL8 and EL7, so > > decode fixed the issue > > $query->execute(@values); > > > > xxx.com> ./testutf_decode.pl #running on EL8 > > DBI version: 1.641 > > “ contains 3 characters > > after decode “ contains 1 characters > > > > xxx.com> ./testutf_decode.pl #running on EL7 > > DBI version: 1.627 > > “ contains 3 characters > > after decode “ contains 1 characters > > > > Thank you!! > > Shirley > > > > On Tue, Dec 17, 2024 at 3:30 PM Alexander Foken via dbi-users > > <dbi-users@perl.org> wrote: > > Hi, > > DBD::ODBC has several tests related to Unicode handling > > (40UnicodeRoundTrip.t, 41Unicode.t, 45_unicode_varchar.t), they should also > > work with other DBDs. They should tell you if your problem is between Perl > > and Postgres or if it is simply in the encoding of your terminal. > > Alexander > > On 17.12.2024 13:31, Felipe Gasper via dbi-users wrote: > >> Respectfully to Dan & others, I don’t advocate adding “use utf8” to > >> existing code without a clear understanding of where your program’s decode > >> & encode points are. > >> > >> Check to see what DBD::Pg actually writes to the database. If it suddenly > >> started encoding, that’s a breaking change that either was documented or > >> should be reported upstream. > >> > >>> On Dec 16, 2024, at 17:13, Shaomei Liu <sliu.newjer...@gmail.com> wrote: > >>> > >>> Hello, > >>> very happy to find this mailing list as it is my last resort!! > >>> I have a project which uses DBI to write to postgres DB. > >>> after upgrading from RHEL7 to RHEL8, the utf-8 character is not displayed > >>> properly in the DB. DB has correct utf-8 encoding set. > >>> for example, left double quotation mark “ is displayed as > >>> â\u0080\u009C. > >>> You can use this link to check hex utf-8 bytes > >>> https://www.cogsci.ed.ac.uk/~richard/utf-8.cgi?input=%E2%80%9C&mode=char > >>> > >>> below is the file testutf.pl which writes left double quotation mark “ > >>> to the database. it also shows the query results from psql for both EL8 > >>> and EL7. > >>> > >>> ==========file testutf.pl========== > >>> #!/usr/bin/perl > >>> use strict; > >>> use warnings; > >>> use DBI; > >>> print "DBI version: $DBI::VERSION\n"; > >>> > >>> my $db = "debugutf"; > >>> my $host = "db"; > >>> my $user = "postgres"; > >>> my $pass = ""; > >>> my $dbh = DBI->connect("DBI:Pg:dbname=$db;host=$host",$user,$pass); > >>> my $sql = 'INSERT INTO table1 (title) VALUES (?)'; > >>> my $query = $dbh->prepare($sql); > >>> my @values = ('“'); > >>> $query->execute(@values); > >>> =================================== > >>> > >>> ==============on RHEL8 > >>> #execute testutf.pl which wrote “ to database on RHEL8 > >>> text.tac1.dev.bia-boeing.com> ./testutf.pl > >>> DBI version: 1.641 > >>> > >>> #from psql > >>> debugutf=# select * from table1; > >>> title > >>> --------------- > >>> â\u0080\u009C =========>unexpected > >>> (1 row) > >>> > >>> > >>> ==============on RHEL7 > >>> #execute testutf.pl which wrote “ to database on RHEL8 > >>> text.tac1.dev.bia-boeing.com> ./testutf.pl > >>> DBI version: 1.627 > >>> > >>> #from psql > >>> debugutf=# select * from table1; > >>> title > >>> --------------- > >>> “ ============>expected > >>> (1 row) > >>> > >>> Any feedback is appreciated. > >>> thank you > >>> Shirley > > -- > > Alexander Foken > > mailto:alexan...@foken.de >