Re: Add Unicode Support to the DBI

2011-11-09 Thread H.Merijn Brand
On Wed, 09 Nov 2011 19:41:33 +, "Martin J. Evans" wrote: > Your going to have a lot of problems with this test code and DBD::Unify > as we previously discovered that DBD::Unify does not decode the data > coming back from the database itself but it can be decoded by any Perl > script using

Re: Add Unicode Support to the DBI

2011-11-09 Thread H.Merijn Brand
On Wed, 09 Nov 2011 19:41:33 +, "Martin J. Evans" wrote: tl;dr; > On 09/11/2011 15:49, H.Merijn Brand wrote: > > On Tue, 08 Nov 2011 21:12:13 +, "Martin J. Evans" > > wrote: > > > >> I've just checked in unicode_test.pl to DBI's subversion trunk in /ex dir. > >> > >> It won't run right

Re: Add Unicode Support to the DBI

2011-11-09 Thread Martin J. Evans
On 09/11/2011 15:49, H.Merijn Brand wrote: On Tue, 08 Nov 2011 21:12:13 +, "Martin J. Evans" wrote: I've just checked in unicode_test.pl to DBI's subversion trunk in /ex dir. It won't run right now without changing the do_connect sub as you have to specify how to connect to the DB. Also,

Re: Add Unicode Support to the DBI

2011-11-09 Thread H.Merijn Brand
On Wed, 9 Nov 2011 16:23:53 +, Tim Bunce wrote: > On Wed, Nov 09, 2011 at 04:50:29PM +0100, H.Merijn Brand wrote: > > On Tue, 08 Nov 2011 21:12:13 +, "Martin J. Evans" > > wrote: > > > > > I've just checked in unicode_test.pl to DBI's subversion trunk in /ex dir. > > > > So now attache

Re: Add Unicode Support to the DBI

2011-11-09 Thread Tim Bunce
On Wed, Nov 09, 2011 at 04:50:29PM +0100, H.Merijn Brand wrote: > On Tue, 08 Nov 2011 21:12:13 +, "Martin J. Evans" > wrote: > > > I've just checked in unicode_test.pl to DBI's subversion trunk in /ex dir. > > So now attached Any chance you could rework your changes into the (recently updat

Re: Add Unicode Support to the DBI

2011-11-09 Thread H.Merijn Brand
On Tue, 08 Nov 2011 21:12:13 +, "Martin J. Evans" wrote: > I've just checked in unicode_test.pl to DBI's subversion trunk in /ex dir. So now attached -- H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/ using 5.00307 through 5.14 and porting perl5.15.x on HP-UX 10.20

Re: Add Unicode Support to the DBI

2011-11-09 Thread H.Merijn Brand
On Tue, 08 Nov 2011 21:12:13 +, "Martin J. Evans" wrote: > I've just checked in unicode_test.pl to DBI's subversion trunk in /ex dir. > > It won't run right now without changing the do_connect sub as you have > to specify how to connect to the DB. > Also, there is a DBD specific section at

Re: Add Unicode Support to the DBI

2011-11-08 Thread Martin J. Evans
On 08/11/2011 17:53, David E. Wheeler wrote: On Nov 8, 2011, at 5:16 AM, Tim Bunce wrote: 1. Focus initially on categorising the capabilities of the databases. Specifically separating those that understand character encodings at one or more of column, table, schema, database level.

Re: Add Unicode Support to the DBI

2011-11-08 Thread Martin J. Evans
On 08/11/2011 13:16, Tim Bunce wrote: On Mon, Nov 07, 2011 at 01:37:38PM +, Martin J. Evans wrote: I didn't think I was going to make LPW but it seems I will now - although it has cost me big time leaving it until the last minute. All your beers at LPW are on me! http://www.martin-evans.

Re: Add Unicode Support to the DBI

2011-11-08 Thread David E. Wheeler
On Nov 8, 2011, at 5:16 AM, Tim Bunce wrote: > 1. Focus initially on categorising the capabilities of the databases. >Specifically separating those that understand character encodings >at one or more of column, table, schema, database level. >Answer the questions: >what "Unicod

Re: Add Unicode Support to the DBI

2011-11-08 Thread Tim Bunce
On Tue, Nov 08, 2011 at 02:45:39PM +, Martin J. Evans wrote: > On 08/11/11 13:16, Tim Bunce wrote: > >On Mon, Nov 07, 2011 at 01:37:38PM +, Martin J. Evans wrote: > > >2. Try to make a data-driven common test script. > > There is already one attached to the bottom of the post and referred

Re: Add Unicode Support to the DBI

2011-11-08 Thread Martin J. Evans
On 08/11/11 13:16, Tim Bunce wrote: On Mon, Nov 07, 2011 at 01:37:38PM +, Martin J. Evans wrote: I didn't think I was going to make LPW but it seems I will now - although it has cost me big time leaving it until the last minute. All your beers at LPW are on me! http://www.martin-evans.

Re: Add Unicode Support to the DBI

2011-11-08 Thread H.Merijn Brand
On Tue, 8 Nov 2011 13:16:17 +, Tim Bunce wrote: > On Mon, Nov 07, 2011 at 01:37:38PM +, Martin J. Evans wrote: > > > > > > I didn't think I was going to make LPW but it seems I will now - > > > although it has cost me big time leaving it until the last minute. > > All your beers at LPW

Re: Add Unicode Support to the DBI

2011-11-08 Thread Tim Bunce
On Mon, Nov 07, 2011 at 01:37:38PM +, Martin J. Evans wrote: > > > >I didn't think I was going to make LPW but it seems I will now - although it > >has cost me big time leaving it until the last minute. All your beers at LPW are on me! > http://www.martin-evans.me.uk/node/121 Great work Mar

Re: Add Unicode Support to the DBI

2011-11-07 Thread Martin J. Evans
On 04/11/11 08:39, Martin J. Evans wrote: On 03/11/11 23:25, David E. Wheeler wrote: On Oct 7, 2011, at 5:06 PM, David E. Wheeler wrote: Perhaps we could carve out some time at LPW to sit together and try to progress this. That would be awesome you guys! So gents, do you plan to do this a

Re: Add Unicode Support to the DBI

2011-11-06 Thread Martin J. Evans
On 05/10/2011 00:06, Jonathan Leffler wrote: On Tue, Oct 4, 2011 at 15:24, Martin J. Evanswrote: On 04/10/2011 22:38, Tim Bunce wrote: I've not had time to devote to this thread. Sorry. I'd be grateful if someone could post a summary of it if/when it approaches some kind of consensus. I d

Re: Add Unicode Support to the DBI

2011-11-04 Thread David E. Wheeler
On Nov 4, 2011, at 10:33 AM, Martin J. Evans wrote: >> Did you ever get any data from DBD::SQLite folks? > > Yes. I found a bug in the process and it was fixed but I have a working > SQLite example. Oh, great. > I'm only really missing DB2 but I have contacts for that on #dbix-class who > I'v

Re: Add Unicode Support to the DBI

2011-11-04 Thread Martin J. Evans
On 04/11/11 16:39, David E. Wheeler wrote: On Nov 4, 2011, at 1:39 AM, Martin J. Evans wrote: Sorry David, I've been snowed under. I will try very hard to publish the research I found this weekend. Awesome, thanks. Did you ever get any data from DBD::SQLite folks? Yes. I found a bug in th

Re: Add Unicode Support to the DBI

2011-11-04 Thread David E. Wheeler
On Nov 4, 2011, at 1:39 AM, Martin J. Evans wrote: > Sorry David, I've been snowed under. I will try very hard to publish the > research I found this weekend. Awesome, thanks. Did you ever get any data from DBD::SQLite folks? > I didn't think I was going to make LPW but it seems I will now - a

Re: Add Unicode Support to the DBI

2011-11-04 Thread Martin J. Evans
On 03/11/11 23:25, David E. Wheeler wrote: On Oct 7, 2011, at 5:06 PM, David E. Wheeler wrote: Perhaps we could carve out some time at LPW to sit together and try to progress this. That would be awesome you guys! So gents, do you plan to do this a bit? Martin, do you have the data you wante

Re: Add Unicode Support to the DBI

2011-11-03 Thread David E. Wheeler
On Oct 7, 2011, at 5:06 PM, David E. Wheeler wrote: >> Perhaps we could carve out some time at LPW to sit together and try to >> progress this. > > That would be awesome you guys! So gents, do you plan to do this a bit? Martin, do you have the data you wanted to collect on this? Thanks, David

Re: Add Unicode Support to the DBI

2011-10-13 Thread David E. Wheeler
On Oct 13, 2011, at 6:03 AM, Greg Sabino Mullane wrote: >> I think what I haven't said is that we should just use the same >> names that Perl I/O uses. Er, well, for the :raw and :utf8 >> varieties I was, anyway. Perhaps we should adopt it wholesale, >> so you'd use ":encoding(UTF-8)" instead o

Re: Add Unicode Support to the DBI

2011-10-13 Thread Greg Sabino Mullane
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 David E. Wheeler wrote: > I think what I haven't said is that we should just use the same > names that Perl I/O uses. Er, well, for the :raw and :utf8 > varieties I was, anyway. Perhaps we should adopt it wholesale, > so you'd use ":encoding(

Re: Add Unicode Support to the DBI

2011-10-07 Thread David E. Wheeler
On Oct 7, 2011, at 1:47 AM, Tim Bunce wrote: > Perhaps we could carve out some time at LPW to sit together and try to > progress this. That would be awesome you guys! D

Re: Add Unicode Support to the DBI

2011-10-07 Thread Tim Bunce
On Tue, Oct 04, 2011 at 11:24:51PM +0100, Martin J. Evans wrote: > On 04/10/2011 22:38, Tim Bunce wrote: > >I've not had time to devote to this thread. Sorry. > > > >I'd be grateful if someone could post a summary of it if/when it > >approaches some kind of consensus. > I don't think there is a "k

Re: Add Unicode Support to the DBI

2011-10-06 Thread David E. Wheeler
On Oct 6, 2011, at 8:56 AM, Greg Sabino Mullane wrote: >> I still prefer an encoding attribute that you can set as follows: > >> * undef: Default; same as your A. >> * ':utf8': Same as your B: >> * ':raw': Same as your C >> * $encoding: Encode/decode to/from $encoding > > I like that. Although t

Re: Add Unicode Support to the DBI

2011-10-06 Thread Greg Sabino Mullane
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 > Uh, say what? Just as I need to > > binmode STDOUT, ':utf8'; > Before sending stuff to STDOUT (that is, turn off the flag), I would > expect DBDs to do the same before sending data to the database. > Unless, of course, it "just works". I ca

Re: Add Unicode Support to the DBI

2011-10-04 Thread H.Merijn Brand
On Tue, 04 Oct 2011 23:24:51 +0100, "Martin J. Evans" wrote: > Some might disagree but DB2 is a main > one I no longer have access to (please contact me if you use DBD::DB2 > and are prepared to spare half an hour or so to modify examples I have > which verify unicode support). Of course, if y

Re: Add Unicode Support to the DBI

2011-10-04 Thread Jonathan Leffler
On Tue, Oct 4, 2011 at 15:24, Martin J. Evans wrote: > On 04/10/2011 22:38, Tim Bunce wrote: > >> I've not had time to devote to this thread. Sorry. >> >> I'd be grateful if someone could post a summary of it if/when it >> approaches some kind of consensus. >> >> I don't think there is a "kind of

Re: Add Unicode Support to the DBI

2011-10-04 Thread Martin J. Evans
On 04/10/2011 22:38, Tim Bunce wrote: I've not had time to devote to this thread. Sorry. I'd be grateful if someone could post a summary of it if/when it approaches some kind of consensus. Thanks. Tim. I don't think there is a "kind of consensus" right now (although some useful discussion whi

Re: Add Unicode Support to the DBI

2011-10-04 Thread Tim Bunce
I've not had time to devote to this thread. Sorry. I'd be grateful if someone could post a summary of it if/when it approaches some kind of consensus. Thanks. Tim.

Re: Add Unicode Support to the DBI

2011-10-03 Thread David E . Wheeler
On Oct 2, 2011, at 8:49 PM, Greg Sabino Mullane wrote: > DEW> I assume you also mean to say that data sent *to* the database > DEW> has the flag turned off, yes? > > No: that is undefined. I don't see it as the DBDs job to massage data > going into the database. Or at least, I cannot imagine a

Re: Add Unicode Support to the DBI

2011-10-02 Thread Greg Sabino Mullane
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 From: "David E. Wheeler" GSM>> * $h->{unicode_flag} GSM>> If this is set on, data returned from the database is assumed to be UTF-8, and GSM>> the utf8 flag will be set. DEW> I assume you also mean to say that data sent *to* the database DE

Re: Add Unicode Support to the DBI

2011-09-22 Thread David E. Wheeler
On Sep 22, 2011, at 11:57 AM, Martin J. Evans wrote: > ok except what the oracle client libraries accept does not match with Encode > accepted strings so someone would have to come up with some sort of mapping > between the two. Yes. That's one of the consequences of providing a single interfac

Re: Add Unicode Support to the DBI

2011-09-22 Thread Martin J. Evans
On 22/09/2011 19:28, David E. Wheeler wrote: On Sep 22, 2011, at 11:14 AM, Martin J. Evans wrote: Right. There needs to be a way to tell the DBI what encoding the server sends and expects to be sent. If it's not UTF-8, then the utf8_flag option is kind of useless. I think this was my point a

Re: Add Unicode Support to the DBI

2011-09-22 Thread David E. Wheeler
On Sep 22, 2011, at 11:14 AM, Martin J. Evans wrote: >> Right. There needs to be a way to tell the DBI what encoding the server >> sends and expects to be sent. If it's not UTF-8, then the utf8_flag option >> is kind of useless. > I think this was my point above, i.e., why utf8? databases accept

Re: Add Unicode Support to the DBI

2011-09-22 Thread Martin J. Evans
On 22/09/2011 17:36, David E. Wheeler wrote: On Sep 22, 2011, at 2:26 AM, Martin J. Evans wrote: There is more than one way to encode unicode - not everyone uses UTF-8; although some encodings don't support all of unicode. Yeah, maybe should be utf8_flag instead. see below. unicode is not e

Re: Add Unicode Support to the DBI

2011-09-22 Thread David E . Wheeler
On Sep 22, 2011, at 2:26 AM, Martin J. Evans wrote: > There is more than one way to encode unicode - not everyone uses UTF-8; > although some encodings don't support all of unicode. Yeah, maybe should be utf8_flag instead. > unicode is not encoded as UTF-8 in ODBC using the wide APIs. > > Usin

Re: Add Unicode Support to the DBI

2011-09-22 Thread Martin J. Evans
David, I forgot to answer your post first and ended up putting most of my comments in a reply to Greg's posting - sorry, it was a long night last night. Some further comments below: On 21/09/11 19:44, David E. Wheeler wrote: On Sep 10, 2011, at 3:08 AM, Martin J. Evans wrote: I'm not sure an

Re: Add Unicode Support to the DBI

2011-09-22 Thread Martin J. Evans
On 21/09/11 21:52, Greg Sabino Mullane wrote: -BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 ... And maybe that's the default. But I should be able to tell it to be pedantic when the data is known to be bad (see, for example data from an SQL_ASCII-encoded PostgreSQL database). ... DBD:

Re: Add Unicode Support to the DBI

2011-09-21 Thread David E. Wheeler
On Sep 21, 2011, at 1:52 PM, Greg Sabino Mullane wrote: > Since nobody has actally defined a specific interface yet, let me throw out a > straw man. It may look familiar :) > > === > * $h->{unicode_flag} > > If this is set on, data returned from the database is assumed to be UTF-8, > and > th

Re: Add Unicode Support to the DBI

2011-09-21 Thread Greg Sabino Mullane
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 ... > And maybe that's the default. But I should be able to tell it to be pedantic > when the > data is known to be bad (see, for example data from an SQL_ASCII-encoded > PostgreSQL database). ... > DBD::Pg's approach is currently broken. Gre

Re: Add Unicode Support to the DBI

2011-09-21 Thread David E. Wheeler
On Sep 10, 2011, at 3:08 AM, Martin J. Evans wrote: > I'm not sure any change is required to DBI to support unicode. As far as I'm > aware unicode already works with DBI if the DBDs do the right thing. Right, but the problem is that, IME, none of them do "the right thing." As I said, I've submi

Re: Add Unicode Support to the DBI

2011-09-21 Thread David E. Wheeler
On Sep 10, 2011, at 7:44 AM, Lyle wrote: >> Right now 5.8 is the required minimum for DBI: should we consider bumping >> this? > > I know a lot of servers in the wild are still running RHEL5 and it's > variants, which are stuck on 5.8 in the standard package management. The new > RHEL6 only ha

Re: Add Unicode Support to the DBI

2011-09-21 Thread David E. Wheeler
DBI peeps, Sorry for the delayed response, I've been busy, looking to reply to this thread now. On Sep 9, 2011, at 8:06 PM, Greg Sabino Mullane wrote: > One thing I see bandied about a lot is that Perl 5.14 is highly preferred. > However, it's not clear exactly what the gains are and how bad 5

Re: Add Unicode Support to the DBI

2011-09-10 Thread Lyle
On 10/09/2011 04:06, Greg Sabino Mullane wrote: Right now 5.8 is the required minimum for DBI: should we consider bumping this? I know a lot of servers in the wild are still running RHEL5 and it's variants, which are stuck on 5.8 in the standard package management. The new RHEL6 only has 5.10

Re: Add Unicode Support to the DBI

2011-09-10 Thread Martin J. Evans
On 10/09/2011 03:52, David E. Wheeler wrote: DBIers, tl;dr: I think it's time to add proper Unicode support to the DBI. What do you think it should look like? I'm not sure any change is required to DBI to support unicode. As far as I'm aware unicode already works with DBI if the DBDs do the ri

Re: Add Unicode Support to the DBI

2011-09-10 Thread H.Merijn Brand
On Sat, 10 Sep 2011 03:06:49 -, "Greg Sabino Mullane" wrote: > One thing I see bandied about a lot is that Perl 5.14 is highly preferred. > However, it's not clear exactly what the gains are and how bad 5.12 is > compared to 5.14, how bad 5.10 is, how bad 5.8 is, etc. Right now 5.8 is > th

Re: Add Unicode Support to the DBI

2011-09-09 Thread Darren Duncan
Another wrinkle to this is the fact that identifiers in the database, such as column names and such, are also character data, and have an encoding. So for any DBMSs that support Unicode identifiers (as I believe a complete one should, even if they have to be quoted in SQL) or identifiers with t

Re: Add Unicode Support to the DBI

2011-09-09 Thread Greg Sabino Mullane
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 One thing I see bandied about a lot is that Perl 5.14 is highly preferred. However, it's not clear exactly what the gains are and how bad 5.12 is compared to 5.14, how bad 5.10 is, how bad 5.8 is, etc. Right now 5.8 is the required minimum fo

Add Unicode Support to the DBI

2011-09-09 Thread David E. Wheeler
DBIers, tl;dr: I think it's time to add proper Unicode support to the DBI. What do you think it should look like? Background I've brought this up a time or two in the past, but a number of things have happened lately to make me think that it was again time: First, on the DBD::Pg list, we've b