On Mon, 2004-03-22 at 03:04, Tim Bunce wrote: > On Sun, Mar 21, 2004 at 04:50:34PM -0800, Dean Arnold wrote: > > > > > > > If a list of charset behaviors for each DBD is needed, > > > > I'd be happy to put one together, assuming the DBD authors > > > > send me the details for each driver. > > > > > > That would be great. > > > > OK. Shall we start w/ DBD::Oracle ? ;^) > > You could, but that's very much a moving target at the moment. > > > And driver authors, feel free to forward to me (and/or thlis > > list). I'll try to put together a little webpage with the info. > > I think it would help if you formulated a set of questions for driver > authors (or anyone else) to answer. Especially as finding the right > questions can be harder than finding the answers. > > Here are a few to get you started:
For Sybase ASE (and DBD::Sybase) > - Does the database: > - have any concept of national character sets? ASE has a concept of locales, with a mapping from the locale to a character set. > - at what levels: database, table, field? server. > - url for list of character set names? > - does it support unicode? Yes. > - Does the database client API: > - provide access to character set information, and how? Yes, in the connection properties. > - at what levels: database, table, field? Server (i.e. connection). > - does it have a concept of a client character set? Yes. > - how is the client charset determined (locale, env var etc) locale/env var (LC_ALL/LANG), but can be overridden via connection properties. > - does it perform charset recoding? Yes, if possible. > - Does the DBD driver: > - (repeat last set of questions) DBD::Sybase will honor the current locale as that is the default behavior of Sybase OpenClient, and you can override the client charset in the DBI DSN as needed. > > Presumably, > > just another bit of $sth metadata, e.g., $sth->{CHAR_SET}, to provide > > the info. If the driver doesn't know, then it fills in with undef, and > > the app is on its own. Otherwise, the app has enough info to make > > the necessary conversion: > > You're presuming that all database that support charsets will use > the same set of names as Encode uses. I hope that is the case but > it might not be. (Add that to your list of things to discover :) ASE uses "iso_1", "cp850", 'sjis", "eucjis", "eucgb", euccns, big5, utf8, roman8, roman9, cp437, gb18030, eucksc and a few others that I've probably missed. The charset names depend on the platform (i.e. Win32 has a different set of charset names than, say, linux or VMS). FWIW... :-) Michael -- Michael Peppler Data Migrations, Inc. [EMAIL PROTECTED] http://www.peppler.org/ Sybase T-SQL/OpenClient/OpenServer/C/Perl developer available for short or long term contract positions - http://www.peppler.org/resume.html