>>>>>>>>>>>> Monty Taylor wrote (2008-09-30 15:14:30):
> So rather than build our own system of dealing with all of this - I'd
> love to see us be able to use some of what's already there. Better than
> the C version of this, the C++ one seems to understand you might want to
> use more than just one global locale. Now, I'm not sure how charsets
> enter in to this setup... but the ability is there to deal with
> collations, numbers, currency and dates. Any thoughts?

To be honest, I think a database should deal with as little as
possible with these things. Currency is just a number and the the fact
that the number is a currency should be dealt with by the
application. The same goes for dates which should be ISO 8601
compliant in the db and how to display the date in some locale should
be dealt with by the application and so on (I even think that
timzeones should be ripped out of all databases and dealt with by
applications. Timezone handling in databases causes nothing but
complexity and confusion. A good database design would be independent
of locales and timezones (and thus not SQL compliant....))

When it comes to character set encoding, I think there are three
important aspects to consider.

1) The on-disk encoding. Chinese and Japanese text will require less
space in UTF-16 than in UTF-8. For western langauges it's the other
way round. But this aspect is only relevant when the amount of text
stored is large relative to the available disk space.

2) The database native (or in-memory) encoding. I think space is not
as critical here. If one e.g. choses ICU to handle collation, it would
be natural to chose UTF-16.

3) The interface encoding. This will be dependent on the interface
chosen. E.g. If the interface is JDBC the strings are by definition in
UTF-16.

I think 2) should be chosen with respect to speed and the ability to
get clean code. For the moment UTF-16 seems to me to be a good
choice. 3) is as mentioned defined by the interface used, and if 2) is
UTF-16, then 1) should by default also be UTF-16 but with the option
to configure or plug-in other encodings.

-- 
Bernt Marius Johnsen, Staff Engineer
Database Technology Group, Sun Microsystems, Trondheim, Norway

Attachment: signature.asc
Description: Digital signature

_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to