Re: [jira] Updated: (DERBY-1862) Simple hash improves performance

Andreas Korneliussen Tue, 19 Sep 2006 04:16:37 -0700

Daniel John Debrunner wrote:
> Andreas Korneliussen wrote:
> 
>> Øystein Grøvlen wrote:
>>
>>> Andreas Korneliussen (JIRA) wrote:
>>>
>>>
>>>> String.toUpperCase(..) with english locale, should return a string
>>>> with the same number of characters, and it should therefore be valid
>>>> to do a check of number of characters before doing any conversions.
>>> Is it correct to always use English locale in this case?  Ref the
>>> reference guide on SQL identifiers:
>>>
>>>      An ordinary identifier must begin with a letter and contain
>>>      only letters, underscore characters (_), and digits. The
>>>      permitted letters and digits include all Unicode letters and
>>>      digits, but Derby does not attempt to ensure that the
>>>      characters in identifiers are valid in the database's
>>>      locale.
>>>
>>> Should not it be possible to match column names in any locale?
>>>
> 
> No, see below.
> 
>> Your question is a valid question to ask about this method, however my
>> intention was to make the method keep its current behavior. The patch
>> simply preserves the current behaviour (which is to use english locale).
>> So any sets of strings s1 and s2 should make the method return the same
>> values as before the patch. If this is not the case, the patch is not as
>> intended.
>>
>> When looking deeper into the String class, my understanding is that the
>> only Locale which has different semantics than other Locales when it
>> comes to toUpperCase(Locale..), is Turkish, so maybe Derby does not work
>> correctly in Turkish locale.
> 
> I think the changes were made to use a single locale (English) for the
> SQL language so that Derby would work in Turkish. Having the name
> matching in SQL be dependent on the locale of the client or engine would
> mean that the potential exists for a SQL statement from a single
> application to have different meanings in different locales. That is not
> the expected behaviour when working against a programming language.
> 
> When the SQL parser upper cased items in the engine's locale an
> application using 'insert' would fail in Turkish, as it does not upper
> case to "INSERT".
> 
>> I also wondered why Derby has its own SQLIgnoreCase method, instead of
>> simply using String.equalsIgnoreCase(). The Derby implementation is very
>> inefficient compared to the String.equalsIgnoreCase() method, since you
>> risk creating two new string objects before doing the comparison.
> 
> I think because String.equalsIgnoreCase() is dependent on the current
> locale.
>


String.toUpperCase() is locale dependent, however I am not sure that
String.equalsIgnoreCase() is locale dependend (does not seem so when
reading the code and javadoc).

I did find an issue with the German double s: ß.

"ß".toUpperCase() returns "SS".

However "ß".equalsIgnoreCase("SS") returns false.

So basically, "ß".toUpperCase().equalsIgnoreCase("ß") returns false.

The Derby method: SQLUtil.SQLIgnoreCase("ß", "SS") returns true (however
the patch which I attached, will make it return false and therefore is
not as intended).

If my column name is "classnames", should it be accessible by using the
string "claßnames" ?

Regards
Andreas

signature.asc
Description: OpenPGP digital signature

Re: [jira] Updated: (DERBY-1862) Simple hash improves performance

Reply via email to