Hi Kathey,
Here is my understanding of how the disabled national string types worked:
1) A national string type used the collation ordering appropriate to the
locale of the database. That collation ordering, in turn, was specified
by the jdk and could not be overriden.
2) The collation ordering determined the meaning of <, =, and > for
national strings. For a given locale, the rules can be quite tricky. If
you're not familiar with a locale, you are likely to be surprised by the
visibly different strings which nevertheless turn out to be = to one
another.
3) The locale-sensitive meaning of <, =, and > affected the operation of
all orderings of national strings, including sorts, indexes, unions,
group-by's, like's, between's, and in's.
At one point I was keen on re-enabling the national string types. Now I
am leaning toward implementing the ANSI collation language. I think this
is more powerful. In particular, it lets you support more than one
language-sensitive ordering in the same database.
You and your customer face a hard problem trying to migrate national
strings from Cloudscape 5.1.60 into Derby 10.1.3 or 10.2. I'm at a loss
how to do this in a way that preserves Cloudscape's performance.
Regards,
-Rick
Kathey Marsden wrote:
Bernt M. Johnsen wrote:
"aa" as one letter was removed from the Norwegian language in 1938 ("å"
had been optional since 1917). It is only used in names today and it is
true what Anders says about the phonebook (also about the foreign names
where "aa" is treated like two letters). I don't think it would be wise
to not let "a.*" match "Aasen" (wich in modern writing would be Åsen).
Thank you so much Knut Anders and Bernt for the clarification on
"aa". I guess now I need a new example and need to understand how
Locale specific LIKE processing is functionally different than
regular like behavior and when it is required.
The user I have been working with is actually migrating from
Cloudscape 5.1.60 National Character types and the goal was to get a
workaround to achieve the same behavior in Derby. The example came
from the doc:
http://publibfi.boulder.ibm.com/epubs/html/cloud51/doc/html/coredocs/sqlj105.htm#1178996
Clearly the Derby code still has the code path for the National Type
special processing.
In org.apache.derby.iapi.types.SQLChar We have a separate code path
for National Character types that passes the Collator.
How is this functionally different than LIKE processing for regular
character types? Can anyone think of another example where this
special processing might be needed?
Thanks
Kathey
Below is a SQLChar code snippet for reference.
public BooleanDataValue like(DataValueDescriptor pattern)
throws StandardException
{
Boolean likeResult;
if (! isNationalString())
{
// note that we call getLength() because the length
// of the char array may be different than the
// length we should be using (i.e. getLength()).
// see getCharArray() for more info
char[] evalCharArray = getCharArray();
char[] patternCharArray = ((SQLChar)pattern).getCharArray();
likeResult = Like.like(evalCharArray,
getLength(),
patternCharArray,
pattern.getLength());
}
else
{
SQLChar patternSQLChar = (SQLChar) pattern;
likeResult = Like.like(getIntArray(),
getIntLength(),
patternSQLChar.getIntArray(),
patternSQLChar.getIntLength(),
getLocaleFinder().getCollator());
}