[
https://issues.apache.org/jira/browse/DERBY-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12478538
]
Daniel John Debrunner commented on DERBY-2336:
----------------------------------------------
Some of my thoughts in this area are biased by the existing implementation of
(not-exposed) national character types, which was originally done quickly and
in my opinion rather poorly. Typically in software engineering the second
re-write is an improvement on the first, it seems there is an opportunity here
to make a second re-write better rather than repeat the mistakes of the first.
Some of the mistakes of the first are:
- using the context to obtain the locale (even commented as a HACK)
- pushing the locale ordering code into the base class (SQLChar) so that
non-locale based ordering has to deal with additional overhead
- basing the ordering on a locale and not a Collator
I think there is also plenty of opportunity to make incremental progress here.
Steps I could see happening are:
1) Create the new character data value classes that perform ordering based
upon a Collator. Have the collator be a field
in the class and hard-code it to some language for now (say Norwegian
:-)
2) make those classes the ones used when locale based ordering is required.
3) write some tests that ensure the order does change when the collation
property is set on database creation
(can be converted into real tests later by using a Norwegian database)
4) Delete the old national character types and remove their ovverhead from
the USC_BASIC ordering classes (SQLChar etc.)
5) Figure out how to set the real Collator object for a DataValueDescriptor
during runtime and recovery.
Of course step 5) doesn't have to be done last, it's independent of steps 1-4
6) Write more tests for other languages.
I agree that some of the locale issues with data types are being confused here,
Mamta found a valid bug where conversion of string to
a date time value is not being handled correctly. That is a separate issue, but
it's being confused because the discussion so far has
not really described the actual problems, it's focussed on getting the locale
based upon the old national character types code.
The real problems here are:
1) How to get the correct Collator object for character comparisions
2) How to get the correct object to parse date-time values from Strings
1) Is just locale based for this issue, but there is the chance to have a
framework that works with more than locale based ordering,
such as case-insensitive ordering. Focusing on the locale increases the
likelyhood that the solution will not be expandable to other types.
E.g. if we duplicate code for locale in the store, do we need to duplicate code
again for case-insensitive searches, and then again for
another Collator style?
2) is easier because it doesn't need to worry about recovery and is more
closely related to the solution used for the Calendar object.
> Enable collation based ordering for CHAR data type.
> ---------------------------------------------------
>
> Key: DERBY-2336
> URL: https://issues.apache.org/jira/browse/DERBY-2336
> Project: Derby
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 10.3.0.0
> Reporter: Mamta A. Satoor
> Attachments: DERBY_LocalFinder_CodeCleanup_diff_V01.txt,
> DERBY_LocalFinder_CodeCleanup_stat_V01.txt
>
>
> I am breaking down the Parent task DERBY-1478 (Add built in language based
> ordering and like processing to Derby) into multiple sub tasks. One of them
> is to concentrate on enabling the collation based ordering on (hopefully the
> simplest of all the character data types) CHAR data type. This task in itself
> might need subtasks if it is later found that it can be subdivided into
> multiple smaller steps.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.