[ 
http://issues.apache.org/jira/browse/DERBY-1478?page=comments#action_12459997 ] 
            
Mamta A. Satoor commented on DERBY-1478:
----------------------------------------

I would like to propose a way for supporting locale sensitive data in Derby. 
Currently, upto Derby 10.2 release, the sorting for CHAR and VARCHAR data types 
is codepoint based (UNICODE). For someone looking for any locale specific 
collation, they can possibly write a couple functions as suggested by  
http://wiki.apache.org/db-derby/LanguageBasedOrdering, but that solution is not 
complete and not efficient (since functional indexes can't be defined in 
Derby). 

My proposal for Derby 10.3 is that a user would be able to specify an optional 
jdbc url attribute, called territoryBasedCollation, at the database create time 
and that attribute can be set to true or false. If the attribute is not 
specified or is set to false, then collation will continue to be codepoint 
based. But if the user specifies true for territoryBasedCollation, the 
collation will be based on language region specified by the exisiting Derby 
attribute called territory (territory=ll_CC) 
http://db.apache.org/derby/docs/10.2/ref/rrefattrib56769.html
If the territory attribute is not specified at the database create time, Derby 
10.2 uses java,util.Locale.getDefault to determine the territory for the newly
created database.

I am not planning to implement any collation support on any existing database, 
ie collation enabling will not be supported at the upgrade database time or on 
a pre-existing database. Those databases will continue to use codepoint based 
collation. I am proposing to implement the collation support only for new 
databases,

The locale based ordering will impact operations that require returning the 
order of data. That includes
1)Comparison using comparison operators (<, >, =, IN, BETWEEN)
2)Statements that involve sorting (ORDER BY, GROUP BY, DISTINCT, MAX, and MIN)
3)Statements that use the LIKE keyword

Derby already has lot of code for locale based ordering for disabled NATIONAL 
CHAR and NATIONAL VARCHAR datatypes. I hope to leverage highly on that code and 
see how it can be used for this project. Also, I am keeping a goal for myself 
to implement this in such a way that databases with codepoint based collation 
will not get penalized by the code for locale based collation.

Other than finding a means of storing the territoryBasedCollation attribute 
from the url somewhere, I don't anticipate any other disk changes as part of 
this project.

Please share if there are any comments. In the mean time, I will start looking 
at how to accept the new jdb url attribute in the create database url and how 
to store that attribute .

> Add built in language based ordering and like processing to Derby
> -----------------------------------------------------------------
>
>                 Key: DERBY-1478
>                 URL: http://issues.apache.org/jira/browse/DERBY-1478
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.1.2.1
>            Reporter: Kathey Marsden
>         Assigned To: Mamta A. Satoor
>
> It would be good for Derby to have built in Language based ordering based on 
> locale specific Collator.
> Language based ordering is an important feature for international deployment. 
>  DERBY-533 offers one implementation option for this but according to the 
> discussion in that issue National Character Types carry a fair amount of 
> baggage with them especially in the form of concerns about conversion   to 
> and from datetime and number types. Rick  mentioned SQL language for 
> collations as an option for language based ordering. There may be other 
> options too, but I thought it worthwhile to add an issue for the high level 
> functional concern, so the best choice can be made for implementation without 
> assuming that National Character Types is the only solution.
> For possible 10.1 workaround and examples see:
> http://wiki.apache.org/db-derby/LanguageBasedOrdering

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to