[ 
https://issues.apache.org/jira/browse/DERBY-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525494
 ] 

Daniel John Debrunner commented on DERBY-2699:
----------------------------------------------

I think the approach of getting collation elements as needed would have a large 
affect on all string comparisons.

I created a scale 4 order entry database with and without a collated database. 
Just looking at the load collation will only affect 'index.sql' which creates 
an index including the customer's last name. With UCS_BASIC collation the index 
created in about 2.5 seconds, with TERRITORY_BASED collation the time was over 
11 seconds.

I don't think that the collation overhead should be that high, I would expect 
maybe a 10-20% overhead, not around 450%

> performance of like in territory based collation databases may be improved by 
> changing way collation elements are calculated.
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-2699
>                 URL: https://issues.apache.org/jira/browse/DERBY-2699
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.3.1.4
>            Reporter: Mike Matrigali
>
> WorkHorseForCollatorDatatypes.java has a method 
> getCollationElementsForString() which currently gets
> called when processing like clauses in databases that have been created with 
> territory based collation, this is
> not an issue in pre-10.3 databases or post 10.3 default databases.
> getCollationElementsForString gets the collation elements for the entire  
> value of the String held by
> the datatype using the class.
> If you take the case of pattern 'A%' and the value of datatype is 
> 'BXXXXXXXXXXXXXXXXXXXXXXX', 
> then it would have been better to  better to get collation elements one 
> character of the String value at a time
> to avoid the  process of getting collation elements for the entire string 
> when we don't really need it 
> One could imagine this might have a huge performance impact on running like 
> against a long clob where
> the like pattern has leading fixed-length pattern to match.
> Comments on this from Dan and Dag can be found in DERBY-2416.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to