[
https://issues.apache.org/jira/browse/DERBY-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525494
]
Daniel John Debrunner commented on DERBY-2699:
----------------------------------------------
I think the approach of getting collation elements as needed would have a large
affect on all string comparisons.
I created a scale 4 order entry database with and without a collated database.
Just looking at the load collation will only affect 'index.sql' which creates
an index including the customer's last name. With UCS_BASIC collation the index
created in about 2.5 seconds, with TERRITORY_BASED collation the time was over
11 seconds.
I don't think that the collation overhead should be that high, I would expect
maybe a 10-20% overhead, not around 450%
> performance of like in territory based collation databases may be improved by
> changing way collation elements are calculated.
> -----------------------------------------------------------------------------------------------------------------------------
>
> Key: DERBY-2699
> URL: https://issues.apache.org/jira/browse/DERBY-2699
> Project: Derby
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 10.3.1.4
> Reporter: Mike Matrigali
>
> WorkHorseForCollatorDatatypes.java has a method
> getCollationElementsForString() which currently gets
> called when processing like clauses in databases that have been created with
> territory based collation, this is
> not an issue in pre-10.3 databases or post 10.3 default databases.
> getCollationElementsForString gets the collation elements for the entire
> value of the String held by
> the datatype using the class.
> If you take the case of pattern 'A%' and the value of datatype is
> 'BXXXXXXXXXXXXXXXXXXXXXXX',
> then it would have been better to better to get collation elements one
> character of the String value at a time
> to avoid the process of getting collation elements for the entire string
> when we don't really need it
> One could imagine this might have a huge performance impact on running like
> against a long clob where
> the like pattern has leading fixed-length pattern to match.
> Comments on this from Dan and Dag can be found in DERBY-2416.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.