Mamta Satoor wrote:

The method above uses the passed RuleBasedCollator to find the collation element for '_'. For our specific example, in Norwegian, '_' translates into only one collation element (vs 2 elements for '\uFA2D'). When looking for '_', we eliminate only 1 collation element from the array created for '\uFA2D' because '_' got translated into 1 collation element.

That in itself looks like a bug. _ means match any single character, at no point should the code be translating _ into a collation element. The use of _ as a 'any character' has no relationship to the collation value for the character underscore.

Dan.

Reply via email to