Re: RFR: 8291660: Grapheme support in BreakIterator

Stuart Marks Wed, 24 Aug 2022 20:57:04 -0700

On Tue, 23 Aug 2022 22:44:13 GMT, Naoto Sato <[email protected]> wrote:


> This is to enhance the character break analysis in `java.text.BreakIterator` 
> to conform to the extended grapheme cluster boundaries defined in 
> https://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries. A 
> corresponding CSR has also been drafted, as there will be behavioral changes 
> with this modification.

src/java.base/share/classes/sun/util/locale/provider/BreakIteratorProviderImpl.java
 line 258:

> 256:                     .filter(i -> boundaries.get(i) > offset)
> 257:                     .findFirst()
> 258:                     .orElse(boundaries.size() - 1);

Is it worth trying to use Collections.binarySearch() here? I think the 
boundaries list is in ascending sorted order, so you might be able to drop in a 
binarySearch() call directly. (Need to be a bit careful with negative return 
values.)

-------------

PR: https://git.openjdk.org/jdk/pull/9991

Re: RFR: 8291660: Grapheme support in BreakIterator

Reply via email to