On Mon, 18 Sep 2023 18:33:20 GMT, Naoto Sato <[email protected]> wrote:
>> This PR is to incorporate the latest Unicode 15.1, which was released
>> yesterday. Besides the usual character data update, an upgraded
>> implementation of RegEx which reflects the Indic Conjunct Break specified in
>> the latest [Unicode Annex #29 ("Unicode Text
>> Segmentation")](https://unicode.org/reports/tr29/) is included. A
>> corresponding CSR has also been drafted.
>
> Naoto Sato has updated the pull request with a new target base due to a merge
> or a rebase. The incremental webrev excludes the unrelated changes brought in
> by the merge/rebase. The pull request contains 10 additional commits since
> the last revision:
>
> - Fix GensrcRegex.gmk
> - Merge branch 'master' into JDK-8296246-Unicode15.1
> - Update
> make/jdk/src/classes/build/tools/generateextraproperties/GenerateExtraProperties.java
>
> Co-authored-by: Andrey Turbanov <[email protected]>
> - Update
> make/jdk/src/classes/build/tools/generateextraproperties/GenerateExtraProperties.java
>
> Co-authored-by: Andrey Turbanov <[email protected]>
> - TR29 final version
> - .md file update
> - Final 8/28
> - Draft 8/11
> - GenerateExtraProperties tool
> - initial commit
LGTM. A couple of minor suggestions.
make/jdk/src/classes/build/tools/generateextraproperties/GenerateExtraProperties.java
line 41:
> 39: /**
> 40: * Parses extra properties files of UCD, and replaces the placeholders in
> 41: * the given template source file with the generated conditions, then emit
s/emit/emits?
make/jdk/src/classes/build/tools/generateextraproperties/GenerateExtraProperties.java
line 51:
> 49: * <blockquote>
> 50: * %%%Type(=Value)%%%
> 51: * </blockquote>
Nice example to explain how it works!
I wonder if it might be helpful to clarify that the () indicate optional. It
could be slightly confused with being a part of the template.
make/modules/java.base/gensrc/GensrcRegex.gmk line 35:
> 33: INDICCONJUNCTBREAKTEMP :=
> $(MODULE_SRC)/share/classes/jdk/internal/util/regex/IndicConjunctBreak.java.template
> 34: INDICCONJUNCTBREAKPROPS :=
> $(MODULE_SRC)/share/data/unicodedata/DerivedCoreProperties.txt
> 35: INDICCONJUNCTBREAKPARAMS := InCB=Linker InCB=Extend InCB=Consonant
DerivedCoreProperties.txt is very large. Would it be helpful to reference the
section, that is "Derived Property: Indic_Conjunct_Break", in this file or the
template?
-------------
Marked as reviewed by joehw (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/15728#pullrequestreview-1633706039
PR Review Comment: https://git.openjdk.org/jdk/pull/15728#discussion_r1330390702
PR Review Comment: https://git.openjdk.org/jdk/pull/15728#discussion_r1330408135
PR Review Comment: https://git.openjdk.org/jdk/pull/15728#discussion_r1330410613