Thanks Dawid! What you said makes sense (reducing build time and catch inconsistency).
I have some follow-up questions, when is the ICU library usually upgraded (I previously thought of a major release, but the last upgrade seems to be in the middle (10.1 to 10.2)), and are there any drawbacks on upgrading the ICU library whenever it's released? Maybe it adds some workloads, but maybe it can be automated? Regards, Anh Dung Bui On Thu, Oct 23, 2025 at 3:40 PM Dawid Weiss <[email protected]> wrote: > If you're on the main branch, the code to regenerate ICU is in > lucene.regenerate.icu.gradle: > > > https://github.com/apache/lucene/blob/main/build-tools/build-infra/src/main/groovy/lucene.regenerate.icu.gradle#L4 > > you should bump the version of icu4j and this makes the build use the > aligned icu-c version too - > > https://github.com/apache/lucene/blob/main/gradle/libs.versions.toml#L24 > > Once you do that, run: > > ./gradlew -p lucene/analysis/icu regenerate > > and it should regenerate, clean-up and create checksums for all affected > files. > > - Why aren't we generating them on the fly based on the available ICU > > version at runtime? Would that enable users to upgrade ICU versions on > > their own without breaking Lucene? > > > > The reasons for not generating them on the fly are multiple - mainly we're > trying to save on > build times (some of the generated resources are very costly or require > external tools and infrastructure) > but also ensure consistency and catch any changer if they happen in the > middleware toolchain somewhere (in theory, > if you run the regenerate command above without touching ICU versions, you > should get identical checksums > of all the resulting files, regardless of the platform used, etc.) > > Dawid >
