I don't think there are any guidelines on this other than changes in Unicode implemented by ICU. So if something changes in an incompatible way and we can't implement a workaround that would be compatible, we'd wait with the upgrade to follow a major Lucene version only.
I CC Robert, he's much more knowledgeable in this than me. As for automation - this is already pretty simple and we have dependabot running on github, so it shouldn't be a problem. It is the consequences of upgrading that are more difficult to assess. Dawid On Thu, Oct 30, 2025 at 4:50 AM Anh Dũng Bùi <[email protected]> wrote: > Thanks Dawid! > > What you said makes sense (reducing build time and catch inconsistency). > > I have some follow-up questions, when is the ICU library usually upgraded > (I previously thought of a major release, but the last upgrade seems to be > in the middle (10.1 to 10.2)), and are there any drawbacks on upgrading the > ICU library whenever it's released? Maybe it adds some workloads, but maybe > it can be automated? > > Regards, > Anh Dung Bui > > On Thu, Oct 23, 2025 at 3:40 PM Dawid Weiss <[email protected]> wrote: > > > If you're on the main branch, the code to regenerate ICU is in > > lucene.regenerate.icu.gradle: > > > > > > > https://github.com/apache/lucene/blob/main/build-tools/build-infra/src/main/groovy/lucene.regenerate.icu.gradle#L4 > > > > you should bump the version of icu4j and this makes the build use the > > aligned icu-c version too - > > > > https://github.com/apache/lucene/blob/main/gradle/libs.versions.toml#L24 > > > > Once you do that, run: > > > > ./gradlew -p lucene/analysis/icu regenerate > > > > and it should regenerate, clean-up and create checksums for all affected > > files. > > > > - Why aren't we generating them on the fly based on the available ICU > > > version at runtime? Would that enable users to upgrade ICU versions on > > > their own without breaking Lucene? > > > > > > > The reasons for not generating them on the fly are multiple - mainly > we're > > trying to save on > > build times (some of the generated resources are very costly or require > > external tools and infrastructure) > > but also ensure consistency and catch any changer if they happen in the > > middleware toolchain somewhere (in theory, > > if you run the regenerate command above without touching ICU versions, > you > > should get identical checksums > > of all the resulting files, regardless of the platform used, etc.) > > > > Dawid > > >
