My experience maintaining custom codecs is that we have to modify them whenever we upgrade Lucene, yes. This isn't a problem because we don't have dependents consuming our software and building their own from it; these are services that are not intended to be further extended. Or considering something like Solr, which I expect users are extending, wouldn't Lucene always be upgraded along with an upgrade of Solr, and not independently?
I guess what I wonder is how frequently is the problem you're describing occurring. Is this a theoretical issue, or do you know folks distributing custom Lucene codecs whose consumers would update Lucene independently? On Thu, Oct 2, 2025, 2:38 PM Chris Hostetter <[email protected]> wrote: > > As someone who has never really mesed around with custom codecs in Lucene, > I'm a little confused/curious what the expected workflow/lifecycle is for > custom codecs that might be maintained by a third-party. > > This questions specifically stems from this project... > > https://github.com/rapidsai/cuvs-lucene/ > > ...and some issues that popped up here... > > https://issues.apache.org/jira/browse/SOLR-17892 > > ...but the confusion is applicable to anyone who might want to write a > custom codec and share it with the world. > > > Let's say it's June of 2025, and I decide to write a simple custom codec, > and host it on github. > > My custom codec is inspired by the FilterCodec javadocs... > > import org.apache.lucene.codecs.* > import org.apache.lucene.codecs.lucene101.Lucene101Codec; > public final class HossCodec extends FilterCodec { > public HossCodec() { > super("HossCodec", new Lucene101Codec()); > } > public LiveDocsFormat liveDocsFormat() { > System.out.println("You are using my custom codec, cool"); > return super.liveDocsFormat() > } > } > > A few things to point out: I've read the *entire* javadocs for > FilterCodec, so I'm smart enough to know that I can't use > Codec.forName("Lucene101") in my constructor *AND* I've either read the > code in Codec.getDefault(), or figured out by trial and error, that I > can't use that method in my constructor either. > > Thus the import of org.apache.lucene.codecs.lucene101.Lucene101Codec, and > the explicit call to 'new Lucene101Codec()' (which is the latest greatest > codec available in the latest greatest lucene release available) > > I write my unit tests, I build my jar file, I release my > hoss-custom-codec-1.0.jar on maven, people start using it to build > their indexes, and everybody is happy. > > Skip ahead to October: one of the people using my custom codec wants to > upgrade to Lucene 10.3, but they can't because the class > "org.apache.lucene.codecs.lucene101.Lucene101Codec" no longer exists -- a > new (otherwise identical) class named > "org.apache.lucene.backward_codecs.lucene101.Lucene101Codec" *does* exist, > but the compiled bytecode of my class doesn't know to use that. > > So -- IIRC -- my users have to wait for me to upgrade my custom codec > (which i can't really do until *after* Lucene 10.3 comes out) to be able > to upgrade their lucene dependency ... even though all the code intended > to ensure "back compat" for my codec is still in Lucene. > > Is that all correct? > > is there anything a custom codec can do to ensure that they can > safely "extend" the current default lucene codec, and have their custom > codec continue to work for an entire major version of lucene w/o > needing to check every release and possible re-compile? > > > > > > -Hoss > http://www.lucidworks.com/ > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
