My experience maintaining custom codecs is that we have to modify them
whenever we upgrade Lucene, yes. This isn't a problem because we don't have
dependents consuming our software and building their own from it; these are
services that are not intended to be further extended.  Or considering
something like Solr, which I expect users are extending, wouldn't Lucene
always be upgraded along with an upgrade of Solr, and not independently?

I guess what I wonder is how frequently is the problem you're describing
occurring. Is this a theoretical issue, or do you know folks distributing
custom Lucene codecs whose consumers would update Lucene independently?


On Thu, Oct 2, 2025, 2:38 PM Chris Hostetter <[email protected]>
wrote:

>
> As someone who has never really mesed around with custom codecs in Lucene,
> I'm a little confused/curious what the expected workflow/lifecycle is for
> custom codecs that might be maintained by a third-party.
>
> This questions specifically stems from this project...
>
>     https://github.com/rapidsai/cuvs-lucene/
>
> ...and some issues that popped up here...
>
>     https://issues.apache.org/jira/browse/SOLR-17892
>
> ...but the confusion is applicable to anyone who might want to write a
> custom codec and share it with the world.
>
>
> Let's say it's June of 2025, and I decide to write a simple custom codec,
> and host it on github.
>
> My custom codec is inspired by the FilterCodec javadocs...
>
>     import org.apache.lucene.codecs.*
>     import org.apache.lucene.codecs.lucene101.Lucene101Codec;
>     public final class HossCodec extends FilterCodec {
>       public HossCodec() {
>         super("HossCodec", new Lucene101Codec());
>       }
>       public LiveDocsFormat liveDocsFormat() {
>         System.out.println("You are using my custom codec, cool");
>         return super.liveDocsFormat()
>       }
>     }
>
> A few things to point out: I've read the *entire* javadocs for
> FilterCodec, so I'm smart enough to know that I can't use
> Codec.forName("Lucene101") in my constructor *AND* I've either read the
> code in Codec.getDefault(), or figured out by trial and error, that I
> can't use that method in my constructor either.
>
> Thus the import of org.apache.lucene.codecs.lucene101.Lucene101Codec, and
> the explicit call to 'new Lucene101Codec()' (which is the latest greatest
> codec available in the latest greatest lucene release available)
>
> I write my unit tests, I build my jar file, I release my
> hoss-custom-codec-1.0.jar on maven, people start using it to build
> their indexes, and everybody is happy.
>
> Skip ahead to October: one of the people using my custom codec wants to
> upgrade to Lucene 10.3, but they can't because the class
> "org.apache.lucene.codecs.lucene101.Lucene101Codec" no longer exists -- a
> new (otherwise identical) class named
> "org.apache.lucene.backward_codecs.lucene101.Lucene101Codec" *does* exist,
> but the compiled bytecode of my class doesn't know to use that.
>
> So -- IIRC -- my users have to wait for me to upgrade my custom codec
> (which i can't really do until *after* Lucene 10.3 comes out) to be able
> to upgrade their lucene dependency ... even though all the code intended
> to ensure "back compat" for my codec is still in Lucene.
>
> Is that all correct?
>
> is there anything a custom codec can do to ensure that they can
> safely "extend" the current default lucene codec, and have their custom
> codec continue to work for an entire major version of lucene w/o
> needing to check every release and possible re-compile?
>
>
>
>
>
> -Hoss
> http://www.lucidworks.com/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to