jfboeuf commented on code in PR #11900:
URL: https://github.com/apache/lucene/pull/11900#discussion_r2689030528
##########
lucene/codecs/src/java/org/apache/lucene/codecs/bloom/FuzzySet.java:
##########
@@ -46,7 +46,9 @@ public class FuzzySet implements Accountable {
public static final int VERSION_SPI = 1; // HashFunction used to be loaded
through a SPI
public static final int VERSION_START = VERSION_SPI;
- public static final int VERSION_CURRENT = 2;
+ public static final int VERSION_MURMUR2 = 2;
+ private static final int VERSION_MULTI_HASH = 3;
+ public static final int VERSION_CURRENT = VERSION_MULTI_HASH;
Review Comment:
The `deserialize()` method is expected to fail: the new format uses a hash
count per document >1, while it was previously assumed to be exactly one. In
addition, the hash function has changed (from murmur2 to a combination of the
msb and lsb of a murmur3 64-bit hash), another reason for incompatibility.
However, it should have failed earlier; the `CodecUtil.checkIndexHeader()`
should have thown an `IndexFormatTooOldException`. Have you already made any
modifications to bypass this exception?
If you need backward compatibility, an option would be to write your own
format calling `CodecUtil.checkIndexHeader()` with the accepted version range
2-3 and returning the actual persisted version. From the returned value, you
can delegate deserialization and implementation to the previous or new
implementation. You'll have to rename packages/classes from the original
implementations (both from Lucene 8 and 9) to prevent naming conflicts and
ensure your version of the format is loaded by `PostingsFormat.forName()`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]