Well, We're using the default Lucene similarity. But as far as I know, we've always disabled norms as well. So I'm surprised I'm even seeing norms mentioned in the context of our own index, which is why I wondered whether -1 might have been an older placeholder for "no value" which later became 0 or something.
About the only thing I'm sure about at the moment is that whatever is going on is weird. TX On Thu, 11 Jun 2020 at 15:38, Adrien Grand <jpou...@gmail.com> wrote: > > Hi Trejkaz, > > Negative norm values are legal. The problem here is that Lucene expects > that documents that have no terms must either not have a norm value > (typically because the document doesn't have a value for the field), or a > norm value equal to 0 (typically because the token stream over the field > value produced no tokens). > > Are you using a custom similarity or one of the Lucene ones? One would only > get -1 as a norm with the Lucene similarities if it had a number of tokens > that is very close to Integer.MAX_VALUE. > > On Thu, Jun 11, 2020 at 4:22 AM Trejkaz <trej...@trypticon.org> wrote: > > > Hi all. > > > > We use CheckIndex as a post-migration sanity check and are seeing this > > quirk, and I'm wondering whether negative norms is even legit or > > whether it should have been treated as if it were zero... > > > > TX > > > > > > 0.00% total deletions; 378 documents; 0 deleteions > > Segments file=segments_1 numSegments=1 version=8.5.1 > > id=52isly98kogao7j0cnautwknj > > 1 of 1: name=_0 maxDoc=378 > > version=8.5.1 > > id=52isly98kogao7j0cnautwkni > > codec=Lucene84 > > compound=false > > numFiles=18 > > size (MB)=0.663 > > diagnostics = {java.vendor=Oracle Corporation, os=Mac OS X, > > java.version=1.8.0_191, java.vm.version=25.191-b12, > > lucene.version=8.5.1, os.arch=x86_64, > > java.runtime.version=1.8.0_191-b12, source=addIndexes(CodecReader...), > > os.version=10.15.5, timestamp=1591841756208} > > no deletions > > test: open reader.........OK [took 0.004 sec] > > test: check integrity.....OK [took 0.002 sec] > > test: check live docs.....OK [took 0.000 sec] > > test: field infos.........OK [36 fields] [took 0.000 sec] > > test: field norms.........OK [26 fields] [took 0.001 sec] > > test: terms, freq, prox...ERROR: java.lang.RuntimeException: > > Document 0 doesn't have terms according to postings but has a norm > > value that is not zero: -1 > > > > java.lang.RuntimeException: Document 0 doesn't have terms according to > > postings but has a norm value that is not zero: -1 > > at org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:1678) > > at org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1871) > > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:724) > > at org.apache.lucene.index.CheckIndex.doCheck(CheckIndex.java:2973) > > > > test: stored fields.......OK [15935 total field count; avg 42.2 > > fields per doc] [took 0.003 sec] > > test: term vectors........OK [1173 total term vector count; avg > > 3.1 term/freq vector fields per doc] [took 0.170 sec] > > test: docvalues...........OK [16 docvalues fields; 11 BINARY; 2 > > NUMERIC; 0 SORTED; 2 SORTED_NUMERIC; 1 SORTED_SET] [took 0.003 sec] > > test: points..............OK [4 fields, 1509 points] [took 0.000 sec] > > FAILED > > WARNING: exorciseIndex() would remove reference to this segment; > > full exception: > > java.lang.RuntimeException: Term Index test failed > > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:750) > > at org.apache.lucene.index.CheckIndex.doCheck(CheckIndex.java:2973) > > > > WARNING: 1 broken segments (containing 378 documents) detected > > Took 0.355 sec total. > > WARNING: would write new segments file, and 378 documents would be > > lost, if -exorcise were specified > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > -- > Adrien --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org