[
https://issues.apache.org/jira/browse/LUCENE-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated LUCENE-6529:
-----------------------------
Attachment: LUCENE-6529.patch
bq. Maybe BasePostingsFormatTestCase does not adequately exercise methods like
size()/ord()/seek(ord). It should be failing!
FWIW, as far as i understand BasePostingsFormatTestCase and
RandomPostingsTester based on skimming them this morning, they may not ever
reproduce this bug since (AFAICT) only ever operate on single segment indexes?
As mentioned: this patch only ever fails for me when testing the
SlowCompositeReaderWrapper -- asserts on the individual segment LeafReaders
seem to pass all the time (even though one segment is forced to have every term
that's in the index as a whole). Likewise if you {{iw.forceMerge(1);}} then
the SlowCompositeReaderWrapper asserts start to pass as well.
----
I've updated the patch to include the test from SOLR-7631, as well as beefing
up UninvertingReader.tTetestSortedSetIntegerManyValues to include all (4)
permutations of multi/single-valued + (no)-precisionStep, (didn't turn up
anything unexpected, only the trie fields are problematic) as well as to
running {{TestUtil.checkReader}} on the SlowCompositeReader before using it.
This last change started triggering failure much earlier...
{noformat}
[junit4] 2> NOTE: reproduce with: ant test
-Dtestcase=TestUninvertingReader -Dtests.method=testSortedSetIntegerManyValues
-Dtests.seed=3A8A592786F36F30 -Dtests.slow=true -Dtests.locale=in_ID
-Dtests.timezone=Zulu -Dtests.asserts=true -Dtests.file.encoding=UTF-8
[junit4] ERROR 0.56s |
TestUninvertingReader.testSortedSetIntegerManyValues <<<
[junit4] > Throwable #1: java.lang.RuntimeException: dv for field:
trie_multi reports wrong maxOrd=33 but this is not the case: 30
[junit4] > at
__randomizedtesting.SeedInfo.seed([3A8A592786F36F30:DB56E81A1372E276]:0)
[junit4] > at
org.apache.lucene.index.CheckIndex.checkSortedSetDocValues(CheckIndex.java:1917)
[junit4] > at
org.apache.lucene.index.CheckIndex.checkDocValues(CheckIndex.java:1987)
[junit4] > at
org.apache.lucene.index.CheckIndex.testDocValues(CheckIndex.java:1790)
[junit4] > at
org.apache.lucene.util.TestUtil.checkReader(TestUtil.java:318)
[junit4] > at
org.apache.lucene.util.TestUtil.checkReader(TestUtil.java:297)
[junit4] > at
org.apache.lucene.uninverting.TestUninvertingReader.testSortedSetIntegerManyValues(TestUninvertingReader.java:284)
{noformat}
...so for good measure, i sprinkled in {{TestUtil.checkReader}} in some of the
other oal.univerting.* tests i could find using SlowCompositeReader -- but
based on my limited beasting, this hasn't triggered any other failures.
(note: patch still has nocommits related to limiting some of the random
variables)
----
bq. If i disable the ord-sharing optimization in DocTermOrds, all 3 seeds pass.
So I think there is a bug in e.g. FixedGap/BlockTerms dictionary or something
like that.
My inclination would be that we should remove this optimization for 5.2.1,
commit these tests, and open a new issue to re-add the optimization if/when if
can be done in such a way that these tests pass reliably.
what do folks think?
> NumericFields + SlowCompositeReaderWrapper + UninvertedReader +
> -Dtests.codec=random can results in incorrect SortedSetDocValues
> ---------------------------------------------------------------------------------------------------------------------------------
>
> Key: LUCENE-6529
> URL: https://issues.apache.org/jira/browse/LUCENE-6529
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Hoss Man
> Attachments: LUCENE-6529.patch, LUCENE-6529.patch
>
>
> Digging into SOLR-7631 and SOLR-7605 I became fairly confident that the only
> explanation of the behavior i was seeing was some sort of bug in either the
> randomized codec/postings-format or the UninvertedReader, that was only
> evident when two were combined and used on a multivalued Numeric Field using
> precision steps. But since i couldn't find any -Dtests.codec or
> -Dtests.postings.format options that would cause the bug 100% regardless of
> seed, I switched tactices and focused on reproducing the problem using
> UninvertedReader directly and checking the SortedSetDocValues.getValueCount().
> I now have a test that fails frequently (and consistently for any seed i
> find), but only with -Dtests.codec=random -- override it with
> -Dtests.codec=default and everything works fine (based on the exhaustive
> testing I did in the linked issues, i suspect every named codec works fine -
> but i didn't re-do that testing here)
> The failures only seem to happen when checking the
> SortedSetDocValues.getValueCount() of a SlowCompositeReaderWrapper around the
> UninvertedReader -- which suggests the root bug may actually be in
> SlowCompositeReaderWrapper? (but still has some dependency on the random
> codec)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]