UninvertingReader#wrap seems to skip uninverting if the field to uninvert already has doc values of the expected type?
On Tue, Dec 18, 2018 at 8:24 PM Andrzej Białecki <andrzej.biale...@lucidworks.com> wrote: > > The unexpected part is that I would have expected the code to handle > franken-segments as well, because at some point we finally resorted to always > forcing the wrapping even for segments that don’t need it (ie. they claim the > field contains DVs) but the test is still failing. > > On 18 Dec 2018, at 19:05, Adrien Grand <jpou...@gmail.com> wrote: > > I had a quick look and couldn't find anything to prevent what you > called “franken-segments” in the Lucene test? > > On Tue, Dec 18, 2018 at 5:59 PM Erick Erickson <erickerick...@gmail.com> > wrote: > > > A couple of additions: > > AddDVMPLuceneTest2 does not use Solr constructs at all, so is the test > we think is most interesting at this point, it won't lead anyone down > the path of "what's all this Solr stuff and is it right" kinds of > questions (believe me, we've spent some time on that path!). Please > feel free to look at all the rest of it of course, but the place we're > stuck is why this test fails. > > AddDvStress is intended as an integration-level test, it requires some > special setup (in particular providing a particular configset), we put > that together to reliably make the problem visible. We thought the new > code was the issue at first and needed something to narrow down the > possibilities... > > The reason we're obsessing about this is that it calls into question > how segments are merged when "things change". We don't understand why > this is happening at the Lucene level so don't know how to insure that > things like the schema API in Solr aren't affected. > > Andrzej isn't the only one running out of ideas ;). > > On Tue, Dec 18, 2018 at 4:46 AM Andrzej Białecki <a...@getopt.org> wrote: > > > Hi, > > I'm working on a use case where an existing Solr setup needs to migrate to a > schema that uses docValues for faceting, instead of uninversion. This case > fits into a broader subject of SOLR-12259 (Robustly upgrade indexes). > However, in this case there are two major requirements for this migration > process: > > * data cannot be reindexed from scratch - I need to work with the already > indexed documents (which do contain the values needed for faceting, but these > values are simply indexed and not stored as doc values) > > * indexing can’t be stopped while the schema is being changed (the conversion > process needs to work on-the-fly while the collection is online, both for > searching and for updates). Collection reloads / reopening is ok but it’s not > ok to take the collection offline for several minutes (or hours). > > Together with Erick Erickson we implemented a solution that uses MergePolicy > (actually MergePolicyFactory in Solr) to enforce re-writing of segments that > no longer match the schema, ie. don’t contain docValues in a field where the > new schema requires it. This merge policy determines what segments need this > conversion and then forces the “merging” (actually re-writing) of these > segments by first wrapping them into UninvertingReader to supply docValues > where they are required by the new schema but actually are missing in the > segment’s data. This “AddDocValuesMergePolicy” (ADVMP for short) is supposed > to deal with the following types of segments: > > * old segments created before the schema change - these don’t contain any > docValues in the target fields and so they are wrapped in UninvertingReader > for merging (and for searching) according to the new schema. > > * new segments created after the schema change - if FieldInfo-s for these > fields claim that the segment already contains docValues where it should then > the segment is passed as-is to merging, otherwise it’s wrapped again. An > optimisation was also put here to “mark” the already converted segments using > a marker in SegmentInfo diagnostics map so that we can avoid re-checking and > re-converting already converted data. > > So, long story short, this process works very well when there’s no concurrent > indexing activity - all old segments are properly wrapped and re-written and > merging with new segments works as intended. However, in a situation with > concurrent indexing it works well but only for a short while. At some point > this conversion process seems to lose large percentage of the docValues, even > though it seems that at all points the source segments are properly wrapped - > the ADVMP merge policy adds a lot of debugging information to track the > source and type of segments across many levels of merging and whether they > were wrapped or not. > > My working theory is that somehow this schema change produces > “franken-segments” (while they still haven’t been flushed) where only some of > the most recent docs have the docValues and earlier ones don’t. As I > understand it, this should not happen in Solr because a schema change results > in a core reload. The tracking information from ADVMP seems to indicate that > all generations of segments, both those that were flushed and merged earlier, > have been properly wrapped. > > My alternate theory is that there’s some bug in the doc values merging > process when UninvertingReader is involved, because this problem occurs also > when we modify ADVMP to always force the wrapping of all segments in > UninvertingReader-s. The percentage of lost doc values is sometimes quite > large, up to 50%, perhaps it’s a bug somewhere where the code accounts for > the presence of doc values in FieldCacheImpl? > > Together with Erick we implemented a bunch of tests that illustrate this > issue - both the tests and the code can be found on branch "jira/solr-12259": > > * code.tests.AddDVMPLuceneTest2 - this is a pure Lucene test that shows how > doc values are lost after several rounds of merging while concurrent indexing > is going on. This failure is reproducible 100%. > > * code.tests.AddDvStress - this is a Solr test that repeatedly creates a > collection without doc values, starts the indexing, changes the config to use > ADVMP, makes the schema change to turn doc values on, and verifies the number > of facets on the target field. This test also fails after a while with the > same symptoms as the Lucene one, so I think that solving the Lucene test > failure should solve this failure too. > > Any suggestions or insights are very much appreciated - I'm running out of > ideas to try... > > — > > Andrzej Białecki > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > > > -- > Adrien > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > -- Adrien --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org