[ https://issues.apache.org/jira/browse/SOLR-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882958#comment-13882958 ]
Steve Rowe commented on SOLR-5652: ---------------------------------- bq. It looks to me like there are two problems here: 1) the same doc is showing up on different pages when deep paging; and 2) missing docvalue docs are sorted incorrectly. I think I understand problem #2: non-multi-valued numeric and string fields are created (by TrieField's and StrField's createFields() methods) as NumericDocValuesField-s and SortedDocValuesField-s, respectively, and these require each doc to have a value, which apparently defaults to zero for NumericDocValuesField-s and the empty string for SortedDocValueField-s. Here are the declarations for the field types that have this problem in DistribCursorPagingTest (from schema-sorts.xml): {code:xml} <fieldtype name="str_dv_last" class="solr.StrField" stored="true" indexed="false" docValues="true" sortMissingLast="true"/> <fieldtype name="str_dv_first" class="solr.StrField" stored="true" indexed="false" docValues="true" sortMissingFirst="true"/> <fieldtype name="int_dv_last" class="solr.TrieIntField" stored="true" indexed="false" docValues="true" sortMissingLast="true"/> <fieldtype name="int_dv_first" class="solr.TrieIntField" stored="true" indexed="false" docValues="true" sortMissingFirst="true"/> <fieldtype name="long_dv_last" class="solr.TrieLongField" stored="true" indexed="false" docValues="true" sortMissingLast="true"/> <fieldtype name="long_dv_first" class="solr.TrieLongField" stored="true" indexed="false" docValues="true" sortMissingFirst="true"/> <fieldtype name="float_dv_last" class="solr.TrieFloatField" stored="true" indexed="false" docValues="true" sortMissingLast="true"/> <fieldtype name="float_dv_first" class="solr.TrieFloatField" stored="true" indexed="false" docValues="true" sortMissingFirst="true"/> <fieldtype name="double_dv_last" class="solr.TrieDoubleField" stored="true" indexed="false" docValues="true" sortMissingLast="true"/> <fieldtype name="double_dv_first" class="solr.TrieDoubleField" stored="true" indexed="false" docValues="true" sortMissingFirst="true"/> {code} I think that the above declarations should by disallowed by Solr, because they contain docValues="true" + sortMissing<Last|First>="true"; the user is asking for a particular sorting behavior for missing values, when there never will be missing values. Also, the Solr Ref Guide [says|https://cwiki.apache.org/confluence/display/solr/DocValues] about docvalue fields "If this type is used, the field must be either required or have a default value, meaning every document must have a value for this field." However, neither the above field types nor the fields using them are required or have a default specified. Maybe this should be enforced by schema parsing? > Heisenbug in DistribCursorPagingTest: "walk already seen ..." > ------------------------------------------------------------- > > Key: SOLR-5652 > URL: https://issues.apache.org/jira/browse/SOLR-5652 > Project: Solr > Issue Type: Bug > Reporter: Hoss Man > Assignee: Hoss Man > Attachments: 129.log, 372.log, > jenkins.thetaphi.de_Lucene-Solr-4.x-MacOSX_1200.log.txt, > jenkins.thetaphi.de_Lucene-Solr-4.x-MacOSX_1217.log.txt > > > Several times now, Uwe's jenkins has encountered a "walk already seen ..." > assertion failure from DistribCursorPagingTest that I've been unable to > fathom, let alone reproduce (although sarowe was able to trigger a similar, > non-reproducible seed, failure on his machine) > Using this as a tracking issue to try and make sense of it. > Summary of things noticed so far: > * So far only seen on http://jenkins.thetaphi.de & sarowe's mac > * So far seen on MacOSX and Linux > * So far seen on branch 4x and trunk > * So far seen on Java6, Java7, and Java8 > * fails occured in first block of randomized testing: > ** we've indexed a small number of randomized docs > ** we're explicitly looping over every field and sorting in both directions > * fails were sorting on one of the "\*_dv_last" or "\*_dv_first" fields > (docValues=true, either sortMissingLast=true OR sortMissingFirst=true) > ** for desc sorts, sort on same field asc has worked fine just before this > (fields are in arbitrary order, but "asc" always tried before "desc") > ** sorting on some other random fields has sometimes been tried before this > and worked > (specifics of each failure seen in the wild recorded in comments) -- This message was sent by Atlassian JIRA (v6.1.5#6160) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org