[ 
https://issues.apache.org/jira/browse/SOLR-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882958#comment-13882958
 ] 

Steve Rowe commented on SOLR-5652:
----------------------------------

bq. It looks to me like there are two problems here: 1) the same doc is showing 
up on different pages when deep paging; and 2) missing docvalue docs are sorted 
incorrectly.

I think I understand problem #2: non-multi-valued numeric and string fields are 
created (by TrieField's and StrField's createFields() methods) as 
NumericDocValuesField-s and SortedDocValuesField-s, respectively, and these 
require each doc to have a value, which apparently defaults to zero for 
NumericDocValuesField-s and the empty string for SortedDocValueField-s.

Here are the declarations for the field types that have this problem in 
DistribCursorPagingTest (from schema-sorts.xml):

{code:xml}
<fieldtype name="str_dv_last" class="solr.StrField" stored="true" 
indexed="false" docValues="true" sortMissingLast="true"/>
<fieldtype name="str_dv_first" class="solr.StrField" stored="true" 
indexed="false" docValues="true" sortMissingFirst="true"/>

<fieldtype name="int_dv_last" class="solr.TrieIntField" stored="true" 
indexed="false" docValues="true" sortMissingLast="true"/>
<fieldtype name="int_dv_first" class="solr.TrieIntField" stored="true" 
indexed="false" docValues="true" sortMissingFirst="true"/>

<fieldtype name="long_dv_last" class="solr.TrieLongField" stored="true" 
indexed="false" docValues="true" sortMissingLast="true"/>
<fieldtype name="long_dv_first" class="solr.TrieLongField" stored="true" 
indexed="false" docValues="true" sortMissingFirst="true"/>

<fieldtype name="float_dv_last" class="solr.TrieFloatField" stored="true" 
indexed="false" docValues="true" sortMissingLast="true"/>
<fieldtype name="float_dv_first" class="solr.TrieFloatField" stored="true" 
indexed="false" docValues="true" sortMissingFirst="true"/>

<fieldtype name="double_dv_last" class="solr.TrieDoubleField" stored="true" 
indexed="false" docValues="true" sortMissingLast="true"/>
<fieldtype name="double_dv_first" class="solr.TrieDoubleField" stored="true" 
indexed="false" docValues="true" sortMissingFirst="true"/>
{code}

I think that the above declarations should by disallowed by Solr, because they 
contain docValues="true" + sortMissing<Last|First>="true"; the user is asking 
for a particular sorting behavior for missing values, when there never will be 
missing values.

Also, the Solr Ref Guide 
[says|https://cwiki.apache.org/confluence/display/solr/DocValues] about 
docvalue fields "If this type is used, the field must be either required or 
have a default value, meaning every document must have a value for this field." 
 However, neither the above field types nor the fields using them are required 
or have a default specified.  Maybe this should be enforced by schema parsing?

> Heisenbug in DistribCursorPagingTest: "walk already seen ..."
> -------------------------------------------------------------
>
>                 Key: SOLR-5652
>                 URL: https://issues.apache.org/jira/browse/SOLR-5652
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>         Attachments: 129.log, 372.log, 
> jenkins.thetaphi.de_Lucene-Solr-4.x-MacOSX_1200.log.txt, 
> jenkins.thetaphi.de_Lucene-Solr-4.x-MacOSX_1217.log.txt
>
>
> Several times now, Uwe's jenkins has encountered a "walk already seen ..." 
> assertion failure from DistribCursorPagingTest that I've been unable to 
> fathom, let alone reproduce (although sarowe was able to trigger a similar, 
> non-reproducible seed, failure on his machine)
> Using this as a tracking issue to try and make sense of it.
> Summary of things noticed so far:
> * So far only seen on http://jenkins.thetaphi.de & sarowe's mac
> * So far seen on MacOSX and Linux
> * So far seen on branch 4x and trunk
> * So far seen on Java6, Java7, and Java8
> * fails occured in first block of randomized testing: 
> ** we've indexed a small number of randomized docs
> ** we're explicitly looping over every field and sorting in both directions
> * fails were sorting on one of the "\*_dv_last" or "\*_dv_first" fields 
> (docValues=true, either sortMissingLast=true OR sortMissingFirst=true) 
> ** for desc sorts, sort on same field asc has worked fine just before this 
> (fields are in arbitrary order, but "asc" always tried before "desc")
> ** sorting on some other random fields has sometimes been tried before this 
> and worked
> (specifics of each failure seen in the wild recorded in comments)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to