[ 
https://issues.apache.org/jira/browse/SOLR-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-9166:
---------------------------------
    Attachment: SOLR-9166.patch

Latest patch implement as we've discussed. The code changes are absolutely 
minimal but are made in ExportWriter since SortingResonseWriter has been 
retired and tests have been added.

There are no default values returned for fields not in the docs, I'm arguing 
that this is incorrect behavior and any code that depends on it needs to be 
re-written. We can discuss that of course....

The test case I added ran afoul of LUCENE-7548. When that's committed the test 
case should be updated. See the comments in StreamingTest.checkSort.

The /export handler seems to sort missing fields first/last as it should, it's 
just that using the /select handler to get the proper ordering seemed like a 
good idea rather than hard-coding the results as in the current patch. This 
test case should continue to run fine even after LUCENE-7548 is committed, 
it'll just be inelegant.

Still to do: Run the entire test suite to see what, if anything, breaks. Will 
do that tonight.

> Export handler returns zero for numeric fields that are not in the original 
> doc
> -------------------------------------------------------------------------------
>
>                 Key: SOLR-9166
>                 URL: https://issues.apache.org/jira/browse/SOLR-9166
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Erick Erickson
>            Assignee: Rohit
>         Attachments: SOLR-9166.patch, SOLR-9166.patch, SOLR-9166.patch
>
>
> From the dev list discussion:
> My original post.
> Zero is different from not
> existing. And let's claim that I want to process a stream and, say,
> facet on in integer field over the result set. There's no way on the
> client side to distinguish between a document that has a zero in the
> field and one that didn't have the field in the first place so I'll
> over-count the zero bucket.
> From Dennis Gove:
> Is this true for non-numeric fields as well? I agree that this seems like a 
> very bad thing.
> I can't imagine that a fix would cause a problem with Streaming Expressions, 
> ParallelSQL, or other given that the /select handler is not returning 0 for 
> these missing fields (the /select handler is the default handler for the 
> Streaming API so if nulls were a problem I imagine we'd have already seen 
> it). 
> That said, within Streaming Expressions there is a select(...) function which 
> supports a replace(...) operation which allows you to replace one value (or 
> null) with some other value. If a 0 were necessary one could use a 
> select(...) to replace null with 0 using an expression like this 
>    select(<stream>, replace(fieldA, null, withValue=0)). 
> The end result of that would be that the field fieldA would never have a null 
> value and for all tuples where a null value existed it would be replaced with 
> 0.
> Details on the select function can be found at 
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61330338#StreamingExpressions-select.
> And to answer Denis' question, null gets returned for string DocValues fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to