[
https://issues.apache.org/jira/browse/SOLR-12572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553539#comment-16553539
]
Varun Thacker commented on SOLR-12572:
--------------------------------------
Thanks Amrit for the patch! Few comments
1. There is a correctness issue with this patch around empty values . Here is a
test in TestExportWriter which demonstrates the problem
{code:java}
@Test
public void testEmptyValues() throws Exception {
//Index 2 document with one document that doesn't have field2_i_p
//Sort and return field2_i_p
//Test SOLR-12572 for potential NPEs
assertU(delQ("*:*"));
assertU(commit());
assertU(adoc("id","1", "field2_i_p","1"));
assertU(adoc("id","2"));
assertU(commit());
String resp = h.query(req("q", "*:*", "qt", "/export", "fl", "id,field2_i_p",
"sort", "field2_i_p asc"));
}{code}
The output for this before the patch is
{code:java}
{
"responseHeader":{"status":0},
"response":{
"numFound":2,
"docs":[{
"id":"2"}
,{
"id":"1",
"field2_i_p":1}]}}{code}
and after the patch the output looks like
{code:java}
{
"responseHeader":{"status":0},
"response":{
"numFound":2,
"docs":[{
"id":"2",
"field2_i_p":0}
,{
"id":"1",
"field2_i_p":1}]}}
{code}
See how doc id:2 get's {{"field2_i_p":0 . This is because in IntFieldWriter we
assume that a document has a value. And 0 because IntValue set's currentValue
to 0 if no value exists. I believe this will be true for the remaining fields
so we should expand the test case to make sure we aren't introducing a
regression}}
{code:java}
if (sortValue.getField().equals(this.field)) {
val = (int) sortValue.getCurrentValue();
ew.put(this.field, val);
return true;
}{code}
2. Can we implement this method in SortDoc
{code:java}
public SortValue getSortValue(String field) {
for (SortValue value : sortValues) {
if (value.getField().equals("field")) {
return value;
}
}
return null;
}{code}
If we do this then LongFieldWriter and all the other FieldWriter's logic around
this could be simplified
{code:java}
for (SortValue sortValue : sortDoc.getSortValues()) {
if (sortValue.getField().equals(this.field)) {
val = (long) sortValue.getCurrentValue();
ew.put(this.field, val);
return true;
}
}{code}
3. We'd need to implement DoubleValue with the new interface otherwise there is
a compilation issue
> Reuse fieldvalues computed while sorting at writing in ExportWriter
> -------------------------------------------------------------------
>
> Key: SOLR-12572
> URL: https://issues.apache.org/jira/browse/SOLR-12572
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: streaming expressions
> Reporter: Amrit Sarkar
> Assignee: Varun Thacker
> Priority: Minor
> Attachments: SOLR-12572.patch, SOLR-12572.patch
>
>
> While exporting result through "/export" handler,
> {code:java}
> http://localhost:8983/solr/core_name/export?q=my-query&sort=severity+desc,timestamp+desc&fl=severity,timestamp,msg
> {code}
> Doc-values are sought for all the {{sort}} fields defined (in this example
> 'severity, 'timestamp'). When we stream out docs we again make doc-value
> seeks against the {{fl}} fields ('severity','timestamp','msg') .
> In most common use-cases we have {{fl = sort}} fields, or atleast the sort
> fields are subset of {{fl}} fields, so if we can *pre-collect* the values
> while sorting it, we can reduce the doc-value seeks potentially bringing
> *speed improvement*.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]