[jira] [Commented] (SOLR-12572) Reuse fieldvalues computed while sorting at writing in ExportWriter

Varun Thacker (JIRA) Mon, 23 Jul 2018 16:22:09 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-12572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553539#comment-16553539
 ]


Varun Thacker commented on SOLR-12572:
--------------------------------------

Thanks Amrit for the patch! Few comments

1. There is a correctness issue with this patch around empty values . Here is a 
test in TestExportWriter which demonstrates the problem
{code:java}
@Test
public void testEmptyValues() throws Exception {
  //Index 2 document with one document that doesn't have field2_i_p
  //Sort and return field2_i_p
  //Test SOLR-12572 for potential NPEs
  assertU(delQ("*:*"));
  assertU(commit());


  assertU(adoc("id","1", "field2_i_p","1"));
  assertU(adoc("id","2"));
  assertU(commit());

  String resp = h.query(req("q", "*:*", "qt", "/export", "fl", "id,field2_i_p", 
"sort", "field2_i_p asc"));
}{code}
 

The output for this before the patch is
{code:java}
{
"responseHeader":{"status":0},
"response":{
"numFound":2,
"docs":[{
"id":"2"}
,{
"id":"1",
"field2_i_p":1}]}}{code}
and after the patch the output looks like
{code:java}
{
"responseHeader":{"status":0},
"response":{
"numFound":2,
"docs":[{
"id":"2",
"field2_i_p":0}
,{
"id":"1",
"field2_i_p":1}]}}
{code}
See how doc id:2 get's {{"field2_i_p":0 . This is because in IntFieldWriter we 
assume that a document has a value. And 0 because IntValue set's currentValue 
to 0 if no value exists. I believe this will be true for the remaining fields 
so we should expand the test case to make sure we aren't introducing a 
regression}}
{code:java}
if (sortValue.getField().equals(this.field)) {
  val = (int) sortValue.getCurrentValue();
  ew.put(this.field, val);
  return true;
}{code}
 

2. Can we implement this method in SortDoc
{code:java}
public SortValue getSortValue(String field) {
  for (SortValue value : sortValues) {
    if (value.getField().equals("field")) {
      return value;
    }
  }
  return null;
}{code}
If we do this then LongFieldWriter and all the other FieldWriter's logic around 
this could be simplified 
{code:java}
for (SortValue sortValue : sortDoc.getSortValues()) {
  if (sortValue.getField().equals(this.field)) {
    val = (long) sortValue.getCurrentValue();
    ew.put(this.field, val);
    return true;
  }
}{code}
 

3. We'd need to implement DoubleValue with the new interface otherwise there is 
a compilation issue

 

 

> Reuse fieldvalues computed while sorting at writing in ExportWriter
> -------------------------------------------------------------------
>
>                 Key: SOLR-12572
>                 URL: https://issues.apache.org/jira/browse/SOLR-12572
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: streaming expressions
>            Reporter: Amrit Sarkar
>            Assignee: Varun Thacker
>            Priority: Minor
>         Attachments: SOLR-12572.patch, SOLR-12572.patch
>
>
> While exporting result through "/export" handler,
> {code:java}
> http://localhost:8983/solr/core_name/export?q=my-query&sort=severity+desc,timestamp+desc&fl=severity,timestamp,msg
> {code}
> Doc-values are sought for all the {{sort}} fields defined (in this example 
> 'severity, 'timestamp'). When we stream out docs we again make doc-value 
> seeks against the {{fl}} fields ('severity','timestamp','msg') . 
> In most common use-cases we have {{fl = sort}} fields, or atleast the sort 
> fields are subset of {{fl}} fields, so if we can *pre-collect* the values 
> while sorting it, we can reduce the doc-value seeks potentially bringing 
> *speed improvement*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-12572) Reuse fieldvalues computed while sorting at writing in ExportWriter

Reply via email to