[jira] [Commented] (SOLR-12616) Track down performance slowdowns with ExportWriter

2018-08-10 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575995#comment-16575995
 ] 

ASF subversion and git services commented on SOLR-12616:


Commit e9f3a3ce1d482bd90ba8aca6e8cb7fe6c86756eb in lucene-solr's branch 
refs/heads/jira/http2 from [~varunthacker]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e9f3a3c ]

SOLR-12616: Optimize Export writer upto 4 sort fields to get better 
performance. This was removed in SOLR-11598 but brought back in the same version


> Track down performance slowdowns with ExportWriter
> --
>
> Key: SOLR-12616
> URL: https://issues.apache.org/jira/browse/SOLR-12616
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>Reporter: Varun Thacker
>Priority: Major
> Fix For: master (8.0), 7.5
>
> Attachments: DefaultCode-1.png, DefaultCode-2.png, SOLR-12616.patch, 
> SOLR-12616.patch, SingleSortValue-1.png, SingleSortValue-2.png
>
>
> Just to be clear for users glancing through this Jira : The performance 
> slowdown is currently on an unreleased version of Solr so no versions are 
> affected by this.
> While doing some benchmarking for SOLR-12572 , I compared the export writers 
> performance against Solr 7.4 and there seems to be some slowdowns that have 
> been introduced. Most likely this is because of SOLR-11598
> In an 1 shard 1 replica collection with 25M docs. We issue the following 
> query 
> {code:java}
> /export?q=*:*=id desc=id{code}
> Solr 7.4 took 8:10 , 8:20 and 8:22 in the 3 runs that I did
> Master took 10:46
> Amrit's done some more benchmarking so he can fill in with some more numbers 
> here. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12616) Track down performance slowdowns with ExportWriter

2018-08-08 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573824#comment-16573824
 ] 

ASF subversion and git services commented on SOLR-12616:


Commit 13b9e28f9dbb0d117d8758c37d8df7d4c17a9edc in lucene-solr's branch 
refs/heads/branch_7x from [~varunthacker]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=13b9e28f ]

SOLR-12616: Optimize Export writer upto 4 sort fields to get better 
performance. This was removed in SOLR-11598 but brought back in the same version

(cherry picked from commit e9f3a3c)


> Track down performance slowdowns with ExportWriter
> --
>
> Key: SOLR-12616
> URL: https://issues.apache.org/jira/browse/SOLR-12616
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>Reporter: Varun Thacker
>Priority: Major
> Fix For: master (8.0), 7.5
>
> Attachments: DefaultCode-1.png, DefaultCode-2.png, SOLR-12616.patch, 
> SOLR-12616.patch, SingleSortValue-1.png, SingleSortValue-2.png
>
>
> Just to be clear for users glancing through this Jira : The performance 
> slowdown is currently on an unreleased version of Solr so no versions are 
> affected by this.
> While doing some benchmarking for SOLR-12572 , I compared the export writers 
> performance against Solr 7.4 and there seems to be some slowdowns that have 
> been introduced. Most likely this is because of SOLR-11598
> In an 1 shard 1 replica collection with 25M docs. We issue the following 
> query 
> {code:java}
> /export?q=*:*=id desc=id{code}
> Solr 7.4 took 8:10 , 8:20 and 8:22 in the 3 runs that I did
> Master took 10:46
> Amrit's done some more benchmarking so he can fill in with some more numbers 
> here. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12616) Track down performance slowdowns with ExportWriter

2018-08-08 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573820#comment-16573820
 ] 

ASF subversion and git services commented on SOLR-12616:


Commit e9f3a3ce1d482bd90ba8aca6e8cb7fe6c86756eb in lucene-solr's branch 
refs/heads/master from [~varunthacker]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e9f3a3c ]

SOLR-12616: Optimize Export writer upto 4 sort fields to get better 
performance. This was removed in SOLR-11598 but brought back in the same version


> Track down performance slowdowns with ExportWriter
> --
>
> Key: SOLR-12616
> URL: https://issues.apache.org/jira/browse/SOLR-12616
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>Reporter: Varun Thacker
>Priority: Major
> Attachments: DefaultCode-1.png, DefaultCode-2.png, SOLR-12616.patch, 
> SOLR-12616.patch, SingleSortValue-1.png, SingleSortValue-2.png
>
>
> Just to be clear for users glancing through this Jira : The performance 
> slowdown is currently on an unreleased version of Solr so no versions are 
> affected by this.
> While doing some benchmarking for SOLR-12572 , I compared the export writers 
> performance against Solr 7.4 and there seems to be some slowdowns that have 
> been introduced. Most likely this is because of SOLR-11598
> In an 1 shard 1 replica collection with 25M docs. We issue the following 
> query 
> {code:java}
> /export?q=*:*=id desc=id{code}
> Solr 7.4 took 8:10 , 8:20 and 8:22 in the 3 runs that I did
> Master took 10:46
> Amrit's done some more benchmarking so he can fill in with some more numbers 
> here. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12616) Track down performance slowdowns with ExportWriter

2018-08-08 Thread Varun Thacker (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573611#comment-16573611
 ] 

Varun Thacker commented on SOLR-12616:
--

Patch which adds back SingleValueSortDoc/ DoubleValueSortDoc/ 
TripleValueSortDoc/ QuadValueSortDoc classes. The speed is back to the original 
speed after doing some tests.

 

> Track down performance slowdowns with ExportWriter
> --
>
> Key: SOLR-12616
> URL: https://issues.apache.org/jira/browse/SOLR-12616
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>Reporter: Varun Thacker
>Priority: Major
> Attachments: DefaultCode-1.png, DefaultCode-2.png, SOLR-12616.patch, 
> SOLR-12616.patch, SingleSortValue-1.png, SingleSortValue-2.png
>
>
> Just to be clear for users glancing through this Jira : The performance 
> slowdown is currently on an unreleased version of Solr so no versions are 
> affected by this.
> While doing some benchmarking for SOLR-12572 , I compared the export writers 
> performance against Solr 7.4 and there seems to be some slowdowns that have 
> been introduced. Most likely this is because of SOLR-11598
> In an 1 shard 1 replica collection with 25M docs. We issue the following 
> query 
> {code:java}
> /export?q=*:*=id desc=id{code}
> Solr 7.4 took 8:10 , 8:20 and 8:22 in the 3 runs that I did
> Master took 10:46
> Amrit's done some more benchmarking so he can fill in with some more numbers 
> here. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12616) Track down performance slowdowns with ExportWriter

2018-08-07 Thread Amrit Sarkar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571751#comment-16571751
 ] 

Amrit Sarkar commented on SOLR-12616:
-

Thanks Varun for such detailed analysis and feedback on the issue. I see 
SOLR-11598 resulted in slowness. Benchmarks done at my end also validates an 
approx 10-18% slowdowns in overall processing of export results.

I digged deeper and found the actual function which is slow, but have no idea 
of reason. Let me share the analysis first:

For query Q1, single sort, with {{SingleValueSortDoc}} introduced again, taking 
4 mins, while vanila master branch code taking 4:45 mins. I attache a sampler 
and attaching screenshots for the respective export query executions.

If you see screenshots: {{SingleSortValue-2}} and {{DefaultSortValue-2}}, the 
only significant difference (around 33 secs) between the processing times of 
respective executions is {{setCurrentValue(docId)}}, which we haven't touched.
SingleSortValue: setCurrentValue(docId): *148 secs*
DefaultCode: setCurrentValue(docId): *181 secs*

I have analyzed the code properly enough to conclude we are not making extra / 
unnecessary calls  for {{setCurrentValue}}, we know the exact line number which 
is causing the slowness: *ExportWriter:235*

> Track down performance slowdowns with ExportWriter
> --
>
> Key: SOLR-12616
> URL: https://issues.apache.org/jira/browse/SOLR-12616
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>Reporter: Varun Thacker
>Priority: Major
> Attachments: DefaultCode-1.png, DefaultCode-2.png, SOLR-12616.patch, 
> SingleSortValue-1.png, SingleSortValue-2.png
>
>
> Just to be clear for users glancing through this Jira : The performance 
> slowdown is currently on an unreleased version of Solr so no versions are 
> affected by this.
> While doing some benchmarking for SOLR-12572 , I compared the export writers 
> performance against Solr 7.4 and there seems to be some slowdowns that have 
> been introduced. Most likely this is because of SOLR-11598
> In an 1 shard 1 replica collection with 25M docs. We issue the following 
> query 
> {code:java}
> /export?q=*:*=id desc=id{code}
> Solr 7.4 took 8:10 , 8:20 and 8:22 in the 3 runs that I did
> Master took 10:46
> Amrit's done some more benchmarking so he can fill in with some more numbers 
> here. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12616) Track down performance slowdowns with ExportWriter

2018-08-06 Thread Varun Thacker (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571042#comment-16571042
 ] 

Varun Thacker commented on SOLR-12616:
--

I can't seem to track down the difference b/w SortDoc and SingleValueSortDoc 
and why SingleValueSortDoc is so much faster.

I tried another round of experiments where I assumed SortDoc will only have one 
sort field and modified the following functions to mimic SingleValueSortDoc . 
The only 1 difference being sortValues is still an array of length one VS a 
single variable. The speed difference still exists 
{code:java}
  public void setValues(SortDoc sortDoc) {
this.docId = sortDoc.docId;
this.ord = sortDoc.ord;
this.docBase = sortDoc.docBase;
sortValues[0].setCurrentValue((sortDoc.sortValues[0]));
  }

  public boolean lessThan(Object o) {
if(docId == -1) {
  return true;
}
int comp = sortValues[0].compareTo(sd.sortValues[0]);
if(comp == -1) {
  return true;
} else if (comp == 1) {
  return false;
} else {
  return docId+docBase > sd.docId+sd.docBase;
}
  }

{code}
To bring back the old performance the one approach we could take is still keep 
the specialized classes for upto 4 sort fields by doing this in the export 
writer
{code:java}
if (sortValues.length == 1) {
  return new SingleValueSortDoc(sortValues[0]);
} else if (sortValues.length == 2) {
  return new DoubleValueSortDoc(sortValues[0]);
} ... for 3 and 4 sort fields .. 
 else {
  return new SortDoc(sortValues);
}

{code}
 

> Track down performance slowdowns with ExportWriter
> --
>
> Key: SOLR-12616
> URL: https://issues.apache.org/jira/browse/SOLR-12616
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>Reporter: Varun Thacker
>Priority: Major
> Attachments: SOLR-12616.patch
>
>
> Just to be clear for users glancing through this Jira : The performance 
> slowdown is currently on an unreleased version of Solr so no versions are 
> affected by this.
> While doing some benchmarking for SOLR-12572 , I compared the export writers 
> performance against Solr 7.4 and there seems to be some slowdowns that have 
> been introduced. Most likely this is because of SOLR-11598
> In an 1 shard 1 replica collection with 25M docs. We issue the following 
> query 
> {code:java}
> /export?q=*:*=id desc=id{code}
> Solr 7.4 took 8:10 , 8:20 and 8:22 in the 3 runs that I did
> Master took 10:46
> Amrit's done some more benchmarking so he can fill in with some more numbers 
> here. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12616) Track down performance slowdowns with ExportWriter

2018-08-03 Thread Varun Thacker (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569031#comment-16569031
 ] 

Varun Thacker commented on SOLR-12616:
--

Patch which tests {{SingleValueSortDoc}} vs SortDoc 

Indexed 25M docs onto a 1 shard X 1 replica collection. 

query - {{/export?q=*:*=id=id desc}} 

With  {{-Dtest.export.writer.optimized=true}} = 7m13 , 7m23

Without {{-Dtest.export.writer.optimized=true}} = 10m27 , 10m31 

I haven't started looking into what's difference b/w SortDoc and 
SingleValueSortDoc because of which we see such speed differences.

> Track down performance slowdowns with ExportWriter
> --
>
> Key: SOLR-12616
> URL: https://issues.apache.org/jira/browse/SOLR-12616
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
>Priority: Major
> Attachments: SOLR-12616.patch
>
>
> Just to be clear for users glancing through this Jira : The performance 
> slowdown is currently on an unreleased version of Solr so no versions are 
> affected by this.
> While doing some benchmarking for SOLR-12572 , I compared the export writers 
> performance against Solr 7.4 and there seems to be some slowdowns that have 
> been introduced. Most likely this is because of SOLR-11598
> In an 1 shard 1 replica collection with 25M docs. We issue the following 
> query 
> {code:java}
> /export?q=*:*=id desc=id{code}
> Solr 7.4 took 8:10 , 8:20 and 8:22 in the 3 runs that I did
> Master took 10:46
> Amrit's done some more benchmarking so he can fill in with some more numbers 
> here. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org