[GitHub] [lucene-solr] atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits

2019-07-10 Thread GitBox
atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits URL: https://github.com/apache/lucene-solr/pull/754#issuecomment-510028794 @jpountz JFYI I ran ant beast in 3 batches of 10 times each and it ran clean: WARNING: All illegal access

[GitHub] [lucene-solr] atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits

2019-07-10 Thread GitBox
atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits URL: https://github.com/apache/lucene-solr/pull/754#issuecomment-510007530 > @atris Can you give some result comparison or other evidence? Please see the blog post listed above.

[GitHub] [lucene-solr] atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits

2019-07-10 Thread GitBox
atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits URL: https://github.com/apache/lucene-solr/pull/754#issuecomment-509928095 Not just the cost of prepopulating the PQ, but also building it in first place. If total number of hits < number of

[GitHub] [lucene-solr] atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits

2019-07-09 Thread GitBox
atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits URL: https://github.com/apache/lucene-solr/pull/754#issuecomment-509655033 I just discovered ant beast that seems like the right tool for ensuring that such random tests get reproduced and

[GitHub] [lucene-solr] atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits

2019-07-09 Thread GitBox
atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits URL: https://github.com/apache/lucene-solr/pull/754#issuecomment-509599328 @jpountz Thanks for looking through! I refactored the tests to remove the dependency on random() to populate

[GitHub] [lucene-solr] atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits

2019-07-09 Thread GitBox
atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits URL: https://github.com/apache/lucene-solr/pull/754#issuecomment-509588693 @jpountz @mikemccand Thanks for your comments, fixed the same. I am surprised that the test is failing : it

[GitHub] [lucene-solr] atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits

2019-07-09 Thread GitBox
atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits URL: https://github.com/apache/lucene-solr/pull/754#issuecomment-509534291 Added a dedicated test for checking order of hits when TopDocs are returned from just the hits list and no PQ is

[GitHub] [lucene-solr] atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits

2019-07-09 Thread GitBox
atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits URL: https://github.com/apache/lucene-solr/pull/754#issuecomment-509525838 @jpountz Thanks, I have fixed the comments. The reason I did not use the BooleanQuery for the PQ build and not

[GitHub] [lucene-solr] atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits

2019-07-09 Thread GitBox
atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits URL: https://github.com/apache/lucene-solr/pull/754#issuecomment-509518620 FYI ant precommit passes -- ran it with latest iteration

[GitHub] [lucene-solr] atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits

2019-07-08 Thread GitBox
atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits URL: https://github.com/apache/lucene-solr/pull/754#issuecomment-509207157 @jpountz I have updated the PR per your comments. ant precommit passes. Apologies, this iteration also got

[GitHub] [lucene-solr] atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits

2019-07-08 Thread GitBox
atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits URL: https://github.com/apache/lucene-solr/pull/754#issuecomment-509169211 @jpountz Thanks for the comments, update the PR.

[GitHub] [lucene-solr] atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits

2019-07-08 Thread GitBox
atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits URL: https://github.com/apache/lucene-solr/pull/754#issuecomment-509109583 @jpountz I have pushed a new iteration which does as discussed i.e. builds a hits list and populates hits as long

[GitHub] [lucene-solr] atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits

2019-07-06 Thread GitBox
atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits URL: https://github.com/apache/lucene-solr/pull/754#issuecomment-508914746 @jpountz @tokee Thanks for your comments. I am planning to maintain an ArrayList of ScoreDocs and collect

[GitHub] [lucene-solr] atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits

2019-07-05 Thread GitBox
atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits URL: https://github.com/apache/lucene-solr/pull/754#issuecomment-508804369 > Actually I don't think we need a growable priority queue. For such large number of hits it'd be probably more

[GitHub] [lucene-solr] atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits

2019-07-05 Thread GitBox
atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits URL: https://github.com/apache/lucene-solr/pull/754#issuecomment-508784416 > Not prepopulating the hit queue is only one part of the problem, we would also need to not allocate `numHits`

[GitHub] [lucene-solr] atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits

2019-07-04 Thread GitBox
atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits URL: https://github.com/apache/lucene-solr/pull/754#issuecomment-508377686 Any thoughts on this? This is an automated message

[GitHub] [lucene-solr] atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits

2019-07-02 Thread GitBox
atris commented on issue #754: LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits URL: https://github.com/apache/lucene-solr/pull/754#issuecomment-507544094 I ran some tests with N requested as 100K but the matching docs being only around 10K. This gives a sizeable