[ 
https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524890#comment-16524890
 ] 

Ishan Chattopadhyaya edited comment on LUCENE-7745 at 6/27/18 11:07 AM:
------------------------------------------------------------------------

Here [0] are some very initial experiments that I ran, along with Kishore 
Angani, a colleague at Unbxd.

1. Generic problem: Given a result set (of document hits) and a scoring 
function, return a sorted list of documents along with the computed scores 
(which may leverage one or more indexed fields).
2. Specific problem: Given (up to 11M) points and associated docids, compute 
the distance from a given query point. Return the sorted list of documents 
based on these distances.
3. GPU implementation based on Thrust library (C++ based Apache 2.0 licensed 
library), called from JNI wrapper. Timings include copying data (scores and 
sorted docids) back from GPU to host system and access from Java (via 
DirectByteBuffer).
4. CPU implementation was based on SpatialExample [1], which is perhaps not the 
fastest (points fields are better, I think).
5. Hardware: CPU is i7 5820k 4.3GHz (OC), 32GB RAM @ 2133MHz. GPU is Nvidia GTX 
1080, 11GB GDDR5 memory.

Results seem promising. The GPU is able to score 11M documents in ~50ms!. Here, 
blue is GPU and red is CPU (Lucene). 

!gpu-benchmarks.png|width=450!


[0] - https://github.com/chatman/gpu-benchmarks
[1] - 
https://github.com/apache/lucene-solr/blob/master/lucene/spatial-extras/src/test/org/apache/lucene/spatial/SpatialExample.java


was (Author: ichattopadhyaya):
Here [0] are some very initial experiments that I ran, along with Kishore 
Angani, a colleague at Unbxd.

1. Generic problem: Given a result set (of document hits) and a scoring 
function, return a sorted list of documents along with the computed scores.
2. Specific problem: Given (up to 11M) points and associated docids, compute 
the distance from a given query point. Return the sorted list of documents 
based on these distances.
3. GPU implementation based on Thrust library (C++ based Apache 2.0 licensed 
library), called from JNI wrapper. Timings include copying data (scores and 
sorted docids) back from GPU to host system and access from Java (via 
DirectByteBuffer).
4. CPU implementation was based on SpatialExample [1], which is perhaps not the 
fastest (points fields are better, I think).
5. Hardware: CPU is i7 5820k 4.3GHz (OC), 32GB RAM @ 2133MHz. GPU is Nvidia GTX 
1080, 11GB GDDR5 memory.

Results seem promising. The GPU is able to score 11M documents in ~50ms!. Here, 
blue is GPU and red is CPU (Lucene). 

!gpu-benchmarks.png|width=450!


[0] - https://github.com/chatman/gpu-benchmarks
[1] - 
https://github.com/apache/lucene-solr/blob/master/lucene/spatial-extras/src/test/org/apache/lucene/spatial/SpatialExample.java

> Explore GPU acceleration
> ------------------------
>
>                 Key: LUCENE-7745
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7745
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Ishan Chattopadhyaya
>            Assignee: Ishan Chattopadhyaya
>            Priority: Major
>              Labels: gsoc2017, mentor
>         Attachments: gpu-benchmarks.png
>
>
> There are parts of Lucene that can potentially be speeded up if computations 
> were to be offloaded from CPU to the GPU(s). With commodity GPUs having as 
> high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to 
> speed parts of Lucene (indexing, search).
> First that comes to mind is spatial filtering, which is traditionally known 
> to be a good candidate for GPU based speedup (esp. when complex polygons are 
> involved). In the past, Mike McCandless has mentioned that "both initial 
> indexing and merging are CPU/IO intensive, but they are very amenable to 
> soaking up the hardware's concurrency."
> I'm opening this issue as an exploratory task, suitable for a GSoC project. I 
> volunteer to mentor any GSoC student willing to work on this this summer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to