[jira] [Commented] (LUCENE-7745) Explore GPU acceleration for spatial search
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527334#comment-16527334 ] Ishan Chattopadhyaya commented on LUCENE-7745: -- Ah, I think I wasn't clear on my intentions behind those numbers. bq. if it brings any performance - I doubt that, because the call overhead between Java and CUDA is way too high - in contrast to Postgres where all in plain C/C++ I wanted to start with those experiments just to prove to myself that there are no significant overheads or bottlenecks (as we've feared in the past) and that there can be clear benefits to be realized. I wanted to try bulk scoring, and chose the distance calculation and sorting as an example because (1) it leverages two fields, (2) it was fairly isolated & easy to try out. In practical usecases of spatial search, the spatial filtering doesn't require score calculation & sorting on the entire dataset (just those documents that are in the vicinity of the user point, filtered down by the geohash or bkd tree node); so in some sense I was trying out an absolute worst case of Lucene spatial search. Now, that I'm convinced that this overall approach works and overheads are low, I can now move on to looking at Lucene internals, maybe starting with scoring in general (BooleanScorer, for example). Other parts of Lucene/Solr that might see benefit could be streaming expressions (since they seem computation heavy), LTR re-ranking etc. Actually incorporating all these benefits into Lucene would require considerable effort, and we can open subsequent JIRAs once we've had a chance to explore them separately. Till then, I'm inclined to keep this issue as a kitchen sink for all-things-GPU, if that makes sense? > Explore GPU acceleration for spatial search > --- > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial-extras >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration for spatial search
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526510#comment-16526510 ] David Smiley commented on LUCENE-7745: -- np. Oh this caught me by surprise too! I though this was about BooleanScorer or postings or something and then low and behold it's spatial -- and then I thought this is so non-obvious by the issue title. So I thought it'd do a little JIRA gardening. > Explore GPU acceleration for spatial search > --- > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial-extras >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration for spatial search
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526506#comment-16526506 ] Adrien Grand commented on LUCENE-7745: -- Not sure why I confused names, I meant Ishan indeed. Sorry for that. I'll let Ishan decide how he wants to manage this issue, I'm personally fine either way, I'm mostly following. :) It just caught me by surprise since I was under the impression that we were still exploring which areas might benefit from GPU acceleration. > Explore GPU acceleration for spatial search > --- > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial-extras >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration for spatial search
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526486#comment-16526486 ] David Smiley commented on LUCENE-7745: -- Mark who? You must mean Ishan? I think that if GPUs are used to accelerate different things, then they would get separate issues and not be lumped under one issue. Does that sound reasonable? Granted the problem posted started off as a bit of an umbrella ticket and perhaps the particular proposal Ishan is presenting in his most recent comment ought to go in a new issue specific to spatial.Accelerating Haversine calculations sounds way different to me than BooleanScorer stuff; no? > Explore GPU acceleration for spatial search > --- > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial-extras >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7745) Explore GPU acceleration for spatial search
[ https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526467#comment-16526467 ] Adrien Grand commented on LUCENE-7745: -- David, I'm not sure this was meant to be specific to lucene/spatial, Mark only mentioned it as a way to conduct an initial benchmark? The main thing that we identified as being a potential candidate for integration with Cuda is actually BooleanScorer (BS1, the one that does scoring in bulk) based on previous comments? > Explore GPU acceleration for spatial search > --- > > Key: LUCENE-7745 > URL: https://issues.apache.org/jira/browse/LUCENE-7745 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial-extras >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: gsoc2017, mentor > Attachments: gpu-benchmarks.png > > > There are parts of Lucene that can potentially be speeded up if computations > were to be offloaded from CPU to the GPU(s). With commodity GPUs having as > high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to > speed parts of Lucene (indexing, search). > First that comes to mind is spatial filtering, which is traditionally known > to be a good candidate for GPU based speedup (esp. when complex polygons are > involved). In the past, Mike McCandless has mentioned that "both initial > indexing and merging are CPU/IO intensive, but they are very amenable to > soaking up the hardware's concurrency." > I'm opening this issue as an exploratory task, suitable for a GSoC project. I > volunteer to mentor any GSoC student willing to work on this this summer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org