[jira] [Commented] (SOLR-16524) Index time hash partitioning

David Smiley (Jira) Sun, 06 Nov 2022 19:37:08 -0800


    [ 
https://issues.apache.org/jira/browse/SOLR-16524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629579#comment-17629579
 ]


David Smiley commented on SOLR-16524:
-------------------------------------

Why a separate file?  Couldn't this be done simply with an URP to a designated 
indexed field?  We could also employ index sorting on this field.

> Index time hash partitioning
> ----------------------------
>
>                 Key: SOLR-16524
>                 URL: https://issues.apache.org/jira/browse/SOLR-16524
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Joel Bernstein
>            Priority: Major
>
> Both Streaming Expressions and Spark-Solr currently rely on query time hash 
> partitioning using the HashQParserPlugin. The query time hash partitioning, 
> although extremely flexible, is very slow when it builds its initial filters. 
> This ticket will add an indexing time hash partitioner that Streaming 
> Expressions and Spark-solr will both be able to use.
> When this ticket is complete I'll also update the ParallelStream and 
> Spark-Solr to be able to use the index time partitioning rather than the 
> HashQParserPlugin.
> This is a stepping stone towards much more performant parallel distributed 
> joins.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-16524) Index time hash partitioning

Reply via email to