[
https://issues.apache.org/jira/browse/SOLR-12408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mikhail Khludnev updated SOLR-12408:
------------------------------------
Description:
{{parallel()}} uses hash filter partitioning, which doesn't work in some edge
cases with high cardinality facets since they kill coordinator on merge phase.
I propose to introduce {{parallelShards()}} which will accepts a collection,
and spawns per-shard substreams (I'm not sure wether to use {{distrib=false}}
or {{shards=foo}}). So, far it's not clear whether {{workerCollection}} is
useful for it at all.
was:
{{parallel()}} uses hash filter partitioning, which doesn't work in some edge
cases with high cardinality facets since they kill coordinator.
I propose to introduce {{parallelShards()}} which will accepts a collection,
and spawns per-shard substreams (I'm not sure wether to use {{distrib=false}}
or {{shards=foo}}). So, far it's not clear whether {{workerCollection}} is
beneficial for it at all.
> Introduce parallelShards() in Streaming Expressions
> ---------------------------------------------------
>
> Key: SOLR-12408
> URL: https://issues.apache.org/jira/browse/SOLR-12408
> Project: Solr
> Issue Type: New Feature
> Security Level: Public(Default Security Level. Issues are Public)
> Components: streaming expressions
> Reporter: Mikhail Khludnev
> Priority: Major
>
> {{parallel()}} uses hash filter partitioning, which doesn't work in some edge
> cases with high cardinality facets since they kill coordinator on merge
> phase.
> I propose to introduce {{parallelShards()}} which will accepts a collection,
> and spawns per-shard substreams (I'm not sure wether to use {{distrib=false}}
> or {{shards=foo}}). So, far it's not clear whether {{workerCollection}} is
> useful for it at all.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]