[ 
https://issues.apache.org/jira/browse/SOLR-12408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16492154#comment-16492154
 ] 

Mikhail Khludnev commented on SOLR-12408:
-----------------------------------------

bq. The idea was to have Streaming Expressions perform the merge of the 
streaming facets. Does that scenario resolve what you are describing?

Well, yes. We can say so. 

However, we'll need to go further. From my experience, it's not possible to 
merge 30 sorted 100M streams with heap and synchronising these input streams. 
The solution I've checked so far is like tweaking {{update()}} to send in-place 
update to temporary collection, which accumulates those per-shard responses.   

> Introduce parallelShards() in Streaming Expressions
> ---------------------------------------------------
>
>                 Key: SOLR-12408
>                 URL: https://issues.apache.org/jira/browse/SOLR-12408
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: streaming expressions
>            Reporter: Mikhail Khludnev
>            Priority: Major
>
> {{parallel()}} uses hash filter partitioning, which doesn't work in some edge 
> cases with high cardinality facets since they kill coordinator on merge 
> phase. 
> I propose to introduce {{parallelShards()}} which will accepts a collection, 
> and spawns per-shard substreams (I'm not sure wether to use {{distrib=false}} 
> or {{shards=foo}}). So, far it's not clear whether {{workerCollection}} is 
> useful for it at all.   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to