[jira] [Commented] (SOLR-15715) Dedicated query aggregator nodes in the solr cluster.

Ishan Chattopadhyaya (Jira) Wed, 24 Nov 2021 11:42:05 -0800


    [ 
https://issues.apache.org/jira/browse/SOLR-15715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448793#comment-17448793
 ]


Ishan Chattopadhyaya commented on SOLR-15715:
---------------------------------------------

I just wrapped up an initial round of testing on 
 * regular setup (6 data nodes)
 * POC setup (1 dedicated overseer + 1 coordinator + 6 data nodes).

h2. Setup details

Regular setup:
 * 6 nodes
 * 2GB heap space on every node
 * 6 collections, 6 shards each, 1 replica per shard
 * Documents 30M per collection (ecommerce events dataset)
 * Queries: 20,000 per collection, all queries on faceting (filtered by 
timeranges)
 * Query rate: 2 threads per collection, 6 collections at the same time.
 * Query target node: first data node (port 50000)

POC setup:
 * 8 nodes: 1 dedicated overseer, 1 coordinator node, 6 data nodes
 * 2GB heap space on every node
 * 6 collections, 6 shards each, 1 replica per shard
 * Documents 30M per collection (ecommerce events dataset)
 * Queries: 20,000 per collection, all queries on faceting (filtered by 
timeranges)
 * Query rate: 2 threads per collection, 6 collections at the same time.
 * Query target node: coordinator node (port 50001)

h2. Performance results

Here are the results,

Regular setup results:

^!0001.jpg!^

POC results:
!0001.jpg!
h2. Conclusion
 * Due to a separate coordinator node, memory usage on data nodes very low.
 * Isolated coordinator node feature for query aggregation working as designed.

> Dedicated query aggregator nodes in the solr cluster. 
> ------------------------------------------------------
>
>                 Key: SOLR-15715
>                 URL: https://issues.apache.org/jira/browse/SOLR-15715
>             Project: Solr
>          Issue Type: New Feature
>          Components: SearchComponents - other
>    Affects Versions: 8.10.1
>            Reporter: Hitesh Khamesra
>            Priority: Major
>         Attachments: 0001-1.jpg, 0001.jpg, coordinator-poc.pdf, 
> regular-node.pdf
>
>
> We have a large collection with 1000s of shards in the solr cluster. We have 
> observed that distributed solr query takes many resources(thread, memory, 
> etc.) on the solr data node(node which contains indexes). Thus we need 
> dedicated query nodes to execute distributed queries on large solr 
> collection. That would reduce the memory/cpu pressure from solr data nodes.
> Elastis search has similar functionality 
> [here|https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html#coordinating-node]
>  
> [~noble.paul] [~ichattopadhyaya]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-15715) Dedicated query aggregator nodes in the solr cluster.

Reply via email to