HI, Thanks for asking that question. The separation of compute and storage would be relevant for the nodes having the "data" role, i.e. nodes that host indexes.
SIP-20 offers a way for these indexes to be on shared storage (S3/GCS etc) and not persisted long term on each individual node, making the nodes themselves stateless (can lose all disk content as they restart and everything will work ok). Given roles coordinator and overseer do not require local state (local persistent storage on the node local disk), SIP-20 makes all the nodes stateless, the same way it does when no node roles are used (state is then only maintained in ZooKeeper and the shared storage backend). If a specific assignment of node roles works for a given cluster/use case, adopting SIP-20 in that cluster would change the storage of indexes and the way each update is handled (distributed to multiple replicas without SIP-20 or being processed by a single replica and shared storage with SIP-20) but the roles would likely stay unchanged: some nodes will be preferred for hosting the Overseer or for coordinating queries, and the same subset of nodes will be handling indexes (although in a different way). Hope that helps, Ilan On Tue, Jan 16, 2024 at 8:57 AM rajani m <rajinima...@gmail.com> wrote: > > Hi All, > > Saw a post on the dev-mailing list about SIP-20 Separation of Compute > and Storage > <https://cwiki.apache.org/confluence/display/SOLR/SIP-20%3A+Separation+of+Compute+and+Storage+in+SolrCloud>. > Trying to understand what extra features it adds when compared to > configuring a solrcloud cluster by leveraging node roles > <https://solr.apache.org/guide/solr/latest/deployment-guide/node-roles.html> > ? > > Thanks, > Rajani