[ 
https://issues.apache.org/jira/browse/SOLR-7256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14365047#comment-14365047
 ] 

Hari Sekhon commented on SOLR-7256:
-----------------------------------

In solrconfig.xml I would like to be able to provide multiple comma separated 
dataDir paths as you would in say Hadoop and have it use the space on all of 
those disks equally (assuming that every directory specified is a separate disk 
- this is how Hadoop does it).

This way we would only deploy / manage 1 replica instance per node using the 
normal tooling and it would simply follow the pre-configured solrconfig.xml to 
utilize all the different disks and space.

The one problem I can see with this is that in Hadoop the configs are stored on 
local directories eg /etc/hadoop/conf but in SolrCloud they are stored in 
ZooKeeper, effectively forcing the same configuration down on all nodes, which 
may or may not have the same disks available (and quite likely one disk may 
fail requiring the config to exclude it).

The workaround to that would be to use a variable ${solr.data.dir:} and have 
some kind of local /etc/solr/solr-env.sh that contains the variable uniquely 
configurable per node if needed.

> Multiple data dirs
> ------------------
>
>                 Key: SOLR-7256
>                 URL: https://issues.apache.org/jira/browse/SOLR-7256
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 4.10.3
>         Environment: HDP 2.2 / HDP Search
>            Reporter: Hari Sekhon
>
> Request to support multiple dataDirs as indexing a large collection fills up 
> only one of many disks in modern servers (think colocating on Hadoop servers 
> with many disks).
> While HDFS is another alternative, it results in poor performance and index 
> corruption under high online indexing loads (SOLR-7255).
> While it should be possible to do multiple cores with different dataDirs, 
> that could be very difficult to manage and not humanly scale well, so I think 
> Solr should support use of multiple dataDirs natively.
> Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to