Re: Setting solr.data.dir for SolrCloud instance
The data _is_ separated from the code. It's all relative to solr_home which need not have any relation to where the code is executing from. For instance, I can start Solr like java -Dsolr.solr.home=/Users/Erick/testdir/solr -jar start.jar and have my war in a completely different place. Best, Erick On Tue, Nov 26, 2013 at 1:08 AM, adfel70 adfe...@gmail.com wrote: Thanks for the reply, Erick. Actually, I didnt not think this through. I just thought it would be a good idea to separate the data from the application code. I guess I'll leave it without setting the datadir parameter and add a symlink. -- View this message in context: http://lucene.472066.n3.nabble.com/Setting-solr-data-dir-for-SolrCloud-instance-tp4103052p4103228.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Setting solr.data.dir for SolrCloud instance
The problem we had was that we tried to run: java -Dsolr.data.dir=/opt/solr/data -Dsolr.solr.home=/opt/solr/home -jar start.jar and got different behavior for how solr handles these 2 params. we created 2 collections, which created 2 cores. then we got 2 home dirs for the cores, as expected: /opt/solr/home/collection1_shard1_replica1 /opt/solr/home/collection2_shard1_replica1 but instead of creating 2 data dirs like: /opt/solr/data/collection1_shard1_replica1 /opt/solr/data/collection2_shard1_replica1 solr had both cores' data dirs pointing to the same directory - /opt/solr/data when we tried putting a relative path in -Dsolr.data.dir, it worked as expected. I don't know if this is a bug, but we thought of 2 solutions in our case: 1. point -Dsolr.data.dir to a relative path on symlink that path to the absolute path we wanted in the first place. 2. dont provide -Dsolr.data.dir at all, and then solr puts the data dir inside the home.dir, which as said, works with relative paths. we chose the first option for now. Erick Erickson wrote The data _is_ separated from the code. It's all relative to solr_home which need not have any relation to where the code is executing from. For instance, I can start Solr like java -Dsolr.solr.home=/Users/Erick/testdir/solr -jar start.jar and have my war in a completely different place. Best, Erick On Tue, Nov 26, 2013 at 1:08 AM, adfel70 lt; adfel70@ gt; wrote: Thanks for the reply, Erick. Actually, I didnt not think this through. I just thought it would be a good idea to separate the data from the application code. I guess I'll leave it without setting the datadir parameter and add a symlink. -- View this message in context: http://lucene.472066.n3.nabble.com/Setting-solr-data-dir-for-SolrCloud-instance-tp4103052p4103228.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://lucene.472066.n3.nabble.com/Setting-solr-data-dir-for-SolrCloud-instance-tp4103052p4103334.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Setting solr.data.dir for SolrCloud instance
On Nov 25, 2013, at 8:12 AM, adfel70 adfe...@gmail.com wrote: I was expecting that the path I sent would serve as the BASE path for all cores the the node hosts When running Solr on HDFS, there is a similar prop you can use -Dsolr.hdfs.home. If you set that, all data dirs are created nicely under it. We talked about wanting a similar option for SolrCloud and local filesystem a while back. If there is no JIRA issue for it, please file one! - Mark
Re: Setting solr.data.dir for SolrCloud instance
On 11/26/2013 9:19 AM, adfel70 wrote: The problem we had was that we tried to run: java -Dsolr.data.dir=/opt/solr/data -Dsolr.solr.home=/opt/solr/home -jar start.jar and got different behavior for how solr handles these 2 params. we created 2 collections, which created 2 cores. then we got 2 home dirs for the cores, as expected: /opt/solr/home/collection1_shard1_replica1 /opt/solr/home/collection2_shard1_replica1 but instead of creating 2 data dirs like: /opt/solr/data/collection1_shard1_replica1 /opt/solr/data/collection2_shard1_replica1 solr had both cores' data dirs pointing to the same directory - /opt/solr/data when we tried putting a relative path in -Dsolr.data.dir, it worked as expected. I don't know if this is a bug, but we thought of 2 solutions in our case: 1. point -Dsolr.data.dir to a relative path on symlink that path to the absolute path we wanted in the first place. 2. dont provide -Dsolr.data.dir at all, and then solr puts the data dir inside the home.dir, which as said, works with relative paths. we chose the first option for now. The dataDir is a per-core setting, you cannot set it for the entire application. If you make it relative, then it will be relative to each individual instanceDir. It defaults to ./data, so you get $instanceDir/data as the location. Thanks, Shawn
Re: Setting solr.data.dir for SolrCloud instance
The first thing I'd do is not send an absolute path. What happens if you just sent -Dsolr.data.dir=data? (no '/')? We had this discussion a while ago when we were working on auto-discovery, and it turns out that there _are_ legitimate cases in which more than one core/collection can point to the same data dir. You have to very carefully control who writes to the core, and I wouldn't do it unless there was no choice, but some people find it useful. And, in general, I wouldn't mix and match the _core_ admin API with the _collections_ api unless you're very confident in what you are doing. Why isn't just letting the default data.dir location working for you? There are good reasons to make it explicit, mostly just checking that you're not over-thinking the problem. Usually they'll be located in a reasonable place. Best, Erick On Mon, Nov 25, 2013 at 8:12 AM, adfel70 adfe...@gmail.com wrote: I found something strange while trying to create more than one collection in SolrCloud: I am running every instance with -Dsolr.data.dir=/data If I look at Core Admin section, I can see that I have one core and its dataDir is set to this fixed location. Problem is, if I create a new collection, another core is created - but with this fixed index location again. I was expecting that the path I sent would serve as the BASE path for all cores the the node hosts. Current behaviour seems like a bug to me, because obviously one collection will see data that was not indexed to him. Is there a way to overcome this? I mean, change the default data dir location, but still be able to create more than one collection correctly? -- View this message in context: http://lucene.472066.n3.nabble.com/Setting-solr-data-dir-for-SolrCloud-instance-tp4103052.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Setting solr.data.dir for SolrCloud instance
Thanks for the reply, Erick. Actually, I didnt not think this through. I just thought it would be a good idea to separate the data from the application code. I guess I'll leave it without setting the datadir parameter and add a symlink. -- View this message in context: http://lucene.472066.n3.nabble.com/Setting-solr-data-dir-for-SolrCloud-instance-tp4103052p4103228.html Sent from the Solr - User mailing list archive at Nabble.com.