[jira] [Commented] (SOLR-11508) Make coreRootDirectory configurable via an environment variable (SOLR_CORE_HOME)

2017-12-06 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16280584#comment-16280584
 ] 

Shawn Heisey commented on SOLR-11508:
-

bq. What benefit exactly would a "rename" of solr.data.home to solr.index.home 
give? 

The idea would be to clear up confusion.  Based on how this issue started and 
progressed, it seems that there's some confusion about what "data" means.  The 
initial expectation seems to have been that it would cover ALL of Solr's data, 
including the conf directory, but in fact it only deals with the *index* data, 
so solr.index.home seems like a better name for the property.

That confusion is also the reason that I mentioned the possibility of replacing 
solr.solr.home with solr.data.home.  Although the idea passes a sniff test, it 
might cause confusion of a different kind for veterans, so it wouldn't be my 
first preference.

Currently we have three things that can be configured, in chronological order:  
solr.solr.home, coreRootDirectory, and solr.data.home.  All of these have uses, 
but I think the end result is particularly confusing for novices.

The reason I think we should kill coreRootDirectory: When I take a step back 
and think about everything, I find little value in separating what's in the 
solr home (solr.xml and configsets) from the rest of the configuration data.

I do find value in separating the config from the index data.  That makes it a 
lot easier to keep configurations in source control, and if you find yourself 
in a place where you want to delete all index data but leave all the cores 
intact, it's REALLY easy.

If I think about what the best option would be if we could start over, I come 
up with the notion of having two configurations -- one for everything that's 
not read-only (the solr home), and one for index data (currently 
solr.data.home).

Accommodating an empty data volume for the solr home location is the last 
wrinkle, and is solved by not *requiring* solr.xml.  SolrCloud can already 
handle an empty solr home, standalone should too.


> Make coreRootDirectory configurable via an environment variable 
> (SOLR_CORE_HOME)
> 
>
> Key: SOLR-11508
> URL: https://issues.apache.org/jira/browse/SOLR-11508
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Marc Morissette
>
> (Heavily edited)
> Since Solr 7, it is possible to store Solr cores in separate disk locations 
> using solr.data.home (see SOLR-6671). This is very useful when running Solr 
> in Docker where data must be stored in a directory which is independent from 
> the rest of the container.
> While this works well in standalone mode, it doesn't in Cloud mode as the 
> core.properties automatically created by Solr are still stored in 
> coreRootDirectory and cores created that way disappear when the Solr Docker 
> container is redeployed.
> The solution is to configure coreRootDirectory to an empty directory that can 
> be mounted outside the Docker container.
> The incoming patch makes this easier to do by allowing coreRootDirectory to 
> be configured via a solr.core.home system property and SOLR_CORE_HOME 
> environment variable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11508) Make coreRootDirectory configurable via an environment variable (SOLR_CORE_HOME)

2017-12-06 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-11508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16280377#comment-16280377
 ] 

Jan Høydahl commented on SOLR-11508:


I still don't get it. What benefit exactly would a "rename" of 
{{solr.data.home}} to {{solr.index.home}} give? What would be the difference 
between current 7.2 {{solr.data.home}} and your new {{solr.index.home}}?

Sounds like a lot of confusion could be handled with some documentation patches 
(describe SOLR_DATA_HOME mainly in context of standalone mode), and code 
changes to *not* require {{solr.xml}} or {{zoo.cfg}} files in {{SOLR_HOME}} at 
all.

Then Cloud users would separate code/config/data by installing binaries in a 
R/O {{/opt/solr}}, keep all config in Zookeeper, even {{solr.xml}}, and then 
point SOLR_HOME to some writeable location of choice (just as they have always 
done in all Solr versions).

Standalone users can choose to separate code from config+data by having R/O 
binaries in {{/opt/solr}}, and choose a writeable SOLR_HOME 
({{/var/solr/home}}) for core config and data, just as they have always done in 
all Solr versions. If they in addition want to separate data from config, they 
must configure SOLR_DATA_HOME  (e.g. {{/mnt/largeDisk/solr-data}}) in addition, 
this is because standalone users store their config locally on disk.

Both of these scenarios will work on Docker if we do not require any 
pre-existing files in SOLR_HOME?

> Make coreRootDirectory configurable via an environment variable 
> (SOLR_CORE_HOME)
> 
>
> Key: SOLR-11508
> URL: https://issues.apache.org/jira/browse/SOLR-11508
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Marc Morissette
>
> (Heavily edited)
> Since Solr 7, it is possible to store Solr cores in separate disk locations 
> using solr.data.home (see SOLR-6671). This is very useful when running Solr 
> in Docker where data must be stored in a directory which is independent from 
> the rest of the container.
> While this works well in standalone mode, it doesn't in Cloud mode as the 
> core.properties automatically created by Solr are still stored in 
> coreRootDirectory and cores created that way disappear when the Solr Docker 
> container is redeployed.
> The solution is to configure coreRootDirectory to an empty directory that can 
> be mounted outside the Docker container.
> The incoming patch makes this easier to do by allowing coreRootDirectory to 
> be configured via a solr.core.home system property and SOLR_CORE_HOME 
> environment variable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11508) Make coreRootDirectory configurable via an environment variable (SOLR_CORE_HOME)

2017-12-04 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16278034#comment-16278034
 ] 

Shawn Heisey commented on SOLR-11508:
-

If Solr no longer requires solr.xml to start, then the solr home can be 
completely empty on startup, fulfilling the requirements mentioned for the data 
volume on Docker.

If you're in standalone mode and you create a core with the commandline, then 
the commandline script will copy a config to ${instancedDir}/conf (where the 
instanceDir is inside the solr home), and when Solr is informed about the new 
core with the CoreAdmin API, part of the core startup will create 
${instanceDir}/data, and under that directory, Lucene will create the index 
directory.

bq. but it would condemn cloud mode users to choose between sticking to the 
default settings or mixing their configuration and data.

In cloud mode, Solr doesn't mix config and data.  The config is not on disk at 
all.  It's in zookeeper.  Even solr.xml can be in zookeeper when running in 
cloud mode.  Which means that cloud mode can ALREADY work with no solr.xml file 
in the solr home, just like I am describing.

bq. It's either that or we would need to externalize every configuration 
parameter available in solr.xml

I agree that it would be important to make sure that a certain critical subset 
of solr.xml configuration parameters must be configurable with system 
properties, which should definitely include the various home directories, but I 
don't think that *everything* needs to be configurable that way.  Even though 
solr.xml would not be REQUIRED to start Solr, you'd still be able to create 
that file and have Solr honor its settings.

So to reiterate my modified proposal, for the master branch only:

{quote}
Eliminate coreRootDirectory entirely.

Make sure there are only two "home" properties.  One of them should be 
solr.index.home, and the other will either be solr.solr.home or solr.data.home. 
 The latter makes a lot of sense, but the former has historical value and will 
support older configs better.  If both solr.solr.home and solr.data.home are 
set in an upgraded configuration, Solr should log an error and refuse to start.

Solr should be able to start up without a solr.xml file, but if one is found 
(either in zookeeper or the solr home) then it will be honored.
{quote}

We could backport these ideas to 7.x, but the "new" way of configuring things 
would have to be explicitly enabled, probably with a system property, and the 
current way of configuring things would need to remain supported for all of 7.x.


> Make coreRootDirectory configurable via an environment variable 
> (SOLR_CORE_HOME)
> 
>
> Key: SOLR-11508
> URL: https://issues.apache.org/jira/browse/SOLR-11508
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Marc Morissette
>
> (Heavily edited)
> Since Solr 7, it is possible to store Solr cores in separate disk locations 
> using solr.data.home (see SOLR-6671). This is very useful when running Solr 
> in Docker where data must be stored in a directory which is independent from 
> the rest of the container.
> While this works well in standalone mode, it doesn't in Cloud mode as the 
> core.properties automatically created by Solr are still stored in 
> coreRootDirectory and cores created that way disappear when the Solr Docker 
> container is redeployed.
> The solution is to configure coreRootDirectory to an empty directory that can 
> be mounted outside the Docker container.
> The incoming patch makes this easier to do by allowing coreRootDirectory to 
> be configured via a solr.core.home system property and SOLR_CORE_HOME 
> environment variable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11508) Make coreRootDirectory configurable via an environment variable (SOLR_CORE_HOME)

2017-12-04 Thread Marc Morissette (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16277849#comment-16277849
 ] 

Marc Morissette commented on SOLR-11508:


[~elyograg] This is an interesting idea but I'm not sure how this solves the 
problem. It would be nice if Solr could start without solr.xml but it would 
condemn cloud mode users to choose between sticking to the default settings or 
mixing their configuration and data. 

It's either that or we would need to externalize every configuration parameter 
available in solr.xml (and there are a lot).

> Make coreRootDirectory configurable via an environment variable 
> (SOLR_CORE_HOME)
> 
>
> Key: SOLR-11508
> URL: https://issues.apache.org/jira/browse/SOLR-11508
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Marc Morissette
>
> (Heavily edited)
> Since Solr 7, it is possible to store Solr cores in separate disk locations 
> using solr.data.home (see SOLR-6671). This is very useful when running Solr 
> in Docker where data must be stored in a directory which is independent from 
> the rest of the container.
> While this works well in standalone mode, it doesn't in Cloud mode as the 
> core.properties automatically created by Solr are still stored in 
> coreRootDirectory and cores created that way disappear when the Solr Docker 
> container is redeployed.
> The solution is to configure coreRootDirectory to an empty directory that can 
> be mounted outside the Docker container.
> The incoming patch makes this easier to do by allowing coreRootDirectory to 
> be configured via a solr.core.home system property and SOLR_CORE_HOME 
> environment variable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11508) Make coreRootDirectory configurable via an environment variable (SOLR_CORE_HOME)

2017-12-04 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16277818#comment-16277818
 ] 

Shawn Heisey commented on SOLR-11508:
-

I'm starting to get an idea of the problem that you want to solve.

What do you think of altering my proposal in one small way:  Making sure that 
Solr starts even if solr.xml cannot be found.  This would allow you to point 
solr.solr.home to a location that's completely empty and still have Solr start. 
 You would then have the option of adding a solr.xml if you desired some 
changes there, and even adding configsets if you wanted to run Solr in 
standalone mode but still have common configs like SolrCloud.

I did try starting Solr 7.0.0 (already had it downloaded) without a solr.xml, 
and it refuses to start.  I think this is not how it should behave.  Having 
Solr log a warning (and possibly even output a message to the console) 
mentioning the missing solr.xml would be a good idea.  I created a minimal 
solr.xml (just contained  on one line) and Solr did start, so it's not 
like it must have any config there.

I have noticed that the stock solr.xml included with 7.x has a lot of config in 
it that uses various system properties, with defaults.  I have no idea whether 
the default settings for these things is reasonable or not.  We would need to 
make sure that the defaults are reasonable.


> Make coreRootDirectory configurable via an environment variable 
> (SOLR_CORE_HOME)
> 
>
> Key: SOLR-11508
> URL: https://issues.apache.org/jira/browse/SOLR-11508
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Marc Morissette
>
> (Heavily edited)
> Since Solr 7, it is possible to store Solr cores in separate disk locations 
> using solr.data.home (see SOLR-6671). This is very useful when running Solr 
> in Docker where data must be stored in a directory which is independent from 
> the rest of the container.
> While this works well in standalone mode, it doesn't in Cloud mode as the 
> core.properties automatically created by Solr are still stored in 
> coreRootDirectory and cores created that way disappear when the Solr Docker 
> container is redeployed.
> The solution is to configure coreRootDirectory to an empty directory that can 
> be mounted outside the Docker container.
> The incoming patch makes this easier to do by allowing coreRootDirectory to 
> be configured via a solr.core.home system property and SOLR_CORE_HOME 
> environment variable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11508) Make coreRootDirectory configurable via an environment variable (SOLR_CORE_HOME)

2017-12-04 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16277497#comment-16277497
 ] 

David Smiley commented on SOLR-11508:
-

Perhaps solr.data.dir should be deprecated in SolrCloud mode (displaying a 
warning at startup)?  In 8.0 we could remove the "-t" convenience parameter to 
bin/solr, leaving it as more of an internal setting.

> Make coreRootDirectory configurable via an environment variable 
> (SOLR_CORE_HOME)
> 
>
> Key: SOLR-11508
> URL: https://issues.apache.org/jira/browse/SOLR-11508
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Marc Morissette
>
> (Heavily edited)
> Since Solr 7, it is possible to store Solr cores in separate disk locations 
> using solr.data.home (see SOLR-6671). This is very useful when running Solr 
> in Docker where data must be stored in a directory which is independent from 
> the rest of the container.
> While this works well in standalone mode, it doesn't in Cloud mode as the 
> core.properties automatically created by Solr are still stored in 
> coreRootDirectory and cores created that way disappear when the Solr Docker 
> container is redeployed.
> The solution is to configure coreRootDirectory to an empty directory that can 
> be mounted outside the Docker container.
> The incoming patch makes this easier to do by allowing coreRootDirectory to 
> be configured via a solr.core.home system property and SOLR_CORE_HOME 
> environment variable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11508) Make coreRootDirectory configurable via an environment variable (SOLR_CORE_HOME)

2017-12-04 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16277474#comment-16277474
 ] 

David Smiley commented on SOLR-11508:
-

coreRootDirectory is not a new concept and is already configurable.  You're 
merely making it _easier_ (and more consistent) to configure.

I think making a setting like this dependent on wether you're in SolrCloud mode 
or not makes this more confusing, but I understand your motivation.  I think 
more documentation can add guidance on the use of these.  SOLR_DATA_DIR can be 
a gotcha for a docker user and that advise can be in the Solr ref guide.  
Future docker images can set things up correctly OOTB using a "volume".

> Make coreRootDirectory configurable via an environment variable 
> (SOLR_CORE_HOME)
> 
>
> Key: SOLR-11508
> URL: https://issues.apache.org/jira/browse/SOLR-11508
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Marc Morissette
>
> (Heavily edited)
> Since Solr 7, it is possible to store Solr cores in separate disk locations 
> using solr.data.home (see SOLR-6671). This is very useful when running Solr 
> in Docker where data must be stored in a directory which is independent from 
> the rest of the container.
> While this works well in standalone mode, it doesn't in Cloud mode as the 
> core.properties automatically created by Solr are still stored in 
> coreRootDirectory and cores created that way disappear when the Solr Docker 
> container is redeployed.
> The solution is to configure coreRootDirectory to an empty directory that can 
> be mounted outside the Docker container.
> The incoming patch makes this easier to do by allowing coreRootDirectory to 
> be configured via a solr.core.home system property and SOLR_CORE_HOME 
> environment variable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11508) Make coreRootDirectory configurable via an environment variable (SOLR_CORE_HOME)

2017-12-04 Thread Marc Morissette (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16277362#comment-16277362
 ] 

Marc Morissette commented on SOLR-11508:


I've started work on a patch that adds the ability to set coreRootDirectory via 
an environment variable and command line option: 
https://github.com/morissm/lucene-solr/commit/95cbd1410fb4bdf97fd9ffec8737117a7931054d

I'm starting to have second thoughts though. Solr already has a steep learning 
curve and I'm loathe to add yet another option if there is a way to avoid it.

What if core.properties files were stored in SOLR_DATA_HOME only when Solr is 
in cloud mode? Unless I'm mistaken, all configuration is stored in Zookeeper in 
cloud mode so that is the only file that matters. As I've argued earlier, 
core.properties files in cloud mode are mostly an implementation detail and 
belong with the data. 

The only issue would be how to handle the transition for people who have set 
SOLR_DATA_HOME in cloud mode pre 7.2. I've thought of many automated ways to 
handle the transition but this might not be easy to accomplish without 
introducing some potential unintended behaviours.

Comments?

> Make coreRootDirectory configurable via an environment variable 
> (SOLR_CORE_HOME)
> 
>
> Key: SOLR-11508
> URL: https://issues.apache.org/jira/browse/SOLR-11508
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Marc Morissette
>
> (Heavily edited)
> Since Solr 7, it is possible to store Solr cores in separate disk locations 
> using solr.data.home (see SOLR-6671). This is very useful when running Solr 
> in Docker where data must be stored in a directory which is independent from 
> the rest of the container.
> While this works well in standalone mode, it doesn't in Cloud mode as the 
> core.properties automatically created by Solr are still stored in 
> coreRootDirectory and cores created that way disappear when the Solr Docker 
> container is redeployed.
> The solution is to configure coreRootDirectory to an empty directory that can 
> be mounted outside the Docker container.
> The incoming patch makes this easier to do by allowing coreRootDirectory to 
> be configured via a solr.core.home system property and SOLR_CORE_HOME 
> environment variable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11508) Make coreRootDirectory configurable via an environment variable (SOLR_CORE_HOME)

2017-12-03 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16276274#comment-16276274
 ] 

Varun Thacker commented on SOLR-11508:
--

+1 to the overall approach. 

In the test case we should add another core creation where we specify a 
different dataDir ( 
http://lucene.apache.org/solr/guide/7_1/defining-core-properties.html#defining-core-properties-files
 ) and then make sure core discovery makes fine.  This use-case might not make 
a lot of sense ( like why specify "solr.data.home" and then go create a core at 
a different place ) but maybe there are SolrCloud users who want to add a 
replica at a later stage to another disk. I think the current approach doesn't 
break this but having a test will be nice.

PS : I won't have time anytime soon to thoroughly go through it and commit it.

> Make coreRootDirectory configurable via an environment variable 
> (SOLR_CORE_HOME)
> 
>
> Key: SOLR-11508
> URL: https://issues.apache.org/jira/browse/SOLR-11508
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Marc Morissette
>
> (Heavily edited)
> Since Solr 7, it is possible to store Solr cores in separate disk locations 
> using solr.data.home (see SOLR-6671). This is very useful when running Solr 
> in Docker where data must be stored in a directory which is independent from 
> the rest of the container.
> While this works well in standalone mode, it doesn't in Cloud mode as the 
> core.properties automatically created by Solr are still stored in 
> coreRootDirectory and cores created that way disappear when the Solr Docker 
> container is redeployed.
> The solution is to configure coreRootDirectory to an empty directory that can 
> be mounted outside the Docker container.
> The incoming patch makes this easier to do by allowing coreRootDirectory to 
> be configured via a solr.core.home system property and SOLR_CORE_HOME 
> environment variable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org