Re: Roadmap for fixing features broken by core autodiscovery

2013-10-05 Thread Erick Erickson
Right, let's move this discussion to SOLR-4779. There's some history
here. Sharing named config sets got a bit wrapped up in sharing the
underlying solrconfig object. This latter has been taken off the
table, but we should discuss fixing Trey's issues up. Here's what the
thinking was:
There would be a directory like solr_home/configs/configset1,
solr_home/configs/configset2, etc. Then a new parameter for or create or whatever like configset=configset1 that
would be smart enough to look in solr_home/configs for an entire
conf directory named configste1.

Does that work for your case? If so, please add your comments to 4779
and we can take it from there. FWIW, I don't think this is especially
hard, but time is always at a premium.


On Fri, Oct 4, 2013 at 6:51 PM, Shawn Heisey wrote:
 On 10/4/2013 7:21 PM, Trey Grainger wrote:
 There are two use-cases that appear broken with the new core
 auto-discovery mechanism:

 *1) The Core Admin Handler's CREATE command no longer works to create
 brand new cores*
 (unless you have logged on the box and created the core's directory
 structure manually, which largely defeats the purpose of the CREATE
 command).  With the old Solr.xml format, we could spin up as many cores
 as we wanted to dynamically with the following command:

 In the new core discovery mode, this exception is now thrown:
 Error CREATEing SolrCore 'newCore1': Could not create a new core in
 solr/collection1/as another core is already defined there

 The CREATE action has *always* required that you have your configuration
 on the disk before you call it.  You are sharing the instanceDir, which
 is the only reason you can skip that step.

 If you want completely dynamic creation, use SolrCloud, which keeps the
 config in zookeeper and requires ZERO config information to exist on the

 *2) Having a shared configuration directory (instanceDir) across many
 cores no longer works*.
 Every core has to have it's own conf/ directory, and this doesn't seem
 to be overridable any longer.  Previously, it was possible to have many
 cores share the same instanceDir (and just override their dataDir for
 obvious reasons).  Now, it is necessary to copy and paste identical
 config files for each Solr core.

 From what I understand talking to the people that worked on this, the
 lack of a shared instanceDir was completely deliberate.  It's the only
 way that core discovery can work in any kind of predictable and sane
 manner.  The entire point of it is that every core is self-contained and
 solr.xml isn't used to tell Solr about them.

 I personally have never tried to share the instanceDir.  I do have
 shared configs, though - my corename/conf directories have symlinks to a
 shared config directory.  I also don't dynamically create cores - I have
 seven shards, each of which has a live core and a build core.  There are
 two other cores that serve as frontends, with the shards parameter in
 the request handlers.

 I don't know if there's already a current roadmap for fixing this.  I
 saw, which suggested
 replacing instanceDir with the ability to specify a named configSet.
  This solves problem 2, but not problem1 (since you still can't have
 multiple files in the same folder).  Based on Erick's
 comments in the JIRA ticket, it also sounds like this ticket is also
 dead at the moment.

 There is definitely a need to have a shared config directory - whether
 that is through a configSet or an explicit indexDir doesn't matter to
 me.  There's also a need to be able to dynamically create Solr cores
 from external systems.  I currently can't upgrade to core auto discovery
 because it doesn't allow dynamic core creation.  Does anyone have some
 thoughts on how to best get these features working again under core
 autodiscovery?  Adding instanceDir to seems like an easy
 solution, but there must be a desire not to do that or it would probably
 have already been done.

 Thankfully, you do not need to upgrade to core discovery anytime soon.
 All future 4.x versions will support the old format, and any problems
 with that will be considered bugs.  It will be mandatory in Solr 5.0,
 which currently doesn't have any kind of release roadmap or timeframe.
 I suspect that what we currently call SolrCloud will also be mandatory
 in 5.0, and that gives you shared configs with zookeeper.  Requiring
 zookeeper allows completely dynamic core/collection creation, because
 the only thing that will be on the disk is the index and transaction log


 To unsubscribe, e-mail: 

Roadmap for fixing features broken by core autodiscovery

2013-10-04 Thread Trey Grainger
There are two use-cases that appear broken with the new core auto-discovery

*1) The Core Admin Handler's CREATE command no longer works to create brand
new cores*
(unless you have logged on the box and created the core's directory
structure manually, which largely defeats the purpose of the CREATE
command).  With the old Solr.xml format, we could spin up as many cores as
we wanted to dynamically with the following command:

In the new core discovery mode, this exception is now thrown:
Error CREATEing SolrCore 'newCore1': Could not create a new core in
solr/collection1/as another core is already defined there

The exception is being intentionally thrown in
because a file already exists in solr/collection1 (and only
one can exist per directory).

*2) Having a shared configuration directory (instanceDir) across many cores
no longer works*.
Every core has to have it's own conf/ directory, and this doesn't seem to
be overridable any longer.  Previously, it was possible to have many cores
share the same instanceDir (and just override their dataDir for obvious
reasons).  Now, it is necessary to copy and paste identical config files
for each Solr core.

I don't know if there's already a current roadmap for fixing this.  I saw, which suggested replacing
instanceDir with the ability to specify a named configSet.  This solves
problem 2, but not problem1 (since you still can't have multiple files in the same folder).  Based on Erick's comments in
the JIRA ticket, it also sounds like this ticket is also dead at the moment.

There is definitely a need to have a shared config directory - whether that
is through a configSet or an explicit indexDir doesn't matter to me.
 There's also a need to be able to dynamically create Solr cores from
external systems.  I currently can't upgrade to core auto discovery because
it doesn't allow dynamic core creation.  Does anyone have some thoughts on
how to best get these features working again under core autodiscovery?
 Adding instanceDir to seems like an easy solution, but
there must be a desire not to do that or it would probably have already
been done.

I'm happy to contribute some time to resolving this if there is agreed upon
path forward.



Re: Roadmap for fixing features broken by core autodiscovery

2013-10-04 Thread Shawn Heisey
On 10/4/2013 7:21 PM, Trey Grainger wrote:
 There are two use-cases that appear broken with the new core
 auto-discovery mechanism:
 *1) The Core Admin Handler's CREATE command no longer works to create
 brand new cores* 
 (unless you have logged on the box and created the core's directory
 structure manually, which largely defeats the purpose of the CREATE
 command).  With the old Solr.xml format, we could spin up as many cores
 as we wanted to dynamically with the following command:
 In the new core discovery mode, this exception is now thrown:
 Error CREATEing SolrCore 'newCore1': Could not create a new core in
 solr/collection1/as another core is already defined there

The CREATE action has *always* required that you have your configuration
on the disk before you call it.  You are sharing the instanceDir, which
is the only reason you can skip that step.

If you want completely dynamic creation, use SolrCloud, which keeps the
config in zookeeper and requires ZERO config information to exist on the

 *2) Having a shared configuration directory (instanceDir) across many
 cores no longer works*.  
 Every core has to have it's own conf/ directory, and this doesn't seem
 to be overridable any longer.  Previously, it was possible to have many
 cores share the same instanceDir (and just override their dataDir for
 obvious reasons).  Now, it is necessary to copy and paste identical
 config files for each Solr core.

From what I understand talking to the people that worked on this, the
lack of a shared instanceDir was completely deliberate.  It's the only
way that core discovery can work in any kind of predictable and sane
manner.  The entire point of it is that every core is self-contained and
solr.xml isn't used to tell Solr about them.

I personally have never tried to share the instanceDir.  I do have
shared configs, though - my corename/conf directories have symlinks to a
shared config directory.  I also don't dynamically create cores - I have
seven shards, each of which has a live core and a build core.  There are
two other cores that serve as frontends, with the shards parameter in
the request handlers.

 I don't know if there's already a current roadmap for fixing this.  I
 saw, which suggested
 replacing instanceDir with the ability to specify a named configSet.
  This solves problem 2, but not problem1 (since you still can't have
 multiple files in the same folder).  Based on Erick's
 comments in the JIRA ticket, it also sounds like this ticket is also
 dead at the moment.
 There is definitely a need to have a shared config directory - whether
 that is through a configSet or an explicit indexDir doesn't matter to
 me.  There's also a need to be able to dynamically create Solr cores
 from external systems.  I currently can't upgrade to core auto discovery
 because it doesn't allow dynamic core creation.  Does anyone have some
 thoughts on how to best get these features working again under core
 autodiscovery?  Adding instanceDir to seems like an easy
 solution, but there must be a desire not to do that or it would probably
 have already been done.

Thankfully, you do not need to upgrade to core discovery anytime soon.
All future 4.x versions will support the old format, and any problems
with that will be considered bugs.  It will be mandatory in Solr 5.0,
which currently doesn't have any kind of release roadmap or timeframe.
I suspect that what we currently call SolrCloud will also be mandatory
in 5.0, and that gives you shared configs with zookeeper.  Requiring
zookeeper allows completely dynamic core/collection creation, because
the only thing that will be on the disk is the index and transaction log


To unsubscribe, e-mail:
For additional commands, e-mail: