Erick Erickson created SOLR-11503:
-------------------------------------

             Summary: Collections created with legacyCloud=true cannot be 
opened if legacyCloud=false
                 Key: SOLR-11503
                 URL: https://issues.apache.org/jira/browse/SOLR-11503
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
    Affects Versions: 6.6.1
            Reporter: Erick Erickson
            Priority: Critical


SOLR-11122 introduced a bug starting with 6.6.1 that means if you create a 
collection with legacyCloud=true then switch to legacyCloud=false, you get an 
NPE because coreNodeName is not defined in core.properties.

Since the default for legacyCloud changed from true to false between 6.6.1 and 
7.x, this means that any attempt to upgrade Solr with existing collections 
*created with Solr 6.6.1 or 6.6.2 will fail* if the default value for 
legacyCloud is used in both. Collections created with 6.6.0 would work. 
Collections created in 6.6.1 or 6.6.2 with legacyCloud=false will work.

This is not as egregious with any collections *created with 7.0* since if the 
default legacyCloud=false is present when the core is created, properties are 
persisted with coreNodeName. However, if someone switches legacyCloud to true, 
then creates a collection, then changes legacyCloud back to false then they'll 
hit this even in 7.0+

This happened because bit of reordering switched the order of the calls below. 
coreNodeName is added to the descriptor in create/createFromDescriptor(this, 
cd) via zkContgroller.preRegister so coresLocator.create(this, cd) persists 
core.properties without coreNodeName.

_original order_
SolrCore core = createFromDescriptor(cd, true, newCollection);
coresLocator.create(this, cd);

(NOTE: private calls to create were renamed to createFromDescriptor in 
SOLR-11122).

I've got a fix in the works for creating cores, I'll attach a preliminary patch 
w/o tests in a bit for discussion, but the question is really what to do about 
6.6.1 and 6.6.2 and 7.1 for that matter. 

This is compounded by the fact that with the CVE, there's strong incentive to 
move to 6.6.2. siiiigh.

There are two parts to fixing this completely:
1> create core.properties correctly
2> deal with coreNodeName not being in the core.properties file by going to ZK 
and getting it (and persisting it). Haven't worked that part out yet though, 
not in the first patch. Note one point here if it works as I hope it will 
update the core.properties files first time they're opened.


Options that I see, there are really two parts:

*part1 create the core.properties correctly*
> Release 6.6.3, and/or 7.1.1 with this fix. This still leaves 7.0 a problem.
> Recommend people not install 7x over collections created with 6x until they 
> have a version with fixes (7.1.1? 7.2?). Switching legacyCloud values and 
> creating collections is at your own risk.
> Recommend that people change legacyCloud=true in 7.x until they start working 
> with a fixed version, which one TBD.

*part2 deal with coreNodeName not being in the core.properties*

> Not backport and release with 7.2? set legacyCloud=true until then.
> Backport to point releases like 7.1.1? 6.6.3?
> and what about 7.0? I don't think many people will be affected by 7.0 since 
> 7.1 came out so soon after. And setting legacyCloud=true will let people get 
> by.

Fixing the two parts is not a question, they both need to be fixed. The real 
question is whether we need to create a point release that incorporates one or 
both or whether saying "you must set legacyCloud=true prior to Solr version 7.# 
in order to work with any collections created with Solr versions 6.6.1 through 
7.#".

Let's hear opinions......




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to