[ 
https://issues.apache.org/jira/browse/SOLR-11488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16204733#comment-16204733
 ] 

Erick Erickson commented on SOLR-11488:
---------------------------------------

bq: Do you mean someone wants to delete data or add data and it ends up in the 
wrong collection?

Well, that's possible too, it's not the case I was worrying about though. I 
mean if the code changes such that the delete collection logic follows the 
alias the new collection would get deleted.

At this point, having an alias that's the same name as a collection only does 
the right thing on delete by chance and there are no safeguards in the test 
suite insuring that behavior. Some innocent change could very easily delete the 
wrong collection!

Interesting idea (the underscore bit). Off the top of my head, it would solve 
the ambiguity problem. Under the covers, it has similar way of eliminating 
problems: there would never be a collection and an alias with the same name.

I guess there'd be some heartburn because the state.json file would have the 
underscore, but we've already changed the shard naming convention and as long 
as the users didn't notice (since they could use "foo") that would be an 
implementation detail. 

There's a little additional  level of indirection, we have clients with 1000's 
of collections. Does that matter enough to worry about? I suppose if someone 
really wanted to not use the aliases, they could reference the collection name 
with the underscore, and then if they told us "we can't use aliases for 
reindexing because all our applications use _foo" I wouldn't be too sympathetic.

I think it boils down to whether we're willing to tell people "in order to 
re-index, you have to create an alias to your old collection first and have 
your client use that". I'm waffling frankly. I just know that there'll be some 
clients who'll reply "we can't change the client". This can still be worked 
around, but would take more effort, to whit:
> create your new collection and index
> plan a service interruption
> delete the old collection (backup first!)
> create an alias with the old collection name pointing to the new collection.

I don't particularly like this as it takes away the fallback of going back to 
the old collection.  It'd work though.

I suppose the alternative is to leave things as they are and create a boatload 
of tests that insure that aliases take precedence over names everywhere, 
everywhen. Updates, deletes, collection admin operations, queries, basically 
anywhere we operate on a collection by name. I'd _much_ rather have a systemic 
fix, either this JIRA or your idea (or some other for that matter).

I'm going to create a new JIRA for the underscore idea and link it here so we 
can track them separately.

> Do not allow collections and aliases to have the same name
> ----------------------------------------------------------
>
>                 Key: SOLR-11488
>                 URL: https://issues.apache.org/jira/browse/SOLR-11488
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>         Attachments: SOLR-11488.patch
>
>
> Currently you can define an alias with the same name as a collection and 
> (perhaps) vice-versa. The more I think about this the worse idea it seems. 
> See the discussion at the linked JIRAs.
> Proposal: We should fail to create a collection if an alias already exists 
> with the same name and vice-versa.
> This should depend on SOLR-11444 and supersede SOLR-11218, this JIRA will 
> include tests that define the intended behavior making SOLR-11218 obsolete. 
> We'll close SOLR-11218 as "contained by" this JIRA.
> This _will_ take away the ability to
> 1> create a collection, call it "old" and index to it.
> 2> decide you want to change the schema
> 3> create a collection call it "new" and index to it.
> 4> create an alias old->new THIS WILL FAIL.
> 5> delete the "old" collection
> People will have to create an alias pointing to "old" and change their 
> clients to use it, then they can do step 4 above....
> This is kind of a pain, but much better than following an alias and deleting 
> "new". I'd also argue that it's a maintenance problem to have collections and 
> aliases with the same name.
> What do people think? I'll try to work up a preliminary patch. If we do this, 
> we should probably coordinate committing this and SOLR-11444 and I'll also 
> change the docs to reflect this and upgrade notes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to