Will White created SOLR-16820:
---------------------------------
Summary: PackageUtils collection validation is more restrictive
than CreateCollectionAPI allows
Key: SOLR-16820
URL: https://issues.apache.org/jira/browse/SOLR-16820
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Components: Package Manager
Reporter: Will White
It's possible to create a collection via the CreateCollectionAPI which [passes
validation from the
SolrIdentifierValidation|https://github.com/apache/solr/blob/main/solr/solrj/src/java/org/apache/solr/client/solrj/util/SolrIdentifierValidator.java#L50-L52]
(a regex which includes the '.' character), but that same collection name
won't then pass validation when deployed/undeployed via the PackageTool because
of the [packagemanager.PackageUtils validateCollection()
method|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/packagemanager/PackageUtils.java#L271].
A change [like this, using the existing
SolrIdentifierValidator|https://github.com/apache/solr/commit/638fd768ebd7ed7908029ced08e56bed05a4a2a5]
would bring the two validation steps back in line, although there's presumably
a better approach.
*Potential risks*
As highlighted by Gus Heck [in this
thread|https://lists.apache.org/thread/h7hnksgqwxxl7nkwkhn01r6jn8xjkjjs]
changing the validation of collection names could be a risky change to make.
The source of the PackageUtils regex appears to be
[https://github.com/apache/lucene-solr/pull/994] from before Solr split from
the Lucene project, and it seems that the regex wasn't crafted for a specific
subset of use cases that specifically excluded the '.' character - it just
appears to be the regex implemented at the time.
Using the {{SolrIdentifierValidator}} approach mentioned above as an example,
other than disallowing a collection name that begins with a '-' character, the
{{SolrIdentifierValidator.identifierPattern}} would be a strict expansion of
the allowed collection names for the {{PackageUtils.validateCollections}}. Any
other solution (such as [this more naive
example|https://github.com/apache/solr/blame/998fffdccf51a0560589e2cb413e9da127a5f26e/solr/core/src/java/org/apache/solr/packagemanager/PackageUtils.java#L271])
could similarly mitigate a lot of the potential risk by only expanding the
allowed collection names.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]