Thiruvalluvan M. G. created SOLR-16810:
------------------------------------------
Summary: Under certain situations Solr produces managed schema XML
that cannot be loaded
Key: SOLR-16810
URL: https://issues.apache.org/jira/browse/SOLR-16810
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Components: Schema and Analysis
Affects Versions: 9.2.1
Reporter: Thiruvalluvan M. G.
While persisting the {{ManagedIndexSchema}} as XML, non-printable characters in
field names get escaped as {{{}#nn;{}}}, where {{nn}} is the decimal
representation of the non-printable character. For example, if the field name
has the byte {{{}0x14{}}}, it gets escaped as {{{}#20;{}}}. This in
indistinguishable from the literal {{#20;}} in the field name. If we have two
fields - one with the non-printable character and the other with the literal
string, two fields get generated with the same name. Loading the resulting XML,
naturally, causes an exception. To fix this, any occurrence of literal {{#}} in
the field name should be escaped, with say {{{}##{}}}.
A second problem is that while escaping happens when generating XML, the
corresponding unescaping does not happen on loading it. This asymmetry should
be fixed as well.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]