[ 
https://issues.apache.org/jira/browse/SOLR-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230689#comment-13230689
 ] 

Hoss Man commented on SOLR-3207:
--------------------------------

the giant elephant in the room that doesn't seem to have been discussed is that 
trying to validate that field names meet some strict criteria when loading 
schema.xml doesn't really address dynamic fields -- the patch ensures that 
<dynamicField name="..."/> configurations have names which are validated, but i 
don't see anything that would considering the actually field names people use 
with those dynamic fields -- ie: "*_i" might be a perfectly valid dynamicField 
at startup, but that startup validation isn't going to help me if i index a 
document containing the field name "{{$ - foo_i}}"

In general, i'm opposed to the idea of "locking down" what field names can be 
used across the board.  My preference would be to let people us any field name 
their heart desires, but provide better documentation on what field name 
restrictions exist on which features and provide (ie: "using a field name in 
function requires that the field name match ValidatorX; using a field name in 
fl requires can only be used with field names conform to ValidatorX and 
ValidatorY; etc...").

If we want to provide automated "validation" of these things for people, then 
let's make it part of the LukeRequestHandler: for any field name returned by 
the LukeRequestHandler, let's include a warnings section advising them which 
validation rules that field name doesn't match, and what features depend on 
that validation rule -- this info could then easily be exposed in the admin UI.

We could also provide an optional UpdateProcessor people could configure with a 
list of individual Validators which could reject any document containing a 
field that didn't match the validator (or optionally: translate the field name 
to something thta did conform) to help people enforce these even on dynamic 
fields.

So by default: any field is allowed, but if i create one with a funky name 
(either explicitly or as a result of loading data using a dynamicField) the 
admin UI starts warning me that feature XYZ won't work with fields A, B, C; and 
if i want to ensure feature D works will all of my fields i add an update 
processor to ensure it.

                
> Add field name validation
> -------------------------
>
>                 Key: SOLR-3207
>                 URL: https://issues.apache.org/jira/browse/SOLR-3207
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 4.0
>            Reporter: Luca Cavanna
>             Fix For: 4.0
>
>         Attachments: SOLR-3207.patch
>
>
> Given the SOLR-2444 updated fl syntax and the SOLR-2719 regression, it would 
> be useful to add some kind of validation regarding the field names you can 
> use on Solr.
> The objective would be adding consistency, allowing only field names that you 
> can then use within fl, sorting etc.
> The rules, taken from the actual StrParser behaviour, seem to be the 
> following: 
> - same used for java identifiers (Character#isJavaIdentifierPart), plus the 
> use of trailing '.' and '-'
> - for the first character the rule is Character#isJavaIdentifierStart minus 
> '$' (The dash can't be used as first character (SOLR-3191) for example)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to