[jira] [Commented] (SOLR-6937) In schemaless mode, field names with spaces should be converted
[ https://issues.apache.org/jira/browse/SOLR-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14276631#comment-14276631 ] ASF subversion and git services commented on SOLR-6937: --- Commit 1651589 from [~noble.paul] in branch 'dev/branches/lucene_solr_5_0' [ https://svn.apache.org/r1651589 ] SOLR-6937 In schemaless mode ,replace spaces and special characters with underscore > In schemaless mode, field names with spaces should be converted > --- > > Key: SOLR-6937 > URL: https://issues.apache.org/jira/browse/SOLR-6937 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Reporter: Grant Ingersoll >Assignee: Noble Paul > Fix For: 5.0 > > Attachments: SOLR-6937.patch, SOLR-6937.patch > > > Assuming spaces in field names are still bad, we should automatically convert > them to not have spaces. For instance, I indexed Citibike public data set > which has: > {quote} > "tripduration","starttime","stoptime","start station id","start station > name","start station latitude","start station longitude","end station > id","end station name","end station latitude","end station > longitude","bikeid","usertype","birth year","gender"{quote} > My vote would be to replace spaces w/ underscores. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6937) In schemaless mode, field names with spaces should be converted
[ https://issues.apache.org/jira/browse/SOLR-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14276630#comment-14276630 ] ASF subversion and git services commented on SOLR-6937: --- Commit 1651588 from [~noble.paul] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1651588 ] SOLR-6937 In schemaless mode ,replace spaces and special characters with underscore > In schemaless mode, field names with spaces should be converted > --- > > Key: SOLR-6937 > URL: https://issues.apache.org/jira/browse/SOLR-6937 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Reporter: Grant Ingersoll >Assignee: Noble Paul > Fix For: 5.0 > > Attachments: SOLR-6937.patch, SOLR-6937.patch > > > Assuming spaces in field names are still bad, we should automatically convert > them to not have spaces. For instance, I indexed Citibike public data set > which has: > {quote} > "tripduration","starttime","stoptime","start station id","start station > name","start station latitude","start station longitude","end station > id","end station name","end station latitude","end station > longitude","bikeid","usertype","birth year","gender"{quote} > My vote would be to replace spaces w/ underscores. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6937) In schemaless mode, field names with spaces should be converted
[ https://issues.apache.org/jira/browse/SOLR-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14276627#comment-14276627 ] ASF subversion and git services commented on SOLR-6937: --- Commit 1651587 from [~noble.paul] in branch 'dev/trunk' [ https://svn.apache.org/r1651587 ] SOLR-6937 In schemaless mode ,replace spaces and special characters with underscore > In schemaless mode, field names with spaces should be converted > --- > > Key: SOLR-6937 > URL: https://issues.apache.org/jira/browse/SOLR-6937 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Reporter: Grant Ingersoll >Assignee: Noble Paul > Fix For: 5.0 > > Attachments: SOLR-6937.patch, SOLR-6937.patch > > > Assuming spaces in field names are still bad, we should automatically convert > them to not have spaces. For instance, I indexed Citibike public data set > which has: > {quote} > "tripduration","starttime","stoptime","start station id","start station > name","start station latitude","start station longitude","end station > id","end station name","end station latitude","end station > longitude","bikeid","usertype","birth year","gender"{quote} > My vote would be to replace spaces w/ underscores. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6937) In schemaless mode, field names with spaces should be converted
[ https://issues.apache.org/jira/browse/SOLR-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14275418#comment-14275418 ] Grant Ingersoll commented on SOLR-6937: --- +1 > In schemaless mode, field names with spaces should be converted > --- > > Key: SOLR-6937 > URL: https://issues.apache.org/jira/browse/SOLR-6937 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Reporter: Grant Ingersoll >Assignee: Noble Paul > Fix For: 5.0 > > Attachments: SOLR-6937.patch > > > Assuming spaces in field names are still bad, we should automatically convert > them to not have spaces. For instance, I indexed Citibike public data set > which has: > {quote} > "tripduration","starttime","stoptime","start station id","start station > name","start station latitude","start station longitude","end station > id","end station name","end station latitude","end station > longitude","bikeid","usertype","birth year","gender"{quote} > My vote would be to replace spaces w/ underscores. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6937) In schemaless mode, field names with spaces should be converted
[ https://issues.apache.org/jira/browse/SOLR-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14275357#comment-14275357 ] Noble Paul commented on SOLR-6937: -- The only catch is , if there are multiple patterns to match you need multiple {{}} tags . I hope it is OK > In schemaless mode, field names with spaces should be converted > --- > > Key: SOLR-6937 > URL: https://issues.apache.org/jira/browse/SOLR-6937 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Reporter: Grant Ingersoll >Assignee: Noble Paul > Fix For: 5.0 > > Attachments: SOLR-6937.patch > > > Assuming spaces in field names are still bad, we should automatically convert > them to not have spaces. For instance, I indexed Citibike public data set > which has: > {quote} > "tripduration","starttime","stoptime","start station id","start station > name","start station latitude","start station longitude","end station > id","end station name","end station latitude","end station > longitude","bikeid","usertype","birth year","gender"{quote} > My vote would be to replace spaces w/ underscores. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6937) In schemaless mode, field names with spaces should be converted
[ https://issues.apache.org/jira/browse/SOLR-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14275348#comment-14275348 ] Erik Hatcher commented on SOLR-6937: [~noble.paul]] - looks good! The pattern should be expanded to include all the funky/problematic/illegal characters before committing, but in general +1. > In schemaless mode, field names with spaces should be converted > --- > > Key: SOLR-6937 > URL: https://issues.apache.org/jira/browse/SOLR-6937 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Reporter: Grant Ingersoll >Assignee: Noble Paul > Fix For: 5.0 > > Attachments: SOLR-6937.patch > > > Assuming spaces in field names are still bad, we should automatically convert > them to not have spaces. For instance, I indexed Citibike public data set > which has: > {quote} > "tripduration","starttime","stoptime","start station id","start station > name","start station latitude","start station longitude","end station > id","end station name","end station latitude","end station > longitude","bikeid","usertype","birth year","gender"{quote} > My vote would be to replace spaces w/ underscores. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6937) In schemaless mode, field names with spaces should be converted
[ https://issues.apache.org/jira/browse/SOLR-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14273823#comment-14273823 ] Hoss Man commented on SOLR-6937: bq. Are there problems that would result when changing the name of a field in FieldMutatingUpdateProcessor? i suspect i put that in as a sanity check to protect the the surface area of the API -- i don't know if relaxing that will cause problems, or if it's just something that's there because the ramifications of allowing it aren't really well tested in the rest of the FieldMutating code paths. in particular: what does it mean? should the old field name be removed? should the corisponding field:value pair be rmeoved, but other instances of that field:value2 be left in (ie: what if the mutator renames one instance of the field but not another?) easiest thing would probably be to implement field renaming it as a complete one-off special UpdateProcessor w/o using hte FieldMutating framework (ie: no config, just something barebones for use in schemaless that can maybe later be re-parented in the class hierarchy to support more config options) > In schemaless mode, field names with spaces should be converted > --- > > Key: SOLR-6937 > URL: https://issues.apache.org/jira/browse/SOLR-6937 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Reporter: Grant Ingersoll >Assignee: Noble Paul > Fix For: 5.0 > > > Assuming spaces in field names are still bad, we should automatically convert > them to not have spaces. For instance, I indexed Citibike public data set > which has: > {quote} > "tripduration","starttime","stoptime","start station id","start station > name","start station latitude","start station longitude","end station > id","end station name","end station latitude","end station > longitude","bikeid","usertype","birth year","gender"{quote} > My vote would be to replace spaces w/ underscores. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6937) In schemaless mode, field names with spaces should be converted
[ https://issues.apache.org/jira/browse/SOLR-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14273713#comment-14273713 ] Erik Hatcher commented on SOLR-6937: [~hossman_luc...@fucit.org], I tried this: {code} public class NormalizeFieldNameUpdateProcessorFactory extends FieldMutatingUpdateProcessorFactory { @Override public UpdateRequestProcessor getInstance(SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor next) { return new FieldMutatingUpdateProcessor(getSelector(), next) { @Override protected SolrInputField mutate(SolrInputField src) { src.setName(src.getName().replace(' ', '_')); return src; } }; } } {code} And got this error: {code} mutate returned field with different name: field with spaces => field_with_spacesorg.apache.solr.common.SolrException: mutate returned field with different name: field with spaces => field_with_spaces... {code} Are there problems that would result when changing the name of a field in FieldMutatingUpdateProcessor? > In schemaless mode, field names with spaces should be converted > --- > > Key: SOLR-6937 > URL: https://issues.apache.org/jira/browse/SOLR-6937 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Reporter: Grant Ingersoll >Assignee: Noble Paul > Fix For: 5.0 > > > Assuming spaces in field names are still bad, we should automatically convert > them to not have spaces. For instance, I indexed Citibike public data set > which has: > {quote} > "tripduration","starttime","stoptime","start station id","start station > name","start station latitude","start station longitude","end station > id","end station name","end station latitude","end station > longitude","bikeid","usertype","birth year","gender"{quote} > My vote would be to replace spaces w/ underscores. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6937) In schemaless mode, field names with spaces should be converted
[ https://issues.apache.org/jira/browse/SOLR-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14270662#comment-14270662 ] Hoss Man commented on SOLR-6937: bq. My vote would be to replace spaces w/ underscores. Could probably be solved with a ~6 line subclass of FieldMutatingUpdateProcessor > In schemaless mode, field names with spaces should be converted > --- > > Key: SOLR-6937 > URL: https://issues.apache.org/jira/browse/SOLR-6937 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Reporter: Grant Ingersoll > Fix For: 5.0 > > > Assuming spaces in field names are still bad, we should automatically convert > them to not have spaces. For instance, I indexed Citibike public data set > which has: > {quote} > "tripduration","starttime","stoptime","start station id","start station > name","start station latitude","start station longitude","end station > id","end station name","end station latitude","end station > longitude","bikeid","usertype","birth year","gender"{quote} > My vote would be to replace spaces w/ underscores. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org