[jira] [Commented] (SOLR-6937) In schemaless mode, field names with spaces should be converted

2015-01-14 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14276631#comment-14276631
 ] 

ASF subversion and git services commented on SOLR-6937:
---

Commit 1651589 from [~noble.paul] in branch 'dev/branches/lucene_solr_5_0'
[ https://svn.apache.org/r1651589 ]

SOLR-6937 In schemaless mode ,replace spaces and special characters with 
underscore

> In schemaless mode, field names with spaces should be converted
> ---
>
> Key: SOLR-6937
> URL: https://issues.apache.org/jira/browse/SOLR-6937
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Reporter: Grant Ingersoll
>Assignee: Noble Paul
> Fix For: 5.0
>
> Attachments: SOLR-6937.patch, SOLR-6937.patch
>
>
> Assuming spaces in field names are still bad, we should automatically convert 
> them to not have spaces.  For instance, I indexed Citibike public data set 
> which has: 
> {quote}
> "tripduration","starttime","stoptime","start station id","start station 
> name","start station latitude","start station longitude","end station 
> id","end station name","end station latitude","end station 
> longitude","bikeid","usertype","birth year","gender"{quote}
> My vote would be to replace spaces w/ underscores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6937) In schemaless mode, field names with spaces should be converted

2015-01-14 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14276630#comment-14276630
 ] 

ASF subversion and git services commented on SOLR-6937:
---

Commit 1651588 from [~noble.paul] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1651588 ]

SOLR-6937 In schemaless mode ,replace spaces and special characters with 
underscore

> In schemaless mode, field names with spaces should be converted
> ---
>
> Key: SOLR-6937
> URL: https://issues.apache.org/jira/browse/SOLR-6937
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Reporter: Grant Ingersoll
>Assignee: Noble Paul
> Fix For: 5.0
>
> Attachments: SOLR-6937.patch, SOLR-6937.patch
>
>
> Assuming spaces in field names are still bad, we should automatically convert 
> them to not have spaces.  For instance, I indexed Citibike public data set 
> which has: 
> {quote}
> "tripduration","starttime","stoptime","start station id","start station 
> name","start station latitude","start station longitude","end station 
> id","end station name","end station latitude","end station 
> longitude","bikeid","usertype","birth year","gender"{quote}
> My vote would be to replace spaces w/ underscores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6937) In schemaless mode, field names with spaces should be converted

2015-01-14 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14276627#comment-14276627
 ] 

ASF subversion and git services commented on SOLR-6937:
---

Commit 1651587 from [~noble.paul] in branch 'dev/trunk'
[ https://svn.apache.org/r1651587 ]

SOLR-6937 In schemaless mode ,replace spaces and special characters with 
underscore

> In schemaless mode, field names with spaces should be converted
> ---
>
> Key: SOLR-6937
> URL: https://issues.apache.org/jira/browse/SOLR-6937
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Reporter: Grant Ingersoll
>Assignee: Noble Paul
> Fix For: 5.0
>
> Attachments: SOLR-6937.patch, SOLR-6937.patch
>
>
> Assuming spaces in field names are still bad, we should automatically convert 
> them to not have spaces.  For instance, I indexed Citibike public data set 
> which has: 
> {quote}
> "tripduration","starttime","stoptime","start station id","start station 
> name","start station latitude","start station longitude","end station 
> id","end station name","end station latitude","end station 
> longitude","bikeid","usertype","birth year","gender"{quote}
> My vote would be to replace spaces w/ underscores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6937) In schemaless mode, field names with spaces should be converted

2015-01-13 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14275418#comment-14275418
 ] 

Grant Ingersoll commented on SOLR-6937:
---

+1

> In schemaless mode, field names with spaces should be converted
> ---
>
> Key: SOLR-6937
> URL: https://issues.apache.org/jira/browse/SOLR-6937
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Reporter: Grant Ingersoll
>Assignee: Noble Paul
> Fix For: 5.0
>
> Attachments: SOLR-6937.patch
>
>
> Assuming spaces in field names are still bad, we should automatically convert 
> them to not have spaces.  For instance, I indexed Citibike public data set 
> which has: 
> {quote}
> "tripduration","starttime","stoptime","start station id","start station 
> name","start station latitude","start station longitude","end station 
> id","end station name","end station latitude","end station 
> longitude","bikeid","usertype","birth year","gender"{quote}
> My vote would be to replace spaces w/ underscores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6937) In schemaless mode, field names with spaces should be converted

2015-01-13 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14275357#comment-14275357
 ] 

Noble Paul commented on SOLR-6937:
--

The only catch is , if there are multiple patterns  to match you need multiple 
{{}} tags . I hope it is OK

> In schemaless mode, field names with spaces should be converted
> ---
>
> Key: SOLR-6937
> URL: https://issues.apache.org/jira/browse/SOLR-6937
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Reporter: Grant Ingersoll
>Assignee: Noble Paul
> Fix For: 5.0
>
> Attachments: SOLR-6937.patch
>
>
> Assuming spaces in field names are still bad, we should automatically convert 
> them to not have spaces.  For instance, I indexed Citibike public data set 
> which has: 
> {quote}
> "tripduration","starttime","stoptime","start station id","start station 
> name","start station latitude","start station longitude","end station 
> id","end station name","end station latitude","end station 
> longitude","bikeid","usertype","birth year","gender"{quote}
> My vote would be to replace spaces w/ underscores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6937) In schemaless mode, field names with spaces should be converted

2015-01-13 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14275348#comment-14275348
 ] 

Erik Hatcher commented on SOLR-6937:


[~noble.paul]] - looks good!   The pattern should be expanded to include all 
the funky/problematic/illegal characters before committing, but in general +1.

> In schemaless mode, field names with spaces should be converted
> ---
>
> Key: SOLR-6937
> URL: https://issues.apache.org/jira/browse/SOLR-6937
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Reporter: Grant Ingersoll
>Assignee: Noble Paul
> Fix For: 5.0
>
> Attachments: SOLR-6937.patch
>
>
> Assuming spaces in field names are still bad, we should automatically convert 
> them to not have spaces.  For instance, I indexed Citibike public data set 
> which has: 
> {quote}
> "tripduration","starttime","stoptime","start station id","start station 
> name","start station latitude","start station longitude","end station 
> id","end station name","end station latitude","end station 
> longitude","bikeid","usertype","birth year","gender"{quote}
> My vote would be to replace spaces w/ underscores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6937) In schemaless mode, field names with spaces should be converted

2015-01-12 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14273823#comment-14273823
 ] 

Hoss Man commented on SOLR-6937:


bq. Are there problems that would result when changing the name of a field in 
FieldMutatingUpdateProcessor?

i suspect i put that in as a sanity check to protect the the surface area of 
the API -- i don't know if relaxing that will cause problems, or if it's just 
something that's there because the ramifications of allowing it aren't really 
well tested in the rest of the FieldMutating code paths.

in particular: what does it mean? should the old field name be removed? should 
the corisponding field:value pair be rmeoved, but other instances of that 
field:value2 be left in (ie: what if the mutator renames one instance of the 
field but not another?)

easiest thing would probably be to implement field renaming it as a complete 
one-off special UpdateProcessor w/o using hte FieldMutating framework (ie: no 
config, just something barebones for use in schemaless that can maybe later be 
re-parented in the class hierarchy to support more config options)

> In schemaless mode, field names with spaces should be converted
> ---
>
> Key: SOLR-6937
> URL: https://issues.apache.org/jira/browse/SOLR-6937
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Reporter: Grant Ingersoll
>Assignee: Noble Paul
> Fix For: 5.0
>
>
> Assuming spaces in field names are still bad, we should automatically convert 
> them to not have spaces.  For instance, I indexed Citibike public data set 
> which has: 
> {quote}
> "tripduration","starttime","stoptime","start station id","start station 
> name","start station latitude","start station longitude","end station 
> id","end station name","end station latitude","end station 
> longitude","bikeid","usertype","birth year","gender"{quote}
> My vote would be to replace spaces w/ underscores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6937) In schemaless mode, field names with spaces should be converted

2015-01-12 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14273713#comment-14273713
 ] 

Erik Hatcher commented on SOLR-6937:


[~hossman_luc...@fucit.org], I tried this:

{code}
public class NormalizeFieldNameUpdateProcessorFactory extends 
FieldMutatingUpdateProcessorFactory {
  @Override
  public UpdateRequestProcessor getInstance(SolrQueryRequest req, 
SolrQueryResponse rsp, UpdateRequestProcessor next) {
return new FieldMutatingUpdateProcessor(getSelector(), next) {
  @Override
  protected SolrInputField mutate(SolrInputField src) {
src.setName(src.getName().replace(' ', '_'));
return src;
  }
   };
  }
}
{code}

And got this error: 

{code}
mutate returned field with different name: 
field with spaces => field_with_spacesorg.apache.solr.common.SolrException: mutate returned field with 
different name: field with spaces => field_with_spaces...
{code}

Are there problems that would result when changing the name of a field in 
FieldMutatingUpdateProcessor?

> In schemaless mode, field names with spaces should be converted
> ---
>
> Key: SOLR-6937
> URL: https://issues.apache.org/jira/browse/SOLR-6937
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Reporter: Grant Ingersoll
>Assignee: Noble Paul
> Fix For: 5.0
>
>
> Assuming spaces in field names are still bad, we should automatically convert 
> them to not have spaces.  For instance, I indexed Citibike public data set 
> which has: 
> {quote}
> "tripduration","starttime","stoptime","start station id","start station 
> name","start station latitude","start station longitude","end station 
> id","end station name","end station latitude","end station 
> longitude","bikeid","usertype","birth year","gender"{quote}
> My vote would be to replace spaces w/ underscores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6937) In schemaless mode, field names with spaces should be converted

2015-01-08 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14270662#comment-14270662
 ] 

Hoss Man commented on SOLR-6937:


bq. My vote would be to replace spaces w/ underscores.

Could probably be solved with a ~6 line subclass of FieldMutatingUpdateProcessor

> In schemaless mode, field names with spaces should be converted
> ---
>
> Key: SOLR-6937
> URL: https://issues.apache.org/jira/browse/SOLR-6937
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Reporter: Grant Ingersoll
> Fix For: 5.0
>
>
> Assuming spaces in field names are still bad, we should automatically convert 
> them to not have spaces.  For instance, I indexed Citibike public data set 
> which has: 
> {quote}
> "tripduration","starttime","stoptime","start station id","start station 
> name","start station latitude","start station longitude","end station 
> id","end station name","end station latitude","end station 
> longitude","bikeid","usertype","birth year","gender"{quote}
> My vote would be to replace spaces w/ underscores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org