[jira] [Comment Edited] (LUCENE-4369) StringFields name is unintuitive and not helpful

Jack Krupansky (JIRA) Fri, 14 Sep 2012 08:14:11 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455856#comment-13455856
 ]


Jack Krupansky edited comment on LUCENE-4369 at 9/15/12 2:14 AM:
-----------------------------------------------------------------

At this stage of the discussion, is there any intention that the replacement 
for "string" fields will permit analysis, at least CharFilter analysis? After 
all, that is one of the main reasons people get confused. I'm okay with 
"ExactTextField" except that character filtering would be really nice and avoid 
confusion for Solr users. Of course, it would be ironic to call it "exact text" 
when it needs to be filtered.

OTOH, at the Lucene level, especially the Lucene query parser, I can see why 
you would want the "string" field to prevent analysis - because there is no 
field-specific analysis, just one analyzer for all fields. Hmmm... maybe that 
should be proposed for the Lucene query parser to side step that particular 
rationale for wanting strings to be unanalyzed - provide a map of 
field-specific analyzers.

At the Solr schema level, we could simply have "string" be a TextField with 
only KeywordTokenizer and then users can copy and/or customize as they wish.

This begs the question of how or whether the Solr schema side of the house will 
rename the "string" field type, or keep it as string and simply change the 
StrField class name.

                
      was (Author: jkrupan):
    At this stage of the discussion, is there any intention that the 
replacement for "string" fields will permit analysis, at least CharFilter 
analysis? After all, that is one of the main reasons people get confused. I'm 
okay with "ExactTextField" except that character filtering would be really nice 
and avoid confusion for Solr users. Of course, it would be ironic to call it 
"exact text" when it needs to be filtered.

OTOH, at the Lucene level, especially the Lucene querey parser, I can see why 
you would want the "string" field to prevent analysis - because there is no 
field-specific analysis, just one analyzer for all fields. Hmmm... maybe that 
should be proposed for the Lucene query parser to side step that particular 
rationale for wanting strings to be unanalyzed - provide a map of 
field-specific analyzers.

At the Solr schema level, we could simply have "string" be a TextField with 
only KeywordTokenizer and then users can copy and/or customize as they wish.

This begs the question of how or whether the Solr schema side of the house will 
rename the "string" field type, or keep it as string and simply change the 
StrField class name.

                  
> StringFields name is unintuitive and not helpful
> ------------------------------------------------
>
>                 Key: LUCENE-4369
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4369
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>         Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (LUCENE-4369) StringFields name is unintuitive and not helpful

Reply via email to