[jira] [Commented] (SOLR-7981) term based ValueSourceParsers should support an option to run an analyzer for hte specified field on the input

Jason Gerlowski (JIRA) Sun, 01 Nov 2015 08:39:44 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-7981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14984446#comment-14984446
 ]


Jason Gerlowski commented on SOLR-7981:
---------------------------------------

As a slight follow up here, I'm pretty certain that I'm right about function 
queries "defaulting" to the query analyzer.

It wouldn't be too difficult to allow users to specify which analyzer to use.  
That's still an option that make sense IMO.  But we probably don't want to 
change the default analyzer (from 'query' to 'none').  With that in mind, the 
new behavior could look like this:

termfreq(foo_t, 'Bicycles');              // Uses 'query' analyzer associated 
with that field.
termfreq(foot_t 'Bicycles', index);   //Uses the index analyzer
termfreq(foot_t, 'Bicycles', query);  // Uses the query analyzer
termfreq(foo_t, 'bicycle', none);      //Explicitly chooses to use no analyzer. 
(Just wrap in TermQuery- don't tokenize).

Does that sound like reasonable behavior?  Just wanted to try and straighten 
out the intended-behavior before I sit down to finish it up.

> term based ValueSourceParsers should support an option to run an analyzer for 
> hte specified field on the input
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-7981
>                 URL: https://issues.apache.org/jira/browse/SOLR-7981
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Hoss Man
>              Labels: newdev
>         Attachments: SOLR-7981.patch
>
>
> The following functions all take exactly 2 arguments: a field name, and a 
> term value...
> * idf
> * termfreq
> * tf
> * totaltermfreq
> ...we should consider adding an optional third argument to indicate if an 
> analyzer for the specified field should be used on the input to find the real 
> "Term" to consider.
> For example, the following might all result in equivilent numeric values for 
> all docs assuming simple plural stemming and lowercasing...
> {noformat}
> termfreq(foo_t,'Bicycles',query) // use the query analyzer for field foo_t on 
> input Bicycles
> termfreq(foo_t,'Bicycles',index) // use the index analyzer for field foo_t on 
> input Bicycles
> termfreq(foo_t,'bicycle',none) // no analyzer used to construct Term
> termfreq(foo_t,'bicycle') // legacy 2 arg syntax, same as 'none'
> {noformat}
> (Special error checking needed if analyzer creates more then one term for the 
> given input string)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-7981) term based ValueSourceParsers should support an option to run an analyzer for hte specified field on the input

Reply via email to