One request handler per view?   

I think if you are able to make the actual view in use for the current request 
a single value (vs. all views that the user could use over time), it would keep 
the qf list down to a manageable size (e.g. specified within the request 
handler XML).   Not sure if this is feasible for  you, but it seems like a 
reasonable approach given the use case you describe.

Just a thought ...

-----Original Message-----
From: Steven White [mailto:swhite4...@gmail.com] 
Sent: Tuesday, May 26, 2015 4:48 PM
To: solr-user@lucene.apache.org
Subject: Re: When is too many fields in "qf" is too many?

Thanks Doug.  I might have to take you on the hangout offer.  Let me refine the 
requirement further and if I still see the need, I will let you know.

Steve

On Tue, May 26, 2015 at 2:01 PM, Doug Turnbull < 
dturnb...@opensourceconnections.com> wrote:

> How you have tie is fine. Setting tie to 1 might give you reasonable 
> results. You could easily still have scores that are just always an 
> order of magnitude or two higher, but try it out!
>
> BTW Anything you put in teh URL can also be put into a request handler.
>
> If you ever just want to have a 15 minute conversation via hangout, 
> happy to chat with you :) Might be fun to think through your prob together.
>
> -Doug
>
> On Tue, May 26, 2015 at 1:42 PM, Steven White <swhite4...@gmail.com>
> wrote:
>
> > Hi Doug,
> >
> > I'm back to this topic.  Unfortunately, due to my DB structer, and
> business
> > need, I will not be able to search against a single field (i.e.: 
> > using copyField).  Thus, I have to use list of fields via "qf".  
> > Given this, I see you said above to use "tie=1.0" will that, more or 
> > less, address this scoring issue?  Should "tie=1.0" be set on the request 
> > handler like so:
> >
> >   <requestHandler name="/select" class="solr.SearchHandler">
> >      <lst name="defaults">
> >        <str name="echoParams">explicit</str>
> >        <int name="rows">20</int>
> >        <str name="defType">edismax</str>
> >        <str name="qf">F1 F2 F3 F4 ... ... ...</str>
> >        <float name="tie">1.0</float>
> >        <str name="fl">_UNIQUE_FIELD_,score</str>
> >        <str name="wt">xml</str>
> >        <str name="indent">true</str>
> >      </lst>
> >   </requestHandler>
> >
> > Or must "tie" be passed as part of the URL?
> >
> > Thanks
> >
> > Steve
> >
> >
> > On Wed, May 20, 2015 at 2:58 PM, Doug Turnbull < 
> > dturnb...@opensourceconnections.com> wrote:
> >
> > > Yeah a copyField into one could be a good space/time tradeoff. It 
> > > can
> be
> > > more manageable to use an all field for both relevancy and 
> > > performance,
> > if
> > > you can handle the duplication of data.
> > >
> > > You could set tie=1.0, which effectively sums all the matches 
> > > instead
> of
> > > picking the best match. You'll still have cases where one field's 
> > > score might just happen to be far off of another, and thus 
> > > dominating the summation. But something easy to try if you want to 
> > > keep playing with dismax.
> > >
> > > -Doug
> > >
> > > On Wed, May 20, 2015 at 2:56 PM, Steven White 
> > > <swhite4...@gmail.com>
> > > wrote:
> > >
> > > > Hi Doug,
> > > >
> > > > Your blog write up on relevancy is very interesting, I didn't 
> > > > know
> > this.
> > > > Looks like I have to go back to my drawing board and figure out 
> > > > an alternative solution: somehow get those group-based-fields 
> > > > data into
> a
> > > > single field using copyField.
> > > >
> > > > Thanks
> > > >
> > > > Steve
> > > >
> > > > On Wed, May 20, 2015 at 11:17 AM, Doug Turnbull < 
> > > > dturnb...@opensourceconnections.com> wrote:
> > > >
> > > > > Steven,
> > > > >
> > > > > I'd be concerned about your relevance with that many qf fields.
> > Dismax
> > > > > takes a "winner takes all" point of view to search. Field 
> > > > > scores
> can
> > > vary
> > > > > by an order of magnitude (or even two) despite the attempts of
> query
> > > > > normalization. You can read more here
> > > > >
> > > > >
> > > >
> > >
> >
> http://opensourceconnections.com/blog/2013/07/02/getting-dissed-by-dis
> max-why-your-incorrect-assumptions-about-dismax-are-hurting-search-rel
> evancy/
> > > > >
> > > > > I'm about to win the "blashphemer" merit badge, but ad-hoc
> all-field
> > > like
> > > > > searching over many fields is actually a good use case for
> > > > Elasticsearch's
> > > > > cross field queries.
> > > > >
> > > > >
> > > >
> > >
> >
> https://www.elastic.co/guide/en/elasticsearch/guide/master/_cross_fiel
> ds_queries.html
> > > > >
> > > > >
> > > >
> > >
> >
> http://opensourceconnections.com/blog/2015/03/19/elasticsearch-cross-f
> ield-search-is-a-lie/
> > > > >
> > > > > It wouldn't be hard (and actually a great feature for the 
> > > > > project)
> to
> > > get
> > > > > the Lucene query associated with cross field search into Solr. 
> > > > > You
> > > could
> > > > > easily write a plugin to integrate it into a query parser:
> > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/elastic/elasticsearch/blob/master/src/main/java/org
> /apache/lucene/queries/BlendedTermQuery.java
> > > > >
> > > > > Hope that helps
> > > > > -Doug
> > > > > --
> > > > > *Doug Turnbull **| *Search Relevance Consultant | OpenSource
> > > Connections,
> > > > > LLC | 240.476.9983 | http://www.opensourceconnections.com
> > > > > Author: Relevant Search <http://manning.com/turnbull> from 
> > > > > Manning Publications This e-mail and all contents, including 
> > > > > attachments, is considered
> to
> > > be
> > > > > Company Confidential unless explicitly stated otherwise, 
> > > > > regardless of whether attachments are marked as such.
> > > > > On Wed, May 20, 2015 at 8:27 AM, Steven White <
> swhite4...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > My solution requires that users in group-A can only search
> against
> > a
> > > > set
> > > > > of
> > > > > > fields-A and users in group-B can only search against a set 
> > > > > > of
> > > > fields-B,
> > > > > > etc.  There can be several groups, as many as 100 even more.  
> > > > > > To
> > meet
> > > > > this
> > > > > > need, I build my search by passing in the list of fields via
> "qf".
> > > > What
> > > > > > goes into "qf" can be large: as many as 1500 fields and each
> field
> > > name
> > > > > > averages 15 characters long, in effect the data passed via "qf"
> > will
> > > be
> > > > > > over 20K characters.
> > > > > >
> > > > > > Given the above, beside the fact that a search for "apple"
> > > translating
> > > > > to a
> > > > > > 20K characters passing over the network, what else within 
> > > > > > Solr
> and
> > > > > Lucene I
> > > > > > should be worried about if any?  Will I hit some kind of a limit?
> > > Will
> > > > > > each search now require more CPU cycles?  Memory?  Etc.
> > > > > >
> > > > > > If the network traffic becomes an issue, my alternative 
> > > > > > solution
> is
> > > to
> > > > > > create a /select handler for each group and in that handler 
> > > > > > list
> > the
> > > > > fields
> > > > > > under "qf".
> > > > > >
> > > > > > I have considered creating pseudo-fields for each group and 
> > > > > > then
> > use
> > > > > > copyField into that group.  During search, I than can "qf"
> against
> > > that
> > > > > one
> > > > > > field.  Unfortunately, this is not ideal for my solution 
> > > > > > because
> > the
> > > > > fields
> > > > > > that go into each group dynamically change (at least once a
> month)
> > > and
> > > > > when
> > > > > > they do change, I have to re-index everything (this I have 
> > > > > > to
> > avoid)
> > > to
> > > > > > sync that group-field.
> > > > > >
> > > > > > I'm using "qf" with edismax and my Solr version is 5.1.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > Steve
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > *Doug Turnbull **| *Search Relevance Consultant | OpenSource
> Connections,
> > > LLC | 240.476.9983 | http://www.opensourceconnections.com
> > > Author: Relevant Search <http://manning.com/turnbull> from Manning 
> > > Publications This e-mail and all contents, including attachments, 
> > > is considered to
> be
> > > Company Confidential unless explicitly stated otherwise, 
> > > regardless of whether attachments are marked as such.
> > >
> >
>
>
>
> --
> *Doug Turnbull **| *Search Relevance Consultant | OpenSource 
> Connections, LLC | 240.476.9983 | http://www.opensourceconnections.com
> Author: Relevant Search <http://manning.com/turnbull> from Manning 
> Publications This e-mail and all contents, including attachments, is 
> considered to be Company Confidential unless explicitly stated 
> otherwise, regardless of whether attachments are marked as such.
>

*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA-CREF
*************************************************************************

Reply via email to