Re: Permutations of entries in a multivalued field

2015-12-21 Thread Johannes Riedl

Thanks a lot for these useful hints.

Best,

Johannes

On 18.12.2015 20:59, Allison, Timothy B. wrote:

Duh, didn't realize you could set inOrder in Solr.  Y, that's the better 
solution.

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Friday, December 18, 2015 2:27 PM
To: solr-user <solr-user@lucene.apache.org>
Subject: Re: Permutations of entries in a multivalued field

The other thing to check is the ComplexPhraseQueryParser, see:
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser

It uses the Span queries to build up the query...

Best,
Erick

On Fri, Dec 18, 2015 at 11:23 AM, Allison, Timothy B.
<talli...@mitre.org> wrote:

Hi Johannes,
   I suspect that Scott's answer would be more efficient than the following, 
and I may be misunderstanding the problem!

  This type of search is supported at the Lucene level by a SpanNearQuery with 
inOrder set to false.

  So, how do you get a SpanQuery in Solr?  You might want to look at the 
SurroundQueryParser, and I have an alternate (LUCENE-5205/SOLR-5410) here: 
https://github.com/tballison/lucene-addons.

  If you do find an appropriate parser, make sure that your position increment gap 
is > 0 on your text field definition, and then you'd never incorrectly get a 
hit across field entries of:

[0] A B
[1] C

Best,
Tim

On Wed, Dec 16, 2015 at 8:38 AM, Johannes Riedl < 
johannes.ri...@uni-tuebingen.de> wrote:


Hello all,

we are facing the following problem: we use a multivalued string
field that contains entries of the kind A/B/C/, where A,B,C are terms.
We are now looking for a simple way to also find all permutations of
A/B/C, so e.g. B/A/C. As a workaround we added a new field that
contains all entries alphabetically sorted and guarantee sorting on the user 
side.
However - since this is limited in some ways - is there a simple way
to either index in a way such that solely A/B/C and all permutations
are found (using e.g. type=text is not an option since a term could
occur in a different entry of the multivalued field) or trigger an
alphabetical sorting of incoming queries.

Thanks a lot for your feedback, best regards

Johannes




--
Scott Stults | Founder & Solutions Architect | OpenSource Connections,
LLC
| 434.409.2780
http://www.opensourceconnections.com




RE: Permutations of entries in a multivalued field

2015-12-18 Thread Allison, Timothy B.
Hi Johannes,
  I suspect that Scott's answer would be more efficient than the following, and 
I may be misunderstanding the problem!

 This type of search is supported at the Lucene level by a SpanNearQuery with 
inOrder set to false.
  
 So, how do you get a SpanQuery in Solr?  You might want to look at the 
SurroundQueryParser, and I have an alternate (LUCENE-5205/SOLR-5410) here: 
https://github.com/tballison/lucene-addons. 

 If you do find an appropriate parser, make sure that your position increment 
gap is > 0 on your text field definition, and then you'd never incorrectly get 
a hit across field entries of:

[0] A B
[1] C

Best,
   Tim

On Wed, Dec 16, 2015 at 8:38 AM, Johannes Riedl < 
johannes.ri...@uni-tuebingen.de> wrote:

> Hello all,
>
> we are facing the following problem: we use a multivalued string field 
> that contains entries of the kind A/B/C/, where A,B,C are terms.
> We are now looking for a simple way to also find all permutations of 
> A/B/C, so e.g. B/A/C. As a workaround we added a new field that 
> contains all entries alphabetically sorted and guarantee sorting on the user 
> side.
> However - since this is limited in some ways - is there a simple way 
> to either index in a way such that solely A/B/C and all permutations 
> are found (using e.g. type=text is not an option since a term could 
> occur in a different entry of the multivalued field) or trigger an 
> alphabetical sorting of incoming queries.
>
> Thanks a lot for your feedback, best regards
>
> Johannes
>
>


--
Scott Stults | Founder & Solutions Architect | OpenSource Connections, LLC
| 434.409.2780
http://www.opensourceconnections.com


Re: Permutations of entries in a multivalued field

2015-12-18 Thread Scott Stults
Johannes,

I think your best bet is to create a QParserPlugin that orders the terms of
the incoming query. It sounds like you have control over the way that field
is indexed, so you could enforce the same ordering when the document comes
into Solr. If that's not the case then you'll also want to write an
UpdateRequestProcessor:

https://wiki.apache.org/solr/UpdateRequestProcessor

Using a phrase query is probably not an option since you're probably
working with > 3 terms and phrase slop wouldn't be able to extend past that.


Hope that helps!
-Scott


On Wed, Dec 16, 2015 at 8:38 AM, Johannes Riedl <
johannes.ri...@uni-tuebingen.de> wrote:

> Hello all,
>
> we are facing the following problem: we use a multivalued string field
> that contains entries of the kind A/B/C/, where A,B,C are terms.
> We are now looking for a simple way to also find all permutations of
> A/B/C, so e.g. B/A/C. As a workaround we added a new field that contains
> all entries alphabetically sorted and guarantee sorting on the user side.
> However - since this is limited in some ways - is there a simple way to
> either index in a way such that solely A/B/C and all permutations are found
> (using e.g. type=text is not an option since a term could occur in a
> different entry of the multivalued field) or trigger an alphabetical
> sorting of incoming queries.
>
> Thanks a lot for your feedback, best regards
>
> Johannes
>
>


-- 
Scott Stults | Founder & Solutions Architect | OpenSource Connections, LLC
| 434.409.2780
http://www.opensourceconnections.com


Re: Permutations of entries in a multivalued field

2015-12-18 Thread Erick Erickson
The other thing to check is the ComplexPhraseQueryParser, see:
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser

It uses the Span queries to build up the query...

Best,
Erick

On Fri, Dec 18, 2015 at 11:23 AM, Allison, Timothy B.
 wrote:
> Hi Johannes,
>   I suspect that Scott's answer would be more efficient than the following, 
> and I may be misunderstanding the problem!
>
>  This type of search is supported at the Lucene level by a SpanNearQuery with 
> inOrder set to false.
>
>  So, how do you get a SpanQuery in Solr?  You might want to look at the 
> SurroundQueryParser, and I have an alternate (LUCENE-5205/SOLR-5410) here: 
> https://github.com/tballison/lucene-addons.
>
>  If you do find an appropriate parser, make sure that your position increment 
> gap is > 0 on your text field definition, and then you'd never incorrectly 
> get a hit across field entries of:
>
> [0] A B
> [1] C
>
> Best,
>Tim
>
> On Wed, Dec 16, 2015 at 8:38 AM, Johannes Riedl < 
> johannes.ri...@uni-tuebingen.de> wrote:
>
>> Hello all,
>>
>> we are facing the following problem: we use a multivalued string field
>> that contains entries of the kind A/B/C/, where A,B,C are terms.
>> We are now looking for a simple way to also find all permutations of
>> A/B/C, so e.g. B/A/C. As a workaround we added a new field that
>> contains all entries alphabetically sorted and guarantee sorting on the user 
>> side.
>> However - since this is limited in some ways - is there a simple way
>> to either index in a way such that solely A/B/C and all permutations
>> are found (using e.g. type=text is not an option since a term could
>> occur in a different entry of the multivalued field) or trigger an
>> alphabetical sorting of incoming queries.
>>
>> Thanks a lot for your feedback, best regards
>>
>> Johannes
>>
>>
>
>
> --
> Scott Stults | Founder & Solutions Architect | OpenSource Connections, LLC
> | 434.409.2780
> http://www.opensourceconnections.com


RE: Permutations of entries in a multivalued field

2015-12-18 Thread Allison, Timothy B.
Duh, didn't realize you could set inOrder in Solr.  Y, that's the better 
solution.  

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Friday, December 18, 2015 2:27 PM
To: solr-user <solr-user@lucene.apache.org>
Subject: Re: Permutations of entries in a multivalued field

The other thing to check is the ComplexPhraseQueryParser, see:
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser

It uses the Span queries to build up the query...

Best,
Erick

On Fri, Dec 18, 2015 at 11:23 AM, Allison, Timothy B.
<talli...@mitre.org> wrote:
> Hi Johannes,
>   I suspect that Scott's answer would be more efficient than the following, 
> and I may be misunderstanding the problem!
>
>  This type of search is supported at the Lucene level by a SpanNearQuery with 
> inOrder set to false.
>
>  So, how do you get a SpanQuery in Solr?  You might want to look at the 
> SurroundQueryParser, and I have an alternate (LUCENE-5205/SOLR-5410) here: 
> https://github.com/tballison/lucene-addons.
>
>  If you do find an appropriate parser, make sure that your position increment 
> gap is > 0 on your text field definition, and then you'd never incorrectly 
> get a hit across field entries of:
>
> [0] A B
> [1] C
>
> Best,
>Tim
>
> On Wed, Dec 16, 2015 at 8:38 AM, Johannes Riedl < 
> johannes.ri...@uni-tuebingen.de> wrote:
>
>> Hello all,
>>
>> we are facing the following problem: we use a multivalued string 
>> field that contains entries of the kind A/B/C/, where A,B,C are terms.
>> We are now looking for a simple way to also find all permutations of 
>> A/B/C, so e.g. B/A/C. As a workaround we added a new field that 
>> contains all entries alphabetically sorted and guarantee sorting on the user 
>> side.
>> However - since this is limited in some ways - is there a simple way 
>> to either index in a way such that solely A/B/C and all permutations 
>> are found (using e.g. type=text is not an option since a term could 
>> occur in a different entry of the multivalued field) or trigger an 
>> alphabetical sorting of incoming queries.
>>
>> Thanks a lot for your feedback, best regards
>>
>> Johannes
>>
>>
>
>
> --
> Scott Stults | Founder & Solutions Architect | OpenSource Connections, 
> LLC
> | 434.409.2780
> http://www.opensourceconnections.com