Re: Permutations of entries in a multivalued field
Thanks a lot for these useful hints. Best, Johannes On 18.12.2015 20:59, Allison, Timothy B. wrote: Duh, didn't realize you could set inOrder in Solr. Y, that's the better solution. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Friday, December 18, 2015 2:27 PM To: solr-user <solr-user@lucene.apache.org> Subject: Re: Permutations of entries in a multivalued field The other thing to check is the ComplexPhraseQueryParser, see: https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser It uses the Span queries to build up the query... Best, Erick On Fri, Dec 18, 2015 at 11:23 AM, Allison, Timothy B. <talli...@mitre.org> wrote: Hi Johannes, I suspect that Scott's answer would be more efficient than the following, and I may be misunderstanding the problem! This type of search is supported at the Lucene level by a SpanNearQuery with inOrder set to false. So, how do you get a SpanQuery in Solr? You might want to look at the SurroundQueryParser, and I have an alternate (LUCENE-5205/SOLR-5410) here: https://github.com/tballison/lucene-addons. If you do find an appropriate parser, make sure that your position increment gap is > 0 on your text field definition, and then you'd never incorrectly get a hit across field entries of: [0] A B [1] C Best, Tim On Wed, Dec 16, 2015 at 8:38 AM, Johannes Riedl < johannes.ri...@uni-tuebingen.de> wrote: Hello all, we are facing the following problem: we use a multivalued string field that contains entries of the kind A/B/C/, where A,B,C are terms. We are now looking for a simple way to also find all permutations of A/B/C, so e.g. B/A/C. As a workaround we added a new field that contains all entries alphabetically sorted and guarantee sorting on the user side. However - since this is limited in some ways - is there a simple way to either index in a way such that solely A/B/C and all permutations are found (using e.g. type=text is not an option since a term could occur in a different entry of the multivalued field) or trigger an alphabetical sorting of incoming queries. Thanks a lot for your feedback, best regards Johannes -- Scott Stults | Founder & Solutions Architect | OpenSource Connections, LLC | 434.409.2780 http://www.opensourceconnections.com
RE: Permutations of entries in a multivalued field
Hi Johannes, I suspect that Scott's answer would be more efficient than the following, and I may be misunderstanding the problem! This type of search is supported at the Lucene level by a SpanNearQuery with inOrder set to false. So, how do you get a SpanQuery in Solr? You might want to look at the SurroundQueryParser, and I have an alternate (LUCENE-5205/SOLR-5410) here: https://github.com/tballison/lucene-addons. If you do find an appropriate parser, make sure that your position increment gap is > 0 on your text field definition, and then you'd never incorrectly get a hit across field entries of: [0] A B [1] C Best, Tim On Wed, Dec 16, 2015 at 8:38 AM, Johannes Riedl < johannes.ri...@uni-tuebingen.de> wrote: > Hello all, > > we are facing the following problem: we use a multivalued string field > that contains entries of the kind A/B/C/, where A,B,C are terms. > We are now looking for a simple way to also find all permutations of > A/B/C, so e.g. B/A/C. As a workaround we added a new field that > contains all entries alphabetically sorted and guarantee sorting on the user > side. > However - since this is limited in some ways - is there a simple way > to either index in a way such that solely A/B/C and all permutations > are found (using e.g. type=text is not an option since a term could > occur in a different entry of the multivalued field) or trigger an > alphabetical sorting of incoming queries. > > Thanks a lot for your feedback, best regards > > Johannes > > -- Scott Stults | Founder & Solutions Architect | OpenSource Connections, LLC | 434.409.2780 http://www.opensourceconnections.com
Re: Permutations of entries in a multivalued field
Johannes, I think your best bet is to create a QParserPlugin that orders the terms of the incoming query. It sounds like you have control over the way that field is indexed, so you could enforce the same ordering when the document comes into Solr. If that's not the case then you'll also want to write an UpdateRequestProcessor: https://wiki.apache.org/solr/UpdateRequestProcessor Using a phrase query is probably not an option since you're probably working with > 3 terms and phrase slop wouldn't be able to extend past that. Hope that helps! -Scott On Wed, Dec 16, 2015 at 8:38 AM, Johannes Riedl < johannes.ri...@uni-tuebingen.de> wrote: > Hello all, > > we are facing the following problem: we use a multivalued string field > that contains entries of the kind A/B/C/, where A,B,C are terms. > We are now looking for a simple way to also find all permutations of > A/B/C, so e.g. B/A/C. As a workaround we added a new field that contains > all entries alphabetically sorted and guarantee sorting on the user side. > However - since this is limited in some ways - is there a simple way to > either index in a way such that solely A/B/C and all permutations are found > (using e.g. type=text is not an option since a term could occur in a > different entry of the multivalued field) or trigger an alphabetical > sorting of incoming queries. > > Thanks a lot for your feedback, best regards > > Johannes > > -- Scott Stults | Founder & Solutions Architect | OpenSource Connections, LLC | 434.409.2780 http://www.opensourceconnections.com
Re: Permutations of entries in a multivalued field
The other thing to check is the ComplexPhraseQueryParser, see: https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser It uses the Span queries to build up the query... Best, Erick On Fri, Dec 18, 2015 at 11:23 AM, Allison, Timothy B.wrote: > Hi Johannes, > I suspect that Scott's answer would be more efficient than the following, > and I may be misunderstanding the problem! > > This type of search is supported at the Lucene level by a SpanNearQuery with > inOrder set to false. > > So, how do you get a SpanQuery in Solr? You might want to look at the > SurroundQueryParser, and I have an alternate (LUCENE-5205/SOLR-5410) here: > https://github.com/tballison/lucene-addons. > > If you do find an appropriate parser, make sure that your position increment > gap is > 0 on your text field definition, and then you'd never incorrectly > get a hit across field entries of: > > [0] A B > [1] C > > Best, >Tim > > On Wed, Dec 16, 2015 at 8:38 AM, Johannes Riedl < > johannes.ri...@uni-tuebingen.de> wrote: > >> Hello all, >> >> we are facing the following problem: we use a multivalued string field >> that contains entries of the kind A/B/C/, where A,B,C are terms. >> We are now looking for a simple way to also find all permutations of >> A/B/C, so e.g. B/A/C. As a workaround we added a new field that >> contains all entries alphabetically sorted and guarantee sorting on the user >> side. >> However - since this is limited in some ways - is there a simple way >> to either index in a way such that solely A/B/C and all permutations >> are found (using e.g. type=text is not an option since a term could >> occur in a different entry of the multivalued field) or trigger an >> alphabetical sorting of incoming queries. >> >> Thanks a lot for your feedback, best regards >> >> Johannes >> >> > > > -- > Scott Stults | Founder & Solutions Architect | OpenSource Connections, LLC > | 434.409.2780 > http://www.opensourceconnections.com
RE: Permutations of entries in a multivalued field
Duh, didn't realize you could set inOrder in Solr. Y, that's the better solution. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Friday, December 18, 2015 2:27 PM To: solr-user <solr-user@lucene.apache.org> Subject: Re: Permutations of entries in a multivalued field The other thing to check is the ComplexPhraseQueryParser, see: https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser It uses the Span queries to build up the query... Best, Erick On Fri, Dec 18, 2015 at 11:23 AM, Allison, Timothy B. <talli...@mitre.org> wrote: > Hi Johannes, > I suspect that Scott's answer would be more efficient than the following, > and I may be misunderstanding the problem! > > This type of search is supported at the Lucene level by a SpanNearQuery with > inOrder set to false. > > So, how do you get a SpanQuery in Solr? You might want to look at the > SurroundQueryParser, and I have an alternate (LUCENE-5205/SOLR-5410) here: > https://github.com/tballison/lucene-addons. > > If you do find an appropriate parser, make sure that your position increment > gap is > 0 on your text field definition, and then you'd never incorrectly > get a hit across field entries of: > > [0] A B > [1] C > > Best, >Tim > > On Wed, Dec 16, 2015 at 8:38 AM, Johannes Riedl < > johannes.ri...@uni-tuebingen.de> wrote: > >> Hello all, >> >> we are facing the following problem: we use a multivalued string >> field that contains entries of the kind A/B/C/, where A,B,C are terms. >> We are now looking for a simple way to also find all permutations of >> A/B/C, so e.g. B/A/C. As a workaround we added a new field that >> contains all entries alphabetically sorted and guarantee sorting on the user >> side. >> However - since this is limited in some ways - is there a simple way >> to either index in a way such that solely A/B/C and all permutations >> are found (using e.g. type=text is not an option since a term could >> occur in a different entry of the multivalued field) or trigger an >> alphabetical sorting of incoming queries. >> >> Thanks a lot for your feedback, best regards >> >> Johannes >> >> > > > -- > Scott Stults | Founder & Solutions Architect | OpenSource Connections, > LLC > | 434.409.2780 > http://www.opensourceconnections.com