Re: Matching on a multi valued field

2011-04-05 Thread Michael Sokolov
Could you try creating fields dynamically: common_names_1, 
common_names_2, etc.


Keep track of the max number of fields and generate queries listing all 
the fields?


Gross, but it handles all the cases mentioned in the thread (wildcards, 
phrases, etc).


-Mike

On 3/29/2011 4:57 PM, Brian Lamb wrote:

Hi all,

I have a field set up like this:

field name=common_names multiValued=true type=text indexed=true
stored=true required=false /

And I have some records:

RECORD1
arr name=common_names
   strman's best friend/str
   strpooch/str
/arr

RECORD2
arr name=common_names
   strman's worst enemy/str
   strfriend to no one/str
/arr

Now if I do a search such as:
http://localhost:8983/solr/search/?q=*:*fq={!q.op=AND df=common_names}man's
friend

Both records are returned. However, I only want RECORD1 returned. I
understand why RECORD2 is returned but how can I structure my query so that
only RECORD1 is returned?

Thanks,

Brian Lamb





Re: Matching on a multi valued field

2011-04-05 Thread Renaud Delbru

Hi,

you could try the SIREn plugin [1] which supports multi-valued fields.

[1] http://siren.sindice.com
--
Renaud Delbru

On 29/03/11 21:57, Brian Lamb wrote:

Hi all,

I have a field set up like this:

field name=common_names multiValued=true type=text indexed=true
stored=true required=false /

And I have some records:

RECORD1
arr name=common_names
   strman's best friend/str
   strpooch/str
/arr

RECORD2
arr name=common_names
   strman's worst enemy/str
   strfriend to no one/str
/arr

Now if I do a search such as:
http://localhost:8983/solr/search/?q=*:*fq={!q.op=AND df=common_names}man's
friend

Both records are returned. However, I only want RECORD1 returned. I
understand why RECORD2 is returned but how can I structure my query so that
only RECORD1 is returned?

Thanks,

Brian Lamb





Re: Matching on a multi valued field

2011-04-04 Thread Brian Lamb
I just noticed Juan's response and I find that I am encountering that very
issue in a few cases. Boosting is a good way to put the more relevant
results to the top but it is possible to only have the correct results
returned?

On Wed, Mar 30, 2011 at 11:51 AM, Brian Lamb
brian.l...@journalexperts.comwrote:

 Thank you all for your responses. The field had already been set up with
 positionIncrementGap=100 so I just needed to add in the slop.


 On Tue, Mar 29, 2011 at 6:32 PM, Juan Pablo Mora jua...@informa.eswrote:

  A multiValued field
  is actually a single field with all data separated with
 positionIncrement.
  Try setting that value high enough and use a PhraseQuery.


 That is true but you cannot do things like:

 q=bar* foo*~10 with default query search.

 and if you use dismax you will have the same problems with multivalued
 fields. Imagine the situation:

 Doc1:
field A: [foo bar,dooh] 2 values

 Doc2:
field A: [bar dooh, whatever] Another 2 values

 the query:
qt=dismax  qf= fieldA  q = ( bar dooh )

 will return both Doc1 and Doc2. The only thing you can do in this
 situation is boost phrase query in Doc2 with parameter pf in order to get
 Doc2 in the first position of the results:

 pf = fieldA^1


 Thanks,
 JP.


 El 29/03/2011, a las 23:14, Markus Jelsma escribió:

  orly, all replies came in while sending =)
 
  Hi,
 
  Your filter query is looking for a match of man's friend in a single
  field. Regardless of analysis of the common_names field, all terms are
  present in the common_names field of both documents. A multiValued
 field
  is actually a single field with all data separated with
 positionIncrement.
  Try setting that value high enough and use a PhraseQuery.
 
  That should work
 
  Cheers,
 
  Hi all,
 
  I have a field set up like this:
 
  field name=common_names multiValued=true type=text
 indexed=true
  stored=true required=false /
 
  And I have some records:
 
  RECORD1
  arr name=common_names
 
   strman's best friend/str
   strpooch/str
 
  /arr
 
  RECORD2
  arr name=common_names
 
   strman's worst enemy/str
   strfriend to no one/str
 
  /arr
 
  Now if I do a search such as:
  http://localhost:8983/solr/search/?q=*:*fq={!q.op=AND
  df=common_names}man's friend
 
  Both records are returned. However, I only want RECORD1 returned. I
  understand why RECORD2 is returned but how can I structure my query so
  that only RECORD1 is returned?
 
  Thanks,
 
  Brian Lamb





Re: Matching on a multi valued field

2011-04-04 Thread Juan Pablo Mora
I have not find any solution to this. The only thing is to denormalize your 
multivalue field into several docs with a single value field.

Try ComplexPhraseQueryParser (https://issues.apache.org/jira/browse/SOLR-1604) 
if you are using solr 1.4 version.


El 04/04/2011, a las 21:21, Brian Lamb escribió:

I just noticed Juan's response and I find that I am encountering that very 
issue in a few cases. Boosting is a good way to put the more relevant results 
to the top but it is possible to only have the correct results returned?

On Wed, Mar 30, 2011 at 11:51 AM, Brian Lamb 
brian.l...@journalexperts.commailto:brian.l...@journalexperts.com wrote:
Thank you all for your responses. The field had already been set up with 
positionIncrementGap=100 so I just needed to add in the slop.


On Tue, Mar 29, 2011 at 6:32 PM, Juan Pablo Mora 
jua...@informa.esmailto:jua...@informa.es wrote:
 A multiValued field
 is actually a single field with all data separated with positionIncrement.
 Try setting that value high enough and use a PhraseQuery.


That is true but you cannot do things like:

q=bar* foo*~10 with default query search.

and if you use dismax you will have the same problems with multivalued fields. 
Imagine the situation:

Doc1:
   field A: [foo bar,dooh] 2 values

Doc2:
   field A: [bar dooh, whatever] Another 2 values

the query:
   qt=dismax  qf= fieldA  q = ( bar dooh )

will return both Doc1 and Doc2. The only thing you can do in this situation is 
boost phrase query in Doc2 with parameter pf in order to get Doc2 in the first 
position of the results:

pf = fieldA^1


Thanks,
JP.


El 29/03/2011, a las 23:14, Markus Jelsma escribió:

 orly, all replies came in while sending =)

 Hi,

 Your filter query is looking for a match of man's friend in a single
 field. Regardless of analysis of the common_names field, all terms are
 present in the common_names field of both documents. A multiValued field
 is actually a single field with all data separated with positionIncrement.
 Try setting that value high enough and use a PhraseQuery.

 That should work

 Cheers,

 Hi all,

 I have a field set up like this:

 field name=common_names multiValued=true type=text indexed=true
 stored=true required=false /

 And I have some records:

 RECORD1
 arr name=common_names

  strman's best friend/str
  strpooch/str

 /arr

 RECORD2
 arr name=common_names

  strman's worst enemy/str
  strfriend to no one/str

 /arr

 Now if I do a search such as:
 http://localhost:8983/solr/search/?q=*:*fq={!q.op=ANDhttp://localhost:8983/solr/search/?q=*:*fq=%7B!q.op=AND
 df=common_names}man's friend

 Both records are returned. However, I only want RECORD1 returned. I
 understand why RECORD2 is returned but how can I structure my query so
 that only RECORD1 is returned?

 Thanks,

 Brian Lamb






Re: Matching on a multi valued field

2011-04-04 Thread Jonathan Rochkind

On 4/4/2011 3:21 PM, Brian Lamb wrote:

I just noticed Juan's response and I find that I am encountering that very
issue in a few cases. Boosting is a good way to put the more relevant
results to the top but it is possible to only have the correct results
returned?


Only what's already been said in the thread.  You can simulate a 
non-phrase non-wildcard search, forced to match all within the same 
value of a multi-valued, by using phrase queries with slop.  And it will 
only return hits that have all terms within the same value -- it's not a 
boosting solution.


But if you need wildcards, or you need to find an actual phrase in the 
same value as additional term(s) or phrase(s), no, you are out of luck 
in Solr.


That is, exactly what Juan said, he already said exactly this.

If someone can think of a clever way to write some Java to do this in a 
new query component, that would be useful.  I am not entirely sure how 
possible that is.  I guess you'd have to make sure that ALL matching 
tokens or phrases are within the positionIncrementGap of each other, not 
sure how feasible that is, I'm not too familiar with Solr/Lucene 
source.   But at any rate, there's no way to do it out of the box with 
Solr, no.




Re: Matching on a multi valued field

2011-03-30 Thread Brian Lamb
Thank you all for your responses. The field had already been set up with
positionIncrementGap=100 so I just needed to add in the slop.

On Tue, Mar 29, 2011 at 6:32 PM, Juan Pablo Mora jua...@informa.es wrote:

  A multiValued field
  is actually a single field with all data separated with
 positionIncrement.
  Try setting that value high enough and use a PhraseQuery.


 That is true but you cannot do things like:

 q=bar* foo*~10 with default query search.

 and if you use dismax you will have the same problems with multivalued
 fields. Imagine the situation:

 Doc1:
field A: [foo bar,dooh] 2 values

 Doc2:
field A: [bar dooh, whatever] Another 2 values

 the query:
qt=dismax  qf= fieldA  q = ( bar dooh )

 will return both Doc1 and Doc2. The only thing you can do in this situation
 is boost phrase query in Doc2 with parameter pf in order to get Doc2 in the
 first position of the results:

 pf = fieldA^1


 Thanks,
 JP.


 El 29/03/2011, a las 23:14, Markus Jelsma escribió:

  orly, all replies came in while sending =)
 
  Hi,
 
  Your filter query is looking for a match of man's friend in a single
  field. Regardless of analysis of the common_names field, all terms are
  present in the common_names field of both documents. A multiValued field
  is actually a single field with all data separated with
 positionIncrement.
  Try setting that value high enough and use a PhraseQuery.
 
  That should work
 
  Cheers,
 
  Hi all,
 
  I have a field set up like this:
 
  field name=common_names multiValued=true type=text
 indexed=true
  stored=true required=false /
 
  And I have some records:
 
  RECORD1
  arr name=common_names
 
   strman's best friend/str
   strpooch/str
 
  /arr
 
  RECORD2
  arr name=common_names
 
   strman's worst enemy/str
   strfriend to no one/str
 
  /arr
 
  Now if I do a search such as:
  http://localhost:8983/solr/search/?q=*:*fq={!q.op=AND
  df=common_names}man's friend
 
  Both records are returned. However, I only want RECORD1 returned. I
  understand why RECORD2 is returned but how can I structure my query so
  that only RECORD1 is returned?
 
  Thanks,
 
  Brian Lamb




Matching on a multi valued field

2011-03-29 Thread Brian Lamb
Hi all,

I have a field set up like this:

field name=common_names multiValued=true type=text indexed=true
stored=true required=false /

And I have some records:

RECORD1
arr name=common_names
  strman's best friend/str
  strpooch/str
/arr

RECORD2
arr name=common_names
  strman's worst enemy/str
  strfriend to no one/str
/arr

Now if I do a search such as:
http://localhost:8983/solr/search/?q=*:*fq={!q.op=AND df=common_names}man's
friend

Both records are returned. However, I only want RECORD1 returned. I
understand why RECORD2 is returned but how can I structure my query so that
only RECORD1 is returned?

Thanks,

Brian Lamb


Re: Matching on a multi valued field

2011-03-29 Thread Jonathan Rochkind
As far as I know, there's no support in Solr for all words must match 
in the same value of a multi-valued field.


I agree it would be useful in some cases.

As long as you don't need to do an _actual_ phrase search, you can kind 
of fake it by using a phrase query, with the query slop set so high that 
it will encompass the whole field. Just make sure your 
positionIncrementGap in your solrconfig.xml is higher than your phrase 
slop, to keep your phrase slop from leaking over into another value of 
the multi-valued field.


fq=man's friend~1
(but url encode the value)

On 3/29/2011 4:57 PM, Brian Lamb wrote:

Hi all,

I have a field set up like this:

field name=common_names multiValued=true type=text indexed=true
stored=true required=false /

And I have some records:

RECORD1
arr name=common_names
   strman's best friend/str
   strpooch/str
/arr

RECORD2
arr name=common_names
   strman's worst enemy/str
   strfriend to no one/str
/arr

Now if I do a search such as:
http://localhost:8983/solr/search/?q=*:*fq={!q.op=AND df=common_names}man's
friend

Both records are returned. However, I only want RECORD1 returned. I
understand why RECORD2 is returned but how can I structure my query so that
only RECORD1 is returned?

Thanks,

Brian Lamb



Re: Matching on a multi valued field

2011-03-29 Thread Savvas-Andreas Moysidis
I assume you are using the Standard Handler?
In that case wouldn't something like:
q=common_names:(man's friend)q.op=AND work?

On 29 March 2011 21:57, Brian Lamb brian.l...@journalexperts.com wrote:

 Hi all,

 I have a field set up like this:

 field name=common_names multiValued=true type=text indexed=true
 stored=true required=false /

 And I have some records:

 RECORD1
 arr name=common_names
  strman's best friend/str
  strpooch/str
 /arr

 RECORD2
 arr name=common_names
  strman's worst enemy/str
  strfriend to no one/str
 /arr

 Now if I do a search such as:
 http://localhost:8983/solr/search/?q=*:*fq={!q.op=ANDdf=common_names}man's
 friend

 Both records are returned. However, I only want RECORD1 returned. I
 understand why RECORD2 is returned but how can I structure my query so that
 only RECORD1 is returned?

 Thanks,

 Brian Lamb



Re: Matching on a multi valued field

2011-03-29 Thread Erick Erickson
Two things need to be done. First, define positionIncrementGap
(see http://wiki.apache.org/solr/SchemaXml) for the field.

Then use phrase searches with the slop less than what you've
defined for positionIncrementGap.

Of course you'll have to have a positionIncrementGap larger than the
number of tokens in any single entry in your multiValued field, and you'll
have to re-index.

Best
Erick

On Tue, Mar 29, 2011 at 4:57 PM, Brian Lamb
brian.l...@journalexperts.com wrote:
 Hi all,

 I have a field set up like this:

 field name=common_names multiValued=true type=text indexed=true
 stored=true required=false /

 And I have some records:

 RECORD1
 arr name=common_names
  strman's best friend/str
  strpooch/str
 /arr

 RECORD2
 arr name=common_names
  strman's worst enemy/str
  strfriend to no one/str
 /arr

 Now if I do a search such as:
 http://localhost:8983/solr/search/?q=*:*fq={!q.op=AND df=common_names}man's
 friend

 Both records are returned. However, I only want RECORD1 returned. I
 understand why RECORD2 is returned but how can I structure my query so that
 only RECORD1 is returned?

 Thanks,

 Brian Lamb



Re: Matching on a multi valued field

2011-03-29 Thread Savvas-Andreas Moysidis
my bad..just realised your problem.. :D

On 29 March 2011 22:07, Savvas-Andreas Moysidis 
savvas.andreas.moysi...@googlemail.com wrote:

 I assume you are using the Standard Handler?
 In that case wouldn't something like:
 q=common_names:(man's friend)q.op=AND work?

 On 29 March 2011 21:57, Brian Lamb brian.l...@journalexperts.com wrote:

 Hi all,

 I have a field set up like this:

 field name=common_names multiValued=true type=text indexed=true
 stored=true required=false /

 And I have some records:

 RECORD1
 arr name=common_names
  strman's best friend/str
  strpooch/str
 /arr

 RECORD2
 arr name=common_names
  strman's worst enemy/str
  strfriend to no one/str
 /arr

 Now if I do a search such as:
 http://localhost:8983/solr/search/?q=*:*fq={!q.op=ANDdf=common_names}man's
 friend

 Both records are returned. However, I only want RECORD1 returned. I
 understand why RECORD2 is returned but how can I structure my query so
 that
 only RECORD1 is returned?

 Thanks,

 Brian Lamb





Re: Matching on a multi valued field

2011-03-29 Thread Markus Jelsma
Hi,

Your filter query is looking for a match of man's friend in a single field. 
Regardless of analysis of the common_names field, all terms are present in the 
common_names field of both documents. A multiValued field is actually a single 
field with all data separated with positionIncrement. Try setting that value 
high enough and use a PhraseQuery. 

That should work

Cheers,

 Hi all,
 
 I have a field set up like this:
 
 field name=common_names multiValued=true type=text indexed=true
 stored=true required=false /
 
 And I have some records:
 
 RECORD1
 arr name=common_names
   strman's best friend/str
   strpooch/str
 /arr
 
 RECORD2
 arr name=common_names
   strman's worst enemy/str
   strfriend to no one/str
 /arr
 
 Now if I do a search such as:
 http://localhost:8983/solr/search/?q=*:*fq={!q.op=AND
 df=common_names}man's friend
 
 Both records are returned. However, I only want RECORD1 returned. I
 understand why RECORD2 is returned but how can I structure my query so that
 only RECORD1 is returned?
 
 Thanks,
 
 Brian Lamb


Re: Matching on a multi valued field

2011-03-29 Thread Markus Jelsma
orly, all replies came in while sending =)

 Hi,
 
 Your filter query is looking for a match of man's friend in a single
 field. Regardless of analysis of the common_names field, all terms are
 present in the common_names field of both documents. A multiValued field
 is actually a single field with all data separated with positionIncrement.
 Try setting that value high enough and use a PhraseQuery.
 
 That should work
 
 Cheers,
 
  Hi all,
  
  I have a field set up like this:
  
  field name=common_names multiValued=true type=text indexed=true
  stored=true required=false /
  
  And I have some records:
  
  RECORD1
  arr name=common_names
  
strman's best friend/str
strpooch/str
  
  /arr
  
  RECORD2
  arr name=common_names
  
strman's worst enemy/str
strfriend to no one/str
  
  /arr
  
  Now if I do a search such as:
  http://localhost:8983/solr/search/?q=*:*fq={!q.op=AND
  df=common_names}man's friend
  
  Both records are returned. However, I only want RECORD1 returned. I
  understand why RECORD2 is returned but how can I structure my query so
  that only RECORD1 is returned?
  
  Thanks,
  
  Brian Lamb


Re: Matching on a multi valued field

2011-03-29 Thread Juan Pablo Mora
 A multiValued field
 is actually a single field with all data separated with positionIncrement.
 Try setting that value high enough and use a PhraseQuery.


That is true but you cannot do things like:

q=bar* foo*~10 with default query search.

and if you use dismax you will have the same problems with multivalued fields. 
Imagine the situation:

Doc1:
field A: [foo bar,dooh] 2 values

Doc2:
field A: [bar dooh, whatever] Another 2 values

the query:
qt=dismax  qf= fieldA  q = ( bar dooh )

will return both Doc1 and Doc2. The only thing you can do in this situation is 
boost phrase query in Doc2 with parameter pf in order to get Doc2 in the first 
position of the results:

pf = fieldA^1


Thanks,
JP.


El 29/03/2011, a las 23:14, Markus Jelsma escribió:

 orly, all replies came in while sending =)
 
 Hi,
 
 Your filter query is looking for a match of man's friend in a single
 field. Regardless of analysis of the common_names field, all terms are
 present in the common_names field of both documents. A multiValued field
 is actually a single field with all data separated with positionIncrement.
 Try setting that value high enough and use a PhraseQuery.
 
 That should work
 
 Cheers,
 
 Hi all,
 
 I have a field set up like this:
 
 field name=common_names multiValued=true type=text indexed=true
 stored=true required=false /
 
 And I have some records:
 
 RECORD1
 arr name=common_names
 
  strman's best friend/str
  strpooch/str
 
 /arr
 
 RECORD2
 arr name=common_names
 
  strman's worst enemy/str
  strfriend to no one/str
 
 /arr
 
 Now if I do a search such as:
 http://localhost:8983/solr/search/?q=*:*fq={!q.op=AND
 df=common_names}man's friend
 
 Both records are returned. However, I only want RECORD1 returned. I
 understand why RECORD2 is returned but how can I structure my query so
 that only RECORD1 is returned?
 
 Thanks,
 
 Brian Lamb