Re: Search only for single value of Solr multivalue field (part 2)
On Sun, Dec 16, 2018 at 05:44:30PM -0800, Erick Erickson wrote: > No, the idea is that you have N single valued fields, one for each of > the MV entries you have. The copyField dest would be MV, and only used > in those cases you wanted to match across values. Not saying that's a > great solution, or if it would even necessarily work but thought it > worth mentioning. Ok, then my initial document with MV fields: > "content_txt":["001 first","002 second"] would become: > "content1_t":"001 first" > "content2_t":"002 second" > "_copiedfield_":["001 first","002 second"] And then the initial user query: > content_txt:(first AND second) would become: > content1_t:(first AND second) OR content2_t:(first AND second) Depending on the length of the initial array, each document will have a different number of contentx_t. This means some management like a layer between the user and the parser, to extend the query with the maximum possible contentx_t fields in the collection. (with max=100 for performance reason?) QUESTION: is the MV limitation a *solr parser* limitation, or a *lucene* limitation. If it is the latter, writing my own parser would be an option isn't ? -- nicolas
Re: Search only for single value of Solr multivalue field (part 2)
bq. multiple fields acts as a MV field No, the idea is that you have N single valued fields, one for each of the MV entries you have. The copyField dest would be MV, and only used in those cases you wanted to match across values. Not saying that's a great solution, or if it would even necessarily work but thought it worth mentioning. Best, Erick On Sun, Dec 16, 2018 at 1:14 PM Nicolas Paris wrote: > > On Sun, Dec 16, 2018 at 09:30:33AM -0800, Erick Erickson wrote: > > Have you looked at ComplexPhraseQueryParser here? > > https://lucene.apache.org/solr/guide/6_6/other-parsers.html > > Sure. However, I am using multi-word synonyms and so far the > complexphrase does not handle them. (maybe soon ?) > > > Depending on how many of these you have, you could do something with > > dynamic fields. Rather than use a single MV field, use N fields. You'd > > probably have to copyField or some such to a catch-all field for > > searches that you wanted to ignore the "mv nature" of the field. > > Problem with copyField from multiple fields acts as a MV field. So the > problem remains: dealing with MV fields. Isn't ? > > Thanks > > -- > nicolas
Re: Search only for single value of Solr multivalue field (part 2)
On Sun, Dec 16, 2018 at 09:30:33AM -0800, Erick Erickson wrote: > Have you looked at ComplexPhraseQueryParser here? > https://lucene.apache.org/solr/guide/6_6/other-parsers.html Sure. However, I am using multi-word synonyms and so far the complexphrase does not handle them. (maybe soon ?) > Depending on how many of these you have, you could do something with > dynamic fields. Rather than use a single MV field, use N fields. You'd > probably have to copyField or some such to a catch-all field for > searches that you wanted to ignore the "mv nature" of the field. Problem with copyField from multiple fields acts as a MV field. So the problem remains: dealing with MV fields. Isn't ? Thanks -- nicolas
Re: Search only for single value of Solr multivalue field (part 2)
Have you looked at ComplexPhraseQueryParser here? https://lucene.apache.org/solr/guide/6_6/other-parsers.html But no, there are no plans that I know of to include something that has the notion of searching within MV fields. Depending on how many of these you have, you could do something with dynamic fields. Rather than use a single MV field, use N fields. You'd probably have to copyField or some such to a catch-all field for searches that you wanted to ignore the "mv nature" of the field. I'd be nervous as the number of such fields got into the hundreds however. Best, Erick On Sun, Dec 16, 2018 at 2:54 AM Nicolas Paris wrote: > > hi > > This question is highly related to a previous one found on the > mailing-list archive [1]. > > I have this document: > > "content_txt":["001 first","002 second"] > I d'like the below query return nothing: > > q=content_txt:(first AND second) > > The method proposed ([1]) by Erick works ok to look for a single value > having BOTH first AND second by setting the field positionIncrementGap > high enough: > > This query returns nothing as expected: > > q=content_txt:("first second"~99) > > > However, this is based on *phrase search*. Phrase search does not allow > to use the below simple query parser features. That's a _HUGE_ limitation! > - regexp > - fuzzy > - whildcard > - ranges > > So the query below does won't match the first field: > > q=content_txt:("[000 TO 001] first"~99) > While this one does match the second and shouldn't! > > q=content_txt:([000 TO 001] AND "second") > > QUESTION: > - > Is there a chance such feature will be developed in future SolR version ? I > mean something > allowing considering multivalued fields independently ? A new field > attribute such independentMultivalued=true would be ok ? > > Thanks, > > > [1]: > http://lucene.472066.n3.nabble.com/Search-only-for-single-value-of-Solr-multivalue-field-td4309850.html#a4309893 > > -- > nicolas
Search only for single value of Solr multivalue field (part 2)
hi This question is highly related to a previous one found on the mailing-list archive [1]. I have this document: "content_txt":["001 first","002 second"] I d'like the below query return nothing: > q=content_txt:(first AND second) The method proposed ([1]) by Erick works ok to look for a single value having BOTH first AND second by setting the field positionIncrementGap high enough: This query returns nothing as expected: > q=content_txt:("first second"~99) However, this is based on *phrase search*. Phrase search does not allow to use the below simple query parser features. That's a _HUGE_ limitation! - regexp - fuzzy - whildcard - ranges So the query below does won't match the first field: > q=content_txt:("[000 TO 001] first"~99) While this one does match the second and shouldn't! > q=content_txt:([000 TO 001] AND "second") QUESTION: - Is there a chance such feature will be developed in future SolR version ? I mean something allowing considering multivalued fields independently ? A new field attribute such independentMultivalued=true would be ok ? Thanks, [1]: http://lucene.472066.n3.nabble.com/Search-only-for-single-value-of-Solr-multivalue-field-td4309850.html#a4309893 -- nicolas
Re: Search only for single value of Solr multivalue field
Hi Dorian, Firstly thanks for your response, but it does not seems to work. Here is another example, I want to search document with affiliations contains the NHM (Natural History Museum) of India. So, I want to only get the document with id=2 : 1 NHM, Austria Annamalai Univ, India 2 NHM, India IRD, FRANCE If I implement your solution, ((NMH in affilliation OR India in affilliation) AND NOT (NMH in affilliation AND India in affilliation) it doesn't return any document. did I have missed something in you explanation ? In the prvious version of my application I used and had a solution with Oracle Full Text, it seem weird that SOLR cannot provide a solution for that. Best regards, Léo. Le 15/12/2016 12:44, Dorian Hoxha a écrit : You should be able to filter "(word1 in field OR word2 in field) AND NOT(word1 in field AND word2 in field)". Translate that into the right syntax. I don't know if lucene is smart enough to execute the filter only once (it should be i guess). Makes sense ? On Thu, Dec 15, 2016 at 12:12 PM, Leo BRUVRY-LAGADECwrote: Hi, I have a multivalued field in my schema called "idx_affilliation". IFREMER, Ctr Brest, DRO Geosci Marines, F-29280 Plouzane, France. Univ Lisbon, Ctr Geofis, P-1269102 Lisbon, Portugal. Univ Bretagne Occidentale, Inst Univ Europeen Mer, Lab Domaines Ocean, F-29280 Plouzane, France. Total Explorat Prod Geosci Projets Nouveaux Exper, F-92078 Paris, France. I want to be able to do a query like: idx_affilliation:(IFREMER Portugal) and not have this document returned. In other words, I do not want queries to span individual values for the field. --- Here are some further examples using the document above of how I want this to work: idx_affilliation:(IFREMER France) --> Returns it. idx_affilliation:(IFREMER Plouzane) --> Returns it. idx_affilliation:("Univ Bretagne Occidentale") --> Returns it. idx_affilliation:("Univ Lisbon" Portugal) --> Returns it. idx_affilliation:(IFREMER Portugal) --> DOES NOT RETURN IT. Does someone known if it's possible to do this ? Best regards, Leo.
Re: Search only for single value of Solr multivalue field
Phrase queries and slop and positionIncrementGap ;) The fieldType has a positionIncrementGap. This is the token delta between the end token of one entry and the beginning of the next. so the first entry: IFREMER, Ctr Brest, DRO Geosci Marines, F-29280 Plouzane, France IFREMER would have a position of 1 and France would have a position of 9 or so. If the positionIncrementGap was 100 then this entry: Univ Lisbon, Ctr Geofis, P-1269102 Lisbon, Portugal. Univ would have a position of 110. Now if I seach "IFREMER France"~99 it'd match the first one but searching "IFREMER Lisbon"~99 it would not match since the positions are > 99 apart. So you configure the positionIncrementGap to be greater than the longest number of tokens you ever expect to have in a single entry. HTH Erick On Thu, Dec 15, 2016 at 3:44 AM, Dorian Hoxhawrote: > You should be able to filter "(word1 in field OR word2 in field) AND > NOT(word1 in field AND word2 in field)". Translate that into the right > syntax. > I don't know if lucene is smart enough to execute the filter only once (it > should be i guess). > Makes sense ? > > On Thu, Dec 15, 2016 at 12:12 PM, Leo BRUVRY-LAGADEC partenaire-exterieur.ifremer.fr> wrote: > >> Hi, >> >> I have a multivalued field in my schema called "idx_affilliation". >> >> IFREMER, Ctr Brest, DRO Geosci Marines, >> F-29280 Plouzane, France. >> Univ Lisbon, Ctr Geofis, P-1269102 Lisbon, >> Portugal. >> Univ Bretagne Occidentale, Inst Univ >> Europeen Mer, Lab Domaines Ocean, F-29280 Plouzane, France. >> Total Explorat Prod Geosci Projets Nouveaux >> Exper, F-92078 Paris, France. >> >> I want to be able to do a query like: idx_affilliation:(IFREMER Portugal) >> and not have this document returned. In other words, I do not want queries >> to span individual values for the field. >> >> >> --- >> >> Here are some further examples using the document above of how I want this >> to work: >> >> idx_affilliation:(IFREMER France) --> Returns it. >> idx_affilliation:(IFREMER Plouzane) --> Returns it. >> idx_affilliation:("Univ Bretagne Occidentale") --> Returns it. >> idx_affilliation:("Univ Lisbon" Portugal) --> Returns it. >> idx_affilliation:(IFREMER Portugal) --> DOES NOT RETURN IT. >> >> Does someone known if it's possible to do this ? >> >> Best regards, >> Leo. >>
Re: Search only for single value of Solr multivalue field
You should be able to filter "(word1 in field OR word2 in field) AND NOT(word1 in field AND word2 in field)". Translate that into the right syntax. I don't know if lucene is smart enough to execute the filter only once (it should be i guess). Makes sense ? On Thu, Dec 15, 2016 at 12:12 PM, Leo BRUVRY-LAGADECwrote: > Hi, > > I have a multivalued field in my schema called "idx_affilliation". > > IFREMER, Ctr Brest, DRO Geosci Marines, > F-29280 Plouzane, France. > Univ Lisbon, Ctr Geofis, P-1269102 Lisbon, > Portugal. > Univ Bretagne Occidentale, Inst Univ > Europeen Mer, Lab Domaines Ocean, F-29280 Plouzane, France. > Total Explorat Prod Geosci Projets Nouveaux > Exper, F-92078 Paris, France. > > I want to be able to do a query like: idx_affilliation:(IFREMER Portugal) > and not have this document returned. In other words, I do not want queries > to span individual values for the field. > > > --- > > Here are some further examples using the document above of how I want this > to work: > > idx_affilliation:(IFREMER France) --> Returns it. > idx_affilliation:(IFREMER Plouzane) --> Returns it. > idx_affilliation:("Univ Bretagne Occidentale") --> Returns it. > idx_affilliation:("Univ Lisbon" Portugal) --> Returns it. > idx_affilliation:(IFREMER Portugal) --> DOES NOT RETURN IT. > > Does someone known if it's possible to do this ? > > Best regards, > Leo. >
Search only for single value of Solr multivalue field
Hi, I have a multivalued field in my schema called "idx_affilliation". IFREMER, Ctr Brest, DRO Geosci Marines, F-29280 Plouzane, France. Univ Lisbon, Ctr Geofis, P-1269102 Lisbon, Portugal. Univ Bretagne Occidentale, Inst Univ Europeen Mer, Lab Domaines Ocean, F-29280 Plouzane, France. Total Explorat Prod Geosci Projets Nouveaux Exper, F-92078 Paris, France. I want to be able to do a query like: idx_affilliation:(IFREMER Portugal) and not have this document returned. In other words, I do not want queries to span individual values for the field. --- Here are some further examples using the document above of how I want this to work: idx_affilliation:(IFREMER France) --> Returns it. idx_affilliation:(IFREMER Plouzane) --> Returns it. idx_affilliation:("Univ Bretagne Occidentale") --> Returns it. idx_affilliation:("Univ Lisbon" Portugal) --> Returns it. idx_affilliation:(IFREMER Portugal) --> DOES NOT RETURN IT. Does someone known if it's possible to do this ? Best regards, Leo.