Re: docValues: Can we apply synonym
What I'm suggesting is that you have two fields, one for searching, one for faceting. You may find you can't use docValues for your field type, in which case Solr will just use caches to improve faceting performance. Upayavira On Sat, May 30, 2015, at 01:50 AM, Aman Tandon wrote: Hi Upayavira, How the copyField will help in my scenario when I have to add the synonym in docValue enable field. With Regards Aman Tandon On Sat, May 30, 2015 at 1:18 AM, Upayavira u...@odoko.co.uk wrote: Use copyField to clone the field for faceting purposes. Upayavira On Fri, May 29, 2015, at 08:06 PM, Aman Tandon wrote: Hi Erick, Thanks for suggestion, We are this query parser plugin ( *SynonymExpandingExtendedDismaxQParserPlugin*) to manage multi-word synonym. So it does work slower than edismax that's why it is not in contrib right? (I am asking this question because we are using for all our searches to handle 10 multiword ice cube, icecube etc) *Moreover I thought a solution for this docValue problem* I need to make city field as *multivalued* and by this I mean i will add the synonym (*mumbai, bombay*) as an extra value to that field if present. Now searching operation will work fine as before. *field name=citymumbai/fieldfield name=citybombay/field* The only prob is if we have to remove the 'city alias/synonym facets' when we are providing results to the clients. *mumbai, 1000* With Regards Aman Tandon On Fri, May 29, 2015 at 7:26 PM, Erick Erickson erickerick...@gmail.com wrote: Do take time for performance testing with that parser. It can be slow depending on your data as I remember. That said it solves the problem it set out to solve so if it meets your SLAs, it can be a life-saver. Best, Erick On Fri, May 29, 2015 at 2:35 AM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Even if a little bit outdated, that query parser is really really cool to manage synonyms ! +1 ! 2015-05-29 1:01 GMT+01:00 Aman Tandon amantandon...@gmail.com: Thanks chris. Yes we are using it for handling multiword synonym problem. With Regards Aman Tandon On Fri, May 29, 2015 at 12:38 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Again, I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:42 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Ok and what synonym processor you is talking about maybe it could help ? With Regards Aman Tandon On Thu, May 28, 2015 at 4:01 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Sorry, my bad. The synonym processor I mention works differently. It's an extension of the EDisMax query processor and doesn't require field level synonym configs. -Original Message- From: Reitzel, Charles [mailto:charles.reit...@tiaa-cref.org] Sent: Wednesday, May 27, 2015 6:12 PM To: solr-user@lucene.apache.org Subject: RE: docValues: Can we apply synonym But the query analysis isn't on a specific field, it is applied to the query string. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:08 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Hi Charles, The problem here is that the docValues works only with primitives data type only like String, int, etc So how could we apply synonym on primitive data type. With Regards Aman Tandon On Thu, May 28, 2015 at 3:19 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Is there any reason you cannot apply the synonyms at query time? Applying synonyms at indexing time has problems, e.g. polluting the term frequency for synonyms added, preventing distance queries, ... Since city names often have multiple terms, e.g. New York, Den Hague, etc., I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. Tastes great, less filling. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ We found this to fix synonyms like ny for New York and vice versa. Haven't tried it with docValues, tho. -Original Message- From: Aman Tandon
Re: docValues: Can we apply synonym
Even if a little bit outdated, that query parser is really really cool to manage synonyms ! +1 ! 2015-05-29 1:01 GMT+01:00 Aman Tandon amantandon...@gmail.com: Thanks chris. Yes we are using it for handling multiword synonym problem. With Regards Aman Tandon On Fri, May 29, 2015 at 12:38 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Again, I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:42 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Ok and what synonym processor you is talking about maybe it could help ? With Regards Aman Tandon On Thu, May 28, 2015 at 4:01 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Sorry, my bad. The synonym processor I mention works differently. It's an extension of the EDisMax query processor and doesn't require field level synonym configs. -Original Message- From: Reitzel, Charles [mailto:charles.reit...@tiaa-cref.org] Sent: Wednesday, May 27, 2015 6:12 PM To: solr-user@lucene.apache.org Subject: RE: docValues: Can we apply synonym But the query analysis isn't on a specific field, it is applied to the query string. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:08 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Hi Charles, The problem here is that the docValues works only with primitives data type only like String, int, etc So how could we apply synonym on primitive data type. With Regards Aman Tandon On Thu, May 28, 2015 at 3:19 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Is there any reason you cannot apply the synonyms at query time? Applying synonyms at indexing time has problems, e.g. polluting the term frequency for synonyms added, preventing distance queries, ... Since city names often have multiple terms, e.g. New York, Den Hague, etc., I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. Tastes great, less filling. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ We found this to fix synonyms like ny for New York and vice versa. Haven't tried it with docValues, tho. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Tuesday, May 26, 2015 11:15 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Yes it could be :) Anyway thanks for helping. With Regards Aman Tandon On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I should investigate that, as usually synonyms are analysis stage. A simple way is to replace the word with all its synonyms ( including original word), but simply using this kind of processor will change the token position and offsets, modifying the actual content of the document . I am from Bombay will become I am from Bombay Mumbai which can be annoying. So a clever approach must be investigated. 2015-05-26 17:36 GMT+01:00 Aman Tandon amantandon...@gmail.com: Okay So how could I do it with UpdateProcessors? With Regards Aman Tandon On Tue, May 26, 2015 at 10:00 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: mmm this is different ! Without any customisation, right now you could : - use docValues to provide exact value facets. - Than you can use a copy field, with the proper analysis, to search when a user click on a filter ! So you will see in your facets : Mumbai(3) Bombay(2) And when clicking you see 5 results. A little bit misleading for the users … On the other hand if you you want to apply the synonyms before, the indexing pipeline ( because docValues field can not be analysed), I think you should play with UpdateProcessors. Cheers 2015-05-26 17:18 GMT+01:00 Aman Tandon amantandon...@gmail.com : We are interested in using docValues for better memory utilization and speed. Currently we are faceting the search results on *city. *In city we have also added the synonym for cities like mumbai, bombay (These are Indian cities). So that result of mumbai is also eligible when somebody will applying filter of bombay on search results. I need this functionality
Re: docValues: Can we apply synonym
Do take time for performance testing with that parser. It can be slow depending on your data as I remember. That said it solves the problem it set out to solve so if it meets your SLAs, it can be a life-saver. Best, Erick On Fri, May 29, 2015 at 2:35 AM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Even if a little bit outdated, that query parser is really really cool to manage synonyms ! +1 ! 2015-05-29 1:01 GMT+01:00 Aman Tandon amantandon...@gmail.com: Thanks chris. Yes we are using it for handling multiword synonym problem. With Regards Aman Tandon On Fri, May 29, 2015 at 12:38 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Again, I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:42 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Ok and what synonym processor you is talking about maybe it could help ? With Regards Aman Tandon On Thu, May 28, 2015 at 4:01 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Sorry, my bad. The synonym processor I mention works differently. It's an extension of the EDisMax query processor and doesn't require field level synonym configs. -Original Message- From: Reitzel, Charles [mailto:charles.reit...@tiaa-cref.org] Sent: Wednesday, May 27, 2015 6:12 PM To: solr-user@lucene.apache.org Subject: RE: docValues: Can we apply synonym But the query analysis isn't on a specific field, it is applied to the query string. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:08 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Hi Charles, The problem here is that the docValues works only with primitives data type only like String, int, etc So how could we apply synonym on primitive data type. With Regards Aman Tandon On Thu, May 28, 2015 at 3:19 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Is there any reason you cannot apply the synonyms at query time? Applying synonyms at indexing time has problems, e.g. polluting the term frequency for synonyms added, preventing distance queries, ... Since city names often have multiple terms, e.g. New York, Den Hague, etc., I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. Tastes great, less filling. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ We found this to fix synonyms like ny for New York and vice versa. Haven't tried it with docValues, tho. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Tuesday, May 26, 2015 11:15 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Yes it could be :) Anyway thanks for helping. With Regards Aman Tandon On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I should investigate that, as usually synonyms are analysis stage. A simple way is to replace the word with all its synonyms ( including original word), but simply using this kind of processor will change the token position and offsets, modifying the actual content of the document . I am from Bombay will become I am from Bombay Mumbai which can be annoying. So a clever approach must be investigated. 2015-05-26 17:36 GMT+01:00 Aman Tandon amantandon...@gmail.com: Okay So how could I do it with UpdateProcessors? With Regards Aman Tandon On Tue, May 26, 2015 at 10:00 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: mmm this is different ! Without any customisation, right now you could : - use docValues to provide exact value facets. - Than you can use a copy field, with the proper analysis, to search when a user click on a filter ! So you will see in your facets : Mumbai(3) Bombay(2) And when clicking you see 5 results. A little bit misleading for the users … On the other hand if you you want to apply the synonyms before, the indexing pipeline ( because docValues field can not be analysed), I think you should play with UpdateProcessors. Cheers 2015-05-26 17:18 GMT+01:00 Aman Tandon amantandon...@gmail.com : We are interested in using docValues for better memory utilization and speed. Currently we are faceting the search
Re: docValues: Can we apply synonym
Hi Upayavira, How the copyField will help in my scenario when I have to add the synonym in docValue enable field. With Regards Aman Tandon On Sat, May 30, 2015 at 1:18 AM, Upayavira u...@odoko.co.uk wrote: Use copyField to clone the field for faceting purposes. Upayavira On Fri, May 29, 2015, at 08:06 PM, Aman Tandon wrote: Hi Erick, Thanks for suggestion, We are this query parser plugin ( *SynonymExpandingExtendedDismaxQParserPlugin*) to manage multi-word synonym. So it does work slower than edismax that's why it is not in contrib right? (I am asking this question because we are using for all our searches to handle 10 multiword ice cube, icecube etc) *Moreover I thought a solution for this docValue problem* I need to make city field as *multivalued* and by this I mean i will add the synonym (*mumbai, bombay*) as an extra value to that field if present. Now searching operation will work fine as before. *field name=citymumbai/fieldfield name=citybombay/field* The only prob is if we have to remove the 'city alias/synonym facets' when we are providing results to the clients. *mumbai, 1000* With Regards Aman Tandon On Fri, May 29, 2015 at 7:26 PM, Erick Erickson erickerick...@gmail.com wrote: Do take time for performance testing with that parser. It can be slow depending on your data as I remember. That said it solves the problem it set out to solve so if it meets your SLAs, it can be a life-saver. Best, Erick On Fri, May 29, 2015 at 2:35 AM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Even if a little bit outdated, that query parser is really really cool to manage synonyms ! +1 ! 2015-05-29 1:01 GMT+01:00 Aman Tandon amantandon...@gmail.com: Thanks chris. Yes we are using it for handling multiword synonym problem. With Regards Aman Tandon On Fri, May 29, 2015 at 12:38 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Again, I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:42 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Ok and what synonym processor you is talking about maybe it could help ? With Regards Aman Tandon On Thu, May 28, 2015 at 4:01 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Sorry, my bad. The synonym processor I mention works differently. It's an extension of the EDisMax query processor and doesn't require field level synonym configs. -Original Message- From: Reitzel, Charles [mailto:charles.reit...@tiaa-cref.org] Sent: Wednesday, May 27, 2015 6:12 PM To: solr-user@lucene.apache.org Subject: RE: docValues: Can we apply synonym But the query analysis isn't on a specific field, it is applied to the query string. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:08 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Hi Charles, The problem here is that the docValues works only with primitives data type only like String, int, etc So how could we apply synonym on primitive data type. With Regards Aman Tandon On Thu, May 28, 2015 at 3:19 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Is there any reason you cannot apply the synonyms at query time? Applying synonyms at indexing time has problems, e.g. polluting the term frequency for synonyms added, preventing distance queries, ... Since city names often have multiple terms, e.g. New York, Den Hague, etc., I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. Tastes great, less filling. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ We found this to fix synonyms like ny for New York and vice versa. Haven't tried it with docValues, tho. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Tuesday, May 26, 2015 11:15 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Yes it could be :) Anyway thanks for helping. With Regards Aman Tandon On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I should investigate
Re: docValues: Can we apply synonym
Hi Erick, Thanks for suggestion, We are this query parser plugin ( *SynonymExpandingExtendedDismaxQParserPlugin*) to manage multi-word synonym. So it does work slower than edismax that's why it is not in contrib right? (I am asking this question because we are using for all our searches to handle 10 multiword ice cube, icecube etc) *Moreover I thought a solution for this docValue problem* I need to make city field as *multivalued* and by this I mean i will add the synonym (*mumbai, bombay*) as an extra value to that field if present. Now searching operation will work fine as before. *field name=citymumbai/fieldfield name=citybombay/field* The only prob is if we have to remove the 'city alias/synonym facets' when we are providing results to the clients. *mumbai, 1000* With Regards Aman Tandon On Fri, May 29, 2015 at 7:26 PM, Erick Erickson erickerick...@gmail.com wrote: Do take time for performance testing with that parser. It can be slow depending on your data as I remember. That said it solves the problem it set out to solve so if it meets your SLAs, it can be a life-saver. Best, Erick On Fri, May 29, 2015 at 2:35 AM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Even if a little bit outdated, that query parser is really really cool to manage synonyms ! +1 ! 2015-05-29 1:01 GMT+01:00 Aman Tandon amantandon...@gmail.com: Thanks chris. Yes we are using it for handling multiword synonym problem. With Regards Aman Tandon On Fri, May 29, 2015 at 12:38 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Again, I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:42 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Ok and what synonym processor you is talking about maybe it could help ? With Regards Aman Tandon On Thu, May 28, 2015 at 4:01 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Sorry, my bad. The synonym processor I mention works differently. It's an extension of the EDisMax query processor and doesn't require field level synonym configs. -Original Message- From: Reitzel, Charles [mailto:charles.reit...@tiaa-cref.org] Sent: Wednesday, May 27, 2015 6:12 PM To: solr-user@lucene.apache.org Subject: RE: docValues: Can we apply synonym But the query analysis isn't on a specific field, it is applied to the query string. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:08 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Hi Charles, The problem here is that the docValues works only with primitives data type only like String, int, etc So how could we apply synonym on primitive data type. With Regards Aman Tandon On Thu, May 28, 2015 at 3:19 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Is there any reason you cannot apply the synonyms at query time? Applying synonyms at indexing time has problems, e.g. polluting the term frequency for synonyms added, preventing distance queries, ... Since city names often have multiple terms, e.g. New York, Den Hague, etc., I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. Tastes great, less filling. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ We found this to fix synonyms like ny for New York and vice versa. Haven't tried it with docValues, tho. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Tuesday, May 26, 2015 11:15 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Yes it could be :) Anyway thanks for helping. With Regards Aman Tandon On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I should investigate that, as usually synonyms are analysis stage. A simple way is to replace the word with all its synonyms ( including original word), but simply using this kind of processor will change the token position and offsets, modifying the actual content of the document . I am from Bombay will become I am from Bombay Mumbai which can be annoying. So a clever approach must be investigated. 2015-05-26 17:36 GMT+01:00 Aman Tandon amantandon...@gmail.com : Okay So how could I do it with UpdateProcessors? With Regards Aman Tandon On Tue
Re: docValues: Can we apply synonym
Use copyField to clone the field for faceting purposes. Upayavira On Fri, May 29, 2015, at 08:06 PM, Aman Tandon wrote: Hi Erick, Thanks for suggestion, We are this query parser plugin ( *SynonymExpandingExtendedDismaxQParserPlugin*) to manage multi-word synonym. So it does work slower than edismax that's why it is not in contrib right? (I am asking this question because we are using for all our searches to handle 10 multiword ice cube, icecube etc) *Moreover I thought a solution for this docValue problem* I need to make city field as *multivalued* and by this I mean i will add the synonym (*mumbai, bombay*) as an extra value to that field if present. Now searching operation will work fine as before. *field name=citymumbai/fieldfield name=citybombay/field* The only prob is if we have to remove the 'city alias/synonym facets' when we are providing results to the clients. *mumbai, 1000* With Regards Aman Tandon On Fri, May 29, 2015 at 7:26 PM, Erick Erickson erickerick...@gmail.com wrote: Do take time for performance testing with that parser. It can be slow depending on your data as I remember. That said it solves the problem it set out to solve so if it meets your SLAs, it can be a life-saver. Best, Erick On Fri, May 29, 2015 at 2:35 AM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Even if a little bit outdated, that query parser is really really cool to manage synonyms ! +1 ! 2015-05-29 1:01 GMT+01:00 Aman Tandon amantandon...@gmail.com: Thanks chris. Yes we are using it for handling multiword synonym problem. With Regards Aman Tandon On Fri, May 29, 2015 at 12:38 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Again, I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:42 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Ok and what synonym processor you is talking about maybe it could help ? With Regards Aman Tandon On Thu, May 28, 2015 at 4:01 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Sorry, my bad. The synonym processor I mention works differently. It's an extension of the EDisMax query processor and doesn't require field level synonym configs. -Original Message- From: Reitzel, Charles [mailto:charles.reit...@tiaa-cref.org] Sent: Wednesday, May 27, 2015 6:12 PM To: solr-user@lucene.apache.org Subject: RE: docValues: Can we apply synonym But the query analysis isn't on a specific field, it is applied to the query string. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:08 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Hi Charles, The problem here is that the docValues works only with primitives data type only like String, int, etc So how could we apply synonym on primitive data type. With Regards Aman Tandon On Thu, May 28, 2015 at 3:19 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Is there any reason you cannot apply the synonyms at query time? Applying synonyms at indexing time has problems, e.g. polluting the term frequency for synonyms added, preventing distance queries, ... Since city names often have multiple terms, e.g. New York, Den Hague, etc., I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. Tastes great, less filling. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ We found this to fix synonyms like ny for New York and vice versa. Haven't tried it with docValues, tho. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Tuesday, May 26, 2015 11:15 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Yes it could be :) Anyway thanks for helping. With Regards Aman Tandon On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I should investigate that, as usually synonyms are analysis stage. A simple way is to replace the word with all its synonyms ( including original word), but simply using this kind of processor will change the token position and offsets, modifying the actual content of the document . I am from Bombay will become I am from Bombay Mumbai which
RE: docValues: Can we apply synonym
Again, I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:42 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Ok and what synonym processor you is talking about maybe it could help ? With Regards Aman Tandon On Thu, May 28, 2015 at 4:01 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Sorry, my bad. The synonym processor I mention works differently. It's an extension of the EDisMax query processor and doesn't require field level synonym configs. -Original Message- From: Reitzel, Charles [mailto:charles.reit...@tiaa-cref.org] Sent: Wednesday, May 27, 2015 6:12 PM To: solr-user@lucene.apache.org Subject: RE: docValues: Can we apply synonym But the query analysis isn't on a specific field, it is applied to the query string. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:08 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Hi Charles, The problem here is that the docValues works only with primitives data type only like String, int, etc So how could we apply synonym on primitive data type. With Regards Aman Tandon On Thu, May 28, 2015 at 3:19 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Is there any reason you cannot apply the synonyms at query time? Applying synonyms at indexing time has problems, e.g. polluting the term frequency for synonyms added, preventing distance queries, ... Since city names often have multiple terms, e.g. New York, Den Hague, etc., I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. Tastes great, less filling. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ We found this to fix synonyms like ny for New York and vice versa. Haven't tried it with docValues, tho. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Tuesday, May 26, 2015 11:15 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Yes it could be :) Anyway thanks for helping. With Regards Aman Tandon On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I should investigate that, as usually synonyms are analysis stage. A simple way is to replace the word with all its synonyms ( including original word), but simply using this kind of processor will change the token position and offsets, modifying the actual content of the document . I am from Bombay will become I am from Bombay Mumbai which can be annoying. So a clever approach must be investigated. 2015-05-26 17:36 GMT+01:00 Aman Tandon amantandon...@gmail.com: Okay So how could I do it with UpdateProcessors? With Regards Aman Tandon On Tue, May 26, 2015 at 10:00 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: mmm this is different ! Without any customisation, right now you could : - use docValues to provide exact value facets. - Than you can use a copy field, with the proper analysis, to search when a user click on a filter ! So you will see in your facets : Mumbai(3) Bombay(2) And when clicking you see 5 results. A little bit misleading for the users … On the other hand if you you want to apply the synonyms before, the indexing pipeline ( because docValues field can not be analysed), I think you should play with UpdateProcessors. Cheers 2015-05-26 17:18 GMT+01:00 Aman Tandon amantandon...@gmail.com: We are interested in using docValues for better memory utilization and speed. Currently we are faceting the search results on *city. *In city we have also added the synonym for cities like mumbai, bombay (These are Indian cities). So that result of mumbai is also eligible when somebody will applying filter of bombay on search results. I need this functionality to apply with docValues enabled field. With Regards Aman Tandon On Tue, May 26, 2015 at 9:19 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I checked in the Documentation to be sure, but apparently : DocValues are only available for specific field types. The types chosen determine the underlying Lucene docValue type that will be used. The available Solr field types are: - StrField and UUIDField. - If the field is single-valued (i.e., multi-valued is false), Lucene
Re: docValues: Can we apply synonym
Thanks chris. Yes we are using it for handling multiword synonym problem. With Regards Aman Tandon On Fri, May 29, 2015 at 12:38 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Again, I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:42 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Ok and what synonym processor you is talking about maybe it could help ? With Regards Aman Tandon On Thu, May 28, 2015 at 4:01 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Sorry, my bad. The synonym processor I mention works differently. It's an extension of the EDisMax query processor and doesn't require field level synonym configs. -Original Message- From: Reitzel, Charles [mailto:charles.reit...@tiaa-cref.org] Sent: Wednesday, May 27, 2015 6:12 PM To: solr-user@lucene.apache.org Subject: RE: docValues: Can we apply synonym But the query analysis isn't on a specific field, it is applied to the query string. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:08 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Hi Charles, The problem here is that the docValues works only with primitives data type only like String, int, etc So how could we apply synonym on primitive data type. With Regards Aman Tandon On Thu, May 28, 2015 at 3:19 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Is there any reason you cannot apply the synonyms at query time? Applying synonyms at indexing time has problems, e.g. polluting the term frequency for synonyms added, preventing distance queries, ... Since city names often have multiple terms, e.g. New York, Den Hague, etc., I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. Tastes great, less filling. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ We found this to fix synonyms like ny for New York and vice versa. Haven't tried it with docValues, tho. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Tuesday, May 26, 2015 11:15 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Yes it could be :) Anyway thanks for helping. With Regards Aman Tandon On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I should investigate that, as usually synonyms are analysis stage. A simple way is to replace the word with all its synonyms ( including original word), but simply using this kind of processor will change the token position and offsets, modifying the actual content of the document . I am from Bombay will become I am from Bombay Mumbai which can be annoying. So a clever approach must be investigated. 2015-05-26 17:36 GMT+01:00 Aman Tandon amantandon...@gmail.com: Okay So how could I do it with UpdateProcessors? With Regards Aman Tandon On Tue, May 26, 2015 at 10:00 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: mmm this is different ! Without any customisation, right now you could : - use docValues to provide exact value facets. - Than you can use a copy field, with the proper analysis, to search when a user click on a filter ! So you will see in your facets : Mumbai(3) Bombay(2) And when clicking you see 5 results. A little bit misleading for the users … On the other hand if you you want to apply the synonyms before, the indexing pipeline ( because docValues field can not be analysed), I think you should play with UpdateProcessors. Cheers 2015-05-26 17:18 GMT+01:00 Aman Tandon amantandon...@gmail.com : We are interested in using docValues for better memory utilization and speed. Currently we are faceting the search results on *city. *In city we have also added the synonym for cities like mumbai, bombay (These are Indian cities). So that result of mumbai is also eligible when somebody will applying filter of bombay on search results. I need this functionality to apply with docValues enabled field. With Regards Aman Tandon On Tue, May 26, 2015 at 9:19 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I checked in the Documentation to be sure, but apparently : DocValues are only available
Re: docValues: Can we apply synonym
Ok and what synonym processor you is talking about maybe it could help ? With Regards Aman Tandon On Thu, May 28, 2015 at 4:01 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Sorry, my bad. The synonym processor I mention works differently. It's an extension of the EDisMax query processor and doesn't require field level synonym configs. -Original Message- From: Reitzel, Charles [mailto:charles.reit...@tiaa-cref.org] Sent: Wednesday, May 27, 2015 6:12 PM To: solr-user@lucene.apache.org Subject: RE: docValues: Can we apply synonym But the query analysis isn't on a specific field, it is applied to the query string. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:08 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Hi Charles, The problem here is that the docValues works only with primitives data type only like String, int, etc So how could we apply synonym on primitive data type. With Regards Aman Tandon On Thu, May 28, 2015 at 3:19 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Is there any reason you cannot apply the synonyms at query time? Applying synonyms at indexing time has problems, e.g. polluting the term frequency for synonyms added, preventing distance queries, ... Since city names often have multiple terms, e.g. New York, Den Hague, etc., I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. Tastes great, less filling. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ We found this to fix synonyms like ny for New York and vice versa. Haven't tried it with docValues, tho. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Tuesday, May 26, 2015 11:15 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Yes it could be :) Anyway thanks for helping. With Regards Aman Tandon On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I should investigate that, as usually synonyms are analysis stage. A simple way is to replace the word with all its synonyms ( including original word), but simply using this kind of processor will change the token position and offsets, modifying the actual content of the document . I am from Bombay will become I am from Bombay Mumbai which can be annoying. So a clever approach must be investigated. 2015-05-26 17:36 GMT+01:00 Aman Tandon amantandon...@gmail.com: Okay So how could I do it with UpdateProcessors? With Regards Aman Tandon On Tue, May 26, 2015 at 10:00 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: mmm this is different ! Without any customisation, right now you could : - use docValues to provide exact value facets. - Than you can use a copy field, with the proper analysis, to search when a user click on a filter ! So you will see in your facets : Mumbai(3) Bombay(2) And when clicking you see 5 results. A little bit misleading for the users … On the other hand if you you want to apply the synonyms before, the indexing pipeline ( because docValues field can not be analysed), I think you should play with UpdateProcessors. Cheers 2015-05-26 17:18 GMT+01:00 Aman Tandon amantandon...@gmail.com: We are interested in using docValues for better memory utilization and speed. Currently we are faceting the search results on *city. *In city we have also added the synonym for cities like mumbai, bombay (These are Indian cities). So that result of mumbai is also eligible when somebody will applying filter of bombay on search results. I need this functionality to apply with docValues enabled field. With Regards Aman Tandon On Tue, May 26, 2015 at 9:19 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I checked in the Documentation to be sure, but apparently : DocValues are only available for specific field types. The types chosen determine the underlying Lucene docValue type that will be used. The available Solr field types are: - StrField and UUIDField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the SORTED type. - If the field is multi-valued, Lucene will use the SORTED_SET type. - Any Trie* numeric fields and EnumField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the NUMERIC type. - If the field is multi-valued, Lucene will use
RE: docValues: Can we apply synonym
Is there any reason you cannot apply the synonyms at query time? Applying synonyms at indexing time has problems, e.g. polluting the term frequency for synonyms added, preventing distance queries, ... Since city names often have multiple terms, e.g. New York, Den Hague, etc., I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. Tastes great, less filling. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ We found this to fix synonyms like ny for New York and vice versa. Haven't tried it with docValues, tho. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Tuesday, May 26, 2015 11:15 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Yes it could be :) Anyway thanks for helping. With Regards Aman Tandon On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I should investigate that, as usually synonyms are analysis stage. A simple way is to replace the word with all its synonyms ( including original word), but simply using this kind of processor will change the token position and offsets, modifying the actual content of the document . I am from Bombay will become I am from Bombay Mumbai which can be annoying. So a clever approach must be investigated. 2015-05-26 17:36 GMT+01:00 Aman Tandon amantandon...@gmail.com: Okay So how could I do it with UpdateProcessors? With Regards Aman Tandon On Tue, May 26, 2015 at 10:00 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: mmm this is different ! Without any customisation, right now you could : - use docValues to provide exact value facets. - Than you can use a copy field, with the proper analysis, to search when a user click on a filter ! So you will see in your facets : Mumbai(3) Bombay(2) And when clicking you see 5 results. A little bit misleading for the users … On the other hand if you you want to apply the synonyms before, the indexing pipeline ( because docValues field can not be analysed), I think you should play with UpdateProcessors. Cheers 2015-05-26 17:18 GMT+01:00 Aman Tandon amantandon...@gmail.com: We are interested in using docValues for better memory utilization and speed. Currently we are faceting the search results on *city. *In city we have also added the synonym for cities like mumbai, bombay (These are Indian cities). So that result of mumbai is also eligible when somebody will applying filter of bombay on search results. I need this functionality to apply with docValues enabled field. With Regards Aman Tandon On Tue, May 26, 2015 at 9:19 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I checked in the Documentation to be sure, but apparently : DocValues are only available for specific field types. The types chosen determine the underlying Lucene docValue type that will be used. The available Solr field types are: - StrField and UUIDField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the SORTED type. - If the field is multi-valued, Lucene will use the SORTED_SET type. - Any Trie* numeric fields and EnumField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the NUMERIC type. - If the field is multi-valued, Lucene will use the SORTED_SET type. This means you should not analyse a field where DocValues is enabled. Can your explain us your use case ? Why are you interested in synonyms DocValues level ? Cheers 2015-05-26 13:32 GMT+01:00 Upayavira u...@odoko.co.uk: To my understanding, docValues are just an uninverted index. That is, it contains the terms that are generated at the end of an analysis chain. Therefore, you simply enable docValues and include the SynonymFilterFactory in your analysis. Is that enough, or are you struggling with some other issue? Upayavira On Tue, May 26, 2015, at 12:03 PM, Aman Tandon wrote: Hi, We have some field *city* in which the docValues are enabled. We need to add the synonym in that field so how could we do it? With Regards Aman Tandon -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
RE: docValues: Can we apply synonym
But the query analysis isn't on a specific field, it is applied to the query string. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:08 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Hi Charles, The problem here is that the docValues works only with primitives data type only like String, int, etc So how could we apply synonym on primitive data type. With Regards Aman Tandon On Thu, May 28, 2015 at 3:19 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Is there any reason you cannot apply the synonyms at query time? Applying synonyms at indexing time has problems, e.g. polluting the term frequency for synonyms added, preventing distance queries, ... Since city names often have multiple terms, e.g. New York, Den Hague, etc., I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. Tastes great, less filling. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ We found this to fix synonyms like ny for New York and vice versa. Haven't tried it with docValues, tho. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Tuesday, May 26, 2015 11:15 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Yes it could be :) Anyway thanks for helping. With Regards Aman Tandon On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I should investigate that, as usually synonyms are analysis stage. A simple way is to replace the word with all its synonyms ( including original word), but simply using this kind of processor will change the token position and offsets, modifying the actual content of the document . I am from Bombay will become I am from Bombay Mumbai which can be annoying. So a clever approach must be investigated. 2015-05-26 17:36 GMT+01:00 Aman Tandon amantandon...@gmail.com: Okay So how could I do it with UpdateProcessors? With Regards Aman Tandon On Tue, May 26, 2015 at 10:00 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: mmm this is different ! Without any customisation, right now you could : - use docValues to provide exact value facets. - Than you can use a copy field, with the proper analysis, to search when a user click on a filter ! So you will see in your facets : Mumbai(3) Bombay(2) And when clicking you see 5 results. A little bit misleading for the users … On the other hand if you you want to apply the synonyms before, the indexing pipeline ( because docValues field can not be analysed), I think you should play with UpdateProcessors. Cheers 2015-05-26 17:18 GMT+01:00 Aman Tandon amantandon...@gmail.com: We are interested in using docValues for better memory utilization and speed. Currently we are faceting the search results on *city. *In city we have also added the synonym for cities like mumbai, bombay (These are Indian cities). So that result of mumbai is also eligible when somebody will applying filter of bombay on search results. I need this functionality to apply with docValues enabled field. With Regards Aman Tandon On Tue, May 26, 2015 at 9:19 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I checked in the Documentation to be sure, but apparently : DocValues are only available for specific field types. The types chosen determine the underlying Lucene docValue type that will be used. The available Solr field types are: - StrField and UUIDField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the SORTED type. - If the field is multi-valued, Lucene will use the SORTED_SET type. - Any Trie* numeric fields and EnumField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the NUMERIC type. - If the field is multi-valued, Lucene will use the SORTED_SET type. This means you should not analyse a field where DocValues is enabled. Can your explain us your use case ? Why are you interested in synonyms DocValues level ? Cheers 2015-05-26 13:32 GMT+01:00 Upayavira u...@odoko.co.uk: To my understanding, docValues are just an uninverted index. That is, it contains the terms that are generated at the end of an analysis chain. Therefore, you simply enable docValues and include the SynonymFilterFactory in your analysis. Is that enough, or are you struggling with some other issue
Re: docValues: Can we apply synonym
Hi Charles, The problem here is that the docValues works only with primitives data type only like String, int, etc So how could we apply synonym on primitive data type. With Regards Aman Tandon On Thu, May 28, 2015 at 3:19 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Is there any reason you cannot apply the synonyms at query time? Applying synonyms at indexing time has problems, e.g. polluting the term frequency for synonyms added, preventing distance queries, ... Since city names often have multiple terms, e.g. New York, Den Hague, etc., I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. Tastes great, less filling. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ We found this to fix synonyms like ny for New York and vice versa. Haven't tried it with docValues, tho. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Tuesday, May 26, 2015 11:15 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Yes it could be :) Anyway thanks for helping. With Regards Aman Tandon On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I should investigate that, as usually synonyms are analysis stage. A simple way is to replace the word with all its synonyms ( including original word), but simply using this kind of processor will change the token position and offsets, modifying the actual content of the document . I am from Bombay will become I am from Bombay Mumbai which can be annoying. So a clever approach must be investigated. 2015-05-26 17:36 GMT+01:00 Aman Tandon amantandon...@gmail.com: Okay So how could I do it with UpdateProcessors? With Regards Aman Tandon On Tue, May 26, 2015 at 10:00 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: mmm this is different ! Without any customisation, right now you could : - use docValues to provide exact value facets. - Than you can use a copy field, with the proper analysis, to search when a user click on a filter ! So you will see in your facets : Mumbai(3) Bombay(2) And when clicking you see 5 results. A little bit misleading for the users … On the other hand if you you want to apply the synonyms before, the indexing pipeline ( because docValues field can not be analysed), I think you should play with UpdateProcessors. Cheers 2015-05-26 17:18 GMT+01:00 Aman Tandon amantandon...@gmail.com: We are interested in using docValues for better memory utilization and speed. Currently we are faceting the search results on *city. *In city we have also added the synonym for cities like mumbai, bombay (These are Indian cities). So that result of mumbai is also eligible when somebody will applying filter of bombay on search results. I need this functionality to apply with docValues enabled field. With Regards Aman Tandon On Tue, May 26, 2015 at 9:19 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I checked in the Documentation to be sure, but apparently : DocValues are only available for specific field types. The types chosen determine the underlying Lucene docValue type that will be used. The available Solr field types are: - StrField and UUIDField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the SORTED type. - If the field is multi-valued, Lucene will use the SORTED_SET type. - Any Trie* numeric fields and EnumField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the NUMERIC type. - If the field is multi-valued, Lucene will use the SORTED_SET type. This means you should not analyse a field where DocValues is enabled. Can your explain us your use case ? Why are you interested in synonyms DocValues level ? Cheers 2015-05-26 13:32 GMT+01:00 Upayavira u...@odoko.co.uk: To my understanding, docValues are just an uninverted index. That is, it contains the terms that are generated at the end of an analysis chain. Therefore, you simply enable docValues and include the SynonymFilterFactory in your analysis. Is that enough, or are you struggling with some other issue? Upayavira On Tue, May 26, 2015, at 12:03 PM, Aman Tandon wrote: Hi, We have some field *city* in which the docValues are enabled. We need to add the synonym in that field so how could we do it? With Regards Aman Tandon
RE: docValues: Can we apply synonym
Sorry, my bad. The synonym processor I mention works differently. It's an extension of the EDisMax query processor and doesn't require field level synonym configs. -Original Message- From: Reitzel, Charles [mailto:charles.reit...@tiaa-cref.org] Sent: Wednesday, May 27, 2015 6:12 PM To: solr-user@lucene.apache.org Subject: RE: docValues: Can we apply synonym But the query analysis isn't on a specific field, it is applied to the query string. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:08 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Hi Charles, The problem here is that the docValues works only with primitives data type only like String, int, etc So how could we apply synonym on primitive data type. With Regards Aman Tandon On Thu, May 28, 2015 at 3:19 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Is there any reason you cannot apply the synonyms at query time? Applying synonyms at indexing time has problems, e.g. polluting the term frequency for synonyms added, preventing distance queries, ... Since city names often have multiple terms, e.g. New York, Den Hague, etc., I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. Tastes great, less filling. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ We found this to fix synonyms like ny for New York and vice versa. Haven't tried it with docValues, tho. -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Tuesday, May 26, 2015 11:15 PM To: solr-user@lucene.apache.org Subject: Re: docValues: Can we apply synonym Yes it could be :) Anyway thanks for helping. With Regards Aman Tandon On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I should investigate that, as usually synonyms are analysis stage. A simple way is to replace the word with all its synonyms ( including original word), but simply using this kind of processor will change the token position and offsets, modifying the actual content of the document . I am from Bombay will become I am from Bombay Mumbai which can be annoying. So a clever approach must be investigated. 2015-05-26 17:36 GMT+01:00 Aman Tandon amantandon...@gmail.com: Okay So how could I do it with UpdateProcessors? With Regards Aman Tandon On Tue, May 26, 2015 at 10:00 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: mmm this is different ! Without any customisation, right now you could : - use docValues to provide exact value facets. - Than you can use a copy field, with the proper analysis, to search when a user click on a filter ! So you will see in your facets : Mumbai(3) Bombay(2) And when clicking you see 5 results. A little bit misleading for the users … On the other hand if you you want to apply the synonyms before, the indexing pipeline ( because docValues field can not be analysed), I think you should play with UpdateProcessors. Cheers 2015-05-26 17:18 GMT+01:00 Aman Tandon amantandon...@gmail.com: We are interested in using docValues for better memory utilization and speed. Currently we are faceting the search results on *city. *In city we have also added the synonym for cities like mumbai, bombay (These are Indian cities). So that result of mumbai is also eligible when somebody will applying filter of bombay on search results. I need this functionality to apply with docValues enabled field. With Regards Aman Tandon On Tue, May 26, 2015 at 9:19 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I checked in the Documentation to be sure, but apparently : DocValues are only available for specific field types. The types chosen determine the underlying Lucene docValue type that will be used. The available Solr field types are: - StrField and UUIDField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the SORTED type. - If the field is multi-valued, Lucene will use the SORTED_SET type. - Any Trie* numeric fields and EnumField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the NUMERIC type. - If the field is multi-valued, Lucene will use the SORTED_SET type. This means you should not analyse a field where DocValues is enabled. Can your explain us your use case ? Why are you interested in synonyms DocValues level ? Cheers 2015-05-26 13:32 GMT+01:00 Upayavira u...@odoko.co.uk
Re: docValues: Can we apply synonym
Yes it could be :) Anyway thanks for helping. With Regards Aman Tandon On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I should investigate that, as usually synonyms are analysis stage. A simple way is to replace the word with all its synonyms ( including original word), but simply using this kind of processor will change the token position and offsets, modifying the actual content of the document . I am from Bombay will become I am from Bombay Mumbai which can be annoying. So a clever approach must be investigated. 2015-05-26 17:36 GMT+01:00 Aman Tandon amantandon...@gmail.com: Okay So how could I do it with UpdateProcessors? With Regards Aman Tandon On Tue, May 26, 2015 at 10:00 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: mmm this is different ! Without any customisation, right now you could : - use docValues to provide exact value facets. - Than you can use a copy field, with the proper analysis, to search when a user click on a filter ! So you will see in your facets : Mumbai(3) Bombay(2) And when clicking you see 5 results. A little bit misleading for the users … On the other hand if you you want to apply the synonyms before, the indexing pipeline ( because docValues field can not be analysed), I think you should play with UpdateProcessors. Cheers 2015-05-26 17:18 GMT+01:00 Aman Tandon amantandon...@gmail.com: We are interested in using docValues for better memory utilization and speed. Currently we are faceting the search results on *city. *In city we have also added the synonym for cities like mumbai, bombay (These are Indian cities). So that result of mumbai is also eligible when somebody will applying filter of bombay on search results. I need this functionality to apply with docValues enabled field. With Regards Aman Tandon On Tue, May 26, 2015 at 9:19 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I checked in the Documentation to be sure, but apparently : DocValues are only available for specific field types. The types chosen determine the underlying Lucene docValue type that will be used. The available Solr field types are: - StrField and UUIDField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the SORTED type. - If the field is multi-valued, Lucene will use the SORTED_SET type. - Any Trie* numeric fields and EnumField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the NUMERIC type. - If the field is multi-valued, Lucene will use the SORTED_SET type. This means you should not analyse a field where DocValues is enabled. Can your explain us your use case ? Why are you interested in synonyms DocValues level ? Cheers 2015-05-26 13:32 GMT+01:00 Upayavira u...@odoko.co.uk: To my understanding, docValues are just an uninverted index. That is, it contains the terms that are generated at the end of an analysis chain. Therefore, you simply enable docValues and include the SynonymFilterFactory in your analysis. Is that enough, or are you struggling with some other issue? Upayavira On Tue, May 26, 2015, at 12:03 PM, Aman Tandon wrote: Hi, We have some field *city* in which the docValues are enabled. We need to add the synonym in that field so how could we do it? With Regards Aman Tandon -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Re: docValues: Can we apply synonym
To my understanding, docValues are just an uninverted index. That is, it contains the terms that are generated at the end of an analysis chain. Therefore, you simply enable docValues and include the SynonymFilterFactory in your analysis. Is that enough, or are you struggling with some other issue? Upayavira On Tue, May 26, 2015, at 12:03 PM, Aman Tandon wrote: Hi, We have some field *city* in which the docValues are enabled. We need to add the synonym in that field so how could we do it? With Regards Aman Tandon
Re: docValues: Can we apply synonym
mmm this is different ! Without any customisation, right now you could : - use docValues to provide exact value facets. - Than you can use a copy field, with the proper analysis, to search when a user click on a filter ! So you will see in your facets : Mumbai(3) Bombay(2) And when clicking you see 5 results. A little bit misleading for the users … On the other hand if you you want to apply the synonyms before, the indexing pipeline ( because docValues field can not be analysed), I think you should play with UpdateProcessors. Cheers 2015-05-26 17:18 GMT+01:00 Aman Tandon amantandon...@gmail.com: We are interested in using docValues for better memory utilization and speed. Currently we are faceting the search results on *city. *In city we have also added the synonym for cities like mumbai, bombay (These are Indian cities). So that result of mumbai is also eligible when somebody will applying filter of bombay on search results. I need this functionality to apply with docValues enabled field. With Regards Aman Tandon On Tue, May 26, 2015 at 9:19 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I checked in the Documentation to be sure, but apparently : DocValues are only available for specific field types. The types chosen determine the underlying Lucene docValue type that will be used. The available Solr field types are: - StrField and UUIDField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the SORTED type. - If the field is multi-valued, Lucene will use the SORTED_SET type. - Any Trie* numeric fields and EnumField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the NUMERIC type. - If the field is multi-valued, Lucene will use the SORTED_SET type. This means you should not analyse a field where DocValues is enabled. Can your explain us your use case ? Why are you interested in synonyms DocValues level ? Cheers 2015-05-26 13:32 GMT+01:00 Upayavira u...@odoko.co.uk: To my understanding, docValues are just an uninverted index. That is, it contains the terms that are generated at the end of an analysis chain. Therefore, you simply enable docValues and include the SynonymFilterFactory in your analysis. Is that enough, or are you struggling with some other issue? Upayavira On Tue, May 26, 2015, at 12:03 PM, Aman Tandon wrote: Hi, We have some field *city* in which the docValues are enabled. We need to add the synonym in that field so how could we do it? With Regards Aman Tandon -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Re: docValues: Can we apply synonym
Okay So how could I do it with UpdateProcessors? With Regards Aman Tandon On Tue, May 26, 2015 at 10:00 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: mmm this is different ! Without any customisation, right now you could : - use docValues to provide exact value facets. - Than you can use a copy field, with the proper analysis, to search when a user click on a filter ! So you will see in your facets : Mumbai(3) Bombay(2) And when clicking you see 5 results. A little bit misleading for the users … On the other hand if you you want to apply the synonyms before, the indexing pipeline ( because docValues field can not be analysed), I think you should play with UpdateProcessors. Cheers 2015-05-26 17:18 GMT+01:00 Aman Tandon amantandon...@gmail.com: We are interested in using docValues for better memory utilization and speed. Currently we are faceting the search results on *city. *In city we have also added the synonym for cities like mumbai, bombay (These are Indian cities). So that result of mumbai is also eligible when somebody will applying filter of bombay on search results. I need this functionality to apply with docValues enabled field. With Regards Aman Tandon On Tue, May 26, 2015 at 9:19 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I checked in the Documentation to be sure, but apparently : DocValues are only available for specific field types. The types chosen determine the underlying Lucene docValue type that will be used. The available Solr field types are: - StrField and UUIDField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the SORTED type. - If the field is multi-valued, Lucene will use the SORTED_SET type. - Any Trie* numeric fields and EnumField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the NUMERIC type. - If the field is multi-valued, Lucene will use the SORTED_SET type. This means you should not analyse a field where DocValues is enabled. Can your explain us your use case ? Why are you interested in synonyms DocValues level ? Cheers 2015-05-26 13:32 GMT+01:00 Upayavira u...@odoko.co.uk: To my understanding, docValues are just an uninverted index. That is, it contains the terms that are generated at the end of an analysis chain. Therefore, you simply enable docValues and include the SynonymFilterFactory in your analysis. Is that enough, or are you struggling with some other issue? Upayavira On Tue, May 26, 2015, at 12:03 PM, Aman Tandon wrote: Hi, We have some field *city* in which the docValues are enabled. We need to add the synonym in that field so how could we do it? With Regards Aman Tandon -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Re: docValues: Can we apply synonym
I checked in the Documentation to be sure, but apparently : DocValues are only available for specific field types. The types chosen determine the underlying Lucene docValue type that will be used. The available Solr field types are: - StrField and UUIDField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the SORTED type. - If the field is multi-valued, Lucene will use the SORTED_SET type. - Any Trie* numeric fields and EnumField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the NUMERIC type. - If the field is multi-valued, Lucene will use the SORTED_SET type. This means you should not analyse a field where DocValues is enabled. Can your explain us your use case ? Why are you interested in synonyms DocValues level ? Cheers 2015-05-26 13:32 GMT+01:00 Upayavira u...@odoko.co.uk: To my understanding, docValues are just an uninverted index. That is, it contains the terms that are generated at the end of an analysis chain. Therefore, you simply enable docValues and include the SynonymFilterFactory in your analysis. Is that enough, or are you struggling with some other issue? Upayavira On Tue, May 26, 2015, at 12:03 PM, Aman Tandon wrote: Hi, We have some field *city* in which the docValues are enabled. We need to add the synonym in that field so how could we do it? With Regards Aman Tandon -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Re: docValues: Can we apply synonym
I should investigate that, as usually synonyms are analysis stage. A simple way is to replace the word with all its synonyms ( including original word), but simply using this kind of processor will change the token position and offsets, modifying the actual content of the document . I am from Bombay will become I am from Bombay Mumbai which can be annoying. So a clever approach must be investigated. 2015-05-26 17:36 GMT+01:00 Aman Tandon amantandon...@gmail.com: Okay So how could I do it with UpdateProcessors? With Regards Aman Tandon On Tue, May 26, 2015 at 10:00 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: mmm this is different ! Without any customisation, right now you could : - use docValues to provide exact value facets. - Than you can use a copy field, with the proper analysis, to search when a user click on a filter ! So you will see in your facets : Mumbai(3) Bombay(2) And when clicking you see 5 results. A little bit misleading for the users … On the other hand if you you want to apply the synonyms before, the indexing pipeline ( because docValues field can not be analysed), I think you should play with UpdateProcessors. Cheers 2015-05-26 17:18 GMT+01:00 Aman Tandon amantandon...@gmail.com: We are interested in using docValues for better memory utilization and speed. Currently we are faceting the search results on *city. *In city we have also added the synonym for cities like mumbai, bombay (These are Indian cities). So that result of mumbai is also eligible when somebody will applying filter of bombay on search results. I need this functionality to apply with docValues enabled field. With Regards Aman Tandon On Tue, May 26, 2015 at 9:19 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I checked in the Documentation to be sure, but apparently : DocValues are only available for specific field types. The types chosen determine the underlying Lucene docValue type that will be used. The available Solr field types are: - StrField and UUIDField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the SORTED type. - If the field is multi-valued, Lucene will use the SORTED_SET type. - Any Trie* numeric fields and EnumField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the NUMERIC type. - If the field is multi-valued, Lucene will use the SORTED_SET type. This means you should not analyse a field where DocValues is enabled. Can your explain us your use case ? Why are you interested in synonyms DocValues level ? Cheers 2015-05-26 13:32 GMT+01:00 Upayavira u...@odoko.co.uk: To my understanding, docValues are just an uninverted index. That is, it contains the terms that are generated at the end of an analysis chain. Therefore, you simply enable docValues and include the SynonymFilterFactory in your analysis. Is that enough, or are you struggling with some other issue? Upayavira On Tue, May 26, 2015, at 12:03 PM, Aman Tandon wrote: Hi, We have some field *city* in which the docValues are enabled. We need to add the synonym in that field so how could we do it? With Regards Aman Tandon -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Re: docValues: Can we apply synonym
We are interested in using docValues for better memory utilization and speed. Currently we are faceting the search results on *city. *In city we have also added the synonym for cities like mumbai, bombay (These are Indian cities). So that result of mumbai is also eligible when somebody will applying filter of bombay on search results. I need this functionality to apply with docValues enabled field. With Regards Aman Tandon On Tue, May 26, 2015 at 9:19 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: I checked in the Documentation to be sure, but apparently : DocValues are only available for specific field types. The types chosen determine the underlying Lucene docValue type that will be used. The available Solr field types are: - StrField and UUIDField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the SORTED type. - If the field is multi-valued, Lucene will use the SORTED_SET type. - Any Trie* numeric fields and EnumField. - If the field is single-valued (i.e., multi-valued is false), Lucene will use the NUMERIC type. - If the field is multi-valued, Lucene will use the SORTED_SET type. This means you should not analyse a field where DocValues is enabled. Can your explain us your use case ? Why are you interested in synonyms DocValues level ? Cheers 2015-05-26 13:32 GMT+01:00 Upayavira u...@odoko.co.uk: To my understanding, docValues are just an uninverted index. That is, it contains the terms that are generated at the end of an analysis chain. Therefore, you simply enable docValues and include the SynonymFilterFactory in your analysis. Is that enough, or are you struggling with some other issue? Upayavira On Tue, May 26, 2015, at 12:03 PM, Aman Tandon wrote: Hi, We have some field *city* in which the docValues are enabled. We need to add the synonym in that field so how could we do it? With Regards Aman Tandon -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England