Re: custom sorting of search result
Latest versions of Solr have collapsing and expanding plugins, reranking plugins and post-filters. Some combinations of these seem like it might be relevant. And, of course, there is always carrot2 clustering. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 3 November 2014 18:07, alx...@aim.com wrote: Hello, We need to order solr search results according to specific rules. I will explain with an example. Let say solr returns 1000 results for query sport. These results must be divided into three buckets according to rules that come from database. Then one doc must be chosen from each bucket and put in the results subsequently until all buckets are empty. One approach was to modify/override solr code where it gets results, sorts them and return #rows of elements. However, from the code in Weight.java scoreAll function we see that docs have only internal document id and nothing else. We expect unique solr document id in order to match documents with the custom scoring. We also see that Lucene code handles those doc ids to scoreAll function, and for now We do not want to modify Lucene code and prefer to solve this issue as a Solr plugin . Any ideas are welcome. Thanks. Alex.
custom sorting of search result
Hello, We need to order solr search results according to specific rules. I will explain with an example. Let say solr returns 1000 results for query sport. These results must be divided into three buckets according to rules that come from database. Then one doc must be chosen from each bucket and put in the results subsequently until all buckets are empty. One approach was to modify/override solr code where it gets results, sorts them and return #rows of elements. However, from the code in Weight.java scoreAll function we see that docs have only internal document id and nothing else. We expect unique solr document id in order to match documents with the custom scoring. We also see that Lucene code handles those doc ids to scoreAll function, and for now We do not want to modify Lucene code and prefer to solve this issue as a Solr plugin . Any ideas are welcome. Thanks. Alex.
Boost function for custom sorting.
Hi, I have some records which include a source_id field which is an integer and a datetime field. I want the records to be ordered such that the adjacent records should not have the same source ids. It should perform some sort of round robin on the records with the source_id as kay and they should be sorted by date. For example: [1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,4] after the sort should give -- [1,2,3,4,1,2,3,1,2,3,1,2,3,1,2,2] and these should be sorted by datetime. meaning the first '1' should be the latest '1' and the last '1' should be the oldest '1'. I was wondering if it was possible to write a boost function for such a requirement. Thanks in advance! Sai
Custom sorting on facets
Hi, I am using facets for suggestions. By default facet sort is based only on index order and count. Now, in that I have a requirement that based on a value in solr doc ;some suggestions must be at top and then other. Example : doc iddoc1/id nameProductInstancename/ machineIdhydraulic/machineId /doc doc iddoc2/id nameProductInstancename/ machineIdother hydraulic/machineId /doc doc iddoc3/id nameProductname/ machineIdtest hydraulic/machineId /doc doc iddoc4/id nameProductname/ machineIdother test hydraulic /machineId /doc In above 4 solr documents ,I will be having many more fields... In suggestions I want to create facet on machineID... But I want to sort it based on name field .. Say if I queried for hy*.. then I should get facet from all 4 docs but sorted by name field. PS: I can't use any other type of suggester as I need to display whole machineID text as suggestions and not a single word.. Many Thanks, Abhinav
Autosuggest - Custom sorting
Is there a way to sort the returned Autosuggest list based on a particular value (ex: score)? I am trying to sort the returned suggestions based on a field that has been calculated manually but not sure how to use that field for sorting suggestions. -- View this message in context: http://lucene.472066.n3.nabble.com/Autosuggest-Custom-sorting-tp4092980.html Sent from the Solr - User mailing list archive at Nabble.com.
Custom sorting of Solr Results
Dear Experts, I have a requirement for the exact matches and applying alphabetical sorting thereafter. To illustrate, the results should be sorted in exact matches and all later alphabetical. So, if there are 5 documents as below Doc1 title: trees Doc 2 title: plum trees Doc 3 title: Money Trees (Legendary Trees) Doc 4 title: Cork Trees Doc 5 title: Old Trees Then, if user searches with query term as 'trees', the results should be in following order: Doc 1 trees - Highest Rank Doc 4 Cork Trees - Alphabetical afterwards.. Doc 3 Money Trees (Legendary Trees) Doc 5 Old Trees Doc 2 plum trees I can achieve the alphabetical sorting by adding the title sort parameter, However, Solr relevancy is higher for Doc 3 (due to matches in 2 terms and so it arranges Doc 3 above Doc 4, 5 and 2). So, it looks like: Doc 1 trees - Highest Rank Doc 3 Money Trees (Legendary Trees) Doc 4 Cork Trees - Alphabetical afterwards.. Doc 5 Old Trees Doc 2 plum trees Can you tell me an easy way to achieve this requirement please? I'm using Solr 4.0 and the *title *field is defined as follows: fieldType name=text_wc class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory stemEnglishPossessive=0 generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1 splitOnNumerics=0 preserveOriginal=1 / filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory stemEnglishPossessive=0 generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1 splitOnNumerics=0 preserveOriginal=1 / filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType Many Thanks in advance, Sandeep
Re: How to do custom sorting in Solr?
You may want to look at http://sujitpal.blogspot.com/2011/05/custom-sorting-in-solr-using-external.html. While it is not the same requirement, this should give you an idea of how to do custom sorting. Thanks Afroz On Sun, Jun 10, 2012 at 4:43 PM, roz dev rozde...@gmail.com wrote: Yes, these documents have lots of unique values as the same product could be assigned to lots of other categories and that too, in a different sort order. We did some evaluation of heap usage and found that with kind of queries we generate, heap usage was going up to 24-26 GB. I could trace it to the fact that fieldCache is creating an array of 2M size for each of the sort fields. Since same products are mapped to multiple categories, we incur significant memory overhead. Therefore, any solve where memory consumption can be reduced is a good one for me. In fact, we have situations where same product is mapped to more than 1 sub-category in the same category like Books -- Programming - Java in a nutshell -- Sale (40% off) - Java in a nutshell So,another thought in my mind is to somehow use second pass collector to group books appropriately in Programming and Sale categories, with right sort order. But, i have no clue about that piece :( -Saroj On Sun, Jun 10, 2012 at 4:30 PM, Erick Erickson erickerick...@gmail.com wrote: 2M docs is actually pretty small. Sorting is sensitive to the number of _unique_ values in the sort fields, not necessarily the number of documents. And sorting only works on fields with a single value (i.e. it can't have more than one token after analysis). So for each field you're only talking 2M values at the vary maximum, assuming that the field in question has a unique value per document, which I doubt very much given your problem description. So with a corpus that size, I'd just try it'. Best Erick On Sun, Jun 10, 2012 at 7:12 PM, roz dev rozde...@gmail.com wrote: Thanks Erik for your quick feedback When Products are assigned to a category or Sub-Category then they can be in any order and price type can be regular or markdown. So, reg and markdown products are intermingled as per their assignment but I want to sort them in such a way that we ensure that all the products which are on markdown are at the bottom of the list. I can use these multiple sorts but I realize that they are costly in terms of heap used, as they are using FieldCache. I have an index with 2M docs and docs are pretty big. So, I don't want to use them unless there is no other option. I am wondering if I can define a custom function query which can be like this: - check if product is on the markdown - if yes then change its sort order field to be the max value in the given sub-category, say 99 - else, use the sort order of the product in the sub-category I have been looking at existing function queries but do not have a good handle on how to make one of my own. - Another option could be use a custom sort comparator but I am not sure about the way it works Any thoughts? -Saroj On Sun, Jun 10, 2012 at 5:02 AM, Erick Erickson erickerick...@gmail.com wrote: Skimming this, I two options come to mind: 1 Simply apply primary, secondary, etc sorts. Something like sort=subcategory asc,markdown_or_regular desc,sort_order asc 2 You could also use grouping to arrange things in groups and sort within those groups. This has the advantage of returning some members of each of the top N groups in the result set, which makes it easier to get some of each group rather than having to analyze the whole list But your example is somewhat contradictory. You say products which are on markdown, are at the bottom of the documents list But in your examples, products on markdown are intermingled Best Erick On Sun, Jun 10, 2012 at 3:36 AM, roz dev rozde...@gmail.com wrote: Hi All I have an index which contains a Catalog of Products and Categories, with Solr 4.0 from trunk Data is organized like this: Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:1 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:4 . ... Product # 70 Price: Regular, Sort Order:70 I want to query Solr and sort these products within each
Re: How to do custom sorting in Solr?
Hi All I have an index which contains a Catalog of Products and Categories, with Solr 4.0 from trunk Data is organized like this: Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:1 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:4 . ... Product # 70 Price: Regular, Sort Order:70 I want to query Solr and sort these products within each of the sub-category in a such a way that products which are on markdown, are at the bottom of the documents list and other products which are on regular price, are sorted as per their sort order in their sub-category. Expected Results are Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:101 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:71 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:71 . ... Product # 70 Price: Regular, Sort Order:70 My query is like this: q=*:*fq=category:Books What are the options to implement custom sorting and how do I do it? - Define a Custom Function query? - Define a Custom Comparator? Or, - Define a Custom Collector? Please let me know the best way to go about it and any pointers to customize Solr 4. Thanks Saroj
Re: How to do custom sorting in Solr?
Skimming this, I two options come to mind: 1 Simply apply primary, secondary, etc sorts. Something like sort=subcategory asc,markdown_or_regular desc,sort_order asc 2 You could also use grouping to arrange things in groups and sort within those groups. This has the advantage of returning some members of each of the top N groups in the result set, which makes it easier to get some of each group rather than having to analyze the whole list But your example is somewhat contradictory. You say products which are on markdown, are at the bottom of the documents list But in your examples, products on markdown are intermingled Best Erick On Sun, Jun 10, 2012 at 3:36 AM, roz dev rozde...@gmail.com wrote: Hi All I have an index which contains a Catalog of Products and Categories, with Solr 4.0 from trunk Data is organized like this: Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:1 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:4 . ... Product # 70 Price: Regular, Sort Order:70 I want to query Solr and sort these products within each of the sub-category in a such a way that products which are on markdown, are at the bottom of the documents list and other products which are on regular price, are sorted as per their sort order in their sub-category. Expected Results are Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:101 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:71 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:71 . ... Product # 70 Price: Regular, Sort Order:70 My query is like this: q=*:*fq=category:Books What are the options to implement custom sorting and how do I do it? - Define a Custom Function query? - Define a Custom Comparator? Or, - Define a Custom Collector? Please let me know the best way to go about it and any pointers to customize Solr 4. Thanks Saroj
Re: How to do custom sorting in Solr?
Thanks Erik for your quick feedback When Products are assigned to a category or Sub-Category then they can be in any order and price type can be regular or markdown. So, reg and markdown products are intermingled as per their assignment but I want to sort them in such a way that we ensure that all the products which are on markdown are at the bottom of the list. I can use these multiple sorts but I realize that they are costly in terms of heap used, as they are using FieldCache. I have an index with 2M docs and docs are pretty big. So, I don't want to use them unless there is no other option. I am wondering if I can define a custom function query which can be like this: - check if product is on the markdown - if yes then change its sort order field to be the max value in the given sub-category, say 99 - else, use the sort order of the product in the sub-category I have been looking at existing function queries but do not have a good handle on how to make one of my own. - Another option could be use a custom sort comparator but I am not sure about the way it works Any thoughts? -Saroj On Sun, Jun 10, 2012 at 5:02 AM, Erick Erickson erickerick...@gmail.comwrote: Skimming this, I two options come to mind: 1 Simply apply primary, secondary, etc sorts. Something like sort=subcategory asc,markdown_or_regular desc,sort_order asc 2 You could also use grouping to arrange things in groups and sort within those groups. This has the advantage of returning some members of each of the top N groups in the result set, which makes it easier to get some of each group rather than having to analyze the whole list But your example is somewhat contradictory. You say products which are on markdown, are at the bottom of the documents list But in your examples, products on markdown are intermingled Best Erick On Sun, Jun 10, 2012 at 3:36 AM, roz dev rozde...@gmail.com wrote: Hi All I have an index which contains a Catalog of Products and Categories, with Solr 4.0 from trunk Data is organized like this: Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:1 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:4 . ... Product # 70 Price: Regular, Sort Order:70 I want to query Solr and sort these products within each of the sub-category in a such a way that products which are on markdown, are at the bottom of the documents list and other products which are on regular price, are sorted as per their sort order in their sub-category. Expected Results are Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:101 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:71 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:71 . ... Product # 70 Price: Regular, Sort Order:70 My query is like this: q=*:*fq=category:Books What are the options to implement custom sorting and how do I do it? - Define a Custom Function query? - Define a Custom Comparator? Or, - Define a Custom Collector? Please let me know the best way to go about it and any pointers to customize Solr 4. Thanks Saroj
Re: How to do custom sorting in Solr?
2M docs is actually pretty small. Sorting is sensitive to the number of _unique_ values in the sort fields, not necessarily the number of documents. And sorting only works on fields with a single value (i.e. it can't have more than one token after analysis). So for each field you're only talking 2M values at the vary maximum, assuming that the field in question has a unique value per document, which I doubt very much given your problem description. So with a corpus that size, I'd just try it'. Best Erick On Sun, Jun 10, 2012 at 7:12 PM, roz dev rozde...@gmail.com wrote: Thanks Erik for your quick feedback When Products are assigned to a category or Sub-Category then they can be in any order and price type can be regular or markdown. So, reg and markdown products are intermingled as per their assignment but I want to sort them in such a way that we ensure that all the products which are on markdown are at the bottom of the list. I can use these multiple sorts but I realize that they are costly in terms of heap used, as they are using FieldCache. I have an index with 2M docs and docs are pretty big. So, I don't want to use them unless there is no other option. I am wondering if I can define a custom function query which can be like this: - check if product is on the markdown - if yes then change its sort order field to be the max value in the given sub-category, say 99 - else, use the sort order of the product in the sub-category I have been looking at existing function queries but do not have a good handle on how to make one of my own. - Another option could be use a custom sort comparator but I am not sure about the way it works Any thoughts? -Saroj On Sun, Jun 10, 2012 at 5:02 AM, Erick Erickson erickerick...@gmail.comwrote: Skimming this, I two options come to mind: 1 Simply apply primary, secondary, etc sorts. Something like sort=subcategory asc,markdown_or_regular desc,sort_order asc 2 You could also use grouping to arrange things in groups and sort within those groups. This has the advantage of returning some members of each of the top N groups in the result set, which makes it easier to get some of each group rather than having to analyze the whole list But your example is somewhat contradictory. You say products which are on markdown, are at the bottom of the documents list But in your examples, products on markdown are intermingled Best Erick On Sun, Jun 10, 2012 at 3:36 AM, roz dev rozde...@gmail.com wrote: Hi All I have an index which contains a Catalog of Products and Categories, with Solr 4.0 from trunk Data is organized like this: Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:1 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:4 . ... Product # 70 Price: Regular, Sort Order:70 I want to query Solr and sort these products within each of the sub-category in a such a way that products which are on markdown, are at the bottom of the documents list and other products which are on regular price, are sorted as per their sort order in their sub-category. Expected Results are Category: Books Sub Category: Programming Products: Product # 1, Price: Regular Sort Order:1 Product # 2, Price: Markdown, Sort Order:101 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:71 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:71 . ... Product # 70 Price: Regular, Sort Order:70 My query is like this: q=*:*fq=category:Books What are the options to implement custom sorting and how do I do it? - Define a Custom Function query? - Define a Custom Comparator? Or, - Define a Custom Collector? Please let me know the best way to go about it and any pointers to customize Solr 4. Thanks Saroj
Re: How to do custom sorting in Solr?
# 3 Price: Regular, Sort Order:3 Product # 4 Price: Regular, Sort Order:4 . ... Product # 100 Price: Regular, Sort Order:100 Sub Category: Fiction Products: Product # 1, Price: Markdown, Sort Order:71 Product # 2, Price: Regular, Sort Order:2 Product # 3 Price: Regular, Sort Order:3 Product # 4 Price: Markdown, Sort Order:71 . ... Product # 70 Price: Regular, Sort Order:70 My query is like this: q=*:*fq=category:Books What are the options to implement custom sorting and how do I do it? - Define a Custom Function query? - Define a Custom Comparator? Or, - Define a Custom Collector? Please let me know the best way to go about it and any pointers to customize Solr 4. Thanks Saroj
Custom sorting doesn't work properly.
Hello everybody, I have problem with a custom sorting in solr . This problem(incorrect sorting order) happened only when used 2 or more shards in solr configuration. I did next: Extends from TrieIntField just for override comparator *public class OperatingStatusFieldType extends TrieIntField { @Override public SortField getSortField(SchemaField field, boolean top) { return new SortField(field.getName(),new OperatingStatusComparatorSource(), top); } } * Custom comparator implementation *public class OperatingStatusComparatorSource extends FieldComparatorSource { private static int counter = 0; @Override public FieldComparator newComparator(String fieldname, int numHits, int sortPos, boolean reversed) throws IOException { return new OperatingStatusComparator(numHits, fieldname, FieldCache.NUMERIC_UTILS_INT_PARSER, SolrSortConfiguration.getInstance().getOrderMapForField(SolrSortConfiguration.OPERATING_STATUS_SORT_PROPERTY)); } private static class OperatingStatusComparator extends FieldComparator { private final int[] values; private int[] currentReaderValues; private final String field; private FieldCache.IntParser parser; private int bottom; private MapString, Integer mapValues = null; OperatingStatusComparator(int numHits, String field, FieldCache.Parser parser, MapString, Integer mapValues) { values = new int[numHits]; this.field = field; this.parser = (FieldCache.IntParser) parser; this.mapValues = mapValues; } @Override public int compare(int slot1, int slot2) { final Integer vKey1 = values[slot1]; final Integer vKey2 = values[slot2]; final Integer val1 = mapValues.get(String.valueOf(vKey1)); final Integer val2 = mapValues.get(String.valueOf(vKey2)); System.out.println(vKey1= + vKey1); System.out.println(vKey2= + vKey2); System.out.println(val1= + val1); System.out.println(val2= + val2); counter++; System.out.println(counter = + counter); if (val1 == null) { if (val2 == null) { return 0; } return -1; } else if (val2 == null) { return 1; } return val1.compareTo(val2); } @Override public int compareBottom(int doc) { final int v2 = currentReaderValues[doc]; if (bottom v2) { return 1; } else if (bottom v2) { return -1; } else { return 0; } } @Override public void copy(int slot, int doc) { values[slot] = currentReaderValues[doc]; } @Override public void setNextReader(IndexReader reader, int docBase) throws IOException { currentReaderValues = FieldCache.DEFAULT.getInts(reader, field, parser); } @Override public void setBottom(final int bottom) { this.bottom = values[bottom]; } @Override public Comparable? value(int slot) { return Integer.valueOf(values[slot]); } } }* in shema.xml field defined as listed below: *fieldType name=operatingStatusTint class=com.dnb.daas.solr.custom.sorting.OperatingStatusFieldType precisionStep=8 omitNorms=true positionIncrementGap=0/* Custom comparator copy-past from the Solr int comparator and just modified compare method. * @Override public int compare(int slot1, int slot2) { final Integer vKey1 = values[slot1]; final Integer vKey2 = values[slot2]; final Integer val1 = mapValues.get(String.valueOf(vKey1)); final Integer val2 = mapValues.get(String.valueOf(vKey2)); System.out.println(vKey1= + vKey1); System.out.println(vKey2= + vKey2); System.out.println(val1= + val1); System.out.println(val2= + val2); counter++; System.out.println(counter = + counter); if (val1 == null) { if (val2 == null) { return 0; } return -1; } else if (val2 == null) { return 1; } return val1.compareTo(val2); }* May be somebody know why incorrect sorting order happened when solr configured with shards. -- Best Regards, Eugene Stherbin Exadel Inc,
Custom sorting based on external (database) data
Hi, Sorry for the possible double post, I wrote this up but had the incorrect sender address, so I am guessing that my previous one is going to be rejected by the list moderation daemon. I am trying to figure out options for the following problem. I am on Solr 1.4.1 (Lucene 2.9.1). I have search results which are going to be ranked by the user (using a thumbs up/down) and would translate to a score between -1 and +1. This data is stored in a database table ( unique_id thumbs_up thumbs_down num_calls as the thumbs up/down component is clicked. We want to be able to sort the results by the following score = (thumbs_up - thumbs_down) / (num_calls). The unique_id field refers to the one referenced as uniqueId in the schema.xml. Based on the following conversation: http://www.mail-archive.com/solr-user@lucene.apache.org/msg06322.html ...my understanding is that I need to: 1) subclass FieldType to create my own RankFieldType. 2) In this class I override the getSortField() method to return my custom FieldSortComparatorSource object. 3) Build the custom FieldSortComparatorSource object which returns a custom FieldSortComparator object in newComparator(). 4) Configure the field type of class RankFieldType (rank_t), and a field (called rank) of field type rank_t in schema.xml of type RankFieldType. 5) use sort=rank+desc to do the sort. My question is: is there a simpler/more performant way? The number of database lookups seems like its going to be pretty high with this approach. And its hard to believe that my problem is new, so I am guessing this is either part of some Solr configuration I am missing, or there is some other (possibly simpler) approach I am overlooking. Pointers to documentation or code (or even keywords I could google) would be much appreciated. TIA for all your help, Sujit
Re: Custom sorting based on external (database) data
--- On Thu, 5/5/11, Sujit Pal sujit@comcast.net wrote: From: Sujit Pal sujit@comcast.net Subject: Custom sorting based on external (database) data To: solr-user solr-user@lucene.apache.org Date: Thursday, May 5, 2011, 11:03 PM Hi, Sorry for the possible double post, I wrote this up but had the incorrect sender address, so I am guessing that my previous one is going to be rejected by the list moderation daemon. I am trying to figure out options for the following problem. I am on Solr 1.4.1 (Lucene 2.9.1). I have search results which are going to be ranked by the user (using a thumbs up/down) and would translate to a score between -1 and +1. This data is stored in a database table ( unique_id thumbs_up thumbs_down num_calls as the thumbs up/down component is clicked. We want to be able to sort the results by the following score = (thumbs_up - thumbs_down) / (num_calls). The unique_id field refers to the one referenced as uniqueId in the schema.xml. Based on the following conversation: http://www.mail-archive.com/solr-user@lucene.apache.org/msg06322.html ...my understanding is that I need to: 1) subclass FieldType to create my own RankFieldType. 2) In this class I override the getSortField() method to return my custom FieldSortComparatorSource object. 3) Build the custom FieldSortComparatorSource object which returns a custom FieldSortComparator object in newComparator(). 4) Configure the field type of class RankFieldType (rank_t), and a field (called rank) of field type rank_t in schema.xml of type RankFieldType. 5) use sort=rank+desc to do the sort. My question is: is there a simpler/more performant way? The number of database lookups seems like its going to be pretty high with this approach. And its hard to believe that my problem is new, so I am guessing this is either part of some Solr configuration I am missing, or there is some other (possibly simpler) approach I am overlooking. Pointers to documentation or code (or even keywords I could google) would be much appreciated. Looks like it can be done with http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html and http://wiki.apache.org/solr/FunctionQuery You can dump your table into three text files. Issue a commit to load these changes. Sort by function query is available in Solr3.1 though.
Re: Custom sorting based on external (database) data
Thank you Ahmet, looks like we could use this. Basically we would do periodic dumps of the (unique_id|computed_score) sorted by score and write it out to this file followed by a commit. Found some more info here, for the benefit of others looking for something similar: http://dev.tailsweep.com/solr-external-scoring/ On Thu, 2011-05-05 at 13:12 -0700, Ahmet Arslan wrote: --- On Thu, 5/5/11, Sujit Pal sujit@comcast.net wrote: From: Sujit Pal sujit@comcast.net Subject: Custom sorting based on external (database) data To: solr-user solr-user@lucene.apache.org Date: Thursday, May 5, 2011, 11:03 PM Hi, Sorry for the possible double post, I wrote this up but had the incorrect sender address, so I am guessing that my previous one is going to be rejected by the list moderation daemon. I am trying to figure out options for the following problem. I am on Solr 1.4.1 (Lucene 2.9.1). I have search results which are going to be ranked by the user (using a thumbs up/down) and would translate to a score between -1 and +1. This data is stored in a database table ( unique_id thumbs_up thumbs_down num_calls as the thumbs up/down component is clicked. We want to be able to sort the results by the following score = (thumbs_up - thumbs_down) / (num_calls). The unique_id field refers to the one referenced as uniqueId in the schema.xml. Based on the following conversation: http://www.mail-archive.com/solr-user@lucene.apache.org/msg06322.html ...my understanding is that I need to: 1) subclass FieldType to create my own RankFieldType. 2) In this class I override the getSortField() method to return my custom FieldSortComparatorSource object. 3) Build the custom FieldSortComparatorSource object which returns a custom FieldSortComparator object in newComparator(). 4) Configure the field type of class RankFieldType (rank_t), and a field (called rank) of field type rank_t in schema.xml of type RankFieldType. 5) use sort=rank+desc to do the sort. My question is: is there a simpler/more performant way? The number of database lookups seems like its going to be pretty high with this approach. And its hard to believe that my problem is new, so I am guessing this is either part of some Solr configuration I am missing, or there is some other (possibly simpler) approach I am overlooking. Pointers to documentation or code (or even keywords I could google) would be much appreciated. Looks like it can be done with http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html and http://wiki.apache.org/solr/FunctionQuery You can dump your table into three text files. Issue a commit to load these changes. Sort by function query is available in Solr3.1 though.
RE: Custom Sorting
Ok thank you for the discussion. As I thought regard to not possible within performance limits. I think the way to go is to document some more stats at index time, and use them in boost queries. :) Thanks Mike Date: Tue, 19 Apr 2011 15:12:00 -0400 Subject: Re: Custom Sorting From: erickerick...@gmail.com To: solr-user@lucene.apache.org As I understand it, sorting by field is what caches are all about. You have a big list in memory of all of the terms for a field, indexed by Lucene doc ID so fetching the term to compare by doc ID is fast, and also why the caches need to be warmed, and why sort fields should be single-valued. If you try to do this yourself and fetch data from each document, you can incur a huge performance hit, since you'll be seeking all over your disk... Score is special though since it's transient. Internally, all Lucene has to do is keep track of the top N scores encountered where N is something like start + queryResultWindowSize, this latter from solrconfig.xml, with no seeks to disk at all... Best Erick On Tue, Apr 19, 2011 at 2:50 PM, Jonathan Rochkind rochk...@jhu.edu wrote: On 4/19/2011 1:43 PM, Jan Høydahl wrote: Hi, Not possible :) Lucene compares each matching document against the query and produces a score for each. Documents are not compared to eachother like normal sort, that would be way too costly. That might be true for sort by 'score' (although even if you have all the scores, it still seems like some kind of sort must be neccesary to see which comes first), but when you sort by a field value, which is also possible, Lucene must be doing some kind of 'normal sort' algorithm, no? Ah, I guess it could just be using each term's position in the index, which is available in constant time, always kept track of in an index? Maybe, I don't know?
Custom Sorting
Hi, I want to able to have a custom sorting algorithm such that for each comparison of document results (A v B) I can rank them. i.e. writing a comparator like I would normally do in Java (Compares its two arguments for order. Returns a negative integer, zero, or a positive integer as the first argument is less than, equal to, or greater than the second). In the comparator I want to be able to take into account the score of the results, as well as other fields in the documents. I've looked at using things such as the score/boost/bf parameters etc, however, want the flexibility of being able to code the comparator, so I can do if conditions and such. Is this possible? And if so what's the best way of doing this? I've upgraded to use the latest version of Solr 3.1, and of course for this use case would expect to have to build from source, in order to add custom source. Or/and, when using the score/boost/bf parameters etc - is it possible to use the score parameter in functions, to say scale it between 0 and 1? Thanks Mike
Re: Custom Sorting
Hi, Not possible :) Lucene compares each matching document against the query and produces a score for each. Documents are not compared to eachother like normal sort, that would be way too costly. But if you explain your use case, I'm sure we can find ways to express your needs in other ways Perhaps it is possible for you to use Sort by Function? http://wiki.apache.org/solr/FunctionQuery#Sort_By_Function Then you can decide exactly what goes into your sort score. If you want to do conditional stuff, you may need to pre-process your documents a bit and create new fields which can be used in a FunctionQuery. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 19. apr. 2011, at 19.02, Michael Owen wrote: Hi, I want to able to have a custom sorting algorithm such that for each comparison of document results (A v B) I can rank them. i.e. writing a comparator like I would normally do in Java (Compares its two arguments for order. Returns a negative integer, zero, or a positive integer as the first argument is less than, equal to, or greater than the second). In the comparator I want to be able to take into account the score of the results, as well as other fields in the documents. I've looked at using things such as the score/boost/bf parameters etc, however, want the flexibility of being able to code the comparator, so I can do if conditions and such. Is this possible? And if so what's the best way of doing this? I've upgraded to use the latest version of Solr 3.1, and of course for this use case would expect to have to build from source, in order to add custom source. Or/and, when using the score/boost/bf parameters etc - is it possible to use the score parameter in functions, to say scale it between 0 and 1? Thanks Mike
Re: Custom Sorting
You could create a new Similarity class plugin that take in account every parameters you need. : http://wiki.apache.org/solr/SolrPlugins?highlight=%28similarity%29#Similarity but, as Jan said, be carefull with the cost of the the similarity function. Ludovic. 2011/4/19 Jan Høydahl / Cominvent [via Lucene] ml-node+2839526-2100261518-383...@n3.nabble.com Hi, Not possible :) Lucene compares each matching document against the query and produces a score for each. Documents are not compared to eachother like normal sort, that would be way too costly. But if you explain your use case, I'm sure we can find ways to express your needs in other ways Perhaps it is possible for you to use Sort by Function? http://wiki.apache.org/solr/FunctionQuery#Sort_By_Function Then you can decide exactly what goes into your sort score. If you want to do conditional stuff, you may need to pre-process your documents a bit and create new fields which can be used in a FunctionQuery. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 19. apr. 2011, at 19.02, Michael Owen wrote: Hi, I want to able to have a custom sorting algorithm such that for each comparison of document results (A v B) I can rank them. i.e. writing a comparator like I would normally do in Java (Compares its two arguments for order. Returns a negative integer, zero, or a positive integer as the first argument is less than, equal to, or greater than the second). In the comparator I want to be able to take into account the score of the results, as well as other fields in the documents. I've looked at using things such as the score/boost/bf parameters etc, however, want the flexibility of being able to code the comparator, so I can do if conditions and such. Is this possible? And if so what's the best way of doing this? I've upgraded to use the latest version of Solr 3.1, and of course for this use case would expect to have to build from source, in order to add custom source. Or/and, when using the score/boost/bf parameters etc - is it possible to use the score parameter in functions, to say scale it between 0 and 1? Thanks Mike -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Custom-Sorting-tp2839375p2839526.html To start a new topic under Solr - User, email ml-node+472068-1765922688-383...@n3.nabble.com To unsubscribe from Solr - User, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=472068code=Ym91dHJvc2xAZ21haWwuY29tfDQ3MjA2OHw0Mzk2MDUxNjE=. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/Custom-Sorting-tp2839375p2839593.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Custom Sorting
On 4/19/2011 1:43 PM, Jan Høydahl wrote: Hi, Not possible :) Lucene compares each matching document against the query and produces a score for each. Documents are not compared to eachother like normal sort, that would be way too costly. That might be true for sort by 'score' (although even if you have all the scores, it still seems like some kind of sort must be neccesary to see which comes first), but when you sort by a field value, which is also possible, Lucene must be doing some kind of 'normal sort' algorithm, no? Ah, I guess it could just be using each term's position in the index, which is available in constant time, always kept track of in an index? Maybe, I don't know?
Re: Custom Sorting
As I understand it, sorting by field is what caches are all about. You have a big list in memory of all of the terms for a field, indexed by Lucene doc ID so fetching the term to compare by doc ID is fast, and also why the caches need to be warmed, and why sort fields should be single-valued. If you try to do this yourself and fetch data from each document, you can incur a huge performance hit, since you'll be seeking all over your disk... Score is special though since it's transient. Internally, all Lucene has to do is keep track of the top N scores encountered where N is something like start + queryResultWindowSize, this latter from solrconfig.xml, with no seeks to disk at all... Best Erick On Tue, Apr 19, 2011 at 2:50 PM, Jonathan Rochkind rochk...@jhu.edu wrote: On 4/19/2011 1:43 PM, Jan Høydahl wrote: Hi, Not possible :) Lucene compares each matching document against the query and produces a score for each. Documents are not compared to eachother like normal sort, that would be way too costly. That might be true for sort by 'score' (although even if you have all the scores, it still seems like some kind of sort must be neccesary to see which comes first), but when you sort by a field value, which is also possible, Lucene must be doing some kind of 'normal sort' algorithm, no? Ah, I guess it could just be using each term's position in the index, which is available in constant time, always kept track of in an index? Maybe, I don't know?
Re: Custom Sorting in Solr
Ok i imagined that the double linked list would be far too complicated for solr. Now, how can i achieve that solr connects to a webservice and do the import? I'm sorry if i'm not clear, sometimes my english gets fuzzy :P On Fri, Oct 29, 2010 at 4:51 PM, Yonik Seeley yo...@lucidimagination.comwrote: On Fri, Oct 29, 2010 at 3:39 PM, Ezequiel Calderara ezech...@gmail.com wrote: Hi all guys! I'm in a weird situation here. We have index a set of documents which are ordered using a linked list (each documents has the reference of the previous and the next). Is there a way when sorting in the solr search, Use the linked list to sort? It seems like you should be able to encode this linked list as an integer instead, and sort by that? If there are multiple linked lists in the index, it seems like you could even use the high bits of the int to designate which list the doc belongs to, and the low order bits as the order in that list. -Yonik http://www.lucidimagination.com -- __ Ezequiel. Http://www.ironicnet.com
Custom Sorting in Solr
Hi all guys! I'm in a weird situation here. We have index a set of documents which are ordered using a linked list (each documents has the reference of the previous and the next). Is there a way when sorting in the solr search, Use the linked list to sort? If that is not possible, how can i use the DIH to access a Service in WCF or a Webservice? Should i develop my own DIH? -- __ Ezequiel. Http://www.ironicnet.com
RE: Custom Sorting in Solr
There's no way I know of to make Solr use that kind of data to create the sort order you want. Generally for 'custom' sorts, you want to create a field in your Solr index with possibly artificially constructed values that will 'naturally' sort the way you want. How to do that with a linked list seems kind of tricky, before you index you may have to write code to analyze your whole graph order and then just supply sort order keys. And then if you sometimes update just a few documents, but not your whole thing.. Geez, i'm not really sure. It's kind of a tricky problem. That kind of data is not really the expected use case for Solr sorting. Sorry, I'm not sure what this means or how it would help: use the DIH to access a Service in WCF or a Webservice? Maybe someone else will know exactly what you mean. Or maybe if you rephrase with more specificity as to how you think this will help you solve your problem, it will be more clear. Recall that you don't need to use DIH to index at all, it's just one of several methods, it simplifies things for common patterns, it's possible you fall out of the common pattern nad it would be simpler not to use DIH. Although even without DIH, I can't think of a particularly simple way to solve your problem. Just curious, but is your _entire_ corpus, your entire document set, part of a _single_ linked list? Or do you have several different linked lists in there? If several, what do you want to happen with sort if two documents in the result set aren't even part of the same linked list? This kind of thing is one reason translating the sort of data you have to a solr sort order starts to seem kind of confusing to me. From: Ezequiel Calderara [ezech...@gmail.com] Sent: Friday, October 29, 2010 3:39 PM To: Solr Mailing List Subject: Custom Sorting in Solr Hi all guys! I'm in a weird situation here. We have index a set of documents which are ordered using a linked list (each documents has the reference of the previous and the next). Is there a way when sorting in the solr search, Use the linked list to sort? If that is not possible, how can i use the DIH to access a Service in WCF or a Webservice? Should i develop my own DIH? -- __ Ezequiel. Http://www.ironicnet.com
Re: Custom Sorting in Solr
On Fri, Oct 29, 2010 at 3:39 PM, Ezequiel Calderara ezech...@gmail.com wrote: Hi all guys! I'm in a weird situation here. We have index a set of documents which are ordered using a linked list (each documents has the reference of the previous and the next). Is there a way when sorting in the solr search, Use the linked list to sort? It seems like you should be able to encode this linked list as an integer instead, and sort by that? If there are multiple linked lists in the index, it seems like you could even use the high bits of the int to designate which list the doc belongs to, and the low order bits as the order in that list. -Yonik http://www.lucidimagination.com
Custom Sorting with function queries
I need to 'rank' the documents in a solr index based on some field values and the query. Is this possible using function queries? Two example to illustrate what I am trying to achieve: The index contains two fields min_rooms and max_rooms, both integers, both optional. If I query the index for a value (rooms) I would like the documents that place this value between min and max to be ranked higher than those that don't. The smaller the difference between min and max is, the more exact a match the document is and the higher the document will be ranked. If either min or max or both are not specified then the document gets a 'negative rank'. The index contains a float field. If, and only if, the query contains a search for this field (field:1 or field:on), then the value of the field affects the ranking of the document. (1, on, yes, etc can be solved with synonyms) Lastly, once this 'custom ranking works), how do I switch off solr's built in ranking calculations?
Re: custom sorting / help overriding FieldComparator
Brad: 1) if you haven't already figured this out, i would suggest emailin the java-user mailing list. It's got a bigger collection of users who are familiar with the internals of the Lucnee-Java API (that's the level it seems like you are having difficulty at) 2) Maybe you mentioned your sorting algorithm in a previous thread, but i'm not remembering it -- it's possibly this is an XY problem, if you describe the algorithm you need (or show us the code for your Comparable impl) we might be able to suggest an efficient way to do this with out any custom code in Solr... http://people.apache.org/~hossman/#xyproblem : I'm trying to get my (overly complex and strange) product IDs sorting properly in Solr. : : Approaches I've tried so far, that I've given up on for various reasons: : --Normalizing/padding the IDs so they naturally sort alphabetically/alphanumerically. : --Splitting the ID into multiple Solr fields and sending a longer, multi-field sort argument in the GET request. : --(both of those approaches do work most of the time, but aren't quite perfect) : : However, in another project, I already have a codeComparble/code class defined in Java that represents a ProductID and does sort them correctly every time. It's not yet in lucene/solr, though. So I'm trying to make a FieldType plugin for Solr that uses the existing ProductID class/datatype. : : I need some help extending the lucene FieldComparator class. I don't know much about the rest of the solr / lucene codebase, so I'm fumbling around a bit, especially with the required setNextReader() method. setNextReader() looks like it checks the FieldCache to see if this value is there already, otherwise grabs a bunch of documents from the index. I think I should call some form of FieldCache.getCustom() for this, but FieldCache.getCustom() itself accepts a comparator as an argument, and is marked as @deprecated Please implement FieldComparatorSource directly, instead ... but isn't that what I'm doing? : : So, I'm just a bit confused. Any help? Specifically, any help implementing a setNextReader() method in a customComparator? : : (solr 1.4.1 / lucene 2.9.3) : : Thanks, : Brad : : : : -Hoss -- http://lucenerevolution.org/ ... October 7-8, Boston http://bit.ly/stump-hoss ... Stump The Chump!
custom sorting / help overriding FieldComparator
Hi I'm trying to get my (overly complex and strange) product IDs sorting properly in Solr. Approaches I've tried so far, that I've given up on for various reasons: --Normalizing/padding the IDs so they naturally sort alphabetically/alphanumerically. --Splitting the ID into multiple Solr fields and sending a longer, multi-field sort argument in the GET request. --(both of those approaches do work most of the time, but aren't quite perfect) However, in another project, I already have a codeComparble/code class defined in Java that represents a ProductID and does sort them correctly every time. It's not yet in lucene/solr, though. So I'm trying to make a FieldType plugin for Solr that uses the existing ProductID class/datatype. I need some help extending the lucene FieldComparator class. I don't know much about the rest of the solr / lucene codebase, so I'm fumbling around a bit, especially with the required setNextReader() method. setNextReader() looks like it checks the FieldCache to see if this value is there already, otherwise grabs a bunch of documents from the index. I think I should call some form of FieldCache.getCustom() for this, but FieldCache.getCustom() itself accepts a comparator as an argument, and is marked as @deprecated Please implement FieldComparatorSource directly, instead ... but isn't that what I'm doing? So, I'm just a bit confused. Any help? Specifically, any help implementing a setNextReader() method in a customComparator? (solr 1.4.1 / lucene 2.9.3) Thanks, Brad
Custom sorting
Hello everyone, I'm new to Solr but have been asked to do an evaluation as an alternative for a commercial search engine. I have some experience with Lucene and a java background so I'm not afraid to dive into code :-) The application now has a very particular way of sorting results using something called buckets. I'll try to explain with a bit of details: In the interface they have 2 fields: what and where. Both fields are actually sets of fields (what = category, name, contact info... and where= country, state, region, city...) so the copyfield feature of Solr immediately comes to mind. Now based on the field generated the actual match the result should end up in a specific bucket. In particular the first bucket contains all the result documents that have an exact match on the category field, in the second bucket all exact matches on name, the third partial matches on category, the fourth partial matches on name, the fifth matches on contact info etc... Then within each of those first tier buckets all results are placed in second tier buckets depending on what location was matched: city, then region, then province and so on. To even complicate things more there is also a third tier bucket where results are placed according to the value of a ranking field: all documents with the value 1 in the ranking field go in bucket 1 and so on. And finally results should be randomized in the third tier bucket... On top of this they obviously want support for facets and paging. My apologies for the long mail but I would greatly appreciate feedback and/or suggestions. I'm aware that this that this is a very particular problem but everything that points me in the right direction is helpful. Cheers, Tom
Custom sorting
Hi, I have a requirement to do the following: For up to the first 10 results (i.e. only on the first page) show sponsored category ads, in order of bid, but no more than 2 / category, and only if all sponsored cat' ads are more that min% of the highest score. e.g. If I had the following: min% =1 doc score bid cat_id sponsored 1 100 x x 0 255x x 0 3502 2 1 4202 2 1 5052 2 1 6801 1 1 7701 1 1 8601 1 1 x = dont care sorted order would be: 3 4 6 7 1 8 2 5 I'm not sure if this can be implemented with a custom comparator as I need access to the final score to enforce min%, I'm thinking I'm probably going to have to implement a subclass of QParserPlugin with a custom sort. but was wondering if there were alternatives ? Many thanks in advance. Dan
Re: Custom sorting
Hi Dan, It seems that you want a SearchComponent[1], something like the QueryElevationComponent[2]. Take a look how at him and I think you can build your custom solution. [1]- http://lucene.apache.org/solr/api/org/apache/solr/handler/component/SearchComponent.html [2]- http://wiki.apache.org/solr/QueryElevationComponent Cheers, -- Daniel Cassiano http://dcassiano.wordpress.com On Wed, May 19, 2010 at 6:46 AM, dan sutton danbsut...@gmail.com wrote: Hi, I have a requirement to do the following: For up to the first 10 results (i.e. only on the first page) show sponsored category ads, in order of bid, but no more than 2 / category, and only if all sponsored cat' ads are more that min% of the highest score. e.g. If I had the following: min% =1 doc score bid cat_id sponsored 1 100 x x 0 255x x 0 3502 2 1 4202 2 1 5052 2 1 6801 1 1 7701 1 1 8601 1 1 x = dont care sorted order would be: 3 4 6 7 1 8 2 5 I'm not sure if this can be implemented with a custom comparator as I need access to the final score to enforce min%, I'm thinking I'm probably going to have to implement a subclass of QParserPlugin with a custom sort. but was wondering if there were alternatives ? Many thanks in advance. Dan
Re: AutoSuggest with custom sorting
This was extremely helpful. Thanks a lot. On 05/04/2010 05:30 PM, Chris Hostetter wrote: First off: i would suggest that instead of doing a simple prefix search, you look into using EdgeNGrams for this sort of thing. I'm also assuming since you need custom scoring for this, you aren't going to get what you need using the TermsComponent or any other simple solution using your main corpus -- it would make more sense to setup a special index consisting of one document per term to include in your autosuggest. : 1. Results matching field1 should be ranked higher. Results matching the easily done with dismax .. even if you are using EdgeNGrams (just make sure you have EdgeNGrams on at index time, but not at query time) : 2.The next sort parameter is the length of the word. So, if you are : searching for IR, Row2 (2 out of 4 ) matches higher than Row3 (2 out of 5). this can be accomplished by indexing a numeric field containing the length of the field as a number, and then doing a secondary sort on it. the fieldNorm typically takes care of this sort of thing for you, but is more of a generalized concept, and doesn't give you exact precision for small numbers. -Hoss Pink OTC Markets Inc. provides the leading inter-dealer quotation and trading system in the over-the-counter (OTC) securities market. We create innovative technology and data solutions to efficiently connect market participants, improve price discovery, increase issuer disclosure, and better inform investors. Our marketplace, comprised of the issuer-listed OTCQX and broker-quoted Pink Sheets, is the third largest U.S. equity trading venue for company shares. This document contains confidential information of Pink OTC Markets and is only intended for the recipient. Do not copy, reproduce (electronically or otherwise), or disclose without the prior written consent of Pink OTC Markets. If you receive this message in error, please destroy all copies in your possession (electronically or otherwise) and contact the sender above.
Re: AutoSuggest with custom sorting
First off: i would suggest that instead of doing a simple prefix search, you look into using EdgeNGrams for this sort of thing. I'm also assuming since you need custom scoring for this, you aren't going to get what you need using the TermsComponent or any other simple solution using your main corpus -- it would make more sense to setup a special index consisting of one document per term to include in your autosuggest. : 1. Results matching field1 should be ranked higher. Results matching the easily done with dismax .. even if you are using EdgeNGrams (just make sure you have EdgeNGrams on at index time, but not at query time) : 2.The next sort parameter is the length of the word. So, if you are : searching for IR, Row2 (2 out of 4 ) matches higher than Row3 (2 out of 5). this can be accomplished by indexing a numeric field containing the length of the field as a number, and then doing a secondary sort on it. the fieldNorm typically takes care of this sort of thing for you, but is more of a generalized concept, and doesn't give you exact precision for small numbers. -Hoss
Re: AutoSuggest with custom sorting
Chris Hostetter wrote: this can be accomplished by indexing a numeric field containing the length of the field as a number, and then doing a secondary sort on it. the fieldNorm typically takes care of this sort of thing for you, but is more of a generalized concept, and doesn't give you exact precision for small numbers Or see https://issues.apache.org/jira/browse/LUCENE-1360 if you don't want to index a field length. -Sean
Re: AutoSuggest with custom sorting
I guess my basic issue is that Solr scores all matches for prefix searches equally. Any way to score PINK over PINKSHEETS when you are searching for PI ? Thanks Papiya Papiya Misra wrote: Hi I am supposed to implement auto suggest where the prefix matches are sorted based on the following criteria. We have two fields (max characters ~ 100) that we need to search. Field 1 has only one word (no spaces) where as Field2 has multiple words separated by spaces. Example - Row1 ```Field1 - ROFL Field2 - Rolls on the floor laughing Row2 Field1: IRLL Field2 - Rolling Row3 Field1 - IRLTR Field2 - I Roll 1. Results matching field1 should be ranked higher. Results matching the first word of Field2 should be ranked higher than any subsequent matches. If you search for RO* in the above example the ranking should be Row1-Row2-Row3. 2.The next sort parameter is the length of the word. So, if you are searching for IR, Row2 (2 out of 4 ) matches higher than Row3 (2 out of 5). 3. The final sort parameter is an integer field that we already have as part of the schema. Any help or pointers will be deeply appreciated. -Papiya Pink OTC Markets Inc. provides the leading inter-dealer quotation and trading system in the over-the-counter (OTC) securities market. We create innovative technology and data solutions to efficiently connect market participants, improve price discovery, increase issuer disclosure, and better inform investors. Our marketplace, comprised of the issuer-listed OTCQX and broker-quoted Pink Sheets, is the third largest U.S. equity trading venue for company shares. This document contains confidential information of Pink OTC Markets and is only intended for the recipient. Do not copy, reproduce (electronically or otherwise), or disclose without the prior written consent of Pink OTC Markets. If you receive this message in error, please destroy all copies in your possession (electronically or otherwise) and contact the sender above.
AutoSuggest with custom sorting
Hi I am supposed to implement auto suggest where the prefix matches are sorted based on the following criteria. We have two fields (max characters ~ 100) that we need to search. Field 1 has only one word (no spaces) where as Field2 has multiple words separated by spaces. Example - Row1 ```Field1 - ROFL Field2 - Rolls on the floor laughing Row2 Field1: IRLL Field2 - Rolling Row3 Field1 - IRLTR Field2 - I Roll 1. Results matching field1 should be ranked higher. Results matching the first word of Field2 should be ranked higher than any subsequent matches. If you search for RO* in the above example the ranking should be Row1-Row2-Row3. 2.The next sort parameter is the length of the word. So, if you are searching for IR, Row2 (2 out of 4 ) matches higher than Row3 (2 out of 5). 3. The final sort parameter is an integer field that we already have as part of the schema. Any help or pointers will be deeply appreciated. -Papiya Pink OTC Markets Inc. provides the leading inter-dealer quotation and trading system in the over-the-counter (OTC) securities market. We create innovative technology and data solutions to efficiently connect market participants, improve price discovery, increase issuer disclosure, and better inform investors. Our marketplace, comprised of the issuer-listed OTCQX and broker-quoted Pink Sheets, is the third largest U.S. equity trading venue for company shares. This document contains confidential information of Pink OTC Markets and is only intended for the recipient. Do not copy, reproduce (electronically or otherwise), or disclose without the prior written consent of Pink OTC Markets. If you receive this message in error, please destroy all copies in your possession (electronically or otherwise) and contact the sender above.
Custom Sorting Based on Relevancy
Hi There, I'm working on a sorting issue. Our site currently sorts by creation date descending, so users list similar products multiple times to show up at the top of the results. When sorting based on score, we want to move items by the same user with the same title down search results. It would be best if the first item stayed in place based on score, and each additional item is moved out (rows * repeated user/title). Is custom sorting the best way? or is there something else I'm not thinking about. At the moment I'm looking at doing roughly the opposite of the Query Elevate Search component. Thanks, David
Re: Custom Sorting Based on Relevancy
Or you could collapse search results with SOLR-236. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: David Giffin da...@giffin.org To: solr-user@lucene.apache.org Sent: Monday, May 4, 2009 12:37:29 PM Subject: Custom Sorting Based on Relevancy Hi There, I'm working on a sorting issue. Our site currently sorts by creation date descending, so users list similar products multiple times to show up at the top of the results. When sorting based on score, we want to move items by the same user with the same title down search results. It would be best if the first item stayed in place based on score, and each additional item is moved out (rows * repeated user/title). Is custom sorting the best way? or is there something else I'm not thinking about. At the moment I'm looking at doing roughly the opposite of the Query Elevate Search component. Thanks, David
Re: Custom Sorting
I was sucessful with your hint and just need to solve another problem: The problem I have is that I have implemented a custome sorting by following your advice to code a QParserPlugin and to create a custom comparator as described in your book, and it really works But now I also would like to return those computed sort values by adding them to the SolrQueryResponse. I am calculating distances and would like to return the distance from the origin for each search result. In your book you describe that it is possible by using this lucene search function: TopFieldDocs docs = searcher.search(query, null, 3, sort); and then to read the sort values: FieldDoc fieldDoc = (FieldDoc) docs.scoreDocs[0]; return - fieldDoc.fields[0] But how can I do this inside Solr? I am using the default QueryComponent and of course I don’t want to make too many changes, because I don’t understand the inside of solr so much – it’s quite big and complicated and I didn’t find many documents explaining Solr. Is there maybe a workaround? Can I just store all my sort values and add them to the SolrQueryResponse at the end? Thanks, Markus Erik Hatcher wrote: Markus, A couple of code pointers for you: * QueryComponent - this is where results are generated, it uses a SortSpec from the QParser. * QParser#getSort - creating a custom QParser you'll be able to wire in your own custom sort You can write your own QParserPlugin and QParser, and configure it into solrconfig.xml and should be good to go. Subclassing existing classes, this should only be a handful of lines of code to do. Erik On Dec 16, 2008, at 3:54 AM, psyron wrote: I have the same problem, also need to plugin my customComparator, but as there is no explanation of the framework, how a RequestHandler is working, what comes in, what comes out ... just impossible! Can someone explain where i have to add which code, to just have the same functionality as the StandardRequestHandler, but also adding a custom sorting? Thanks, Markus hossman wrote: : Sort sort = new Sort(new SortField[] : { SortField.FIELD_SCORE, new SortField(customValue, SortField.FLOAT, : true) }); : indexSearcher.search(q, sort) that appears to just be a sort on score withe a secondary reversed float sort on whatever field name is in the variable customValue ... assuming hte field name is FIELD that's hte same thing as... sort=score+asc,+FIELD+desc : Sort sort = new Sort(new SortField(customValue, customComparator)) : indexSearcher.search(q, sort) this is using a custom SortComparatorSource -- code you (or someone else) has written which is not part of Lucene and which tells lucene how to order the documents using whatever crazy logic it wants ... for obvious reasons Solr can't do that same logic (since it doesn't know what it is) although many things in Solr are easily customizable, just by writting a little factory and configuring it by class name, i'm afraind SortComparatorSources aren't once of them. You could write a custom RequestHandler which used your SortComparatorSource, or you could write a custom FieldType that used it anything someone sorted on that field ... but those are the best options i cna think of. -Hoss -- View this message in context: http://www.nabble.com/Custom-Sorting-tp1659p21029370.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Custom-Sorting-tp1659p22248512.html Sent from the Solr - User mailing list archive at Nabble.com.
Custom Sorting Algorithm
Is an easy way to choose/create an alternate sorting algorithm? I'm frequently dealing with large result sets (a few million results) and I might be able to benefit domain knowledge in my sort. -- View this message in context: http://www.nabble.com/Custom-Sorting-Algorithm-tp21837721p21837721.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Custom Sorting Algorithm
Hi, You can use one of the exiting function queries (if they fit your need) or write a custom function query to reorder the results of a query. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: wojtekpia wojte...@hotmail.com To: solr-user@lucene.apache.org Sent: Wednesday, February 4, 2009 2:28:56 PM Subject: Custom Sorting Algorithm Is an easy way to choose/create an alternate sorting algorithm? I'm frequently dealing with large result sets (a few million results) and I might be able to benefit domain knowledge in my sort. -- View this message in context: http://www.nabble.com/Custom-Sorting-Algorithm-tp21837721p21837721.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Custom Sorting Algorithm
That's not quite what I meant. I'm not looking for a custom comparator, I'm looking for a custom sorting algorithm. Is there a way to use quick sort or merge sort or... rather than the current algorithm? Also, what is the current algorithm? Otis Gospodnetic wrote: You can use one of the exiting function queries (if they fit your need) or write a custom function query to reorder the results of a query. -- View this message in context: http://www.nabble.com/Custom-Sorting-Algorithm-tp21837721p21838804.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Custom Sorting Algorithm
It would not be simple to use a new algorithm. The current implementation takes place at the Lucene level and uses a priority queue. When you ask for the top n results, a priority queue of size n is filled with all of the matching documents. The ordering in the priority queue is the sort. The on Sort method orders by relevance score - the Sort method orders by field, relevance, or doc id. - Mark wojtekpia wrote: That's not quite what I meant. I'm not looking for a custom comparator, I'm looking for a custom sorting algorithm. Is there a way to use quick sort or merge sort or... rather than the current algorithm? Also, what is the current algorithm? Otis Gospodnetic wrote: You can use one of the exiting function queries (if they fit your need) or write a custom function query to reorder the results of a query.
Re: Custom Sorting Algorithm
Ok, so maybe a better question is: should I bother trying to change the sorting algorithm? I'm concerned that with large data sets, sorting becomes a severe bottleneck (this is an assumption, I haven't profiled anything to verify). Does it become a severe bottleneck? Do you know if alternate sort algorithms have been tried during Lucene development? markrmiller wrote: It would not be simple to use a new algorithm. The current implementation takes place at the Lucene level and uses a priority queue. When you ask for the top n results, a priority queue of size n is filled with all of the matching documents. The ordering in the priority queue is the sort. The on Sort method orders by relevance score - the Sort method orders by field, relevance, or doc id. -- View this message in context: http://www.nabble.com/Custom-Sorting-Algorithm-tp21837721p21840299.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Custom Sorting Algorithm
On Wed, Feb 4, 2009 at 4:45 PM, wojtekpia wojte...@hotmail.com wrote: Ok, so maybe a better question is: should I bother trying to change the sorting algorithm? I'm concerned that with large data sets, sorting becomes a severe bottleneck (this is an assumption, I haven't profiled anything to verify). No... Lucene/Solr never sorts the complete result set. If you ask for the top 10 results, a priority queue (heap) of the current top 10 results is maintained... far more efficient and scalable than sorting all the hits at the end. -Yonik
Re: Custom Sorting
Thanks Erik, that helped me a lot ... but still have somthing, i am not sure about: If i am using a custom sort - like the DistanceComparator example described in oh your book - and i debugged the code and seem to understand that the the distances-array is created for all indexed documents - not only for the search result. The compare-function is then called only for the docs of the search result, right? My problem is now, that i wonder, if it is not possible to compute only the distances from the documents of the search result (that should help the performance, if there are a lot of documents, but the search result is mostly very small, right?) Another point: Of course it also could be interesting to compute all distances for all documents the first time a new start location is given, in the case, that you want do a lot of queries from the same location. But this would then only make sense, if all distances are cached together with the location value. I am not sure how things are actually handled in lucene/solr. What and at which timer things are cached? To compute distances only for the search result, i could - store the reader instance in a variable - for every doc-id called in the compare function the first time, i could compute the distance at this moment - and then compare Would this work? Or is there a better way to compute the distances only on the search result? A lot of questions, i know, Thanks for the good book, Markus Erik Hatcher wrote: * QueryComponent - this is where results are generated, it uses a SortSpec from the QParser. * QParser#getSort - creating a custom QParser you'll be able to wire in your own custom sort You can write your own QParserPlugin and QParser, and configure it into solrconfig.xml and should be good to go. Subclassing existing classes, this should only be a handful of lines of code to do. -- View this message in context: http://www.nabble.com/Custom-Sorting-tp1659p21825900.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Custom Sorting
I have the same problem, also need to plugin my customComparator, but as there is no explanation of the framework, how a RequestHandler is working, what comes in, what comes out ... just impossible! Can someone explain where i have to add which code, to just have the same functionality as the StandardRequestHandler, but also adding a custom sorting? Thanks, Markus hossman wrote: : Sort sort = new Sort(new SortField[] : { SortField.FIELD_SCORE, new SortField(customValue, SortField.FLOAT, : true) }); : indexSearcher.search(q, sort) that appears to just be a sort on score withe a secondary reversed float sort on whatever field name is in the variable customValue ... assuming hte field name is FIELD that's hte same thing as... sort=score+asc,+FIELD+desc : Sort sort = new Sort(new SortField(customValue, customComparator)) : indexSearcher.search(q, sort) this is using a custom SortComparatorSource -- code you (or someone else) has written which is not part of Lucene and which tells lucene how to order the documents using whatever crazy logic it wants ... for obvious reasons Solr can't do that same logic (since it doesn't know what it is) although many things in Solr are easily customizable, just by writting a little factory and configuring it by class name, i'm afraind SortComparatorSources aren't once of them. You could write a custom RequestHandler which used your SortComparatorSource, or you could write a custom FieldType that used it anything someone sorted on that field ... but those are the best options i cna think of. -Hoss -- View this message in context: http://www.nabble.com/Custom-Sorting-tp1659p21029370.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Custom Sorting
Markus, A couple of code pointers for you: * QueryComponent - this is where results are generated, it uses a SortSpec from the QParser. * QParser#getSort - creating a custom QParser you'll be able to wire in your own custom sort You can write your own QParserPlugin and QParser, and configure it into solrconfig.xml and should be good to go. Subclassing existing classes, this should only be a handful of lines of code to do. Erik On Dec 16, 2008, at 3:54 AM, psyron wrote: I have the same problem, also need to plugin my customComparator, but as there is no explanation of the framework, how a RequestHandler is working, what comes in, what comes out ... just impossible! Can someone explain where i have to add which code, to just have the same functionality as the StandardRequestHandler, but also adding a custom sorting? Thanks, Markus hossman wrote: : Sort sort = new Sort(new SortField[] : { SortField.FIELD_SCORE, new SortField(customValue, SortField.FLOAT, : true) }); : indexSearcher.search(q, sort) that appears to just be a sort on score withe a secondary reversed float sort on whatever field name is in the variable customValue ... assuming hte field name is FIELD that's hte same thing as... sort=score+asc,+FIELD+desc : Sort sort = new Sort(new SortField(customValue, customComparator)) : indexSearcher.search(q, sort) this is using a custom SortComparatorSource -- code you (or someone else) has written which is not part of Lucene and which tells lucene how to order the documents using whatever crazy logic it wants ... for obvious reasons Solr can't do that same logic (since it doesn't know what it is) although many things in Solr are easily customizable, just by writting a little factory and configuring it by class name, i'm afraind SortComparatorSources aren't once of them. You could write a custom RequestHandler which used your SortComparatorSource, or you could write a custom FieldType that used it anything someone sorted on that field ... but those are the best options i cna think of. -Hoss -- View this message in context: http://www.nabble.com/Custom-Sorting-tp1659p21029370.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: custom sorting
: If you went with the FunctionQuery approach for sorting by distance, would : there be any way to use the output of the FunctionQuery to limit the : documents to those within a certain radius? Or is it just for boosting : documents, not for filtering? FunctionQueries don't restrict the set of documents at all, so you would need to combine it with a seperate query that limits the documents ... the simplest way would be as you say: by combining it with two range queries that would define a lat/lon bounding box : Also, even if you're just using it for boosting, is there a way to avoid : running the expensive function on all docs in the index? Could you somehow that's the bueaty of skipTo in query scoring ... BooleanQueries keep track of the next document each clause can match (in order by docid), and tell all of the other queries to skipTo that doc and not bother trying to score any doc ids below that. -Hoss
Re: custom sorting
If you went with the FunctionQuery approach for sorting by distance, would there be any way to use the output of the FunctionQuery to limit the documents to those within a certain radius? Or is it just for boosting documents, not for filtering? Also, even if you're just using it for boosting, is there a way to avoid running the expensive function on all docs in the index? Could you somehow nest bounding-box RangeQuery for latitude and longitude inside as ValueSources? Thanks, Doug hossman wrote: : leaks, etc.). (Speaking of which, could anyone with more Lucene/Solr : experience than I comment on the performance characteristics of the : locallucene implementation mentioned on the list recently? I've taken : a first look and it seems reasonable to me.) i cna't speak for anyone else, but i haven't had a chacne to drill into it yet. : Using a function query, as Yonik suggests above, is another approach. : But to get a true sort, you have to boost the original query to zero? or a very close approximation there of (0.01 perhaps) keep in mind: a true distance sort while easy to explain may not be as useful as a sort by score where the distance is factored into the score ... there have been some threads about this on the java-user list in the past and it's been discussed that a really relevant result 2 miles away is probably better then a mildly relevent result 1.5 miles away ... that's where a function query with well choosen boosts might serve you better. : How does this impact the results returned by the original query? Will : the requirements (and boosts) of the original (now nested) query : remain intact, only sorted by the function? Also, is there any way to it should ... but i won't swear to that. : do this with the dismax handler? a strict sort on the value of a a function? put the function in the bf param, don't bother with bq or pf params and change your qf params to all have really small boosts. -Hoss -- View this message in context: http://www.nabble.com/custom-sorting-tf4521989.html#a13436617 Sent from the Solr - User mailing list archive at Nabble.com.
RE: custom sorting
i have been testing locallucene with our data for the last couple of days. one issue i faced with it is during when using geo sorting is that it seems to eat up all the memory, however big and become progressively slower, finally after several requests (10 or so in my case) it throws up a java.lang.OutOfMemoryError: Java heap space error. is there a way to get around this? -Original Message- From: Jon Pierce [mailto:[EMAIL PROTECTED] Sent: 28 September 2007 15:48 To: solr-user@lucene.apache.org Subject: Re: custom sorting Is the machinery in place to do this now (hook up a function query to be used in sorting)? I'm trying to figure out what's the best way to do a distance sort: custom comparator or function query. Using a custom comparator seems straightforward and reusable across both the standard and dismax handlers. But it also seems most likely to impact performance (or at least require the most work/knowledge to get right by minimizing calculations, caching, watching out for memory leaks, etc.). (Speaking of which, could anyone with more Lucene/Solr experience than I comment on the performance characteristics of the locallucene implementation mentioned on the list recently? I've taken a first look and it seems reasonable to me.) Using a function query, as Yonik suggests above, is another approach. But to get a true sort, you have to boost the original query to zero? How does this impact the results returned by the original query? Will the requirements (and boosts) of the original (now nested) query remain intact, only sorted by the function? Also, is there any way to do this with the dismax handler? Thanks, - Jon On 9/27/07, Yonik Seeley [EMAIL PROTECTED] wrote: On 9/27/07, Erik Hatcher [EMAIL PROTECTED] wrote: Using something like this, how would the custom SortComparatorSource get a parameter from the request to use in sorting calculations? perhaps hook in via function query: dist(10.4,20.2,geoloc) And either manipulate the score with that and sort by score, q=+(foo bar)^0 dist(10.4,20.2,geoloc) sort=score asc or extend solr's sorting mechanisms to allow specifying a function to sort by. sort=dist(10.4,20.2,geoloc) asc -Yonik This email is confidential and may also be privileged. If you are not the intended recipient please notify us immediately by telephoning +44 (0)20 7452 5300 or email [EMAIL PROTECTED] You should not copy it or use it for any purpose nor disclose its contents to any other person. Touch Local cannot accept liability for statements made which are clearly the sender's own and are not made on behalf of the firm. Touch Local Limited Registered Number: 2885607 VAT Number: GB896112114 Cardinal Tower, 12 Farringdon Road, London EC1M 3NN +44 (0)20 7452 5300
Re: custom sorting
Hi all, Regarding this issue, we tried using a custom request handler which inturn uses the CustomCompartor. But this has a memory leak and we are almost got stuck up at that point. As somebody mentioned, we are thinking of moving towards function query to achieve the same. Please let me know whether anybody has faced similar issue or is it that we are doing something wrong. The additional code that we have return from the default handler is as given below. * if* (*myappRequestHandler*.equalsIgnoreCase(requestHandler)) { sort = getSortCriteria(*new* SimpleSortComparatorSourceImpl()); } Thanks and Regards Narayanan On 9/28/07, Yonik Seeley [EMAIL PROTECTED] wrote: On 9/27/07, Erik Hatcher [EMAIL PROTECTED] wrote: Using something like this, how would the custom SortComparatorSource get a parameter from the request to use in sorting calculations? perhaps hook in via function query: dist(10.4,20.2,geoloc) And either manipulate the score with that and sort by score, q=+(foo bar)^0 dist(10.4,20.2,geoloc) sort=score asc or extend solr's sorting mechanisms to allow specifying a function to sort by. sort=dist(10.4,20.2,geoloc) asc -Yonik
Re: custom sorting
Is the machinery in place to do this now (hook up a function query to be used in sorting)? I'm trying to figure out what's the best way to do a distance sort: custom comparator or function query. Using a custom comparator seems straightforward and reusable across both the standard and dismax handlers. But it also seems most likely to impact performance (or at least require the most work/knowledge to get right by minimizing calculations, caching, watching out for memory leaks, etc.). (Speaking of which, could anyone with more Lucene/Solr experience than I comment on the performance characteristics of the locallucene implementation mentioned on the list recently? I've taken a first look and it seems reasonable to me.) Using a function query, as Yonik suggests above, is another approach. But to get a true sort, you have to boost the original query to zero? How does this impact the results returned by the original query? Will the requirements (and boosts) of the original (now nested) query remain intact, only sorted by the function? Also, is there any way to do this with the dismax handler? Thanks, - Jon On 9/27/07, Yonik Seeley [EMAIL PROTECTED] wrote: On 9/27/07, Erik Hatcher [EMAIL PROTECTED] wrote: Using something like this, how would the custom SortComparatorSource get a parameter from the request to use in sorting calculations? perhaps hook in via function query: dist(10.4,20.2,geoloc) And either manipulate the score with that and sort by score, q=+(foo bar)^0 dist(10.4,20.2,geoloc) sort=score asc or extend solr's sorting mechanisms to allow specifying a function to sort by. sort=dist(10.4,20.2,geoloc) asc -Yonik
Re: custom sorting
: Using something like this, how would the custom SortComparatorSource : get a parameter from the request to use in sorting calculations? in general: you wouldn't you would have to specify all options as init params for the FieldType -- which makes it pretty horrible for distance calculations, and isn't something i considered when i posted that. the only way i can think of that you can really solve the problem with a plugin at the moment (without some serious internal changes that yonik describes below) would be to use a dynamicField when you want geodistance sort, and encode the center lat/lon point in the field name, ala: sort=geodist_-124.75_93.45 : or extend solr's sorting mechanisms to allow specifying a function to sort by. : : sort=dist(10.4,20.2,geoloc) asc thta would in fact, kick ass. even if there is a better solution for the distance stuff the idea of being able to specify a raw function as a sort would be pretty sick. (NOTE: that's sick as in so good it's amazing ... since the last person i used that idiom with didn't understand and thought i ment bad) -Hoss
Re: custom sorting
: leaks, etc.). (Speaking of which, could anyone with more Lucene/Solr : experience than I comment on the performance characteristics of the : locallucene implementation mentioned on the list recently? I've taken : a first look and it seems reasonable to me.) i cna't speak for anyone else, but i haven't had a chacne to drill into it yet. : Using a function query, as Yonik suggests above, is another approach. : But to get a true sort, you have to boost the original query to zero? or a very close approximation there of (0.01 perhaps) keep in mind: a true distance sort while easy to explain may not be as useful as a sort by score where the distance is factored into the score ... there have been some threads about this on the java-user list in the past and it's been discussed that a really relevant result 2 miles away is probably better then a mildly relevent result 1.5 miles away ... that's where a function query with well choosen boosts might serve you better. : How does this impact the results returned by the original query? Will : the requirements (and boosts) of the original (now nested) query : remain intact, only sorted by the function? Also, is there any way to it should ... but i won't swear to that. : do this with the dismax handler? a strict sort on the value of a a function? put the function in the bf param, don't bother with bq or pf params and change your qf params to all have really small boosts. -Hoss
Re: custom sorting
On Sep 27, 2007, at 2:50 PM, Chris Hostetter wrote: to answer the broader question of using customized LUcene SortComparatorSource objects in solr -- it is in fact possible. In Solr, all decisisons about how to sort are driven by FieldTypes. You can subclass any of the FieldTypes that come with Solr and override just the getSortField method to use whatever sort logic you want and then use your new FieldType as you would any other plugin... http://wiki.apache.org/solr/SolrPlugins In the case where you have a custom SortComparatorSource that is not field specific (or uses data from morethen one field) you would need to make your field type smart enough to let you cofigure (via the fieldType declaration in the schema) which fields (if any) to get it's data from, and then create a marker field of that type, which you don't use to index or store any data, but you use to indicate when to trigger your custom sort logic, ie... fieldType name=distance class=solr.YourField latFieldName=latitude lonFieldName=longitute stored=false indexed=false / ... field name=latitude type=sint indexed=true stored=true / field name=latitude type=sint indexed=true stored=true / field name=distance type=distance / ...and then use sort=distance+asc in your query Using something like this, how would the custom SortComparatorSource get a parameter from the request to use in sorting calculations? I haven't looked under the covers of the local-solr stuff that flew by earlier, but looks quite well done. I think I can speak for many that would love to have geo field types / sorting capability built into Solr. Erik
Re: custom sorting
On 9/27/07, Erik Hatcher [EMAIL PROTECTED] wrote: Using something like this, how would the custom SortComparatorSource get a parameter from the request to use in sorting calculations? perhaps hook in via function query: dist(10.4,20.2,geoloc) And either manipulate the score with that and sort by score, q=+(foo bar)^0 dist(10.4,20.2,geoloc) sort=score asc or extend solr's sorting mechanisms to allow specifying a function to sort by. sort=dist(10.4,20.2,geoloc) asc -Yonik
custom sorting
Hi Guys, this question as been asked before but i was unable to find an answer thats good for me, so hope you guys can help again i am working on a website where we need to sort the results by distance from the location entered by the user. I have indexed the lat and long info for each record in solr and also i can get the lat and long of the location input by the user. Previously we were using lucene to do this. by using the SortComparatorSource we could sort the documents returned by distance nicely. we are now switching over to lucene because of the features it provides, however i am not able to see a way to do this in Solr. If someone can point me in the right direction i would be very grateful! Thanks in advance, Sandeep This email is confidential and may also be privileged. If you are not the intended recipient please notify us immediately by telephoning +44 (0)20 7452 5300 or email [EMAIL PROTECTED] You should not copy it or use it for any purpose nor disclose its contents to any other person. Touch Local cannot accept liability for statements made which are clearly the sender's own and are not made on behalf of the firm. Touch Local Limited Registered Number: 2885607 VAT Number: GB896112114 Cardinal Tower, 12 Farringdon Road, London EC1M 3NN +44 (0)20 7452 5300
Re: custom sorting
On 26-Sep-07, at 5:14 AM, Sandeep Shetty wrote: Hi Guys, this question as been asked before but i was unable to find an answer thats good for me, so hope you guys can help again i am working on a website where we need to sort the results by distance from the location entered by the user. I have indexed the lat and long info for each record in solr and also i can get the lat and long of the location input by the user. Previously we were using lucene to do this. by using the SortComparatorSource we could sort the documents returned by distance nicely. we are now switching over to lucene because of the features it provides, however i am not able to see a way to do this in Solr. If someone can point me in the right direction i would be very grateful! Thanks in advance, Sandeep This email is confidential and may also be privileged. If you are not the intended recipient please notify us immediately by telephoning +44 (0)20 7452 5300 or email [EMAIL PROTECTED] You should not copy it or use it for any purpose nor disclose its contents to any other person. Touch Local cannot accept liability for statements made which are clearly the sender's own and are not made on behalf of the firm. Sorry, I'm afraid the above email is already irrevokably publicly archived. -Mike
Re: Custom Sorting
: Sort sort = new Sort(new SortField[] : { SortField.FIELD_SCORE, new SortField(customValue, SortField.FLOAT, : true) }); : indexSearcher.search(q, sort) that appears to just be a sort on score withe a secondary reversed float sort on whatever field name is in the variable customValue ... assuming hte field name is FIELD that's hte same thing as... sort=score+asc,+FIELD+desc : Sort sort = new Sort(new SortField(customValue, customComparator)) : indexSearcher.search(q, sort) this is using a custom SortComparatorSource -- code you (or someone else) has written which is not part of Lucene and which tells lucene how to order the documents using whatever crazy logic it wants ... for obvious reasons Solr can't do that same logic (since it doesn't know what it is) although many things in Solr are easily customizable, just by writting a little factory and configuring it by class name, i'm afraind SortComparatorSources aren't once of them. You could write a custom RequestHandler which used your SortComparatorSource, or you could write a custom FieldType that used it anything someone sorted on that field ... but those are the best options i cna think of. -Hoss
Custom Sorting
Hi All, Currently we are having an application which uses Lucene for text search and we are in the process of migrating to Solr. In our Lucene code we have the following way of using the sort criteria code Sort sort = new Sort(new SortField[] { SortField.FIELD_SCORE, new SortField(customValue, SortField.FLOAT, true) }); indexSearcher.search(q, sort) /code and another code snippet is code Sort sort = new Sort(new SortField(customValue, customComparator)) indexSearcher.search(q, sort) /code In these two scenarios, how should I configure the schema to have my own sort definition for the query. I couldnt find any documentation which describes this query time sort definition in any of the documentation. Can any of you please throw some light on this. Thanks and Regards Palasseri
custom sorting for multivalued field
Hi, We are trying to set up Solr to search documents with multiple keywords, which we have implemented as a multivalued field. Is it possible to assign a custom sorting value for each of the values in the multivalued field? So that the document gets sorted differently, depending on the matched value in the multivalued field. The other approach would be to store each document/keyword pair as a separate document with the sorting value as an explicit field. Is it possible to filter the results on the Solr end (based on the relevancy of the matched keyword), so that the same original document doesn't appear in the result set twice? Any ideas or suggestions would be greatly appreciated. Thanks! Sick sense of humor? Visit Yahoo! TV's Comedy with an Edge to see what's on, when. http://tv.yahoo.com/collections/222
custom sorting for multivalued field
Hi, We are trying to set up Solr to search documents with multiple keywords, which we have implemented as a multivalued field. Is it possible to assign a custom sorting value for each of the values in the multivalued field? So that the document gets sorted differently, depending on the matched value in the multivalued field. The other approach would be to store each document/keyword pair as a separate document with the sorting value as an explicit field. Is it possible to filter the results on the Solr end, so that the same original document doesn't appear in the result set twice? Any ideas or suggestions would be greatly appreciated. Thanks! Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. http://smallbusiness.yahoo.com/webhosting