Savvas-Andreas Moysidis wrote: > In my understanding sorting on a field for which analysis has yielded > multiple terms just doesn't make sense.. > If you have document#1 with a field A which has the terms Epsilon, Alpha, > and document#2 with field A which has the terms Beta, Delta and request > an ascending sort on field A what order should you get and why?
In the couple use cases I've been asked for it, either... (a) returning each document only the first time it appeared document 1 [for alpha] followed by document 2[beta] (b) or returning them with duplicates doc1 [alpha], doc2[beta], doc2[beta] doc1[epsilon] ... would have been an OK user experience. The use case "show me documents relevant to things close to a location" seems like a pretty broad use-case that any geospatial-aware search engine would like to handle; and I imagine in many cases a single document might refer to multiple addresses/locations. In another case, I was asked if the application could "sort the incidents by the age of rape victims". And while most incidents involved a single victim, some had 2 or more. The idea wasn't to impose some total ordering but rather make it quick to find documents involving younger people. I realize I can work around that one by adding a "min-age" column. For the spatial one, where different users might pick different center points I can't think of any good workaround beyond Jonathan's idea of facets -- perhaps overlaying some map grid on the data and using facets for that. > On 27 October 2010 17:56, Jonathan Rochkind <rochk...@jhu.edu> wrote: > >> I would suggest that trying to sort on a multi-token/multi-value value in >> the first place ought to always raise an exception. Are there any reasons >> why you'd EVER want to do this, with the way it currently works? Letting >> people do this and only _sometimes_ raise an exception, but never do >> anything that's actually reasonable, just adds confusion for newbies. >> >> Alternately, perhaps sorting on a multi-valued or tokenized field ought to >> sort only on the FIRST token found in the first value of , but not sure how >> feasible that is to code. >> >> Ron, for your particular use case -- lucene sorting just can't really do >> that, I'm not sure there's a WAY to code sorting that works on multi-valued >> fields. A given lucene/solr search results set only includes each document >> ONCE. So where would that document appear in your sort on a multi-valued >> field? A different solution is required. I too sometimes have similar use >> cases, and my best ideas about how to solve them involve using faceting --- >> you can facet on a multi-valued field, and you can sort facets--but you can >> only sort facets by "index order", a strict byte-by-byte sort. Which >> doesn't always work for me either. I haven't quite figured out the solution >> to this sort of problem. >> >> >> Ron Mayer wrote: >> >>> Lance Norskog wrote: >>> >>> >>>> You may not sort on a tokenized field. You may not sort on a multiValued >>>> field. You can only have one term in a field. >>>> >>>> If there are more search terms than documents, A) sorting doesn't mean >>>> anything and B) Lucene will throw an exception. >>>> >>>> >>> >>> Is that considered a feature, or an annoyance/bug? >>> >>> One of the things I'm using Solr for is to store a whole bunch of >>> documents about crime events that contain information roughly >>> like this: >>> >>> "the gang member ran the red light at 100 main st, and >>> continued driving to 500 main street where he hit a car. He >>> fled his car and ran to 789 second avenue where he hijacked >>> another car and drove to his house at 654 someother st" >>> >>> If I do a search for the name of that gang member's gang, >>> I'd really really like to be able to sort my documents by >>> distance from a location -- for example to quickly find >>> any documents referring to gang activity in a neighborhood. >>> >>> And I'd really like to see this document near the top >>> of my search results whether the user chose 100 main, 500 main, >>> 790 second, or 650 someother street as his center point for >>> sorting his search. >>> >>> >>> If I wanted that so badly I'd be willing to try coding it >>> so you _could_ sort on multiValued fields, would people want >>> that feature? If so - would someone know off the top of >>> their head where I should get started looking in the code? >>> >>> Or is it considered a feature that solr currently disallows it? >>> >>> >>> >>> >