First step is to feed a Set with "collection" Second step is to sort it.
With a sortedSet, you can do that, isnt'it? M. Antoine Baudoux a écrit : > Could-you be more precise? I dont understand what you mean. > > > > On 15 Jun 2007, at 17:20, Mathieu Lecarme wrote: > >> Your request seems to be a two steps query. >> First step, you select image, and then collection >> Second step, you sort collection. >> >> BitVector can help you? >> >> M. >> Antoine Baudoux a écrit : >>> Hi, >>> >>> I'm developping an image database. Each lucene document >>> representing an image contains (among other fields ): >>> >>> - a date field >>> - a collection field containing the ID of the collection the image >>> belongs to. >>> >>> I want to be able to give a score to each collection. Collections >>> with a higher score appear first in the results. I want to avoid >>> re-indexing all the documents each time i change my collection scores. >>> >>> For example on day 1 I decide to give collection #1 a 5 score and >>> collection #3 a 10 score --> images belonging to collection #3 appear >>> first in search results. >>> One day 2 i give collection #3 a 2 score --> images belonging to >>> collection #1 appear first in search results. >>> >>> I have read the lucene docs, and from what i understand there are >>> many ways to achieve what I want : >>> >>> >>> - I can use a Very big Boolean query (OR query in fact) containing >>> one TermQuery per collection ID, setting the correct boost factor for >>> each termquery. The problem with this is that i have 300 collections, >>> so i have a boolean query with 300 terms that i append to each query i >>> make. I am afraid that it will be slow. >>> >>> - I can use a ValueSourceQuery, where for each document i compute >>> a custom score based on the value of the collection field. Will it be >>> faster than the first solution? >>> >>> - I can do advanced things such as writing a custom HitCollector, >>> or a custom Query. >>> >>> - I can add another field to each document, containing a computed >>> custom score, then i could sort on that field. But i want to avoid >>> this solution at all costs, since it would mean re-indexing all the >>> documents each time the collection scores change. >>> >>> What solution do you suggest? Is there another solution that i >>> didnt mention? >>> >>> More recent documents should also come first : In fact the final >>> sorting should be a ponderated sum between the collection score of an >>> image and the date of an image : most recent images from the >>> best-scored collections come first, then most recent from less-scrored >>> collections, then less recent from best scored, and so on. I would >>> also like to be able to adjust the balance between date/collection >>> score. >>> >>> What solution do you suggest? >>> >>> >>> I would also like to implement random-sorting. My solution is : i >>> create 12 new fields R1 -> R12 for each document, each containing a >>> random number between 1 and 12. To get a random sort, i sort each day >>> with a different combination of R1 .. R12. For example : >>> >>> Day 1 : i sort by R1 then R4 then R5.. >>> Day 2 : i sort by R10 then R9 then R2.... >>> etc... >>> >>> Is it a good solution? Is there another way to do it? >>> >>> >>> Very big thx in advance for your answers. >>> >>> Antoine >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>> For additional commands, e-mail: [EMAIL PROTECTED] >>> >>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]