Re: Grouping results by choosen field

karl wettin Fri, 17 Mar 2006 12:00:05 -0800


17 mar 2006 kl. 16.36 skrev Java Programmer:

My problem concerns result grouping, the best example will beGoogle searchwhere you have results sorted by relevance, and also grouped bydomain (they
have little indent/margin). In my project I want to get similar
functionality, without very huge CPU consumption.

Can you share any helpful hints ?

I do that. Basically I marshall the hit documents to java instancesof Comparable. Then I just plain old Collections.sort(the documentsas object representation). Each document may contain classificationweights. Weights points at a classifiction value, and theclassification value points at a clazz.


UML class diagram:

[Persistent]<#>--- {0..*} ->[ClassificationWeight +compareTo()+weight:float]---- {1} ->[Classification + compareTo()+value:String]--- {1} ->[Clazz +fieldName:String + compareTo()]

A clazz in this instance is the "group of domain names". Theclassification is the actual domain name. You can skip the weight ifyou only use domain names. I guess all weights would be 1.

The weight compares to classifications that compares to the clazz. Iftwo weights equal I use the lucene score.

You might want to do several passes or nested order to get the grouptop score as the natrual order per group cluster.

It handles 3000+ queries per minute on 120000+ documents in RAM, 24/7on a dual core at 40% load. And I even use Lucene for persistencyeven though I should not.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Grouping results by choosen field

Reply via email to