Hey all, I'm wondering if anyone has an idea for a solution to the following Lucene problem. We'd like to group search results into buckets, but I can't find an efficient way to do so besides modifying the source to the IndexSearcher class. A bit of background: we use Lucene to search over messages in our forum application (Jive). Each message in the forum is a document with some of the following fields: subject, body, threadID, forumID Multiple messages belong to the same thread. The problem is that search results are getting overwhelmed with multiple messages in the same thread. This isn't ideal since multiple messages from a thread are displayed on the same page (as an example, here's a thread page from a Jive forum: http://forums.java.sun.com/thread.jsp?forum=45&thread=77362 ). A much better solution would be to only show one result in the list of hits per thread. I've played around trying to implement this with a filter, but that approach doesn't seam feasible, since we can't know statically how the buckets should be defined. The following wouldn't work as a filter: // Loop through all documents in index and set buckets appropriately. for (int i=0; i<numDocs; i++) { Document doc = reader.document(i); String fieldValue = doc.get("threadID"); if (fieldValue != null) { if (!buckets.containsKey(fieldValue)) { buckets.put("threadID", fieldValue); } else { bits.set(i); } } } since we can't know which messages in a thread might actually match the search query, and we'd ideally like the highest rated message in the thread to be the one that comes through as a hit. Instead, we need to create the buckets dynamically as the search results start coming through in the searcher. The algorithm would be: * Make an empty map * As documents come through as hits, mark threadIDs found in the map. If document already found in map with same threadID, discard if it has a lower hitValue, otherwise replace. So, I think what I'm essentially asking for is a more complicated hit collector, or a dynamic filter for the searcher. Does anybody know of a better solution than modifying the low level source? Thanks, Matt _______________________________________________ Lucene-users mailing list [EMAIL PROTECTED] http://lists.sourceforge.net/lists/listinfo/lucene-users