This has been implemented in open source, but not with lucene?
http://www.cs.put.poznan.pl/dweiss/carrot/
and
http://carrot2.sourceforge.net/
David Weiss is a Polish academic at Poznan University, Poland. He and
others have implemented a servlet based web app that uses pipe lined
components that communicate using http and implement a couple of
clustering algorithms.
Clustering, of course, can go way beyond search result presentation and
there are some very suggestive examples at
http://www.sics.se/humle/socialcomputing/
Where the encore project (Martin Svennson) is based on orthogonal
transformations of a large sparse matrix (a possible method for matrix
dimension reduction). I think it would be interesting to hook a
recommender system into lucene, thus clustering would take place on the
basis of user profile which may be built up automatically by
accumulating clicks and comparing to other visitors, with some
intelligent weighting to node inputs.
This calls into question what really a search is, does it have to be
instigated by the user or might their context and history suggest enough
to pull in additional material? So this would be on top of snippets and
also influence what snippets are returned as well as their presentation.
Coller still would be to be able to recognise the user without a login.
This might be implemented with cookies, but to deal with the user in
terms of types of interests, a series of faceted profiles, so that
portals could become fluidly dynamic. Sounds far flung, but I actually
think it is just round the corner.
Let me know if this is of interest.

Adam

> -----Original Message-----
> From: integer [daniel prawdzik] [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, January 26, 2005 5:17 PM
> To: lucene-dev@jakarta.apache.org
> Subject: -> Grouping Search Results by Clustering Snippets:
>
> Grouping Search Results by Clustering Snippets:
>
> The presentation of search engines are typically long unsorted lists
of
> results. To find the page you’re looking for, is often time-consuming
> and unsatisfying.
> Showing the results in groups by similar  topics is a quite more
> suitable solution to give an user a quick overview over the results.
> This can be done by a technology called cluster analysis. Actually I’m
> working on my diploma master thesis about this topic. In my
> understanding, it’s too nice to be born for the archive, so I want to
> implement this feature in an opensource software. The coding of this
> programm already gone pretty far, I’ve got some tests done and the
> results are impresive and might still get better [you can see some
> results on http://www.trist.de/CV/Text-Mining/ -> sorry, only in
german]
>
> To make a long story short:
> I’m wondering, if this is an attractive feature for the lucene
> community?
>
> regards,
> integer
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to