Re: Faceting on text fields

Jeffrey Tiong Thu, 11 Jun 2009 22:46:57 -0700

Thanks Otis!

Do you know under what circumstances or application should we cluster the
whole corpus of documents vs just the search results?


Jeffrey

On Fri, Jun 12, 2009 at 1:39 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

>
> Jeffrey,
>
> Are you looking to cluster a whole corpus of documents of just the search
> results?  If it's the latter, use Carrot2.  If it's the former, look at
> Mahout.  Clustering top 1M matching documents doesn't really make sense.
>  Usually top 100-200 is sufficient.
>
>  Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
> > From: Jeffrey Tiong <jeffrey.ti...@gmail.com>
> > To: solr-user@lucene.apache.org
> > Sent: Friday, June 12, 2009 12:44:55 AM
> > Subject: Re: Faceting on text fields
> >
> > Hi all,
> >
> > We are thinking of using the carrot clustering too. But we saw that
> carrot
> > maybe can only cluster up to 1000 search snippets. Does anyone know how
> can
> > we cluster snippets that is much more than that ? (maybe in the million
> > range?)
> >
> > And what is the difference between mahout and carrot?
> >
> > Thank!
> >
> > Jeffrey
> >
> > On Thu, Jun 11, 2009 at 9:47 PM, Michael Ludwig wrote:
> >
> > > Yao Ge schrieb:
> > >
> > >> BTW, Carrot2 has a very impressive Clustering Workbench (based on
> > >> eclipse) that has built-in integration with Solr. If you have a Solr
> > >> service running, it is a just a matter of point the workbench to it.
> > >> The clustering results and visualization are amazing.
> > >> (http://project.carrot2.org/download.html).
> > >>
> > >
> > > A new world opens up for me ...
> > >
> > > Thanks for pointing out how cool this is!
> > >
> > > Hint for other newcomers: Open the View Menu to configure the details
> of
> > > how you perform your search, e.g. your Solr URL in case it differs from
> > > the default, or your "summary field", which is what gets used to
> analyze
> > > the data in order to determine clusters, if I understand correctly.
> > >
> > > Michael Ludwig
> > >
>
>

Re: Faceting on text fields

Reply via email to