Hi Toke,

Thank you for the link.

I'm using Solr 5.2.1 but I think the carrot2 bundled will be slightly older
version, as I'm using the latest carrot2-workbench-3.10.3, which is only
released recently. I've changed all the settings like fragSize and
desiredCluserCountBase to be the same on both sides, and I'm now able to
get very similar cluster results.

Now I've tried to increase the carrot.fragSize to 75 and
carrot.summarySnippets to 2, and set the carrot.produceSummary to true.
With this setting, I'm mostly able to get the cluster results back within 2
to 3 seconds when I set rows=200. I'm still trying out to see if the
cluster labels are ok, but in theory do you think this is a suitable
setting to attempt to improve the clustering results and at the same time
improve the performance?

Regards,
Edwin



On 26 August 2015 at 13:58, Toke Eskildsen <t...@statsbiblioteket.dk> wrote:

> On Wed, 2015-08-26 at 10:10 +0800, Zheng Lin Edwin Yeo wrote:
> > I'm currently trying out on the Carrot2 Workbench and get it to call Solr
> > to see how they did the clustering. Although it still takes some time to
> do
> > the clustering, but the results of the cluster is much better than mine.
> I
> > think its probably due to the different settings like the fragSize and
> > desiredCluserCountBase?
>
> Either that or the carrot bundled with Solr is an older version.
>
> > By the way, the link on the clustering example
> > https://cwiki.apache.org/confluence/display/solr/Result is not working
> as
> > it says 'Page Not Found'.
>
> That is because it is too long for a single line. Try copy-pasting it:
>
> https://cwiki.apache.org/confluence/display/solr/Result
> +Clustering#ResultClustering-Configuration
>
> - Toke Eskildsen, State and University Library, Denmark
>
>
>

Reply via email to