[ 
https://issues.apache.org/jira/browse/SOLR-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795914#action_12795914
 ] 

Stanislaw Osinski commented on SOLR-1692:
-----------------------------------------

I've had a quick look into this issue and have two questions to consider:

* Where should the configuration of the highlighter we use for clustering come 
from? Should it be the same as for the regular Solr highlighting or should we 
allow a clustering-specific configuration? My intuition is that we should go 
with the former. Otherwise, we may lose the clear relationship between cluster 
labels and documents on the output, because the clusters will be generated 
based on a text that is different from what the user is going to see.

* What should we do if the highlighter is not able to generate a summary? One 
option is to use the full contents of the field. Alternatively, we could use N 
(configurable) first characters of the field. The answer to this really depends 
on the characteristics of the data we may get. If the total number of documents 
fed to Carrot2 doesn't exceed about a 1000, longer documents shouldn't be too 
much of a problem, so I'd suggest the former option (use full field text).

> CarrotClusteringEngine produce summary does nothing
> ---------------------------------------------------
>
>                 Key: SOLR-1692
>                 URL: https://issues.apache.org/jira/browse/SOLR-1692
>             Project: Solr
>          Issue Type: Bug
>          Components: contrib - Clustering
>            Reporter: Grant Ingersoll
>            Assignee: Grant Ingersoll
>             Fix For: 1.5
>
>
> In the CarrotClusteringEngine, the produceSummary option does nothing, as the 
> results of doing the highlighting are just ignored.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to