[jira] [Updated] (SOLR-9955) Add cluster Streaming Expression

Joel Bernstein (JIRA) Wed, 11 Jan 2017 09:06:04 -0800

     [ 
https://issues.apache.org/jira/browse/SOLR-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Joel Bernstein updated SOLR-9955:
---------------------------------
    Description: 
This ticket will add the *cluster* Streaming Expression to hook into the 
carrot2 clustering handler. Real-time clustering will fit nicely into the 
Streaming Expression library and should benefit from being able interact with 
other streams. 

One interesting approach to clustering would be to seed the cluster expression 
with a stream.

For example:

{code}
cluster(collection, expr, walk="field->field")
{code}

The walk parameter would map a field from the underlying stream to another 
field to perform the clustering search.

So this could work:
{code}
cluster(collection, facet(collection..),  walk="author->author", rows=1000)
{code}

This would run a facet expression for authors and the form a query with the 
author facets on the author field, bringing back all the content that the 
author set has written, and then cluster these documents.

This would give a topic map for a specific set of authors.

  




  was:
This ticket will add the *cluster* Streaming Expression to hook into the 
carrot2 clustering handler. Real-time clustering will fit nicely into the 
Streaming Expression library and should benefit from being able interact with 
other streams. 



> Add cluster Streaming Expression
> --------------------------------
>
>                 Key: SOLR-9955
>                 URL: https://issues.apache.org/jira/browse/SOLR-9955
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Joel Bernstein
>
> This ticket will add the *cluster* Streaming Expression to hook into the 
> carrot2 clustering handler. Real-time clustering will fit nicely into the 
> Streaming Expression library and should benefit from being able interact with 
> other streams. 
> One interesting approach to clustering would be to seed the cluster 
> expression with a stream.
> For example:
> {code}
> cluster(collection, expr, walk="field->field")
> {code}
> The walk parameter would map a field from the underlying stream to another 
> field to perform the clustering search.
> So this could work:
> {code}
> cluster(collection, facet(collection..),  walk="author->author", rows=1000)
> {code}
> This would run a facet expression for authors and the form a query with the 
> author facets on the author field, bringing back all the content that the 
> author set has written, and then cluster these documents.
> This would give a topic map for a specific set of authors.
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SOLR-9955) Add cluster Streaming Expression

Reply via email to