[ 
https://issues.apache.org/jira/browse/SOLR-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-13047:
----------------------------------
    Description: 
The current facet expression is a generic tool for creating multi-dimension 
aggregations. The *facet2D* Streaming Expression has semantics specific for 2 
dimensional facets which are designed to be *pivoted* into a matrix and 
operated on by *Math Expressions*. 

facet2D will use the json facet API under the covers. 

Proposed syntax:
{code:java}
facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", 
count(*)){code}
The example above will return tuples containing the top 300 diseases and the 
top ten symptoms for each disease. 

Using math expression the tuples can be *pivoted* into a matrix where the rows 
of the matrix are the diseases, the columns of the matrix are the symptoms and 
the cells in the matrix contain the counts. This matrix can then be *clustered* 
to find clusters of *diseases* that are correlated by *symptoms*. 
{code:java}
let(a=facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", 
count(*)),
    b=pivot(a, diseases, symptoms, count(*)),
    c=kmeans(b, 10)){code}
 

*Implementation Note:*

The implementation plan for this ticket is to create a new stream called 
Facet2DStream. The FacetStream code is a good starting point for the new 
implementation and can be adapted for the Facet2D parameters. 

 

  was:
The current facet expression is a generic tool for creating multi-dimension 
aggregations. The *facet2D* Streaming Expression has semantics specific for 2 
dimensional facets which are designed to be *pivoted* into a matrix and 
operated on by *Math Expressions*. 

facet2D will use the json facet API under the covers. 

Proposed syntax:
{code:java}
facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", 
count(*)){code}
The example above will return tuples containing the top 300 diseases and the 
top ten symptoms for each disease. 

Using math expression the tuples can be *pivoted* into a matrix where the rows 
of the matrix are the diseases, the columns of the matrix are the symptoms and 
the cells in the matrix contain the counts. This matrix can then be *clustered* 
to find clusters of *diseases* that are correlated by *symptoms*. 
{code:java}
let(a=facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", 
count(*)),
    b=pivot(a, diseases, symptoms, count(*)),
    c=kmeans(b, 10)){code}
 

*Implementation Note:*

The implementation plan for this ticket is to create a new stream called 
Facet2DStream. The FacetStream code is a good starting point for the new 
implementation and can be adapt for the Facet2D parameters. 

 


> Add facet2D Streaming Expression
> --------------------------------
>
>                 Key: SOLR-13047
>                 URL: https://issues.apache.org/jira/browse/SOLR-13047
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Joel Bernstein
>            Assignee: Joel Bernstein
>            Priority: Major
>
> The current facet expression is a generic tool for creating multi-dimension 
> aggregations. The *facet2D* Streaming Expression has semantics specific for 2 
> dimensional facets which are designed to be *pivoted* into a matrix and 
> operated on by *Math Expressions*. 
> facet2D will use the json facet API under the covers. 
> Proposed syntax:
> {code:java}
> facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", 
> count(*)){code}
> The example above will return tuples containing the top 300 diseases and the 
> top ten symptoms for each disease. 
> Using math expression the tuples can be *pivoted* into a matrix where the 
> rows of the matrix are the diseases, the columns of the matrix are the 
> symptoms and the cells in the matrix contain the counts. This matrix can then 
> be *clustered* to find clusters of *diseases* that are correlated by 
> *symptoms*. 
> {code:java}
> let(a=facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 
> 10", count(*)),
>     b=pivot(a, diseases, symptoms, count(*)),
>     c=kmeans(b, 10)){code}
>  
> *Implementation Note:*
> The implementation plan for this ticket is to create a new stream called 
> Facet2DStream. The FacetStream code is a good starting point for the new 
> implementation and can be adapted for the Facet2D parameters. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to