[jira] [Comment Edited] (SOLR-13047) Add facet2D Streaming Expression

Nazerke Seidan (JIRA) Tue, 16 Apr 2019 03:51:22 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16818793#comment-16818793
 ]


Nazerke Seidan edited comment on SOLR-13047 at 4/16/19 10:50 AM:
-----------------------------------------------------------------

Regarding the implementation details, are the math expressions limited to 
metrics such as sum, count, max, min and avg?

 
I came up with the following implementation ideas:
 
This is a constructor how it looks like in this stream:
facet2DStream(String collection, ModifiableSolrParams params, Bucket x, Bucket 
y, String dimensions, Metric metric).
 
The basic idea is that  first I will apply a count metric on the given buckets. 
Then I will internally sort the buckets in descending order. Then I will get 
the tuples while the x and y values are not equal in the dimensions. 
 
Any suggestions?


was (Author: snazerke):
Regarding the implementation details, are the math expressions limited to 
metrics such as sum, count, max, min and avg?

> Add facet2D Streaming Expression
> --------------------------------
>
>                 Key: SOLR-13047
>                 URL: https://issues.apache.org/jira/browse/SOLR-13047
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Joel Bernstein
>            Assignee: Joel Bernstein
>            Priority: Major
>
> The current facet expression is a generic tool for creating multi-dimension 
> aggregations. The *facet2D* Streaming Expression has semantics specific for 2 
> dimensional facets which are designed to be *pivoted* into a matrix and 
> operated on by *Math Expressions*. 
> facet2D will use the json facet API under the covers. 
> Proposed syntax:
> {code:java}
> facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", 
> count(*)){code}
> The example above will return tuples containing the top 300 diseases and the 
> top ten symptoms for each disease. 
> Using math expression the tuples can be *pivoted* into a matrix where the 
> rows of the matrix are the diseases, the columns of the matrix are the 
> symptoms and the cells in the matrix contain the counts. This matrix can then 
> be *clustered* to find clusters of *diseases* that are correlated by 
> *symptoms*. 
> {code:java}
> let(a=facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 
> 10", count(*)),
>     b=pivot(a, diseases, symptoms, count(*)),
>     c=kmeans(b, 10)){code}
>  
> *Implementation Note:*
> The implementation plan for this ticket is to create a new stream called 
> Facet2DStream. The FacetStream code is a good starting point for the new 
> implementation and can be adapted for the Facet2D parameters. Similar tests 
> to the FacetStream can be added to StreamExpressionTest
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-13047) Add facet2D Streaming Expression

Reply via email to