[jira] [Comment Edited] (SOLR-13047) Add facet2D Streaming Expression

2019-06-13 Thread Nazerke Seidan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16863160#comment-16863160
 ] 

Nazerke Seidan edited comment on SOLR-13047 at 6/13/19 2:56 PM:


[~ctargett], closed the PR #659.


was (Author: snazerke):
[~ctargett], closed the PR #660.

> Add facet2D Streaming Expression
> 
>
> Key: SOLR-13047
> URL: https://issues.apache.org/jira/browse/SOLR-13047
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Fix For: 8.2
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The current facet expression is a generic tool for creating multi-dimension 
> aggregations. The *facet2D* Streaming Expression has semantics specific for 2 
> dimensional facets which are designed to be *pivoted* into a matrix and 
> operated on by *Math Expressions*. 
> facet2D will use the json facet API under the covers. 
> Proposed syntax:
> {code:java}
> facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", 
> count(*)){code}
> The example above will return tuples containing the top 300 diseases and the 
> top ten symptoms for each disease. 
> Using math expression the tuples can be *pivoted* into a matrix where the 
> rows of the matrix are the diseases, the columns of the matrix are the 
> symptoms and the cells in the matrix contain the counts. This matrix can then 
> be *clustered* to find clusters of *diseases* that are correlated by 
> *symptoms*. 
> {code:java}
> let(a=facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 
> 10", count(*)),
> b=pivot(a, diseases, symptoms, count(*)),
> c=kmeans(b, 10)){code}
>  
> *Implementation Note:*
> The implementation plan for this ticket is to create a new stream called 
> Facet2DStream. The FacetStream code is a good starting point for the new 
> implementation and can be adapted for the Facet2D parameters. Similar tests 
> to the FacetStream can be added to StreamExpressionTest
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13047) Add facet2D Streaming Expression

2019-06-13 Thread Nazerke Seidan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16863160#comment-16863160
 ] 

Nazerke Seidan commented on SOLR-13047:
---

[~ctargett], closed the PR #660.

> Add facet2D Streaming Expression
> 
>
> Key: SOLR-13047
> URL: https://issues.apache.org/jira/browse/SOLR-13047
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Fix For: 8.2
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The current facet expression is a generic tool for creating multi-dimension 
> aggregations. The *facet2D* Streaming Expression has semantics specific for 2 
> dimensional facets which are designed to be *pivoted* into a matrix and 
> operated on by *Math Expressions*. 
> facet2D will use the json facet API under the covers. 
> Proposed syntax:
> {code:java}
> facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", 
> count(*)){code}
> The example above will return tuples containing the top 300 diseases and the 
> top ten symptoms for each disease. 
> Using math expression the tuples can be *pivoted* into a matrix where the 
> rows of the matrix are the diseases, the columns of the matrix are the 
> symptoms and the cells in the matrix contain the counts. This matrix can then 
> be *clustered* to find clusters of *diseases* that are correlated by 
> *symptoms*. 
> {code:java}
> let(a=facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 
> 10", count(*)),
> b=pivot(a, diseases, symptoms, count(*)),
> c=kmeans(b, 10)){code}
>  
> *Implementation Note:*
> The implementation plan for this ticket is to create a new stream called 
> Facet2DStream. The FacetStream code is a good starting point for the new 
> implementation and can be adapted for the Facet2D parameters. Similar tests 
> to the FacetStream can be added to StreamExpressionTest
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13047) Add facet2D Streaming Expression

2019-04-16 Thread Nazerke Seidan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16818793#comment-16818793
 ] 

Nazerke Seidan edited comment on SOLR-13047 at 4/16/19 10:50 AM:
-

Regarding the implementation details, are the math expressions limited to 
metrics such as sum, count, max, min and avg?

 
I came up with the following implementation ideas:
 
This is a constructor how it looks like in this stream:
facet2DStream(String collection, ModifiableSolrParams params, Bucket x, Bucket 
y, String dimensions, Metric metric).
 
The basic idea is that  first I will apply a count metric on the given buckets. 
Then I will internally sort the buckets in descending order. Then I will get 
the tuples while the x and y values are not equal in the dimensions. 
 
Any suggestions?


was (Author: snazerke):
Regarding the implementation details, are the math expressions limited to 
metrics such as sum, count, max, min and avg?

> Add facet2D Streaming Expression
> 
>
> Key: SOLR-13047
> URL: https://issues.apache.org/jira/browse/SOLR-13047
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
>
> The current facet expression is a generic tool for creating multi-dimension 
> aggregations. The *facet2D* Streaming Expression has semantics specific for 2 
> dimensional facets which are designed to be *pivoted* into a matrix and 
> operated on by *Math Expressions*. 
> facet2D will use the json facet API under the covers. 
> Proposed syntax:
> {code:java}
> facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", 
> count(*)){code}
> The example above will return tuples containing the top 300 diseases and the 
> top ten symptoms for each disease. 
> Using math expression the tuples can be *pivoted* into a matrix where the 
> rows of the matrix are the diseases, the columns of the matrix are the 
> symptoms and the cells in the matrix contain the counts. This matrix can then 
> be *clustered* to find clusters of *diseases* that are correlated by 
> *symptoms*. 
> {code:java}
> let(a=facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 
> 10", count(*)),
> b=pivot(a, diseases, symptoms, count(*)),
> c=kmeans(b, 10)){code}
>  
> *Implementation Note:*
> The implementation plan for this ticket is to create a new stream called 
> Facet2DStream. The FacetStream code is a good starting point for the new 
> implementation and can be adapted for the Facet2D parameters. Similar tests 
> to the FacetStream can be added to StreamExpressionTest
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13047) Add facet2D Streaming Expression

2019-04-16 Thread Nazerke Seidan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16818793#comment-16818793
 ] 

Nazerke Seidan edited comment on SOLR-13047 at 4/16/19 9:00 AM:


Regarding the implementation details, are the math expressions limited to 
metrics such as sum, count, max, min and avg?


was (Author: snazerke):
Regarding the math expressions, is it limited to metrics?

> Add facet2D Streaming Expression
> 
>
> Key: SOLR-13047
> URL: https://issues.apache.org/jira/browse/SOLR-13047
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
>
> The current facet expression is a generic tool for creating multi-dimension 
> aggregations. The *facet2D* Streaming Expression has semantics specific for 2 
> dimensional facets which are designed to be *pivoted* into a matrix and 
> operated on by *Math Expressions*. 
> facet2D will use the json facet API under the covers. 
> Proposed syntax:
> {code:java}
> facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", 
> count(*)){code}
> The example above will return tuples containing the top 300 diseases and the 
> top ten symptoms for each disease. 
> Using math expression the tuples can be *pivoted* into a matrix where the 
> rows of the matrix are the diseases, the columns of the matrix are the 
> symptoms and the cells in the matrix contain the counts. This matrix can then 
> be *clustered* to find clusters of *diseases* that are correlated by 
> *symptoms*. 
> {code:java}
> let(a=facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 
> 10", count(*)),
> b=pivot(a, diseases, symptoms, count(*)),
> c=kmeans(b, 10)){code}
>  
> *Implementation Note:*
> The implementation plan for this ticket is to create a new stream called 
> Facet2DStream. The FacetStream code is a good starting point for the new 
> implementation and can be adapted for the Facet2D parameters. Similar tests 
> to the FacetStream can be added to StreamExpressionTest
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13047) Add facet2D Streaming Expression

2019-04-16 Thread Nazerke Seidan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16818793#comment-16818793
 ] 

Nazerke Seidan commented on SOLR-13047:
---

Regarding the math expressions, is it limited to metrics?

> Add facet2D Streaming Expression
> 
>
> Key: SOLR-13047
> URL: https://issues.apache.org/jira/browse/SOLR-13047
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
>
> The current facet expression is a generic tool for creating multi-dimension 
> aggregations. The *facet2D* Streaming Expression has semantics specific for 2 
> dimensional facets which are designed to be *pivoted* into a matrix and 
> operated on by *Math Expressions*. 
> facet2D will use the json facet API under the covers. 
> Proposed syntax:
> {code:java}
> facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", 
> count(*)){code}
> The example above will return tuples containing the top 300 diseases and the 
> top ten symptoms for each disease. 
> Using math expression the tuples can be *pivoted* into a matrix where the 
> rows of the matrix are the diseases, the columns of the matrix are the 
> symptoms and the cells in the matrix contain the counts. This matrix can then 
> be *clustered* to find clusters of *diseases* that are correlated by 
> *symptoms*. 
> {code:java}
> let(a=facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 
> 10", count(*)),
> b=pivot(a, diseases, symptoms, count(*)),
> c=kmeans(b, 10)){code}
>  
> *Implementation Note:*
> The implementation plan for this ticket is to create a new stream called 
> Facet2DStream. The FacetStream code is a good starting point for the new 
> implementation and can be adapted for the Facet2D parameters. Similar tests 
> to the FacetStream can be added to StreamExpressionTest
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Deleted] (SOLR-13047) Add facet2D Streaming Expression

2019-04-16 Thread Nazerke Seidan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nazerke Seidan updated SOLR-13047:
--
Comment: was deleted

(was: Regarding the implementation details, are the math expressions limited to 
metrics such as count(*) , sum(col), max(col), min(col) and avg(col)? Why do we 
need count? as a parameter in the facet2D? Obviously, we are returning 300 
diseases containing 10 symptoms for each disease.)

> Add facet2D Streaming Expression
> 
>
> Key: SOLR-13047
> URL: https://issues.apache.org/jira/browse/SOLR-13047
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
>
> The current facet expression is a generic tool for creating multi-dimension 
> aggregations. The *facet2D* Streaming Expression has semantics specific for 2 
> dimensional facets which are designed to be *pivoted* into a matrix and 
> operated on by *Math Expressions*. 
> facet2D will use the json facet API under the covers. 
> Proposed syntax:
> {code:java}
> facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", 
> count(*)){code}
> The example above will return tuples containing the top 300 diseases and the 
> top ten symptoms for each disease. 
> Using math expression the tuples can be *pivoted* into a matrix where the 
> rows of the matrix are the diseases, the columns of the matrix are the 
> symptoms and the cells in the matrix contain the counts. This matrix can then 
> be *clustered* to find clusters of *diseases* that are correlated by 
> *symptoms*. 
> {code:java}
> let(a=facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 
> 10", count(*)),
> b=pivot(a, diseases, symptoms, count(*)),
> c=kmeans(b, 10)){code}
>  
> *Implementation Note:*
> The implementation plan for this ticket is to create a new stream called 
> Facet2DStream. The FacetStream code is a good starting point for the new 
> implementation and can be adapted for the Facet2D parameters. Similar tests 
> to the FacetStream can be added to StreamExpressionTest
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13047) Add facet2D Streaming Expression

2019-04-16 Thread Nazerke Seidan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16818790#comment-16818790
 ] 

Nazerke Seidan edited comment on SOLR-13047 at 4/16/19 8:57 AM:


Regarding the implementation details, are the math expressions limited to 
metrics such as count(*) , sum(col), max(col), min(col) and avg(col)? Why do we 
need count? as a parameter in the facet2D? Obviously, we are returning 300 
diseases containing 10 symptoms for each disease.


was (Author: snazerke):
Regarding the implementation details, are the math expressions limited to 
metrics such as count(*), sum(col), max(col), min(col) and avg(col)? Why do we 
need count(*)  as a parameter in the facet2D? Obviously, we are returning 300 
diseases containing 10 symptoms for each disease.  

> Add facet2D Streaming Expression
> 
>
> Key: SOLR-13047
> URL: https://issues.apache.org/jira/browse/SOLR-13047
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
>
> The current facet expression is a generic tool for creating multi-dimension 
> aggregations. The *facet2D* Streaming Expression has semantics specific for 2 
> dimensional facets which are designed to be *pivoted* into a matrix and 
> operated on by *Math Expressions*. 
> facet2D will use the json facet API under the covers. 
> Proposed syntax:
> {code:java}
> facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", 
> count(*)){code}
> The example above will return tuples containing the top 300 diseases and the 
> top ten symptoms for each disease. 
> Using math expression the tuples can be *pivoted* into a matrix where the 
> rows of the matrix are the diseases, the columns of the matrix are the 
> symptoms and the cells in the matrix contain the counts. This matrix can then 
> be *clustered* to find clusters of *diseases* that are correlated by 
> *symptoms*. 
> {code:java}
> let(a=facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 
> 10", count(*)),
> b=pivot(a, diseases, symptoms, count(*)),
> c=kmeans(b, 10)){code}
>  
> *Implementation Note:*
> The implementation plan for this ticket is to create a new stream called 
> Facet2DStream. The FacetStream code is a good starting point for the new 
> implementation and can be adapted for the Facet2D parameters. Similar tests 
> to the FacetStream can be added to StreamExpressionTest
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13047) Add facet2D Streaming Expression

2019-04-16 Thread Nazerke Seidan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16818790#comment-16818790
 ] 

Nazerke Seidan commented on SOLR-13047:
---

Regarding the implementation details, are the math expressions limited to 
metrics such as count(*), sum(col), max(col), min(col) and avg(col)? Why do we 
need count(*)  as a parameter in the facet2D? Obviously, we are returning 300 
diseases containing 10 symptoms for each disease.  

> Add facet2D Streaming Expression
> 
>
> Key: SOLR-13047
> URL: https://issues.apache.org/jira/browse/SOLR-13047
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
>
> The current facet expression is a generic tool for creating multi-dimension 
> aggregations. The *facet2D* Streaming Expression has semantics specific for 2 
> dimensional facets which are designed to be *pivoted* into a matrix and 
> operated on by *Math Expressions*. 
> facet2D will use the json facet API under the covers. 
> Proposed syntax:
> {code:java}
> facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", 
> count(*)){code}
> The example above will return tuples containing the top 300 diseases and the 
> top ten symptoms for each disease. 
> Using math expression the tuples can be *pivoted* into a matrix where the 
> rows of the matrix are the diseases, the columns of the matrix are the 
> symptoms and the cells in the matrix contain the counts. This matrix can then 
> be *clustered* to find clusters of *diseases* that are correlated by 
> *symptoms*. 
> {code:java}
> let(a=facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 
> 10", count(*)),
> b=pivot(a, diseases, symptoms, count(*)),
> c=kmeans(b, 10)){code}
>  
> *Implementation Note:*
> The implementation plan for this ticket is to create a new stream called 
> Facet2DStream. The FacetStream code is a good starting point for the new 
> implementation and can be adapted for the Facet2D parameters. Similar tests 
> to the FacetStream can be added to StreamExpressionTest
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13047) Add facet2D Streaming Expression

2019-04-16 Thread Nazerke Seidan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16818790#comment-16818790
 ] 

Nazerke Seidan edited comment on SOLR-13047 at 4/16/19 8:56 AM:


Regarding the implementation details, are the math expressions limited to 
metrics such as count(*), sum(col), max(col), min(col) and avg(col)? Why do we 
need count(*)  as a parameter in the facet2D? Obviously, we are returning 300 
diseases containing 10 symptoms for each disease.  


was (Author: snazerke):
Regarding the implementation details, are the math expressions limited to 
metrics such as count(*), sum(col), max(col), min(col) and avg(col)? Why do we 
need count(*)  as a parameter in the facet2D? Obviously, we are returning 300 
diseases containing 10 symptoms for each disease.  

> Add facet2D Streaming Expression
> 
>
> Key: SOLR-13047
> URL: https://issues.apache.org/jira/browse/SOLR-13047
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
>
> The current facet expression is a generic tool for creating multi-dimension 
> aggregations. The *facet2D* Streaming Expression has semantics specific for 2 
> dimensional facets which are designed to be *pivoted* into a matrix and 
> operated on by *Math Expressions*. 
> facet2D will use the json facet API under the covers. 
> Proposed syntax:
> {code:java}
> facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", 
> count(*)){code}
> The example above will return tuples containing the top 300 diseases and the 
> top ten symptoms for each disease. 
> Using math expression the tuples can be *pivoted* into a matrix where the 
> rows of the matrix are the diseases, the columns of the matrix are the 
> symptoms and the cells in the matrix contain the counts. This matrix can then 
> be *clustered* to find clusters of *diseases* that are correlated by 
> *symptoms*. 
> {code:java}
> let(a=facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 
> 10", count(*)),
> b=pivot(a, diseases, symptoms, count(*)),
> c=kmeans(b, 10)){code}
>  
> *Implementation Note:*
> The implementation plan for this ticket is to create a new stream called 
> Facet2DStream. The FacetStream code is a good starting point for the new 
> implementation and can be adapted for the Facet2D parameters. Similar tests 
> to the FacetStream can be added to StreamExpressionTest
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13391) Add variance and standard deviation stream evaluators

2019-04-10 Thread Nazerke Seidan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814745#comment-16814745
 ] 

Nazerke Seidan commented on SOLR-13391:
---

[~joel.bernstein], I have ready code on this to the pull request. What I used 
were StatUtils.variance(data) and Math.sqrt(StatUtils.variance(data))  instead 
of StatUtils.mean(data) in the MeanEvaluator.java.

> Add variance and standard deviation stream evaluators
> -
>
> Key: SOLR-13391
> URL: https://issues.apache.org/jira/browse/SOLR-13391
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>    Reporter: Nazerke Seidan
>Priority: Minor
>  Labels: pull-request-available
>
> It seems variance and standard deviation stream evaluators are not supported 
> by any of the solr version. For example, 
>               let(echo="m,v,sd", arr=array(1,3,3), m=mean(a), v=var(a), 
> sd=stddev(a))
> So far, only the mean function is implemented. I think it is useful to have 
> var and sttdev functions separately as a stream evaluator. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-13391) Add variance and standard deviation stream evaluators

2019-04-10 Thread Nazerke Seidan (JIRA)
Nazerke Seidan created SOLR-13391:
-

 Summary: Add variance and standard deviation stream evaluators
 Key: SOLR-13391
 URL: https://issues.apache.org/jira/browse/SOLR-13391
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: streaming expressions
Reporter: Nazerke Seidan


It seems variance and standard deviation stream evaluators are not supported by 
any of the solr version. For example, 

              let(echo="m,v,sd", arr=array(1,3,3), m=mean(a), v=var(a), 
sd=stddev(a))

So far, only the mean function is implemented. I think it is useful to have var 
and sttdev functions separately as a stream evaluator. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Query about solr volunteers to mentor: GSoC 19

2019-03-05 Thread Nazerke Seidan
Hi David,

Thank you for your response!

Well, actually I have recently started learning Solr, thus I have not
explored it much. I am currently focusing on *streaming expressions in
SolrJ *and its use cases. It would be much useful if there were enough
streaming expressions examples using not only curl tool but also solrj and
others.

I think from your interest list, *UnifiedHighlighter *seems interesting for
me. Also, this highlighter does not support surround parser that I have
read. It might be useful to implement this, but I have no clue about its
implementation complexity. Also, adding
multi-highlighting feature can be useful.

Other than that, I was thinking of a *feature* to solr admin ui for
modifying the solrconfig.xml without any command line tools. But I am not
sure about the security stuff.

--Nazerke

On Mon, Mar 4, 2019 at 10:31 PM David Smiley 
wrote:

> BTW another topic is the migration of Solr's admin UI to a more modern
> Angular JS -- or something like that -- I haven't been following that very
> closely.  I'm definitely not the right mentor for that but perhaps someone
> here could mentor if you choose to pick that up.
>
> ~ David Smiley
> Freelance Apache Lucene/Solr Search Consultant/Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Mon, Mar 4, 2019 at 4:16 PM David Smiley 
> wrote:
>
>> Hello Nazerke,
>>
>> Thanks for your interest and proactively reaching out to us!
>>
>> I am interested in being a mentor provided the topic interests me.
>> Topics of interest to me:
>> * spatial
>>ex: any open issue, such as:
>> https://issues.apache.org/jira/browse/SOLR-4242
>> * highlighting
>>ex: any open issue, esp. relating to the UnifiedHighlighter
>> * test infrastructure utilities
>> * benchmarking automation
>> * the build: migrate from Ant to Gradle
>> * refactorings related to technical debt
>>
>> And perhaps others might interest me if you propose something specific. I
>> know you commented on SOLR-10329 but I'd rather not mentor for that.
>>
>> Depending on the scope of some issue(s) there might be multiple actual
>> things to work on, perhaps ideally in the same subject area.
>>
>> What do you think?
>>
>> ~ David
>>
>> On Mon, Mar 4, 2019 at 11:35 AM Nazerke Seidan
>>  wrote:
>>
>>> Hi All,
>>>
>>> I am a final year CS BSc student, interested in participating GSoC'19 by
>>> contributing to Apache Solr project. I was wondering if there are any
>>> volunteers from Solr community to mentor GSoC'19 project. I would like to
>>> discuss about potential topics.
>>>
>>>
>>> Many thanks,
>>>
>>> Nazerke
>>>
>> --
>> Lucene/Solr Search Committer (PMC), Developer, Author, Speaker
>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>> http://www.solrenterprisesearchserver.com
>>
>


[jira] [Commented] (SOLR-7229) Allow DIH to handle attachments as separate documents

2019-03-04 Thread Nazerke Seidan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-7229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783547#comment-16783547
 ] 

Nazerke Seidan commented on SOLR-7229:
--

Hi Tim,

I was wondering whether this project is still open or not? I would like to 
participate in GSoC'19 by contributing to solr community. 

> Allow DIH to handle attachments as separate documents
> -
>
> Key: SOLR-7229
> URL: https://issues.apache.org/jira/browse/SOLR-7229
> Project: Solr
>  Issue Type: Improvement
>Reporter: Tim Allison
>Assignee: Alexandre Rafalovitch
>Priority: Minor
>  Labels: gsoc2017
>
> With Tika 1.7's RecursiveParserWrapper, it is possible to maintain metadata 
> of individual attachments/embedded documents.  Tika's default handling was to 
> maintain the metadata of the container document and concatenate the contents 
> of all embedded files.  With SOLR-7189, we added the legacy behavior.
> It might be handy, for example, to be able to send an MSG file through DIH 
> and treat the container email as well each attachment as separate (child?) 
> documents, or send a zip of jpeg files and correctly index the geo locations 
> for each image file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10329) Rebuild Solr examples

2019-03-04 Thread Nazerke Seidan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783533#comment-16783533
 ] 

Nazerke Seidan commented on SOLR-10329:
---

Hi Alexandre,

I was wondering whether this project is still open or not? I would like to 
participate in GSoC'19 by contributing to solr community. 

 

Many thanks!

> Rebuild Solr examples
> -
>
> Key: SOLR-10329
> URL: https://issues.apache.org/jira/browse/SOLR-10329
> Project: Solr
>  Issue Type: Wish
>  Components: examples
>Reporter: Alexandre Rafalovitch
>Priority: Major
>  Labels: gsoc2017
>
> Apache Solr ships with a number of examples. They evolved from a kitchen sync 
> example and are rather large. When new Solr features are added, they are 
> often shoehorned into the most appropriate example and sometimes are not 
> represented at all. 
> Often, for new users, it is hard to tell what part of example is relevant, 
> what part is default and what part is demonstrating something completely 
> different.
> It would take significant (and very appreciated) effort to review all the 
> examples and rebuild them to provide clean way to showcase best practices 
> around base and most recent features.
> Specific issues are around kitchen sync vs. minimal examples, better approach 
> to "schemaless" mode and creating examples and datasets that allow to create 
> both "hello world" and more-advanced tutorials.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Query about solr volunteers to mentor: GSoC 19

2019-03-04 Thread Nazerke Seidan
Hi All,

I am a final year CS BSc student, interested in participating GSoC'19 by
contributing to Apache Solr project. I was wondering if there are any
volunteers from Solr community to mentor GSoC'19 project. I would like to
discuss about potential topics.


Many thanks,

Nazerke