[jira] [Commented] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Aroop (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209147#comment-17209147
 ] 

Aroop commented on SOLR-14916:
--

Thanks Joel

> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time lines per each split. The 
> split-limit and split-sort parameters will also be added to control the 
> number and order of values in the split field result. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-limit=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function.  The *diff*  and 
> *minMaxScale* functions already support operations over matrix rows so it's 
> very easy to perform clustering etc.. on this output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Aroop (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209108#comment-17209108
 ] 

Aroop commented on SOLR-14916:
--

[~jbernste] what possible values of "gap" will we support and will the "format" 
have corresponding valid list of values documented or an enum/constants file to 
that effect created?

> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time lines per each split. The 
> split-limit and split-sort parameters will also be added to control the 
> number and order of values in the split field result. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-limit=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function.  The *diff*  and 
> *minMaxScale* functions already support operations over matrix rows so it's 
> very easy to perform clustering etc.. on this output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14916) Add split parameter to timeseries Streaming Expression

2020-10-06 Thread Aroop (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209051#comment-17209051
 ] 

Aroop commented on SOLR-14916:
--

[~jbernste] this looks very neat!

> Add split parameter to timeseries Streaming Expression
> --
>
> Key: SOLR-14916
> URL: https://issues.apache.org/jira/browse/SOLR-14916
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the time series function only supports aggregations across the time 
> dimension. This ticket will add the *split* parameter which will add a top 
> level split by categorical field, to produce time lines per each split. The 
> split-limit and split-sort parameters will also be added to control number 
> and order of values in the split field result. 
> Sample syntax:
> {code}
> timeseries(collection1, 
>q="*:*", 
>split="company", 
>split-limit=10, 
>split-sort="avg(price_f) desc",  
>field="timefield", 
>gap="+1DAY", 
>format="-dd-MM" ,
>avg(price_f))
> {code}
> The output of this can be easily pivoted into a matrix and correlated or 
> clustered like the output of the *facet2D* function.  The *diff* function 
> already supports the serial differencing of matrix columns so it's very easy 
> to perform clustering etc.. on this output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14660) Migrating HDFS into a package

2020-07-29 Thread Aroop (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167457#comment-17167457
 ] 

Aroop commented on SOLR-14660:
--

[~warper] this is a great start.

I have a few questions regarding the codebase for the hdfs backup/restore, 
these as you many know are collection apis. And it uses  HdfsBackupRepository 
bindings which you found being configured via solr.xml (optionally for those 
who need it). 

Have you foreseen any disruption to that call due to this move?

I am assuming the collection api handler for that call will now need to use a 
different import for the new path ?

> Migrating HDFS into a package
> -
>
> Key: SOLR-14660
> URL: https://issues.apache.org/jira/browse/SOLR-14660
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>
> Following up on the deprecation of HDFS (SOLR-14021), we need to work on 
> isolating it away from Solr core and making a package for this. This issue is 
> to track the efforts for that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression

2020-07-01 Thread Aroop (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aroop updated SOLR-14614:
-
Description: 
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

to give an example of how involved the corresponding streaming expression can 
get, to get it to work on large scale systems,{color:#4c9aff} _find top 10 
cities where someone named Alex works with the respective counts_{color}
{code:java}
qt=/stream=facet=
select( top( rollup(sort(by%3D"city+asc",
   +plist( 
  
select(facet(collection1,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa),

  
select(facet(collection2,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa)
 )),
+over%3D"city",+sum(Nj3bXa)),
+n%3D"10",+sort%3D"sum(Nj3bXa)+desc"),
+city,+sum(Nj3bXa)+as+Nj3bXa)


{code}
This is a query on an alias with 2 collections behind it representing 2 data 
partitions, which is a requirement of sorts in big data systems. This is one of 
the only ways to get information from Billions of records in a matter of 
seconds. This is awesome in terms of capability and performance.

But one can see how involved this syntax can be in the current scheme and is a 
barrier to entry for new adopters.

 

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?action=aggregate=*:*=name:alex=city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select city, count(*) from collection where name = 'alex'
group by city order by count(*) desc limit 10;{code}
On the solr side this would get translated to the best possible streaming 
expression using *rollups, top, sort, plist* etc.; but all done transparently 
to the user.

Heres to making the power of Streaming expressions simpler to use for all.

 

 

  was:
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

to give an example of how involved the corresponding streaming expression can 
get, to get it to work on large scale systems, _find me top 10 cities where 
someone named Alex works with the respective counts_
{code:java}
qt=/stream=facet=
select( top( rollup(sort(by%3D"city+asc",
   +plist( 
  
select(facet(collection1,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa),

  
select(facet(collection2,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa)
 )),
+over%3D"city",+sum(Nj3bXa)),
+n%3D"10",+sort%3D"sum(Nj3bXa)+desc"),
+city,+sum(Nj3bXa)+as+Nj3bXa)


{code}
This is a query on an alias with 2 collections behind it representing 2 data 
partitions, which is a requirement of sorts in big data systems. This is one of 
the only ways to get information from Billions of records in a matter of 
seconds. But one can see how involved this syntax can be in the current scheme 
and is a barrier to entry for new adopters.

 

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?action=aggregate=*:*=name:alex=city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select city, count(*) from collection where name = 'alex'
group by city order by count(*) desc limit 10;{code}
On the solr side this would get translated to the best possible streaming 

[jira] [Updated] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression

2020-07-01 Thread Aroop (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aroop updated SOLR-14614:
-
Description: 
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

to give an example of how involved the corresponding streaming expression can 
get, to get it to work on large scale systems, _find me top 10 cities where 
someone named Alex works with the respective counts_
{code:java}
qt=/stream=facet=
select( top( rollup(sort(by%3D"city+asc",
   +plist( 
  
select(facet(collection1,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa),

  
select(facet(collection2,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa)
 )),
+over%3D"city",+sum(Nj3bXa)),
+n%3D"10",+sort%3D"sum(Nj3bXa)+desc"),
+city,+sum(Nj3bXa)+as+Nj3bXa)


{code}
This is a query on an alias with 2 collections behind it representing 2 data 
partitions, which is a requirement of sorts in big data systems. This is one of 
the only ways to get information from Billions of records in a matter of 
seconds. But one can see how involved this syntax can be in the current scheme 
and is a barrier to entry for new adopters.

 

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?action=aggregate=*:*=name:alex=city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select city, count(*) from collection where name = 'alex'
group by city order by count(*) desc limit 10;{code}
On the solr side this would get translated to the best possible streaming 
expression using *rollups, top, sort, plist* etc.; but all done transparently 
to the user.

Heres to making the power of Streaming expressions simpler to use for all.

 

 

  was:
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

to give an example of how involved the corresponding streaming expression can 
get, to get it to work on large scale systems, 
{code:java}
qt=/stream=facet=
select( top( rollup(sort(by%3D"city+asc",
   +plist( 
  
select(facet(collection1,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa),

  
select(facet(collection2,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa)
 )),
+over%3D"city",+sum(Nj3bXa)),
+n%3D"10",+sort%3D"sum(Nj3bXa)+desc"),
+city,+sum(Nj3bXa)+as+Nj3bXa)


{code}
This is a query on an alias with 2 collections behind it representing 2 data 
partitions, which is a requirement of sorts in big data systems. This is one of 
the only ways to get information from Billions of records in a matter of 
seconds. But one can see how involved this syntax can be in the current scheme 
and is a barrier to entry for new adopters.

 

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?action=aggregate=*:*=name:alex=city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select city, count(*) from collection where name = 'alex'
group by city order by count(*) desc limit 10;{code}
On the solr side this would get translated to the best possible streaming 
expression using *rollups, top, sort, plist* etc.; but all done transparently 
to the user.

 

Heres to making the power of Streaming expressions simpler to use 

[jira] [Updated] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression

2020-07-01 Thread Aroop (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aroop updated SOLR-14614:
-
Description: 
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

to give an example of how involved the corresponding streaming expression can 
get, to get it to work on large scale systems, 
{code:java}
qt=/stream=facet=
select( top( rollup(sort(by%3D"city+asc",
   +plist( 
  
select(facet(collection1,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa),

  
select(facet(collection2,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa)
 )),
+over%3D"city",+sum(Nj3bXa)),
+n%3D"10",+sort%3D"sum(Nj3bXa)+desc"),
+city,+sum(Nj3bXa)+as+Nj3bXa)


{code}
This is a query on an alias with 2 collections behind it representing 2 data 
partitions, which is a requirement of sorts in big data systems. This is one of 
the only ways to get information from Billions of records in a matter of 
seconds. But one can see how involved this syntax can be in the current scheme 
and is a barrier to entry for new adopters.

 

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?action=aggregate=*:*=name:alex=city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select city, count(*) from collection where name = 'alex'
group by city order by count(*) desc limit 10;{code}
On the solr side this would get translated to the best possible streaming 
expression using *rollups, top, sort, plist* etc.; but all done transparently 
to the user.

 

Heres to making the power of Streaming expressions simpler to use for all.

 

 

  was:
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?action=aggregate=*:*=name:alex*=age,city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select age, city, count(*) from collection where name like 'alex%'
group by age, city order by age desc limit 10;{code}
 

On the solr side this would get translated to the best possible streaming 
expression using *rollups, top, sort, plist* etc.; but all done transparently 
to the user.

 

 


> Add Simplified Aggregation Interface to Streaming Expression
> 
>
> Key: SOLR-14614
> URL: https://issues.apache.org/jira/browse/SOLR-14614
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query, query parsers, streaming expressions
>Affects Versions: 7.7.2, 8.4.1
>Reporter: Aroop
>Priority: Major
>
> For the Data Analytics use cases the standard use case is:
>  # Find a pattern
>  # Then Aggregate by certain dimensions
>  # Then compute metrics (like count, sum, avg)
>  # Sort by a dimension or metric
>  # look at top-n
> This functionality has been available over many different interfaces in the 
> past on solr, but only streaming expressions have the ability to deliver 
> results in a scalable, performant and stable manner for systems that have 
> large data to the tune of Big data systems.
> However, one barrier to entry is the query interface, not being simple enough 
> in streaming expressions.
> to give an example of how involved the 

[jira] [Updated] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression

2020-07-01 Thread Aroop (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aroop updated SOLR-14614:
-
Description: 
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?action=aggregate=*:*=name:alex*=age,city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select age, city, count(*) from collection where name like 'alex%'
group by age, city order by age desc limit 10;{code}
 

On the solr side this would get translated to the best possible streaming 
expression using *rollups, top, sort, plist* etc.; but all done transparently 
to the user.

 

 

  was:
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?q=*:*=name:alex*=age,city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select age, city, count(*) from collection where name like 'alex%'
group by age, city order by age desc limit 10;{code}
 

On the solr side this would get translated to the best possible streaming 
expression using *rollups, top, sort, plist* etc.; but all done transparently 
to the user.

 

 


> Add Simplified Aggregation Interface to Streaming Expression
> 
>
> Key: SOLR-14614
> URL: https://issues.apache.org/jira/browse/SOLR-14614
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query, query parsers, streaming expressions
>Affects Versions: 7.7.2, 8.4.1
>Reporter: Aroop
>Priority: Major
>
> For the Data Analytics use cases the standard use case is:
>  # Find a pattern
>  # Then Aggregate by certain dimensions
>  # Then compute metrics (like count, sum, avg)
>  # Sort by a dimension or metric
>  # look at top-n
> This functionality has been available over many different interfaces in the 
> past on solr, but only streaming expressions have the ability to deliver 
> results in a scalable, performant and stable manner for systems that have 
> large data to the tune of Big data systems.
> However, one barrier to entry is the query interface, not being simple enough 
> in streaming expressions.
> This Jira is to track the work of creating a simplified analytics endpoint 
> augmenting streaming expressions.
> a starting proposal is to have the endpoint have these query parameters:
> {code:java}
> /analytics?action=aggregate=*:*=name:alex*=age,city=count=count=desc=10{code}
> This is equivalent to a sql that an analyst would write:
> {code:java}
> select age, city, count(*) from collection where name like 'alex%'
> group by age, city order by age desc limit 10;{code}
>  
> On the solr side this would get translated to the best possible streaming 
> expression using *rollups, top, sort, plist* etc.; but all done transparently 
> to the user.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression

2020-07-01 Thread Aroop (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aroop updated SOLR-14614:
-
Description: 
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?q=*:*=name:alex*=age,city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select age, city, count(*) from collection where name like 'alex%'
group by age, city order by age desc limit 10;{code}
 

On the solr side this would get translated to the best possible streaming 
expression using _*rollups, top, sort, plist* etc.; b_ut all done transparently 
to the user.

 

 

  was:
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics=*:*=name:alex*=age,city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select age, city, count(*) from collection where name like 'alex%'
group by age, city order by age desc limit 10;{code}
 

On the solr side this would get translated to the best possible streaming 
expression using _*rollups, top, sort, plist* etc.; b_ut all done transparently 
to the user.

 

 


> Add Simplified Aggregation Interface to Streaming Expression
> 
>
> Key: SOLR-14614
> URL: https://issues.apache.org/jira/browse/SOLR-14614
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query, query parsers, streaming expressions
>Affects Versions: 7.7.2, 8.4.1
>Reporter: Aroop
>Priority: Major
>
> For the Data Analytics use cases the standard use case is:
>  # Find a pattern
>  # Then Aggregate by certain dimensions
>  # Then compute metrics (like count, sum, avg)
>  # Sort by a dimension or metric
>  # look at top-n
> This functionality has been available over many different interfaces in the 
> past on solr, but only streaming expressions have the ability to deliver 
> results in a scalable, performant and stable manner for systems that have 
> large data to the tune of Big data systems.
> However, one barrier to entry is the query interface, not being simple enough 
> in streaming expressions.
> This Jira is to track the work of creating a simplified analytics endpoint 
> augmenting streaming expressions.
> a starting proposal is to have the endpoint have these query parameters:
> {code:java}
> /analytics?q=*:*=name:alex*=age,city=count=count=desc=10{code}
> This is equivalent to a sql that an analyst would write:
> {code:java}
> select age, city, count(*) from collection where name like 'alex%'
> group by age, city order by age desc limit 10;{code}
>  
> On the solr side this would get translated to the best possible streaming 
> expression using _*rollups, top, sort, plist* etc.; b_ut all done 
> transparently to the user.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression

2020-07-01 Thread Aroop (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aroop updated SOLR-14614:
-
Description: 
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?q=*:*=name:alex*=age,city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select age, city, count(*) from collection where name like 'alex%'
group by age, city order by age desc limit 10;{code}
 

On the solr side this would get translated to the best possible streaming 
expression using *rollups, top, sort, plist* etc.; but all done transparently 
to the user.

 

 

  was:
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?q=*:*=name:alex*=age,city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select age, city, count(*) from collection where name like 'alex%'
group by age, city order by age desc limit 10;{code}
 

On the solr side this would get translated to the best possible streaming 
expression using _*rollups, top, sort, plist* etc.; b_ut all done transparently 
to the user.

 

 


> Add Simplified Aggregation Interface to Streaming Expression
> 
>
> Key: SOLR-14614
> URL: https://issues.apache.org/jira/browse/SOLR-14614
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query, query parsers, streaming expressions
>Affects Versions: 7.7.2, 8.4.1
>Reporter: Aroop
>Priority: Major
>
> For the Data Analytics use cases the standard use case is:
>  # Find a pattern
>  # Then Aggregate by certain dimensions
>  # Then compute metrics (like count, sum, avg)
>  # Sort by a dimension or metric
>  # look at top-n
> This functionality has been available over many different interfaces in the 
> past on solr, but only streaming expressions have the ability to deliver 
> results in a scalable, performant and stable manner for systems that have 
> large data to the tune of Big data systems.
> However, one barrier to entry is the query interface, not being simple enough 
> in streaming expressions.
> This Jira is to track the work of creating a simplified analytics endpoint 
> augmenting streaming expressions.
> a starting proposal is to have the endpoint have these query parameters:
> {code:java}
> /analytics?q=*:*=name:alex*=age,city=count=count=desc=10{code}
> This is equivalent to a sql that an analyst would write:
> {code:java}
> select age, city, count(*) from collection where name like 'alex%'
> group by age, city order by age desc limit 10;{code}
>  
> On the solr side this would get translated to the best possible streaming 
> expression using *rollups, top, sort, plist* etc.; but all done transparently 
> to the user.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression

2020-07-01 Thread Aroop (Jira)
Aroop created SOLR-14614:


 Summary: Add Simplified Aggregation Interface to Streaming 
Expression
 Key: SOLR-14614
 URL: https://issues.apache.org/jira/browse/SOLR-14614
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: query, query parsers, streaming expressions
Affects Versions: 8.4.1, 7.7.2
Reporter: Aroop


For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics=*:*=name:alex*=age,city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select age, city, count(*) from collection where name like 'alex%'
group by age, city order by age desc limit 10;{code}
 

On the solr side this would get translated to the best possible streaming 
expression using _*rollups, top, sort, plist* etc.; b_ut all done transparently 
to the user.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14316) Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method

2020-03-11 Thread Aroop (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aroop updated SOLR-14316:
-
Description: 
There is an unchecked type conversion warning in JavaBinCodec's readMapEntry's 
equals() method. 

This change removes that warning by handling a checked conversion and also adds 
to tests to an earlier untested api.

  was:There is an unchecked type conversion warning in JavaBinCodec's 
readMapEntry's equals() method. 


> Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's 
> equals() method
> -
>
> Key: SOLR-14316
> URL: https://issues.apache.org/jira/browse/SOLR-14316
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 7.7.2, 8.4.1
>Reporter: Aroop
>Priority: Minor
>  Labels: patch
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> There is an unchecked type conversion warning in JavaBinCodec's 
> readMapEntry's equals() method. 
> This change removes that warning by handling a checked conversion and also 
> adds to tests to an earlier untested api.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14316) Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method

2020-03-10 Thread Aroop (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aroop updated SOLR-14316:
-
Attachment: SOLR-14316.patch

> Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's 
> equals() method
> -
>
> Key: SOLR-14316
> URL: https://issues.apache.org/jira/browse/SOLR-14316
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 7.7.2, 8.4.1
>Reporter: Aroop
>Priority: Minor
>  Labels: patch
>
> There is an unchecked type conversion warning in JavaBinCodec's 
> readMapEntry's equals() method. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14316) Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method

2020-03-10 Thread Aroop (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aroop updated SOLR-14316:
-
Attachment: (was: SOLR-14316.patch)

> Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's 
> equals() method
> -
>
> Key: SOLR-14316
> URL: https://issues.apache.org/jira/browse/SOLR-14316
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 7.7.2, 8.4.1
>Reporter: Aroop
>Priority: Minor
>  Labels: patch
>
> There is an unchecked type conversion warning in JavaBinCodec's 
> readMapEntry's equals() method. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14316) Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method

2020-03-10 Thread Aroop (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aroop updated SOLR-14316:
-
Labels: patch  (was: patch warnings)

> Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's 
> equals() method
> -
>
> Key: SOLR-14316
> URL: https://issues.apache.org/jira/browse/SOLR-14316
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 7.7.2, 8.4.1
>Reporter: Aroop
>Priority: Minor
>  Labels: patch
>
> There is an unchecked type conversion warning in JavaBinCodec's 
> readMapEntry's equals() method. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14316) Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method

2020-03-10 Thread Aroop (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aroop updated SOLR-14316:
-
Labels: patch warnings  (was: warnings)

> Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's 
> equals() method
> -
>
> Key: SOLR-14316
> URL: https://issues.apache.org/jira/browse/SOLR-14316
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 7.7.2, 8.4.1
>Reporter: Aroop
>Priority: Minor
>  Labels: patch, warnings
>
> There is an unchecked type conversion warning in JavaBinCodec's 
> readMapEntry's equals() method. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14316) Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method

2020-03-09 Thread Aroop (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aroop updated SOLR-14316:
-
Summary: Remove unchecked type conversion warning in JavaBinCodec's 
readMapEntry's equals() method  (was: Remove there was an unchecked type 
conversion warning in JavaBinCodec's readMapEntry's equals() method)

> Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's 
> equals() method
> -
>
> Key: SOLR-14316
> URL: https://issues.apache.org/jira/browse/SOLR-14316
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 7.7.2, 8.4.1
>Reporter: Aroop
>Priority: Minor
>  Labels: warnings
>
> There is an unchecked type conversion warning in JavaBinCodec's 
> readMapEntry's equals() method. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14316) Remove there was an unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method

2020-03-09 Thread Aroop (Jira)
Aroop created SOLR-14316:


 Summary: Remove there was an unchecked type conversion warning in 
JavaBinCodec's readMapEntry's equals() method
 Key: SOLR-14316
 URL: https://issues.apache.org/jira/browse/SOLR-14316
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrJ
Affects Versions: 8.4.1, 7.7.2
Reporter: Aroop


There is an unchecked type conversion warning in JavaBinCodec's readMapEntry's 
equals() method. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org