[jira] [Updated] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression

2020-07-01 Thread Aroop (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aroop updated SOLR-14614:
-
Description: 
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

to give an example of how involved the corresponding streaming expression can 
get, to get it to work on large scale systems,{color:#4c9aff} _find top 10 
cities where someone named Alex works with the respective counts_{color}
{code:java}
qt=/stream=facet=
select( top( rollup(sort(by%3D"city+asc",
   +plist( 
  
select(facet(collection1,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa),

  
select(facet(collection2,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa)
 )),
+over%3D"city",+sum(Nj3bXa)),
+n%3D"10",+sort%3D"sum(Nj3bXa)+desc"),
+city,+sum(Nj3bXa)+as+Nj3bXa)


{code}
This is a query on an alias with 2 collections behind it representing 2 data 
partitions, which is a requirement of sorts in big data systems. This is one of 
the only ways to get information from Billions of records in a matter of 
seconds. This is awesome in terms of capability and performance.

But one can see how involved this syntax can be in the current scheme and is a 
barrier to entry for new adopters.

 

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?action=aggregate=*:*=name:alex=city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select city, count(*) from collection where name = 'alex'
group by city order by count(*) desc limit 10;{code}
On the solr side this would get translated to the best possible streaming 
expression using *rollups, top, sort, plist* etc.; but all done transparently 
to the user.

Heres to making the power of Streaming expressions simpler to use for all.

 

 

  was:
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

to give an example of how involved the corresponding streaming expression can 
get, to get it to work on large scale systems, _find me top 10 cities where 
someone named Alex works with the respective counts_
{code:java}
qt=/stream=facet=
select( top( rollup(sort(by%3D"city+asc",
   +plist( 
  
select(facet(collection1,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa),

  
select(facet(collection2,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa)
 )),
+over%3D"city",+sum(Nj3bXa)),
+n%3D"10",+sort%3D"sum(Nj3bXa)+desc"),
+city,+sum(Nj3bXa)+as+Nj3bXa)


{code}
This is a query on an alias with 2 collections behind it representing 2 data 
partitions, which is a requirement of sorts in big data systems. This is one of 
the only ways to get information from Billions of records in a matter of 
seconds. But one can see how involved this syntax can be in the current scheme 
and is a barrier to entry for new adopters.

 

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?action=aggregate=*:*=name:alex=city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select city, count(*) from collection where name = 'alex'
group by city order by count(*) desc limit 10;{code}
On the solr side this would get translated to the best possible streaming 

[jira] [Updated] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression

2020-07-01 Thread Aroop (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aroop updated SOLR-14614:
-
Description: 
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

to give an example of how involved the corresponding streaming expression can 
get, to get it to work on large scale systems, _find me top 10 cities where 
someone named Alex works with the respective counts_
{code:java}
qt=/stream=facet=
select( top( rollup(sort(by%3D"city+asc",
   +plist( 
  
select(facet(collection1,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa),

  
select(facet(collection2,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa)
 )),
+over%3D"city",+sum(Nj3bXa)),
+n%3D"10",+sort%3D"sum(Nj3bXa)+desc"),
+city,+sum(Nj3bXa)+as+Nj3bXa)


{code}
This is a query on an alias with 2 collections behind it representing 2 data 
partitions, which is a requirement of sorts in big data systems. This is one of 
the only ways to get information from Billions of records in a matter of 
seconds. But one can see how involved this syntax can be in the current scheme 
and is a barrier to entry for new adopters.

 

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?action=aggregate=*:*=name:alex=city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select city, count(*) from collection where name = 'alex'
group by city order by count(*) desc limit 10;{code}
On the solr side this would get translated to the best possible streaming 
expression using *rollups, top, sort, plist* etc.; but all done transparently 
to the user.

Heres to making the power of Streaming expressions simpler to use for all.

 

 

  was:
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

to give an example of how involved the corresponding streaming expression can 
get, to get it to work on large scale systems, 
{code:java}
qt=/stream=facet=
select( top( rollup(sort(by%3D"city+asc",
   +plist( 
  
select(facet(collection1,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa),

  
select(facet(collection2,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa)
 )),
+over%3D"city",+sum(Nj3bXa)),
+n%3D"10",+sort%3D"sum(Nj3bXa)+desc"),
+city,+sum(Nj3bXa)+as+Nj3bXa)


{code}
This is a query on an alias with 2 collections behind it representing 2 data 
partitions, which is a requirement of sorts in big data systems. This is one of 
the only ways to get information from Billions of records in a matter of 
seconds. But one can see how involved this syntax can be in the current scheme 
and is a barrier to entry for new adopters.

 

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?action=aggregate=*:*=name:alex=city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select city, count(*) from collection where name = 'alex'
group by city order by count(*) desc limit 10;{code}
On the solr side this would get translated to the best possible streaming 
expression using *rollups, top, sort, plist* etc.; but all done transparently 
to the user.

 

Heres to making the power of Streaming expressions simpler to use 

[jira] [Updated] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression

2020-07-01 Thread Aroop (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aroop updated SOLR-14614:
-
Description: 
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

to give an example of how involved the corresponding streaming expression can 
get, to get it to work on large scale systems, 
{code:java}
qt=/stream=facet=
select( top( rollup(sort(by%3D"city+asc",
   +plist( 
  
select(facet(collection1,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa),

  
select(facet(collection2,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa)
 )),
+over%3D"city",+sum(Nj3bXa)),
+n%3D"10",+sort%3D"sum(Nj3bXa)+desc"),
+city,+sum(Nj3bXa)+as+Nj3bXa)


{code}
This is a query on an alias with 2 collections behind it representing 2 data 
partitions, which is a requirement of sorts in big data systems. This is one of 
the only ways to get information from Billions of records in a matter of 
seconds. But one can see how involved this syntax can be in the current scheme 
and is a barrier to entry for new adopters.

 

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?action=aggregate=*:*=name:alex=city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select city, count(*) from collection where name = 'alex'
group by city order by count(*) desc limit 10;{code}
On the solr side this would get translated to the best possible streaming 
expression using *rollups, top, sort, plist* etc.; but all done transparently 
to the user.

 

Heres to making the power of Streaming expressions simpler to use for all.

 

 

  was:
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?action=aggregate=*:*=name:alex*=age,city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select age, city, count(*) from collection where name like 'alex%'
group by age, city order by age desc limit 10;{code}
 

On the solr side this would get translated to the best possible streaming 
expression using *rollups, top, sort, plist* etc.; but all done transparently 
to the user.

 

 


> Add Simplified Aggregation Interface to Streaming Expression
> 
>
> Key: SOLR-14614
> URL: https://issues.apache.org/jira/browse/SOLR-14614
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query, query parsers, streaming expressions
>Affects Versions: 7.7.2, 8.4.1
>Reporter: Aroop
>Priority: Major
>
> For the Data Analytics use cases the standard use case is:
>  # Find a pattern
>  # Then Aggregate by certain dimensions
>  # Then compute metrics (like count, sum, avg)
>  # Sort by a dimension or metric
>  # look at top-n
> This functionality has been available over many different interfaces in the 
> past on solr, but only streaming expressions have the ability to deliver 
> results in a scalable, performant and stable manner for systems that have 
> large data to the tune of Big data systems.
> However, one barrier to entry is the query interface, not being simple enough 
> in streaming expressions.
> to give an example of how involved the 

[jira] [Updated] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression

2020-07-01 Thread Aroop (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aroop updated SOLR-14614:
-
Description: 
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?action=aggregate=*:*=name:alex*=age,city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select age, city, count(*) from collection where name like 'alex%'
group by age, city order by age desc limit 10;{code}
 

On the solr side this would get translated to the best possible streaming 
expression using *rollups, top, sort, plist* etc.; but all done transparently 
to the user.

 

 

  was:
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?q=*:*=name:alex*=age,city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select age, city, count(*) from collection where name like 'alex%'
group by age, city order by age desc limit 10;{code}
 

On the solr side this would get translated to the best possible streaming 
expression using *rollups, top, sort, plist* etc.; but all done transparently 
to the user.

 

 


> Add Simplified Aggregation Interface to Streaming Expression
> 
>
> Key: SOLR-14614
> URL: https://issues.apache.org/jira/browse/SOLR-14614
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query, query parsers, streaming expressions
>Affects Versions: 7.7.2, 8.4.1
>Reporter: Aroop
>Priority: Major
>
> For the Data Analytics use cases the standard use case is:
>  # Find a pattern
>  # Then Aggregate by certain dimensions
>  # Then compute metrics (like count, sum, avg)
>  # Sort by a dimension or metric
>  # look at top-n
> This functionality has been available over many different interfaces in the 
> past on solr, but only streaming expressions have the ability to deliver 
> results in a scalable, performant and stable manner for systems that have 
> large data to the tune of Big data systems.
> However, one barrier to entry is the query interface, not being simple enough 
> in streaming expressions.
> This Jira is to track the work of creating a simplified analytics endpoint 
> augmenting streaming expressions.
> a starting proposal is to have the endpoint have these query parameters:
> {code:java}
> /analytics?action=aggregate=*:*=name:alex*=age,city=count=count=desc=10{code}
> This is equivalent to a sql that an analyst would write:
> {code:java}
> select age, city, count(*) from collection where name like 'alex%'
> group by age, city order by age desc limit 10;{code}
>  
> On the solr side this would get translated to the best possible streaming 
> expression using *rollups, top, sort, plist* etc.; but all done transparently 
> to the user.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression

2020-07-01 Thread Aroop (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aroop updated SOLR-14614:
-
Description: 
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?q=*:*=name:alex*=age,city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select age, city, count(*) from collection where name like 'alex%'
group by age, city order by age desc limit 10;{code}
 

On the solr side this would get translated to the best possible streaming 
expression using _*rollups, top, sort, plist* etc.; b_ut all done transparently 
to the user.

 

 

  was:
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics=*:*=name:alex*=age,city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select age, city, count(*) from collection where name like 'alex%'
group by age, city order by age desc limit 10;{code}
 

On the solr side this would get translated to the best possible streaming 
expression using _*rollups, top, sort, plist* etc.; b_ut all done transparently 
to the user.

 

 


> Add Simplified Aggregation Interface to Streaming Expression
> 
>
> Key: SOLR-14614
> URL: https://issues.apache.org/jira/browse/SOLR-14614
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query, query parsers, streaming expressions
>Affects Versions: 7.7.2, 8.4.1
>Reporter: Aroop
>Priority: Major
>
> For the Data Analytics use cases the standard use case is:
>  # Find a pattern
>  # Then Aggregate by certain dimensions
>  # Then compute metrics (like count, sum, avg)
>  # Sort by a dimension or metric
>  # look at top-n
> This functionality has been available over many different interfaces in the 
> past on solr, but only streaming expressions have the ability to deliver 
> results in a scalable, performant and stable manner for systems that have 
> large data to the tune of Big data systems.
> However, one barrier to entry is the query interface, not being simple enough 
> in streaming expressions.
> This Jira is to track the work of creating a simplified analytics endpoint 
> augmenting streaming expressions.
> a starting proposal is to have the endpoint have these query parameters:
> {code:java}
> /analytics?q=*:*=name:alex*=age,city=count=count=desc=10{code}
> This is equivalent to a sql that an analyst would write:
> {code:java}
> select age, city, count(*) from collection where name like 'alex%'
> group by age, city order by age desc limit 10;{code}
>  
> On the solr side this would get translated to the best possible streaming 
> expression using _*rollups, top, sort, plist* etc.; b_ut all done 
> transparently to the user.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression

2020-07-01 Thread Aroop (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aroop updated SOLR-14614:
-
Description: 
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?q=*:*=name:alex*=age,city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select age, city, count(*) from collection where name like 'alex%'
group by age, city order by age desc limit 10;{code}
 

On the solr side this would get translated to the best possible streaming 
expression using *rollups, top, sort, plist* etc.; but all done transparently 
to the user.

 

 

  was:
For the Data Analytics use cases the standard use case is:
 # Find a pattern
 # Then Aggregate by certain dimensions
 # Then compute metrics (like count, sum, avg)
 # Sort by a dimension or metric
 # look at top-n

This functionality has been available over many different interfaces in the 
past on solr, but only streaming expressions have the ability to deliver 
results in a scalable, performant and stable manner for systems that have large 
data to the tune of Big data systems.

However, one barrier to entry is the query interface, not being simple enough 
in streaming expressions.

This Jira is to track the work of creating a simplified analytics endpoint 
augmenting streaming expressions.

a starting proposal is to have the endpoint have these query parameters:
{code:java}
/analytics?q=*:*=name:alex*=age,city=count=count=desc=10{code}
This is equivalent to a sql that an analyst would write:
{code:java}
select age, city, count(*) from collection where name like 'alex%'
group by age, city order by age desc limit 10;{code}
 

On the solr side this would get translated to the best possible streaming 
expression using _*rollups, top, sort, plist* etc.; b_ut all done transparently 
to the user.

 

 


> Add Simplified Aggregation Interface to Streaming Expression
> 
>
> Key: SOLR-14614
> URL: https://issues.apache.org/jira/browse/SOLR-14614
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: query, query parsers, streaming expressions
>Affects Versions: 7.7.2, 8.4.1
>Reporter: Aroop
>Priority: Major
>
> For the Data Analytics use cases the standard use case is:
>  # Find a pattern
>  # Then Aggregate by certain dimensions
>  # Then compute metrics (like count, sum, avg)
>  # Sort by a dimension or metric
>  # look at top-n
> This functionality has been available over many different interfaces in the 
> past on solr, but only streaming expressions have the ability to deliver 
> results in a scalable, performant and stable manner for systems that have 
> large data to the tune of Big data systems.
> However, one barrier to entry is the query interface, not being simple enough 
> in streaming expressions.
> This Jira is to track the work of creating a simplified analytics endpoint 
> augmenting streaming expressions.
> a starting proposal is to have the endpoint have these query parameters:
> {code:java}
> /analytics?q=*:*=name:alex*=age,city=count=count=desc=10{code}
> This is equivalent to a sql that an analyst would write:
> {code:java}
> select age, city, count(*) from collection where name like 'alex%'
> group by age, city order by age desc limit 10;{code}
>  
> On the solr side this would get translated to the best possible streaming 
> expression using *rollups, top, sort, plist* etc.; but all done transparently 
> to the user.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org