Re: Solr Aggregation queries are way slower than Elastic Search

2017-12-12 Thread RAUNAK AGRAWAL
Thanks Yonik and Joel. I will try with JSON Facet API and update the
results here.

On Tue, Dec 12, 2017 at 10:56 PM, Yonik Seeley  wrote:

> On Tue, Dec 12, 2017 at 9:17 AM, RAUNAK AGRAWAL
>  wrote:
> > Hi Yonik,
> >
> > So if the query is fine then I guess even using JSON Facet API will not
> > help me here.
>
> As Joel mentioned, it's completely different code than the old stats API.
> This is a very simple use-case, so if we're slower than ES for some
> reason, it should be very easy to fix.
>
> -Yonik
>
>
> > On Tue, Dec 12, 2017 at 7:27 PM, Yonik Seeley  wrote:
> >
> >> OK great, so it's definitely not the main query (which is just a
> >> single term query in this example!)
> >>
> >> > Also I have looked into the JSON Facet API. If I have to use facets, I
> >> will
> >> > have to then define 3600 facets in a single query and I guess that
> would
> >> be
> >> > also slow.
> >>
> >> You can ask for any number of stats for a given facet (even the root
> >> facet bucket w/o faceting on any fields):
> >>
> >> cutl 'http://localhost:8983/solr/collection1.query?q=variable1:
> >> 290=0={
> >>   s1:"sum(metric_1)",
> >>   s2:"sum(metric_2)",
> >>   s3:"sum(metric_3)"
> >> }'
> >>
> >> -Yonik
> >>
> >>
> >> On Tue, Dec 12, 2017 at 5:40 AM, RAUNAK AGRAWAL
> >>  wrote:
> >> > Hi Yonik,
> >> >
> >> > As you asked here is the code snippet and the actual solr query.
> Please
> >> > have a look. I have included only 104 metrics but like this we can go
> >> upto
> >> > 3600 rollups.
> >> >
> >> > Also I have looked into the JSON Facet API. If I have to use facets, I
> >> will
> >> > have to then define 3600 facets in a single query and I guess that
> would
> >> be
> >> > also slow. Also is there any max limit on the number of facets we can
> >> > define in a single query?
> >> >
> >> > Code snippet:
> >> >
> >> > private SolrQuery buildQuery(Integer variable1, List metrics)
> {
> >> > SolrQuery query = new SolrQuery();
> >> > query.set("q", "variable1:" + variable1);
> >> > query.setRows(0);
> >> > metrics.forEach(
> >> > metric -> query.setGetFieldStatistics("{!sum=true }" +
> >> metric)
> >> > );
> >> > return query;
> >> > }
> >> >
> >> >
> >> > The generated query:
> >> >
> >> > {! q=variable1:290 rows=0 stats=true stats.field='{!sum=true
> >> > }metric_1' stats.field='{!sum=true }metric_2' stats.field='{!sum=true
> >> > }metric_3' stats.field='{!sum=true }metric_4' stats.field='{!sum=true
> >> > }metric_5' stats.field='{!sum=true }metric_6' stats.field='{!sum=true
> >> > }metric_7' stats.field='{!sum=true }metric_8' stats.field='{!sum=true
> >> > }metric_9' stats.field='{!sum=true }metric_10' stats.field='{!sum=true
> >> > }metric_11' stats.field='{!sum=true }metric_12'
> >> > stats.field='{!sum=true }metric_13' stats.field='{!sum=true
> >> > }metric_14' stats.field='{!sum=true }metric_15'
> >> > stats.field='{!sum=true }metric_16' stats.field='{!sum=true
> >> > }metric_17' stats.field='{!sum=true }metric_18'
> >> > stats.field='{!sum=true }metric_19' stats.field='{!sum=true
> >> > }metric_20' stats.field='{!sum=true }metric_21'
> >> > stats.field='{!sum=true }metric_22' stats.field='{!sum=true
> >> > }metric_23' stats.field='{!sum=true }metric_24'
> >> > stats.field='{!sum=true }metric_25' stats.field='{!sum=true
> >> > }metric_26' stats.field='{!sum=true }metric_27'
> >> > stats.field='{!sum=true }metric_28' stats.field='{!sum=true
> >> > }metric_29' stats.field='{!sum=true }metric_30'
> >> > stats.field='{!sum=true }metric_31' stats.field='{!sum=true
> >> > }metric_32' stats.field='{!sum=true }metric_33'
> >> > stats.field='{!sum=true }metric_34' stats.field='{!sum=true
> >> > }metric_35' stats.field='{!sum=true }metric_36'
> >> > stats.field='{!sum=true }metric_37' stats.field='{!sum=true
> >> > }metric_38' stats.field='{!sum=true }metric_39'
> >> > stats.field='{!sum=true }metric_40' stats.field='{!sum=true
> >> > }metric_41' stats.field='{!sum=true }metric_42'
> >> > stats.field='{!sum=true }metric_43' stats.field='{!sum=true
> >> > }metric_44' stats.field='{!sum=true }metric_45'
> >> > stats.field='{!sum=true }metric_46' stats.field='{!sum=true
> >> > }metric_47' stats.field='{!sum=true }metric_48'
> >> > stats.field='{!sum=true }metric_49' stats.field='{!sum=true
> >> > }metric_50' stats.field='{!sum=true }metric_51'
> >> > stats.field='{!sum=true }metric_52' stats.field='{!sum=true
> >> > }metric_53' stats.field='{!sum=true }metric_54'
> >> > stats.field='{!sum=true }metric_55' stats.field='{!sum=true
> >> > }metric_56' stats.field='{!sum=true }metric_57'
> >> > stats.field='{!sum=true }metric_58' stats.field='{!sum=true
> >> > }metric_59' stats.field='{!sum=true }metric_60'
> >> > stats.field='{!sum=true }metric_61' stats.field='{!sum=true
> >> > }metric_62' stats.field='{!sum=true }metric_63'
> >> > stats.field='{!sum=true }metric_64' stats.field='{!sum=true
> >> > }metric_65' 

Re: Solr Aggregation queries are way slower than Elastic Search

2017-12-12 Thread Yonik Seeley
On Tue, Dec 12, 2017 at 9:17 AM, RAUNAK AGRAWAL
 wrote:
> Hi Yonik,
>
> So if the query is fine then I guess even using JSON Facet API will not
> help me here.

As Joel mentioned, it's completely different code than the old stats API.
This is a very simple use-case, so if we're slower than ES for some
reason, it should be very easy to fix.

-Yonik


> On Tue, Dec 12, 2017 at 7:27 PM, Yonik Seeley  wrote:
>
>> OK great, so it's definitely not the main query (which is just a
>> single term query in this example!)
>>
>> > Also I have looked into the JSON Facet API. If I have to use facets, I
>> will
>> > have to then define 3600 facets in a single query and I guess that would
>> be
>> > also slow.
>>
>> You can ask for any number of stats for a given facet (even the root
>> facet bucket w/o faceting on any fields):
>>
>> cutl 'http://localhost:8983/solr/collection1.query?q=variable1:
>> 290=0={
>>   s1:"sum(metric_1)",
>>   s2:"sum(metric_2)",
>>   s3:"sum(metric_3)"
>> }'
>>
>> -Yonik
>>
>>
>> On Tue, Dec 12, 2017 at 5:40 AM, RAUNAK AGRAWAL
>>  wrote:
>> > Hi Yonik,
>> >
>> > As you asked here is the code snippet and the actual solr query. Please
>> > have a look. I have included only 104 metrics but like this we can go
>> upto
>> > 3600 rollups.
>> >
>> > Also I have looked into the JSON Facet API. If I have to use facets, I
>> will
>> > have to then define 3600 facets in a single query and I guess that would
>> be
>> > also slow. Also is there any max limit on the number of facets we can
>> > define in a single query?
>> >
>> > Code snippet:
>> >
>> > private SolrQuery buildQuery(Integer variable1, List metrics) {
>> > SolrQuery query = new SolrQuery();
>> > query.set("q", "variable1:" + variable1);
>> > query.setRows(0);
>> > metrics.forEach(
>> > metric -> query.setGetFieldStatistics("{!sum=true }" +
>> metric)
>> > );
>> > return query;
>> > }
>> >
>> >
>> > The generated query:
>> >
>> > {! q=variable1:290 rows=0 stats=true stats.field='{!sum=true
>> > }metric_1' stats.field='{!sum=true }metric_2' stats.field='{!sum=true
>> > }metric_3' stats.field='{!sum=true }metric_4' stats.field='{!sum=true
>> > }metric_5' stats.field='{!sum=true }metric_6' stats.field='{!sum=true
>> > }metric_7' stats.field='{!sum=true }metric_8' stats.field='{!sum=true
>> > }metric_9' stats.field='{!sum=true }metric_10' stats.field='{!sum=true
>> > }metric_11' stats.field='{!sum=true }metric_12'
>> > stats.field='{!sum=true }metric_13' stats.field='{!sum=true
>> > }metric_14' stats.field='{!sum=true }metric_15'
>> > stats.field='{!sum=true }metric_16' stats.field='{!sum=true
>> > }metric_17' stats.field='{!sum=true }metric_18'
>> > stats.field='{!sum=true }metric_19' stats.field='{!sum=true
>> > }metric_20' stats.field='{!sum=true }metric_21'
>> > stats.field='{!sum=true }metric_22' stats.field='{!sum=true
>> > }metric_23' stats.field='{!sum=true }metric_24'
>> > stats.field='{!sum=true }metric_25' stats.field='{!sum=true
>> > }metric_26' stats.field='{!sum=true }metric_27'
>> > stats.field='{!sum=true }metric_28' stats.field='{!sum=true
>> > }metric_29' stats.field='{!sum=true }metric_30'
>> > stats.field='{!sum=true }metric_31' stats.field='{!sum=true
>> > }metric_32' stats.field='{!sum=true }metric_33'
>> > stats.field='{!sum=true }metric_34' stats.field='{!sum=true
>> > }metric_35' stats.field='{!sum=true }metric_36'
>> > stats.field='{!sum=true }metric_37' stats.field='{!sum=true
>> > }metric_38' stats.field='{!sum=true }metric_39'
>> > stats.field='{!sum=true }metric_40' stats.field='{!sum=true
>> > }metric_41' stats.field='{!sum=true }metric_42'
>> > stats.field='{!sum=true }metric_43' stats.field='{!sum=true
>> > }metric_44' stats.field='{!sum=true }metric_45'
>> > stats.field='{!sum=true }metric_46' stats.field='{!sum=true
>> > }metric_47' stats.field='{!sum=true }metric_48'
>> > stats.field='{!sum=true }metric_49' stats.field='{!sum=true
>> > }metric_50' stats.field='{!sum=true }metric_51'
>> > stats.field='{!sum=true }metric_52' stats.field='{!sum=true
>> > }metric_53' stats.field='{!sum=true }metric_54'
>> > stats.field='{!sum=true }metric_55' stats.field='{!sum=true
>> > }metric_56' stats.field='{!sum=true }metric_57'
>> > stats.field='{!sum=true }metric_58' stats.field='{!sum=true
>> > }metric_59' stats.field='{!sum=true }metric_60'
>> > stats.field='{!sum=true }metric_61' stats.field='{!sum=true
>> > }metric_62' stats.field='{!sum=true }metric_63'
>> > stats.field='{!sum=true }metric_64' stats.field='{!sum=true
>> > }metric_65' stats.field='{!sum=true }metric_66'
>> > stats.field='{!sum=true }metric_67' stats.field='{!sum=true
>> > }metric_68' stats.field='{!sum=true }metric_69'
>> > stats.field='{!sum=true }metric_70' stats.field='{!sum=true
>> > }metric_71' stats.field='{!sum=true }metric_72'
>> > stats.field='{!sum=true }metric_73' stats.field='{!sum=true
>> > }metric_74' stats.field='{!sum=true 

Re: Solr Aggregation queries are way slower than Elastic Search

2017-12-12 Thread Joel Bernstein
The JSON facet API is a different aggregation implementation. So, it will
definitely have different performance characteristics.

Joel Bernstein
http://joelsolr.blogspot.com/

On Tue, Dec 12, 2017 at 9:17 AM, RAUNAK AGRAWAL 
wrote:

> Hi Yonik,
>
> So if the query is fine then I guess even using JSON Facet API will not
> help me here. Can you suggest me some other idea or further tuning which
> will help me in reducing the latency?
>
> On Tue, Dec 12, 2017 at 7:27 PM, Yonik Seeley  wrote:
>
> > OK great, so it's definitely not the main query (which is just a
> > single term query in this example!)
> >
> > > Also I have looked into the JSON Facet API. If I have to use facets, I
> > will
> > > have to then define 3600 facets in a single query and I guess that
> would
> > be
> > > also slow.
> >
> > You can ask for any number of stats for a given facet (even the root
> > facet bucket w/o faceting on any fields):
> >
> > cutl 'http://localhost:8983/solr/collection1.query?q=variable1:
> > 290=0={
> >   s1:"sum(metric_1)",
> >   s2:"sum(metric_2)",
> >   s3:"sum(metric_3)"
> > }'
> >
> > -Yonik
> >
> >
> > On Tue, Dec 12, 2017 at 5:40 AM, RAUNAK AGRAWAL
> >  wrote:
> > > Hi Yonik,
> > >
> > > As you asked here is the code snippet and the actual solr query. Please
> > > have a look. I have included only 104 metrics but like this we can go
> > upto
> > > 3600 rollups.
> > >
> > > Also I have looked into the JSON Facet API. If I have to use facets, I
> > will
> > > have to then define 3600 facets in a single query and I guess that
> would
> > be
> > > also slow. Also is there any max limit on the number of facets we can
> > > define in a single query?
> > >
> > > Code snippet:
> > >
> > > private SolrQuery buildQuery(Integer variable1, List metrics) {
> > > SolrQuery query = new SolrQuery();
> > > query.set("q", "variable1:" + variable1);
> > > query.setRows(0);
> > > metrics.forEach(
> > > metric -> query.setGetFieldStatistics("{!sum=true }" +
> > metric)
> > > );
> > > return query;
> > > }
> > >
> > >
> > > The generated query:
> > >
> > > {! q=variable1:290 rows=0 stats=true stats.field='{!sum=true
> > > }metric_1' stats.field='{!sum=true }metric_2' stats.field='{!sum=true
> > > }metric_3' stats.field='{!sum=true }metric_4' stats.field='{!sum=true
> > > }metric_5' stats.field='{!sum=true }metric_6' stats.field='{!sum=true
> > > }metric_7' stats.field='{!sum=true }metric_8' stats.field='{!sum=true
> > > }metric_9' stats.field='{!sum=true }metric_10' stats.field='{!sum=true
> > > }metric_11' stats.field='{!sum=true }metric_12'
> > > stats.field='{!sum=true }metric_13' stats.field='{!sum=true
> > > }metric_14' stats.field='{!sum=true }metric_15'
> > > stats.field='{!sum=true }metric_16' stats.field='{!sum=true
> > > }metric_17' stats.field='{!sum=true }metric_18'
> > > stats.field='{!sum=true }metric_19' stats.field='{!sum=true
> > > }metric_20' stats.field='{!sum=true }metric_21'
> > > stats.field='{!sum=true }metric_22' stats.field='{!sum=true
> > > }metric_23' stats.field='{!sum=true }metric_24'
> > > stats.field='{!sum=true }metric_25' stats.field='{!sum=true
> > > }metric_26' stats.field='{!sum=true }metric_27'
> > > stats.field='{!sum=true }metric_28' stats.field='{!sum=true
> > > }metric_29' stats.field='{!sum=true }metric_30'
> > > stats.field='{!sum=true }metric_31' stats.field='{!sum=true
> > > }metric_32' stats.field='{!sum=true }metric_33'
> > > stats.field='{!sum=true }metric_34' stats.field='{!sum=true
> > > }metric_35' stats.field='{!sum=true }metric_36'
> > > stats.field='{!sum=true }metric_37' stats.field='{!sum=true
> > > }metric_38' stats.field='{!sum=true }metric_39'
> > > stats.field='{!sum=true }metric_40' stats.field='{!sum=true
> > > }metric_41' stats.field='{!sum=true }metric_42'
> > > stats.field='{!sum=true }metric_43' stats.field='{!sum=true
> > > }metric_44' stats.field='{!sum=true }metric_45'
> > > stats.field='{!sum=true }metric_46' stats.field='{!sum=true
> > > }metric_47' stats.field='{!sum=true }metric_48'
> > > stats.field='{!sum=true }metric_49' stats.field='{!sum=true
> > > }metric_50' stats.field='{!sum=true }metric_51'
> > > stats.field='{!sum=true }metric_52' stats.field='{!sum=true
> > > }metric_53' stats.field='{!sum=true }metric_54'
> > > stats.field='{!sum=true }metric_55' stats.field='{!sum=true
> > > }metric_56' stats.field='{!sum=true }metric_57'
> > > stats.field='{!sum=true }metric_58' stats.field='{!sum=true
> > > }metric_59' stats.field='{!sum=true }metric_60'
> > > stats.field='{!sum=true }metric_61' stats.field='{!sum=true
> > > }metric_62' stats.field='{!sum=true }metric_63'
> > > stats.field='{!sum=true }metric_64' stats.field='{!sum=true
> > > }metric_65' stats.field='{!sum=true }metric_66'
> > > stats.field='{!sum=true }metric_67' stats.field='{!sum=true
> > > }metric_68' stats.field='{!sum=true }metric_69'
> > > stats.field='{!sum=true }metric_70' 

Re: Solr Aggregation queries are way slower than Elastic Search

2017-12-12 Thread RAUNAK AGRAWAL
Hi Yonik,

So if the query is fine then I guess even using JSON Facet API will not
help me here. Can you suggest me some other idea or further tuning which
will help me in reducing the latency?

On Tue, Dec 12, 2017 at 7:27 PM, Yonik Seeley  wrote:

> OK great, so it's definitely not the main query (which is just a
> single term query in this example!)
>
> > Also I have looked into the JSON Facet API. If I have to use facets, I
> will
> > have to then define 3600 facets in a single query and I guess that would
> be
> > also slow.
>
> You can ask for any number of stats for a given facet (even the root
> facet bucket w/o faceting on any fields):
>
> cutl 'http://localhost:8983/solr/collection1.query?q=variable1:
> 290=0={
>   s1:"sum(metric_1)",
>   s2:"sum(metric_2)",
>   s3:"sum(metric_3)"
> }'
>
> -Yonik
>
>
> On Tue, Dec 12, 2017 at 5:40 AM, RAUNAK AGRAWAL
>  wrote:
> > Hi Yonik,
> >
> > As you asked here is the code snippet and the actual solr query. Please
> > have a look. I have included only 104 metrics but like this we can go
> upto
> > 3600 rollups.
> >
> > Also I have looked into the JSON Facet API. If I have to use facets, I
> will
> > have to then define 3600 facets in a single query and I guess that would
> be
> > also slow. Also is there any max limit on the number of facets we can
> > define in a single query?
> >
> > Code snippet:
> >
> > private SolrQuery buildQuery(Integer variable1, List metrics) {
> > SolrQuery query = new SolrQuery();
> > query.set("q", "variable1:" + variable1);
> > query.setRows(0);
> > metrics.forEach(
> > metric -> query.setGetFieldStatistics("{!sum=true }" +
> metric)
> > );
> > return query;
> > }
> >
> >
> > The generated query:
> >
> > {! q=variable1:290 rows=0 stats=true stats.field='{!sum=true
> > }metric_1' stats.field='{!sum=true }metric_2' stats.field='{!sum=true
> > }metric_3' stats.field='{!sum=true }metric_4' stats.field='{!sum=true
> > }metric_5' stats.field='{!sum=true }metric_6' stats.field='{!sum=true
> > }metric_7' stats.field='{!sum=true }metric_8' stats.field='{!sum=true
> > }metric_9' stats.field='{!sum=true }metric_10' stats.field='{!sum=true
> > }metric_11' stats.field='{!sum=true }metric_12'
> > stats.field='{!sum=true }metric_13' stats.field='{!sum=true
> > }metric_14' stats.field='{!sum=true }metric_15'
> > stats.field='{!sum=true }metric_16' stats.field='{!sum=true
> > }metric_17' stats.field='{!sum=true }metric_18'
> > stats.field='{!sum=true }metric_19' stats.field='{!sum=true
> > }metric_20' stats.field='{!sum=true }metric_21'
> > stats.field='{!sum=true }metric_22' stats.field='{!sum=true
> > }metric_23' stats.field='{!sum=true }metric_24'
> > stats.field='{!sum=true }metric_25' stats.field='{!sum=true
> > }metric_26' stats.field='{!sum=true }metric_27'
> > stats.field='{!sum=true }metric_28' stats.field='{!sum=true
> > }metric_29' stats.field='{!sum=true }metric_30'
> > stats.field='{!sum=true }metric_31' stats.field='{!sum=true
> > }metric_32' stats.field='{!sum=true }metric_33'
> > stats.field='{!sum=true }metric_34' stats.field='{!sum=true
> > }metric_35' stats.field='{!sum=true }metric_36'
> > stats.field='{!sum=true }metric_37' stats.field='{!sum=true
> > }metric_38' stats.field='{!sum=true }metric_39'
> > stats.field='{!sum=true }metric_40' stats.field='{!sum=true
> > }metric_41' stats.field='{!sum=true }metric_42'
> > stats.field='{!sum=true }metric_43' stats.field='{!sum=true
> > }metric_44' stats.field='{!sum=true }metric_45'
> > stats.field='{!sum=true }metric_46' stats.field='{!sum=true
> > }metric_47' stats.field='{!sum=true }metric_48'
> > stats.field='{!sum=true }metric_49' stats.field='{!sum=true
> > }metric_50' stats.field='{!sum=true }metric_51'
> > stats.field='{!sum=true }metric_52' stats.field='{!sum=true
> > }metric_53' stats.field='{!sum=true }metric_54'
> > stats.field='{!sum=true }metric_55' stats.field='{!sum=true
> > }metric_56' stats.field='{!sum=true }metric_57'
> > stats.field='{!sum=true }metric_58' stats.field='{!sum=true
> > }metric_59' stats.field='{!sum=true }metric_60'
> > stats.field='{!sum=true }metric_61' stats.field='{!sum=true
> > }metric_62' stats.field='{!sum=true }metric_63'
> > stats.field='{!sum=true }metric_64' stats.field='{!sum=true
> > }metric_65' stats.field='{!sum=true }metric_66'
> > stats.field='{!sum=true }metric_67' stats.field='{!sum=true
> > }metric_68' stats.field='{!sum=true }metric_69'
> > stats.field='{!sum=true }metric_70' stats.field='{!sum=true
> > }metric_71' stats.field='{!sum=true }metric_72'
> > stats.field='{!sum=true }metric_73' stats.field='{!sum=true
> > }metric_74' stats.field='{!sum=true }metric_75'
> > stats.field='{!sum=true }metric_76' stats.field='{!sum=true
> > }metric_77' stats.field='{!sum=true }metric_78'
> > stats.field='{!sum=true }metric_79' stats.field='{!sum=true
> > }metric_80' stats.field='{!sum=true }metric_81'
> > stats.field='{!sum=true }metric_82' 

Re: Solr Aggregation queries are way slower than Elastic Search

2017-12-12 Thread Yonik Seeley
OK great, so it's definitely not the main query (which is just a
single term query in this example!)

> Also I have looked into the JSON Facet API. If I have to use facets, I will
> have to then define 3600 facets in a single query and I guess that would be
> also slow.

You can ask for any number of stats for a given facet (even the root
facet bucket w/o faceting on any fields):

cutl 
'http://localhost:8983/solr/collection1.query?q=variable1:290=0={
  s1:"sum(metric_1)",
  s2:"sum(metric_2)",
  s3:"sum(metric_3)"
}'

-Yonik


On Tue, Dec 12, 2017 at 5:40 AM, RAUNAK AGRAWAL
 wrote:
> Hi Yonik,
>
> As you asked here is the code snippet and the actual solr query. Please
> have a look. I have included only 104 metrics but like this we can go upto
> 3600 rollups.
>
> Also I have looked into the JSON Facet API. If I have to use facets, I will
> have to then define 3600 facets in a single query and I guess that would be
> also slow. Also is there any max limit on the number of facets we can
> define in a single query?
>
> Code snippet:
>
> private SolrQuery buildQuery(Integer variable1, List metrics) {
> SolrQuery query = new SolrQuery();
> query.set("q", "variable1:" + variable1);
> query.setRows(0);
> metrics.forEach(
> metric -> query.setGetFieldStatistics("{!sum=true }" + metric)
> );
> return query;
> }
>
>
> The generated query:
>
> {! q=variable1:290 rows=0 stats=true stats.field='{!sum=true
> }metric_1' stats.field='{!sum=true }metric_2' stats.field='{!sum=true
> }metric_3' stats.field='{!sum=true }metric_4' stats.field='{!sum=true
> }metric_5' stats.field='{!sum=true }metric_6' stats.field='{!sum=true
> }metric_7' stats.field='{!sum=true }metric_8' stats.field='{!sum=true
> }metric_9' stats.field='{!sum=true }metric_10' stats.field='{!sum=true
> }metric_11' stats.field='{!sum=true }metric_12'
> stats.field='{!sum=true }metric_13' stats.field='{!sum=true
> }metric_14' stats.field='{!sum=true }metric_15'
> stats.field='{!sum=true }metric_16' stats.field='{!sum=true
> }metric_17' stats.field='{!sum=true }metric_18'
> stats.field='{!sum=true }metric_19' stats.field='{!sum=true
> }metric_20' stats.field='{!sum=true }metric_21'
> stats.field='{!sum=true }metric_22' stats.field='{!sum=true
> }metric_23' stats.field='{!sum=true }metric_24'
> stats.field='{!sum=true }metric_25' stats.field='{!sum=true
> }metric_26' stats.field='{!sum=true }metric_27'
> stats.field='{!sum=true }metric_28' stats.field='{!sum=true
> }metric_29' stats.field='{!sum=true }metric_30'
> stats.field='{!sum=true }metric_31' stats.field='{!sum=true
> }metric_32' stats.field='{!sum=true }metric_33'
> stats.field='{!sum=true }metric_34' stats.field='{!sum=true
> }metric_35' stats.field='{!sum=true }metric_36'
> stats.field='{!sum=true }metric_37' stats.field='{!sum=true
> }metric_38' stats.field='{!sum=true }metric_39'
> stats.field='{!sum=true }metric_40' stats.field='{!sum=true
> }metric_41' stats.field='{!sum=true }metric_42'
> stats.field='{!sum=true }metric_43' stats.field='{!sum=true
> }metric_44' stats.field='{!sum=true }metric_45'
> stats.field='{!sum=true }metric_46' stats.field='{!sum=true
> }metric_47' stats.field='{!sum=true }metric_48'
> stats.field='{!sum=true }metric_49' stats.field='{!sum=true
> }metric_50' stats.field='{!sum=true }metric_51'
> stats.field='{!sum=true }metric_52' stats.field='{!sum=true
> }metric_53' stats.field='{!sum=true }metric_54'
> stats.field='{!sum=true }metric_55' stats.field='{!sum=true
> }metric_56' stats.field='{!sum=true }metric_57'
> stats.field='{!sum=true }metric_58' stats.field='{!sum=true
> }metric_59' stats.field='{!sum=true }metric_60'
> stats.field='{!sum=true }metric_61' stats.field='{!sum=true
> }metric_62' stats.field='{!sum=true }metric_63'
> stats.field='{!sum=true }metric_64' stats.field='{!sum=true
> }metric_65' stats.field='{!sum=true }metric_66'
> stats.field='{!sum=true }metric_67' stats.field='{!sum=true
> }metric_68' stats.field='{!sum=true }metric_69'
> stats.field='{!sum=true }metric_70' stats.field='{!sum=true
> }metric_71' stats.field='{!sum=true }metric_72'
> stats.field='{!sum=true }metric_73' stats.field='{!sum=true
> }metric_74' stats.field='{!sum=true }metric_75'
> stats.field='{!sum=true }metric_76' stats.field='{!sum=true
> }metric_77' stats.field='{!sum=true }metric_78'
> stats.field='{!sum=true }metric_79' stats.field='{!sum=true
> }metric_80' stats.field='{!sum=true }metric_81'
> stats.field='{!sum=true }metric_82' stats.field='{!sum=true
> }metric_83' stats.field='{!sum=true }metric_84'
> stats.field='{!sum=true }metric_85' stats.field='{!sum=true
> }metric_86' stats.field='{!sum=true }metric_87'
> stats.field='{!sum=true }metric_88' stats.field='{!sum=true
> }metric_89' stats.field='{!sum=true }metric_90'
> stats.field='{!sum=true }metric_91' stats.field='{!sum=true
> }metric_92' stats.field='{!sum=true }metric_93'
> stats.field='{!sum=true }metric_94' stats.field='{!sum=true
> }metric_95' 

Re: Solr Aggregation queries are way slower than Elastic Search

2017-12-12 Thread RAUNAK AGRAWAL
Hi Yonik,

As you asked here is the code snippet and the actual solr query. Please
have a look. I have included only 104 metrics but like this we can go upto
3600 rollups.

Also I have looked into the JSON Facet API. If I have to use facets, I will
have to then define 3600 facets in a single query and I guess that would be
also slow. Also is there any max limit on the number of facets we can
define in a single query?

Code snippet:

private SolrQuery buildQuery(Integer variable1, List metrics) {
SolrQuery query = new SolrQuery();
query.set("q", "variable1:" + variable1);
query.setRows(0);
metrics.forEach(
metric -> query.setGetFieldStatistics("{!sum=true }" + metric)
);
return query;
}


The generated query:

{! q=variable1:290 rows=0 stats=true stats.field='{!sum=true
}metric_1' stats.field='{!sum=true }metric_2' stats.field='{!sum=true
}metric_3' stats.field='{!sum=true }metric_4' stats.field='{!sum=true
}metric_5' stats.field='{!sum=true }metric_6' stats.field='{!sum=true
}metric_7' stats.field='{!sum=true }metric_8' stats.field='{!sum=true
}metric_9' stats.field='{!sum=true }metric_10' stats.field='{!sum=true
}metric_11' stats.field='{!sum=true }metric_12'
stats.field='{!sum=true }metric_13' stats.field='{!sum=true
}metric_14' stats.field='{!sum=true }metric_15'
stats.field='{!sum=true }metric_16' stats.field='{!sum=true
}metric_17' stats.field='{!sum=true }metric_18'
stats.field='{!sum=true }metric_19' stats.field='{!sum=true
}metric_20' stats.field='{!sum=true }metric_21'
stats.field='{!sum=true }metric_22' stats.field='{!sum=true
}metric_23' stats.field='{!sum=true }metric_24'
stats.field='{!sum=true }metric_25' stats.field='{!sum=true
}metric_26' stats.field='{!sum=true }metric_27'
stats.field='{!sum=true }metric_28' stats.field='{!sum=true
}metric_29' stats.field='{!sum=true }metric_30'
stats.field='{!sum=true }metric_31' stats.field='{!sum=true
}metric_32' stats.field='{!sum=true }metric_33'
stats.field='{!sum=true }metric_34' stats.field='{!sum=true
}metric_35' stats.field='{!sum=true }metric_36'
stats.field='{!sum=true }metric_37' stats.field='{!sum=true
}metric_38' stats.field='{!sum=true }metric_39'
stats.field='{!sum=true }metric_40' stats.field='{!sum=true
}metric_41' stats.field='{!sum=true }metric_42'
stats.field='{!sum=true }metric_43' stats.field='{!sum=true
}metric_44' stats.field='{!sum=true }metric_45'
stats.field='{!sum=true }metric_46' stats.field='{!sum=true
}metric_47' stats.field='{!sum=true }metric_48'
stats.field='{!sum=true }metric_49' stats.field='{!sum=true
}metric_50' stats.field='{!sum=true }metric_51'
stats.field='{!sum=true }metric_52' stats.field='{!sum=true
}metric_53' stats.field='{!sum=true }metric_54'
stats.field='{!sum=true }metric_55' stats.field='{!sum=true
}metric_56' stats.field='{!sum=true }metric_57'
stats.field='{!sum=true }metric_58' stats.field='{!sum=true
}metric_59' stats.field='{!sum=true }metric_60'
stats.field='{!sum=true }metric_61' stats.field='{!sum=true
}metric_62' stats.field='{!sum=true }metric_63'
stats.field='{!sum=true }metric_64' stats.field='{!sum=true
}metric_65' stats.field='{!sum=true }metric_66'
stats.field='{!sum=true }metric_67' stats.field='{!sum=true
}metric_68' stats.field='{!sum=true }metric_69'
stats.field='{!sum=true }metric_70' stats.field='{!sum=true
}metric_71' stats.field='{!sum=true }metric_72'
stats.field='{!sum=true }metric_73' stats.field='{!sum=true
}metric_74' stats.field='{!sum=true }metric_75'
stats.field='{!sum=true }metric_76' stats.field='{!sum=true
}metric_77' stats.field='{!sum=true }metric_78'
stats.field='{!sum=true }metric_79' stats.field='{!sum=true
}metric_80' stats.field='{!sum=true }metric_81'
stats.field='{!sum=true }metric_82' stats.field='{!sum=true
}metric_83' stats.field='{!sum=true }metric_84'
stats.field='{!sum=true }metric_85' stats.field='{!sum=true
}metric_86' stats.field='{!sum=true }metric_87'
stats.field='{!sum=true }metric_88' stats.field='{!sum=true
}metric_89' stats.field='{!sum=true }metric_90'
stats.field='{!sum=true }metric_91' stats.field='{!sum=true
}metric_92' stats.field='{!sum=true }metric_93'
stats.field='{!sum=true }metric_94' stats.field='{!sum=true
}metric_95' stats.field='{!sum=true }metric_96'
stats.field='{!sum=true }metric_97' stats.field='{!sum=true
}metric_98' stats.field='{!sum=true }metric_99'
stats.field='{!sum=true }metric_100' stats.field='{!sum=true
}metric_101' stats.field='{!sum=true }metric_102'
stats.field='{!sum=true }metric_103' stats.field='{!sum=true
}metric_104'}




On Tue, Dec 12, 2017 at 10:21 AM, RAUNAK AGRAWAL 
wrote:

> Hi Yonik,
>
> I will try the JSON Facet API and update here but my hunch is that
> querying mechanism is not the problem. Rather the problem lies with the
> solr aggregations.
>
> Thanks
>
> On Tue, Dec 12, 2017 at 6:31 AM, Yonik Seeley  wrote:
>
>> I think the SolrJ below uses the old stats component.
>> Hopefully the JSON Facet API would be faster for 

Re: Solr Aggregation queries are way slower than Elastic Search

2017-12-11 Thread RAUNAK AGRAWAL
Hi Yonik,

I will try the JSON Facet API and update here but my hunch is that querying
mechanism is not the problem. Rather the problem lies with the solr
aggregations.

Thanks

On Tue, Dec 12, 2017 at 6:31 AM, Yonik Seeley  wrote:

> I think the SolrJ below uses the old stats component.
> Hopefully the JSON Facet API would be faster for this, but it's not
> completely clear what the main query here looks like, and if it's the
> source of any bottleneck rather than the aggregations.
> What does the generated query string actually look like (it may be
> easiest just to pull this from the logs).
>
> -Yonik
>
>
> On Mon, Dec 11, 2017 at 7:32 PM, RAUNAK AGRAWAL
>  wrote:
> > Hi,
> >
> > We have a use case where there are 4-5 dimensions and around 3500 metrics
> > in a single document. We have indexed only 2 dimensions and made all the
> > metrics as doc_values so that we can run the aggregation queries.
> >
> > We have 6 million such documents and we are using solr cloud(6.6) on a 6
> > node cluster with 8 Vcores and 24 GB RAM each.
> >
> > On the same set of hardware in elastic search we were getting the
> response
> > in about 10ms whereas in solr we are getting response in around 300-400
> ms.
> >
> > This is how I am querying the data.
> >
> > private SolrQuery buildQuery(Integer variable1, List groups,
> > List metrics) {
> > SolrQuery query = new SolrQuery();
> > String groupQuery = " (" + groups.stream().map(g -> "group:" +
> g).collect
> > (Collectors.joining(" OR ")) + ")";
> > String finalQuery = "variable1:" + variable1 + " AND " + groupQuery;
> > query.set("q", finalQuery);
> > query.setRows(0);
> > metrics.forEach(
> > metric -> query.setGetFieldStatistics("{!sum=true }" +
> metric)
> > );
> > return query;
> > }
> >
> > Any help will be appreciated regarding this.
> >
> >
> > Thanks,
> >
> > Raunak
>


Re: Solr Aggregation queries are way slower than Elastic Search

2017-12-11 Thread Yonik Seeley
I think the SolrJ below uses the old stats component.
Hopefully the JSON Facet API would be faster for this, but it's not
completely clear what the main query here looks like, and if it's the
source of any bottleneck rather than the aggregations.
What does the generated query string actually look like (it may be
easiest just to pull this from the logs).

-Yonik


On Mon, Dec 11, 2017 at 7:32 PM, RAUNAK AGRAWAL
 wrote:
> Hi,
>
> We have a use case where there are 4-5 dimensions and around 3500 metrics
> in a single document. We have indexed only 2 dimensions and made all the
> metrics as doc_values so that we can run the aggregation queries.
>
> We have 6 million such documents and we are using solr cloud(6.6) on a 6
> node cluster with 8 Vcores and 24 GB RAM each.
>
> On the same set of hardware in elastic search we were getting the response
> in about 10ms whereas in solr we are getting response in around 300-400 ms.
>
> This is how I am querying the data.
>
> private SolrQuery buildQuery(Integer variable1, List groups,
> List metrics) {
> SolrQuery query = new SolrQuery();
> String groupQuery = " (" + groups.stream().map(g -> "group:" + g).collect
> (Collectors.joining(" OR ")) + ")";
> String finalQuery = "variable1:" + variable1 + " AND " + groupQuery;
> query.set("q", finalQuery);
> query.setRows(0);
> metrics.forEach(
> metric -> query.setGetFieldStatistics("{!sum=true }" + metric)
> );
> return query;
> }
>
> Any help will be appreciated regarding this.
>
>
> Thanks,
>
> Raunak


Solr Aggregation queries are way slower than Elastic Search

2017-12-11 Thread RAUNAK AGRAWAL
Hi,

We have a use case where there are 4-5 dimensions and around 3500 metrics
in a single document. We have indexed only 2 dimensions and made all the
metrics as doc_values so that we can run the aggregation queries.

We have 6 million such documents and we are using solr cloud(6.6) on a 6
node cluster with 8 Vcores and 24 GB RAM each.

On the same set of hardware in elastic search we were getting the response
in about 10ms whereas in solr we are getting response in around 300-400 ms.

This is how I am querying the data.

private SolrQuery buildQuery(Integer variable1, List groups,
List metrics) {
SolrQuery query = new SolrQuery();
String groupQuery = " (" + groups.stream().map(g -> "group:" + g).collect
(Collectors.joining(" OR ")) + ")";
String finalQuery = "variable1:" + variable1 + " AND " + groupQuery;
query.set("q", finalQuery);
query.setRows(0);
metrics.forEach(
metric -> query.setGetFieldStatistics("{!sum=true }" + metric)
);
return query;
}

Any help will be appreciated regarding this.


Thanks,

Raunak