Re: Total Collection Size in Solr 7

2018-06-27 Thread Aroop Ganguly
Ah ok ! 

> On Jun 27, 2018, at 8:53 AM, Erick Erickson  wrote:
> 
> Just sum up the sizes of all the files in your index directory. Clumsy
> to be sure
> 
> On Tue, Jun 26, 2018 at 3:12 PM, Aroop Ganguly  
> wrote:
>> Hi Eric
>> 
>> Thanks for the advice.
>> One open question still, about point 1 below: how to get that magic number 
>> of size in GBs :) ?
>> As I am mostly using streaming expressions, most of my fields are DocValues 
>> and not stored.
>> 
>> I will look at the health endpoint to see what it gives me in connection 
>> with size.
>> 
>> Thanks
>> Aroop
>> 
>> 
>>> On Jun 26, 2018, at 10:49 AM, Erick Erickson  
>>> wrote:
>>> 
>>> Aroop:
>>> 
>>> Not that I know of. You could do a reasonable approximation by
>>> 1> check the index size (manually) with, say, 10M docs
>>> 2> check it again with 20M docs
>>> 3> use a match all docs query and do the math.
>>> 
>>> That's clumsy but do-able. The reason I start with 10M and 20M is that
>>> index size does not go up linearly so I like to seed the index first.
>>> 
>>> That said, though, it's hard to generalize index size as meaning much.
>>> Is it 90% stored? 10% stored data? Those ratios have huge implications
>>> on whether you're straining anything except disk space.
>>> 
>>> There are a lot of metrics, starting with Solr 6.4 that are available
>>> that give you a much better view of Solr's health.
>>> 
>>> Best,
>>> Erick
>>> 
>>> On Tue, Jun 26, 2018 at 9:21 AM, Aroop Ganguly  
>>> wrote:
 Hi Erick
 
 Sure I will look those jiras up.
 In the interim, is what Susmit suggested the only way to get the size 
 info? Or is there something else you can recommend?
 
 Thanks
 Aroop
 
 
 
> On Jun 26, 2018, at 6:53 AM, Erick Erickson  
> wrote:
> 
> Some work is being done on the admin UI, there are several JIRAs.
> Perhaps you'd like to join that conversation? We need to have input,
> especially in terms of what kinds of information would be useful from
> a practitioner's standpoint.
> 
> Best,
> Erick
> 
>> On Mon, Jun 25, 2018 at 11:26 PM, Aroop Ganguly 
>>  wrote:
>> I see, Thanks Susmit.
>> I hoped there was something simpler, that could just be part of the 
>> collections view we now have in solr 7 admin ui. Or a at least a one 
>> stop api call.
>> I guess this will be added in a later release.
>> 
>>> On Jun 25, 2018, at 11:20 PM, Susmit  wrote:
>>> 
>>> Hi Aroop,
>>> i created a utility using solrzkclient api to read state.json, 
>>> enumerated (one) replica for each shard and used /replication handler 
>>> for size and added them up..
>>> 
>>> Sent from my iPhone
>>> 
 On Jun 25, 2018, at 7:24 PM, Aroop Ganguly  
 wrote:
 
 Hi Team
 
 I am not sure how to ascertain the total size of a collection via the 
 Solr UI on a Solr7+ installation.
 The collection is shared and replicated heavily so its tedious to have 
 to look at each core and figure out the size of the entire collection 
 from this in an additive way.
 
 Is there an api or ui section from where this info can be obtained ?
 
 On the flip side, it would be great to have a consolidated view of the 
 collection size in GBs along with the individual shard sizes. (Should 
 this be a Jira :) ?)
 
 Thanks
 Aroop
>> 
>> 



Re: Total Collection Size in Solr 7

2018-06-27 Thread Erick Erickson
Just sum up the sizes of all the files in your index directory. Clumsy
to be sure

On Tue, Jun 26, 2018 at 3:12 PM, Aroop Ganguly  wrote:
> Hi Eric
>
> Thanks for the advice.
> One open question still, about point 1 below: how to get that magic number of 
> size in GBs :) ?
> As I am mostly using streaming expressions, most of my fields are DocValues 
> and not stored.
>
> I will look at the health endpoint to see what it gives me in connection with 
> size.
>
> Thanks
> Aroop
>
>
>> On Jun 26, 2018, at 10:49 AM, Erick Erickson  wrote:
>>
>> Aroop:
>>
>> Not that I know of. You could do a reasonable approximation by
>> 1> check the index size (manually) with, say, 10M docs
>> 2> check it again with 20M docs
>> 3> use a match all docs query and do the math.
>>
>> That's clumsy but do-able. The reason I start with 10M and 20M is that
>> index size does not go up linearly so I like to seed the index first.
>>
>> That said, though, it's hard to generalize index size as meaning much.
>> Is it 90% stored? 10% stored data? Those ratios have huge implications
>> on whether you're straining anything except disk space.
>>
>> There are a lot of metrics, starting with Solr 6.4 that are available
>> that give you a much better view of Solr's health.
>>
>> Best,
>> Erick
>>
>> On Tue, Jun 26, 2018 at 9:21 AM, Aroop Ganguly  
>> wrote:
>>> Hi Erick
>>>
>>> Sure I will look those jiras up.
>>> In the interim, is what Susmit suggested the only way to get the size info? 
>>> Or is there something else you can recommend?
>>>
>>> Thanks
>>> Aroop
>>>
>>>
>>>
 On Jun 26, 2018, at 6:53 AM, Erick Erickson  
 wrote:

 Some work is being done on the admin UI, there are several JIRAs.
 Perhaps you'd like to join that conversation? We need to have input,
 especially in terms of what kinds of information would be useful from
 a practitioner's standpoint.

 Best,
 Erick

> On Mon, Jun 25, 2018 at 11:26 PM, Aroop Ganguly  
> wrote:
> I see, Thanks Susmit.
> I hoped there was something simpler, that could just be part of the 
> collections view we now have in solr 7 admin ui. Or a at least a one stop 
> api call.
> I guess this will be added in a later release.
>
>> On Jun 25, 2018, at 11:20 PM, Susmit  wrote:
>>
>> Hi Aroop,
>> i created a utility using solrzkclient api to read state.json, 
>> enumerated (one) replica for each shard and used /replication handler 
>> for size and added them up..
>>
>> Sent from my iPhone
>>
>>> On Jun 25, 2018, at 7:24 PM, Aroop Ganguly  
>>> wrote:
>>>
>>> Hi Team
>>>
>>> I am not sure how to ascertain the total size of a collection via the 
>>> Solr UI on a Solr7+ installation.
>>> The collection is shared and replicated heavily so its tedious to have 
>>> to look at each core and figure out the size of the entire collection 
>>> from this in an additive way.
>>>
>>> Is there an api or ui section from where this info can be obtained ?
>>>
>>> On the flip side, it would be great to have a consolidated view of the 
>>> collection size in GBs along with the individual shard sizes. (Should 
>>> this be a Jira :) ?)
>>>
>>> Thanks
>>> Aroop
>
>


Re: Total Collection Size in Solr 7

2018-06-26 Thread Aroop Ganguly
Hi Eric

Thanks for the advice. 
One open question still, about point 1 below: how to get that magic number of 
size in GBs :) ?
As I am mostly using streaming expressions, most of my fields are DocValues and 
not stored.

I will look at the health endpoint to see what it gives me in connection with 
size.

Thanks
Aroop


> On Jun 26, 2018, at 10:49 AM, Erick Erickson  wrote:
> 
> Aroop:
> 
> Not that I know of. You could do a reasonable approximation by
> 1> check the index size (manually) with, say, 10M docs
> 2> check it again with 20M docs
> 3> use a match all docs query and do the math.
> 
> That's clumsy but do-able. The reason I start with 10M and 20M is that
> index size does not go up linearly so I like to seed the index first.
> 
> That said, though, it's hard to generalize index size as meaning much.
> Is it 90% stored? 10% stored data? Those ratios have huge implications
> on whether you're straining anything except disk space.
> 
> There are a lot of metrics, starting with Solr 6.4 that are available
> that give you a much better view of Solr's health.
> 
> Best,
> Erick
> 
> On Tue, Jun 26, 2018 at 9:21 AM, Aroop Ganguly  
> wrote:
>> Hi Erick
>> 
>> Sure I will look those jiras up.
>> In the interim, is what Susmit suggested the only way to get the size info? 
>> Or is there something else you can recommend?
>> 
>> Thanks
>> Aroop
>> 
>> 
>> 
>>> On Jun 26, 2018, at 6:53 AM, Erick Erickson  wrote:
>>> 
>>> Some work is being done on the admin UI, there are several JIRAs.
>>> Perhaps you'd like to join that conversation? We need to have input,
>>> especially in terms of what kinds of information would be useful from
>>> a practitioner's standpoint.
>>> 
>>> Best,
>>> Erick
>>> 
 On Mon, Jun 25, 2018 at 11:26 PM, Aroop Ganguly  
 wrote:
 I see, Thanks Susmit.
 I hoped there was something simpler, that could just be part of the 
 collections view we now have in solr 7 admin ui. Or a at least a one stop 
 api call.
 I guess this will be added in a later release.
 
> On Jun 25, 2018, at 11:20 PM, Susmit  wrote:
> 
> Hi Aroop,
> i created a utility using solrzkclient api to read state.json, enumerated 
> (one) replica for each shard and used /replication handler for size and 
> added them up..
> 
> Sent from my iPhone
> 
>> On Jun 25, 2018, at 7:24 PM, Aroop Ganguly  
>> wrote:
>> 
>> Hi Team
>> 
>> I am not sure how to ascertain the total size of a collection via the 
>> Solr UI on a Solr7+ installation.
>> The collection is shared and replicated heavily so its tedious to have 
>> to look at each core and figure out the size of the entire collection 
>> from this in an additive way.
>> 
>> Is there an api or ui section from where this info can be obtained ?
>> 
>> On the flip side, it would be great to have a consolidated view of the 
>> collection size in GBs along with the individual shard sizes. (Should 
>> this be a Jira :) ?)
>> 
>> Thanks
>> Aroop
 



Re: Total Collection Size in Solr 7

2018-06-26 Thread Erick Erickson
Aroop:

Not that I know of. You could do a reasonable approximation by
1> check the index size (manually) with, say, 10M docs
2> check it again with 20M docs
3> use a match all docs query and do the math.

That's clumsy but do-able. The reason I start with 10M and 20M is that
index size does not go up linearly so I like to seed the index first.

That said, though, it's hard to generalize index size as meaning much.
Is it 90% stored? 10% stored data? Those ratios have huge implications
on whether you're straining anything except disk space.

There are a lot of metrics, starting with Solr 6.4 that are available
that give you a much better view of Solr's health.

Best,
Erick

On Tue, Jun 26, 2018 at 9:21 AM, Aroop Ganguly  wrote:
> Hi Erick
>
> Sure I will look those jiras up.
> In the interim, is what Susmit suggested the only way to get the size info? 
> Or is there something else you can recommend?
>
> Thanks
> Aroop
>
>
>
>> On Jun 26, 2018, at 6:53 AM, Erick Erickson  wrote:
>>
>> Some work is being done on the admin UI, there are several JIRAs.
>> Perhaps you'd like to join that conversation? We need to have input,
>> especially in terms of what kinds of information would be useful from
>> a practitioner's standpoint.
>>
>> Best,
>> Erick
>>
>>> On Mon, Jun 25, 2018 at 11:26 PM, Aroop Ganguly  
>>> wrote:
>>> I see, Thanks Susmit.
>>> I hoped there was something simpler, that could just be part of the 
>>> collections view we now have in solr 7 admin ui. Or a at least a one stop 
>>> api call.
>>> I guess this will be added in a later release.
>>>
 On Jun 25, 2018, at 11:20 PM, Susmit  wrote:

 Hi Aroop,
 i created a utility using solrzkclient api to read state.json, enumerated 
 (one) replica for each shard and used /replication handler for size and 
 added them up..

 Sent from my iPhone

> On Jun 25, 2018, at 7:24 PM, Aroop Ganguly  
> wrote:
>
> Hi Team
>
> I am not sure how to ascertain the total size of a collection via the 
> Solr UI on a Solr7+ installation.
> The collection is shared and replicated heavily so its tedious to have to 
> look at each core and figure out the size of the entire collection from 
> this in an additive way.
>
> Is there an api or ui section from where this info can be obtained ?
>
> On the flip side, it would be great to have a consolidated view of the 
> collection size in GBs along with the individual shard sizes. (Should 
> this be a Jira :) ?)
>
> Thanks
> Aroop
>>>


Re: Total Collection Size in Solr 7

2018-06-26 Thread Aroop Ganguly
Hi Erick

Sure I will look those jiras up. 
In the interim, is what Susmit suggested the only way to get the size info? Or 
is there something else you can recommend? 

Thanks
Aroop



> On Jun 26, 2018, at 6:53 AM, Erick Erickson  wrote:
> 
> Some work is being done on the admin UI, there are several JIRAs.
> Perhaps you'd like to join that conversation? We need to have input,
> especially in terms of what kinds of information would be useful from
> a practitioner's standpoint.
> 
> Best,
> Erick
> 
>> On Mon, Jun 25, 2018 at 11:26 PM, Aroop Ganguly  
>> wrote:
>> I see, Thanks Susmit.
>> I hoped there was something simpler, that could just be part of the 
>> collections view we now have in solr 7 admin ui. Or a at least a one stop 
>> api call.
>> I guess this will be added in a later release.
>> 
>>> On Jun 25, 2018, at 11:20 PM, Susmit  wrote:
>>> 
>>> Hi Aroop,
>>> i created a utility using solrzkclient api to read state.json, enumerated 
>>> (one) replica for each shard and used /replication handler for size and 
>>> added them up..
>>> 
>>> Sent from my iPhone
>>> 
 On Jun 25, 2018, at 7:24 PM, Aroop Ganguly  wrote:
 
 Hi Team
 
 I am not sure how to ascertain the total size of a collection via the Solr 
 UI on a Solr7+ installation.
 The collection is shared and replicated heavily so its tedious to have to 
 look at each core and figure out the size of the entire collection from 
 this in an additive way.
 
 Is there an api or ui section from where this info can be obtained ?
 
 On the flip side, it would be great to have a consolidated view of the 
 collection size in GBs along with the individual shard sizes. (Should this 
 be a Jira :) ?)
 
 Thanks
 Aroop
>> 


Re: Total Collection Size in Solr 7

2018-06-26 Thread Erick Erickson
Some work is being done on the admin UI, there are several JIRAs.
Perhaps you'd like to join that conversation? We need to have input,
especially in terms of what kinds of information would be useful from
a practitioner's standpoint.

Best,
Erick

On Mon, Jun 25, 2018 at 11:26 PM, Aroop Ganguly  wrote:
> I see, Thanks Susmit.
> I hoped there was something simpler, that could just be part of the 
> collections view we now have in solr 7 admin ui. Or a at least a one stop api 
> call.
> I guess this will be added in a later release.
>
>> On Jun 25, 2018, at 11:20 PM, Susmit  wrote:
>>
>> Hi Aroop,
>> i created a utility using solrzkclient api to read state.json, enumerated 
>> (one) replica for each shard and used /replication handler for size and 
>> added them up..
>>
>> Sent from my iPhone
>>
>>> On Jun 25, 2018, at 7:24 PM, Aroop Ganguly  wrote:
>>>
>>> Hi Team
>>>
>>> I am not sure how to ascertain the total size of a collection via the Solr 
>>> UI on a Solr7+ installation.
>>> The collection is shared and replicated heavily so its tedious to have to 
>>> look at each core and figure out the size of the entire collection from 
>>> this in an additive way.
>>>
>>> Is there an api or ui section from where this info can be obtained ?
>>>
>>> On the flip side, it would be great to have a consolidated view of the 
>>> collection size in GBs along with the individual shard sizes. (Should this 
>>> be a Jira :) ?)
>>>
>>> Thanks
>>> Aroop
>


Re: Total Collection Size in Solr 7

2018-06-26 Thread Aroop Ganguly
I see, Thanks Susmit. 
I hoped there was something simpler, that could just be part of the collections 
view we now have in solr 7 admin ui. Or a at least a one stop api call.
I guess this will be added in a later release.

> On Jun 25, 2018, at 11:20 PM, Susmit  wrote:
> 
> Hi Aroop, 
> i created a utility using solrzkclient api to read state.json, enumerated 
> (one) replica for each shard and used /replication handler for size and added 
> them up..
> 
> Sent from my iPhone
> 
>> On Jun 25, 2018, at 7:24 PM, Aroop Ganguly  wrote:
>> 
>> Hi Team
>> 
>> I am not sure how to ascertain the total size of a collection via the Solr 
>> UI on a Solr7+ installation.
>> The collection is shared and replicated heavily so its tedious to have to 
>> look at each core and figure out the size of the entire collection from this 
>> in an additive way.
>> 
>> Is there an api or ui section from where this info can be obtained ?
>> 
>> On the flip side, it would be great to have a consolidated view of the 
>> collection size in GBs along with the individual shard sizes. (Should this 
>> be a Jira :) ?) 
>> 
>> Thanks
>> Aroop



Re: Total Collection Size in Solr 7

2018-06-26 Thread Susmit
Hi Aroop, 
i created a utility using solrzkclient api to read state.json, enumerated (one) 
replica for each shard and used /replication handler for size and added them 
up..

Sent from my iPhone

> On Jun 25, 2018, at 7:24 PM, Aroop Ganguly  wrote:
> 
> Hi Team
> 
> I am not sure how to ascertain the total size of a collection via the Solr UI 
> on a Solr7+ installation.
> The collection is shared and replicated heavily so its tedious to have to 
> look at each core and figure out the size of the entire collection from this 
> in an additive way.
> 
> Is there an api or ui section from where this info can be obtained ?
> 
> On the flip side, it would be great to have a consolidated view of the 
> collection size in GBs along with the individual shard sizes. (Should this be 
> a Jira :) ?) 
> 
> Thanks
> Aroop


Total Collection Size in Solr 7

2018-06-25 Thread Aroop Ganguly
Hi Team

I am not sure how to ascertain the total size of a collection via the Solr UI 
on a Solr7+ installation.
The collection is shared and replicated heavily so its tedious to have to look 
at each core and figure out the size of the entire collection from this in an 
additive way.

Is there an api or ui section from where this info can be obtained ?

On the flip side, it would be great to have a consolidated view of the 
collection size in GBs along with the individual shard sizes. (Should this be a 
Jira :) ?) 

Thanks
Aroop