Re: solr.NRTCachingDirectoryFactory

2016-08-26 Thread Rallavagu

Thanks Michail.

I am unable to locate bottleneck so far. Will try jstack and other tools.

On 8/25/16 11:40 PM, Mikhail Khludnev wrote:

Rough sampling under load makes sense as usual. JMC is one of the suitable
tools for this.
Sometimes even just jstack  or looking at SolrAdmin/Threads is enough.
If the only small ratio of documents is updated and a bottleneck is
filterCache you can experiment with segmened filters which suite more for
NRT.
http://blog-archive.griddynamics.com/2014/01/segmented-filter-cache-in-solr.html


On Fri, Aug 26, 2016 at 2:56 AM, Rallavagu  wrote:


Follow up update ...

Set autowarm count to zero for caches for NRT and I could negotiate
latency from 2 min to 5 min :)

However, still seeing high QTimes and wondering where else can I look?
Should I debug the code or run some tools to isolate bottlenecks (disk I/O,
CPU or Query itself). Looking for some tuning advice. Thanks.


On 7/26/16 9:42 AM, Erick Erickson wrote:


And, I might add, you should look through your old logs
and see how long it takes to open a searcher. Let's
say Shawn's lower bound is what you see, i.e.
it takes a minute each to execute all the autowarming
in filterCache and queryResultCache... So you're current
latency is _at least_ 2 minutes between the time something
is indexed and it's available for search just for autowarming.

Plus up to another 2 minutes for your soft commit interval
to expire.

So if your business people haven't noticed a 4 minute
latency yet, tell them they don't know what they're talking
about when they insist on the NRT interval being a few
seconds ;).

Best,
Erick

On Tue, Jul 26, 2016 at 7:20 AM, Rallavagu  wrote:




On 7/26/16 5:46 AM, Shawn Heisey wrote:



On 7/22/2016 10:15 AM, Rallavagu wrote:








 size="2"
 initialSize="2"
 autowarmCount="500"/>




As Erick indicated, these settings are incompatible with Near Real Time
updates.

With those settings, every time you commit and create a new searcher,
Solr will execute up to 1000 queries (potentially 500 for each of the
caches above) before that new searcher will begin returning new results.

I do not know how fast your filter queries execute when they aren't
cached... but even if they only take 100 milliseconds each, that's could
take up to a minute for filterCache warming.  If each one takes two
seconds and there are 500 entries in the cache, then autowarming the
filterCache would take nearly 17 minutes. You would also need to wait
for the warming queries on queryResultCache.

The autowarmCount on my filterCache is 4, and warming that cache *still*
sometimes takes ten or more seconds to complete.

If you want true NRT, you need to set all your autowarmCount values to
zero.  The tradeoff with NRT is that your caches are ineffective
immediately after a new searcher is created.



Will look into this and make changes as suggested.



Looking at the "top" screenshot ... you have plenty of memory to cache
the entire index.  Unless your queries are extreme, this is usually
enough for good performance.

One possible problem is that cache warming is taking far longer than
your autoSoftCommit interval, and the server is constantly busy making
thousands of warming queries.  Reducing autowarmCount, possibly to zero,
*might* fix that. I would expect higher CPU load than what your
screenshot shows if this were happening, but it still might be the
problem.



Great point. Thanks for the help.



Thanks,
Shawn









Re: solr.NRTCachingDirectoryFactory

2016-08-25 Thread Mikhail Khludnev
Rough sampling under load makes sense as usual. JMC is one of the suitable
tools for this.
Sometimes even just jstack  or looking at SolrAdmin/Threads is enough.
If the only small ratio of documents is updated and a bottleneck is
filterCache you can experiment with segmened filters which suite more for
NRT.
http://blog-archive.griddynamics.com/2014/01/segmented-filter-cache-in-solr.html


On Fri, Aug 26, 2016 at 2:56 AM, Rallavagu  wrote:

> Follow up update ...
>
> Set autowarm count to zero for caches for NRT and I could negotiate
> latency from 2 min to 5 min :)
>
> However, still seeing high QTimes and wondering where else can I look?
> Should I debug the code or run some tools to isolate bottlenecks (disk I/O,
> CPU or Query itself). Looking for some tuning advice. Thanks.
>
>
> On 7/26/16 9:42 AM, Erick Erickson wrote:
>
>> And, I might add, you should look through your old logs
>> and see how long it takes to open a searcher. Let's
>> say Shawn's lower bound is what you see, i.e.
>> it takes a minute each to execute all the autowarming
>> in filterCache and queryResultCache... So you're current
>> latency is _at least_ 2 minutes between the time something
>> is indexed and it's available for search just for autowarming.
>>
>> Plus up to another 2 minutes for your soft commit interval
>> to expire.
>>
>> So if your business people haven't noticed a 4 minute
>> latency yet, tell them they don't know what they're talking
>> about when they insist on the NRT interval being a few
>> seconds ;).
>>
>> Best,
>> Erick
>>
>> On Tue, Jul 26, 2016 at 7:20 AM, Rallavagu  wrote:
>>
>>>
>>>
>>> On 7/26/16 5:46 AM, Shawn Heisey wrote:
>>>

 On 7/22/2016 10:15 AM, Rallavagu wrote:

>
>   size="5000"
>  initialSize="5000"
>  autowarmCount="500"/>
>
>
   size="2"
>  initialSize="2"
>  autowarmCount="500"/>
>


 As Erick indicated, these settings are incompatible with Near Real Time
 updates.

 With those settings, every time you commit and create a new searcher,
 Solr will execute up to 1000 queries (potentially 500 for each of the
 caches above) before that new searcher will begin returning new results.

 I do not know how fast your filter queries execute when they aren't
 cached... but even if they only take 100 milliseconds each, that's could
 take up to a minute for filterCache warming.  If each one takes two
 seconds and there are 500 entries in the cache, then autowarming the
 filterCache would take nearly 17 minutes. You would also need to wait
 for the warming queries on queryResultCache.

 The autowarmCount on my filterCache is 4, and warming that cache *still*
 sometimes takes ten or more seconds to complete.

 If you want true NRT, you need to set all your autowarmCount values to
 zero.  The tradeoff with NRT is that your caches are ineffective
 immediately after a new searcher is created.

>>>
>>> Will look into this and make changes as suggested.
>>>
>>>
 Looking at the "top" screenshot ... you have plenty of memory to cache
 the entire index.  Unless your queries are extreme, this is usually
 enough for good performance.

 One possible problem is that cache warming is taking far longer than
 your autoSoftCommit interval, and the server is constantly busy making
 thousands of warming queries.  Reducing autowarmCount, possibly to zero,
 *might* fix that. I would expect higher CPU load than what your
 screenshot shows if this were happening, but it still might be the
 problem.

>>>
>>> Great point. Thanks for the help.
>>>
>>>
 Thanks,
 Shawn


>>>


-- 
Sincerely yours
Mikhail Khludnev


Re: solr.NRTCachingDirectoryFactory

2016-08-25 Thread Rallavagu

Follow up update ...

Set autowarm count to zero for caches for NRT and I could negotiate 
latency from 2 min to 5 min :)


However, still seeing high QTimes and wondering where else can I look? 
Should I debug the code or run some tools to isolate bottlenecks (disk 
I/O, CPU or Query itself). Looking for some tuning advice. Thanks.



On 7/26/16 9:42 AM, Erick Erickson wrote:

And, I might add, you should look through your old logs
and see how long it takes to open a searcher. Let's
say Shawn's lower bound is what you see, i.e.
it takes a minute each to execute all the autowarming
in filterCache and queryResultCache... So you're current
latency is _at least_ 2 minutes between the time something
is indexed and it's available for search just for autowarming.

Plus up to another 2 minutes for your soft commit interval
to expire.

So if your business people haven't noticed a 4 minute
latency yet, tell them they don't know what they're talking
about when they insist on the NRT interval being a few
seconds ;).

Best,
Erick

On Tue, Jul 26, 2016 at 7:20 AM, Rallavagu  wrote:



On 7/26/16 5:46 AM, Shawn Heisey wrote:


On 7/22/2016 10:15 AM, Rallavagu wrote:











As Erick indicated, these settings are incompatible with Near Real Time
updates.

With those settings, every time you commit and create a new searcher,
Solr will execute up to 1000 queries (potentially 500 for each of the
caches above) before that new searcher will begin returning new results.

I do not know how fast your filter queries execute when they aren't
cached... but even if they only take 100 milliseconds each, that's could
take up to a minute for filterCache warming.  If each one takes two
seconds and there are 500 entries in the cache, then autowarming the
filterCache would take nearly 17 minutes. You would also need to wait
for the warming queries on queryResultCache.

The autowarmCount on my filterCache is 4, and warming that cache *still*
sometimes takes ten or more seconds to complete.

If you want true NRT, you need to set all your autowarmCount values to
zero.  The tradeoff with NRT is that your caches are ineffective
immediately after a new searcher is created.


Will look into this and make changes as suggested.



Looking at the "top" screenshot ... you have plenty of memory to cache
the entire index.  Unless your queries are extreme, this is usually
enough for good performance.

One possible problem is that cache warming is taking far longer than
your autoSoftCommit interval, and the server is constantly busy making
thousands of warming queries.  Reducing autowarmCount, possibly to zero,
*might* fix that. I would expect higher CPU load than what your
screenshot shows if this were happening, but it still might be the
problem.


Great point. Thanks for the help.



Thanks,
Shawn





Re: solr.NRTCachingDirectoryFactory

2016-07-26 Thread Erick Erickson
And, I might add, you should look through your old logs
and see how long it takes to open a searcher. Let's
say Shawn's lower bound is what you see, i.e.
it takes a minute each to execute all the autowarming
in filterCache and queryResultCache... So you're current
latency is _at least_ 2 minutes between the time something
is indexed and it's available for search just for autowarming.

Plus up to another 2 minutes for your soft commit interval
to expire.

So if your business people haven't noticed a 4 minute
latency yet, tell them they don't know what they're talking
about when they insist on the NRT interval being a few
seconds ;).

Best,
Erick

On Tue, Jul 26, 2016 at 7:20 AM, Rallavagu  wrote:
>
>
> On 7/26/16 5:46 AM, Shawn Heisey wrote:
>>
>> On 7/22/2016 10:15 AM, Rallavagu wrote:
>>>
>>> >>  size="5000"
>>>  initialSize="5000"
>>>  autowarmCount="500"/>
>>>
>>
>>> >>  size="2"
>>>  initialSize="2"
>>>  autowarmCount="500"/>
>>
>>
>> As Erick indicated, these settings are incompatible with Near Real Time
>> updates.
>>
>> With those settings, every time you commit and create a new searcher,
>> Solr will execute up to 1000 queries (potentially 500 for each of the
>> caches above) before that new searcher will begin returning new results.
>>
>> I do not know how fast your filter queries execute when they aren't
>> cached... but even if they only take 100 milliseconds each, that's could
>> take up to a minute for filterCache warming.  If each one takes two
>> seconds and there are 500 entries in the cache, then autowarming the
>> filterCache would take nearly 17 minutes. You would also need to wait
>> for the warming queries on queryResultCache.
>>
>> The autowarmCount on my filterCache is 4, and warming that cache *still*
>> sometimes takes ten or more seconds to complete.
>>
>> If you want true NRT, you need to set all your autowarmCount values to
>> zero.  The tradeoff with NRT is that your caches are ineffective
>> immediately after a new searcher is created.
>
> Will look into this and make changes as suggested.
>
>>
>> Looking at the "top" screenshot ... you have plenty of memory to cache
>> the entire index.  Unless your queries are extreme, this is usually
>> enough for good performance.
>>
>> One possible problem is that cache warming is taking far longer than
>> your autoSoftCommit interval, and the server is constantly busy making
>> thousands of warming queries.  Reducing autowarmCount, possibly to zero,
>> *might* fix that. I would expect higher CPU load than what your
>> screenshot shows if this were happening, but it still might be the
>> problem.
>
> Great point. Thanks for the help.
>
>>
>> Thanks,
>> Shawn
>>
>


Re: solr.NRTCachingDirectoryFactory

2016-07-26 Thread Rallavagu



On 7/26/16 5:46 AM, Shawn Heisey wrote:

On 7/22/2016 10:15 AM, Rallavagu wrote:









As Erick indicated, these settings are incompatible with Near Real Time
updates.

With those settings, every time you commit and create a new searcher,
Solr will execute up to 1000 queries (potentially 500 for each of the
caches above) before that new searcher will begin returning new results.

I do not know how fast your filter queries execute when they aren't
cached... but even if they only take 100 milliseconds each, that's could
take up to a minute for filterCache warming.  If each one takes two
seconds and there are 500 entries in the cache, then autowarming the
filterCache would take nearly 17 minutes. You would also need to wait
for the warming queries on queryResultCache.

The autowarmCount on my filterCache is 4, and warming that cache *still*
sometimes takes ten or more seconds to complete.

If you want true NRT, you need to set all your autowarmCount values to
zero.  The tradeoff with NRT is that your caches are ineffective
immediately after a new searcher is created.

Will look into this and make changes as suggested.



Looking at the "top" screenshot ... you have plenty of memory to cache
the entire index.  Unless your queries are extreme, this is usually
enough for good performance.

One possible problem is that cache warming is taking far longer than
your autoSoftCommit interval, and the server is constantly busy making
thousands of warming queries.  Reducing autowarmCount, possibly to zero,
*might* fix that. I would expect higher CPU load than what your
screenshot shows if this were happening, but it still might be the problem.

Great point. Thanks for the help.



Thanks,
Shawn



Re: solr.NRTCachingDirectoryFactory

2016-07-26 Thread Shawn Heisey
On 7/22/2016 10:15 AM, Rallavagu wrote:
>   size="5000"
>  initialSize="5000"
>  autowarmCount="500"/>
>

>   size="2"
>  initialSize="2"
>  autowarmCount="500"/>

As Erick indicated, these settings are incompatible with Near Real Time
updates.

With those settings, every time you commit and create a new searcher,
Solr will execute up to 1000 queries (potentially 500 for each of the
caches above) before that new searcher will begin returning new results.

I do not know how fast your filter queries execute when they aren't
cached... but even if they only take 100 milliseconds each, that's could
take up to a minute for filterCache warming.  If each one takes two
seconds and there are 500 entries in the cache, then autowarming the
filterCache would take nearly 17 minutes. You would also need to wait
for the warming queries on queryResultCache.

The autowarmCount on my filterCache is 4, and warming that cache *still*
sometimes takes ten or more seconds to complete.

If you want true NRT, you need to set all your autowarmCount values to
zero.  The tradeoff with NRT is that your caches are ineffective
immediately after a new searcher is created.

Looking at the "top" screenshot ... you have plenty of memory to cache
the entire index.  Unless your queries are extreme, this is usually
enough for good performance.

One possible problem is that cache warming is taking far longer than
your autoSoftCommit interval, and the server is constantly busy making
thousands of warming queries.  Reducing autowarmCount, possibly to zero,
*might* fix that. I would expect higher CPU load than what your
screenshot shows if this were happening, but it still might be the problem.

Thanks,
Shawn



Re: solr.NRTCachingDirectoryFactory

2016-07-22 Thread Rallavagu



On 7/22/16 9:56 AM, Erick Erickson wrote:

OK, scratch autowarming. In fact your autowarm counts
are quite high, I suspect far past "diminishing returns".
I usually see autowarm counts < 64, but YMMV.

Are you seeing actual hit ratios that are decent on
those caches (admin UI>>plugins/stats>>cache>>...)
And your cache sizes are also quite high in my experience,
it's probably worth measuring the utilization there as well.
And, BTW, your filterCache can occupy up to 2G of your heap.
That's probably not your central problem, but it's something
to consider.

Will look into it.


So I don't know why your queries are taking that long, my
assumption is that they may simply be very complex queries,
or you have grouping on or.

Queries are a bit complex for sure.


I guess the next thing I'd do is start trying to characterize
what queries are slow. Grouping? Pivot Faceting? 'cause
from everything you've said so far it's surprising that you're
seeing queries take this long, something doesn't feel right
but what it is I don't have a clue.


Thanks



Best,
Erick

On Fri, Jul 22, 2016 at 9:15 AM, Rallavagu  wrote:



On 7/22/16 8:34 AM, Erick Erickson wrote:


Mostly this sounds like a problem that could be cured with
autowarming. But two things are conflicting here:
1> you say "We have a requirement to have updates available immediately
(NRT)"
2> your docs aren't available for 120 seconds given your autoSoftCommit
settings unless you're specifying
-Dsolr.autoSoftCommit.maxTime=some_other_interval
as a startup parameter.


Yes. We have 120 seconds available.


So assuming you really do have a 120 second autocommit time, you should be
able to smooth out the spikes by appropriate autowarming. You also haven't
indicated what your filterCache and queryResultCache settings are. They
come with a default of 0 for autowarm. But what is their size? And do you
see a correlation between longer queries every on 2 minute intervals? And
do you have some test harness in place (jmeter works well) to demonstrate
that differences in your configuration help or hurt? I can't
over-emphasize the
importance of this, otherwise if you rely on somebody simply saying "it's
slow"
you have no way to know what effect changes have.



Here is the cache configuration.









We have run load tests using JMeter with directory pointing to Solr and also
tests that are pointing to the application that queries Solr. In both cases,
we have noticed the results being slower.

Thanks



Best,
Erick


On Thu, Jul 21, 2016 at 11:22 PM, Shawn Heisey 
wrote:


On 7/21/2016 11:25 PM, Rallavagu wrote:


There is no other software running on the system and it is completely
dedicated to Solr. It is running on Linux. Here is the full version.

Linux version 3.8.13-55.1.6.el7uek.x86_64
(mockbu...@ca-build56.us.oracle.com) (gcc version 4.8.3 20140911 (Red
Hat 4.8.3-9) (GCC) ) #2 SMP Wed Feb 11 14:18:22 PST 2015



Run the top program, press shift-M to sort by memory usage, and then
grab a screenshot of the terminal window.  Share it with a site like
dropbox, imgur, or something similar, and send the URL.  You'll end up
with something like this:

https://www.dropbox.com/s/zlvpvd0rrr14yit/linux-solr-top.png?dl=0

If you know what to look for, you can figure out all the relevant memory
details from that.

Thanks,
Shawn





Re: solr.NRTCachingDirectoryFactory

2016-07-22 Thread Rallavagu

Also, here is the link to screenshot.

https://dl.dropboxusercontent.com/u/39813705/Screen%20Shot%202016-07-22%20at%2010.40.21%20AM.png

Thanks

On 7/21/16 11:22 PM, Shawn Heisey wrote:

On 7/21/2016 11:25 PM, Rallavagu wrote:

There is no other software running on the system and it is completely
dedicated to Solr. It is running on Linux. Here is the full version.

Linux version 3.8.13-55.1.6.el7uek.x86_64
(mockbu...@ca-build56.us.oracle.com) (gcc version 4.8.3 20140911 (Red
Hat 4.8.3-9) (GCC) ) #2 SMP Wed Feb 11 14:18:22 PST 2015


Run the top program, press shift-M to sort by memory usage, and then
grab a screenshot of the terminal window.  Share it with a site like
dropbox, imgur, or something similar, and send the URL.  You'll end up
with something like this:

https://www.dropbox.com/s/zlvpvd0rrr14yit/linux-solr-top.png?dl=0

If you know what to look for, you can figure out all the relevant memory
details from that.

Thanks,
Shawn



Re: solr.NRTCachingDirectoryFactory

2016-07-22 Thread Rallavagu
Here is the snapshot of memory usage from "top" as you mentioned. First 
row is "solr" process. Thanks.


 PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ 
COMMAND
29468 solr  20   0 27.536g 0.013t 3.297g S  45.7 27.6   4251:45 java 






21366 root  20   0 14.499g 217824  12952 S   1.0  0.4 192:11.54 java 






 2077 root  20   0 14.049g 190824   9980 S   0.7  0.4  62:44.00 
java 





  511 root  20   0  125792  56848  56616 S   0.0  0.1   9:33.23 
systemd-journal 





  316 splunk20   0  232056  44284  11804 S   0.7  0.1  84:52.74 
splunkd 





 1045 root  20   0  257680  39956   6836 S   0.3  0.1   7:05.78 
puppet 





32631 root  20   0  360956  39292   4788 S   0.0  0.1   4:55.37 
mcollectived 





  703 root  20   0  250372   9000976 S   0.0  0.0   1:35.52 
rsyslogd 





 1058 nslcd 20   0  454192   6004   2996 S   0.0  0.0  15:08.87 nslcd

On 7/21/16 11:22 PM, Shawn Heisey wrote:

On 7/21/2016 11:25 PM, Rallavagu wrote:

There is no other software running on the system and it is completely
dedicated to Solr. It is running on Linux. Here is the full version.

Linux version 3.8.13-55.1.6.el7uek.x86_64
(mockbu...@ca-build56.us.oracle.com) (gcc version 4.8.3 20140911 (Red
Hat 4.8.3-9) (GCC) ) #2 SMP Wed Feb 11 14:18:22 PST 2015


Run the top program, press shift-M to sort by memory usage, and then
grab a screenshot of the terminal window.  Share it with a site like
dropbox, imgur, or something similar, and send the URL.  You'll end up
with something like this:

https://www.dropbox.com/s/zlvpvd0rrr14yit/linux-solr-top.png?dl=0

If you know what to look for, you can figure out all the relevant memory
details from that.

Thanks,
Shawn



Re: solr.NRTCachingDirectoryFactory

2016-07-22 Thread Erick Erickson
OK, scratch autowarming. In fact your autowarm counts
are quite high, I suspect far past "diminishing returns".
I usually see autowarm counts < 64, but YMMV.

Are you seeing actual hit ratios that are decent on
those caches (admin UI>>plugins/stats>>cache>>...)
And your cache sizes are also quite high in my experience,
it's probably worth measuring the utilization there as well.
And, BTW, your filterCache can occupy up to 2G of your heap.
That's probably not your central problem, but it's something
to consider.

So I don't know why your queries are taking that long, my
assumption is that they may simply be very complex queries,
or you have grouping on or.

I guess the next thing I'd do is start trying to characterize
what queries are slow. Grouping? Pivot Faceting? 'cause
from everything you've said so far it's surprising that you're
seeing queries take this long, something doesn't feel right
but what it is I don't have a clue.

Best,
Erick

On Fri, Jul 22, 2016 at 9:15 AM, Rallavagu  wrote:
>
>
> On 7/22/16 8:34 AM, Erick Erickson wrote:
>>
>> Mostly this sounds like a problem that could be cured with
>> autowarming. But two things are conflicting here:
>> 1> you say "We have a requirement to have updates available immediately
>> (NRT)"
>> 2> your docs aren't available for 120 seconds given your autoSoftCommit
>> settings unless you're specifying
>> -Dsolr.autoSoftCommit.maxTime=some_other_interval
>> as a startup parameter.
>>
> Yes. We have 120 seconds available.
>
>> So assuming you really do have a 120 second autocommit time, you should be
>> able to smooth out the spikes by appropriate autowarming. You also haven't
>> indicated what your filterCache and queryResultCache settings are. They
>> come with a default of 0 for autowarm. But what is their size? And do you
>> see a correlation between longer queries every on 2 minute intervals? And
>> do you have some test harness in place (jmeter works well) to demonstrate
>> that differences in your configuration help or hurt? I can't
>> over-emphasize the
>> importance of this, otherwise if you rely on somebody simply saying "it's
>> slow"
>> you have no way to know what effect changes have.
>
>
> Here is the cache configuration.
>
>   size="5000"
>  initialSize="5000"
>  autowarmCount="500"/>
>
> 
>   size="2"
>  initialSize="2"
>  autowarmCount="500"/>
>
> 
> size="10"
>initialSize="10"
>autowarmCount="0"/>
>
> We have run load tests using JMeter with directory pointing to Solr and also
> tests that are pointing to the application that queries Solr. In both cases,
> we have noticed the results being slower.
>
> Thanks
>
>>
>> Best,
>> Erick
>>
>>
>> On Thu, Jul 21, 2016 at 11:22 PM, Shawn Heisey 
>> wrote:
>>>
>>> On 7/21/2016 11:25 PM, Rallavagu wrote:

 There is no other software running on the system and it is completely
 dedicated to Solr. It is running on Linux. Here is the full version.

 Linux version 3.8.13-55.1.6.el7uek.x86_64
 (mockbu...@ca-build56.us.oracle.com) (gcc version 4.8.3 20140911 (Red
 Hat 4.8.3-9) (GCC) ) #2 SMP Wed Feb 11 14:18:22 PST 2015
>>>
>>>
>>> Run the top program, press shift-M to sort by memory usage, and then
>>> grab a screenshot of the terminal window.  Share it with a site like
>>> dropbox, imgur, or something similar, and send the URL.  You'll end up
>>> with something like this:
>>>
>>> https://www.dropbox.com/s/zlvpvd0rrr14yit/linux-solr-top.png?dl=0
>>>
>>> If you know what to look for, you can figure out all the relevant memory
>>> details from that.
>>>
>>> Thanks,
>>> Shawn
>>>
>


Re: solr.NRTCachingDirectoryFactory

2016-07-22 Thread Rallavagu



On 7/22/16 8:34 AM, Erick Erickson wrote:

Mostly this sounds like a problem that could be cured with
autowarming. But two things are conflicting here:
1> you say "We have a requirement to have updates available immediately (NRT)"
2> your docs aren't available for 120 seconds given your autoSoftCommit
settings unless you're specifying
-Dsolr.autoSoftCommit.maxTime=some_other_interval
as a startup parameter.


Yes. We have 120 seconds available.


So assuming you really do have a 120 second autocommit time, you should be
able to smooth out the spikes by appropriate autowarming. You also haven't
indicated what your filterCache and queryResultCache settings are. They
come with a default of 0 for autowarm. But what is their size? And do you
see a correlation between longer queries every on 2 minute intervals? And
do you have some test harness in place (jmeter works well) to demonstrate
that differences in your configuration help or hurt? I can't over-emphasize the
importance of this, otherwise if you rely on somebody simply saying "it's slow"
you have no way to know what effect changes have.


Here is the cache configuration.









We have run load tests using JMeter with directory pointing to Solr and 
also tests that are pointing to the application that queries Solr. In 
both cases, we have noticed the results being slower.


Thanks



Best,
Erick


On Thu, Jul 21, 2016 at 11:22 PM, Shawn Heisey  wrote:

On 7/21/2016 11:25 PM, Rallavagu wrote:

There is no other software running on the system and it is completely
dedicated to Solr. It is running on Linux. Here is the full version.

Linux version 3.8.13-55.1.6.el7uek.x86_64
(mockbu...@ca-build56.us.oracle.com) (gcc version 4.8.3 20140911 (Red
Hat 4.8.3-9) (GCC) ) #2 SMP Wed Feb 11 14:18:22 PST 2015


Run the top program, press shift-M to sort by memory usage, and then
grab a screenshot of the terminal window.  Share it with a site like
dropbox, imgur, or something similar, and send the URL.  You'll end up
with something like this:

https://www.dropbox.com/s/zlvpvd0rrr14yit/linux-solr-top.png?dl=0

If you know what to look for, you can figure out all the relevant memory
details from that.

Thanks,
Shawn



Re: solr.NRTCachingDirectoryFactory

2016-07-22 Thread Erick Erickson
Mostly this sounds like a problem that could be cured with
autowarming. But two things are conflicting here:
1> you say "We have a requirement to have updates available immediately (NRT)"
2> your docs aren't available for 120 seconds given your autoSoftCommit
settings unless you're specifying
-Dsolr.autoSoftCommit.maxTime=some_other_interval
as a startup parameter.

So assuming you really do have a 120 second autocommit time, you should be
able to smooth out the spikes by appropriate autowarming. You also haven't
indicated what your filterCache and queryResultCache settings are. They
come with a default of 0 for autowarm. But what is their size? And do you
see a correlation between longer queries every on 2 minute intervals? And
do you have some test harness in place (jmeter works well) to demonstrate
that differences in your configuration help or hurt? I can't over-emphasize the
importance of this, otherwise if you rely on somebody simply saying "it's slow"
you have no way to know what effect changes have.

Best,
Erick


On Thu, Jul 21, 2016 at 11:22 PM, Shawn Heisey  wrote:
> On 7/21/2016 11:25 PM, Rallavagu wrote:
>> There is no other software running on the system and it is completely
>> dedicated to Solr. It is running on Linux. Here is the full version.
>>
>> Linux version 3.8.13-55.1.6.el7uek.x86_64
>> (mockbu...@ca-build56.us.oracle.com) (gcc version 4.8.3 20140911 (Red
>> Hat 4.8.3-9) (GCC) ) #2 SMP Wed Feb 11 14:18:22 PST 2015
>
> Run the top program, press shift-M to sort by memory usage, and then
> grab a screenshot of the terminal window.  Share it with a site like
> dropbox, imgur, or something similar, and send the URL.  You'll end up
> with something like this:
>
> https://www.dropbox.com/s/zlvpvd0rrr14yit/linux-solr-top.png?dl=0
>
> If you know what to look for, you can figure out all the relevant memory
> details from that.
>
> Thanks,
> Shawn
>


Re: solr.NRTCachingDirectoryFactory

2016-07-21 Thread Shawn Heisey
On 7/21/2016 11:25 PM, Rallavagu wrote:
> There is no other software running on the system and it is completely
> dedicated to Solr. It is running on Linux. Here is the full version.
>
> Linux version 3.8.13-55.1.6.el7uek.x86_64
> (mockbu...@ca-build56.us.oracle.com) (gcc version 4.8.3 20140911 (Red
> Hat 4.8.3-9) (GCC) ) #2 SMP Wed Feb 11 14:18:22 PST 2015 

Run the top program, press shift-M to sort by memory usage, and then
grab a screenshot of the terminal window.  Share it with a site like
dropbox, imgur, or something similar, and send the URL.  You'll end up
with something like this:

https://www.dropbox.com/s/zlvpvd0rrr14yit/linux-solr-top.png?dl=0

If you know what to look for, you can figure out all the relevant memory
details from that.

Thanks,
Shawn



Re: solr.NRTCachingDirectoryFactory

2016-07-21 Thread Rallavagu



On 7/21/16 9:16 PM, Shawn Heisey wrote:

On 7/21/2016 9:37 AM, Rallavagu wrote:

I suspect swapping as well. But, for my understanding - are the index
files from disk memory mapped automatically at the startup time?


They are *mapped* at startup time, but they are not *read* at startup.
The mapping just sets up a virtual address space for the entire file,
but until something actually reads the data from the disk, it will not
be in memory.  Getting the data in memory is what makes mmap fast.

http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html


We are not performing "commit" after every update and here is the
configuration for softCommit and hardCommit.


   ${solr.autoCommit.maxTime:15000}
   false



   ${solr.autoSoftCommit.maxTime:12}


I am seeing QTimes (for searches) swing between 10 seconds - 2
seconds. Some queries were showing the slowness caused to due to
faceting (debug=true). Since we have adjusted indexing and facet times
are improved but basic query QTime is still high so wondering where
can I look? Is there a way to debug (instrument) a query on Solr node?


Assuming you have not defined the maxTime system properties mentioned in
those configs, that config means you will potentially be creating a new
searcher every two minutes ... but if you are sending explicit commits
or using commitWithin on your updates, then the true situation may be
very different than what's configured here.


We have allocated significant amount of RAM (48G total
physical memory, 12G heap, Total index disk size is 15G)


Assuming there's no other software on the system besides the one
instance of Solr with a 12GB heap, this would mean that you have enough
room to cache the entire index.  What OS are you running on? With that
information, I may be able to relay some instructions that will help
determine what the complete memory situation is on your server.


There is no other software running on the system and it is completely 
dedicated to Solr. It is running on Linux. Here is the full version.


Linux version 3.8.13-55.1.6.el7uek.x86_64 
(mockbu...@ca-build56.us.oracle.com) (gcc version 4.8.3 20140911 (Red 
Hat 4.8.3-9) (GCC) ) #2 SMP Wed Feb 11 14:18:22 PST 2015


Thanks



Thanks,
Shawn



Re: solr.NRTCachingDirectoryFactory

2016-07-21 Thread Shawn Heisey
On 7/21/2016 9:37 AM, Rallavagu wrote:
> I suspect swapping as well. But, for my understanding - are the index
> files from disk memory mapped automatically at the startup time?

They are *mapped* at startup time, but they are not *read* at startup. 
The mapping just sets up a virtual address space for the entire file,
but until something actually reads the data from the disk, it will not
be in memory.  Getting the data in memory is what makes mmap fast.

http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

> We are not performing "commit" after every update and here is the
> configuration for softCommit and hardCommit.
>
> 
>${solr.autoCommit.maxTime:15000}
>false
> 
>
> 
>${solr.autoSoftCommit.maxTime:12}
> 
>
> I am seeing QTimes (for searches) swing between 10 seconds - 2
> seconds. Some queries were showing the slowness caused to due to
> faceting (debug=true). Since we have adjusted indexing and facet times
> are improved but basic query QTime is still high so wondering where
> can I look? Is there a way to debug (instrument) a query on Solr node?

Assuming you have not defined the maxTime system properties mentioned in
those configs, that config means you will potentially be creating a new
searcher every two minutes ... but if you are sending explicit commits
or using commitWithin on your updates, then the true situation may be
very different than what's configured here.

>>> We have allocated significant amount of RAM (48G total
>>> physical memory, 12G heap, Total index disk size is 15G)

Assuming there's no other software on the system besides the one
instance of Solr with a 12GB heap, this would mean that you have enough
room to cache the entire index.  What OS are you running on? With that
information, I may be able to relay some instructions that will help
determine what the complete memory situation is on your server.

Thanks,
Shawn



Re: solr.NRTCachingDirectoryFactory

2016-07-21 Thread Rallavagu

Thanks Erick.

On 7/21/16 8:25 AM, Erick Erickson wrote:

bq: map index files so "reading from disk" will be as simple and quick
as reading from memory hence would not incur any significant
performance degradation.

Well, if
1> the read has already been done. First time a page of the file is
accessed, it must be read from disk.
2> You have enough physical memory that _all_ of the files can be held
in memory at once.

<2> is a little tricky since the big slowdown comes from swapping
eventually. But in an LRU scheme, that may be OK if the oldest pages
are the stored=true data which are only accessed to return the top N,
not to satisfy the search.
I suspect swapping as well. But, for my understanding - are the index 
files from disk memory mapped automatically at the startup time?


What are your QTimes anyway? Define "optimal"

I'd really push back on this statement: "We have a requirement to have
updates available immediately (NRT)". Truly? You can't set
expectations that 5 seconds will be needed (or 10?). Often this is an
artificial requirement that does no real service to the user, it's
just something people think they want. If this means you're sending a
commit after every document, it's actually a really bad practice
that'll get you into trouble eventually. Plus you won't be able to do
any autowarming which will read data from disk into the OS memory and
smooth out any spikes


We are not performing "commit" after every update and here is the 
configuration for softCommit and hardCommit.



   ${solr.autoCommit.maxTime:15000}
   false



   ${solr.autoSoftCommit.maxTime:12}


I am seeing QTimes (for searches) swing between 10 seconds - 2 seconds. 
Some queries were showing the slowness caused to due to faceting 
(debug=true). Since we have adjusted indexing and facet times are 
improved but basic query QTime is still high so wondering where can I 
look? Is there a way to debug (instrument) a query on Solr node?




FWIW,
Erick

On Thu, Jul 21, 2016 at 8:18 AM, Rallavagu  wrote:

Solr 5.4.1 with embedded jetty with cloud enabled

We have a Solr deployment (approximately 3 million documents) with both
write and search operations happening. We have a requirement to have updates
available immediately (NRT). Configured with default
"solr.NRTCachingDirectoryFactory" for directory factory. Considering the
fact that every time there is an update, caches are invalidated and re-built
I assume that "solr.NRTCachingDirectoryFactory" would memory map index files
so "reading from disk" will be as simple and quick as reading from memory
hence would not incur any significant performance degradation. Am I right in
my assumption? We have allocated significant amount of RAM (48G total
physical memory, 12G heap, Total index disk size is 15G) but not sure if I
am seeing the optimal QTimes (for searches). Any inputs are welcome. Thanks
in advance.


Re: solr.NRTCachingDirectoryFactory

2016-07-21 Thread Erick Erickson
bq: map index files so "reading from disk" will be as simple and quick
as reading from memory hence would not incur any significant
performance degradation.

Well, if
1> the read has already been done. First time a page of the file is
accessed, it must be read from disk.
2> You have enough physical memory that _all_ of the files can be held
in memory at once.

<2> is a little tricky since the big slowdown comes from swapping
eventually. But in an LRU scheme, that may be OK if the oldest pages
are the stored=true data which are only accessed to return the top N,
not to satisfy the search.

What are your QTimes anyway? Define "optimal"

I'd really push back on this statement: "We have a requirement to have
updates available immediately (NRT)". Truly? You can't set
expectations that 5 seconds will be needed (or 10?). Often this is an
artificial requirement that does no real service to the user, it's
just something people think they want. If this means you're sending a
commit after every document, it's actually a really bad practice
that'll get you into trouble eventually. Plus you won't be able to do
any autowarming which will read data from disk into the OS memory and
smooth out any spikes.

FWIW,
Erick

On Thu, Jul 21, 2016 at 8:18 AM, Rallavagu  wrote:
> Solr 5.4.1 with embedded jetty with cloud enabled
>
> We have a Solr deployment (approximately 3 million documents) with both
> write and search operations happening. We have a requirement to have updates
> available immediately (NRT). Configured with default
> "solr.NRTCachingDirectoryFactory" for directory factory. Considering the
> fact that every time there is an update, caches are invalidated and re-built
> I assume that "solr.NRTCachingDirectoryFactory" would memory map index files
> so "reading from disk" will be as simple and quick as reading from memory
> hence would not incur any significant performance degradation. Am I right in
> my assumption? We have allocated significant amount of RAM (48G total
> physical memory, 12G heap, Total index disk size is 15G) but not sure if I
> am seeing the optimal QTimes (for searches). Any inputs are welcome. Thanks
> in advance.


solr.NRTCachingDirectoryFactory

2016-07-21 Thread Rallavagu

Solr 5.4.1 with embedded jetty with cloud enabled

We have a Solr deployment (approximately 3 million documents) with both 
write and search operations happening. We have a requirement to have 
updates available immediately (NRT). Configured with default 
"solr.NRTCachingDirectoryFactory" for directory factory. Considering the 
fact that every time there is an update, caches are invalidated and 
re-built I assume that "solr.NRTCachingDirectoryFactory" would memory 
map index files so "reading from disk" will be as simple and quick as 
reading from memory hence would not incur any significant performance 
degradation. Am I right in my assumption? We have allocated significant 
amount of RAM (48G total physical memory, 12G heap, Total index disk 
size is 15G) but not sure if I am seeing the optimal QTimes (for 
searches). Any inputs are welcome. Thanks in advance.