subject:"Possible memory leaks with frequent replication"

Re: Possible memory leaks with frequent replication

2010-11-03 Thread Jonathan Rochkind

I hadn't looked at the code, am not familiar with Solr code, and can't 
say what that code does.


But I have experienced issues that I _believe_ were caused by too 
frequent commits causing over-lapping searcher preperation. And I've 
definitely seen Solr documentation that suggests this is an issue. Let 
me find it now to see if the experts think these documented suggests are 
still correct or not:


On the other hand, autowarming (populating) a new collection could take 
a lot of time, especially since it uses only one thread and one CPU. If 
your settings fire off snapinstaller too frequently, then a Solr slave 
could be in the undesirable condition of handing-off queries to one 
(old) collection, and, while warming a new collection, a second “new” 
one could be snapped and begin warming!


If we attempted to solve such a situation, we would have to invalidate 
the first “new” collection in order to use the second one, then when a 
“third” new collection would be snapped and warmed, we would have to 
invalidate the “second” new collection, and so on ad infinitum. A 
completely warmed collection would never make it to full term before it 
was aborted. This can be prevented with a properly tuned configuration 
so new collections do not get installed too rapidly. 


http://wiki.apache.org/solr/SolrPerformanceFactors#Updates_and_Commit_Frequency_Tradeoffs

I think I've seen that same advice on another wiki page without being 
specifically regarding replication, but just being about commit 
frequency balanced with auto-warming, leading to overlapping warming, 
leading to spiraling RAM/CPU usage -- but NOT an exception being thrown 
or HTTP error delivered.


I can't find it on the wiki, but here's a listserv post with someone 
reporting findings that match my understanding: 
http://osdir.com/ml/solr-user.lucene.apache.org/2010-09/msg00528.html


How does this advice square with the code Lance found?  Is my 
understanding of how frequent commits can interact with time it takes to 
warm a new collection correct? Appreciate any additional info.





Lance Norskog wrote:

Isn't that what this code does?

  onDeckSearchers++;
  if (onDeckSearchers  1) {
// should never happen... just a sanity check
log.error(logid+ERROR!!! onDeckSearchers is  + onDeckSearchers);
onDeckSearchers=1;  // reset
  } else if (onDeckSearchers  maxWarmingSearchers) {
onDeckSearchers--;
String msg=Error opening new searcher. exceeded limit of
maxWarmingSearchers=+maxWarmingSearchers + , try again later.;
log.warn(logid++ msg);
// HTTP 503==service unavailable, or 409==Conflict
throw new
SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE,msg,true);
  } else if (onDeckSearchers  1) {
log.info(logid+PERFORMANCE WARNING: Overlapping
onDeckSearchers= + onDeckSearchers);
  }


On Tue, Nov 2, 2010 at 10:02 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
  

It's definitely a known 'issue' that you can't replicate (or do any other
kind of index change, including a commit) at a faster frequency than your
warming queries take to complete, or you'll wind up with something like
you've seen.

It's in some documentation somewhere I saw, for sure.

The advice to 'just query against the master' is kind of odd, because,
then... why have a slave at all, if you aren't going to query against it?  I
guess just for backup purposes.

But even with just one solr, or querying master, if you commit at rate such
that commits come before the warming queries can complete, you're going to
have the same issue.

The only answer I know of is Don't commit (or replicate) at a faster rate
than it takes your warming to complete.  You can reduce your warming
queries/operations, or reduce your commit/replicate frequency.

Would be interesting/useful if Solr noticed this going on, and gave you some
kind of error in the log (or even an exception when started with a certain
parameter for testing) Overlapping warming queries, you're committing too
fast or something. Because it's easy to make this happen without realizing
it, and then your Solr does what Simon says, runs out of RAM and/or uses a
whole lot of CPU and disk io.

Lance Norskog wrote:


You should query against the indexer. I'm impressed that you got 5s
replication to work reliably.

On Mon, Nov 1, 2010 at 4:27 PM, Simon Wistow si...@thegestalt.org wrote:

  

We've been trying to get a setup in which a slave replicates from a
master every few seconds (ideally every second but currently we have it
set at every 5s).

Everything seems to work fine until, periodically, the slave just stops
responding from what looks like it running out of memory:

org.apache.catalina.core.StandardWrapperValve invoke
SEVERE: Servlet.service() for servlet jsp threw exception
java.lang.OutOfMemoryError: Java heap space


(our monitoring seems to confirm this).

Looking around my suspicion is that it takes new Readers longer to warm
than

Re: Possible memory leaks with frequent replication

2010-11-03 Thread Jonathan Rochkind

Ah, but reading Peter's email message I reference more carefully, it 
seems that Solr already DOES provide an info-level log warning you about 
over-lapping warming, awesome. (But again, I'm pretty sure it does NOT 
throw or HTTP error in that condition, based on my and others experience).



 To check if your Solr environment is suffering from this, turn on INFO
 level logging, and look for: 'PERFORMANCE WARNING: Overlapping
 onDeckSearchers=x'.

Sweet, good to know, and I'll definitely add this to my debugging 
toolbox. Peter's listserv message really ought to be a wiki page, I 
think.  Any reason for me not to just add it as a new one with title 
Commit frequency and auto-warming or something like that?  Unless it's 
already in the wiki somewhere I haven't found, assuming the wiki will 
let an ordinary user-created account add a new page.

//
Jonathan Rochkind wrote:
I hadn't looked at the code, am not familiar with Solr code, and can't 
say what that code does.


But I have experienced issues that I _believe_ were caused by too 
frequent commits causing over-lapping searcher preperation. And I've 
definitely seen Solr documentation that suggests this is an issue. Let 
me find it now to see if the experts think these documented suggests are 
still correct or not:


On the other hand, autowarming (populating) a new collection could take 
a lot of time, especially since it uses only one thread and one CPU. If 
your settings fire off snapinstaller too frequently, then a Solr slave 
could be in the undesirable condition of handing-off queries to one 
(old) collection, and, while warming a new collection, a second “new” 
one could be snapped and begin warming!


If we attempted to solve such a situation, we would have to invalidate 
the first “new” collection in order to use the second one, then when a 
“third” new collection would be snapped and warmed, we would have to 
invalidate the “second” new collection, and so on ad infinitum. A 
completely warmed collection would never make it to full term before it 
was aborted. This can be prevented with a properly tuned configuration 
so new collections do not get installed too rapidly. 


http://wiki.apache.org/solr/SolrPerformanceFactors#Updates_and_Commit_Frequency_Tradeoffs

I think I've seen that same advice on another wiki page without being 
specifically regarding replication, but just being about commit 
frequency balanced with auto-warming, leading to overlapping warming, 
leading to spiraling RAM/CPU usage -- but NOT an exception being thrown 
or HTTP error delivered.


I can't find it on the wiki, but here's a listserv post with someone 
reporting findings that match my understanding: 
http://osdir.com/ml/solr-user.lucene.apache.org/2010-09/msg00528.html


How does this advice square with the code Lance found?  Is my 
understanding of how frequent commits can interact with time it takes to 
warm a new collection correct? Appreciate any additional info.





Lance Norskog wrote:
  

Isn't that what this code does?

  onDeckSearchers++;
  if (onDeckSearchers  1) {
// should never happen... just a sanity check
log.error(logid+ERROR!!! onDeckSearchers is  + onDeckSearchers);
onDeckSearchers=1;  // reset
  } else if (onDeckSearchers  maxWarmingSearchers) {
onDeckSearchers--;
String msg=Error opening new searcher. exceeded limit of
maxWarmingSearchers=+maxWarmingSearchers + , try again later.;
log.warn(logid++ msg);
// HTTP 503==service unavailable, or 409==Conflict
throw new
SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE,msg,true);
  } else if (onDeckSearchers  1) {
log.info(logid+PERFORMANCE WARNING: Overlapping
onDeckSearchers= + onDeckSearchers);
  }


On Tue, Nov 2, 2010 at 10:02 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
  


It's definitely a known 'issue' that you can't replicate (or do any other
kind of index change, including a commit) at a faster frequency than your
warming queries take to complete, or you'll wind up with something like
you've seen.

It's in some documentation somewhere I saw, for sure.

The advice to 'just query against the master' is kind of odd, because,
then... why have a slave at all, if you aren't going to query against it?  I
guess just for backup purposes.

But even with just one solr, or querying master, if you commit at rate such
that commits come before the warming queries can complete, you're going to
have the same issue.

The only answer I know of is Don't commit (or replicate) at a faster rate
than it takes your warming to complete.  You can reduce your warming
queries/operations, or reduce your commit/replicate frequency.

Would be interesting/useful if Solr noticed this going on, and gave you some
kind of error in the log (or even an exception when started with a certain
parameter for testing) Overlapping warming queries, you're committing too
fast or something. Because it's easy to make this happen

Re: Possible memory leaks with frequent replication

2010-11-03 Thread Lance Norskog

Do you use EmbeddedSolr in the query server? There is a memory leak
that shows up when taking a lot of replications.

On Wed, Nov 3, 2010 at 8:28 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
Ah, but reading Peter's email message I reference more carefully, it seems
that Solr already DOES provide an info-level log warning you about
over-lapping warming, awesome. (But again, I'm pretty sure it does NOT throw
or HTTP error in that condition, based on my and others experience).

To check if your Solr environment is suffering from this, turn on INFO
level logging, and look for: 'PERFORMANCE WARNING: Overlapping
onDeckSearchers=x'.

Sweet, good to know, and I'll definitely add this to my debugging toolbox.
Peter's listserv message really ought to be a wiki page, I think. Any
reason for me not to just add it as a new one with title Commit frequency
and auto-warming or something like that? Unless it's already in the wiki
somewhere I haven't found, assuming the wiki will let an ordinary
user-created account add a new page.
//
Jonathan Rochkind wrote:

I hadn't looked at the code, am not familiar with Solr code, and can't say
what that code does.

But I have experienced issues that I _believe_ were caused by too frequent
commits causing over-lapping searcher preperation. And I've definitely seen
Solr documentation that suggests this is an issue. Let me find it now to see
if the experts think these documented suggests are still correct or not:

On the other hand, autowarming (populating) a new collection could take a
lot of time, especially since it uses only one thread and one CPU. If your
settings fire off snapinstaller too frequently, then a Solr slave could be
in the undesirable condition of handing-off queries to one (old) collection,
and, while warming a new collection, a second “new” one could be snapped and
begin warming!

If we attempted to solve such a situation, we would have to invalidate the
first “new” collection in order to use the second one, then when a “third”
new collection would be snapped and warmed, we would have to invalidate the
“second” new collection, and so on ad infinitum. A completely warmed
collection would never make it to full term before it was aborted. This can
be prevented with a properly tuned configuration so new collections do not
get installed too rapidly.

http://wiki.apache.org/solr/SolrPerformanceFactors#Updates_and_Commit_Frequency_Tradeoffs

I think I've seen that same advice on another wiki page without being
specifically regarding replication, but just being about commit frequency
balanced with auto-warming, leading to overlapping warming, leading to
spiraling RAM/CPU usage -- but NOT an exception being thrown or HTTP error
delivered.

I can't find it on the wiki, but here's a listserv post with someone
reporting findings that match my understanding:
http://osdir.com/ml/solr-user.lucene.apache.org/2010-09/msg00528.html

How does this advice square with the code Lance found? Is my
understanding of how frequent commits can interact with time it takes to
warm a new collection correct? Appreciate any additional info.

Lance Norskog wrote:

Isn't that what this code does?

onDeckSearchers++;
if (onDeckSearchers 1) {
// should never happen... just a sanity check
log.error(logid+ERROR!!! onDeckSearchers is + onDeckSearchers);
onDeckSearchers=1; // reset
} else if (onDeckSearchers maxWarmingSearchers) {
onDeckSearchers--;
String msg=Error opening new searcher. exceeded limit of
maxWarmingSearchers=+maxWarmingSearchers + , try again later.;
log.warn(logid++ msg);
// HTTP 503==service unavailable, or 409==Conflict
throw new
SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE,msg,true);
} else if (onDeckSearchers 1) {
log.info(logid+PERFORMANCE WARNING: Overlapping
onDeckSearchers= + onDeckSearchers);
}

On Tue, Nov 2, 2010 at 10:02 AM, Jonathan Rochkind rochk...@jhu.edu
wrote:

It's definitely a known 'issue' that you can't replicate (or do any
other
kind of index change, including a commit) at a faster frequency than
your
warming queries take to complete, or you'll wind up with something like
you've seen.

It's in some documentation somewhere I saw, for sure.

The advice to 'just query against the master' is kind of odd, because,
then... why have a slave at all, if you aren't going to query against
it? I
guess just for backup purposes.

But even with just one solr, or querying master, if you commit at rate
such
that commits come before the warming queries can complete, you're going
to
have the same issue.

The only answer I know of is Don't commit (or replicate) at a faster
rate
than it takes your warming to complete. You can reduce your warming
queries/operations, or reduce your commit/replicate frequency.

Would be interesting/useful if Solr noticed this going on, and gave you

Re: Possible memory leaks with frequent replication

2010-11-02 Thread Simon Wistow

On Mon, Nov 01, 2010 at 05:42:51PM -0700, Lance Norskog said:
 You should query against the indexer. I'm impressed that you got 5s
 replication to work reliably.

That's our current solution - I was just wondering if there was anything 
I was missing. 

Thanks!

Re: Possible memory leaks with frequent replication

2010-11-02 Thread Yonik Seeley

On Tue, Nov 2, 2010 at 12:32 PM, Simon Wistow si...@thegestalt.org wrote:
 On Mon, Nov 01, 2010 at 05:42:51PM -0700, Lance Norskog said:
 You should query against the indexer. I'm impressed that you got 5s
 replication to work reliably.

 That's our current solution - I was just wondering if there was anything
 I was missing.

You could also try dialing down maxWarmingSearchers to 1 - that should
prevent multiple searchers warming at the same time and may be the
source of you running out of memory.

-Yonik
http://www.lucidimagination.com

Re: Possible memory leaks with frequent replication

2010-11-02 Thread Jonathan Rochkind

It's definitely a known 'issue' that you can't replicate (or do any 
other kind of index change, including a commit) at a faster frequency 
than your warming queries take to complete, or you'll wind up with 
something like you've seen.


It's in some documentation somewhere I saw, for sure.

The advice to 'just query against the master' is kind of odd, because, 
then... why have a slave at all, if you aren't going to query against 
it?  I guess just for backup purposes.


But even with just one solr, or querying master, if you commit at rate 
such that commits come before the warming queries can complete, you're 
going to have the same issue.


The only answer I know of is Don't commit (or replicate) at a faster 
rate than it takes your warming to complete.  You can reduce your 
warming queries/operations, or reduce your commit/replicate frequency.


Would be interesting/useful if Solr noticed this going on, and gave you 
some kind of error in the log (or even an exception when started with a 
certain parameter for testing) Overlapping warming queries, you're 
committing too fast or something. Because it's easy to make this happen 
without realizing it, and then your Solr does what Simon says, runs out 
of RAM and/or uses a whole lot of CPU and disk io.


Lance Norskog wrote:

You should query against the indexer. I'm impressed that you got 5s
replication to work reliably.

On Mon, Nov 1, 2010 at 4:27 PM, Simon Wistow si...@thegestalt.org wrote:
  

We've been trying to get a setup in which a slave replicates from a
master every few seconds (ideally every second but currently we have it
set at every 5s).

Everything seems to work fine until, periodically, the slave just stops
responding from what looks like it running out of memory:

org.apache.catalina.core.StandardWrapperValve invoke
SEVERE: Servlet.service() for servlet jsp threw exception
java.lang.OutOfMemoryError: Java heap space


(our monitoring seems to confirm this).

Looking around my suspicion is that it takes new Readers longer to warm
than the gap between replication and thus they just build up until all
memory is consumed (which, I suppose isn't really memory 'leaking' per
se, more just resource consumption)

That said, we've tried turning off caching on the slave and that didn't
help either so it's possible I'm wrong.

Is there anything we can do about this? I'm reluctant to increase the
heap space since I suspect that will mean that there's just a longer
period between failures. Might Zoie help here? Or should we just query
against the Master?


Thanks,

Simon

Re: Possible memory leaks with frequent replication

2010-11-02 Thread Lance Norskog

Isn't that what this code does?

  onDeckSearchers++;
  if (onDeckSearchers  1) {
// should never happen... just a sanity check
log.error(logid+ERROR!!! onDeckSearchers is  + onDeckSearchers);
onDeckSearchers=1;  // reset
  } else if (onDeckSearchers  maxWarmingSearchers) {
onDeckSearchers--;
String msg=Error opening new searcher. exceeded limit of
maxWarmingSearchers=+maxWarmingSearchers + , try again later.;
log.warn(logid++ msg);
// HTTP 503==service unavailable, or 409==Conflict
throw new
SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE,msg,true);
  } else if (onDeckSearchers  1) {
log.info(logid+PERFORMANCE WARNING: Overlapping
onDeckSearchers= + onDeckSearchers);
  }


On Tue, Nov 2, 2010 at 10:02 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
 It's definitely a known 'issue' that you can't replicate (or do any other
 kind of index change, including a commit) at a faster frequency than your
 warming queries take to complete, or you'll wind up with something like
 you've seen.

 It's in some documentation somewhere I saw, for sure.

 The advice to 'just query against the master' is kind of odd, because,
 then... why have a slave at all, if you aren't going to query against it?  I
 guess just for backup purposes.

 But even with just one solr, or querying master, if you commit at rate such
 that commits come before the warming queries can complete, you're going to
 have the same issue.

 The only answer I know of is Don't commit (or replicate) at a faster rate
 than it takes your warming to complete.  You can reduce your warming
 queries/operations, or reduce your commit/replicate frequency.

 Would be interesting/useful if Solr noticed this going on, and gave you some
 kind of error in the log (or even an exception when started with a certain
 parameter for testing) Overlapping warming queries, you're committing too
 fast or something. Because it's easy to make this happen without realizing
 it, and then your Solr does what Simon says, runs out of RAM and/or uses a
 whole lot of CPU and disk io.

 Lance Norskog wrote:

 You should query against the indexer. I'm impressed that you got 5s
 replication to work reliably.

 On Mon, Nov 1, 2010 at 4:27 PM, Simon Wistow si...@thegestalt.org wrote:


 We've been trying to get a setup in which a slave replicates from a
 master every few seconds (ideally every second but currently we have it
 set at every 5s).

 Everything seems to work fine until, periodically, the slave just stops
 responding from what looks like it running out of memory:

 org.apache.catalina.core.StandardWrapperValve invoke
 SEVERE: Servlet.service() for servlet jsp threw exception
 java.lang.OutOfMemoryError: Java heap space


 (our monitoring seems to confirm this).

 Looking around my suspicion is that it takes new Readers longer to warm
 than the gap between replication and thus they just build up until all
 memory is consumed (which, I suppose isn't really memory 'leaking' per
 se, more just resource consumption)

 That said, we've tried turning off caching on the slave and that didn't
 help either so it's possible I'm wrong.

 Is there anything we can do about this? I'm reluctant to increase the
 heap space since I suspect that will mean that there's just a longer
 period between failures. Might Zoie help here? Or should we just query
 against the Master?


 Thanks,

 Simon










-- 
Lance Norskog
goks...@gmail.com

Possible memory leaks with frequent replication

2010-11-01 Thread Simon Wistow

We've been trying to get a setup in which a slave replicates from a 
master every few seconds (ideally every second but currently we have it 
set at every 5s).

Everything seems to work fine until, periodically, the slave just stops 
responding from what looks like it running out of memory:

org.apache.catalina.core.StandardWrapperValve invoke
SEVERE: Servlet.service() for servlet jsp threw exception
java.lang.OutOfMemoryError: Java heap space


(our monitoring seems to confirm this).

Looking around my suspicion is that it takes new Readers longer to warm 
than the gap between replication and thus they just build up until all 
memory is consumed (which, I suppose isn't really memory 'leaking' per 
se, more just resource consumption)

That said, we've tried turning off caching on the slave and that didn't 
help either so it's possible I'm wrong.

Is there anything we can do about this? I'm reluctant to increase the 
heap space since I suspect that will mean that there's just a longer 
period between failures. Might Zoie help here? Or should we just query 
against the Master?


Thanks,

Simon

Re: Possible memory leaks with frequent replication

2010-11-01 Thread Lance Norskog

You should query against the indexer. I'm impressed that you got 5s
replication to work reliably.

On Mon, Nov 1, 2010 at 4:27 PM, Simon Wistow si...@thegestalt.org wrote:
 We've been trying to get a setup in which a slave replicates from a
 master every few seconds (ideally every second but currently we have it
 set at every 5s).

 Everything seems to work fine until, periodically, the slave just stops
 responding from what looks like it running out of memory:

 org.apache.catalina.core.StandardWrapperValve invoke
 SEVERE: Servlet.service() for servlet jsp threw exception
 java.lang.OutOfMemoryError: Java heap space


 (our monitoring seems to confirm this).

 Looking around my suspicion is that it takes new Readers longer to warm
 than the gap between replication and thus they just build up until all
 memory is consumed (which, I suppose isn't really memory 'leaking' per
 se, more just resource consumption)

 That said, we've tried turning off caching on the slave and that didn't
 help either so it's possible I'm wrong.

 Is there anything we can do about this? I'm reluctant to increase the
 heap space since I suspect that will mean that there's just a longer
 period between failures. Might Zoie help here? Or should we just query
 against the Master?


 Thanks,

 Simon




-- 
Lance Norskog
goks...@gmail.com

Re: Possible memory leaks with frequent replication

Re: Possible memory leaks with frequent replication

Re: Possible memory leaks with frequent replication

Re: Possible memory leaks with frequent replication

Re: Possible memory leaks with frequent replication

Re: Possible memory leaks with frequent replication

Re: Possible memory leaks with frequent replication

Possible memory leaks with frequent replication

Re: Possible memory leaks with frequent replication

9 matches

Site Navigation

Mail list logo

Footer information