Re: StatelessScriptUpdateProcessorFactory causing OOM errors?

Erick Erickson Thu, 13 Feb 2020 14:26:04 -0800

Robert:

My concern with fixing by adding memory is that it may just be kicking the can 
down the road. Assuming there really is some leak eventually they’ll accumulate 
and you’ll hit another OOM. If that were the case, I’d expect a cursory look at 
your memory usage to just keep increasing over time as your script is utilized. 
When I looked at your script, I don’t see anything obvious...


Now, all that said if you bump the memory and it stays in some channel maybe 
you were just running close to your limits before and got “lucky”.

Here's another possibility:

- your commit interval is too long. While I constantly find them set too short, 
it’s also possible to set them to be too long. To support Real-time-get, Solr 
needs to keep pointers in to the TLOGs for all documents that have been added 
since the last searcher was opened. I can’t really make this square with 
switching from a jar to a script, but…

You’d probably need to enable the OOM killer script and enable heap-dump-on-oom 
to really get to the bottom of this, or maybe just take a heap dump after a 
while when you’re indexing docs. 

Best,
Erick

> On Feb 13, 2020, at 2:45 PM, Jörn Franke <jornfra...@gmail.com> wrote:
> 
> I had also issues with this factory when creating atomic updates inside 
> there. They worked, but searcher where never closed and new ones where open 
> and stayed open with all the issues related to that one. Maybe one needs to 
> look into more detail into that. However - it is a script in the end so that 
> could be always a bug in your script as well.
> 
>> Am 13.02.2020 um 19:21 schrieb Haschart, Robert J (rh9ec) 
>> <rh...@virginia.edu>:
>> 
>> Erick,
>> 
>> Sorry I didn't see this response, for some reason solr-users has stopped 
>> being delivered to my mail box.
>> 
>> The script that adds a field based on the value(s) in some other field 
>> doesn't add a large number of different fields to the index.
>> The pool_f field only has a total of 11 different values, and except for 
>> some rare cases, any given record only has a single value in that field, and 
>> those rare cases will have two values.
>> 
>> I had previously implemented the same functionality by making a small jar 
>> file containing a customized version of TemplateUpdateProcessorFactory  that 
>> could generate different field names, but since I needed another bit of 
>> functionality in the Update Chain I decided to port the original 
>> functionality to a script  since the "development overhead" of adding a 
>> script is less than adding in multiple additional custom 
>> UpdateProcessorFactory objects.
>> 
>> I had been running solr with the the memory flag  "-m 8G" and it had been 
>> running fine with that setting for a least a year, even recently when the 
>> customized java version of TemplateUpdateProcessorFactory was being invoked 
>> to perform essentially the same processing step.
>> 
>> However when I tried to accomplish the same thing via javascript through 
>> StatelessScriptUpdateProcessorFactory  and start a re-index it would die 
>> after about 1 million records being indexed.    And since it is merely my 
>> (massive) development machine, during the re-index there are close to zero 
>> searches coming through while the re-index is happening.
>> 
>> I've managed to work around the issue on my dev box by upping the the memory 
>> for solr to 16G, and haven't had an OOM since doing that, but I'm hesitant 
>> to push these changes to our AWS-hosted production instances since running 
>> out of memory and terminating there would be more of an issue.
>> 
>> -Bob
>> 
>> 
>> 
>> ________________________________
>>   From: Erick Erickson <erickerick...@gmail.com>
>>   Subject: Re: StatelessScriptUpdateProcessorFactory causing OOM errors?
>>   Date: Thu, 6 Feb 2020 09:18:41 -0500
>> 
>>   How many fields do you wind up having? It looks on a quick glance like
>>   it depends on the values of fields. While I’ve seen Solr/Lucene handle
>>   indexes with over 1M different fields, it’s unsatisfactory.
>> 
>>   What I’m wondering is if you are adding a zillion different fields to your
>>   docs as time passes and eventually the structures that are needed to
>>   maintain your field mappings are blowing up memory.
>> 
>>   If that’s that case, you need an alternative design because your
>>   performance will be unacceptable.
>> 
>>   May be off base, if so we can dig further.
>> 
>>   Best,
>>   Erick
>> 
>>> On Feb 5, 2020, at 3:41 PM, Haschart, Robert J (rh9ec) <rh...@virginia.edu> 
>>> wrote:
>>> 
>>> StatelessScriptUpdateProcessorFactory
>> 
>> 
>> 
>>

Re: StatelessScriptUpdateProcessorFactory causing OOM errors?

Reply via email to