Erick,

Sorry I didn't see this response, for some reason solr-users has stopped being 
delivered to my mail box.

The script that adds a field based on the value(s) in some other field doesn't 
add a large number of different fields to the index.
The pool_f field only has a total of 11 different values, and except for some 
rare cases, any given record only has a single value in that field, and those 
rare cases will have two values.

I had previously implemented the same functionality by making a small jar file 
containing a customized version of TemplateUpdateProcessorFactory  that could 
generate different field names, but since I needed another bit of functionality 
in the Update Chain I decided to port the original functionality to a script  
since the "development overhead" of adding a script is less than adding in 
multiple additional custom UpdateProcessorFactory objects.

I had been running solr with the the memory flag  "-m 8G" and it had been 
running fine with that setting for a least a year, even recently when the 
customized java version of TemplateUpdateProcessorFactory was being invoked to 
perform essentially the same processing step.

However when I tried to accomplish the same thing via javascript through 
StatelessScriptUpdateProcessorFactory  and start a re-index it would die after 
about 1 million records being indexed.    And since it is merely my (massive) 
development machine, during the re-index there are close to zero searches 
coming through while the re-index is happening.

I've managed to work around the issue on my dev box by upping the the memory 
for solr to 16G, and haven't had an OOM since doing that, but I'm hesitant to 
push these changes to our AWS-hosted production instances since running out of 
memory and terminating there would be more of an issue.

-Bob



________________________________
    From: Erick Erickson <erickerick...@gmail.com>
    Subject: Re: StatelessScriptUpdateProcessorFactory causing OOM errors?
    Date: Thu, 6 Feb 2020 09:18:41 -0500

    How many fields do you wind up having? It looks on a quick glance like
    it depends on the values of fields. While I’ve seen Solr/Lucene handle
    indexes with over 1M different fields, it’s unsatisfactory.

    What I’m wondering is if you are adding a zillion different fields to your
    docs as time passes and eventually the structures that are needed to
    maintain your field mappings are blowing up memory.

    If that’s that case, you need an alternative design because your
    performance will be unacceptable.

    May be off base, if so we can dig further.

    Best,
    Erick

    > On Feb 5, 2020, at 3:41 PM, Haschart, Robert J (rh9ec) 
<rh...@virginia.edu> wrote:
    >
    > StatelessScriptUpdateProcessorFactory




Reply via email to