Thank you Erick - it was a mistake for this collection to be running in schemaless mode; I will fix that, but right now the 'PROCESSOR_LOGS' schema only has 10 fields.  Another managed schema in the system has over 1,000.

Shawn - I did see a post about setting vm.max_map_count higher (it was 65,530) and I increased it to 262144.  For the solr user, we're using 102,400 for open files and for max user processes, we use 65,000.

-Joe

On 12/10/2019 7:46 AM, Erick Erickson wrote:
One other red flag is you’re apparently running in “schemaless” mode, based on:

org.apache.solr.update.processor.AddSchemaFieldsUpdateProcessorFactory$AddSchemaFieldsUpdateProcessor.processAdd(AddSchemaFieldsUpdateProcessorFactory.java:475)

When running in schemaless mode, if Solr encounters a field in a doc that it 
hasn’t seen before it will add a new field to the schema. Which will update the 
schema and reload the collection. If this is happening in the middle of “heavy 
indexing”, it’s going to clog up the works.

Please turn this off. See the message when you create a collection or look at 
the ref guide for how. The expense is one reason, but the second reason is that 
you have no control at all about how many fields you have in your index. Solr 
will merrily create these for _any_ new field. If you’re _lucky_, Solr will 
guess right. If you’re not lucky, Solr will start refusing to index documents 
due to field incompatibilities. Say the first value for a field is “1”. Solr 
guesses it’s an int. The next doc has “1.0”. solr will fail the doc.

Next up. When Solr has thousands of fields it starts to bog down due to housekeeping 
complexity. Do you have any idea how many fields have actually been realized in your index? 
5? 50? 100K? The admin UI>>core>>schema will give you an idea.

Of course if your input docs are very tightly controlled, this really won’t be 
a problem, but in that case you don’t need schemaless anyway.

Why am I belaboring this? Because this may be the root of your thread issue. As 
you keep throwing docs at Solr, it has to queue them up if it’s making schema 
changes until the schema is updated and re-distributed to all replicas….

Best,
Erick

On Dec 10, 2019, at 2:25 AM, Walter Underwood <wun...@wunderwood.org> wrote:

We’ve run into this fatal problem with 6.6 in prod. It gets overloaded, make 
4000 threads, runs out of memory, and dies.

Not an acceptable design. Excess load MUST be rejected, otherwise the system 
goes into a stable congested state.

I was working with John Nagle when he figured this out in the late 1980s.

https://www.researchgate.net/publication/224734039_On_Packet_Switches_with_Infinite_Storage

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

On Dec 9, 2019, at 11:14 PM, Mikhail Khludnev <m...@apache.org> wrote:

My experience with  "OutOfMemoryError: unable to create new native thread"
as follows: it occurs on envs where devs refuse to use threadpools in favor
of old good new Thread().
Then, it turns rather interesting: If there are plenty of heap, GC doesn't
sweep Thread instances. Since they are native in Java, every of them hold
some ram for native stack. That exceeds stack space at some point of time.
So, check how many thread JVM hold after this particular OOME occurs by
jstack; you can even force GC to release that native stack space. Then,
rewrite the app, or reduce heap to enforce  GC.

On Tue, Dec 10, 2019 at 9:44 AM Shawn Heisey <apa...@elyograg.org> wrote:

On 12/9/2019 2:23 PM, Joe Obernberger wrote:
Getting this error on some of the nodes in a solr cloud during heavy
indexing:
<snip>

Caused by: java.lang.OutOfMemoryError: unable to create new native thread
Java was not able to start a new thread.  Most likely this is caused by
the operating system imposing limits on the number of processes or
threads that a user is allowed to start.

On Linux, the default limit is usually 1024 processes.  It doesn't take
much for a Solr install to need more threads than that.

How to increase the limit will depend on what OS you're running on.
Typically on Linux, this is controlled by /etc/security/limits.conf.  If
you're not on Linux, then you'll need to research how to increase the
process limit.

As long as you're fiddling with limits, you'll probably also want to
increase the open file limit.

Thanks,
Shawn


--
Sincerely yours
Mikhail Khludnev

Reply via email to