Hi Uwe,

Thanks for the response.  We've tried setting sharedArenaMaxPermits to 64;
I'll update this thread once we get some data.

One thing I don't understand is why does the list of deleted mmapped
fields only include doc values files?  If your theory is correct and this
is caused by deletes being updated over and over, wouldn't we expect only
.liv files to be deleted?

Output:
```
7ed8a2754000-7ed8a2757000 r--s 00000000 08:10 80872480
/usr/share/opensearch/data/nodes/0/indices/Ci3MyIbNTceUmC67d1IlwQ/196/index/_9h0q_de_Lucene90_0.dvd
(deleted)
7ed8a2757000-7ed8a275c000 r--s 00000000 08:10 78838113
/usr/share/opensearch/data/nodes/0/indices/Ci3MyIbNTceUmC67d1IlwQ/119/index/_912j_m9_Lucene90_0.dvd
(deleted)
7ed8a275c000-7ed8a275f000 r--s 00000000 08:10 78830146
/usr/share/opensearch/data/nodes/0/indices/Ci3MyIbNTceUmC67d1IlwQ/126/index/_9buk_4e_Lucene90_0.dvd
(deleted)
```

Justin

On Wed, May 7, 2025 at 9:50 AM Uwe Schindler <u...@thetaphi.de> wrote:

> Hi,
>
> this could be related to a bug or limitation of the following change:
>
>  1. GITHUB#13570
>     <https://github.com/apache/lucene/pull/13570>,GITHUB#13574
>     <https://github.com/apache/lucene/pull/13574>,GITHUB#13535
>     <https://github.com/apache/lucene/pull/13535>: Avoid performance
>     degradation with closing shared Arenas. Closing many individual
>     index files can potentially lead to a degradation in execution
>     performance. Index files are mmapped one-to-one with the JDK's
>     foreign shared Arena. The JVM deoptimizes the top few frames of all
>     threads when closing a shared Arena (see JDK-8335480). We mitigate
>     this situation when running with JDK 21 and greater, by *1) using a
>     confined Arena where appropriate, and 2) grouping files from the
>     same segment to a single shared Arena*. A system property has been
>     added that allows to control the total maximum number of mmapped
>     files that may be associated with a single shared Arena. For
>     example, to set the max number of permits to 256, pass the following
>     on the command line
>     -Dorg.apache.lucene.store.MMapDirectory.sharedArenaMaxPermits=256.
>     Setting a value of 1 associatesa single file to a single shared arena.
>     (Chris Hegarty, Michael Gibney, Uwe Schindler)
>
> Actually it looks like there are many deletes on the same index segment
> so the segment itsself is not closed but the deltes are updated over an
> over. As the whole segment uses the same shared memory arena and it
> won't delete all 1024 (the default value) mappings and this would count
> against the maxMapCount limit.
>
> To work around the issue you can choose to reduce the setting as
> described above by passing it as a separate system property on
> Opensearch's command line. I'd recomment to use a smaller value like 64
> for systems with many indexes.
>
> Please tell us what you found out! Did reducing the
> sharedArenaMaxPermits limit help? Maybe a good idea would be to change
> Lucene / Opensearch to open deletion files in a separate arena or use
> READONCE to load them to memory.
>
> Uwe
>
> Am 07.05.2025 um 03:44 schrieb Justin Borromeo:
> > Hi all,
> >
> > After upgrading our OpenSearch cluster from 2.16.0 to 2.19.1 (moving from
> > Lucene 9.10 to Lucene 9.12), our largest clusters started crashing with
> the
> > following error:
> >
> > # There is insufficient memory for the Java Runtime Environment to
> continue.
> >
> > # Native memory allocation (malloc) failed to allocate 2097152 bytes.
> Error
> > detail: AllocateHeap
> >
> > We narrowed down the issue to the vm max map count (262144) being
> reached.
> > Prior to server crash, we see map count (measured by `cat
> /proc/{pid}/maps
> > | wc -l`) approach the 262144 limit we set.  Looking at one of the
> outputs
> > of `cat /proc/{pid}/maps`, we observed that 246K of the 252K maps are for
> > deleted doc values (.dvd) files.
> >
> > Is this expected?  If so, were there any changes in the Lucene codebase
> > between those two versions that could have caused this?  Any suggestions
> on
> > debugging?
> >
> > Thanks in advance and sorry if this is a better question for the OS
> > community or the Lucene developer list.
> >
> > Justin Borromeo
> >
> --
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de
> eMail:u...@thetaphi.de
>

Reply via email to