Re: Supporting multiple indexes in one collection

2020-07-01 Thread Raji N
Did the test while back . Revisiting this again. But in standalone solr we
have experienced the queries more time if the data exists in 2 shards .
That's the main reason this test was done. If anyone has experience want to
hear

On Tue, Jun 30, 2020 at 11:50 PM Jörn Franke  wrote:

> How many documents ?
> The real difference  was only a couple of ms?
>
> > Am 01.07.2020 um 07:34 schrieb Raji N :
> >
> > Had 2 indexes in 2 separate shards in one collection and had exact same
> > data published with composite router with a prefix. Disabled all caches.
> > Issued the same query which is a small query with q parameter and fq
> > parameter . Number of queries which got executed  (with same threads and
> > run for same time ) were more in 2  indexes with 2 separate shards case.
> > 90th percentile response time was also few ms better.
> >
> > Thanks,
> > Raji
> >
> >> On Tue, Jun 30, 2020 at 10:06 PM Jörn Franke 
> wrote:
> >>
> >> What did you test? Which queries? What were the exact results in terms
> of
> >> time ?
> >>
> >>>> Am 30.06.2020 um 22:47 schrieb Raji N :
> >>>
> >>> Hi ,
> >>>
> >>>
> >>> Trying to place multiple smaller indexes in one collection (as we read
> >>> solrcloud performance degrades as number of collections increase). We
> are
> >>> exploring two ways
> >>>
> >>>
> >>> 1) Placing each index on a single shard of a collection
> >>>
> >>>  In this case placing documents for a single index is manual and
> >>> automatic rebalancing not done by solr
> >>>
> >>>
> >>> 2) Solr routing composite router with a prefix .
> >>>
> >>> In this case solr doesn’t place all the docs with same prefix in
> one
> >>> shard , so searches becomes distributed. But shard rebalancing is taken
> >>> care by solr.
> >>>
> >>>
> >>> We did a small perf test with both these set up. We saw the performance
> >> for
> >>> the first case (placing an index explicitly on a shard ) is better.
> >>>
> >>>
> >>> Has anyone done anything similar. Can you please share your experience.
> >>>
> >>>
> >>> Thanks,
> >>>
> >>> Raji
> >>
>


Re: Supporting multiple indexes in one collection

2020-06-30 Thread Raji N
Had 2 indexes in 2 separate shards in one collection and had exact same
data published with composite router with a prefix. Disabled all caches.
Issued the same query which is a small query with q parameter and fq
parameter . Number of queries which got executed  (with same threads and
run for same time ) were more in 2  indexes with 2 separate shards case.
90th percentile response time was also few ms better.

Thanks,
Raji

On Tue, Jun 30, 2020 at 10:06 PM Jörn Franke  wrote:

> What did you test? Which queries? What were the exact results in terms of
> time ?
>
> > Am 30.06.2020 um 22:47 schrieb Raji N :
> >
> > Hi ,
> >
> >
> > Trying to place multiple smaller indexes in one collection (as we read
> > solrcloud performance degrades as number of collections increase). We are
> > exploring two ways
> >
> >
> > 1) Placing each index on a single shard of a collection
> >
> >   In this case placing documents for a single index is manual and
> > automatic rebalancing not done by solr
> >
> >
> > 2) Solr routing composite router with a prefix .
> >
> >  In this case solr doesn’t place all the docs with same prefix in one
> > shard , so searches becomes distributed. But shard rebalancing is taken
> > care by solr.
> >
> >
> > We did a small perf test with both these set up. We saw the performance
> for
> > the first case (placing an index explicitly on a shard ) is better.
> >
> >
> > Has anyone done anything similar. Can you please share your experience.
> >
> >
> > Thanks,
> >
> > Raji
>


Supporting multiple indexes in one collection

2020-06-30 Thread Raji N
Hi ,


Trying to place multiple smaller indexes in one collection (as we read
solrcloud performance degrades as number of collections increase). We are
exploring two ways


1) Placing each index on a single shard of a collection

   In this case placing documents for a single index is manual and
automatic rebalancing not done by solr


2) Solr routing composite router with a prefix .

  In this case solr doesn’t place all the docs with same prefix in one
shard , so searches becomes distributed. But shard rebalancing is taken
care by solr.


We did a small perf test with both these set up. We saw the performance for
the first case (placing an index explicitly on a shard ) is better.


Has anyone done anything similar. Can you please share your experience.


Thanks,

Raji


Re: off-heap OOM

2020-05-01 Thread Raji N
Thanks for your  reply . Sure will take a look at the docker host log.  But
even when we got "unable to create new native thread" error , the heap dump
taken within hour before (we have hourly heap generation) the OOM did not
have more than 150 to 160 threads. So it doesn't look like it happens due
to running out of threads. Rather suspecting it happens because there is no
native memory?.

Thanks,
Raji

On Fri, May 1, 2020 at 12:13 AM Mikhail Khludnev  wrote:

> > java.lang.OutOfMemoryError: unable to create new native thread
> Usually mean code flaw, but there is a workaround to trigger heap GC.
> It happens when app creates threads instead of proper pooling, and no GC
> occurs, so java Thread objects hangs in heap in stopped state, but every of
> them holds a native thread handler; and system run out of native threads
> sooner or later. So, in this case reducing heap size, frees native thread
> and app is able to recycle them. But you are right, it's rather better to
> disable it.
> Also, check docker host log, there's a specific error message for java
> under docker.
>
> On Fri, May 1, 2020 at 3:55 AM Raji N  wrote:
>
> > It used to occur every 3 days ,we reduced heap and it started
> > occurring every 5 days .  From the logs we can't get much. Some times we
> > see "unable to create  new native thread" in the logs and many times no
> > exceptions .
> > When it says "unable to create native thread" error , we got below
> > exceptions as we use cdcr. To eliminate cdcr from this issue , we
> disabled
> > CDCR also. But we still get OOM.
> >
> >  WARN  (cdcr-update-log-synchronizer-93-thread-1) [   ]
> > o.a.s.h.CdcrUpdateLogSynchronizer Caught unexpected exception
> >
> > java.lang.OutOfMemoryError: unable to create new native thread
> >
> >at java.lang.Thread.start0(Native Method) ~[?:1.8.0_211]
> >
> >at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_211]
> >
> >at
> >
> >
> org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96)
> > ~[httpclient-4.5.3.jar:4.5.3]
> >
> >at
> >
> >
> org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219)
> > ~[httpclient-4.5.3.jar:4.5.3]
> >
> >at
> >
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:319)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >at
> >
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >at
> >
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:268)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >at
> >
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:255)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >at
> >
> >
> org.apache.solr.client.solrj.impl.HttpSolrClient.(HttpSolrClient.java:200)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >at
> >
> >
> org.apache.solr.client.solrj.impl.HttpSolrClient$Builder.build(HttpSolrClient.java:957)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >at
> >
> >
> org.apache.solr.handler.CdcrUpdateLogSynchronizer$UpdateLogSynchronisation.run(CdcrUpdateLogSynchronizer.java:139)
> > [solr-core-7.6.0.jar:7.6.0-SNAPSHOT
> > 34d82ed033cccd8120431b73e93554b85b24a278 - i843100 - 2019-09-30
> > 14:02:46]
> >
> >at
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> > [?:1.8.0_211]
> >
> >at
> > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> > [?:1.8.0_211]
> >
> >at
> >
> >
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> > [?:1.8.0_211]
> >
> >at
> >
> >
> java.util.concurrent.ScheduledThread

Re: off-heap OOM

2020-04-30 Thread Raji N
It used to occur every 3 days ,we reduced heap and it started
occurring every 5 days .  From the logs we can't get much. Some times we
see "unable to create  new native thread" in the logs and many times no
exceptions .
When it says "unable to create native thread" error , we got below
exceptions as we use cdcr. To eliminate cdcr from this issue , we disabled
CDCR also. But we still get OOM.

 WARN  (cdcr-update-log-synchronizer-93-thread-1) [   ]
o.a.s.h.CdcrUpdateLogSynchronizer Caught unexpected exception

java.lang.OutOfMemoryError: unable to create new native thread

   at java.lang.Thread.start0(Native Method) ~[?:1.8.0_211]

   at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_211]

   at
org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96)
~[httpclient-4.5.3.jar:4.5.3]

   at
org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219)
~[httpclient-4.5.3.jar:4.5.3]

   at
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:319)
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
- nknize - 2018-12-07 14:47:53]

   at
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330)
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
- nknize - 2018-12-07 14:47:53]

   at
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:268)
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
- nknize - 2018-12-07 14:47:53]

   at
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:255)
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
- nknize - 2018-12-07 14:47:53]

   at
org.apache.solr.client.solrj.impl.HttpSolrClient.(HttpSolrClient.java:200)
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
- nknize - 2018-12-07 14:47:53]

   at
org.apache.solr.client.solrj.impl.HttpSolrClient$Builder.build(HttpSolrClient.java:957)
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
- nknize - 2018-12-07 14:47:53]

   at
org.apache.solr.handler.CdcrUpdateLogSynchronizer$UpdateLogSynchronisation.run(CdcrUpdateLogSynchronizer.java:139)
[solr-core-7.6.0.jar:7.6.0-SNAPSHOT
34d82ed033cccd8120431b73e93554b85b24a278 - i843100 - 2019-09-30
14:02:46]

   at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[?:1.8.0_211]

   at
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
[?:1.8.0_211]

   at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
[?:1.8.0_211]

   at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
[?:1.8.0_211]

   at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_211]

   at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_211]

Thanks,
Raji
On Thu, Apr 30, 2020 at 12:24 AM Mikhail Khludnev  wrote:

> Raji, how that "OOM for solr occur in every 5 days." exactly looks like?
> What is the error message? Where it's occurring exactly?
>
> On Thu, Apr 30, 2020 at 1:30 AM Raji N  wrote:
>
> > Thanks so much Jan. Will try your suggestions , yes we are also running
> > solr inside docker.
> >
> > Thanks,
> > Raji
> >
> > On Wed, Apr 29, 2020 at 1:46 PM Jan Høydahl 
> wrote:
> >
> > > I have seen the same, but only in Docker.
> > > I think it does not relate to Solr’s off-heap usage for filters and
> other
> > > data structures, but rather how Docker treats memory-mapped files as
> > > virtual memory.
> > > As you know, when using MMapDirectoryFactory, you actually let Linux
> > > handle the loading and unloading of the index files, and Solr will
> access
> > > them as if they were in a huge virtual memory pool. Naturally the index
> > > files grow large, and there is something strange going on in the way
> > Docker
> > > handles this, leading to OOM, not for Java heap but for the process.
> > >
> > > I have no definitive answer, but so far my research has found a few
> > > possible settings
> > >
> > > Set env.var MALLOC_ARENA_MAX=2
> > > Try to limit -XX:MaxDirectMemorySize
> > > Lower mem swappiness in Docker (--memory-swappiness 0)
> > > More generic insight into java mem allocation in Docker:
> > > https://dzone.com/articles/native-memory-allocation-in-examples
> > >
> > > Have not yet found a si

Re: off-heap OOM

2020-04-29 Thread Raji N
Thanks so much Jan. Will try your suggestions , yes we are also running
solr inside docker.

Thanks,
Raji

On Wed, Apr 29, 2020 at 1:46 PM Jan Høydahl  wrote:

> I have seen the same, but only in Docker.
> I think it does not relate to Solr’s off-heap usage for filters and other
> data structures, but rather how Docker treats memory-mapped files as
> virtual memory.
> As you know, when using MMapDirectoryFactory, you actually let Linux
> handle the loading and unloading of the index files, and Solr will access
> them as if they were in a huge virtual memory pool. Naturally the index
> files grow large, and there is something strange going on in the way Docker
> handles this, leading to OOM, not for Java heap but for the process.
>
> I have no definitive answer, but so far my research has found a few
> possible settings
>
> Set env.var MALLOC_ARENA_MAX=2
> Try to limit -XX:MaxDirectMemorySize
> Lower mem swappiness in Docker (--memory-swappiness 0)
> More generic insight into java mem allocation in Docker:
> https://dzone.com/articles/native-memory-allocation-in-examples
>
> Have not yet found a silver bullet, so very interested in this thread.
>
> Jan
>
> > 29. apr. 2020 kl. 19:26 skrev Raji N :
> >
> > Thank you for your reply.  When OOM happens somehow it doesn't generate
> > dump file. So we have hourly heaps running to diagnose this issue. Heap
> is
> > around 700MB and threads around 150. But 29GB of native memory is used
> up,
> > it is consumed by java.io.DirectBufferR (27GB major consumption) and
> > java.io.DirectByteBuffer  objects .
> >
> > We use solr 7.6.0 in solrcloud mode and OS is alpine . Java version
> >
> > java -version
> >
> > Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
> >
> > java version "1.8.0_211"
> >
> > Java(TM) SE Runtime Environment (build 1.8.0_211-b12)
> >
> > Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)
> >
> >
> >
> > Thanks much for taking a look at it.
> >
> > Raji
> >
> >
> >
> > On Wed, Apr 29, 2020 at 10:04 AM Shawn Heisey 
> wrote:
> >
> >> On 4/29/2020 2:07 AM, Raji N wrote:
> >>> Has anyone encountered off-heap OOM. We are thinking of reducing heap
> >>> further and increasing the hardcommit interval . Any other
> suggestions? .
> >>> Please share your thoughts.
> >>
> >> It sounds like it's not heap memory that's running out.
> >>
> >> When the OutOfMemoryError is logged, it will also contain a message
> >> mentioning which resource ran out.
> >>
> >> A common message that might be logged with the OOME is "Unable to create
> >> native thread".  This type of error, if that's what's happening,
> >> actually has nothing at all to do with memory, OOME is just how Java
> >> happens to report it.
> >>
> >> You will need to know exactly which resource is running out before we
> >> can offer any assistance.
> >>
> >> If the OOME is logged, the message you're looking for will be in the
> >> solr log, not the tiny special log that is created when Solr is killed
> >> by an OOME.  What version of Solr are you running, and what OS is it
> >> running on?
> >>
> >> Thanks,
> >> Shawn
> >>
>
>


Re: off-heap OOM

2020-04-29 Thread Raji N
Thank you for your reply.  When OOM happens somehow it doesn't generate
dump file. So we have hourly heaps running to diagnose this issue. Heap is
around 700MB and threads around 150. But 29GB of native memory is used up,
it is consumed by java.io.DirectBufferR (27GB major consumption) and
java.io.DirectByteBuffer  objects .

We use solr 7.6.0 in solrcloud mode and OS is alpine . Java version

java -version

Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8

java version "1.8.0_211"

Java(TM) SE Runtime Environment (build 1.8.0_211-b12)

Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)



Thanks much for taking a look at it.

Raji



On Wed, Apr 29, 2020 at 10:04 AM Shawn Heisey  wrote:

> On 4/29/2020 2:07 AM, Raji N wrote:
> > Has anyone encountered off-heap OOM. We are thinking of reducing heap
> > further and increasing the hardcommit interval . Any other suggestions? .
> > Please share your thoughts.
>
> It sounds like it's not heap memory that's running out.
>
> When the OutOfMemoryError is logged, it will also contain a message
> mentioning which resource ran out.
>
> A common message that might be logged with the OOME is "Unable to create
> native thread".  This type of error, if that's what's happening,
> actually has nothing at all to do with memory, OOME is just how Java
> happens to report it.
>
> You will need to know exactly which resource is running out before we
> can offer any assistance.
>
> If the OOME is logged, the message you're looking for will be in the
> solr log, not the tiny special log that is created when Solr is killed
> by an OOME.  What version of Solr are you running, and what OS is it
> running on?
>
> Thanks,
> Shawn
>


Re: IdleTimeout setting in Jetty (Solr 7.7.1)

2020-04-29 Thread Raji N
Try starting like this.

 bin/solr start -Dsolr.jetty.threads.idle.timeout=2000 -z localhost:2181



Hope this helps

Raji

On Sun, Apr 26, 2020 at 11:24 PM Kommu, Vinodh K.  wrote:

> Can someone shed some idea on below requirement?
>
> Thanks & Regards,
> Vinodh
>
> From: Kommu, Vinodh K.
> Sent: Friday, April 24, 2020 11:34 PM
> To: solr-user@lucene.apache.org
> Subject: IdleTimeout setting in Jetty (Solr 7.7.1)
>
> Hi,
>
> Our clients are running streaming expressions on 140M docs collection
> which has relatively huge data however query is not completing through &
> timing out after 120secs (which is default timeout in jetty*.xml files). We
> had changed the default timeout from 120s to 300s which worked fine. To
> change the default timeout setting, we had to modify the jetty files from
> installation directory, to avoid this, is there any option/way available in
> solr start arguments to overwrite the default idleTimeout value with custom
> value without modifying the actual jetty*.xml files?
>
> Thanks & Regards,
> Vinodh
>
> DTCC DISCLAIMER: This email and any files transmitted with it are
> confidential and intended solely for the use of the individual or entity to
> whom they are addressed. If you have received this email in error, please
> notify us immediately and delete the email and any attachments from your
> system. The recipient should check this email and any attachments for the
> presence of viruses. The company accepts no liability for any damage caused
> by any virus transmitted by this email.
>


off-heap OOM

2020-04-29 Thread Raji N
Hi,

We are using solrcloud 7.6.0 and we have containerized solr. We have around
30 collections and 7 solr nodes in the cluster.  Though we have
containerized , we have  one zookeeper container and one solr container
running in a host.
We have 24GB  heap and total container has memory 49GB , which leaves off-
heap as 25GB. We have set

max user processes  (-u) unlimited

virtual memory  (kbytes, -v) unlimited

file locks  (-x) unlimited

max memory size (kbytes, -m) unlimited


 OOM for solr occur in every 5 days. When we examined heapdumps , the heap
is only around 700MB , but we have off-heap memory as 29GB.

Major consumer is  java.nio.DirectByteBufferR


Major Reference chains



8,820,117Kb (1462.3%): *java.nio.DirectByteBufferR*: 64 objects

↖*sun.misc.Cleaner**.referent*

↖*sun.misc.Cleaner**.{prev}*

↖*java.nio.DirectByteBuffer**.cleaner*

↖*java.nio.ByteBuffer[]*

↖*sun.nio.ch.Util$BufferCache**.buffers*

↖*j.l.ThreadLocal$ThreadLocalMap$Entry**.value*

↖*j.l.ThreadLocal$ThreadLocalMap$Entry[]*

↖*j.l.ThreadLocal$ThreadLocalMap**.table*

↖*j.l.Thread**.threadLocals*

↖*j.l.Thread[]*

↖*j.l.ThreadGroup**.threads*

↖*j.l.ThreadGroup[]*

↖*j.l.ThreadGroup**.groups*

↖Java Static *sun.rmi.runtime.NewThreadAction**.systemThreadGroup*

3,534,863Kb (586.0%): *java.nio.DirectByteBufferR*: 22 objects

↖*sun.misc.Cleaner**.referent*

↖*sun.misc.Cleaner**.{next}*

↖*sun.nio.fs.NativeBuffer**.cleaner*

↖*sun.nio.fs.NativeBuffer[]*

↖*j.l.ThreadLocal$ThreadLocalMap$Entry**.value*

↖*j.l.ThreadLocal$ThreadLocalMap$Entry[]*

↖*j.l.ThreadLocal$ThreadLocalMap**.table*

↖*j.l.Thread**.threadLocals*

↖*j.l.Thread[]*

↖*j.l.ThreadGroup**.threads*

↖*j.l.ThreadGroup[]*

↖*j.l.ThreadGroup**.groups*

↖Java Static *sun.rmi.runtime.NewThreadAction**.systemThreadGroup*

3,145,728Kb (521.5%): *java.nio.DirectByteBufferR*: 3 objects

↖*java.nio.ByteBuffer[]*

↖*org.apache.lucene.store.ByteBufferIndexInput$MultiBufferImpl**.buffers*

↖*org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader*
*.fieldsStream*

↖*org.apache.lucene.index.SegmentCoreReaders**.fieldsReaderOrig*

↖*org.apache.lucene.index.SegmentReader**.core*

↖*org.apache.lucene.index.SegmentReader[]*

↖*org.apache.lucene.index.StandardDirectoryReader**.subReaders*

↖*org.apache.solr.search.SolrIndexSearcher**.rawReader*

↖*{j.u.concurrent.ConcurrentHashMap}**.values*

↖*org.apache.solr.core.SolrCore**.infoRegistry*

↖*{j.u.LinkedHashMap}**.values*

↖*org.apache.solr.core.SolrCores**.cores*

↖*org.apache.solr.core.CoreContainer**.solrCores*

↖*org.apache.solr.cloud.RecoveringCoreTermWatcher**.coreContainer*

↖*{j.u.HashSet}*

↖*org.apache.solr.cloud.ZkShardTerms**.listeners*

↖*{j.u.concurrent.ConcurrentHashMap}**.keys*

↖Java Static *org.apache.solr.common.util.ObjectReleaseTracker**.OBJECTS*

2,605,258Kb (431.9%): *java.nio.DirectByteBufferR*: 184 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

↖*org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader*
*.fieldsStream*

↖*org.apache.lucene.index.SegmentCoreReaders**.fieldsReaderOrig*

↖*org.apache.lucene.index.SegmentReader**.core*

↖*org.apache.lucene.index.SegmentReader[]*

↖*org.apache.lucene.index.StandardDirectoryReader**.subReaders*

↖*org.apache.solr.search.SolrIndexSearcher**.rawReader*

↖*{j.u.concurrent.ConcurrentHashMap}**.values*

↖*org.apache.solr.core.SolrCore**.infoRegistry*

↖*{j.u.LinkedHashMap}**.values*

↖*org.apache.solr.core.SolrCores**.cores*

↖*org.apache.solr.core.CoreContainer**.solrCores*

↖*org.apache.solr.cloud.RecoveringCoreTermWatcher**.coreContainer*

↖*{j.u.HashSet}*

↖*org.apache.solr.cloud.ZkShardTerms**.listeners*

↖*{j.u.concurrent.ConcurrentHashMap}**.keys*

↖Java Static *org.apache.solr.common.util.ObjectReleaseTracker**.OBJECTS*

1,790,441Kb (296.8%): *java.nio.DirectByteBufferR*: 70 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

↖*org.apache.lucene.codecs.lucene50.Lucene50CompoundReader**.handle*

↖*org.apache.lucene.index.SegmentCoreReaders**.cfsReader*

↖*org.apache.lucene.index.SegmentReader**.core*

↖*org.apache.lucene.index.SegmentReader[]*

↖*org.apache.lucene.index.StandardDirectoryReader**.subReaders*

↖*org.apache.solr.search.SolrIndexSearcher**.rawReader*

↖*{j.u.concurrent.ConcurrentHashMap}**.values*

↖*org.apache.solr.core.SolrCore**.infoRegistry*

↖*{j.u.LinkedHashMap}**.values*

↖*org.apache.solr.core.SolrCores**.cores*

↖*org.apache.solr.core.CoreContainer**.solrCores*

↖*org.apache.solr.cloud.RecoveringCoreTermWatcher**.coreContainer*

↖*{j.u.HashSet}*

↖*org.apache.solr.cloud.ZkShardTerms**.listeners*

↖*{j.u.concurrent.ConcurrentHashMap}**.keys*

↖Java Static *org.apache.solr.common.util.ObjectReleaseTracker**.OBJECTS*

1,385,471Kb (229.7%): *java.nio.DirectByteBufferR*: 85 objects

↖*sun.misc.Cleaner**.referent*

↖*sun.misc.Cleaner**.{next}*

↖*java.nio.DirectByteBuffer**.cleaner*

↖*java.nio.ByteBuffer[]*


Re: Solrcloud 7.6 OOM due to unable to create native threads

2020-03-31 Thread Raji N
Hi Eric,

What are you recommendations for SolrCloud DR strategy.

Thanks,
Raji

On Sun, Mar 29, 2020 at 6:25 PM Erick Erickson 
wrote:

> I don’t recommend CDCR at this point, I think there better approaches.
>
> The root problem is that CDCR uses tlog files as a queueing mechanism.
> If the connection between the DCs is broken for any reason, the tlogs grow
> without limit. This could probably be fixed, but a better alternative is to
> use something designed to insure messages (updates) are delivered to
> separate DCs rathe than try to have CDCR re-invent that wheel.
>
> Best,
> Erick
>
> > On Mar 29, 2020, at 6:47 PM, S G  wrote:
> >
> > Is CDCR even recommended to be used in production?
> > Or it was abandoned before it could become production ready ?
> >
> > Thanks
> > SG
> >
> >
> > On Sun, Mar 29, 2020 at 5:18 AM Erick Erickson 
> > wrote:
> >
> >> What that error usually means is that there are a zillion threads
> running.
> >>
> >> Try taking a thread dump. It’s _probable_ that it’s CDCR, but
> >> take a look at the thread dump to see if you have lots of
> >> threads that are running. Any by “lots” here, I mean 100s of threads
> >> that reference the same component, in this case that have cdcr in
> >> the stack trace.
> >>
> >> CDCR is not getting active work at this point, you might want to
> >> consider another replication strategy if you’re not willing to fix
> >> the code.
> >>
> >> Best,
> >> Erick
> >>
> >>> On Mar 29, 2020, at 4:17 AM, Raji N  wrote:
> >>>
> >>> Hi All,
> >>>
> >>> We running solrcloud 7.6  (with the patch #
> >>>
> >>
> https://issues.apache.org/jira/secure/attachment/12969150)/SOLR-11724.patchon
> >>> production on 7 hosts in  containers. The container memory is 48GB ,
> heap
> >>> is 24GB.
> >>> ulimit -v
> >>>
> >>> unlimited
> >>>
> >>> ulimit -m
> >>>
> >>> unlimited
> >>> We don't have any custom code in solr. We have set up  bidirectional
> CDCR
> >>> between primary and secondary Datacenter. Our secondary DC is very
> >> unstable
> >>> and many times many instances are down.
> >>>
> >>> We get below exception quite often. Is this because the CDCR connection
> >> is
> >>> broken.
> >>>
> >>> WARN  (cdcr-update-log-synchronizer-80-thread-1) [   ]
> >>> o.a.s.h.CdcrUpdateLogSynchronizer Caught unexpected exception
> >>>
> >>> java.lang.OutOfMemoryError: unable to create new native thread
> >>>
> >>>  at java.lang.Thread.start0(Native Method) ~[?:1.8.0_211]
> >>>
> >>>  at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_211]
> >>>
> >>>  at
> >>>
> >>
> org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96)
> >>> ~[httpclient-4.5.3.jar:4.5.3]
> >>>
> >>>  at
> >>>
> >>
> org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219)
> >>> ~[httpclient-4.5.3.jar:4.5.3]
> >>>
> >>>  at
> >>>
> >>
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:319)
> >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> >>> - nknize - 2018-12-07 14:47:53]
> >>>
> >>>  at
> >>>
> >>
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330)
> >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> >>> - nknize - 2018-12-07 14:47:53]
> >>>
> >>>  at
> >>>
> >>
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:268)
> >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> >>> - nknize - 2018-12-07 14:47:53]
> >>>
> >>>  at
> >>>
> >>
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:255)
> >>> ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> >>> - nknize - 2018-12-07 14:47:53]
> >>>
> >>>  at
> >>>
> >>
> org.apache.solr.client.

Re: Solrcloud 7.6 OOM due to unable to create native threads

2020-03-31 Thread Raji N
Thanks Eric. I don't seeing anywhere that CDCR is not recommended for
production use. Took the thread dump. Seeing about 140 CDCR threads


cdcr-replicator-219-thread-8" #787 prio=5 os_prio=0 tid=0x7f7c34009000
nid=0x50a waiting on condition [0x7f7ec871b000]

   java.lang.Thread.State: WAITING (parking)

at sun.misc.Unsafe.park(Native Method)

- parking to wait for  <0x0001da724ca0> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)

at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)

at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)

at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)




cdcr-update-log-synchronizer-157-thread-1" #240 prio=5 os_prio=0
tid=0x7f8782543800 nid=0x2e5 waiting on condition [0x7f82ad99c000]

   java.lang.Thread.State: WAITING (parking)

at sun.misc.Unsafe.park(Native Method)

- parking to wait for  <0x0001d7f9e8e8> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)

at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)

at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1081)

at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)

at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)


Thanks,

Raji

On Sun, Mar 29, 2020 at 5:18 AM Erick Erickson 
wrote:

> What that error usually means is that there are a zillion threads running.
>
> Try taking a thread dump. It’s _probable_ that it’s CDCR, but
> take a look at the thread dump to see if you have lots of
> threads that are running. Any by “lots” here, I mean 100s of threads
> that reference the same component, in this case that have cdcr in
> the stack trace.
>
> CDCR is not getting active work at this point, you might want to
> consider another replication strategy if you’re not willing to fix
> the code.
>
> Best,
> Erick
>
> > On Mar 29, 2020, at 4:17 AM, Raji N  wrote:
> >
> > Hi All,
> >
> > We running solrcloud 7.6  (with the patch #
> >
> https://issues.apache.org/jira/secure/attachment/12969150)/SOLR-11724.patchon
> > production on 7 hosts in  containers. The container memory is 48GB , heap
> > is 24GB.
> > ulimit -v
> >
> > unlimited
> >
> > ulimit -m
> >
> > unlimited
> > We don't have any custom code in solr. We have set up  bidirectional CDCR
> > between primary and secondary Datacenter. Our secondary DC is very
> unstable
> > and many times many instances are down.
> >
> > We get below exception quite often. Is this because the CDCR connection
> is
> > broken.
> >
> > WARN  (cdcr-update-log-synchronizer-80-thread-1) [   ]
> > o.a.s.h.CdcrUpdateLogSynchronizer Caught unexpected exception
> >
> > java.lang.OutOfMemoryError: unable to create new native thread
> >
> >   at java.lang.Thread.start0(Native Method) ~[?:1.8.0_211]
> >
> >   at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_211]
> >
> >   at
> >
> org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96)
> > ~[httpclient-4.5.3.jar:4.5.3]
> >
> >   at
> >
> org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219)
> > ~[httpclient-4.5.3.jar:4.5.3]
> >
> >   at
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:319)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >   at
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
&g

Solrcloud 7.6 OOM due to unable to create native threads

2020-03-29 Thread Raji N
Hi All,

We running solrcloud 7.6  (with the patch #
https://issues.apache.org/jira/secure/attachment/12969150)/SOLR-11724.patchon
production on 7 hosts in  containers. The container memory is 48GB , heap
is 24GB.
ulimit -v

unlimited

ulimit -m

unlimited
 We don't have any custom code in solr. We have set up  bidirectional CDCR
between primary and secondary Datacenter. Our secondary DC is very unstable
and many times many instances are down.

We get below exception quite often. Is this because the CDCR connection is
broken.

 WARN  (cdcr-update-log-synchronizer-80-thread-1) [   ]
o.a.s.h.CdcrUpdateLogSynchronizer Caught unexpected exception

java.lang.OutOfMemoryError: unable to create new native thread

   at java.lang.Thread.start0(Native Method) ~[?:1.8.0_211]

   at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_211]

   at
org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96)
~[httpclient-4.5.3.jar:4.5.3]

   at
org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219)
~[httpclient-4.5.3.jar:4.5.3]

   at
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:319)
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
- nknize - 2018-12-07 14:47:53]

   at
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330)
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
- nknize - 2018-12-07 14:47:53]

   at
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:268)
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
- nknize - 2018-12-07 14:47:53]

   at
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:255)
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
- nknize - 2018-12-07 14:47:53]

   at
org.apache.solr.client.solrj.impl.HttpSolrClient.(HttpSolrClient.java:200)
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
- nknize - 2018-12-07 14:47:53]

   at
org.apache.solr.client.solrj.impl.HttpSolrClient$Builder.build(HttpSolrClient.java:957)
~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
- nknize - 2018-12-07 14:47:53]

   at
org.apache.solr.handler.CdcrUpdateLogSynchronizer$UpdateLogSynchronisation.run(CdcrUpdateLogSynchronizer.java:139)
[solr-core-7.6.0.jar:7.6.0-SNAPSHOT
34d82ed033cccd8120431b73e93554b85b24a278 - i843100 - 2019-09-30
14:02:46]

   at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[?:1.8.0_211]

   at
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
[?:1.8.0_211]

   at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
[?:1.8.0_211]

   at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
[?:1.8.0_211]

   at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_211]

   at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_211]

   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_211]

 Thanks,
 Raji