Christopher, I have created an issue (
https://github.com/apache/accumulo/issues/1578) on the github page. Fyi

Thanks,
Karthick

On Wed, 1 Apr 2020 at 11:58, karthick rn <karthick.narend...@gmail.com>
wrote:

> Hi Christopher, I have talked through this issue with Keith internally but
> haven't raised an official channel for discussion, because I suspected the
> issue could be related to ZK / Netty framework as only after enabling TLS
> we are seeing this.
> May be I should have opened an issue on Accumulo git first. I'm doing this
> now.
>
> Thanks,
> Karthick
>
> On Wed, 1 Apr 2020 at 08:51, Christopher <ctubb...@apache.org> wrote:
>
>> Karthick, I haven't seen this discussed in the Accumulo community. Can you
>> point me to the conversation there?
>>
>> On Wed, Apr 1, 2020 at 2:19 AM Andor Molnar <an...@apache.org> wrote:
>>
>> > Why would they need to be daemon threads?
>> > I’m not an expert of Java threading, but afaik I/O threads should not be
>> > daemon threads in most cases.
>> >
>> > Also those threads are Netty internal threads, so this question is
>> better
>> > to be asked in Netty community.
>> > ZK threads reported in jstack are just waiting for input to
>> send/receive.
>> > Do you know at which point Accumulo does stuck?
>> >
>> > Andor
>> >
>> >
>> >
>> > > On 2020. Mar 31., at 14:27, karthick rn <karthick.narend...@gmail.com
>> >
>> > wrote:
>> > >
>> > > Hi Enrico,
>> > >
>> > > Yes, I have already run this through Accumulo folks they have looked
>> at
>> > the
>> > > jstack output & advised to check with ZK devs if those 2 threads (#27
>> &
>> > > #30) are expected to be non-daemon threads.
>> > > Also, in this cluster we have wire encryption enabled only for ZK and
>> by
>> > > disabling it we don't encounter this issue.
>> > > There are no error messages reported on the ZK server log, below are
>> the
>> > > INFO messages when running the "accumulo-service master start" command
>> > >
>> > > 2020-03-31 12:16:28,626 [myid:2] - INFO
>> > > [nioEventLoopGroup-7-5:X509AuthenticationProvider@172] -
>> Authenticated
>> > Id
>> > > 'CN=host2' for Scheme 'x509'
>> > >
>> > > 2020-03-31 12:16:28,676 [myid:2] - INFO
>> > > [nioEventLoopGroup-7-5:ZooKeeperServer@1095] - got auth packet
>> /<host2
>> > > IP>:46332
>> > >
>> > > 2020-03-31 12:16:28,676 [myid:2] - INFO
>> > > [nioEventLoopGroup-7-5:ZooKeeperServer@1113] - auth success /<host2
>> > > IP>:46332
>> > >
>> > > This issue is reproducible everytime I start Accumulo master. Let me
>> know
>> > > for any further details?
>> > >
>> > > Many thanks
>> > >
>> > > Regards,
>> > > Karthick
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > On Tue, 31 Mar 2020 at 07:23, Enrico Olivelli <eolive...@gmail.com>
>> > wrote:
>> > >
>> > >> Hi,
>> > >> Did you check with Accumulo community?
>> > >> Do you see errors or informational messages in ZK server logs?
>> > >>
>> > >> Enrico
>> > >>
>> > >> Il Mar 31 Mar 2020, 01:12 karthick rn <karthick.narend...@gmail.com>
>> ha
>> > >> scritto:
>> > >>
>> > >>> Hello dev team,
>> > >>>
>> > >>> We are using Hadoop, Accumulo & Zookeeper in our environment, after
>> > >>> enabling TLS for ZK we noticed that starting Accumulo master service
>> > >> hangs
>> > >>> in an intermediate process as shown below and require to kill the
>> > process
>> > >>> in-order for Accumulo master to start.
>> > >>>
>> > >>> [user1@host1 ~]$ jps -m
>> > >>>
>> > >>> 23314 JournalNode
>> > >>>
>> > >>> 23011 NameNode
>> > >>>
>> > >>> 23539 DFSZKFailoverController
>> > >>>
>> > >>> *84118 Main org.apache.accumulo.master.state.SetGoalState NORMAL*
>> > >>>
>> > >>> 22590 QuorumPeerMain
>> > >>>
>> > >>> 89790 Jps -m
>> > >>>
>> > >>>
>> > >>> [user1@host1 ~]$ *kill -9 84118*
>> > >>>
>> > >>> [user1@host1 ~]$ jps -m
>> > >>>
>> > >>> 23314 JournalNode
>> > >>>
>> > >>> 23011 NameNode
>> > >>>
>> > >>> 23539 DFSZKFailoverController
>> > >>>
>> > >>> 89892 Jps -m
>> > >>>
>> > >>> *89847 Main master*
>> > >>>
>> > >>> 22590 QuorumPeerMain
>> > >>>
>> > >>> [user1@host1 ~]$
>> > >>>
>> > >>> Jstack collected during the hang shows 2 non-daemon threads (#27 &
>> #30)
>> > >>> while the rest are daemon threads. Would like to check with the dev
>> > team
>> > >> if
>> > >>> "nioEventLoopGroup" threads are expected to be non-daemon? If so,
>> any
>> > >>> thoughts on what else might be causing the issue?
>> > >>> I have copied only a portion of the jstack output, let me know
>> in-case
>> > >> you
>> > >>> need the full output. Fyi, I'm using Apache Zookeeper 3.5.7, Hadoop
>> > >> 3.2.1 &
>> > >>> Accumulo 2.0. Let me know if you need any further details? Many
>> thanks
>> > >>>
>> > >>>
>> "org.apache.accumulo.master.state.SetGoalState-SendThread(host1:2281)"
>> > >> #25
>> > >>> daemon prio=5 os_prio=0 cpu=127.90ms elapsed=95.38s
>> > >> tid=0x0000000003a10800
>> > >>> nid=0x1624e waiting on condition  [0x00007f5c7bd67000]
>> > >>>   java.lang.Thread.State: TIMED_WAITING (parking)
>> > >>> at jdk.internal.misc.Unsafe.park(java.base@11.0.6/Native Method)
>> > >>> - parking to wait for  <0x000000070f0acf38> (a
>> > >>>
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>> > >>> at java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.6
>> > >>> /LockSupport.java:234)
>> > >>> at
>> > >>>
>> > >>>
>> > >>
>> >
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(java.base@11.0.6
>> > >>> /AbstractQueuedSynchronizer.java:2123)
>> > >>> at
>> java.util.concurrent.LinkedBlockingDeque.pollFirst(java.base@11.0.6
>> > >>> /LinkedBlockingDeque.java:513)
>> > >>> at java.util.concurrent.LinkedBlockingDeque.poll(java.base@11.0.6
>> > >>> /LinkedBlockingDeque.java:675)
>> > >>> at
>> > >>>
>> > >>>
>> > >>
>> >
>> org.apache.zookeeper.ClientCnxnSocketNetty.doTransport(ClientCnxnSocketNetty.java:278)
>> > >>> at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>> > >>>
>> > >>>   Locked ownable synchronizers:
>> > >>> - None
>> > >>>
>> > >>> "org.apache.accumulo.master.state.SetGoalState-EventThread" #26
>> daemon
>> > >>> prio=5 os_prio=0 cpu=0.88ms elapsed=95.38s tid=0x0000000003a16800
>> > >>> nid=0x1624f waiting on condition  [0x00007f5c7bc66000]
>> > >>>   java.lang.Thread.State: WAITING (parking)
>> > >>> at jdk.internal.misc.Unsafe.park(java.base@11.0.6/Native Method)
>> > >>> - parking to wait for  <0x000000070f0f8af0> (a
>> > >>>
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>> > >>> at java.util.concurrent.locks.LockSupport.park(java.base@11.0.6
>> > >>> /LockSupport.java:194)
>> > >>> at
>> > >>>
>> > >>>
>> > >>
>> >
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@11.0.6
>> > >>> /AbstractQueuedSynchronizer.java:2081)
>> > >>> at java.util.concurrent.LinkedBlockingQueue.take(java.base@11.0.6
>> > >>> /LinkedBlockingQueue.java:433)
>> > >>> at
>> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
>> > >>>
>> > >>>   Locked ownable synchronizers:
>> > >>> - None
>> > >>>
>> > >>> "*nioEventLoopGroup-2-1*" #27 prio=10 os_prio=0 cpu=718.22ms
>> > >> elapsed=95.28s
>> > >>> tid=0x0000000003cd2800 nid=0x16250 runnable  [0x00007f5c7d6b9000]
>> > >>>   java.lang.Thread.State: RUNNABLE
>> > >>> at sun.nio.ch.EPoll.wait(java.base@11.0.6/Native Method)
>> > >>> at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@11.0.6
>> > >>> /EPollSelectorImpl.java:120)
>> > >>> at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@11.0.6
>> > >>> /SelectorImpl.java:124)
>> > >>> - locked <0x000000070f079a28> (a
>> > >>> io.netty.channel.nio.SelectedSelectionKeySet)
>> > >>> - locked <0x000000070f06ca80> (a sun.nio.ch.EPollSelectorImpl)
>> > >>> at sun.nio.ch.SelectorImpl.select(java.base@11.0.6
>> > >> /SelectorImpl.java:141)
>> > >>> at
>> > >>>
>> > >>>
>> > >>
>> >
>> io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:68)
>> > >>> at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:803)
>> > >>> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:457)
>> > >>> at
>> > >>>
>> > >>>
>> > >>
>> >
>> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>> > >>> at
>> > >>>
>> >
>> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>> > >>> at
>> > >>>
>> > >>>
>> > >>
>> >
>> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>> > >>> at java.lang.Thread.run(java.base@11.0.6/Thread.java:834)
>> > >>>
>> > >>>   Locked ownable synchronizers:
>> > >>> - None
>> > >>>
>> > >>>
>> "org.apache.accumulo.master.state.SetGoalState-SendThread(host2:2281)"
>> > >> #28
>> > >>> daemon prio=5 os_prio=0 cpu=8.27ms elapsed=94.31s
>> > tid=0x0000000005942800
>> > >>> nid=0x16259 waiting on condition  [0x00007f5c7b145000]
>> > >>>   java.lang.Thread.State: TIMED_WAITING (parking)
>> > >>> at jdk.internal.misc.Unsafe.park(java.base@11.0.6/Native Method)
>> > >>> - parking to wait for  <0x000000070f2acbc0> (a
>> > >>>
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>> > >>> at java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.6
>> > >>> /LockSupport.java:234)
>> > >>> at
>> > >>>
>> > >>>
>> > >>
>> >
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(java.base@11.0.6
>> > >>> /AbstractQueuedSynchronizer.java:2123)
>> > >>> at
>> java.util.concurrent.LinkedBlockingDeque.pollFirst(java.base@11.0.6
>> > >>> /LinkedBlockingDeque.java:513)
>> > >>> at java.util.concurrent.LinkedBlockingDeque.poll(java.base@11.0.6
>> > >>> /LinkedBlockingDeque.java:675)
>> > >>> at
>> > >>>
>> > >>>
>> > >>
>> >
>> org.apache.zookeeper.ClientCnxnSocketNetty.doTransport(ClientCnxnSocketNetty.java:278)
>> > >>> at
>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
>> > >>>
>> > >>>   Locked ownable synchronizers:
>> > >>> - None
>> > >>>
>> > >>> "org.apache.accumulo.master.state.SetGoalState-EventThread" #29
>> daemon
>> > >>> prio=5 os_prio=0 cpu=0.25ms elapsed=94.31s tid=0x0000000005943800
>> > >>> nid=0x1625a waiting on condition  [0x00007f5c7b044000]
>> > >>>   java.lang.Thread.State: WAITING (parking)
>> > >>> at jdk.internal.misc.Unsafe.park(java.base@11.0.6/Native Method)
>> > >>> - parking to wait for  <0x000000070f2adff8> (a
>> > >>>
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>> > >>> at java.util.concurrent.locks.LockSupport.park(java.base@11.0.6
>> > >>> /LockSupport.java:194)
>> > >>> at
>> > >>>
>> > >>>
>> > >>
>> >
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@11.0.6
>> > >>> /AbstractQueuedSynchronizer.java:2081)
>> > >>> at java.util.concurrent.LinkedBlockingQueue.take(java.base@11.0.6
>> > >>> /LinkedBlockingQueue.java:433)
>> > >>> at
>> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
>> > >>>
>> > >>>   Locked ownable synchronizers:
>> > >>> - None
>> > >>>
>> > >>> "*nioEventLoopGroup-3-1*" #30 prio=10 os_prio=0 cpu=297.70ms
>> > >> elapsed=94.30s
>> > >>> tid=0x00000000044af800 nid=0x1625b runnable  [0x00007f5c7b646000]
>> > >>>   java.lang.Thread.State: RUNNABLE
>> > >>> at sun.nio.ch.EPoll.wait(java.base@11.0.6/Native Method)
>> > >>> at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@11.0.6
>> > >>> /EPollSelectorImpl.java:120)
>> > >>> at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@11.0.6
>> > >>> /SelectorImpl.java:124)
>> > >>> - locked <0x000000070f2ab868> (a
>> > >>> io.netty.channel.nio.SelectedSelectionKeySet)
>> > >>> - locked <0x000000070f2ab640> (a sun.nio.ch.EPollSelectorImpl)
>> > >>> at sun.nio.ch.SelectorImpl.select(java.base@11.0.6
>> > >> /SelectorImpl.java:141)
>> > >>> at
>> > >>>
>> > >>>
>> > >>
>> >
>> io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:68)
>> > >>> at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:803)
>> > >>> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:457)
>> > >>> at
>> > >>>
>> > >>>
>> > >>
>> >
>> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>> > >>> at
>> > >>>
>> >
>> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>> > >>> at
>> > >>>
>> > >>>
>> > >>
>> >
>> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>> > >>> at java.lang.Thread.run(java.base@11.0.6/Thread.java:834)
>> > >>>
>> > >>>   Locked ownable synchronizers:
>> > >>> - None
>> > >>>
>> > >>
>> >
>> >
>>
>

Reply via email to