[ 
https://issues.apache.org/jira/browse/KUDU-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-2589:
------------------------------
    Component/s: test

> ITClient is flaky under stress when leadership changes
> ------------------------------------------------------
>
>                 Key: KUDU-2589
>                 URL: https://issues.apache.org/jira/browse/KUDU-2589
>             Project: Kudu
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 1.8.0
>            Reporter: William Berkeley
>            Priority: Major
>
> Saw this failure in ITClient test:
> {{noformat}}
> 00:34:51.362 [DEBUG - New I/O worker #10] (AsyncKuduScanner.java:492) Can not 
> open scanner
> org.apache.kudu.client.NonRecoverableException: Tablet hasn't heard from 
> leader, or there hasn't been a stable leader for: 0.757s secs, (max is 
> 0.750s):
>       at org.apache.kudu.client.RpcProxy.dispatchTSError(RpcProxy.java:320)
>       at org.apache.kudu.client.RpcProxy.responseReceived(RpcProxy.java:242)
>       at org.apache.kudu.client.RpcProxy.access$000(RpcProxy.java:59)
>       at org.apache.kudu.client.RpcProxy$1.call(RpcProxy.java:131)
>       at org.apache.kudu.client.RpcProxy$1.call(RpcProxy.java:127)
>       at 
> org.apache.kudu.client.Connection.messageReceived(Connection.java:391)
>       at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>       at org.apache.kudu.client.Connection.handleUpstream(Connection.java:243)
>       at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>       at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>       at 
> org.jboss.netty.handler.timeout.ReadTimeoutHandler.messageReceived(ReadTimeoutHandler.java:184)
>       at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>       at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>       at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>       at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
>       at 
> org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70)
>       at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>       at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>       at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
>       at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
>       at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
>       at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
>       at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>       at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>       at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
>       at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
>       at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
>       at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
>       at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
>       at 
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
>       at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
>       at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
>       at 
> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>       at 
> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> {{noformat}}
> There was a new leader elected just before:
> {{noformat}}
> 00:34:50.953 [INFO - cluster stderr printer] (MiniKuduCluster.java:526) I0925 
> 00:34:50.953722 18257 catalog_manager.cc:3758] T 
> 07e99535cba24d8e991829485de22275 P 060cf49269fb4f6f901696d741e69303 reported 
> cstate change: term changed from 1 to 2, leader changed from 
> acaedc4ec505489dbc853d8b32bfc147 (127.16.196.2) to 
> 060cf49269fb4f6f901696d741e69303 (127.16.196.3). New cstate: current_term: 2 
> leader_uuid: "060cf49269fb4f6f901696d741e69303" committed_config { 
> opid_index: -1 OBSOLETE_local: false peers { permanent_uuid: 
> "acaedc4ec505489dbc853d8b32bfc147" member_type: VOTER last_known_addr { host: 
> "127.16.196.2" port: 58760 } health_report { overall_health: UNKNOWN } } 
> peers { permanent_uuid: "060cf49269fb4f6f901696d741e69303" member_type: VOTER 
> last_known_addr { host: "127.16.196.3" port: 37181 } health_report { 
> overall_health: HEALTHY } } peers { permanent_uuid: 
> "2f0ee942338c4fb7bce908b1c32f0bcb" member_type: VOTER last_known_addr { host: 
> "127.16.196.1" port: 47865 } health_report { overall_health: UNKNOWN } } }
> {{noformat}}
> but it seems the 0.75 seconds the async scanner will wait for a leader 
> partially overlapped with the election, and the remaining portion of the 
> period wasn't enough to get an answer on who the new leader was.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to