Il lun 12 mar 2018, 19:37 Sijie Guo <guosi...@gmail.com> ha scritto: > Thanks Enrico! > > On Mon, Mar 12, 2018 at 4:21 AM, Enrico Olivelli <eolive...@gmail.com> > wrote: > > > Summary of my findings: > > > > The problem is about clients which get messed up and are not able to read > > and write to bookies after rolling restarts of an application, > > the problem appears only on a cluster of 6 machines (reduced to 3 in > order > > to narrow down the search) of my colleagues which are performing "manual" > > tests on a new version of an application, no changes at BK client level > are > > present in the application > > > > Does the problem only exist in that cluster? Or have you tried other > machines with same setup? >
I have only those machines which reproduce the issue. We have three other similar clusters which do not have the problem. My colleagues are comparing the machines but they are all on the same vmware subsystem, apparently the only difference as far as I can know at the moment is that that cluster have a higher application load, not directly on BK, but CPUs are busier. My colleagues are excluding hardware issues but currently they are still investigating. It is interesting that there is a way to trigger the problem systematically, that is by restarting one or at most two bookies. It is interesting that the problems is on 'readers' and it seems that the PCBC seems corrupted and even writes (if the broker is promoted to 'leader') are able to go on after the reads broke the client. I was trying to reproduce the issue on my laptop without success. I will do further investigation on BK code in next days Stay tuned Enrico > As what you said, you didn't see this error in your production. I am > wondering the corruption is coming from your hardware/os. > > > - Sijie > > > > > > > I am able to reproduce the problem quite simply on the those machines, it > > is enough to restart bookies to mess the cluster, this is my reproducer: > > 1) there is a client which is continuously writing to BK, on a WAL (same > > pattern as in the BK tutorial) > > 2) there are two other clients (other machines) which are 'tailing' the > WAL > > (as in the tutorial), these are the 'readers' which have the problem > > 3) there are 3 bookies, co-located in the same machines as the clients > > 4) running BK 4.6.1 on both clients and servers, using v3 protocol > > 5) ledgers are written with WQ=2 and AQ=2 (over a cluster of 3) > > 6) restart one or two bookies and the readers start being prevented to > > read: readers start receiving invalid v2 responses (com.google.protobuf. > > InvalidProtocolBufferException: Protocol message contained an invalid tag > > (zero)) [before Sijie's patch this was "Invalid opcode..."] > > 7) on bookie side we have the error below, > > "com.google.protobuf.InvalidProtocolBufferException > > $InvalidWireTypeException: > > Protocol message tag had invalid wire type", > > BookieProtoEncoding#RequestDecoder falls back to v2 protocol and start > > answering with wrong encoding > > > > I am not able to create a reproducer on BK test suite. > > > > Notes: > > - when I "restart" bookies I issue a kill -9 (I think this could be the > > reason why I can't reproduce the issue on testcases) > > - the issue starts when bookie comes up again > > - when a reader is promoted to leader (writer) it is no more able to > write > > entries, all writes time out > > - I am running on Linux with Epool, no Auth, no TLS > > - Ivan took a look and some network dumps and found corrupted messages > > (thank you Ivan !!) > > > > Beside Additional (not important) notes from the history of this email > > thread: > > - I have never seen this kind of errors in production, and we have been > > running 4.6.1 in production for a month > > - I am running with jdk9 but even reverting to jdk8 the problem is the > same > > - I was running with Netty 4.1.21Final but I have reverted to 4.1.12Final > > and the problem is the same > > > > I think it may be an issue on PerChannelBookieClient, but to me it is not > > very clear how "connection resets" are handled, I am digging into code. > > > > Bolow there is the stacktrace or the error on bookie side > > > > (line numbers may not match 4.6.1 tag because I have added plenty of > debug, > > but we are at the point when BookieProtoEncoding#RequestDecoder switches > > to > > v2 protocol) > > > > Thank you very much to all of you that is helping with this problem > > > > Enrico > > > > > > 18-03-12-11-08-34 error decoding msg PooledSlicedByteBuf(ridx: 145, > > widx: 145, cap: 145/145, unwrapped: PooledUnsafeDirectByteBuf(ridx: 149, > > widx: 298, cap: 65536)) > > 18-03-12-11-08-34 > > com.google.protobuf.InvalidProtocolBufferException > > $InvalidWireTypeException: > > Protocol message tag had invalid wire type. > > com.google.protobuf.InvalidProtocolBufferException > > $InvalidWireTypeException: > > Protocol message tag had invalid wire type. > > at > > com.google.protobuf.InvalidProtocolBufferException.invalidWireType( > > InvalidProtocolBufferException.java:115) > > at > > com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom( > > UnknownFieldSet.java:551) > > at > > com.google.protobuf.GeneratedMessageV3.parseUnknownField( > > GeneratedMessageV3.java:293) > > at > > org.apache.bookkeeper.proto.BookkeeperProtocol$AddRequest. > > <init>(BookkeeperProtocol.java:4536) > > at > > org.apache.bookkeeper.proto.BookkeeperProtocol$AddRequest. > > <init>(BookkeeperProtocol.java:4497) > > at > > org.apache.bookkeeper.proto.BookkeeperProtocol$AddRequest$ > > 1.parsePartialFrom(BookkeeperProtocol.java:5378) > > at > > org.apache.bookkeeper.proto.BookkeeperProtocol$AddRequest$ > > 1.parsePartialFrom(BookkeeperProtocol.java:5373) > > at > > com.google.protobuf.CodedInputStream$StreamDecoder.readMessage( > > CodedInputStream.java:2362) > > at > > org.apache.bookkeeper.proto.BookkeeperProtocol$Request.< > > init>(BookkeeperProtocol.java:1367) > > at > > org.apache.bookkeeper.proto.BookkeeperProtocol$Request.< > > init>(BookkeeperProtocol.java:1296) > > at > > > org.apache.bookkeeper.proto.BookkeeperProtocol$Request$1.parsePartialFrom( > > BookkeeperProtocol.java:3265) > > at > > > org.apache.bookkeeper.proto.BookkeeperProtocol$Request$1.parsePartialFrom( > > BookkeeperProtocol.java:3260) > > at > > com.google.protobuf.AbstractParser.parsePartialFrom( > > AbstractParser.java:221) > > at com.google.protobuf.AbstractParser.parseFrom( > > AbstractParser.java:239) > > at com.google.protobuf.AbstractParser.parseFrom( > > AbstractParser.java:49) > > at > > com.google.protobuf.GeneratedMessageV3.parseWithIOException( > > GeneratedMessageV3.java:320) > > at > > org.apache.bookkeeper.proto.BookkeeperProtocol$Request. > > parseFrom(BookkeeperProtocol.java:1905) > > at > > > org.apache.bookkeeper.proto.BookieProtoEncoding$RequestEnDecoderV3.decode( > > BookieProtoEncoding.java:309) > > at > > org.apache.bookkeeper.proto.BookieProtoEncoding$RequestDecoder.decode( > > BookieProtoEncoding.java:408) > > at > > io.netty.handler.codec.MessageToMessageDecoder.channelRead( > > MessageToMessageDecoder.java:88) > > at > > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead( > > AbstractChannelHandlerContext.java:362) > > at > > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead( > > AbstractChannelHandlerContext.java:348) > > at > > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead( > > AbstractChannelHandlerContext.java:340) > > at > > io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead( > > ByteToMessageDecoder.java:310) > > at > > io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead( > > ByteToMessageDecoder.java:297) > > at > > io.netty.handler.codec.ByteToMessageDecoder.callDecode( > > ByteToMessageDecoder.java:413) > > at > > io.netty.handler.codec.ByteToMessageDecoder.channelRead( > > ByteToMessageDecoder.java:265) > > at > > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead( > > AbstractChannelHandlerContext.java:362) > > at > > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead( > > AbstractChannelHandlerContext.java:348) > > at > > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead( > > AbstractChannelHandlerContext.java:340) > > at > > io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead( > > DefaultChannelPipeline.java:1334) > > at > > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead( > > AbstractChannelHandlerContext.java:362) > > at > > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead( > > AbstractChannelHandlerContext.java:348) > > at > > io.netty.channel.DefaultChannelPipeline.fireChannelRead( > > DefaultChannelPipeline.java:926) > > at > > io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe. > > epollInReady(AbstractEpollStreamChannel.java:979) > > at > > io.netty.channel.epoll.EpollEventLoop.processReady( > > EpollEventLoop.java:401) > > at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:306) > > at > > io.netty.util.concurrent.SingleThreadEventExecutor$5. > > run(SingleThreadEventExecutor.java:858) > > at java.base/java.lang.Thread.run(Thread.java:844) > > > > > > > > > > 2018-03-12 11:17 GMT+01:00 Enrico Olivelli <eolive...@gmail.com>: > > > > > I will send a report soon. > > > With new debug I have some finding, I am looking into problems during > > > restarts of bookies. Maybe there is some problem in error handling in > > PCBC. > > > > > > Thank you > > > Enrico > > > > > > 2018-03-12 10:58 GMT+01:00 Ivan Kelly <iv...@apache.org>: > > > > > >> Enrico, could you summarize what the state of things is now? What are > > >> you running, what problems are you seeing and how are the problems > > >> manifesting themselves. > > >> > > >> Regards, > > >> Ivan > > >> > > >> On Mon, Mar 12, 2018 at 10:15 AM, Enrico Olivelli < > eolive...@gmail.com> > > >> wrote: > > >> > Applyed Sijie's fixes and added some debug: > > >> > > > >> > Problem is triggered when you restart a bookie (I have a cluster of > 3 > > >> > bookies, WQ = 2 and AQ = 2) > > >> > > > >> > Below a new error on client side ("tailing" reader) > > >> > > > >> > Enrico > > >> > > > >> > this is a new error on client side: > > >> > 18-03-12-09-11-45 Unexpected exception caught by bookie client > > >> channel > > >> > handler > > >> > 18-03-12-09-11-45 io.netty.handler.codec.DecoderException: > > >> > com.google.protobuf.InvalidProtocolBufferException: Protocol message > > >> > contained an invalid tag (zero). > > >> > io.netty.handler.codec.DecoderException: > > >> > com.google.protobuf.InvalidProtocolBufferException: Protocol message > > >> > contained an invalid tag (zero). > > >> > at > > >> > io.netty.handler.codec.MessageToMessageDecoder.channelRead(M > > >> essageToMessageDecoder.java:98) > > >> > at > > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannel > > >> Read(AbstractChannelHandlerContext.java:362) > > >> > at > > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannel > > >> Read(AbstractChannelHandlerContext.java:348) > > >> > at > > >> > io.netty.channel.AbstractChannelHandlerContext.fireChannelRe > > >> ad(AbstractChannelHandlerContext.java:340) > > >> > at > > >> > io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead( > > >> ByteToMessageDecoder.java:310) > > >> > at > > >> > io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead( > > >> ByteToMessageDecoder.java:297) > > >> > at > > >> > io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteT > > >> oMessageDecoder.java:413) > > >> > at > > >> > io.netty.handler.codec.ByteToMessageDecoder.channelRead(Byte > > >> ToMessageDecoder.java:265) > > >> > at > > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannel > > >> Read(AbstractChannelHandlerContext.java:362) > > >> > at > > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannel > > >> Read(AbstractChannelHandlerContext.java:348) > > >> > at > > >> > io.netty.channel.AbstractChannelHandlerContext.fireChannelRe > > >> ad(AbstractChannelHandlerContext.java:340) > > >> > at > > >> > io.netty.channel.DefaultChannelPipeline$HeadContext. > > >> channelRead(DefaultChannelPipeline.java:1334) > > >> > at > > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannel > > >> Read(AbstractChannelHandlerContext.java:362) > > >> > at > > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannel > > >> Read(AbstractChannelHandlerContext.java:348) > > >> > at > > >> > io.netty.channel.DefaultChannelPipeline.fireChannelRead(Defa > > >> ultChannelPipeline.java:926) > > >> > at > > >> > io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStrea > > >> mUnsafe.epollInReady(AbstractEpollStreamChannel.java:979) > > >> > at > > >> > io.netty.channel.epoll.EpollEventLoop.processReady(EpollEven > > >> tLoop.java:401) > > >> > at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop. > > >> java:306) > > >> > at > > >> > io.netty.util.concurrent.SingleThreadEventExecutor$5.run( > > >> SingleThreadEventExecutor.java:858) > > >> > at java.base/java.lang.Thread.run(Thread.java:844) > > >> > Caused by: com.google.protobuf.InvalidProtocolBufferException: > > Protocol > > >> > message contained an invalid tag (zero). > > >> > at > > >> > com.google.protobuf.InvalidProtocolBufferException. > > >> invalidTag(InvalidProtocolBufferException.java:105) > > >> > at > > >> > com.google.protobuf.CodedInputStream$StreamDecoder.readTag(C > > >> odedInputStream.java:2060) > > >> > at > > >> > org.apache.bookkeeper.proto.BookkeeperProtocol$ReadResponse. > > >> <init>(BookkeeperProtocol.java:10009) > > >> > at > > >> > org.apache.bookkeeper.proto.BookkeeperProtocol$ReadResponse. > > >> <init>(BookkeeperProtocol.java:9975) > > >> > at > > >> > org.apache.bookkeeper.proto.BookkeeperProtocol$ReadResponse$ > > >> 1.parsePartialFrom(BookkeeperProtocol.java:10869) > > >> > at > > >> > org.apache.bookkeeper.proto.BookkeeperProtocol$ReadResponse$ > > >> 1.parsePartialFrom(BookkeeperProtocol.java:10864) > > >> > at > > >> > com.google.protobuf.CodedInputStream$StreamDecoder. > > >> readMessage(CodedInputStream.java:2362) > > >> > at > > >> > org.apache.bookkeeper.proto.BookkeeperProtocol$Response.<ini > > >> t>(BookkeeperProtocol.java:7852) > > >> > at > > >> > org.apache.bookkeeper.proto.BookkeeperProtocol$Response.<ini > > >> t>(BookkeeperProtocol.java:7782) > > >> > at > > >> > org.apache.bookkeeper.proto.BookkeeperProtocol$Response$1.pa > > >> rsePartialFrom(BookkeeperProtocol.java:9887) > > >> > at > > >> > org.apache.bookkeeper.proto.BookkeeperProtocol$Response$1.pa > > >> rsePartialFrom(BookkeeperProtocol.java:9882) > > >> > at > > >> > com.google.protobuf.AbstractParser.parsePartialFrom(Abstract > > >> Parser.java:221) > > >> > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser. > > >> java:239) > > >> > at com.google.protobuf.AbstractParser.parseFrom(AbstractParser. > > >> java:49) > > >> > at > > >> > com.google.protobuf.GeneratedMessageV3.parseWithIOException( > > >> GeneratedMessageV3.java:320) > > >> > at > > >> > org.apache.bookkeeper.proto.BookkeeperProtocol$Response.pars > > >> eFrom(BookkeeperProtocol.java:8454) > > >> > at > > >> > org.apache.bookkeeper.proto.BookieProtoEncoding$ResponseEnDe > > >> coderV3.decode(BookieProtoEncoding.java:329) > > >> > at > > >> > org.apache.bookkeeper.proto.BookieProtoEncoding$ResponseDeco > > >> der.decode(BookieProtoEncoding.java:470) > > >> > at > > >> > io.netty.handler.codec.MessageToMessageDecoder.channelRead(M > > >> essageToMessageDecoder.java:88) > > >> > ... 19 more > > >> > > > >> > > > >> > 2018-03-09 22:37 GMT+01:00 Sijie Guo <guosi...@gmail.com>: > > >> > > > >> >> Enrico, > > >> >> > > >> >> I would suggest you applied my fixes and then debug from there. In > > this > > >> >> way, you will have a better sense where the first corruption is > from. > > >> >> > > >> >> Sijie > > >> >> > > >> >> On Fri, Mar 9, 2018 at 11:48 AM Enrico Olivelli < > eolive...@gmail.com > > > > > >> >> wrote: > > >> >> > > >> >> > Il ven 9 mar 2018, 19:30 Enrico Olivelli <eolive...@gmail.com> > ha > > >> >> scritto: > > >> >> > > > >> >> > > Thank you Ivan! > > >> >> > > I hope I did not mess up the dump and added ZK ports. We are > not > > >> using > > >> >> > > standard ports and in that 3 machines there is also the 3 > nodes > > zk > > >> >> > > ensemble which is supporting BK and all the other parts of the > > >> >> > application > > >> >> > > > > >> >> > > So one explanation would be that something is connecting to the > > >> bookie > > >> >> > and > > >> >> > > this makes the bookie switch in a corrupted state by double > > >> releasing a > > >> >> > > bytebuf? > > >> >> > > > > >> >> > > > >> >> > I did some experiments and it is easy to reproduce the bookie > side > > >> error > > >> >> > and the double release with a forged sequence of bytes (just > using > > nc > > >> >> from > > >> >> > the shell) > > >> >> > But this seems not enough to break the bookie. > > >> >> > I guess there is some corruption on client side and the error on > > the > > >> >> bookie > > >> >> > is only and effect, as Ivan is saying. > > >> >> > My colleagues left the system running with a deep level of debug > > >> during > > >> >> > next weekend, hopefully we will get some other stacktrace > > >> >> > > > >> >> > Enrico > > >> >> > > > >> >> > > > >> >> > > Enrico > > >> >> > > > > >> >> > > > > >> >> > > Il ven 9 mar 2018, 18:23 Ivan Kelly <iv...@apache.org> ha > > scritto: > > >> >> > > > > >> >> > >> I need to sign off for the day. I've done some analysis of a > > >> tcpdump > > >> >> > >> enrico sent to me out of band (may contain sensitive info so > > best > > >> not > > >> >> > >> to post on public forum). > > >> >> > >> > > >> >> > >> I've attached a dump of just first bit of the header. Format > is > > >> >> > >> <sequence in dump> <whether a request or response>(<remote > > port>) > > >> >> > >> <hexdump of payload> > > >> >> > >> > > >> >> > >> There are definitely corrupt packets coming from somewhere. > > >> Search for > > >> >> > >> lines with CORRUPT. > > >> >> > >> > > >> >> > >> <snip> > > >> >> > >> 0247 - req (049546) - 00:00:00:08:ff:ff:ff:fe:00:00:00:0b > > >> CORRUPT > > >> >> > >> </snip> > > >> >> > >> > > >> >> > >> It's not clear whether these are originating at a valid client > > or > > >> not. > > >> >> > >> These trigger corrupt responses from the server, which I guess > > is > > >> the > > >> >> > >> double free manifesting itself. Strangely the > > >> >> > >> corrupt message seems to have a lot of data in common with > what > > >> seems > > >> >> > >> like an ok message (it's clearer on fixed width font). > > >> >> > >> > > >> >> > >> <snip> > > >> >> > >> 0248 - resp(049720) - > > >> >> > >> > > >> >> > >> > > >> >> > 00:00:00:54:00:03:00:89:00:00:02:86:00:07:e2:b1:00:00:00:00: > > >> >> 00:00:02:86:00:05:e9:76:00:00 > > >> >> > >> 0249 - resp(049546) - > > >> >> > >> 00:00:00:10:ff:ff:ff:fe:00:00:02:86:00:07:e2:b1:00:00:00:00 > > >> >> CORRUPT > > >> >> > >> </snip> > > >> >> > >> > > >> >> > >> There's also some other weird traffic. Correct BK protobuf > > traffic > > >> >> > >> should be <4 bytes len>:00:03:.... > > >> >> > >> There seems to be other traffic which is being accepted at the > > >> same > > >> >> > >> port, but looks like ZK traffic. > > >> >> > >> > > >> >> > >> Anyhow, I'll dig more on monday. > > >> >> > >> > > >> >> > >> -Ivan > > >> >> > >> > > >> >> > >> > > >> >> > >> On Fri, Mar 9, 2018 at 3:27 PM, Ivan Kelly <iv...@apache.org> > > >> wrote: > > >> >> > >> > On Fri, Mar 9, 2018 at 3:20 PM, Enrico Olivelli < > > >> >> eolive...@gmail.com> > > >> >> > >> wrote: > > >> >> > >> >> Bookies > > >> >> > >> >> 10.168.10.117:1822 -> bad bookie with 4.1.21 > > >> >> > >> >> 10.168.10.116:1822 -> bookie with 4.1.12 > > >> >> > >> >> 10.168.10.118:1281 -> bookie with 4.1.12 > > >> >> > >> >> > > >> >> > >> >> 10.168.10.117 client machine on which I have 4.1.21 client > > >> >> (different > > >> >> > >> >> process than the bookie one) > > >> >> > >> > Oh. This dump won't have the stream we need then, as that > will > > >> be on > > >> >> > >> > loopback. Try adding "-i any" to the tcpdump. Sorry, I > didn't > > >> >> realize > > >> >> > >> > your clients and servers are colocated. > > >> >> > >> > > > >> >> > >> > -Ivan > > >> >> > >> > > >> >> > > -- > > >> >> > > > > >> >> > > > > >> >> > > -- Enrico Olivelli > > >> >> > > > > >> >> > -- > > >> >> > > > >> >> > > > >> >> > -- Enrico Olivelli > > >> >> > > > >> >> > > >> > > > > > > > > > -- -- Enrico Olivelli