Re: Bookkeeper Recovery Issues

Sebastián Schepens Tue, 29 Nov 2016 05:01:47 -0800

Sijie,

Thank you very much!


We will be trying all of this.

We’ve also encountered some issues with the manual recovery tool.


First, ti's supposed to be able to take 2 arguments (ZK node and BK node)
and then choose randomly the node for each replicated ledger, but running
it with just 2 arguments crashes with and IndexOutOfBounds exception while
parsing arguments.


Secondly, even while specifying the target Bookie to which to replicate
ledgers, it still defaults to just randomly distribuiting ledgers across
all nodes.


Agian, thank you very much for all the help.

Cheers,

Sebastian

On Mon, Nov 28, 2016 at 5:52 PM Sijie Guo <[email protected]> wrote:

> Sebastian,
>
> Sorry for late response. (I didn't have network access over the weekend).
>
> Let me try to answer your questions in sequence and then I will suggest a
> solution for you.
>
> *Re "2016-11-25 13:55:36,775 - ERROR - [New I/O boss
> #9:PerChannelBookieClient$1@**255] - Could not connect to bookie: [id:
> 0x3dc4f6b8]/10.3.3.41:3181 <http://10.3.3.41:3181>, current state
> CONNECTING :"*
>
> The logging here is expected. It is a bit noisy. I created
> https://issues.apache.org/jira/browse/BOOKKEEPER-978  to turn that
> logging into debug.
>
> *Re "The strange thing is that 10.3.3.41 is the autorecovery node, it
> seems to be trying to replicate data to itself, is this ok?"*
>
> It is a problem for running autorecovery in separate nodes. As it will use
> itself as the target bookie. This is a bug and we've fixed at
> https://github.com/twitter/bookkeeper/commit/fc7e171135a58cddb9e91f5f614b3ceb6f9f9fee
> . I will port this change to master in
> https://issues.apache.org/jira/browse/BOOKKEEPER-977
>
> *Re "Is it normal for the recovery processes to run out of direct memory?"*
>
> I don't think it is normal. I suspected that it might be related to
> replicate too many ledgers at the same time. Do you mind me asking how many
> pulsar topics do you have?
>
>
> *Re "*
> *2016-11-25 18:15:44,792 - INFO  - 
> [main-EventThread:**LedgerFragmentReplicator@139]
> - Fragment :Fragment(LedgerID: 14401, FirstEntryID: 0[-1],
> LastKnownEntryID: -1[-1], Host: 10.3.2.*
> *158:3181, Closed: true) is split into sub fragments :[Fragment(LedgerID:
> 14401, FirstEntryID: 0[-1], LastKnownEntryID: -1[-1], Host: 10.3.2.158:3181
> <http://10.3.2.158:3181>, Closed: true)]*
> *2016-11-25 18:15:44,792 - ERROR - [main-EventThread:*
> *LedgerFragmentReplicator$2@251**] - *
>
> *BK error reading ledger entry: -1"*
> You are right. The replicator worker should skip replicating empty
> ledgers. (https://issues.apache.org/jira/browse/BOOKKEEPER-979) I am
> going to port the fix today.
>
>
> I would suggest ignoring the logging or set the logging level to WARN for
> now. Then try to run the recover tool to replicate to a specific bookie
> (specify the target bookie), or try to run autorecovery as daemon in the
> bookie server right now. Once I finished porting the fixes for the above
> jiras, I will let you know and you can try the fixes from master to run
> autorecovery as separate daemons.
>
> Let me know if this helps.
>
> - Sijie
>
>
>
>
>
>
>
> On Fri, Nov 25, 2016 at 6:19 AM, Sebastián Schepens <
> [email protected]> wrote:
>
> Guys,
> Still seeing weird behavior.
> Trying autorecovery on separate nodes, they've been running for a long
> time just spitting error logs, but apparently moving data, but they never
> seem to end and eventually throw OutOfDirectMemory errors (we have assigned
> 20G direct memory to them...).
>
> I now have started autorecovery processes of version 4.4 (rest of the
> cluster is 4.3), they throw the following errors and do not seem to be
> moving data, all node's IO metrics are flat:
>
> 2016-11-25 13:55:36,775 - ERROR - [New I/O boss
> #9:PerChannelBookieClient$1@255] - Could not connect to bookie: [id:
> 0x3dc4f6b8]/10.3.3.41:3181, current state CONNECTING :
> java.net.ConnectException: Connection refused: /10.3.3.41:3181
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>         at
> org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
>         at
> org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
>         at
> org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
>         at
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
>         at
> org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
>         at
> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>         at
> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
>
> 2016-11-25 13:46:46,458 - ERROR -
> [BookKeeperClientWorker-orderedsafeexecutor-1-0:BookkeeperInternalCallbacks$MultiCallback@125]
> - Error in multi callback : -8
> 2016-11-25 13:46:46,458 - ERROR -
> [BookKeeperClientWorker-orderedsafeexecutor-1-0:LedgerFragmentReplicator$2$1@274]
> - BK error writing entry for ledgerId: 9449, entryId: 2536703, b
> ookie: 10.3.3.41:3181
>
> org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException
>         at
> org.apache.bookkeeper.client.BKException.create(BKException.java:62)
>         at
> org.apache.bookkeeper.client.LedgerFragmentReplicator$2$1.writeComplete(LedgerFragmentReplicator.java:279)
>         at
> org.apache.bookkeeper.proto.BookieClient$1$1.safeRun(BookieClient.java:189)
>         at
> org.apache.bookkeeper.util.SafeRunnable.run(SafeRunnable.java:31)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
>
> The strange thing is that 10.3.3.41 is the autorecovery node, it seems to
> be trying to replicate data to itself, is this ok?
>
> It's also attempting to connect to the failed node:
>
> 2016-11-25 13:55:36,777 - ERROR - [New I/O boss
> #9:PerChannelBookieClient$1@255] - Could not connect to bookie: [id:
> 0xb1b27b5b]/10.3.2.158:3181, current state CONNECTING :
> java.net.ConnectException: Connection refused: /10.3.2.158:3181
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>         at
> org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
>         at
> org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
>         at
> org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
>         at
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
>         at
> org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
>         at
> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>         at
> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
>
> If I execute a manual recovery I get the same errors:
> 2016-11-25 14:16:56,438 - ERROR - [New I/O boss
> #9:PerChannelBookieClient$1@255] - Could not connect to bookie: [id:
> 0x3e258d5b]/10.3.2.158:3181, current state CONNECTING :
> java.net.ConnectException: Connection refused: /10.3.2.158:3181
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>         at
> org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
>         at
> org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
>         at
> org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
>         at
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
>         at
> org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
>         at
> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>         at
> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
>
> But it seems to be moving data now.
>
> Could you guys give me some insight?
> Are these logs normal? If so, why are they level error? Why is the
> autorecovery process trying to connect to itself and not replicating data
> to other nodes? Is there a way to know the progress of the recovery
> process? Is it normal for the recovery processes to run out of direct
> memory?
>
> Thanks,
> Sebastian
>
>
> On Wed, Nov 23, 2016 at 11:33 AM Sebastián Schepens <
> [email protected]> wrote:
>
> Sijie,
> Yes, that's precisely what I meant, we're running separate autorecovery
> processes, not daemons on all nodes.
>
> Autorecovery processes run quietly until I stop a node, as soon as I stop
> a node, they're plagued with logs like the following, where the stopped
> node is (10.3.2.56):
>
> 2016-11-22 17:34:06,085 - ERROR -
> [bookkeeper-io-1-1:PerChannelBookieClient$2@284] - Could not connect to
> bookie: [id: 0xd3b0c759, L:/10.3.3.42:45164]/10.3.2.56:3181, current
> state CONNECTING :
> java.net.ConnectException: syscall:getsockopt(...): /10.3.2.56:3181
>
> There seems to be waves of thousands and thousands of these logs while
> some data movement seems to be occurring, but it's really weird that it's
> constantly trying to connect to the failed node.
> Couldn't it realize it's down because it's not shown as available on
> zookeeper?
>
> We also see a couple of this logs, but really few of them compared to the
> previous.
>
> 2016-11-23 14:28:01,661 - WARN  -
> [ReplicationWorker:RackawareEnsemblePlacementPolicy@543] - Failed to
> choose a bookie: excluded [<Bookie:10.3.2.57:3181>, <Bookie:
> 10.3.2.195:3181>
> , <Bookie:10.3.2.158:3181>], fallback to choose bookie randomly from the
> cluster.
>
>
> The cluster currently has 6 nodes, and as I said before we're using
> ensemble size 3, write quorum 3 and ack quorum 2.
>
> Thanks,
> Sebastian
>
> On Tue, Nov 22, 2016 at 2:10 PM Sijie Guo <[email protected]> wrote:
>
> I think what Sebastian said is that manual recovery didn't even work. This
> seems to a bit strange to me. The autorecovery will check if the bookie is
> available or not. After that, it should rereplicate the data from other
> nodes in the ensemble. This seems to indicate something is broken.
> Sebastian, Can you point us some loggings?
>
> Sijie
>
> On Nov 19, 2016 9:46 AM, "Rithin Shetty" <[email protected]> wrote:
>
> A few things to note: Make sure 'autoRecoveryDaemonEnabled' set to true on
> all the bookie conf files; by default this is false. Otherwise recovery
> will not work. The auto recovery process tries to make sure that the data
> doesn't exist on the source node before replicating to destination. That
> might be the reason why it is trying to talk to the source node.
>
> --Rithin
>
> On Fri, Nov 18, 2016 at 12:00 PM, Sebastián Schepens <
> [email protected]> wrote:
>
> Hi guys,
> I'm running into some issues while trying to recover a decomissioned node.
> Both the recovery command and autorecovery processes fail trying to
> connect to the failing node, which seems reasonable because the node is
> down.
> But I don't get why it's trying to connect to that node, according to the
> documentation it should pull ledger data from other nodes in the ensemble
> (3) and replicate them.
> The recovery command also seems to completely ignore the destination node
> given as third argument.
>
> Could someone give us some help?
> Thanks,
> Sebastian
>
>
>
>

Re: Bookkeeper Recovery Issues

Reply via email to