java.io.EOFException

2017-05-26 Thread I PVP
How to recover from the following error that  started happening after a server 
crash?
Zookeeper won’t start and the following message is showing repeatedly on the 
log.

2017-05-27 01:02:08,072 [myid:] - INFO  [main:Environment@100] - Server 
environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2017-05-27 01:02:08,072 [myid:] - INFO  [main:Environment@100] - Server 
environment:java.io.tmpdir=/tmp
2017-05-27 01:02:08,072 [myid:] - INFO  [main:Environment@100] - Server 
environment:java.compiler=
2017-05-27 01:02:08,072 [myid:] - INFO  [main:Environment@100] - Server 
environment:os.name=Linux
2017-05-27 01:02:08,072 [myid:] - INFO  [main:Environment@100] - Server 
environment:os.arch=amd64
2017-05-27 01:02:08,073 [myid:] - INFO  [main:Environment@100] - Server 
environment:os.version=3.10.0-514.16.1.el7.x86_64
2017-05-27 01:02:08,073 [myid:] - INFO  [main:Environment@100] - Server 
environment:user.name=zookeeper
2017-05-27 01:02:08,073 [myid:] - INFO  [main:Environment@100] - Server 
environment:user.home=/opt/zookeeper
2017-05-27 01:02:08,073 [myid:] - INFO  [main:Environment@100] - Server 
environment:user.dir=/
2017-05-27 01:02:08,074 [myid:] - INFO  [main:ZooKeeperServer@829] - tickTime 
set to 2000
2017-05-27 01:02:08,074 [myid:] - INFO  [main:ZooKeeperServer@838] - 
minSessionTimeout set to -1
2017-05-27 01:02:08,074 [myid:] - INFO  [main:ZooKeeperServer@847] - 
maxSessionTimeout set to -1
2017-05-27 01:02:08,080 [myid:] - INFO  [main:NIOServerCnxnFactory@89] - 
binding to port 0.0.0.0/0.0.0.0:2181
2017-05-27 01:02:08,385 [myid:] - ERROR [main:Util@239] - Last transaction was 
partial.
2017-05-27 01:02:08,400 [myid:] - ERROR [main:Util@239] - Last transaction was 
partial.
2017-05-27 01:02:08,403 [myid:] - ERROR [main:Util@239] - Last transaction was 
partial.
2017-05-27 01:02:08,403 [myid:] - ERROR [main:Util@239] - Last transaction was 
partial.
2017-05-27 01:02:08,404 [myid:] - ERROR [main:Util@239] - Last transaction was 
partial.
2017-05-27 01:02:08,404 [myid:] - ERROR [main:ZooKeeperServerMain@64] - 
Unexpected exception, exiting abnormally
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at 
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at 
org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:585)
at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:604)
at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:570)
at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:652)
at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:166)
at 
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
at 
org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:283)
at 
org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:410)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:118)
at 
org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:119)
at 
org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:87)
at 
org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:53)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)


any help is appreciate

best
IPVP


Re: Recovering from zxid rollover

2017-05-26 Thread Patrick Hunt
On Wed, May 24, 2017 at 8:08 AM, Mike Heffner  wrote:

> On Tue, May 23, 2017 at 10:21 PM, Patrick Hunt  wrote:
>
> > On Tue, May 23, 2017 at 3:47 PM, Mike Heffner  wrote:
> >
> > > Hi,
> > >
> > > I'm curious what the best practices are for handling zxid rollover in a
> > ZK
> > > ensemble. We have a few five-node ZK ensembles (some 3.4.8 and some
> > 3.3.6)
> > > and they periodically rollover their zxid. We see the following in the
> > > system logs on the leader node:
> > >
> > > 2017-05-22 12:54:14,117 [myid:15] - ERROR [ProcessThread(sid:15
> > > cport:-1)::ZooKeeperCriticalThread@49] - Severe unrecoverable error,
> > from
> > > thread : ProcessThread(sid:15 cport:-1):
> > > org.apache.zookeeper.server.RequestProcessor$RequestProcesso
> rException:
> > > zxid lower 32 bits have rolled over, forcing re-election, and therefore
> > new
> > > epoch start
> > >
> > > From my best understanding of the code, this exception will end up
> > causing
> > > the leader to enter shutdown():
> > >
> > > https://github.com/apache/zookeeper/blob/09cd5db55446a4b390f
> > > 82e3548b929f19e33430d/src/java/main/org/apache/zookeeper/
> > > server/ZooKeeperServer.java#L464-L464
> > >
> > > This shuts down the zookeeper instance from servicing requests, but the
> > JVM
> > > is still actually running. What we experience is that while this ZK
> > > instance is still running, the remaining follower nodes can't re-elect
> a
> > > leader (at least within 15 mins) and quorum is offline. Our remediation
> > so
> > > far has been to restart the original leader node, at which point the
> > > cluster recovers.
> > >
> > > The two questions I have are:
> > >
> > > 1. Should the remaining 4 nodes be able to re-elect a leader after zxid
> > > rollover without intervention (restarting)?
> > >
> > >
> > Hi Mike.
> >
> > That is the intent. Originally the epoch would rollover and cause the
> > cluster to hang (similar to what you are reporting), the JIRA is here
> > https://issues.apache.org/jira/browse/ZOOKEEPER-1277
> > However the patch, calling shutdown of the leader, was intended to force
> a
> > re-election before the epoch could rollover.
> >
>
> Should the leader JVM actually exit during this shutdown, thereby allowing
> the init system to restart it?
>
>
iirc it should not be necessary but it's been some time since I looked at
it.


>
> >
> >
> > > 2. If the leader enters shutdown() state after a zxid rollover, is
> there
> > > any scenario where it will return to started? If not, how are others
> > > handling this scenario -- maybe a healthcheck that kills/restarts an
> > > instance that is in shutdown state?
> > >
> > >
> > I have run into very few people who have seen the zxid rollover and
> testing
> > under real conditions is not easily done. We have unit tests but that
> code
> > is just not exercised sufficiently in everyday use. You're not seeing
> > what's intended, please create a JIRA and include any additional details
> > you can (e.g. config, logs)
> >
>
> Sure, I've opened one here:
> https://issues.apache.org/jira/browse/ZOOKEEPER-2791
>
>
> >
> > What I heard people (well really one user, I have personally only seen
> this
> > at one site) were doing prior to 1277 was monitoring the epoch number,
> and
> > when it got close to rolling over (within 10% say) they would force the
> > current leader to restart by restarting the process. The intent of 1277
> was
> > to effectively do this automatically.
> >
>
> We are looking at doing something similar, maybe once a week finding the
> current leader and restarting it. From testing this quickly re-elects a new
> leader and resets the zxid to zero so it should avoid the rollover that
> occurs after a few weeks of uptime.
>
>
Exactly. This is pretty much the same scenario that I've seen in the past,
along with a similar workaround.

You might want to take a look at the work Benedict Jin has done here:
https://issues.apache.org/jira/browse/ZOOKEEPER-2789
Given you are seeing this so frequently it might be something you could
collaborate on with the author of the patch? I have not looked at it in
great detail but it may allow you to run longer w/o seeing the issue. I
have not thought through all the implications though... (including b/w
compat).

Patrick


>
> >
> > Patrick
> >
> >
> > >
> > > Cheers,
> > >
> > > Mike
> > >
> > >
> >
>
> Mike
>


Re: Yet another "two datacenter" discussion

2017-05-26 Thread Shawn Heisey
On 5/26/2017 9:48 AM, Jordan Zimmerman wrote:
> In ZK 3.4.x if you have configuration differences amongst your instances you 
> are susceptible to a split brain. See this email thread, "Rolling Config 
> Change Considered Harmful":
>
> http://zookeeper-user.578899.n2.nabble.com/Rolling-config-change-considered-harmful-td7578761.html
>  
> 
>
> In ZK 3.5.x I'm not even sure it would work. 

Thank you for your reply.

I don't fully understand everything being discussed in that thread, but
it sounds like bad things could happen once connectivity is restored. 
If DC1 and DC2 were both operational from a client perspective, but
unable to communicate with each other, I think the potential for bad
things would be even higher, because there could be confusion about
which Solr servers are leaders, as well as which ZK server is the leader.

Thanks,
Shawn



Re: Yet another "two datacenter" discussion

2017-05-26 Thread Jordan Zimmerman
In ZK 3.4.x if you have configuration differences amongst your instances you 
are susceptible to a split brain. See this email thread, "Rolling Config Change 
Considered Harmful":

http://zookeeper-user.578899.n2.nabble.com/Rolling-config-change-considered-harmful-td7578761.html
 


In ZK 3.5.x I'm not even sure it would work. 

-JZ

> On May 26, 2017, at 5:43 PM, Shawn Heisey  wrote:
> 
> I feel fairly certain that this thread willbe an annoyance.  I don't
> know enough about zookeeper to answer the questions that are being
> asked, so I apologize about needing to relay questions about ZK fault
> tolerance in two datacenters.
> 
> It seems that everyone wants to avoid the expense of a tie-breaker ZK VM
> in a third datacenter.
> 
> The scenario, which this list has seen over and over:
> 
> DC1 - three ZK servers, one or more Solr servers.
> DC2 - two ZK servers, one or more Solr servers.
> 
> I've already explained that if DC2 goes down, everything's fine, but if
> DC1 goes down, Solr goes ready-only, and there's no way to prevent that.
> 
> The conversation went further, and I'm sure you guys have seen this
> before too:  "Is there any way we can get DC2 back to operational with
> manual intervention if DC1 goes down?"  I explained that any manual
> intervention would briefly take Solr down ... at which point the
> following proposal was mentioned:
> 
> Add an observer node to DC2, and in the event DC1 goes down, run a
> script that reconfigures all the ZK servers to change the observer to a
> voting member and does rolling restarts.
> 
> Will their proposal work?  What happens when DC1 comes back online?  As
> you know, DC1 will contain a partial ensemble that still has quorum,
> about to rejoin what it THINKS is a partial ensemble *without* quorum,
> which is not what it will find.  I'm guessing that ZK assumes the
> question of who has the "real" quorum shouldn't ever need to be
> negotiated, because the rules prevent multiple partitions from gaining
> quorum.
> 
> Solr currently ships with 3.4.6, but the next version of Solr (about to
> drop any day now) will have 3.4.10.  Once 3.5 is released and Solr is
> updated to use it, does the situation I've described above change in any
> meaningful way?
> 
> Thanks,
> Shawn
> 



Yet another "two datacenter" discussion

2017-05-26 Thread Shawn Heisey
I feel fairly certain that this thread willbe an annoyance.  I don't
know enough about zookeeper to answer the questions that are being
asked, so I apologize about needing to relay questions about ZK fault
tolerance in two datacenters.

It seems that everyone wants to avoid the expense of a tie-breaker ZK VM
in a third datacenter.

The scenario, which this list has seen over and over:

DC1 - three ZK servers, one or more Solr servers.
DC2 - two ZK servers, one or more Solr servers.

I've already explained that if DC2 goes down, everything's fine, but if
DC1 goes down, Solr goes ready-only, and there's no way to prevent that.

The conversation went further, and I'm sure you guys have seen this
before too:  "Is there any way we can get DC2 back to operational with
manual intervention if DC1 goes down?"  I explained that any manual
intervention would briefly take Solr down ... at which point the
following proposal was mentioned:

Add an observer node to DC2, and in the event DC1 goes down, run a
script that reconfigures all the ZK servers to change the observer to a
voting member and does rolling restarts.

Will their proposal work?  What happens when DC1 comes back online?  As
you know, DC1 will contain a partial ensemble that still has quorum,
about to rejoin what it THINKS is a partial ensemble *without* quorum,
which is not what it will find.  I'm guessing that ZK assumes the
question of who has the "real" quorum shouldn't ever need to be
negotiated, because the rules prevent multiple partitions from gaining
quorum.

Solr currently ships with 3.4.6, but the next version of Solr (about to
drop any day now) will have 3.4.10.  Once 3.5 is released and Solr is
updated to use it, does the situation I've described above change in any
meaningful way?

Thanks,
Shawn