Re: Is the value of $MYID allowed to change across runs in an HA ZK deployment?

2018-02-05 Thread Alexander Shraer
Hi Jay,

Perhaps it also depends on the restart? if the restart is done gradually,
for example a leader is in the middle of collecting votes when one of the
voters gets a new id and votes twice instead of once ? If the restart is a
barrier, where all servers are shut down and then restarted, this shouldn't
happen.

In 3.5, cluster membership is written into the ZK database as well as
configuration files, and contains server ids and parameters (ports, IPs,
etc). If ids change, it sounds like the membership
information may be wrong.

Perhaps there are also some implications on the security-related configs ?
Someone else may want to comment on these.

In general, changing ids doesn't feel like a very safe method to me...

Cheers,
Alex




On Mon, Feb 5, 2018 at 10:47 AM,  wrote:

> Greetings Zookeepers,
>
> I'm investigating possible ways for Zookeeper to run safely on top of
> Kubernetes clusters.
>
> When the zookeeper containers come online, the value for $MYID is
> initially derived from the Kubernetes pod name.  All active pod names
> are guaranteed to be unique within the cluster at any given point in time.
>
> Example values:
>
> - zookeeper-0
> - zookeeper-1
> - zookeeper-2
>
> and the formula for $MYID is ((the trailing number of the pod name) + 1):
>
> - zookeeper-0 => $MYID=1
> - zookeeper-1 => $MYID=2
> - zookeeper-2 => $MYID=3
>
> The part I'm uncertain of is the relationship between $MYID and ensuring
> each zookeeper data set stays in sync with the rest of the cluster,
> particularly across container restarts.  Restarts can lead to Zookeeper
> data set being launched with a different value of $MYID compared with
> the previous run.  I.e., Zookeeper may have already run on any given
> data set in the past when the myid file contained a different value.
>
> Is it part of the mechanism used to ensure all follower members are in
> sync with the current leader?  It seems to me that if the leader (or
> followers) keep track of their peers via myid and it gets changed, there
> could be problems.
>
> Initial testing (without much load) has gone fine and things seem to
> work fine when launched with updated $MYID values.  I've also been
> perusing the ZK source code and inspecting how myid is used, and nothing
> stood out to indicate that this will lead to future problems.  However,
> experience dictates that with distributed systems the devil is often in
> nuanced details, so I'm hoping the experts out there may be able to shed
> light about the internal dependencies on the value of myid.
>
> Specific questions:
>
> - Is myid relied on to never change, or does it only need to be
> unique within the cluster at any given time?
>
> - What are the risks with changing myid in relation to ZK data set
> directories across runs?
>
> Your insights will be greatly appreciated!
>
> Kind regards,
> Jay Taylor
>
>
>


Re: ZooKeeper 3.4.11 bug: dataDir and dataLogDir swapped

2018-02-05 Thread Patrick Hunt
This is a good point Andor. I've updated the release page on the website to
reflect the regression addressed in ZOOKEEPER-2960 and upcoming fix.

Thanks!

Patrick

On Fri, Feb 2, 2018 at 1:07 AM, Andor Molnar  wrote:

> Hi all,
>
> Please be aware that 3.4.11 has a quite unfortunate bug which causes
> ZooKeeper to swap dataDir and dataLogDir parameters. If you configured ZK
> to use separate txn and snapshot folders in these two options and plan to
> upgrade, you'll experience that ZK is trying to load transaction logs from
> snapshot folder and vica versa.
>
> Fix is on the way, 3.4.12 will be released soon and it's recommended to
> postpone upgrading ZooKeeper until that.
>
> *dev*
> I think it'd be useful to add a similar warning message to the Releases
> page too.
>
> Regards,
> Andor
>


Re: replacing one zookeeper machine with brand new machine

2018-02-05 Thread Check Peck
Is there an option to remove a zookeeper node from exhibitor? I am not sure
it is there I guess.

On Mon, Feb 5, 2018 at 10:21 AM, Washko, Daniel  wrote:

> The steps are the same whether Exhibitor is in the mix or not. Exhibitor
> will take care of management, though. I would recommend backing up the data
> in your Zookeeper ensemble just to be safe.
>
> 1) Spin up a new zookeeper and configure it to use exhibitor.
> 2) Let exhibitor bring it into the ensemble.
> 3) Use exhibitor to remove the old node.
> 4) Terminate the old node when exhibitor says it is no longer in the
> ensemble; or it is down.
>
> It has been a few years since I have worked with Exhibitor. It should
> automatically pull the new node into the ensemble. I believe there is an
> option to remove a node. You will be presented with a choice on how you
> want to initiate the changes - a rolling restart of restart all at once. I
> would recommend a rolling restart if you want to keep the ensemble live
> while you make the changes.
>
> If you have a problem with removing one of the nodes, you can edit the
> node list in exhibitor, remove that node, and save the configuration.
> Again, this will prompt for a rolling restart or parallel restart.
>
> Without exhibitor these are the steps I follow:
>
> 1) Backup the data
> 2) Spin up a new zookeeper
> 3) Identify the master
> 4) Alter the configuration on each zookeeper to add the new node and to
> add the other nodes to the new zookeeper. Be aware of the zookeeper ID, it
> has to be unique.
> 5) Perform a rolling restart of each node with the master last.
> 6) Verify the new master and that the data stored in zookeeper has
> migrated successfully to the new node.
> 7) Remove the old node from each config.
> 8) Stop zookeeper on the old node and do a rolling restart of the
> remaining zookeepr nodes with the master last.
> 9) Terminate the old node.
>
> --
> Daniel S Washko
> Solutions Architect
>
>
>
> Phone: 757 667 1463
> dwas...@gannett.comgannett.com 
>
> On 2/2/18, 3:20 PM, "Check Peck"  wrote:
>
> I have a zookeeper ensemble of 5 servers and I am using exhibitor on
> top of
> it. And I installed exhibitor and setup zookeeper by following this
> link:
>
> https://github.com/soabase/exhibitor/wiki/Building-Exhibitor
>
> Below is how all my zookeeper machines are setup in exhibitor
>
> S:1:machineA,
> S:2:machineB,
> S:3:machineC,
> S:4:machineD,
> S:5:machineE
>
> Now for some reasons, I need to replace "machineE" with brand new
> "machineF". What is the best way by which I can safely remove one
> machine
> and replace it with new machine?
>
>
>


SASL jaas.conf principal="*" problem

2018-02-05 Thread Botond Hejj
Hi,

Java 8 introduced the possibility to use * for the principal in treadmill
which is great and would allow us to run treadmill behind multiple
interfaces and SASL would pick the right keytab.

Unfortunately this doesn't work in ZooKeeper I have dived in the code a bit
and what I have found is that ZooKeeper is using DIGEST-MD5 in that case
even though I don't use the DigestLoginModule. The reason for that is line
251 here:
https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/util/SecurityUtils.java

It falls back to Digest if the principal list is empty which is the case
when * is specified.
Why is that and why not the login type is checked?
Anyway this can only be fixed in a nonbackward compatible way or with a
flag in a backward compatible way.

I could prepare a patch.
I would just like to understand the reason behind the implementation. Is
there any particular reason why this fallback is there? What would the
implication if I remove that? If I understand the reason maybe I could
patch it without breaking backward compatibility.

There is also a comment: TODO: use 'authMech=' value in zoo.cfg.

Is there any jira or patch for that?

Regards,
Botond Hejj
Morgan Stanley | Technology
Lechner Odon fasor 8 | Floor 07
Budapest, 1095
Phone: +36 1 881-3962
botond.h...@morganstanley.com


Is the value of $MYID allowed to change across runs in an HA ZK deployment?

2018-02-05 Thread jay . taylor
Greetings Zookeepers,

I'm investigating possible ways for Zookeeper to run safely on top of
Kubernetes clusters.

When the zookeeper containers come online, the value for $MYID is
initially derived from the Kubernetes pod name.  All active pod names
are guaranteed to be unique within the cluster at any given point in time.

Example values:

    - zookeeper-0
    - zookeeper-1
    - zookeeper-2

and the formula for $MYID is ((the trailing number of the pod name) + 1):

    - zookeeper-0 => $MYID=1
    - zookeeper-1 => $MYID=2
    - zookeeper-2 => $MYID=3

The part I'm uncertain of is the relationship between $MYID and ensuring
each zookeeper data set stays in sync with the rest of the cluster,
particularly across container restarts.  Restarts can lead to Zookeeper
data set being launched with a different value of $MYID compared with
the previous run.  I.e., Zookeeper may have already run on any given
data set in the past when the myid file contained a different value.

Is it part of the mechanism used to ensure all follower members are in
sync with the current leader?  It seems to me that if the leader (or
followers) keep track of their peers via myid and it gets changed, there
could be problems.

Initial testing (without much load) has gone fine and things seem to
work fine when launched with updated $MYID values.  I've also been
perusing the ZK source code and inspecting how myid is used, and nothing
stood out to indicate that this will lead to future problems.  However,
experience dictates that with distributed systems the devil is often in
nuanced details, so I'm hoping the experts out there may be able to shed
light about the internal dependencies on the value of myid.

Specific questions:

    - Is myid relied on to never change, or does it only need to be
unique within the cluster at any given time?

    - What are the risks with changing myid in relation to ZK data set
directories across runs?

Your insights will be greatly appreciated!

Kind regards,
Jay Taylor




Re: replacing one zookeeper machine with brand new machine

2018-02-05 Thread Washko, Daniel
The steps are the same whether Exhibitor is in the mix or not. Exhibitor will 
take care of management, though. I would recommend backing up the data in your 
Zookeeper ensemble just to be safe.

1) Spin up a new zookeeper and configure it to use exhibitor. 
2) Let exhibitor bring it into the ensemble. 
3) Use exhibitor to remove the old node.
4) Terminate the old node when exhibitor says it is no longer in the ensemble; 
or it is down.

It has been a few years since I have worked with Exhibitor. It should 
automatically pull the new node into the ensemble. I believe there is an option 
to remove a node. You will be presented with a choice on how you want to 
initiate the changes - a rolling restart of restart all at once. I would 
recommend a rolling restart if you want to keep the ensemble live while you 
make the changes. 

If you have a problem with removing one of the nodes, you can edit the node 
list in exhibitor, remove that node, and save the configuration. Again, this 
will prompt for a rolling restart or parallel restart.

Without exhibitor these are the steps I follow:

1) Backup the data
2) Spin up a new zookeeper
3) Identify the master
4) Alter the configuration on each zookeeper to add the new node and to add the 
other nodes to the new zookeeper. Be aware of the zookeeper ID, it has to be 
unique.
5) Perform a rolling restart of each node with the master last. 
6) Verify the new master and that the data stored in zookeeper has migrated 
successfully to the new node. 
7) Remove the old node from each config.
8) Stop zookeeper on the old node and do a rolling restart of the remaining 
zookeepr nodes with the master last.
9) Terminate the old node.

-- 
Daniel S Washko
Solutions Architect



Phone: 757 667 1463
dwas...@gannett.comgannett.com 

On 2/2/18, 3:20 PM, "Check Peck"  wrote:

I have a zookeeper ensemble of 5 servers and I am using exhibitor on top of
it. And I installed exhibitor and setup zookeeper by following this link:

https://github.com/soabase/exhibitor/wiki/Building-Exhibitor

Below is how all my zookeeper machines are setup in exhibitor

S:1:machineA,
S:2:machineB,
S:3:machineC,
S:4:machineD,
S:5:machineE

Now for some reasons, I need to replace "machineE" with brand new
"machineF". What is the best way by which I can safely remove one machine
and replace it with new machine?