Re: Ephemeral znodes not getting removed

2019-08-05 Thread John Lindwall
Thanks for the response! My direct access to this zk cluster is 
limited.  I'll see about getting a copy of the logs to examine.  I'll 
also try to coordinate your experiment of creating a znode in each node 
in turn and checking the cluster-wide view of that data.  If we see a 
situation where the "global view" is inconsistent what would be the next 
step?


I did receive output from each cluster node containing the results of 
these 4-letter words: dump, cons, mntr, and stat.  For one of the 
ephemerals in question we could see a record of it in the "dump" output 
for one of the 3 cluster nodes (the leader) but not in the other 2 nodes 
dump output.  Weirdly, the session id associated with that ephemeral 
znode does not appear in the "cons" output for any of the cluster 
members.  So this appears to be an ephemeral that has survived the 
termination of its associated zk session (!?)


Thanks for any advice or feedback,
John

Patrick Hunt wrote on 8/2/19 9:38 AM:

The jira you ref'd is the only one that comes to mind. In terms of
troubleshooting - try connecting a client to each of the servers in tern
and see if it's a situation where they have a different view of the world
wrt those znodes. You might also have the client create separate znodes on
each server and ensure that they are consistent. The logs are also
typically a good source of information - check against the session id.

Patrick

On Wed, Jul 31, 2019 at 5:54 PM John Lindwall 
wrote:


ZooKeeper 3.4.6-1569965

In our environment we seem to have a situation where ephemeral znodes
are not getting removed after the zookeeper session has been
terminated.  We can see examples of znodes that were created 3-4 days
past that still exist, though the zk sessions bound to those znodes
should no longer exist.

Note that we've had this cluster running to about 4 years and have not
seen this problem until recently.

1. I am wondering if there are any known issues that would affect our
zookeeper version that may cause this behavior?
2. Is it possible our servers are simply in a "bad state" and a simple
reboot might clean things up?
3. Any tips on diagnosing this?

We noticed this issue from 2011 but that seems to have been fixed in our
branch.


https://issues.apache.org/jira/browse/ZOOKEEPER-1208

Thanks,
John Lindwall



--
Sent from Postbox 



Re: Issues with using ZooKeeper 3.5.5 together with Solr 8.2.0

2019-08-05 Thread Patrick Hunt
It sounds to me like a regression. We always had the properties format for
4lw, this (membership:) breaks that. I'd recommend fixing it in the next
3.5/3.6. ie. output the membership on a single line "membership:  \n".
Should be a pretty simple change - anyone interested in taking it on?

Also agree that folks should move off 4lw to the new (better) options, esp
as we plan to deprecate 4lw at some point.

Patrick

On Sun, Aug 4, 2019 at 12:15 PM Enrico Olivelli  wrote:

> Il sab 3 ago 2019, 21:41 Shawn Heisey  ha scritto:
>
> > On 8/2/2019 10:33 AM, Patrick Hunt wrote:
> > > Right, it prints the membership of the quorum, see (for majority case
> > which
> > > is typical):
> > > org.apache.zookeeper.server.quorum.flexible.QuorumMaj#toString
> > >
> >
> https://github.com/apache/zookeeper/blob/faa7cec71fddfb959a7d67923acffdb67d93c953/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/flexible/QuorumMaj.java#L112
> >
> > For our purposes (the Solr project) the output of the "conf" 4lw command
> > is inconsistent, changing when there is a multi-server ensemble.  All of
> > the lines except the "membership: " one use an equals sign as a
> > separator.  Our parsing code fails on that line because there is no
> > equals sign.
> >
> > Whether or not the ZK project should consider this a bug is the question
> > that I am asking.
> >
> > While getting to the bottom of that question, another one arises:  Who
> > are the intended audiences of the "conf" 4lw output?  If one of those
> > audiences is ZK itself, then the output of the command probably will
> > work perfectly for that audience, as ZK uses Java's "properties" API to
> > read its config file, which means that both = and : will work as
> > separators.
> >
> > The current output also works great for a human audience.  Humans are
> > quite flexible.
> >
> > The difficulty is machine-based parsers like the one in Solr, which is
> > very simple and just splits lines on an equal sign.  How much
> > consistency can an audience like this expect?  I would personally say
> > that the way "membership: " is output is a bug.  That line probably
> > should be entirely removed, or the colon could be replaced with an equal
> > sign.  I think that the line only makes sense for a human audience, and
> > that audience probably doesn't really need it.
> >
> > An alternate path:  One statement in the documentation would remove all
> > difficulty, without any code changes in ZK:
> >
> > "The output from the conf 4lw command should be parsed by the Java
> > Properties API for best results."
> >
>
> I think the best option is to switch to the Admin, HTTP + json based, as it
> is possible to integrate better with other automatic tools.
> We are working on docs for 3.6 (expecially the http admin server).
> We also added many new 'commands' to the admin API, which is supposed to be
> the future for the mid/long term
>
> Enrico
>
>
>
> > If that statement is added, then Solr just needs to utilize the
> > Properties API, which is very easy to do, and all is well again.
> >
> > So... I'm thinking we should open an issue in Jira, and then leave it up
> > to the ZK committers whether it's better to change the output or adjust
> > the documentation.  I can supply a patch either way.  What does the
> > community think?
> >
> > Thanks,
> > Shawn
> >
>


Re: Can SSL capability be satisfied by a smaller dependency than netty-all?

2019-08-05 Thread Norbert Kalmar
Thanks for bringing this up Shawn.

I also checked on my fork, netty-transport-native-epoll is the one actually
needed. But yeah, netty-all is overkill.
I created a jira:
https://issues.apache.org/jira/browse/ZOOKEEPER-3494

I will upload my PR soon.

Regards,
Norbert

On Fri, Aug 2, 2019 at 2:07 AM Michael Han  wrote:

> >> SSL capability can be satisfied by one of the smaller netty jars, rather
> than netty-all
>
> A brief look on the imports indicates that we might only need the handler
> and transport jars from Netty. I'd suggest to create a JIRA to request this
> change.
>
> On Tue, Jul 30, 2019 at 1:11 PM Shawn Heisey  wrote:
>
> > We neglected to notice that netty is a required dependency for ZK SSL
> > when we upgraded to ZK 3.5.5 in Solr.  We have an issue to track this:
> >
> > https://issues.apache.org/jira/browse/SOLR-13665
> >
> > I was noticing that the netty-all jar included in ZK is nearly 4MB ...
> > and we will have to include it twice in the Solr download because it is
> > needed for the SolrJ client as well as the Solr server.  The Solr
> > download is already quite large ... increasing it by another 7MB is
> > painful.
> >
> > I'm hoping that ZK's SSL capability can be satisfied by one of the
> > smaller netty jars, rather than netty-all.  Is that a question that can
> > be answered here on the ZK list?  The specific class that is mentioned
> > by the error is included in netty-transport.
> >
> > Thanks,
> > Shawn
> >
>


Re: Issues with using ZooKeeper 3.5.5 together with Solr 8.2.0

2019-08-05 Thread Enrico Olivelli
Il lun 5 ago 2019, 00:57 Jan Høydahl  ha scritto:

> Will admin server be folded in and exposed on same port as main client
> port in the future? If not, clients will need to have one config for zkHost
> plus one more for zkAdminServer.


Personally I hope we won't do this. I hope we continue investing in the
client endpoint performances and mixing it with an HTTP server will
complicate things. That said, it is possible in theory to merge them

I asked in another thread of admin server port number will have a
> better/more unique default than 8080 in the future, such as 2188 or
> whatever?
>

+1
I don't know how much this can impact downstream bundles.
I am not an user of admin server yet, I will switch as soon as 3.6 is out.


Enrico


> Jan Høydahl
>
> > 4. aug. 2019 kl. 21:15 skrev Enrico Olivelli :
> >
> > Il sab 3 ago 2019, 21:41 Shawn Heisey  ha scritto:
> >
> >>> On 8/2/2019 10:33 AM, Patrick Hunt wrote:
> >>> Right, it prints the membership of the quorum, see (for majority case
> >> which
> >>> is typical):
> >>> org.apache.zookeeper.server.quorum.flexible.QuorumMaj#toString
> >>>
> >>
> https://github.com/apache/zookeeper/blob/faa7cec71fddfb959a7d67923acffdb67d93c953/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/flexible/QuorumMaj.java#L112
> >>
> >> For our purposes (the Solr project) the output of the "conf" 4lw command
> >> is inconsistent, changing when there is a multi-server ensemble.  All of
> >> the lines except the "membership: " one use an equals sign as a
> >> separator.  Our parsing code fails on that line because there is no
> >> equals sign.
> >>
> >> Whether or not the ZK project should consider this a bug is the question
> >> that I am asking.
> >>
> >> While getting to the bottom of that question, another one arises:  Who
> >> are the intended audiences of the "conf" 4lw output?  If one of those
> >> audiences is ZK itself, then the output of the command probably will
> >> work perfectly for that audience, as ZK uses Java's "properties" API to
> >> read its config file, which means that both = and : will work as
> >> separators.
> >>
> >> The current output also works great for a human audience.  Humans are
> >> quite flexible.
> >>
> >> The difficulty is machine-based parsers like the one in Solr, which is
> >> very simple and just splits lines on an equal sign.  How much
> >> consistency can an audience like this expect?  I would personally say
> >> that the way "membership: " is output is a bug.  That line probably
> >> should be entirely removed, or the colon could be replaced with an equal
> >> sign.  I think that the line only makes sense for a human audience, and
> >> that audience probably doesn't really need it.
> >>
> >> An alternate path:  One statement in the documentation would remove all
> >> difficulty, without any code changes in ZK:
> >>
> >> "The output from the conf 4lw command should be parsed by the Java
> >> Properties API for best results."
> >>
> >
> > I think the best option is to switch to the Admin, HTTP + json based, as
> it
> > is possible to integrate better with other automatic tools.
> > We are working on docs for 3.6 (expecially the http admin server).
> > We also added many new 'commands' to the admin API, which is supposed to
> be
> > the future for the mid/long term
> >
> > Enrico
> >
> >
> >
> >> If that statement is added, then Solr just needs to utilize the
> >> Properties API, which is very easy to do, and all is well again.
> >>
> >> So... I'm thinking we should open an issue in Jira, and then leave it up
> >> to the ZK committers whether it's better to change the output or adjust
> >> the documentation.  I can supply a patch either way.  What does the
> >> community think?
> >>
> >> Thanks,
> >> Shawn
> >>
>