[jira] [Updated] (ZOOKEEPER-3041) Typo in error message, affects log analysis

2018-05-12 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ZOOKEEPER-3041:
--
Labels: pull-request-available  (was: )

> Typo in error message, affects log analysis
> ---
>
> Key: ZOOKEEPER-3041
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3041
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.3
>Reporter: Hugh O'Brien
>Priority: Trivial
>  Labels: pull-request-available
>
> simple typo
>  
> PR here: 
> https://github.com/apache/zookeeper/pull/498/commits/a8cb7f668d31a7bcf12481409328a886231020f6



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper issue #498: fix typo, charater --> character

2018-05-12 Thread hughobrienjet
Github user hughobrienjet commented on the issue:

https://github.com/apache/zookeeper/pull/498
  
Thanks @maoling , I created ZOOKEEPER-3041


---


[jira] [Created] (ZOOKEEPER-3041) Typo in error message, affects log analysis

2018-05-12 Thread Hugh O'Brien (JIRA)
Hugh O'Brien created ZOOKEEPER-3041:
---

 Summary: Typo in error message, affects log analysis
 Key: ZOOKEEPER-3041
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3041
 Project: ZooKeeper
  Issue Type: Bug
Affects Versions: 3.5.3
Reporter: Hugh O'Brien


simple typo

 

PR here: 
https://github.com/apache/zookeeper/pull/498/commits/a8cb7f668d31a7bcf12481409328a886231020f6



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ZOOKEEPER-3008) Potential NPE in SaslQuorumAuthLearner#authenticate and SaslQuorumAuthServer#authenticate

2018-05-12 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ZOOKEEPER-3008:
--
Labels: pull-request-available  (was: )

> Potential NPE in SaslQuorumAuthLearner#authenticate and 
> SaslQuorumAuthServer#authenticate
> -
>
> Key: ZOOKEEPER-3008
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3008
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.6.0
>Reporter: lujie
>Priority: Major
>  Labels: pull-request-available
>
> Inspired by ZK-3006 , I develop a simple static analysis tool to find other 
> Potential NPE like ZK-3006. This bug is found by this tool ,and I have 
> carefully studied it.  But i am a newbie at here so i may be wrong, hope 
> someone could confirm it and help me improve this tool.
> h2. Bug description:
> callee :SecurityUtils#createSaslClient will return null while encounter 
> exception
> {code:java}
> // code placeholder
> catch (Exception e) {
>   LOG.error("Exception while trying to create SASL client", e);
>   return null;
> }
> {code}
> but its caller has no null check just like:
> {code:java}
> // code placeholder
> sc = SecurityUtils.createSaslClient();
> if (sc.hasInitialResponse()) {
>responseToken = createSaslToken(new byte[0], sc, learnerLogin);
> }
> {code}
> I think we should add null check in caller while callee return null



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper issue #496: ZOOKEEPER-3008: Potential NPE in SaslQuorumAuthLearner...

2018-05-12 Thread lujiefsi
Github user lujiefsi commented on the issue:

https://github.com/apache/zookeeper/pull/496
  
@anmolnar  Local run is ok, @brettKK could you trigger another build


---


Re: Name resolution in StaticHostProvider

2018-05-12 Thread Norbert Kalmar
Hi,

In my opinion, when the client calls next(), and sees that this call could
return with an UnknownHostException, it will assume next() is giving back
the next available address in the list, whether it's a valid address or not.
If next() would throw something like "NoValidAddressFoundException", then
it would now it iterated the whole list, and there is no chance there is
any valid address left.

So the way I see it using checked exception helps here to understand next,
and to signal it to the caller that it should prepare that there was a
failed lookup, and do something about it.

I would go with checked exception, but I don't know all that well the
callers or the mechanic here.

The rules of thumb I follow with checked vs unchecked exception is that "Is
it reasonable for the caller to recover from this?" If yes, throw a checked
exception. In this case, if next() only looks one element from the list,
but there's an error with that address. then I think it is reasonable to
expect the caller to recover.

But I'm also open to pros and cons :)

Regards,
Norbert



On Sat, May 12, 2018 at 4:09 PM Andor Molnar  wrote:

> Hi team,
>
> Anyone else has thoughts about whether we should use checked or unchecked
> exception here?
>
> Thanks,
> Andor
>
>
>
>
> On Thu, May 10, 2018 at 8:19 AM, Andor Molnar  wrote:
>
> > Interesting idea. What difference you think it could make comparing to
> > checked?
> >
> >
> >
> > On Wed, May 9, 2018 at 8:01 AM, Flavio Junqueira  wrote:
> >
> >> I'm actually now wondering whether we should be using an unchecked
> >> exception instead. A lot of things have changed with exception handling
> >> since we wrote this code base initially. An unchecked exception would
> >> actually match better my current mental model of what that signature
> should
> >> look like.
> >>
> >> -Flavio
> >>
> >> > On 9 May 2018, at 16:44, Flavio Junqueira  wrote:
> >> >
> >> > I like the idea of indicating to the application that there is
> >> something wrong with the list of servers so that it has a chance to look
> >> into it. With the current code in `ClientCnxn`, we will log at warn
> level
> >> and hope that someone sees it, but we are not really stopping the
> client.
> >> Throwing might actually be an improvement as it will output a log
> message,
> >> but I'm now wondering if we should propagate it all the way to the
> >> application. Responding to myself, one reason for not doing it is that
> it
> >> is not a fatal error unless no server can be resolved.
> >> >
> >> > -Flavio
> >> >
> >> >> On 8 May 2018, at 16:06, Andor Molnar  wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> Updating this thread, because the PR is still being review on GitHub.
> >> >>
> >> >> So, the reason why I refactored the original behaviour of
> >> >> StaticHostProvider is that I believe that it's trying to do something
> >> which
> >> >> is not its responsibility. Please tell me if there's a good
> historical
> >> >> reason for that.
> >> >>
> >> >> My approach is giving the user the following to options:
> >> >> 1- Use static IP addresses, if you don't want to deal with DNS
> >> resolution
> >> >> at all - we guarantee that no DNS logic will involved in this case at
> >> all.
> >> >> 2- Use DNS hostnames if you have a reliable DNS service for
> resolution
> >> >> (with HA, secondary servers, backups, etc.) - we must use DNS in the
> >> right
> >> >> way in this case e.g. do NOT cache IP address for a longer period
> that
> >> DNS
> >> >> server allows to and re-resolve after TTL expries, because it's
> >> mandatory
> >> >> by protocol.
> >> >>
> >> >> My 2 cents here:
> >> >> - the fix which was originally posted for re-resolution is a
> >> workaround and
> >> >> doesn't satisfy the requirement for #2,
> >> >> - the solution is already built-in in JDK and DNS clients in the
> right
> >> way
> >> >> - can't see a reason why we shouldn't use that
> >> >>
> >> >> I checked this in some other projects as well and found very similar
> >> >> approach in hadoop-common's SecurityUtil.java. It has 2 built-in
> >> plugins
> >> >> for that:
> >> >> - Standard resolver uses java's built-in getByName().
> >> >> - Qualified resolver still uses getByName(), but adds some logic to
> >> avoid
> >> >> incorrect re-resolutions and reverse IP lookups.
> >> >>
> >> >> Please let me know your thoughts.
> >> >>
> >> >> Regards,
> >> >> Andor
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar 
> >> wrote:
> >> >>
> >> >>> Hi Abe,
> >> >>>
> >> >>> Unfortunately we haven't got any feedback yet. What do you think of
> >> >>> implementing Option #3?
> >> >>>
> >> >>> Regards,
> >> >>> Andor
> >> >>>
> >> >>>
> >> >>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar 
> >> wrote:
> >> >>>
> >>  Did anybody happen to take a quick look by any chance?
> >> 
> >>  I 

Re: Name resolution in StaticHostProvider

2018-05-12 Thread Andor Molnar
Hi team,

Anyone else has thoughts about whether we should use checked or unchecked
exception here?

Thanks,
Andor




On Thu, May 10, 2018 at 8:19 AM, Andor Molnar  wrote:

> Interesting idea. What difference you think it could make comparing to
> checked?
>
>
>
> On Wed, May 9, 2018 at 8:01 AM, Flavio Junqueira  wrote:
>
>> I'm actually now wondering whether we should be using an unchecked
>> exception instead. A lot of things have changed with exception handling
>> since we wrote this code base initially. An unchecked exception would
>> actually match better my current mental model of what that signature should
>> look like.
>>
>> -Flavio
>>
>> > On 9 May 2018, at 16:44, Flavio Junqueira  wrote:
>> >
>> > I like the idea of indicating to the application that there is
>> something wrong with the list of servers so that it has a chance to look
>> into it. With the current code in `ClientCnxn`, we will log at warn level
>> and hope that someone sees it, but we are not really stopping the client.
>> Throwing might actually be an improvement as it will output a log message,
>> but I'm now wondering if we should propagate it all the way to the
>> application. Responding to myself, one reason for not doing it is that it
>> is not a fatal error unless no server can be resolved.
>> >
>> > -Flavio
>> >
>> >> On 8 May 2018, at 16:06, Andor Molnar  wrote:
>> >>
>> >> Hi,
>> >>
>> >> Updating this thread, because the PR is still being review on GitHub.
>> >>
>> >> So, the reason why I refactored the original behaviour of
>> >> StaticHostProvider is that I believe that it's trying to do something
>> which
>> >> is not its responsibility. Please tell me if there's a good historical
>> >> reason for that.
>> >>
>> >> My approach is giving the user the following to options:
>> >> 1- Use static IP addresses, if you don't want to deal with DNS
>> resolution
>> >> at all - we guarantee that no DNS logic will involved in this case at
>> all.
>> >> 2- Use DNS hostnames if you have a reliable DNS service for resolution
>> >> (with HA, secondary servers, backups, etc.) - we must use DNS in the
>> right
>> >> way in this case e.g. do NOT cache IP address for a longer period that
>> DNS
>> >> server allows to and re-resolve after TTL expries, because it's
>> mandatory
>> >> by protocol.
>> >>
>> >> My 2 cents here:
>> >> - the fix which was originally posted for re-resolution is a
>> workaround and
>> >> doesn't satisfy the requirement for #2,
>> >> - the solution is already built-in in JDK and DNS clients in the right
>> way
>> >> - can't see a reason why we shouldn't use that
>> >>
>> >> I checked this in some other projects as well and found very similar
>> >> approach in hadoop-common's SecurityUtil.java. It has 2 built-in
>> plugins
>> >> for that:
>> >> - Standard resolver uses java's built-in getByName().
>> >> - Qualified resolver still uses getByName(), but adds some logic to
>> avoid
>> >> incorrect re-resolutions and reverse IP lookups.
>> >>
>> >> Please let me know your thoughts.
>> >>
>> >> Regards,
>> >> Andor
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Tue, Mar 6, 2018 at 8:12 AM, Andor Molnar 
>> wrote:
>> >>
>> >>> Hi Abe,
>> >>>
>> >>> Unfortunately we haven't got any feedback yet. What do you think of
>> >>> implementing Option #3?
>> >>>
>> >>> Regards,
>> >>> Andor
>> >>>
>> >>>
>> >>> On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar 
>> wrote:
>> >>>
>>  Did anybody happen to take a quick look by any chance?
>> 
>>  I don't want to push this too hard, because I know it's a time
>> consuming
>>  topic to think about, but this is a blocker in 3.5 which has been
>> hanging
>>  around for a while and any feedback would be extremely helpful to
>> close it
>>  quickly.
>> 
>>  Thanks,
>>  Andor
>> 
>> 
>> 
>>  On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar 
>>  wrote:
>> 
>> > Hi all,
>> >
>> > We need more eyes and brains on the following PR:
>> >
>> > https://github.com/apache/zookeeper/pull/451
>> >
>> > I added a comment few days ago about the way we currently do DNS
>> name
>> > resolution in this class and a suggestion on how we could simplify
>> things a
>> > little bit. We talked about it with Abe Fine, but we're a little
>> bit unsure
>> > and cannot get a conclusion. It would be extremely handy to get more
>> > feedback from you.
>> >
>> > To add some colour to it, let me elaborate on the situation here:
>> >
>> > In general, the task that StaticHostProvider does is to get a list
>> of
>> > potentially unresolved InetSocketAddress objects, resolve them and
>> iterate
>> > over the resolved objects by calling next() method.
>> >
>> > *Option #1 (current logic)*
>> > - Resolve addresses with getAllByName() which returns a list of IP
>> 

[GitHub] zookeeper issue #150: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-05-12 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/150
  
@tophei @glasser This issue has become top priority quite recently, because 
more and more users reporting similar issues that you're facing with, so I'd 
like to get #451 in as soon as possible.

But we also have to be careful. #451 fixes it in a completely different way 
and I'd like to be confident about that the fix is right. I'll see if I can put 
together a Kafka build for you for testing.

@tophei Yes, #451 has a different approach. It should not have conflicts 
with 3.4.12 or you can just grab the latest from my branch and build it. Please 
drop me an email if you need help.


---


ZooKeeper_branch35_jdk8 - Build # 961 - Still Failing

2018-05-12 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch35_jdk8/961/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 60.83 KB...]
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.064 sec, Thread: 3, Class: org.apache.zookeeper.test.SaslClientTest
[junit] Running org.apache.zookeeper.test.ServerCnxnTest in thread 3
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.446 sec, Thread: 7, Class: org.apache.zookeeper.test.SaslSuperUserTest
[junit] Running org.apache.zookeeper.test.SessionInvalidationTest in thread 
7
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
3.363 sec, Thread: 3, Class: org.apache.zookeeper.test.ServerCnxnTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.684 sec, Thread: 7, Class: org.apache.zookeeper.test.SessionInvalidationTest
[junit] Running org.apache.zookeeper.test.SessionTest in thread 3
[junit] Running org.apache.zookeeper.test.SessionTimeoutTest in thread 7
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.999 sec, Thread: 7, Class: org.apache.zookeeper.test.SessionTimeoutTest
[junit] Running org.apache.zookeeper.test.SessionTrackerCheckTest in thread 
7
[junit] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.082 sec, Thread: 7, Class: org.apache.zookeeper.test.SessionTrackerCheckTest
[junit] Running org.apache.zookeeper.test.SessionUpgradeTest in thread 7
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
247.364 sec, Thread: 8, Class: org.apache.zookeeper.test.RecoveryTest
[junit] Running org.apache.zookeeper.test.StandaloneTest in thread 8
[junit] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
13.576 sec, Thread: 3, Class: org.apache.zookeeper.test.SessionTest
[junit] Running org.apache.zookeeper.test.StatTest in thread 3
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
2.616 sec, Thread: 8, Class: org.apache.zookeeper.test.StandaloneTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.894 sec, Thread: 3, Class: org.apache.zookeeper.test.StatTest
[junit] Running org.apache.zookeeper.test.StaticHostProviderTest in thread 8
[junit] Running org.apache.zookeeper.test.StringUtilTest in thread 3
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.119 sec, Thread: 3, Class: org.apache.zookeeper.test.StringUtilTest
[junit] Running org.apache.zookeeper.test.SyncCallTest in thread 3
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.95 sec, Thread: 3, Class: org.apache.zookeeper.test.SyncCallTest
[junit] Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
2.342 sec, Thread: 8, Class: org.apache.zookeeper.test.StaticHostProviderTest
[junit] Running org.apache.zookeeper.test.TruncateTest in thread 8
[junit] Running org.apache.zookeeper.test.WatchEventWhenAutoResetTest in 
thread 3
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
19.902 sec, Thread: 7, Class: org.apache.zookeeper.test.SessionUpgradeTest
[junit] Running org.apache.zookeeper.test.WatchedEventTest in thread 7
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.099 sec, Thread: 7, Class: org.apache.zookeeper.test.WatchedEventTest
[junit] Running org.apache.zookeeper.test.WatcherFuncTest in thread 7
[junit] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
4.35 sec, Thread: 7, Class: org.apache.zookeeper.test.WatcherFuncTest
[junit] Running org.apache.zookeeper.test.WatcherTest in thread 7
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
194.382 sec, Thread: 1, Class: org.apache.zookeeper.test.RestoreCommittedLogTest
[junit] Running org.apache.zookeeper.test.X509AuthTest in thread 1
[junit] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.147 sec, Thread: 1, Class: org.apache.zookeeper.test.X509AuthTest
[junit] Running org.apache.zookeeper.test.ZkDatabaseCorruptionTest in 
thread 1
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
19.956 sec, Thread: 3, Class: 
org.apache.zookeeper.test.WatchEventWhenAutoResetTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
20.435 sec, Thread: 8, Class: org.apache.zookeeper.test.TruncateTest
[junit] Running org.apache.zookeeper.test.ZooKeeperQuotaTest in thread 3
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
4.257 sec, Thread: 3, Class: org.apache.zookeeper.test.ZooKeeperQuotaTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
18.649 sec, Thread: 1, Class: org.apache.zookeeper.test.ZkDatabaseCorruptionTest

[GitHub] zookeeper issue #150: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-05-12 Thread tophei
Github user tophei commented on the issue:

https://github.com/apache/zookeeper/pull/150
  
@glasser Thanks for the sharing. Yeah, a static ip indeed would help work 
around the issue. However, assign static ip to service in k8s seems to be a bit 
heavy and less flexible when you need to setup massive of clusters.


---


[GitHub] zookeeper issue #150: ZOOKEEPER-2184: Zookeeper Client should re-resolve hos...

2018-05-12 Thread tophei
Github user tophei commented on the issue:

https://github.com/apache/zookeeper/pull/150
  
@anmolnar Thanks for the comment. Actually I was trying to patch 451 into 
release tag 3.4.10 too , but seems it has much conflicts. By the way, may patch 
451 have additional changes that may have related fix to above connection 
reading issue compared with patch 150?


---