> On Nov. 30, 2017, 5:33 p.m., András Piros wrote:
> > core/src/main/java/org/apache/oozie/service/ZKLocksService.java
> > Lines 239 (patched)
> > <https://reviews.apache.org/r/63739/diff/2/?file=1901904#file1901904line239>
> >
> >     Yep, blocking a thread for 30-ish minutes doesn't seem like a good 
> > idea. E.g. ZooKeeper outage lasts for only a few seconds under heavy load - 
> > it can happen that Oozie JVM ends up blocking lots of `Thread`s in 
> > `TIMED_WAITING` state.
> >     
> >     Can you think of some, more elaborate solution? Maybe having an 
> > `ExecutorService` just to try to unlock, and `Callable` instances that 
> > would try unlocking. In that case, the number of `Thread`s blocked in 
> > `TIMED_WAITING` state would be:
> >     * configurable
> >     * limited to a manageable extent

There are couple of cases where ZK outages require Ops team restarting the 
quorum. From finding the issue to having it restarted, 30 mins would easily 
pass. Instead of exponential retry, the retry should be every 10 seconds. 
Retrying after 2,4,8 and 16 minutes is waste of time and blocking thread 
unnecessarily as you mentioned if ZK is back up. Otherwise it does not matter 
if retry continues for 30 mins or 1 hour (NN, RM retry time limit is also about 
the same). Everything in Oozie is going to be totally blocked if ZK is down as 
you can't acquire locks for any command. Having higher retry time avoids having 
to restart Oozie.


> On Nov. 30, 2017, 5:33 p.m., András Piros wrote:
> > core/src/test/java/org/apache/oozie/service/TestZKLocksService.java
> > Lines 579-598 (patched)
> > <https://reviews.apache.org/r/63739/diff/2/?file=1901906#file1901906line579>
> >
> >     Extract nested class w/ a catchy name.

Prefer to disagree on this one. I think it is a overkill to extract into one 
method and one class for 20 lines of test code. Unless the method is being 
reused or it is a big chunk of code, it does not make sense to extract into 
methods and that too in a test class.

More methods + more classes != Clean code. It kills readability as well when 
one has to unnecessarily jump to multiple places to finish reading logic of one 
method.


- Rohini


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63739/#review192309
-----------------------------------------------------------


On Nov. 27, 2017, 7:22 p.m., Satish Saley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63739/
> -----------------------------------------------------------
> 
> (Updated Nov. 27, 2017, 7:22 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-3113
>     https://issues.apache.org/jira/browse/OOZIE-3113
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> OOZIE-3113 Retry for ZK lock release
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/service/ZKLocksService.java 2c71c001f 
>   core/src/main/resources/oozie-default.xml 8285df0a7 
>   core/src/test/java/org/apache/oozie/service/TestZKLocksService.java 
> d04f04e80 
> 
> 
> Diff: https://reviews.apache.org/r/63739/diff/2/
> 
> 
> Testing
> -------
> 
> Tested locally
> 
> 
> Thanks,
> 
> Satish Saley
> 
>

Reply via email to