> On Nov. 30, 2017, 9:33 a.m., András Piros wrote:
> > core/src/main/java/org/apache/oozie/service/ZKLocksService.java
> > Lines 239 (patched)
> > <https://reviews.apache.org/r/63739/diff/2/?file=1901904#file1901904line239>
> >
> >     Yep, blocking a thread for 30-ish minutes doesn't seem like a good 
> > idea. E.g. ZooKeeper outage lasts for only a few seconds under heavy load - 
> > it can happen that Oozie JVM ends up blocking lots of `Thread`s in 
> > `TIMED_WAITING` state.
> >     
> >     Can you think of some, more elaborate solution? Maybe having an 
> > `ExecutorService` just to try to unlock, and `Callable` instances that 
> > would try unlocking. In that case, the number of `Thread`s blocked in 
> > `TIMED_WAITING` state would be:
> >     * configurable
> >     * limited to a manageable extent
> 
> Rohini Palaniswamy wrote:
>     There are couple of cases where ZK outages require Ops team restarting 
> the quorum. From finding the issue to having it restarted, 30 mins would 
> easily pass. Instead of exponential retry, the retry should be every 10 
> seconds. Retrying after 2,4,8 and 16 minutes is waste of time and blocking 
> thread unnecessarily as you mentioned if ZK is back up. Otherwise it does not 
> matter if retry continues for 30 mins or 1 hour (NN, RM retry time limit is 
> also about the same). Everything in Oozie is going to be totally blocked if 
> ZK is down as you can't acquire locks for any command. Having higher retry 
> time avoids having to restart Oozie.

If ZooKeeper outages lasts for only a few seconds then the thread won't be 
blocked for 30 minutes. When it wakes up from sleep, it will release the lock 
successfully.


> On Nov. 30, 2017, 9:33 a.m., András Piros wrote:
> > core/src/test/java/org/apache/oozie/service/TestZKLocksService.java
> > Lines 563 (patched)
> > <https://reviews.apache.org/r/63739/diff/2/?file=1901906#file1901906line563>
> >
> >     What about `Services.get().get(ZKLocksService.class)`?

Similar to other test cases in this class because we are not initializing 
ZKLocksService in setup method.


> On Nov. 30, 2017, 9:33 a.m., András Piros wrote:
> > core/src/test/java/org/apache/oozie/service/TestZKLocksService.java
> > Lines 567-574 (patched)
> > <https://reviews.apache.org/r/63739/diff/2/?file=1901906#file1901906line567>
> >
> >     Extract method `setupSilentLock()`.

Overkill, dropping as Rohini suggested below.


- Satish


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63739/#review192309
-----------------------------------------------------------


On Nov. 27, 2017, 11:22 a.m., Satish Saley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63739/
> -----------------------------------------------------------
> 
> (Updated Nov. 27, 2017, 11:22 a.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-3113
>     https://issues.apache.org/jira/browse/OOZIE-3113
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> OOZIE-3113 Retry for ZK lock release
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/service/ZKLocksService.java 2c71c001f 
>   core/src/main/resources/oozie-default.xml 8285df0a7 
>   core/src/test/java/org/apache/oozie/service/TestZKLocksService.java 
> d04f04e80 
> 
> 
> Diff: https://reviews.apache.org/r/63739/diff/2/
> 
> 
> Testing
> -------
> 
> Tested locally
> 
> 
> Thanks,
> 
> Satish Saley
> 
>

Reply via email to