abdullah alamoudi has posted comments on this change. Change subject: ASTERIXDB-1642,ASTERIXDB-1657,ASTERIXDB-1658 Fix Task Failure Handling ......................................................................
Patch Set 6: (7 comments) https://asterix-gerrit.ics.uci.edu/#/c/1197/6/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/Joblet.java File hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/Joblet.java: Line 350: taskInitializationThreads.interrupt(); > why do we need to interrupt the work queue thread? Maybe I misunderstand t It is because sometimes the task initialization blocks and we need to notify it somehow to return. otherwise, the nc hangs. https://asterix-gerrit.ics.uci.edu/#/c/1197/6/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/work/AbortTasksWork.java File hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/work/AbortTasksWork.java: Line 53: } > +1 -1 Line 53: } > Why do this in the constructor? Isn't that risky? because if the task initialization is blocking the abort will never get a chance to run. this way, we ensure the task initialization aborts since the constructor is called from another thread. https://asterix-gerrit.ics.uci.edu/#/c/1197/6/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/work/StartTasksWork.java File hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/work/StartTasksWork.java: Line 110: joblet.setTaskInitializationThreads(Thread.currentThread()); > is Thread.currentThread() the single work queue thread? mmm, seems like it is Line 184: LOGGER.log(Level.SEVERE, "Failure acquiring a joblet. Task has most likely been aborted", e); > Is an aborted job always something SEVERE, or can it be an expected outcome I believe it is always severe. should not happen unless something is seriously wrong. Line 205: throw new NullPointerException("Joblet was not found. This job was most likely aborted!!"); > what else could this be other than aborted? anything other than a bug? It is either an abort or a bug! Line 263: { > why newline? stupid formatter -- To view, visit https://asterix-gerrit.ics.uci.edu/1197 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2ec2c798b704ca426d5937f22e6d2bd394a9095a Gerrit-PatchSet: 6 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: abdullah alamoudi <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Michael Blow <[email protected]> Gerrit-Reviewer: Till Westmann <[email protected]> Gerrit-Reviewer: Yingyi Bu <[email protected]> Gerrit-Reviewer: abdullah alamoudi <[email protected]> Gerrit-HasComments: Yes
