[ 
https://issues.apache.org/jira/browse/TEZ-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493493#comment-14493493
 ] 

Siddharth Seth commented on TEZ-1752:
-------------------------------------

[~rajesh.balamohan] - thanks for taking this up. Haven't looked at the entire 
patch yet (mostly skipped bits relating to cleanup).

- Re-setting the interrupt status in most places looks good. It probably needs 
to be set in a few more places, and some parts could throw an 
InterruptedException itself.

- In LogicalIOProcessorRuntimeTask - it would be useful to log the interrupt 
status between each close invocation, and potentially set it if the I/O/P being 
closed ends up unsetting it.
- cleanup would behave differently if initialize hasn't been invoked. We may 
need to track which I O Ps have been initialized - and close just those - In 
the MergeManager, InterruptedException thrown by MergeThraed.close likely needs 
to be handled (otherwise it'll end up skipping cleanup?)
- In the invocation of finalMerge - an IOException is caught, are there 
specific cases here where this IO exception is actually masking an interrupt ? 
(and as a result the interrupt status needs to be set)
- The TezMerger change - should we just change the interface to throw 
InterruptedException, instead of setting the flag. That's a private method, and 
will force consumers within the IOs to handle it.
- UnorderedPartitionedKVWriter / others - in the close method, instead of 
returning an empty event list - should this just throw an InterruptedException 
back ?
- Is the change in the TaskReporter required ? taskFailed shouldn't be invoked 
after the currentTask has been unregistered.

We likely need to ensure that cleanup / close methods aren't called twice - 
once during regular cleanup, second during an interrupt while the cleanup is in 
progress.

Not directly related to interrupts - but an invocation on Task.close() (regular 
flow) can cause exceptions during Processor close or Input / Output close - 
which would prevent subsequent Inputs / Outputs from being closed. Do we need 
to make sure that close() gets invoked on subsequent Inputs / Outputs despite a 
prior exception ?

Eventually, We should consider changing the API on Readers / Writers to throw 
an InterruptedException rather than relying on IOException - so that Processors 
can find out if an Input / Output gets interrupted - rather than ending up with 
an error. That'll likely be 0.8 though.

Was talking to [~hitesh] offline about this for a bit - for this to work with 
regular containers, a lookup on the Inputs / Outputs / Processors to find out 
whether they are actually interruptable would be useful. Either via an 
annotation or via an explicit API lookup (input.isInterrutable). I think this 
can be a separate follow up jira though.



> Inputs / Outputs in the Runtime library should be interruptable
> ---------------------------------------------------------------
>
>                 Key: TEZ-1752
>                 URL: https://issues.apache.org/jira/browse/TEZ-1752
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>         Attachments: TEZ-1752.1.patch
>
>
> Not possible to preempt tasks without killing containers without this.
> There's still the problem of Processors not supporting interrupts. We may 
> need API enhancements to either query IPOs on whether they are interrupbtible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to