Kuhu Shukla commented on TEZ-3437:

Thanks a lot for the review [~sseth]. 
Earlier, this timeout handling would rely upon the processor explicitly setting 
progress - which would typically be done after n rows. That's no longer the 
case. Even if the processor keeps reporting progress - 0.0f for a long time, 
while it waits for instance, it'll end up timing out.
With the check completely removed - we have a thread running which reports 
progress - and makes the processor 'report after n entries' fairly useless.

Since we already have the lazySet for the AtomicBoolean for progress update, 
how about we just do the float value set through the processor thread 
invocation and let the {{notifyProgress()}} and 
{{ProcessorContext#setProgress}} remain as it is? I can add a getter for 
{{runtimeTask}} in {{TezTaskContextImpl}} and call 
{{LogicalIOProcessorRuntimeTask#setProgress()}} directly in 
{{ProgressHelper#monitorProgress}} thread. Thoughts?

I will address other comments in the upcoming patch.

> Improve synchronization and the progress report behavior for Inputs from 
> TEZ-3317
> ---------------------------------------------------------------------------------
>                 Key: TEZ-3437
>                 URL: https://issues.apache.org/jira/browse/TEZ-3437
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Kuhu Shukla
>            Assignee: Kuhu Shukla
>         Attachments: TEZ-3437.001.patch, TEZ-3437.002.patch, 
> TEZ-3437.003.patch
> Follow up from TEZ-3317 to improve the getProgress thread synchronization and 
> replace timerTasks with ScheduledExecutorService. 

This message was sent by Atlassian JIRA

Reply via email to