[jira] [Updated] (TEZ-3437) Improve synchronization and the progress report behavior for Inputs from TEZ-3317
[ https://issues.apache.org/jira/browse/TEZ-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-3437: Fix Version/s: 0.8.5 0.7.2 > Improve synchronization and the progress report behavior for Inputs from > TEZ-3317 > - > > Key: TEZ-3437 > URL: https://issues.apache.org/jira/browse/TEZ-3437 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 0.7.2, 0.9.0, 0.8.5 > > Attachments: TEZ-3437.001.patch, TEZ-3437.002.patch, > TEZ-3437.003.patch, TEZ-3437.004.patch, TEZ-3437.005.patch, > TEZ-3437.006.patch, TEZ-3437.007.patch, TEZ-3437.008.patch, > TEZ-3437.009.patch, TEZ-3437.branch-0.7.001.patch, > TEZ-3437.branch-0.8.001.patch > > > Follow up from TEZ-3317 to improve the getProgress thread synchronization and > replace timerTasks with ScheduledExecutorService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-3437) Improve synchronization and the progress report behavior for Inputs from TEZ-3317
[ https://issues.apache.org/jira/browse/TEZ-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated TEZ-3437: - Attachment: TEZ-3437.branch-0.8.001.patch TEZ-3437.branch-0.7.001.patch Attaching branch-0.8, 0.7 versions of the patch. Let me know if you have any additional comments. Thanks! These changes are on top of version patches on TEZ-3317, which need to go in first. > Improve synchronization and the progress report behavior for Inputs from > TEZ-3317 > - > > Key: TEZ-3437 > URL: https://issues.apache.org/jira/browse/TEZ-3437 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 0.9.0 > > Attachments: TEZ-3437.001.patch, TEZ-3437.002.patch, > TEZ-3437.003.patch, TEZ-3437.004.patch, TEZ-3437.005.patch, > TEZ-3437.006.patch, TEZ-3437.007.patch, TEZ-3437.008.patch, > TEZ-3437.009.patch, TEZ-3437.branch-0.7.001.patch, > TEZ-3437.branch-0.8.001.patch > > > Follow up from TEZ-3317 to improve the getProgress thread synchronization and > replace timerTasks with ScheduledExecutorService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-3437) Improve synchronization and the progress report behavior for Inputs from TEZ-3317
[ https://issues.apache.org/jira/browse/TEZ-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated TEZ-3437: - Attachment: TEZ-3437.009.patch Thank you [~sseth], sorry I missed that change. Updated patch addressing comments above, also added IOException and InterruptedException to {{getProgress}} for readers (which was added in TEZ-3317), now modeled after {{MRReader}}. Let me know if this seems reasonable since some of the readers' {{getProgress}} call would not actually throw these exceptions, for example : {{ConcatenatedMergedKeyValuesReader#getProgress}}. If the Input uses the reader's progress, an IOException is thrown as ProgressFailedException. An InterruptedException is thrown as-is. Includes null check for {{scheduledExecutorService}} which is now volatile in ProgressHelper. Appreciate more comments/review. > Improve synchronization and the progress report behavior for Inputs from > TEZ-3317 > - > > Key: TEZ-3437 > URL: https://issues.apache.org/jira/browse/TEZ-3437 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3437.001.patch, TEZ-3437.002.patch, > TEZ-3437.003.patch, TEZ-3437.004.patch, TEZ-3437.005.patch, > TEZ-3437.006.patch, TEZ-3437.007.patch, TEZ-3437.008.patch, TEZ-3437.009.patch > > > Follow up from TEZ-3317 to improve the getProgress thread synchronization and > replace timerTasks with ScheduledExecutorService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-3437) Improve synchronization and the progress report behavior for Inputs from TEZ-3317
[ https://issues.apache.org/jira/browse/TEZ-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated TEZ-3437: - Attachment: TEZ-3437.008.patch Addressed comments from [~sseth] but left {{setProgress}} as it was in the previous iteration from last patch. Let me know if we want to decouple that. Added Interrupted Exception to getProgress in Inputs. Also removed {{throws}} clause from getProgress of {{UnorderedKVReader}} as none of the operations tend to throw those exceptions. Added {{processorName}} to ProgressHelper. It is used with VertexName for the threadName initialization. Instantiated progressHelper for FilterByWordInputProcessor. Also wrapped {{progressHelper.shutDownProgressTaskService}} with null check. > Improve synchronization and the progress report behavior for Inputs from > TEZ-3317 > - > > Key: TEZ-3437 > URL: https://issues.apache.org/jira/browse/TEZ-3437 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3437.001.patch, TEZ-3437.002.patch, > TEZ-3437.003.patch, TEZ-3437.004.patch, TEZ-3437.005.patch, > TEZ-3437.006.patch, TEZ-3437.007.patch, TEZ-3437.008.patch > > > Follow up from TEZ-3317 to improve the getProgress thread synchronization and > replace timerTasks with ScheduledExecutorService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-3437) Improve synchronization and the progress report behavior for Inputs from TEZ-3317
[ https://issues.apache.org/jira/browse/TEZ-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated TEZ-3437: - Attachment: TEZ-3437.007.patch Thank you so much [~sseth] for the quick and detailed comments. I have added the setProgress check with 0.001f epsilon. Is that a good granularity? Also, for adding InterruptedException to the interface, I added it to the getProgress call in AbstractLogicalInput/MergedLogicalInput. Is that what you meant? Also throwing a runtimeException would bubble up and may cause failures if the progress update fails. Is that ok to have? Appreciate more comments. Thanks! > Improve synchronization and the progress report behavior for Inputs from > TEZ-3317 > - > > Key: TEZ-3437 > URL: https://issues.apache.org/jira/browse/TEZ-3437 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3437.001.patch, TEZ-3437.002.patch, > TEZ-3437.003.patch, TEZ-3437.004.patch, TEZ-3437.005.patch, > TEZ-3437.006.patch, TEZ-3437.007.patch > > > Follow up from TEZ-3317 to improve the getProgress thread synchronization and > replace timerTasks with ScheduledExecutorService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-3437) Improve synchronization and the progress report behavior for Inputs from TEZ-3317
[ https://issues.apache.org/jira/browse/TEZ-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-3437: - Summary: Improve synchronization and the progress report behavior for Inputs from TEZ-3317 (was: hhjerekretudubjdkbktjvgjrd) > Improve synchronization and the progress report behavior for Inputs from > TEZ-3317 > - > > Key: TEZ-3437 > URL: https://issues.apache.org/jira/browse/TEZ-3437 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3437.001.patch, TEZ-3437.002.patch, > TEZ-3437.003.patch, TEZ-3437.004.patch, TEZ-3437.005.patch, TEZ-3437.006.patch > > > Follow up from TEZ-3317 to improve the getProgress thread synchronization and > replace timerTasks with ScheduledExecutorService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-3437) Improve synchronization and the progress report behavior for Inputs from TEZ-3317
[ https://issues.apache.org/jira/browse/TEZ-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated TEZ-3437: - Attachment: TEZ-3437.006.patch Cleaned up MapProcessor to use the default progressUpdateThread. > Improve synchronization and the progress report behavior for Inputs from > TEZ-3317 > - > > Key: TEZ-3437 > URL: https://issues.apache.org/jira/browse/TEZ-3437 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3437.001.patch, TEZ-3437.002.patch, > TEZ-3437.003.patch, TEZ-3437.004.patch, TEZ-3437.005.patch, TEZ-3437.006.patch > > > Follow up from TEZ-3317 to improve the getProgress thread synchronization and > replace timerTasks with ScheduledExecutorService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-3437) Improve synchronization and the progress report behavior for Inputs from TEZ-3317
[ https://issues.apache.org/jira/browse/TEZ-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated TEZ-3437: - Attachment: TEZ-3437.005.patch Added new Exception {{ProgressFailedException}} for getProgress. > Improve synchronization and the progress report behavior for Inputs from > TEZ-3317 > - > > Key: TEZ-3437 > URL: https://issues.apache.org/jira/browse/TEZ-3437 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3437.001.patch, TEZ-3437.002.patch, > TEZ-3437.003.patch, TEZ-3437.004.patch, TEZ-3437.005.patch > > > Follow up from TEZ-3317 to improve the getProgress thread synchronization and > replace timerTasks with ScheduledExecutorService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-3437) Improve synchronization and the progress report behavior for Inputs from TEZ-3317
[ https://issues.apache.org/jira/browse/TEZ-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated TEZ-3437: - Attachment: TEZ-3437.004.patch Addressed some of the review comments, the following one still remains. {quote} Individual inputs don't need to throw the Exception. The question was really around why IOException and InterruptedException. I'd change this to either Exception - similar to the other methods (but not very useful), or InterruptedException + a TezCheckedException or a custom ProgressFailedException. Anyone else has thoughts on this? Processors may want to handle this, hence a checked Exception. {quote} > Improve synchronization and the progress report behavior for Inputs from > TEZ-3317 > - > > Key: TEZ-3437 > URL: https://issues.apache.org/jira/browse/TEZ-3437 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3437.001.patch, TEZ-3437.002.patch, > TEZ-3437.003.patch, TEZ-3437.004.patch > > > Follow up from TEZ-3317 to improve the getProgress thread synchronization and > replace timerTasks with ScheduledExecutorService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-3437) Improve synchronization and the progress report behavior for Inputs from TEZ-3317
[ https://issues.apache.org/jira/browse/TEZ-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated TEZ-3437: - Attachment: TEZ-3437.003.patch Fixing some more findbugs. > Improve synchronization and the progress report behavior for Inputs from > TEZ-3317 > - > > Key: TEZ-3437 > URL: https://issues.apache.org/jira/browse/TEZ-3437 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3437.001.patch, TEZ-3437.002.patch, > TEZ-3437.003.patch > > > Follow up from TEZ-3317 to improve the getProgress thread synchronization and > replace timerTasks with ScheduledExecutorService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-3437) Improve synchronization and the progress report behavior for Inputs from TEZ-3317
[ https://issues.apache.org/jira/browse/TEZ-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated TEZ-3437: - Attachment: TEZ-3437.002.patch Updated patch to fix find-bugs warnings and make progressHelper in Processors private. > Improve synchronization and the progress report behavior for Inputs from > TEZ-3317 > - > > Key: TEZ-3437 > URL: https://issues.apache.org/jira/browse/TEZ-3437 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3437.001.patch, TEZ-3437.002.patch > > > Follow up from TEZ-3317 to improve the getProgress thread synchronization and > replace timerTasks with ScheduledExecutorService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-3437) Improve synchronization and the progress report behavior for Inputs from TEZ-3317
[ https://issues.apache.org/jira/browse/TEZ-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated TEZ-3437: - Attachment: TEZ-3437.001.patch > Improve synchronization and the progress report behavior for Inputs from > TEZ-3317 > - > > Key: TEZ-3437 > URL: https://issues.apache.org/jira/browse/TEZ-3437 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3437.001.patch > > > Follow up from TEZ-3317 to improve the getProgress thread synchronization and > replace timerTasks with ScheduledExecutorService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-3437) Improve synchronization and the progress report behavior for Inputs from TEZ-3317
[ https://issues.apache.org/jira/browse/TEZ-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated TEZ-3437: - Attachment: (was: TEZ-3437.001.patch) > Improve synchronization and the progress report behavior for Inputs from > TEZ-3317 > - > > Key: TEZ-3437 > URL: https://issues.apache.org/jira/browse/TEZ-3437 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > > Follow up from TEZ-3317 to improve the getProgress thread synchronization and > replace timerTasks with ScheduledExecutorService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-3437) Improve synchronization and the progress report behavior for Inputs from TEZ-3317
[ https://issues.apache.org/jira/browse/TEZ-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated TEZ-3437: - Attachment: TEZ-3437.001.patch Uploading first version of the patch addressing comments from TEZ-3317 inline below. [~sseth], appreciate more comments/review on the patch and questions below. Thanks a lot! bq. Progress in case of 0 inputs. Not sure if this is handled (suspect it's returning 0.0f) Yes. For 0 inputs how should the progress be reported? bq. TezProcessorContextImpl - != comparison on a float. Will this work as expected? Fixed. bq. getProgress on the processor throws an IOException and InterruptedException. Why is IOException special cased? Fixed method signatures to be consistent. Let me know if I am missing something here. bq. The Timer threads that are setup don't have any kind of synchronization. There's no synchronization on the processors when accessing the progress either. Think visibility of variables within the timed thread that reports progress can be a problem. e.g. totalBytesRead and totalFileBytes are updated by the worker thread, but read by the thread started by the timer in UnorderedKVReader My assumption here is that totalBytesRead and totalFileBytes will be used as and when available. As a background thread for progress update, should it matter that the read and writes to progress value is not synchronized? Appreciate some more comments on how we can make this better and what would you do to fix this. bq.Does the time interval in the MapProcessor (100ms at the moment) need to be linked to any other config property / made configurable? Since some processors dont have the conf objects, should we make a more elaborate change for the processors to pick up conf as well(add conf as a member maybe?). bq. [Nit] UnorderedKVReader - return 0l instead of 0.0f? Fixed. bq. TestMapProcessor - generateInputSplit, writeSplitFiles. This looks very similar to what exists in MapUtils. Can that be re-used? Fixed with changes to MapUtils method signature to read the number of KV pairs to write. bq. ShuffleManager.getNumCompletedInputs. Expose the int value, instead of exposing the internal AtomicInteger Fixed to return float value straight up. bq. The Processors seem to share code for reporting timed progress. Can this be moved into a common base, or a helper? Fixed. Appreciate more comments. bq. Specific reason to use a Timer? Otherwise can we move to a ScheduledExecutorService? Fixed. > Improve synchronization and the progress report behavior for Inputs from > TEZ-3317 > - > > Key: TEZ-3437 > URL: https://issues.apache.org/jira/browse/TEZ-3437 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > > Follow up from TEZ-3317 to improve the getProgress thread synchronization and > replace timerTasks with ScheduledExecutorService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)