[ 
https://issues.apache.org/jira/browse/TEZ-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14361426#comment-14361426
 ] 

Hitesh Shah commented on TEZ-1909:
----------------------------------

Comments:

- any reason why this is needed in the DAGAppMaster "Set<String> getDagIDs()" ? 
- the "if (skipAllOtherEvents) {" check is probably also needed at the top of 
the loop to prevent new files from being opened and read ( in addition to 
short-circuiting the read of all events in the given file ). Maybe just log a 
message that other files were present and skipped
- I do not see TEZ_AM_RECOVERY_HANDLE_REMAINING_EVENT_WHEN_STOPPED being used 
anywhere apart from being set to true in one of the tests.
- please replace "import com.sun.tools.javac.util.List;" with java.lang.List
- testCorruptedLastRecord should also verify that the dag submitted event was 
seen. 
- also, we should add a test for adding corrupt data to the summary stream and 
ensuring that its processing fails
- there may not be a need to add "getDAGNames()". Instead, you can just use 
"dagAppMaster.dagNames.add(dagSummaryData.dagName);" as dagNames should be 
package-private.




 

> Remove need to copy over all events from attempt 1 to attempt 2 dir
> -------------------------------------------------------------------
>
>                 Key: TEZ-1909
>                 URL: https://issues.apache.org/jira/browse/TEZ-1909
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Hitesh Shah
>            Assignee: Jeff Zhang
>         Attachments: TEZ-1909-1.patch, TEZ-1909-2.patch
>
>
> Use of file versions should prevent the need for copying over data into a 
> second attempt dir. Care needs to be taken to handle "last corrupt record" 
> handling. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to