[ https://issues.apache.org/jira/browse/TEZ-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134200#comment-14134200 ]
Jonathan Eagles commented on TEZ-1563: -------------------------------------- I am ok with this short term approach, and agree with Sid that internal modifications should long term not be part of the client object serializations. > TezClient.submitDAGSession alters DAG local resources regardless of DAG > submission > ---------------------------------------------------------------------------------- > > Key: TEZ-1563 > URL: https://issues.apache.org/jira/browse/TEZ-1563 > Project: Apache Tez > Issue Type: Bug > Affects Versions: 0.5.0 > Reporter: Josh Elser > Assignee: Bikas Saha > Attachments: TEZ-1563.1.patch, TEZ-1563.2.patch > > > In {{TezClient#submitDAGSesssion(Dag)}}, a {{DAGPlan}} is created from the > {{DAG}} before the {{DAGClientAMProtocolBlockingPB}} is instantiated. When > the application isn't running, {{waitForProxy()}} will throw a > {{SessionNotRunning}} Exception. > The problem is that the internal state of the {{DAG}} is modified, regardless > of whether the DAG is actually run or not. > {code} > DAGPlan dagPlan = dag.createDag(amConfig.getTezConfiguration()); > {code} > The {{createDag}} method will ultimately call {{addTaskLocalFiles}} for each > {{Vertex}} in the {{DAG}} > {code} > // add common task files for this DAG > vertex.addTaskLocalFiles(commonTaskLocalFiles); > {code} > Because the {{DAG}}'s state is modified, {{Vertex#addTaskLocalFiles(Map)}} > will fail if any resources are added multiple times. As such, if the > application is not running and {{SessionNotRunning}} is thrown, that same DAG > cannot be passed in to run the DAG after the application is started again. > Additionally, {{DAG}} is missing a getTaskLocalFiles method as compared to > {{Vertex}} which would be good to add to make the two classes more uniform. -- This message was sent by Atlassian JIRA (v6.3.4#6332)