[ 
https://issues.apache.org/jira/browse/TEZ-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14531143#comment-14531143
 ] 

Bikas Saha commented on TEZ-2410:
---------------------------------

bq. Do you mean to add statement like this ?
bq. historyEventHandler.verifyVertexGroupCommitStartedEvent("v1", 0);
Right. Because today these events are being logged for non-group case too.

bq. If the commit is onDAGSuccess, then VertexGroupFinished should not be logged
Why is this the case? Why do we not log vertex group finished when we commit 
groups upon dag completion? Is there any harm even if we do? If not then why 
not do it? I am wary that such conditional logging will end up creating bugs as 
other semantics change. Vertex group commit should always write its logs and 
the reader of the logs can decide whether to use that information or not. 
However, if we change this then it should be a separate jira.

Please add some comments before the test checks to explain what is being 
checked - eg. - If the commit is onVertexSuccess, then VertexGroupFinished 
should be logged once

bq. The original purpose of VertexGroupStatus is to track the status of commits 
in VertexGroup
Sorry for not being clear. I mean the following.
{code}
VertexGroupStatus - not committing <===> group.committed = false
VertexGroupStatus - committing <===> group.committed = true and 
group.successfulCommits < total
VertexGroupStatus - committed <===> group.committed = true and 
group.successfulCommits = total
{code}
So we dont need to add a duplicate enum because its logic is completely 
captured by other means.
e.g. this is redundant code
{code} 
+        if (vertexGroup.successfulCommits == vertexGroup.outputs.size()) {
+          vertexGroup.status = VertexGroupInfo.VertexGroupStatus.COMMITTED;
+          if (!commitAllOutputsOnSuccess)
{code} as it can simply be 
{code}
+        if (vertexGroup.isCommitComplete()) {
+          if (!commitAllOutputsOnSuccess)
{code}

Regardless of TEZ-2413, counting of successful commits would be needed. IMO, 
TEZ-2413 does not need to be fixed because it is ok for the DAG state machine 
to not assume that vertex state machine will not rerun in this case.


> VertexGroupCommitFinishedEvent is not logged correctly
> ------------------------------------------------------
>
>                 Key: TEZ-2410
>                 URL: https://issues.apache.org/jira/browse/TEZ-2410
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>            Priority: Blocker
>             Fix For: 0.7.0
>
>         Attachments: TEZ-2410-1.patch, TEZ-2410-1.patch
>
>
> VertexGroupCommitFinishedEvent may be logged for non-vertex group commits.
> VertexGroupCommitFinishedEvent may be logged for each member vertex of the 
> group instead of once per group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to