[
https://issues.apache.org/jira/browse/TEZ-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996461#comment-14996461
]
Jeff Zhang commented on TEZ-2581:
---------------------------------
bq. Are we sure? From what I see, a transform is just a view. So it is
evaluated afresh every item it is iterated and produces new objects. Could you
please check in a debugger? If this is indeed the case, then we shoudl avoid
the getVertex() lock contention. The same thing would hold inside
VertexGroupCommitFinishedEvent.java and other places that are using transforms.
You are right, my mistake. It is lazy evaluated. Will fix it.
bq. logJobHistoryTaskStartedEvent() has the side effect of setting the
scheduleTime (code moved from InitialScheduleTransition). IMO, these methods
should be side effect free.
bq. From the code it looks like if taskRecovery fails then we end up calling
addAndScheduleAttempt() instead of failing. Am I missing something?
Yes, If the task output can not be recovered, task would schedule a task
attempt to rerun it.
bq. Sure. But today, logVertexInitGeneratedEvent will be logged multiple times,
once for each input and we have multi-input vertices in Hive/Pig. Will that
work today?
Yes, that works. We just need to sure VertexInitedGenreated be logged before
VertexIntializedEvent. Any concern about that ?
> Umbrella for Tez Recovery Redesign
> ----------------------------------
>
> Key: TEZ-2581
> URL: https://issues.apache.org/jira/browse/TEZ-2581
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Jeff Zhang
> Assignee: Jeff Zhang
> Attachments: TEZ-2581-WIP-1.patch, TEZ-2581-WIP-10.patch,
> TEZ-2581-WIP-2.patch, TEZ-2581-WIP-3.patch, TEZ-2581-WIP-4.patch,
> TEZ-2581-WIP-5.patch, TEZ-2581-WIP-6.patch, TEZ-2581-WIP-7.patch,
> TEZ-2581-WIP-8.patch, TEZ-2581-WIP-9.patch, TezRecoveryRedesignProposal.pdf,
> TezRecoveryRedesignV1.1.pdf
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)