[ 
https://issues.apache.org/jira/browse/TEZ-993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14298287#comment-14298287
 ] 

Jeff Zhang commented on TEZ-993:
--------------------------------

bq. This will imply that all the events will still be sent to the recovery 
service ( which adds some level of overhead ). Also, I am assuming that there 
are code changes needed to make sure these events are filtered out before being 
added to the queue?
Considering currently we only support executing one DAG at a time, filtering 
should be fast. Event in future multiple dags are supported, the number of dags 
should not be large, filtering efficiency should not be a problem. IAC, we need 
some place to determine whether the recovery events is needed to be logged 
either in the recovery store or in the entity (DAG/Vertex/Task/TaskAttempt). 
IMO, put it in recovery store make more sense. [~hitesh] Any thoughts ?



> Remove application logic from RecoveryService
> ---------------------------------------------
>
>                 Key: TEZ-993
>                 URL: https://issues.apache.org/jira/browse/TEZ-993
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Jeff Zhang
>         Attachments: TEZ-993-3.patch, Tez-993-2.patch, Tez-993.patch
>
>
> Currently RecoveryService storage logic knows a lot about the DAG like which 
> dag is pre-warm and does not need to be stored, which events needs special 
> treatment etc. This kind of logic couples the DAG and the storage more than 
> is probably necessary and can be a source of complications down the road. The 
> storage should ideally be simply storing a sequence of arbitrary records 
> delimited by a marker.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to