[
https://issues.apache.org/jira/browse/TEZ-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15596820#comment-15596820
]
Hitesh Shah commented on TEZ-3271:
----------------------------------
Comments:
- generateEmptyEventsForSourceTask in EdgeManagerPlugin should not be an
abstract function. Given that CartesianProductEdgeManager needs changing this
is an incompatible feature. An appropriate exception thrown could be used to
indicate that the EM plugin in use does not support the failure threshold
percent feature.
- I think we can add a fail-safe in the edge plugins to generate the events
only for known outputs (maybe if they belong the tez runtime package ? ) i.e.
if someone ends up writing a new output that uses a different payload we would
need to throw an error atleast with the current impl though we do need to
figure out how the EM plugin can invoke an empty event that the Input
understands. One option here would be to enhance the DME meta info to indicate
empty/null payload or invoke an api on the Output to generate the empty data
event.
- As for event generation, I have a doubt with respect to recovery given that
we expect all DME events to be generated before a task completes. This might be
something to test more carefully on recovery to see if events are generated
correctly as needed when a failed vertex is recovered or replayed as needed.
- Unit test could be moved to TestTezJobs. At some point we probably need to
get rid of a lot of the TestMRR* minicluster tests.
> Provide mapreduce failures.maxpercent equivalent
> ------------------------------------------------
>
> Key: TEZ-3271
> URL: https://issues.apache.org/jira/browse/TEZ-3271
> Project: Apache Tez
> Issue Type: New Feature
> Reporter: Jonathan Eagles
> Assignee: Jonathan Eagles
> Attachments: Succeeded with Failures.png, TEZ-3271.1.patch,
> TEZ-3271.2.patch, TEZ-3271.3.patch, TEZ-3271.4.patch, TEZ-3271.5.patch,
> TEZ-3271.6.patch, TEZ-3271.7.patch
>
>
> There is a certain category of work that need not have 100% of tasks succeed
> to cause the work to be considered a success. To meet that end, I propose we
> provide a tez equivalent of mapreduce.map.failures.maxpercent and
> mapreduce.reduce.failures.maxpercent. In this way a vertex will be considered
> a success if the number of failures is below a configured threshold.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)