[
https://issues.apache.org/jira/browse/TEZ-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Siddharth Seth updated TEZ-1447:
--------------------------------
Priority: Blocker (was: Major)
Target Version/s: 0.5.1
> Handle parallelism updates and versioning w/ custom InputInitializerEvents
> --------------------------------------------------------------------------
>
> Key: TEZ-1447
> URL: https://issues.apache.org/jira/browse/TEZ-1447
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Gunther Hagleitner
> Assignee: Bikas Saha
> Priority: Blocker
> Fix For: 0.5.0
>
>
> I'm trying to do dynamic partition pruning through input initializer events
> in Hive. That means that the initializer of a table scan vertex has to
> receive events from all tasks in another vertex (which contain the pruning
> info) before generating tasks to run.
> The problem with the current API I ran into:
> getNumTasks: I'm currently using a busy loop to wait for the num tasks for a
> vertex to be decided (-1 -> x). There's no way around it, because it's the
> only way to find out what number of events to expect (0 is a valid number of
> tasks - so I can't wait for the first to complete).
> With auto-reducer parallelism I have to employ another busy loop. Because I
> might be initially expecting 10 events, which later get's knocked down to 5.
> Since there's no event associated with this, I have to periodically check
> whether I have enough events.
> Versioning: Events have a version number, but I don't know which task they
> are coming from. Thus I can't de-dup events.
--
This message was sent by Atlassian JIRA
(v6.2#6252)