Gopal V created TEZ-2105:
----------------------------
Summary: Totally Sorted Edge with auto-parallelism
Key: TEZ-2105
URL: https://issues.apache.org/jira/browse/TEZ-2105
Project: Apache Tez
Issue Type: New Feature
Reporter: Gopal V
Pig-on-Tez supports an edge configuration using a sampled Output along with a
vertex manager for automatic parallelism estimation.
This is referred to in the Pig-on-Tez Hadoop Summit presentation.
http://www.slideshare.net/Hadoop_Summit/pig-on-tez-low-latency-etl-with-big-data/19
Migrating that plan-model into Tez as a native edge type would allow for much
more efficient scheduling of the downstream edges and effectively turn the
auto-parallelism implementation into a runtime skew-correcting mechanism within
this edge.
The Tez Edge has enough information to sample, determine partitioning order and
correct parallelism.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)