Gopal V created TEZ-2105:
----------------------------

             Summary: Totally Sorted Edge with auto-parallelism
                 Key: TEZ-2105
                 URL: https://issues.apache.org/jira/browse/TEZ-2105
             Project: Apache Tez
          Issue Type: New Feature
            Reporter: Gopal V


Pig-on-Tez supports an edge configuration using a sampled Output along with a 
vertex manager  for automatic parallelism estimation.

This is referred to in the Pig-on-Tez Hadoop Summit presentation.

http://www.slideshare.net/Hadoop_Summit/pig-on-tez-low-latency-etl-with-big-data/19

Migrating that plan-model into Tez as a native edge type would allow for much 
more efficient scheduling of the downstream edges and effectively turn the 
auto-parallelism implementation into a runtime skew-correcting mechanism within 
this edge.

The Tez Edge has enough information to sample, determine partitioning order and 
correct parallelism.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to