Hi,all: recently I am studying tez shuffle logic. I read "Apache Tez: Dynamic Graph Reconfiguration", and have two questions: (1)I am not very clear about this "The data samples could be sent via the VertexManager events to the vertex manager that can create the key-range histogram and determine the correct number of partitions. It can then assign the appropriate key-ranges to each partition". How does tez assign the appropriate key-ranges to each partition? by event? (2)Before auto-reduce, a map task's outputs should go which reduce task is decided by partitioner. But after auto-reduce, number of reduce tasks desc. how does tez decide that a map task's outputs route to which reduce task after auto-reduce done?
thanks in advance for any reply. Maria.Lu
