Hello Tez devs,

My question is regarding the understanding of output descriptor through data 
sink attached to a vertex in the dag.

We attach a dataSink to the vertex and specify the DataSinkDescriptor. The 
OutputDescriptor in the DataSinkDescriptor specifies the output type the vertex 
processor/task will receive. However in VertexImpl in function 
setAdditionalOutputs the OutputSpec sets physicalEdgeCount to 0. This leads to 
the LogicalOutput the task receives in the given vertex have 0 
numPhysicalOutputs.

My question is that is there any other way we can change the LogicalOutputs 
numPhysicalOutputs to be set to a value determined by our internal routing 
logic. Where this LogicalOutput OutputSpec is determined by the 
OutputDescriptor of the dataSink. We need this because the LogicalOutput is 
later used in the task to parameterize our call to internal framework that 
produces global output.

Also why do we manually set the value of physicalEdgeCount to 0. Then what 
purpose does the OutputDescriptor in DataSinkDescriptor serves when the 
LogicalOutput will always have 0 channels?

Thank You,
Gurleen

Reply via email to