marin-ma opened a new issue, #10618: URL: https://github.com/apache/incubator-gluten/issues/10618
### Description For the "time of input iterator" metric in the `InputIteratorTransformer`, the representation of this time depends on its previous operator. Below are the 3 different cases that I observe: 1. **When the previous operator is shuffle:** The time is primarily the total shuffle read time, including fetch wait time and native reader processing time (such as decompression and deserialization). 2. **When the previous operator is broadcast:** The time is nearly zero because the broadcast is already executed before the pipeline starts. 3. **For other cases (e.g., ColumnarUnion or fallback operators within the same Spark stage as the previous Velox pipelines):** Since wallTimeNanos in Velox is measured by the driver’s getOutput, the time of the previous pipelines is included in the getOutput call from the ValueStreamNode. In this case, the time of input iterators represents the total time counted from the beginning of the current stage. The discrepancy in behavior across different cases for this operator is not documented and may cause confusion for users. It would be better to document this and highlight it in the metrics description. ### Gluten version None -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
