HeartSaVioR opened a new pull request, #38719:
URL: https://github.com/apache/spark/pull/38719

   ### What changes were proposed in this pull request?
   
   This PR proposes to fix the metrics issue for streaming query when DSv1 
streaming source and DSv2 streaming source are co-used. If the streaming query 
has both DSv1 streaming source and DSv2 streaming source, only DSv1 streaming 
source produced correct metrics.
   
   There is a bug in ProgressReporter that it tries to match logical node for 
DSv2 streaming source with OffsetHolder, which will be never matched. Given 
that physical node for DSv2 streaming source contains both source information 
and metrics, we can simply deduce all the necessary information from the 
physical node rather than trying to find the source from association map.
   
   ### Why are the changes needed?
   
   The logic of collecting metrics does not collect metrics for DSv2 streaming 
sources properly.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   New test case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to