JingGe commented on pull request #18825:
URL: https://github.com/apache/flink/pull/18825#issuecomment-1046946122


   > aha. The SinkWriterOperator passes data to the sinkWriter (which counts 
stuff as it writes them to Kafka), but then also sends a commitable to the 
co-located committer, which also of course counts as a record.
   > 
   > So, question 1: This doesn't seem specific to Kafka but a fundamental flaw 
in how the SinkWriterOperator works. Why are you only addressing it for Kafka?
   
   Yes, every connector that implements sink v2 should have the same issue. 
Currently, we only have the issue with Kafka and maybe File. I pushed the 
solution only for Kafka to fix the urgent bug and get feedback. Once the 
solution is approved, it will be applied to other connectors.
   
   > 
   > Question 2: How did you guys think this should be exposed to the user? 
What you are running into is an unanswered question w.r.t. metrics, as you have 
1 operator writing data to the outside while also writing data to the next 
operator. Now what is the numRecordsOut for that operator? It's clearly not 
either/or. If you drop the records sent to the committer then you have an 
operator (the committer) with 0 input. If you drop the Writer count, well then 
users complain that we only "wrote 1 record) or something like that.
   
   We will keep both metrics, `numRecordsOut` will be used as it was for the 
records out to the next operator. A new one called `numRecordsSend` will be 
used for sending the data to the outside downstream system.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to