[GitHub] [flink] StephanEwen commented on pull request #13920: [FLINK-19743] Add metric definitions in FLIP-33 and report some of them by default.

GitBox Fri, 06 Nov 2020 09:01:11 -0800


StephanEwen commented on pull request #13920:
URL: https://github.com/apache/flink/pull/13920#issuecomment-723189562



   Sorry to be late to the game here, but could you share a bit more 
information on what the original setup was?
   Specifically, what was your checkpoint storage system that offered such bad 
stream read performance? Was it HDFS? OSS? S3?
   
   Looking at this change here, it seems very big (40 files) for "just" 
introducing a buffer in a stream.
   So I tend to be -1 on the change as it is.
   
   Two other options to solve this:
   
   (1) Input stream buffering is a property of the `CheckpointStorage`. It is 
created there, rather than in the state backends that have to wrap the stream.
   
   (2) Alternatively, we can make it a contract that all FileSystem 
implementations return well buffered streams. Some already do this by default, 
wrapping them with another buffered stream adds just another layer and extra 
copying of bytes, costing performance.
   
   At a first glance, I'd say let's go with option (2) if possible, otherwise 
option (1).
   Hence the question: Which FS did you use that had such bad performing 
streams?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] StephanEwen commented on pull request #13920: [FLINK-19743] Add metric definitions in FLIP-33 and report some of them by default.

Reply via email to