ramkrish86 commented on PR #21508:
URL: https://github.com/apache/flink/pull/21508#issuecomment-1373364230

   We used the streaming wordcount program as the test case. 
   We generated close to 2G data and collected the word count sample. This 
result was matched by using output as ABFS file sink and verified the output by 
running the tests without restarting of any TMs and also by restarting of TMs 
intermittently to ensure we are able to recover and get back the same word 
count.  Total TMs - 5. Parallelism - 4. 
   
   We tested with default rolling policy where we are sure the inprogress files 
will be created and it needs a truncation on recovery from the latest check 
point. Also tested with checkpoint based rolling policy and in that case no 
truncation was needed as inprogress files were always committed. 
   We have not extensively tested with ORC/Parquet formats. 
   I can commit for any other further improvements/enhancements/bug fixes that 
needs to be done here. 
   
   Created the TODO as agreed up on. 
https://issues.apache.org/jira/browse/FLINK-30588. FYI @xintongsong 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to