Hi guys ! I have a question regarding new StreamingFileSink (introduced in 1.6 version) . We use this sink to write data into Parquet format. But I faced with issue when trying to run job on Yarn cluster and save result to HDFS. In our case we use latest Cloudera distributive (CHD 5.15) and it contains HDFS 2.6.0 . This version is not support truncate method . I would like to create Pull request but I want to ask your advice how better design this fix and which ideas are behind this decision . I saw similiar PR for BucketingSink https://github.com/apache/flink/pull/6108 . Maybe I could also add support of valid-length files for older Hadoop versions ?
P.S.Unfortently CHD 5.15 (with Hadoop 2.6) is the latest version of Cloudera distributive and we can't upgrade hadoop to 2.7 Hadoop . Best regards, Artsem