Hi Kostas, Please help ignore my previous email about the issue with security. It seems to I had mixed version of shaded and unshaded jars.
However, I'm now facing another issue with writing parquet files: only the first part is closed. All the subsequent parts are kept in in-progress state forever. My settings are to have checkpoint every 3 minutes. Sink parallelism set to 1 (my tries to set to 4 or 30 showed no difference). BucketID assigner is using event-timestamp. I only got this issue when running Flink on a yarn cluster, either writing to file:/// or to S3. When I ran it on my laptop, I got one part for every single checkpoint. TM logs says something like "*BucketState ... has pending files for checkpoints: {2 }*" Could you please help on how can I further debug this? Here below is the TM log: 2018-10-06 14:39:01.197 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 closing in-progress part file for bucket id=dt=2018-09-22 due to element SdcRecord(1537656300000,meter0219838,R1.S1.LT1.P25). 2018-10-06 14:39:01.197 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 closing in-progress part file for bucket id=dt=2018-09-22 due to element SdcRecord(1537656300000,meter0219838,R1.S1.LT1.P25). 2018-10-06 14:39:01.984 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 opening new part file "part-0-0" for bucket id=dt=2018-09-22. 2018-10-06 14:39:01.984 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 opening new part file "part-0-0" for bucket id=dt=2018-09-22. 2018-10-06 14:40:17.855 [Sink: Unnamed (1/1)] INFO o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing for checkpoint with id=2 (max part counter=1). 2018-10-06 14:40:17.855 [Sink: Unnamed (1/1)] INFO o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing for checkpoint with id=2 (max part counter=1). 2018-10-06 14:40:17.855 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 closing in-progress part file for bucket id=dt=2018-09-22 on checkpoint. 2018-10-06 14:40:17.855 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 closing in-progress part file for bucket id=dt=2018-09-22 on checkpoint. 2018-10-06 14:40:18.254 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing: BucketState for bucketId=dt=2018-09-22 and bucketPath=s3a://assn-averell/Test/output/dt=2018-09-22, has pending files for checkpoints: {2 } 2018-10-06 14:40:18.254 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing: BucketState for bucketId=dt=2018-09-22 and bucketPath=s3a://assn-averell/Test/output/dt=2018-09-22, has pending files for checkpoints: {2 } 2018-10-06 14:40:44.069 [Async calls on Sink: Unnamed (1/1)] INFO o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 received completion notification for checkpoint with id=2. 2018-10-06 14:40:44.069 [Async calls on Sink: Unnamed (1/1)] INFO o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 received completion notification for checkpoint with id=2. 2018-10-06 14:40:46.691 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 closing in-progress part file for bucket id=dt=2018-09-22 due to element SdcRecord(1537656300000,meter0207081,R1.S1.LT1.P25). 2018-10-06 14:40:46.691 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 closing in-progress part file for bucket id=dt=2018-09-22 due to element SdcRecord(1537656300000,meter0207081,R1.S1.LT1.P25). 2018-10-06 14:40:46.765 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 opening new part file "part-0-1" for bucket id=dt=2018-09-22. 2018-10-06 14:40:46.765 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 opening new part file "part-0-1" for bucket id=dt=2018-09-22. 2018-10-06 14:43:17.831 [Sink: Unnamed (1/1)] INFO o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing for checkpoint with id=3 (max part counter=2). 2018-10-06 14:43:17.831 [Sink: Unnamed (1/1)] INFO o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing for checkpoint with id=3 (max part counter=2). 2018-10-06 14:43:17.831 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 closing in-progress part file for bucket id=dt=2018-09-22 on checkpoint. 2018-10-06 14:43:17.831 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 closing in-progress part file for bucket id=dt=2018-09-22 on checkpoint. 2018-10-06 14:43:18.401 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing: BucketState for bucketId=dt=2018-09-22 and bucketPath=s3a://assn-averell/Test/output/dt=2018-09-22, has pending files for checkpoints: {3 } 2018-10-06 14:43:18.401 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing: BucketState for bucketId=dt=2018-09-22 and bucketPath=s3a://assn-averell/Test/output/dt=2018-09-22, has pending files for checkpoints: {3 } 2018-10-06 14:45:59.276 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 closing in-progress part file for bucket id=dt=2018-09-22 due to element SdcRecord(1537657200000,meter0218455,R1.S1.LT1.P10). 2018-10-06 14:45:59.276 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 closing in-progress part file for bucket id=dt=2018-09-22 due to element SdcRecord(1537657200000,meter0218455,R1.S1.LT1.P10). 2018-10-06 14:45:59.334 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 opening new part file "part-0-2" for bucket id=dt=2018-09-22. 2018-10-06 14:45:59.334 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 opening new part file "part-0-2" for bucket id=dt=2018-09-22. 2018-10-06 14:46:17.825 [Sink: Unnamed (1/1)] INFO o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing for checkpoint with id=4 (max part counter=3). 2018-10-06 14:46:17.825 [Sink: Unnamed (1/1)] INFO o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing for checkpoint with id=4 (max part counter=3). 2018-10-06 14:46:17.825 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 closing in-progress part file for bucket id=dt=2018-09-22 on checkpoint. 2018-10-06 14:46:17.825 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 closing in-progress part file for bucket id=dt=2018-09-22 on checkpoint. 2018-10-06 14:46:18.228 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing: BucketState for bucketId=dt=2018-09-22 and bucketPath=s3a://assn-averell/Test/output/dt=2018-09-22, has pending files for checkpoints: {3 4 } 2018-10-06 14:46:18.228 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing: BucketState for bucketId=dt=2018-09-22 and bucketPath=s3a://assn-averell/Test/output/dt=2018-09-22, has pending files for checkpoints: {3 4 } 2018-10-06 14:46:25.041 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 closing in-progress part file for bucket id=dt=2018-09-22 due to element SdcRecord(1537657200000,meter0209471,R1.S1.LT1.P25). 2018-10-06 14:46:25.041 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 closing in-progress part file for bucket id=dt=2018-09-22 due to element SdcRecord(1537657200000,meter0209471,R1.S1.LT1.P25). 2018-10-06 14:46:25.186 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 opening new part file "part-0-3" for bucket id=dt=2018-09-22. 2018-10-06 14:46:25.186 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 opening new part file "part-0-3" for bucket id=dt=2018-09-22. 2018-10-06 14:49:17.848 [Sink: Unnamed (1/1)] INFO o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing for checkpoint with id=5 (max part counter=4). 2018-10-06 14:49:17.848 [Sink: Unnamed (1/1)] INFO o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing for checkpoint with id=5 (max part counter=4). 2018-10-06 14:49:17.849 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 closing in-progress part file for bucket id=dt=2018-09-22 on checkpoint. 2018-10-06 14:49:17.849 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Bucket - Subtask 0 closing in-progress part file for bucket id=dt=2018-09-22 on checkpoint. 2018-10-06 14:49:18.385 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing: BucketState for bucketId=dt=2018-09-22 and bucketPath=s3a://assn-averell/Test/output/dt=2018-09-22, has pending files for checkpoints: {3 4 5 } 2018-10-06 14:49:18.385 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing: BucketState for bucketId=dt=2018-09-22 and bucketPath=s3a://assn-averell/Test/output/dt=2018-09-22, has pending files for checkpoints: {3 4 5 } 2018-10-06 14:52:17.824 [Sink: Unnamed (1/1)] INFO o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing for checkpoint with id=6 (max part counter=4). 2018-10-06 14:52:17.824 [Sink: Unnamed (1/1)] INFO o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing for checkpoint with id=6 (max part counter=4). 2018-10-06 14:52:17.825 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing: BucketState for bucketId=dt=2018-09-22 and bucketPath=s3a://assn-averell/Test/output/dt=2018-09-22, has pending files for checkpoints: {3 4 5 } 2018-10-06 14:52:17.825 [Sink: Unnamed (1/1)] DEBUG o.a.flink.streaming.api.functions.sink.filesystem.Buckets - Subtask 0 checkpointing: BucketState for bucketId=dt=2018-09-22 and bucketPath=s3a://assn-averell/Test/output/dt=2018-09-22, has pending files for checkpoints: {3 4 5 } Thanks and best regards, Averell -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/