Hi Wu,

If you are using the FileSink, it would use a random UUID on startup, it would 
be changed
after failover, thus the new records would be writes to files like 
<prefix>-<new unique id>-<new count>.suffix.

And logically the sink in 1.14 should be able to recover from an existing 
savepoint from previous version smoothly since
we do not change the state in recent versions? Namely you could use 
stop-with-savepoint to get a savepoint when
running the old version, and starts the new version's job with this savepoint. 

Best,
Yun




------------------------------------------------------------------
From:wu shaoj <shao...@gmail.com>
Send Time:2021 Oct. 9 (Sat.) 09:49
To:Yun Gao <yungao...@aliyun.com>; dev@flink.apache.org <dev@flink.apache.org>
Subject:Re: What's the purpose of uniqueId in FileWriterBucket

If flink job recover from previous checkpoint/savepoint, will it re-output 
records to a different file with same partFileIndex? And can we upgrade flink 
to 1.4 smoothly?
From: Yun Gao <yungao...@aliyun.com>
Date: Friday, October 8, 2021 at 22:43
To: wu shaoj <shao...@gmail.com>, dev@flink.apache.org <dev@flink.apache.org>
Subject: Re: What's the purpose of uniqueId in FileWriterBucket
Hi Wu,

The uid is used to distinguish between the different subtasks, if removed, the 
different subtasks
of the filesink would have name conflicts if they writes to the same bucket, 
thus the uid should
be necessary if there are multiple subtasks.

Best,
Yun

 
------------------Original Mail ------------------
Sender:wu shaoj <shao...@gmail.com>
Send Date:Fri Oct 8 14:18:34 2021
Recipients:dev@flink.apache.org <dev@flink.apache.org>
Subject:What's the purpose of uniqueId in FileWriterBucket
Hi, folks,

 From 1.14, file sink add a uid to a committed file, so would you mind to tell 
me what’s the purpose of this field? Can it be removed safely?
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/file_sink/#part-file-configuration




Reply via email to