[ 
https://issues.apache.org/jira/browse/FLINK-16574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-16574:
-----------------------------------
      Labels: auto-deprioritized-major auto-deprioritized-minor  (was: 
auto-deprioritized-major stale-minor)
    Priority: Not a Priority  (was: Minor)

This issue was labeled "stale-minor" 7 days ago and has not received any 
updates so it is being deprioritized. If this ticket is actually Minor, please 
raise the priority and ask a committer to assign you the issue or revive the 
public discussion.


> StreamingFileSink should rename files or fail if destination file already 
> exists
> --------------------------------------------------------------------------------
>
>                 Key: FLINK-16574
>                 URL: https://issues.apache.org/jira/browse/FLINK-16574
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / FileSystem
>    Affects Versions: 1.9.1
>         Environment: We're using Flink 1.9.1 on YARN with Horton HDP 2.7.3.
>            Reporter: static-max
>            Priority: Not a Priority
>              Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> I switched from BucketingSink to StreamingFileSink so my state could not be 
> restored after starting from a savepoint.
> Upon start of the job there were already part-0-0 and part-0-1 files in the 
> HDFS destination folder. The StreamingFileSink then creates a file like 
> .part-0-0.inprogress.d1849354-39d4-4634-8fb3-dfb8e2083857{color:#172b4d}. 
> When the file is rolled Flink tries to rename it to part-0-0, but that file 
> already exists. NameNode logs "WARN hdfs.StateChange 
> (FSDirRenameOp.java:unprotectedRenameTo(174)) - DIR* 
> FSDirectory.unprotectedRenameTo: failed to rename XXXX to XXXXbeca
> use destination exists".{color}
> Flink does not care and creates a new file like .part-0-1.inprogress.d 
> {color:#172b4d}for the next bucket and the game continues until the part 
> index counter is so high the file can be renamed. But now I'm left with a lot 
> of .part-xxx.inprogress.xxx that I need to rename by hand if I don't want to 
> lose the data.{color}
>  
> I would expect Flink to either fail if the file cannot be renamed, or 
> auto-rename it to filename that does not exists yet.
> The same happens when not starting from a savepoint. IIRC the 
> BucketingFileSink did not have this problem.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to