[ https://issues.apache.org/jira/browse/FLINK-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198727#comment-16198727 ]
Gabor Gevay commented on FLINK-1268: ------------------------------------ This issue just happened to me. I ran my job locally with parallelism 8, and then later with 4, and then I was debugging for an hour to figure out what went wrong. > FileOutputFormat with overwrite does not clear local output directories > ----------------------------------------------------------------------- > > Key: FLINK-1268 > URL: https://issues.apache.org/jira/browse/FLINK-1268 > Project: Flink > Issue Type: Bug > Components: Batch Connectors and Input/Output Formats > Reporter: Till Rohrmann > Priority: Minor > > I noticed that the FileOutputFormat does not clear the output directories if > it writes to local disk. This has the consequence that previous partitions > are still contained in the directory if one decreases the DOP between > subsequent runs. If one reads the data from this directory, then more > partitions will be read in than were actually written. This can lead to a > wrong user code behaviour which is hard to debug. I'm aware that in case of a > distributed execution the TaskManagers or the Tasks have to be responsible > for the cleanup and if multiple Tasks are running on a TaskManager, then the > cleanup has to be coordinated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)