[ https://issues.apache.org/jira/browse/BEAM-2826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172818#comment-16172818 ]
Eugene Kirpichov commented on BEAM-2826: ---------------------------------------- This will be addressed as part of the FileIO.write() effort. However, what Luke suggests above will also work in practice as a workaround. > Need to generate a single XML file when write is performed on small amount of > data > ---------------------------------------------------------------------------------- > > Key: BEAM-2826 > URL: https://issues.apache.org/jira/browse/BEAM-2826 > Project: Beam > Issue Type: New Feature > Components: beam-model > Affects Versions: 2.0.0 > Reporter: Balajee Venkatesh > Assignee: Eugene Kirpichov > > I'm trying to write an XML file where the source is a text file stored in > GCS. The code is running fine but instead of a single XML file, it is > generating multiple XML files. (No. of XML files seem to follow total no. of > records present in source text file). I have observed this scenario while > using 'DataflowRunner'. > When I run the same code in local then two files get generated. First one > contains all the records with proper elements and the second one contains > only opening and closing root element. > As I learnt,it is expected that it may produce multiple files: e.g. if the > runner chooses to process your data parallelizing it into 3 tasks > ("bundles"), you'll get 3 files. Some of the parts may turn out empty in some > cases, but the total data written will always add up to the expected data. -- This message was sent by Atlassian JIRA (v6.4.14#64029)