[
https://issues.apache.org/jira/browse/BEAM-2826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kenneth Knowles reassigned BEAM-2826:
-------------------------------------
Assignee: Eugene Kirpichov (was: Kenneth Knowles)
> Need to generate a single XML file when write is performed on small amount of
> data
> ----------------------------------------------------------------------------------
>
> Key: BEAM-2826
> URL: https://issues.apache.org/jira/browse/BEAM-2826
> Project: Beam
> Issue Type: New Feature
> Components: beam-model
> Affects Versions: 2.0.0
> Reporter: Balajee Venkatesh
> Assignee: Eugene Kirpichov
>
> I'm trying to write an XML file where the source is a text file stored in
> GCS. The code is running fine but instead of a single XML file, it is
> generating multiple XML files. (No. of XML files seem to follow total no. of
> records present in source text file). I have observed this scenario while
> using 'DataflowRunner'.
> When I run the same code in local then two files get generated. First one
> contains all the records with proper elements and the second one contains
> only opening and closing root element.
> As I learnt,it is expected that it may produce multiple files: e.g. if the
> runner chooses to process your data parallelizing it into 3 tasks
> ("bundles"), you'll get 3 files. Some of the parts may turn out empty in some
> cases, but the total data written will always add up to the expected data.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)