[ 
https://issues.apache.org/jira/browse/BEAM-2826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Kirpichov closed BEAM-2826.
----------------------------------
       Resolution: Fixed
    Fix Version/s: 2.2.0

This is indeed addressed by FileIO.write which I think was added in 2.2.

> Need to generate a single XML file when write is performed on small amount of 
> data
> ----------------------------------------------------------------------------------
>
>                 Key: BEAM-2826
>                 URL: https://issues.apache.org/jira/browse/BEAM-2826
>             Project: Beam
>          Issue Type: New Feature
>          Components: beam-model
>    Affects Versions: 2.0.0
>            Reporter: Balajee Venkatesh
>            Assignee: Eugene Kirpichov
>            Priority: Major
>             Fix For: 2.2.0
>
>
> I'm trying to write an XML file where the source is a text file stored in 
> GCS. The code is running fine but instead of a single XML file, it is 
> generating multiple XML files. (No. of XML files seem to follow total no. of 
> records present in source text file). I have observed this scenario while 
> using 'DataflowRunner'.
> When I run the same code in local then two files get generated. First one 
> contains all the records with proper elements and the second one contains 
> only opening and closing root element.
> As I learnt,it is expected that it may produce multiple files: e.g. if the 
> runner chooses to process your data parallelizing it into 3 tasks 
> ("bundles"), you'll get 3 files. Some of the parts may turn out empty in some 
> cases, but the total data written will always add up to the expected data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to