Thanks Jeff. I will look into it. I am trying to write Thrift Objects,
maybe using newline delimiter can work.

On Fri, Apr 26, 2019 at 12:50 PM Jeff Klukas <[email protected]> wrote:

> You can use the read* and write* methods of FileIO to read and write
> arbitrary binary files. The examples in the Javadoc for FileIO [0] include
> an example of reading the entire contents of a file as a string into a Beam
> record, along with metadata about the file.
>
> If a one-to-one mapping of files to records is fine for your use case,
> then it should be fairly straightforward to read and write byte arrays.
>
> If your files are large and contain many logical records, then you need a
> way to understand the format of a binary file in order to break it up into
> records.
>
> To support writing batched records in an arbitrary file format, you could
> build a custom implementation of FileIO.Sink [1]. There are existing
> pre-built sinks for newline-delimited text (TextIO.Sink), Avro, xml, etc.
> which may or may not meet your needs.
>
> [0]
> https://beam.apache.org/releases/javadoc/2.12.0/org/apache/beam/sdk/io/FileIO.html
> [1]
> https://beam.apache.org/releases/javadoc/2.12.0/org/apache/beam/sdk/io/FileIO.Sink.html
>
> On Fri, Apr 26, 2019 at 3:13 PM Nikhil Goyal <[email protected]> wrote:
>
>> Hi,
>>
>> Is there a way to read and write binary files in beam?
>>
>> Thanks
>> Nikhil
>>
>

Reply via email to