Thanks Jeff. I will look into it. I am trying to write Thrift Objects, maybe using newline delimiter can work.
On Fri, Apr 26, 2019 at 12:50 PM Jeff Klukas <[email protected]> wrote: > You can use the read* and write* methods of FileIO to read and write > arbitrary binary files. The examples in the Javadoc for FileIO [0] include > an example of reading the entire contents of a file as a string into a Beam > record, along with metadata about the file. > > If a one-to-one mapping of files to records is fine for your use case, > then it should be fairly straightforward to read and write byte arrays. > > If your files are large and contain many logical records, then you need a > way to understand the format of a binary file in order to break it up into > records. > > To support writing batched records in an arbitrary file format, you could > build a custom implementation of FileIO.Sink [1]. There are existing > pre-built sinks for newline-delimited text (TextIO.Sink), Avro, xml, etc. > which may or may not meet your needs. > > [0] > https://beam.apache.org/releases/javadoc/2.12.0/org/apache/beam/sdk/io/FileIO.html > [1] > https://beam.apache.org/releases/javadoc/2.12.0/org/apache/beam/sdk/io/FileIO.Sink.html > > On Fri, Apr 26, 2019 at 3:13 PM Nikhil Goyal <[email protected]> wrote: > >> Hi, >> >> Is there a way to read and write binary files in beam? >> >> Thanks >> Nikhil >> >
