TextIO doesn't retain filenames. Looks like proposed API for reading whole files [1] retain filesnames so you should be able to use that to produce a PCollection of KV<filename, data> once it's available.
- Cham [1] https://issues.apache.org/jira/browse/BEAM-2750 On Mon, Aug 21, 2017 at 9:59 PM Siddharth Mittal <[email protected]> wrote: > Hi Team, > > I have a use case where I will get a PCollection of file names. > > Files are present on NFS and file size may wary from few KBs to few GBs. > > We want to transform PCollection of File Names to PCollection of <FileName > , File Line> > > Please Suggest how to handle this type of use case. > > Thanks & Regards > > Siddharth Mittal > Senior Associate | Sapient > >
