This is not specific to the batch work. This is a more generic functionality which even streaming applications can benefit from.
The separate port is for both the actual tuple as well as the metadata. ~ Bhupesh _______________________________________________________ Bhupesh Chawda E: bhup...@datatorrent.com | Twitter: @bhupeshsc www.datatorrent.com | apex.apache.org On Sat, Jun 3, 2017 at 9:37 AM, Thomas Weise <t...@apache.org> wrote: > How does this relate to the batch control tuples work? > > With a separate port, how can a downstream operator relate the metadata to > the tuples emitted from the primary port? > > -- > sent from mobile > On Jun 2, 2017 12:06 PM, "Bhupesh Chawda" <bhup...@datatorrent.com> wrote: > > Hi, > > > > Emitting file > information > for a file based source like a file input operator > in malhar > seems > like > a > good > feature to provide. It is useful information for any downstream operator to > know > that a data tuple belongs to a certain file > for instance > . > > > We propose to add capability in the abstract file input operator to emit > file control tuples. These control tuples can include filenames as well as > any metadata that the user wishes to include along with it. > > To link this meta data to each tuple, we can add another port to the input > operator which would carry the meta data along with the actual tuple. We > can try to reduce the amount of meta data that goes with each tuple by > having some sort of meta encoding in the control tuple. > > > ~ Bhupesh > > > _______________________________________________________ > > Bhupesh Chawda > > E: bhup...@datatorrent.com | Twitter: @bhupeshsc > > www.datatorrent.com | apex.apache.org >