The way this is supposed to work is if the downstream operator wants both data+metadata then it connects to this new port ("...which would carry the meta data along with the actual tuple...") otherwise it connects to the legacy port which continues to behave the same way. Note that each tuple on the new port has metadata as well.
+1 for the proposal. On Sat, Jun 3, 2017 at 9:37 AM, Thomas Weise <t...@apache.org> wrote: > How does this relate to the batch control tuples work? > > With a separate port, how can a downstream operator relate the metadata to > the tuples emitted from the primary port? > > -- > sent from mobile > On Jun 2, 2017 12:06 PM, "Bhupesh Chawda" <bhup...@datatorrent.com> wrote: > > Hi, > > > > Emitting file > information > for a file based source like a file input operator > in malhar > seems > like > a > good > feature to provide. It is useful information for any downstream operator to > know > that a data tuple belongs to a certain file > for instance > . > > > We propose to add capability in the abstract file input operator to emit > file control tuples. These control tuples can include filenames as well as > any metadata that the user wishes to include along with it. > > To link this meta data to each tuple, we can add another port to the input > operator which would carry the meta data along with the actual tuple. We > can try to reduce the amount of meta data that goes with each tuple by > having some sort of meta encoding in the control tuple. > > > ~ Bhupesh > > > _______________________________________________________ > > Bhupesh Chawda > > E: bhup...@datatorrent.com | Twitter: @bhupeshsc > > www.datatorrent.com | apex.apache.org >