Will it make an impression on user that, if he has a batch usecase he has
to use batch aware operators only? If so, is that what we expect? I am not
aware of how do we implement batch scenario so this might be a basic
question.

-Priyanka

On Mon, Jan 16, 2017 at 12:02 PM, Bhupesh Chawda <bhup...@datatorrent.com>
wrote:

> Hi All,
>
> While design / implementation for custom control tuples is ongoing, I
> thought it would be a good idea to consider its usefulness in one of the
> use cases -  batch applications.
>
> This is a proposal to adapt / extend existing operators in the Apache Apex
> Malhar library so that it is easy to use them in batch use cases.
> Naturally, this would be applicable for only a subset of operators like
> File, JDBC and NoSQL databases.
> For example, for a file based store, (say HDFS store), we could have
> FileBatchInput and FileBatchOutput operators which allow easy integration
> into a batch application. These operators would be extended from their
> existing implementations and would be "Batch Aware", in that they may
> understand the meaning of some specific control tuples that flow through
> the DAG. Start batch and end batch seem to be the obvious candidates that
> come to mind. On receipt of such control tuples, they may try to modify the
> behavior of the operator - to reinitialize some metrics or finalize an
> output file for example.
>
> We can discuss the potential control tuples and actions in detail, but
> first I would like to understand the views of the community for this
> proposal.
>
> ~ Bhupesh
>

Reply via email to