Please see my comments inline.

Thank you,

Vlad

//On 4/1/17 09:13, Sanjay Pujare wrote:
Sounds like a good idea, +1

Couple of questions/comments:
- "...re-open the port and wait till the stream becomes active again...".
How is the operator informed that the stream is active again or is it just
a property of the output port that the operator needs to check?
There are multiple options available and it will be necessary to see which one gives the most flexibility and performance. One of them is to allow an operator to register a callback in the open port call. Until the stream is active, emitting on the port should raise an exception.
- Say an operator has 2 input ports one of which becomes inactive as per
this scenario. It cannot be undeployed because of the other port but the
operator needs to behave differently because it's a different DAG now. Will
a control tuple inform it that the input port is closing?
Again, the details will need to be flushed out. It may be an eos (end of stream) control tuple. An operator with 2 input ports may not even care that one of the input ports is connected to inactive stream. It should behave as if the upstream operator does not emit anything on the port.


On Sat, Apr 1, 2017 at 8:12 AM, Vlad Rozov <v.ro...@datatorrent.com> wrote:

All,

Currently Apex assumes that an operator can emit on any defined output
port and all streams defined by a DAG are active. I'd like to propose an
ability for an operator to open and close output ports. By default all
ports defined by an operator will be open. In the case an operator for any
reason decides that it will not emit tuples on the output port, it may
close it. This will make the stream inactive and the application master may
undeploy the downstream (for that input stream) operators. If this leads to
containers that don't have any active operators, those containers may be
undeployed as well leading to better cluster resource utilization and
better Apex elasticity. Later, the operator may be in a state where it
needs to emit tuples on the closed port. In this case, it needs to re-open
the port and wait till the stream becomes active again before emitting
tuples on that port. Making inactive stream active again, requires the
application master to re-allocate containers and re-deploy the downstream
operators.

It should be also possible for an application designer to mark streams as
inactive when an application starts. This will allow the application master
avoid reserving all containers when the application starts. Later, the port
can be open and inactive stream become active.

Thank you,

Vlad



Reply via email to