All,

Currently Apex assumes that an operator can emit on any defined output port and all streams defined by a DAG are active. I'd like to propose an ability for an operator to open and close output ports. By default all ports defined by an operator will be open. In the case an operator for any reason decides that it will not emit tuples on the output port, it may close it. This will make the stream inactive and the application master may undeploy the downstream (for that input stream) operators. If this leads to containers that don't have any active operators, those containers may be undeployed as well leading to better cluster resource utilization and better Apex elasticity. Later, the operator may be in a state where it needs to emit tuples on the closed port. In this case, it needs to re-open the port and wait till the stream becomes active again before emitting tuples on that port. Making inactive stream active again, requires the application master to re-allocate containers and re-deploy the downstream operators.

It should be also possible for an application designer to mark streams as inactive when an application starts. This will allow the application master avoid reserving all containers when the application starts. Later, the port can be open and inactive stream become active.

Thank you,

Vlad

Reply via email to