Hi All, I am planning provide support for adding modules into the json and property file specification of DAG. We will go with the same syntax as discussed in following mail thread.
https://mail-archives.apache.org/mod_mbox/incubator-apex-dev/201512.mbox/%3C565D0E2C.2000805%40datatorrent.com%3E There are multiple choices for the design and I want some suggestion from the community aboutg how to go about adding this support. To support json/property format, Module should support following functionality - Property format uses setSource, addSink methods of StreamMeta, which were not properly supported by Module. - Module need capability of extracting port object given names. Operator provides this functionality through Operators.describe. Approach 1) As module meta and operator meta shares common fields such as name, port information. We can separate this out in a common class NodeMeta and derive OperatorMeta and ModuleMeta from it. This class will handle extracting port information form the object. The changes involve are - Split OperatorMeta object - PortMeta objects will contain NodeMeta references than OperatorMeta, in some places we will have to perform unchecked cast from NodeMeta to OperatorMeta. where code was expecting OperatorMeta from the PortMeta. - Support setSource and addSink methods. - Change Operators.describe to accept Module or Operator and return port mapping. - Need instanceof call at few places to check if object is Module or Operator, before calling specific API while working with property and json file. - change signature of few method which accepts Operator to Object. as Module and Operator do not share a common parent but requires common processing - Replace OperatorMeta with NodeMeta in most of the classes. Approach 2) Make Module extends Operator and make ModuleMeta extends OperatorMeta. The changes involved will be - Change Module interface to extend Operator - add support for setSource and addSink for Modules. - addOperator will inspect type of object and call addModule if operator is a module. Disadvantage - This will break compatibility but this should not be a problem as Module is still an evolving API. - Operator lifecyle methods will not get executed for module and we can document this. Advantages - This will avoid code duplication at multiple places where there is similarity between Module and Operator, and there is lot of similarity while defining logical DAG. - This will also automatically add support for Module in api being developed for operator. For example high level api. - This will make module and operator interchangeable in application. I will prefer Approach 2. Regards, -Tushar.
