This brings one more issue that we did not cover in the module design discussions. What happens when new versions of operators embedded into a module become available? I believe we all go with the assumption that module will pickup version available on the classpath at run-time and it is operator developer responsibility to provide full binary compatibility. Another possible behavior is to consider module as completely independent unit and package all necessary libraries along with the module.

Thank you,

Vlad

On 11/3/15 09:40, Pramod Immaneni wrote:
I am not suggesting we leak the internals or compromise access modifiers. I
want the module developer to have the ability (not mandatory) to make
available all or a subset of the properties of an operator easily if they
desire without having to create setter/getter for each of them. You don't
have to expose the operator they belong to. My preference would also be to
preserve the namespace of the properties in some way for example by
grouping them by operator name. Think about scenario where people have
built modules using kafka input operator and there is a new kafka
connection property. Without having this ability the modules have to be
changed to support this property. With this feature the module developers
have a choice whether to keep the list of kafka properties fixed in the
module or allow new properties.

Thanks

On Tue, Nov 3, 2015 at 9:31 AM, Thomas Weise <tho...@datatorrent.com> wrote:

There is also the option to inherit common module properties through a base
class.

I don't see how this is any different from an operator. The developer
decides what gets exposed and has the same options to control it.

Encapsulation is good practice, by leaking the module internals the using
code becomes brittle.

Thomas

On Tue, Nov 3, 2015 at 9:22 AM, Amol Kekre <a...@datatorrent.com> wrote:

The same goes for a Java or C++ class that changes its api. In general
this
is left to the developer, and these languages have internals as private
by
default for precisely the same purpose. The module developer must have
the
right to change internals, keep api clean/constant and expect user code
to
not break.

Thks,
Amol


On Tue, Nov 3, 2015 at 9:19 AM, Pramod Immaneni <pra...@datatorrent.com>
wrote:

For 3 and 4 can't we strike a balance between not having to expose the
operators underneath and at the same time not having to write
boilerplate
code for all the properties that the module wants to make available
outside. It can quickly become unmanageable. For example, an input
operator
has a new connection property which can be used outside and now all the
modules using that operator, their code has to be modified to just add
a
pass through setter/getter. How about treating the operator name as a
group
name and ability for module developers to easily make available/specify
all
or a subset of the properties of an operator to the user without having
to
explicitly make each of them a module property.

On Mon, Nov 2, 2015 at 5:00 PM, Amol Kekre <a...@datatorrent.com>
wrote:
3,4 should follow conventions where the creator decides the api
(including
accessibility). In general only those properties exposed by module
creator
should be settable. What the module internally does with them is
module
designer's call. Accessing internals of module from outside is
uncommon.
For exampe in Java (or C++) private fields/members are not to be
accessed.
Properties (setter and getter) are the api that module designer gives
to
the module user. It is dangerous and has unintended consequences if
module
user starts to access internals outside the api.

Partitioning should be next phase. As long as current design does not
halt
partitioning it should be ok (which I believe is true).

Thks,
Amol


On Mon, Nov 2, 2015 at 3:44 PM, Vlad Rozov <v.ro...@datatorrent.com>
wrote:

1. +1, though passing original DAG to module's populateDAG() it is
not
by
design and is the current pull request implementation details.

2. While I agree that both Module and StreamingApplication let's
module/application designer to expose DAG design reuse pattern and
StreamingApplication interface may be extending Module, it does not
seem
to
buy us much. Do we want to allow certain applications to be reused
as
Modules in other applications or should application package be
different
from Module package? The current approach is to distribute Modules
as
part
of .jar for example as part of Malhar library without necessarily
providing
all necessary dependencies. Application package on other side must
include
all dependencies not provided by the platform.

3, 4. While this will help Module designer, it may complicate
Module
maintenance and how Modules are used. What if Module designer wants
to
change it's implementation and replace one operator implementation
with
another operator? Does StreamingApplication designer need to know
internal
structure of Modules? Should Module be considered as a black box
during
Application design time as it was initially planned?

5, 6, 7 +1. This is currently proposed behavior of Module
functionality
the way I understand it.

8. We need to see what Module designer can specify for
partitioning.
One
of supported cases should be ability to specify cascading
partitioning
scheme.

Thank you,

Vlad


On 11/2/15 10:30, Pramod Immaneni wrote:

I have some comments and suggestions on the module design. I think
these
need to be taken into account before we can merge the
implementation
provided below into the mainline code. I apologize if these should
have
been brought up earlier as for some reason or the other I was out
of
loop
on this one

      https://github.com/apache/incubator-apex-core/pull/148
<

https://github.com/apache/incubator-apex-core/pull/148#issuecomment-153104963
      1. DAG scoping currently in the implementation is global for
modules,
each module's populateDAG sees the entire DAG. It should be
locally
scoped
as one module does not and should not know about another.
      2. The module has a populateDAG method with exact same syntax
as
in
StreamingApplication. Is StreamingApplication also a module,
should
it
extend that interface.
      3. Setting properties for modules is too verbose. Module
developer
needs to repeat every property they want exposed with a setter and
getter
in JAVA. I don't disagree that module developer should be able to
choose
which properties from which operators need to be exposed but the
current
way seems to duplicate code. Here is a suggestion.
           a. Allow modules to specify which operators and
properties
can
be
accessible from outside. One way is in the "populateDAG" method of
the
module when adding the operator have the ability to specify if
this
operator can be accessible from outside and which or all
properties
can
be
accessible.
           b. Provide methods in ModuleMeta or elsewhere to set
property
values by specifying the operator name (friendly name) inside the
module
and property name. If this is allowed by a. above it is successful
else
it
should fail.
           c. Allow a syntax in property files to specify the
property
in
b.
Example syntax
dt.module.<modulename>.operator.<operatorname>.prop.<
propname>
      4. For attributes same mechanism as in 3 should apply for the
operators
that are exposed by the module.  For property file, example syntax
dt.module.<modulename>.operator.<operatorname>.attr.<attrname>
      5. Module developers in addition to 3. and 4. above may
choose
to
support module level properties and attributes. These should not
be
the
default when 3. and 4. are possible but complementary, in addition
to
them.
In this case for properties they can implement setters and getters
in
the
module. For attributes the user should still be able to set the
attributes
using the dag setAttribute method. You could introduce a method in
the
module to process attributes that can get called by the engine
once
everything is set.
      6. For 5. above setting global properties and attributes for
module
is
akin to ideas that have been proposed for the application as
well. A
consistent way must be possible for applications as well even if
it
is
not
implemented now.
      7. For 5. or 6. above there should be a property file way of
specifying
the global module properties and attributes. Example syntax
dt.module.<modulename>.prop.<propname>,
dt.module.<modulename>.attr.<attrname>.
Notice the difference with 3. c. and 4 above that there is no
operator
keyword here.
      8. Partitioning needs to be consistent with what the user
will
expect
when they see module as an entity. I will send an image of
possible
examples of how the user will expect the physical plan to look in
certain
cases.

Thanks



Reply via email to