Simple Operators within Malhar (MLHR-1914)

York, Brennon Wed, 09 Dec 2015 13:44:22 -0800

All, I’ve been working on the JIRA ticket MLHR-1914 (at 
https://malhar.atlassian.net/projects/MLHR/issues/MLHR-1914?filter=allopenissues)
 and I wanted to shoot this out to describe what I’ve been doing and get 
feedback now that its in a state of something that we can discuss ;)


Before going into depth here is the code on my local repo:
https://github.com/brennonyork/incubator-apex-malhar/tree/MLHR-1914/library/src/main/java/com/datatorrent/lib/complex
https://github.com/brennonyork/incubator-apex-malhar/tree/MLHR-1914/library/src/main/java/com/datatorrent/lib/simple
The tests are in the same respective test directory.

So, the biggest impetus for this JIRA is that there should be a set of 
operators that 1. standardize the input and output ports and 2. make it very 
simple for a developer to merely implement a process method and forget the 
rest. Given all of this I found that there were two sets of operators based on 
the complexity of ports and how they mapped to each other. I gave them the 
package names ‘simple’ and ‘complex’ for lack of a better idea at the time. 
Feel free to propose something better :)

Under ‘simple’ are three operators:

 *   SingleInputOutput: This abstracts the input and output port (defined as 
‘input’ and ‘output’) and merely allows a user to implement a process method.
 *   SingleInputMultiOutput: Like above, but the return value from the 
‘process’ method is emitted to N output ports where N defaults to 2.
 *   MultiInputSingleOutput: N inputs are mapped into a single ‘process’ method 
with a single output port with N defaulting to 2.

Under ‘complex’ are four operators:

 *   SingleInputListOutput: a single input port and ‘process’ method where the 
return value of the ‘process’ method is a list of values with each value in the 
array matching the N output ports with N defaulting to 2.
 *   DirectMultiInputOutput: This maps N inputs to N outputs processed under a 
single ‘process’ method with N defaulting to 2.
 *   AllWayMultiInputOutput: maps N inputs to M outputs such that, for each 
input the ‘process’ method is called and, with the return value of the process 
method, it is sent to each of the M output ports with M and N defaulting to 2.
 *   AllWayMultiInputListOutput: like above except that, instead of having the 
‘process’ method return value emit to each of the M output ports, the return 
value from ‘process’ is a list with each element in the list emitting to a 
different output port. Concretely, v[0] => O[0], v[1] => O[1], etc. where v[] 
is the array of values from the ‘process’ method and O[] is the array of output 
ports.

Like I said I’m still working through the test and error cases (say where 
v[].len != O[].len) although I’d love to get feedback on everything thus far! 
Also, forgot to mention above, but this work is heavily related and will be the 
base of MLHR-1915 whereby we can build higher level operators such as ‘map’, 
‘filter’, ‘reduce’, ‘join’, etc. Thoughts?
________________________________________________________

The information contained in this e-mail is confidential and/or proprietary to 
Capital One and/or its affiliates and may only be used solely in performance of 
work or services for Capital One. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed. If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.

Simple Operators within Malhar (MLHR-1914)

Reply via email to