sanha commented on issue #2: [NEMO-7] Intra-TaskGroup pipelining
URL: https://github.com/apache/incubator-nemo/pull/2#issuecomment-371376883
 
 
   For the @johnyangk's comment (avoiding hashing and creating objects), I'd 
suggest the following model.
   - Build a DAG of `TaskWrapper` (or something like that) from `taskGroupDag` 
of `ScheduledTaskGroup` when a `TaskGroup` is scheduled.
     - This DAG should manage the connection among vertices as pointer 
(reference) rather than `Map` and `List`, unlike our current `DAG` 
implementation.
     - The `TaskWrapper` should have `Callable`, which consumes input element 
and produce output.
       - This `Callable` can be built from the `Transform` of `Task` that the 
wrapper wraps.
     - The `TaskWrapper` can have any other stuffs which are stored in 
`TaskDataHandler` now.
   - After this, each data in input data `Iterable` can be processed through 
this DAG of `TaskWrapper` without calculating any hash or creating any extra 
object.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to