[ 
https://issues.apache.org/jira/browse/SAMZA-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16438980#comment-16438980
 ] 

ASF GitHub Bot commented on SAMZA-1659:
---------------------------------------

GitHub user nickpan47 opened a pull request:

    https://github.com/apache/samza/pull/475

    SAMZA-1659: Serializable OperatorSpec

    This change is to make the user supplied functions serializable. Hence, 
making the full user defined DAG serializable.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/nickpan47/samza 
serializable-opspec-only-Jan-24-18

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/samza/pull/475.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #475
    
----
commit 5573a069e95f6467b92d73b3cba37f035e067ae2
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-06-12T04:20:36Z

    WIP: new api revision

commit 8bb975204bd8a5ad3459642609c9e4992c495701
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-06-12T07:38:40Z

    WIP: proto-type of input/output stream/system specs

commit aeb457303779e7f31b114ad23bfa9ebaafeddab6
Author: Xinyu Liu <xiliu@...>
Date:   2017-06-29T00:16:10Z

    SAMZA-1321: Propagate end-of-stream and watermark messages
    
    The patch completes the end-of-stream work flow across multi-stage 
pipeline. It also contains initial commit for supporting watermarks. For 
watermark, there are issues raised in the review feedback and will be addressed 
by further prs. The main logic this patch adds:
    
    - EndOfStreamManager aggregates the end-of-stream control messages, 
propagate the result to to downstream intermediate topics based on the topology 
of the IO in the StreamGraph.
    
    - WatermarkManager aggregates the watermark control messages from the 
upstage tasks, pass it through the operators, and propagate it to downstream.
    
    In operator impl, I implemented similar watermark logic as Beam for 
watermark propagation:
    * InputWatermark(op) = min { OutputWatermark(op') | op1 is upstream of op}
    * OutputWatermark(op) = min { InputWatermark(op), OldestWorkTime(op) }
    
    Add quite a few unit tests and integration test. The code is 100% covered 
as reported by Intellij. Both control messages work as expected.
    
    Author: Xinyu Liu <xi...@xiliu-ld.linkedin.biz>
    
    Reviewers: Yi Pan <nickpa...@gmail.com>
    
    Closes #236 from xinyuiscool/SAMZA-1321

commit ae3dc6ff133fa3f0c7d9f27b6f724c652105680c
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-06-12T04:20:36Z

    WIP: new api revision

commit 91f364f1e3e350c4c9d6413e620d9506a1da91c8
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-06-12T07:38:40Z

    WIP: proto-type of input/output stream/system specs

commit b898e6c037e4fb4c9ddf1baee95095045e6500ab
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-06-13T01:43:50Z

    WIP: new-api-v2

commit cd528c1c30646a3307c81770f3dc82482bba1aba
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-06-21T16:12:41Z

    WIP: updated spec and user DAG API

commit 0bc7ee7babd0a64ed4d7cf6b7fab24be3a3447ae
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-06-30T16:55:09Z

    WIP: update the user code example on new APIs

commit 51541e133216b9c730f57653f5f7afb6c56d21a5
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-07-11T16:07:32Z

    WIP: cleanup StreamDescriptor

commit 3c50629eb93b674731398f9f8c756e499da5faf2
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-07-27T22:05:07Z

    WIP: adding support for low-level task APIs

commit 4a6a58dcb80fb2919f5bad42f6f1979960826d62
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-07-31T09:51:30Z

    WIP: updated w/ low-level task API and global var ingestion/metrics reporter

commit 256155ad530c48af0cf60ba8945e8face0133a90
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-07-31T17:29:22Z

    WIP: new API user code examples

commit f227380f20503734decca75c0f34aa5810b7f1c0
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-08-02T07:50:54Z

    WIP: update the app runner classes

commit e6fb96e574aa9ee78b3beedd6921e4a680fa9c73
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-08-08T14:35:39Z

    WIP: merged all application types into StreamApplications

commit 525d8bc1b96149299ad831d1ba3d458d6dd3e4b1
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-08-09T18:02:31Z

    Merge branch '0.14.0' into new-api-v2
    
    Conflicts:
        samza-core/src/main/java/org/apache/samza/operators/StreamGraphImpl.java
        
samza-core/src/main/java/org/apache/samza/operators/impl/OutputOperatorImpl.java
        samza-core/src/main/java/org/apache/samza/processor/StreamProcessor.java
        
samza-core/src/main/java/org/apache/samza/runtime/LocalContainerRunner.java

commit 6fc6d4c09d13132ae3c915de6bb6223206cb4ca2
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-08-25T16:49:18Z

    WIP: experiment code to implement an end-to-end working example for new APIs

commit d7df6ed0ef3f6ca8ad5ee4cb7e7307c314a71619
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-08-31T22:41:41Z

    WIP: added all unit test for OperatorSpec#copy methods.

commit dde1ab14bee0ff4ebbc15500a938b191295dbd90
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-09-06T16:43:37Z

    WIP: first end-to-end test

commit 50201728e964cbe3c997b4e335df3cd2a46ed076
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-10-19T23:48:56Z

    Merge branch 'experiment-new-api-v2' into new-api-v2-0.14
    
    Conflicts:
        samza-api/src/main/java/org/apache/samza/operators/MessageStream.java
        samza-api/src/main/java/org/apache/samza/operators/StreamGraph.java
        samza-api/src/main/java/org/apache/samza/operators/windows/Windows.java
        
samza-api/src/main/java/org/apache/samza/operators/windows/internal/WindowInternal.java
        samza-api/src/main/java/org/apache/samza/serializers/Serde.java
        samza-api/src/main/java/org/apache/samza/system/StreamSpec.java
        
samza-core/src/main/java/org/apache/samza/operators/MessageStreamImpl.java
        samza-core/src/main/java/org/apache/samza/operators/StreamGraphImpl.java
        
samza-core/src/main/java/org/apache/samza/operators/functions/PartialJoinFunction.java
        
samza-core/src/main/java/org/apache/samza/operators/impl/InputOperatorImpl.java
        
samza-core/src/main/java/org/apache/samza/operators/impl/OperatorImplGraph.java
        
samza-core/src/main/java/org/apache/samza/operators/impl/OutputOperatorImpl.java
        
samza-core/src/main/java/org/apache/samza/operators/impl/WindowOperatorImpl.java
        
samza-core/src/main/java/org/apache/samza/operators/spec/InputOperatorSpec.java
        
samza-core/src/main/java/org/apache/samza/operators/spec/JoinOperatorSpec.java
        
samza-core/src/main/java/org/apache/samza/operators/spec/OperatorSpec.java
        
samza-core/src/main/java/org/apache/samza/operators/spec/OperatorSpecs.java
        
samza-core/src/main/java/org/apache/samza/operators/spec/OutputOperatorSpec.java
        
samza-core/src/main/java/org/apache/samza/operators/spec/OutputStreamImpl.java
        
samza-core/src/main/java/org/apache/samza/operators/spec/SinkOperatorSpec.java
        
samza-core/src/main/java/org/apache/samza/operators/spec/StreamOperatorSpec.java
        
samza-core/src/main/java/org/apache/samza/operators/spec/WindowOperatorSpec.java
        
samza-core/src/main/java/org/apache/samza/runtime/AbstractApplicationRunner.java
        
samza-core/src/main/java/org/apache/samza/runtime/LocalApplicationRunner.java
        
samza-core/src/main/java/org/apache/samza/runtime/RemoteApplicationRunner.java
        samza-core/src/main/java/org/apache/samza/task/StreamOperatorTask.java
        
samza-core/src/test/java/org/apache/samza/control/TestEndOfStreamManager.java
        samza-core/src/test/java/org/apache/samza/control/TestIOGraph.java
        
samza-core/src/test/java/org/apache/samza/control/TestWatermarkManager.java
        samza-core/src/test/java/org/apache/samza/example/BroadcastExample.java
        samza-core/src/test/java/org/apache/samza/example/MergeExample.java
        
samza-core/src/test/java/org/apache/samza/execution/TestExecutionPlanner.java
        
samza-core/src/test/java/org/apache/samza/execution/TestJobGraphJsonGenerator.java
        
samza-core/src/test/java/org/apache/samza/operators/TestJoinOperator.java
        
samza-core/src/test/java/org/apache/samza/operators/TestMessageStreamImpl.java
        
samza-core/src/test/java/org/apache/samza/operators/TestStreamGraphImpl.java
        
samza-core/src/test/java/org/apache/samza/operators/TestWindowOperator.java
        
samza-core/src/test/java/org/apache/samza/operators/impl/TestOperatorImpl.java
        
samza-core/src/test/java/org/apache/samza/operators/impl/TestOperatorImplGraph.java
        
samza-core/src/test/java/org/apache/samza/operators/spec/TestWindowOperatorSpec.java
        
samza-core/src/test/java/org/apache/samza/runtime/TestApplicationRunnerMain.java
        
samza-core/src/test/java/org/apache/samza/runtime/TestLocalApplicationRunner.java
        
samza-test/src/main/java/org/apache/samza/example/KeyValueStoreExample.java
        
samza-test/src/main/java/org/apache/samza/example/OrderShipmentJoinExample.java
        
samza-test/src/main/java/org/apache/samza/example/PageViewCounterExample.java
        
samza-test/src/main/java/org/apache/samza/example/RepartitionExample.java
        samza-test/src/main/java/org/apache/samza/example/WindowExample.java
        
samza-test/src/test/java/org/apache/samza/test/controlmessages/EndOfStreamIntegrationTest.java
        
samza-test/src/test/java/org/apache/samza/test/controlmessages/WatermarkIntegrationTest.java
        
samza-test/src/test/java/org/apache/samza/test/operator/RepartitionWindowApp.java
        
samza-test/src/test/java/org/apache/samza/test/operator/SessionWindowApp.java
        
samza-test/src/test/java/org/apache/samza/test/operator/StreamApplicationIntegrationTestHarness.java
        
samza-test/src/test/java/org/apache/samza/test/operator/TestRepartitionWindowApp.java
        
samza-test/src/test/java/org/apache/samza/test/operator/TumblingWindowApp.java
        
samza-test/src/test/java/org/apache/samza/test/processor/TestZkLocalApplicationRunner.java

commit bf1ce907645e4a131c5765c4612c158e1855730a
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-10-20T00:43:33Z

    WIP: removing StreamDescriptor first

commit d46403294aecc2ebdbc3a890da8c40acfa35f448
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-10-23T16:05:48Z

    WIP: fixing unit tests after merge

commit 6a14b2afad8d57cccb17e5926cd44743b6ebd034
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-10-30T22:25:03Z

    WIP: fixed unit test failure for Windows

commit 475a46bced8d84c7fe09b57590d3a72e0587c3c8
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-12-15T18:59:29Z

    WIP: fixed TestZkLocalApplicationRunner. Debugging issues w/ 
TestRepartitionWindowApp (i.e. missing changelog creation step when directly 
running LocalApplicationRunner)

commit dc7da87e2eec618bb060acfec695c5b1aa280259
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2017-12-22T18:27:21Z

    WIP: unit tests for serialization

commit 4102aa8ced2b5630edf3ac84e0c6298956a999b5
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2018-01-02T17:36:29Z

    WIP: continued working on potential offspring integration

commit 1670aff0cefb569b109fad3928f04190a7b4e0a1
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2018-01-10T22:20:22Z

    WIP: class-loading of user program logic and main() method based user 
program logic are both included in 
ThreadJobFactory/ProcessJobFactory/YarnJobFactory. ThreadJobFactory test suite 
to be fixed.

commit 0ebebfc3ba49de6d6589cd3b82eb8ce3882d034d
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2018-01-19T23:11:26Z

    WIP: serialization only change

commit aca423085ed29452adc89c48aee4aa1126874328
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2018-01-25T07:52:16Z

    WIP: Serialize OperatorSpec only w/o StreamApplication interface change. 
Passed all build and tests.

commit b973b105ddb8cd49ea5f084ceebea19b27254f36
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2018-02-08T08:12:58Z

    Merge branch 'master' into serializable-opspec-only-Jan-24-18
    
    Conflicts:
        samza-api/src/main/java/org/apache/samza/operators/MessageStream.java
        
samza-core/src/main/java/org/apache/samza/operators/MessageStreamImpl.java
        samza-core/src/main/java/org/apache/samza/operators/StreamGraphImpl.java
        
samza-core/src/main/java/org/apache/samza/operators/impl/OperatorImpl.java
        
samza-core/src/main/java/org/apache/samza/operators/impl/OperatorImplGraph.java
        
samza-core/src/main/java/org/apache/samza/operators/impl/OutputOperatorImpl.java
        
samza-core/src/main/java/org/apache/samza/operators/impl/PartialJoinOperatorImpl.java
        
samza-core/src/main/java/org/apache/samza/operators/impl/PartitionByOperatorImpl.java
        
samza-core/src/main/java/org/apache/samza/operators/impl/WindowOperatorImpl.java
        
samza-core/src/main/java/org/apache/samza/operators/spec/InputOperatorSpec.java
        
samza-core/src/main/java/org/apache/samza/operators/spec/OperatorSpec.java
        
samza-core/src/main/java/org/apache/samza/operators/spec/OperatorSpecs.java
        
samza-core/src/main/java/org/apache/samza/operators/spec/OutputStreamImpl.java
        
samza-core/src/main/java/org/apache/samza/operators/spec/PartitionByOperatorSpec.java
        
samza-core/src/main/java/org/apache/samza/operators/spec/StreamOperatorSpec.java
        
samza-core/src/main/java/org/apache/samza/operators/stream/IntermediateMessageStreamImpl.java
        
samza-core/src/main/java/org/apache/samza/runtime/LocalApplicationRunner.java
        
samza-core/src/main/scala/org/apache/samza/coordinator/JobModelManager.scala
        
samza-core/src/test/java/org/apache/samza/execution/TestExecutionPlanner.java
        
samza-core/src/test/java/org/apache/samza/execution/TestJobGraphJsonGenerator.java
        
samza-core/src/test/java/org/apache/samza/operators/TestJoinOperator.java
        
samza-core/src/test/java/org/apache/samza/operators/TestMessageStreamImpl.java
        
samza-core/src/test/java/org/apache/samza/operators/TestStreamGraphImpl.java
        
samza-core/src/test/java/org/apache/samza/operators/impl/TestOperatorImpl.java
        
samza-core/src/test/java/org/apache/samza/operators/impl/TestOperatorImplGraph.java
        
samza-core/src/test/java/org/apache/samza/operators/impl/TestWindowOperator.java
        
samza-core/src/test/java/org/apache/samza/operators/spec/TestWindowOperatorSpec.java
        
samza-test/src/main/java/org/apache/samza/example/KeyValueStoreExample.java
        
samza-test/src/main/java/org/apache/samza/example/OrderShipmentJoinExample.java
        
samza-test/src/main/java/org/apache/samza/example/PageViewCounterExample.java
        
samza-test/src/main/java/org/apache/samza/example/RepartitionExample.java
        samza-test/src/main/java/org/apache/samza/example/WindowExample.java
        
samza-test/src/test/java/org/apache/samza/test/controlmessages/EndOfStreamIntegrationTest.java
        
samza-test/src/test/java/org/apache/samza/test/controlmessages/WatermarkIntegrationTest.java
        
samza-test/src/test/java/org/apache/samza/test/operator/RepartitionJoinWindowApp.java
        
samza-test/src/test/java/org/apache/samza/test/operator/SessionWindowApp.java
        
samza-test/src/test/java/org/apache/samza/test/operator/TestRepartitionJoinWindowApp.java
        
samza-test/src/test/java/org/apache/samza/test/operator/TumblingWindowApp.java
        
samza-test/src/test/java/org/apache/samza/test/processor/TestZkLocalApplicationRunner.java

commit 7c8d1591cef6326f32b2ffd061cc52b902e37214
Author: Yi Pan (Data Infrastructure) <nickpan47@...>
Date:   2018-02-12T08:29:26Z

    WIP: working on unit tests for trigger, broadcast, join, table, and SQL UDF 
function serialization

----


> Making user functions serializable in high-level APIs
> -----------------------------------------------------
>
>                 Key: SAMZA-1659
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1659
>             Project: Samza
>          Issue Type: Improvement
>            Reporter: Yi Pan (Data Infrastructure)
>            Priority: Major
>              Labels: fluent-api
>             Fix For: 0.15.0
>
>
> In order to be able to fully serialize the user DAG in high-level APIs, we 
> need to make sure that the user-supplied functions are serializable as well. 
> This will enable:
>  # Deserialize and instantiation of per-task user functions become completely 
> managed by the system. No need to invoke StreamApplication.init() in each task
>  # Fully serialized user DAG can be distributed across the network to 
> multiple containers
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to