[GitHub] flink pull request: [FLINK-1927][py] Operator distribution rework

2015-07-30 Thread mxm
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/931#issuecomment-126218526 I think we can merge this later on. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] flink pull request: [FLINK-1927][py] Operator distribution rework

2015-07-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/931 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] flink pull request: [FLINK-1927][py] Operator distribution rework

2015-07-29 Thread mxm
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/931#issuecomment-125979125 Thanks for the pull request @zentol! +1 for removing the dill library. As far as I can see, we handle all the serialization ourselves now. We only used the Dill

[GitHub] flink pull request: [FLINK-1927][py] Operator distribution rework

2015-07-29 Thread zentol
Github user zentol commented on the pull request: https://github.com/apache/flink/pull/931#issuecomment-125983186 Thanks for the review @mxm . I've addressed the cosmetic issue you mentioned, and added a small fix for a separate issue as well (error reporting was partially

[GitHub] flink pull request: [FLINK-1927][py] Operator distribution rework

2015-07-29 Thread mxm
Github user mxm commented on a diff in the pull request: https://github.com/apache/flink/pull/931#discussion_r35763245 --- Diff: flink-staging/flink-language-binding/flink-python/src/main/java/org/apache/flink/languagebinding/api/java/python/streaming/PythonStreamer.java --- @@

[GitHub] flink pull request: [FLINK-1927][py] Operator distribution rework

2015-07-22 Thread zentol
GitHub user zentol opened a pull request: https://github.com/apache/flink/pull/931 [FLINK-1927][py] Operator distribution rework Python operators are no longer serialized and shipped across the cluster. Instead the plan file is executed on each node, followed by usage of the

[GitHub] flink pull request: [FLINK-1927] [py] Operator distribution rework

2015-05-20 Thread zentol
Github user zentol closed the pull request at: https://github.com/apache/flink/pull/638 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] flink pull request: [FLINK-1927] [py] Operator distribution rework

2015-05-20 Thread zentol
Github user zentol commented on the pull request: https://github.com/apache/flink/pull/638#issuecomment-103813351 Implementing this in a clean way has become trickier that i initially expected, as such I'll postpone it and close this PR for now. --- If your project is set up for it,

[GitHub] flink pull request: [FLINK-1927] [py] Operator distribution rework

2015-04-29 Thread mxm
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/638#issuecomment-97353102 Wow! Great to see that we get rid of the only python-side dependency. It was a bit unclear under which terms we could ship the library anyways. Have you done any measurements

[GitHub] flink pull request: [FLINK-1927] [py] Operator distribution rework

2015-04-29 Thread zentol
Github user zentol commented on the pull request: https://github.com/apache/flink/pull/638#issuecomment-97389037 I didn't check performance, it shouldn't have any noticeable effect on it. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] flink pull request: [FLINK-1927] [py] Operator distribution rework

2015-04-29 Thread zentol
Github user zentol commented on the pull request: https://github.com/apache/flink/pull/638#issuecomment-97417603 @aljoscha that variable must be declared somewhere within the plan file. during the plan rebuild this would be executed as well, so i don't think this is a problem. in

[GitHub] flink pull request: [FLINK-1927] [py] Operator distribution rework

2015-04-29 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/638#issuecomment-97412959 But doesn't this mean that the lambdas now must be stateless, i.e. if a user refers to some variable outside the lambda this will not be serialised with the closure

[GitHub] flink pull request: [FLINK-1927] [py] Operator distribution rework

2015-04-29 Thread zentol
Github user zentol commented on the pull request: https://github.com/apache/flink/pull/638#issuecomment-97418740 oh snap i just noticed a big flaw... well let's put this PR on hold for a bit. I'm simply re executing the plan file on each node, but forgot to deal with

[GitHub] flink pull request: [FLINK-1927] [py] Operator distribution rework

2015-04-28 Thread zentol
GitHub user zentol opened a pull request: https://github.com/apache/flink/pull/638 [FLINK-1927] [py] Operator distribution rework Python operators are no longer serialized and instead rebuilt on each node. This also means that the dill library is no longer necessary. You can merge