[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user kl0u closed the pull request at: https://github.com/apache/flink/pull/887 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/887#issuecomment-123752371 Hi @mxm. Thanks a lot! I don't have your email unfortunately. Could you somehow send it to me? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/887#issuecomment-123756678 @kl0u Sure, I've sent you an email. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/887#issuecomment-123756855 Thanks a lot! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/887#issuecomment-123745096 Hi @kl0u, I'm sorry that my changes overlap so much with yours. Let me know if you have any questions. I think you have my email address? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/887#issuecomment-123738197 Hello, The latest changes are pretty invasive and have a big overlap with the ones in my pull request. More specifically, the abstraction of the AccumulatorRegistry changes my implementation a lot. Consequently I have to re-implement much of my previous code. This may take some time. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/887#issuecomment-123361546 Now that #896 is in, this becomes mergeable... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/887#issuecomment-119140624 Ok, sounds good! Could you give the number of the ticket of the changes @mxm is doing? Just to have a look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/887#issuecomment-119137396 I think all in all, this looks good. We have another change by @mxm pending that changes the accumulators to do live updates during runtime. Since @mxm's change is more invasive, I would suggest that we wait for it to be merged before merging this one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/887#discussion_r33896123 --- Diff: .gitignore --- @@ -21,3 +21,4 @@ docs/api build-target flink-staging/flink-avro/src/test/java/org/apache/flink/api/io/avro/generated/ atlassian-ide-plugin.xml +/notes/* --- End diff -- Can you remove this entry? Seems very specific to your setup, not a generic entry. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/887#issuecomment-118636188 Nice to see this happening. The change is a bit more elaborate, so we'll probably need a bit for the review. Please create a JIRA ticket for the change and tag version and components (JobManager, TaskManager). All changes need to be tracked. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/887#discussion_r33896212 --- Diff: flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/misc/testing/CollectTesting.java --- @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.examples.java.misc.testing; + +import org.apache.flink.api.java.DataSet; +import org.apache.flink.api.java.ExecutionEnvironment; + +public class CollectTesting implements java.io.Serializable { --- End diff -- Few comments on the test: - Tests should not print messages to sysout. - Tests that include actor messaging need to be executed with multiple actor systems. Otherwise, no message serialization happens. You can start a `ForableFlinkMiniCluster` and tell it to use different actor systems for JobManager and TaskManager. The simplest thing would be to simply add the oversized accumulator test to https://github.com/apache/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/misc/MiscellaneousIssuesITCase.java - Tests need to be deterministic on multiple setups. That includes hard-setting the degree of parallelism. If not explicitly specified, the parallelism becomes the number of CPU cores in the local executor, making the test behave differently on different testing machines. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/887#discussion_r33896142 --- Diff: flink-examples/flink-java-examples/pom.xml --- @@ -319,7 +319,30 @@ under the License. /includes /configuration /execution - + --- End diff -- I think the `collect()` example needs not be packaged into a JAR file. We package only examples as JAR files that compute something meaningful. Other examples are only includes as source. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/887#issuecomment-118642591 Hello, I have integrated the changes you suggested, so now: 1) notes are no longer in the .gitignore 2) the collect example is not in the created jars, if fact it is no longer in the examples 3) the oversized accumulator test is added in the MiscellaneousIssuesITCase. And also a JIRA ticket is created. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user chiwanpark commented on the pull request: https://github.com/apache/flink/pull/887#issuecomment-118644497 FLINK-2319, I leave a commit with the JIRA ticket to track changes of this PR. :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
GitHub user kl0u opened a pull request: https://github.com/apache/flink/pull/887 Collect(): Fixing the akka.framesize size limitation. In Apache Flink the results of the collect() call were returned through akka to the client. This led to an inherent limitation to the size of the output of a job, as this could not exceed the akka.framesize size. In other case, akka would drop the message. To alleviate this, without dropping the benefits brought by akka and its out-of-the-box efficiency for small-sized results, we decided to keep forwarding the non-oversized (i.e. smaller than the akka.framesize) results through akka, and use the BlobCache module for the forwarding the oversized (large) ones. Now the JobManager receives end merges the small accumulators (as before), and simply forwards to the Client the keys to the blobs storing the oversized ones. Now it is the responsibility of the Client to do the final merging between oversized and non-oversized accumulators. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kl0u/flink collect_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/887.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #887 commit f417e2585fda1aca936b8e0637618d44cd0b81ca Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-04T14:50:48Z A first working version of collect() with unbounded Accumulator sizes. commit bf52a091b0fbb04426fa61949334cc44c548d6c2 Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-04T15:31:47Z Cleaned up the TaskManaget side. commit f0de184b0a3aac64bcaa753db0917778e031883e Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-04T18:54:08Z Cleaned up till the JobManager side. commit 10faf14c4df168da533a35fefa495c1b860ddf1d Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-04T22:56:09Z Cleaned up the code. Missing the Stringified result. commit 9cd35f46dcf5e6494196185621413ba793da0913 Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-05T00:37:12Z Fixed a version for the Stringified result. commit e5787c74e48a9bed7c503e5d2e90c51b5f33d24f Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-05T01:45:38Z Fixed a sanity check in the SerializedJobExecutionResult. commit c36bab2c54f1e6a9f401be6eb1e9a75171342212 Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-05T02:35:17Z Fixed the cleaning up of the BlobCache after the end of the job. commit 764f8bda9fbda58d3df7cac51f5b1b2c1cee10de Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-05T03:41:44Z Fixed a test bug. commit 1c1701a0bd8e4eef742d18494875176136f35233 Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-05T14:52:06Z Fixed a comment in the RuntimeEnvironment. commit 1471bc22bd32675be91c96ec5e0e8ce884fc0bd0 Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-05T15:01:59Z Fixed some method and class renaming. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---