[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-24 Thread kl0u
Github user kl0u closed the pull request at:

https://github.com/apache/flink/pull/887


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-22 Thread kl0u
Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/887#issuecomment-123752371
  
Hi @mxm. Thanks a lot!
I don't have your email unfortunately. 
Could you somehow send it to me?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-22 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/887#issuecomment-123756678
  
@kl0u Sure, I've sent you an email.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-22 Thread kl0u
Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/887#issuecomment-123756855
  
Thanks a lot!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-22 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/887#issuecomment-123745096
  
Hi @kl0u, I'm sorry that my changes overlap so much with yours. Let me know 
if you have any questions. I think you have my email address?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-22 Thread kl0u
Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/887#issuecomment-123738197
  
Hello,

The latest changes are pretty invasive and have a big overlap with the ones 
in my pull request. 
More specifically, the abstraction of the AccumulatorRegistry changes my 
implementation a lot. 

Consequently I have to re-implement much of my previous code. 
This may take some time.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-21 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/887#issuecomment-123361546
  
Now that #896 is in, this becomes mergeable...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-07 Thread kl0u
Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/887#issuecomment-119140624
  
Ok, sounds good! 
Could you give the number of the ticket of the changes @mxm is doing? 
Just to have a look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-07 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/887#issuecomment-119137396
  
I think all in all, this looks good.

We have another change by @mxm pending that changes the accumulators to do 
live updates during runtime. Since @mxm's change is more invasive, I would 
suggest that we wait for it to be merged before merging this one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-05 Thread StephanEwen
Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/887#discussion_r33896123
  
--- Diff: .gitignore ---
@@ -21,3 +21,4 @@ docs/api
 build-target
 
flink-staging/flink-avro/src/test/java/org/apache/flink/api/io/avro/generated/
 atlassian-ide-plugin.xml
+/notes/*
--- End diff --

Can you remove this entry? Seems very specific to your setup, not a generic 
entry.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-05 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/887#issuecomment-118636188
  
Nice to see this happening.
The change is a bit more elaborate, so we'll probably need a bit for the 
review.

Please create a JIRA ticket for the change and tag version and components 
(JobManager, TaskManager). All changes need to be tracked.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-05 Thread StephanEwen
Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/887#discussion_r33896212
  
--- Diff: 
flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/misc/testing/CollectTesting.java
 ---
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.examples.java.misc.testing;
+
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+
+public class CollectTesting implements java.io.Serializable {
--- End diff --

Few comments on the test:
 
 - Tests should not print messages to sysout.

 - Tests that include actor messaging need to be executed with multiple 
actor systems. Otherwise, no message serialization happens. You can start a 
`ForableFlinkMiniCluster` and tell it to use different actor systems for 
JobManager and TaskManager. The simplest thing would be to simply add the 
oversized accumulator test to 
https://github.com/apache/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/misc/MiscellaneousIssuesITCase.java

 - Tests need to be deterministic on multiple setups. That includes 
hard-setting the degree of parallelism. If not explicitly specified, the 
parallelism becomes the number of CPU cores in the local executor, making the 
test behave differently on different testing machines.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-05 Thread StephanEwen
Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/887#discussion_r33896142
  
--- Diff: flink-examples/flink-java-examples/pom.xml ---
@@ -319,7 +319,30 @@ under the License.
/includes
/configuration
/execution
-   
+
--- End diff --

I think the `collect()` example needs not be packaged into a JAR file. 

We package only examples as JAR files that compute something meaningful. 
Other examples are only includes as source.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-05 Thread kl0u
Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/887#issuecomment-118642591
  
Hello,

I have integrated the changes you suggested, so now:
1) notes are no longer in the .gitignore
2) the collect example is not in the created jars, if fact it is no  longer 
in the examples
3) the oversized accumulator test is added in the MiscellaneousIssuesITCase.

And also a JIRA ticket is created.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-05 Thread chiwanpark
Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/887#issuecomment-118644497
  
FLINK-2319,
I leave a commit with the JIRA ticket to track changes of this PR. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-05 Thread kl0u
GitHub user kl0u opened a pull request:

https://github.com/apache/flink/pull/887

Collect(): Fixing the akka.framesize size limitation.

In Apache Flink the results of the collect() call were returned through 
akka to the client. This led to an inherent limitation to the size of the 
output of a job, as this could not exceed the akka.framesize size. In other 
case, akka would drop the message.

To alleviate this, without dropping the benefits brought by akka and its 
out-of-the-box efficiency for small-sized results, we decided to keep 
forwarding the non-oversized (i.e. smaller than the akka.framesize) results 
through akka, and use the BlobCache module for the forwarding the oversized 
(large) ones.

Now the JobManager receives end merges the small accumulators (as before), 
and simply forwards to the Client the keys to the blobs storing the oversized 
ones. Now it is the responsibility of the Client to do the final merging 
between oversized and non-oversized accumulators.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kl0u/flink collect_fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/887.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #887


commit f417e2585fda1aca936b8e0637618d44cd0b81ca
Author: Kostas Kloudas kklou...@gmail.com
Date:   2015-07-04T14:50:48Z

A first working version of collect() with unbounded Accumulator sizes.

commit bf52a091b0fbb04426fa61949334cc44c548d6c2
Author: Kostas Kloudas kklou...@gmail.com
Date:   2015-07-04T15:31:47Z

Cleaned up the TaskManaget side.

commit f0de184b0a3aac64bcaa753db0917778e031883e
Author: Kostas Kloudas kklou...@gmail.com
Date:   2015-07-04T18:54:08Z

Cleaned up till the JobManager side.

commit 10faf14c4df168da533a35fefa495c1b860ddf1d
Author: Kostas Kloudas kklou...@gmail.com
Date:   2015-07-04T22:56:09Z

Cleaned up the code. Missing the Stringified result.

commit 9cd35f46dcf5e6494196185621413ba793da0913
Author: Kostas Kloudas kklou...@gmail.com
Date:   2015-07-05T00:37:12Z

Fixed a version for the Stringified result.

commit e5787c74e48a9bed7c503e5d2e90c51b5f33d24f
Author: Kostas Kloudas kklou...@gmail.com
Date:   2015-07-05T01:45:38Z

Fixed a sanity check in the SerializedJobExecutionResult.

commit c36bab2c54f1e6a9f401be6eb1e9a75171342212
Author: Kostas Kloudas kklou...@gmail.com
Date:   2015-07-05T02:35:17Z

Fixed the cleaning up of the BlobCache after the end of the job.

commit 764f8bda9fbda58d3df7cac51f5b1b2c1cee10de
Author: Kostas Kloudas kklou...@gmail.com
Date:   2015-07-05T03:41:44Z

Fixed a test bug.

commit 1c1701a0bd8e4eef742d18494875176136f35233
Author: Kostas Kloudas kklou...@gmail.com
Date:   2015-07-05T14:52:06Z

Fixed a comment in the RuntimeEnvironment.

commit 1471bc22bd32675be91c96ec5e0e8ce884fc0bd0
Author: Kostas Kloudas kklou...@gmail.com
Date:   2015-07-05T15:01:59Z

Fixed some method and class renaming.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---