Re: [VOTE] Release Apache Flink 0.8.0 (RC1)
On Mon, Jan 12, 2015 at 11:22 AM, Stephan Ewen se...@apache.org wrote: It would be good to have the patch, but it is also a very tricky patch, so pushing it hastily may be problematic. I agree. @Till: would that be OK with you? If yes, I think we are good to go for the next RC.
Re: Flink @ Heise
Hey Andreas, thanks for sharing. Due to the press announcement by the ASF today, there is quite some attention in the news. For all non-germans: Heise is one of the biggest IT-news sites here. On Mon, Jan 12, 2015 at 3:10 PM, Andreas Kunft andreas.ku...@tu-berlin.de wrote: FYI http://www.heise.de/newsticker/meldung/Big-Data- Apache-Flink-wird-Top-Level-Projekt-2516177.html
Re: [VOTE] Release Apache Flink 0.8.0 (RC1)
It would be good to have the patch, but it is also a very tricky patch, so pushing it hastily may be problematic. We could put out 0.8 and very soon a 0.8.1 with that fix. Am 12.01.2015 11:12 schrieb Till Rohrmann trohrm...@apache.org: I only have to fix a last dead lock and then the fix should be working. On Mon, Jan 12, 2015 at 11:06 AM, Ufuk Celebi u...@apache.org wrote: Thanks, Marton. Yes, I think https://issues.apache.org/jira/browse/FLINK-1376. Till has a fix coming up for it. Should we wait for this or postpone it to 0.8.1? – Ufuk On 12 Jan 2015, at 10:16, Márton Balassi balassi.mar...@gmail.com wrote: Based on the requests listed in this thread Stephan and myself cherry-picked the following commits: 7634310 [FLINK-1266] Generalize DistributedFileSystem implementation (rmetzger) 40a2705 [FLINK-1266] Update mongodb link and address pull request comments (rmetzger) cd66ced [FLINK-1266] Properly pass the fs.defaulFS setting when initializing … (rmetzger) 3bf2d21 [FLINK-1266] More dependency exclusions (rmetzger) e8ab5b7 [FLINK-1266] Backport fix to 0.8 (rmetzger) 7cd0f47 [docs] Prepare documentation for 0.8 release (rmetzger) e83ccd0 [FLINK-1385] Print warning if not resources are _currently_ available… (rmetzger) 4f9dcae [FLINK-1378] [scala] Fix type extraction for nested type parameters (aljoscha) fced2eb [FLINK-1378] Add support for Throwables in KryoSerializer (aljoscha) ada35eb [FLINK-1378] [scala] Add support for Try[A] (Success/Failure) (aljoscha) a64abe1 [docs] Update README and internals (scheduling) for graduation and fi… (SephanEwen) 4f95ce7 Fix typo in README.md (uce) Both of us committed some other fixes mainly for the dependencies and licences, the distribution and the quickstarts. Pushing the results as soon as the tests pass. Does anyone have any other issue that has to be included in the next release candidate? On Sat, Jan 10, 2015 at 7:34 PM, Robert Metzger rmetz...@apache.org wrote: I've tested it on an empty YARN cluster, allocating more containers than available. Flink will then allocate as many containers as possible. On Sat, Jan 10, 2015 at 7:31 PM, Stephan Ewen se...@apache.org wrote: Seems reasonable. Have you tested it on a cluster with concurrent YARN jobs? On Sat, Jan 10, 2015 at 7:28 PM, Robert Metzger rmetz...@apache.org wrote: I would really like to include this commit into the 0.8 release as well: https://github.com/apache/flink/commit/ec2bb573d185429f8b3efe111850b8f0e67f2704 A user is affected by this issue. If you agree, I can merge it. On Sat, Jan 10, 2015 at 7:25 PM, Stephan Ewen se...@apache.org wrote: I have gone through the code, cleaned up dependencies and made sure that all licenses are correctly handled. The changes are in the public release-0.8 branch. From that side, the code is now good to go in my opinion. Stephan On Sat, Jan 10, 2015 at 12:30 PM, Robert Metzger rmetz...@apache.org wrote: I've updated the docs/_config.yml variables to reflect that hadoop2 is now the default profile: https://github.com/apache/flink/pull/294 On Fri, Jan 9, 2015 at 8:52 PM, Stephan Ewen se...@apache.org wrote: Just as a heads up. I am almost through with checking dependencies and licenses. Will commit that later tonight or tomorrow. On Fri, Jan 9, 2015 at 7:09 PM, Stephan Ewen se...@apache.org wrote: I vote to include it as well. It is sort of vital for advanced use of the Scala API. It is also an isolated change that does not affect other components, so it should be testable very well. On Fri, Jan 9, 2015 at 7:05 PM, Robert Metzger rmetz...@apache.org wrote: I'd say yes because its affecting a user. On Fri, Jan 9, 2015 at 7:00 PM, Aljoscha Krettek aljos...@apache.org wrote: I have a fix for this issue: https://issues.apache.org/jira/browse/FLINK-1378?jql=project%20%3D%20FLINK and I think this should also make it into the 0.8.0 release. What do you think? On Fri, Jan 9, 2015 at 3:28 PM, Márton Balassi balassi.mar...@gmail.com wrote: Sure, thanks Ufuk. On Fri, Jan 9, 2015 at 3:15 PM, Ufuk Celebi u...@apache.org wrote: Marton, could you also cherry-pick 7f659f6 and 7e08fa1 for the next RC? It's a minor update to the README describing the IDE setup. I will closed the respective issue FLINK-1109. On 08 Jan 2015, at 23:50, Henry Saputra henry.sapu...@gmail.com wrote: Marton, could you close this VOTE thread by replying to the original email and append [CANCEL] in the subject line. - Henry On Thu, Jan 8, 2015 at 9:35 AM, Márton Balassi
Re: [VOTE] Release Apache Flink 0.8.0 (RC1)
Thanks, Marton. Yes, I think https://issues.apache.org/jira/browse/FLINK-1376. Till has a fix coming up for it. Should we wait for this or postpone it to 0.8.1? – Ufuk On 12 Jan 2015, at 10:16, Márton Balassi balassi.mar...@gmail.com wrote: Based on the requests listed in this thread Stephan and myself cherry-picked the following commits: 7634310 [FLINK-1266] Generalize DistributedFileSystem implementation (rmetzger) 40a2705 [FLINK-1266] Update mongodb link and address pull request comments (rmetzger) cd66ced [FLINK-1266] Properly pass the fs.defaulFS setting when initializing … (rmetzger) 3bf2d21 [FLINK-1266] More dependency exclusions (rmetzger) e8ab5b7 [FLINK-1266] Backport fix to 0.8 (rmetzger) 7cd0f47 [docs] Prepare documentation for 0.8 release (rmetzger) e83ccd0 [FLINK-1385] Print warning if not resources are _currently_ available… (rmetzger) 4f9dcae [FLINK-1378] [scala] Fix type extraction for nested type parameters (aljoscha) fced2eb [FLINK-1378] Add support for Throwables in KryoSerializer (aljoscha) ada35eb [FLINK-1378] [scala] Add support for Try[A] (Success/Failure) (aljoscha) a64abe1 [docs] Update README and internals (scheduling) for graduation and fi… (SephanEwen) 4f95ce7 Fix typo in README.md (uce) Both of us committed some other fixes mainly for the dependencies and licences, the distribution and the quickstarts. Pushing the results as soon as the tests pass. Does anyone have any other issue that has to be included in the next release candidate? On Sat, Jan 10, 2015 at 7:34 PM, Robert Metzger rmetz...@apache.org wrote: I've tested it on an empty YARN cluster, allocating more containers than available. Flink will then allocate as many containers as possible. On Sat, Jan 10, 2015 at 7:31 PM, Stephan Ewen se...@apache.org wrote: Seems reasonable. Have you tested it on a cluster with concurrent YARN jobs? On Sat, Jan 10, 2015 at 7:28 PM, Robert Metzger rmetz...@apache.org wrote: I would really like to include this commit into the 0.8 release as well: https://github.com/apache/flink/commit/ec2bb573d185429f8b3efe111850b8f0e67f2704 A user is affected by this issue. If you agree, I can merge it. On Sat, Jan 10, 2015 at 7:25 PM, Stephan Ewen se...@apache.org wrote: I have gone through the code, cleaned up dependencies and made sure that all licenses are correctly handled. The changes are in the public release-0.8 branch. From that side, the code is now good to go in my opinion. Stephan On Sat, Jan 10, 2015 at 12:30 PM, Robert Metzger rmetz...@apache.org wrote: I've updated the docs/_config.yml variables to reflect that hadoop2 is now the default profile: https://github.com/apache/flink/pull/294 On Fri, Jan 9, 2015 at 8:52 PM, Stephan Ewen se...@apache.org wrote: Just as a heads up. I am almost through with checking dependencies and licenses. Will commit that later tonight or tomorrow. On Fri, Jan 9, 2015 at 7:09 PM, Stephan Ewen se...@apache.org wrote: I vote to include it as well. It is sort of vital for advanced use of the Scala API. It is also an isolated change that does not affect other components, so it should be testable very well. On Fri, Jan 9, 2015 at 7:05 PM, Robert Metzger rmetz...@apache.org wrote: I'd say yes because its affecting a user. On Fri, Jan 9, 2015 at 7:00 PM, Aljoscha Krettek aljos...@apache.org wrote: I have a fix for this issue: https://issues.apache.org/jira/browse/FLINK-1378?jql=project%20%3D%20FLINK and I think this should also make it into the 0.8.0 release. What do you think? On Fri, Jan 9, 2015 at 3:28 PM, Márton Balassi balassi.mar...@gmail.com wrote: Sure, thanks Ufuk. On Fri, Jan 9, 2015 at 3:15 PM, Ufuk Celebi u...@apache.org wrote: Marton, could you also cherry-pick 7f659f6 and 7e08fa1 for the next RC? It's a minor update to the README describing the IDE setup. I will closed the respective issue FLINK-1109. On 08 Jan 2015, at 23:50, Henry Saputra henry.sapu...@gmail.com wrote: Marton, could you close this VOTE thread by replying to the original email and append [CANCEL] in the subject line. - Henry On Thu, Jan 8, 2015 at 9:35 AM, Márton Balassi balassi.mar...@gmail.com wrote: Cherry-picked and tested: found no duplicate dependencies in lib, yarn uberjar build goes without the mentioned warns. Travis tests are passing, pushing soon. On Thu, Jan 8, 2015 at 4:57 PM, Stephan Ewen se...@apache.org wrote: Nice. @Marton: As soon as as you are done, I make a pass over the licenses... Stephan On Thu, Jan 8, 2015 at 4:42 PM, Robert Metzger rmetz...@apache.org wrote: Allright. The travis tests are green and I tested it again with Tachyon on a cluster. My pull request also fixes some of the issues mentioned earlier in this thread by Stephan (the warnings from shading regarding duplicate classes).
Re: Gather a distributed dataset
Hey Alexander, On 12 Jan 2015, at 11:42, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: Hi there, I wished for intermediate datasets, and Santa Ufuk made my wishes come true (thank you, Santa)! Now that FLINK-986 is in the mainline, I want to ask some practical questions. In Spark, there is a way to put a value from the local driver to the distributed runtime via val x = env.parallelize(...) // expose x to the distributed runtime val y = dataflow(env, x) // y is produced by a dataflow which reads from x and also to get a value from the distributed runtime back to the driver val z = env.collect(y) As far as I know, in Flink we have an equivalent for parallelize val x = env.fromCollection(...) but not for collect. Is this still the case? Yes, but this will change soon. If yes, how hard would it be to add this feature at the moment? Can you give me some pointers? There is a alpha version/hack of this using accumulators. See https://github.com/apache/flink/pull/210. The problem is that each collect call results in a new program being executed from the sources. I think Stephan is working on the scheduling to support this kind of programs. From the runtime perspective, it is not a problem to transfer the produced intermediate results back to the job manager. The job manager can basically use the same mechanism that the task managers use. Even the accumulator version should be fine as a initial version.
Re: [VOTE] Release Apache Flink 0.8.0 (RC1)
Yeah I agree with that. On Mon, Jan 12, 2015 at 11:30 AM, Ufuk Celebi u...@apache.org wrote: On Mon, Jan 12, 2015 at 11:22 AM, Stephan Ewen se...@apache.org wrote: It would be good to have the patch, but it is also a very tricky patch, so pushing it hastily may be problematic. I agree. @Till: would that be OK with you? If yes, I think we are good to go for the next RC.
[VOTE] Release Apache Flink 0.8.0 (RC2)
Please vote on releasing the following candidate as Apache Flink version 0.8.0 This release will be the first major release for Flink as a top level project. - The commit to be voted on is in the branch release-0.8.0-rc2 (commit 9ee74f3): https://git-wip-us.apache.org/repos/asf/flink/commit/9ee74f3 The release artifacts to be voted on can be found at: http://people.apache.org/~mbalassi/flink-0.8.0-rc2/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/mbalassi.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapacheflink-1022 - Please vote on releasing this package as Apache Flink 0.8.0. The vote is open for the next 72 hours and passes if a majority of at least three +1 PMC votes are cast. [ ] +1 Release this package as Apache Flink 0.8.0 [ ] -1 Do not release this package because ...
[jira] [Created] (FLINK-1391) Kryo fails to properly serialize avro collection types
Robert Metzger created FLINK-1391: - Summary: Kryo fails to properly serialize avro collection types Key: FLINK-1391 URL: https://issues.apache.org/jira/browse/FLINK-1391 Project: Flink Issue Type: Improvement Affects Versions: 0.8, 0.9 Reporter: Robert Metzger Before FLINK-610, Avro was the default generic serializer. Now, special types coming from Avro are handled by Kryo .. which seems to cause errors like: {code} Exception in thread main org.apache.flink.runtime.client.JobExecutionException: java.lang.NullPointerException at org.apache.avro.generic.GenericData$Array.add(GenericData.java:200) at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:116) at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:22) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:761) at org.apache.flink.api.java.typeutils.runtime.KryoSerializer.deserialize(KryoSerializer.java:143) at org.apache.flink.api.java.typeutils.runtime.KryoSerializer.deserialize(KryoSerializer.java:148) at org.apache.flink.api.java.typeutils.runtime.PojoSerializer.deserialize(PojoSerializer.java:244) at org.apache.flink.runtime.plugable.DeserializationDelegate.read(DeserializationDelegate.java:56) at org.apache.flink.runtime.io.network.serialization.AdaptiveSpanningRecordDeserializer.getNextRecord(AdaptiveSpanningRecordDeserializer.java:71) at org.apache.flink.runtime.io.network.channels.InputChannel.readRecord(InputChannel.java:189) at org.apache.flink.runtime.io.network.gates.InputGate.readRecord(InputGate.java:176) at org.apache.flink.runtime.io.network.api.MutableRecordReader.next(MutableRecordReader.java:51) at org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:53) at org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:170) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:257) at java.lang.Thread.run(Thread.java:744) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)