Re: Website documentation minor bug
Seems like my smart data crawling web mail took the linked images out. So here we go again: New http://i.imgur.com/KK7fhiR.png Old http://i.imgur.com/kP2LPnY.png On Tue, Mar 10, 2015 at 11:17 AM, Stephan Ewen se...@apache.org wrote: Looks the same to me ;-) The mailing lists do not support attachments... On Tue, Mar 10, 2015 at 11:15 AM, Maximilian Michels m...@apache.org wrote: So here are the proposed changes. New Old If there are no objections, I will merge this by the end of the day. Best regards, Max On Mon, Mar 9, 2015 at 4:22 PM, Hermann Gábor reckone...@gmail.com wrote: Thanks Gyula, that helps a lot :D Nice solution. Thank you Max! I also support the reduced header size! Cheers, Gabor On Mon, Mar 9, 2015 at 3:36 PM Márton Balassi balassi.mar...@gmail.com wrote: +1 for the proposed solution from Max +1 for decreasing the size: but let's have preview, I also think that the current one is a bit too large On Mon, Mar 9, 2015 at 2:16 PM, Maximilian Michels m...@apache.org wrote: We can fix this for the headings by adding the following CSS rule: h1, h2, h3, h4 { padding-top: 100px; margin-top: -100px; } In the course of changing this, we could also reduce the size of the navigation header in the docs. It is occupies too much space and doesn't have a lot of functionality. I'd suggest to half its size. The positioning at the top is fine for me. Kind regards, Max On Mon, Mar 9, 2015 at 2:08 PM, Hermann Gábor reckone...@gmail.com wrote: I think the navigation looks nice this way. It's rather a small CSS/HTML problem that the header shades the title when clicking on an anchor link. (It's that the content starts at top, but there is the header covering it.) I'm not much into web stuff, but I would gladly fix it. Can someone help me with this? On Sun, Mar 8, 2015 at 9:52 PM Stephan Ewen se...@apache.org wrote: I agree, it is not optimal. What would be a better way to do this? Have the main navigation (currently on the left) at the top, and the per-page navigation on the side? Do you want to take a stab at this? On Sun, Mar 8, 2015 at 7:08 PM, Hermann Gábor reckone...@gmail.com wrote: Hey, Currently following an anchor link (e.g. #transformations http://ci.apache.org/projects/flink/flink-docs-master/ programming_guide.html#transformations ) results in the header occupying the top of the page, thus the title and some of the first lines cannot be seen. This is not a big deal, but it's user-facing and a bit irritating. Can someone fix it, please? (I tried it on Firefox and Chromium on Ubuntu 14.10) Cheers, Gabor
Re: Website documentation minor bug
Looks the same to me ;-) The mailing lists do not support attachments... On Tue, Mar 10, 2015 at 11:15 AM, Maximilian Michels m...@apache.org wrote: So here are the proposed changes. New Old If there are no objections, I will merge this by the end of the day. Best regards, Max On Mon, Mar 9, 2015 at 4:22 PM, Hermann Gábor reckone...@gmail.com wrote: Thanks Gyula, that helps a lot :D Nice solution. Thank you Max! I also support the reduced header size! Cheers, Gabor On Mon, Mar 9, 2015 at 3:36 PM Márton Balassi balassi.mar...@gmail.com wrote: +1 for the proposed solution from Max +1 for decreasing the size: but let's have preview, I also think that the current one is a bit too large On Mon, Mar 9, 2015 at 2:16 PM, Maximilian Michels m...@apache.org wrote: We can fix this for the headings by adding the following CSS rule: h1, h2, h3, h4 { padding-top: 100px; margin-top: -100px; } In the course of changing this, we could also reduce the size of the navigation header in the docs. It is occupies too much space and doesn't have a lot of functionality. I'd suggest to half its size. The positioning at the top is fine for me. Kind regards, Max On Mon, Mar 9, 2015 at 2:08 PM, Hermann Gábor reckone...@gmail.com wrote: I think the navigation looks nice this way. It's rather a small CSS/HTML problem that the header shades the title when clicking on an anchor link. (It's that the content starts at top, but there is the header covering it.) I'm not much into web stuff, but I would gladly fix it. Can someone help me with this? On Sun, Mar 8, 2015 at 9:52 PM Stephan Ewen se...@apache.org wrote: I agree, it is not optimal. What would be a better way to do this? Have the main navigation (currently on the left) at the top, and the per-page navigation on the side? Do you want to take a stab at this? On Sun, Mar 8, 2015 at 7:08 PM, Hermann Gábor reckone...@gmail.com wrote: Hey, Currently following an anchor link (e.g. #transformations http://ci.apache.org/projects/flink/flink-docs-master/ programming_guide.html#transformations ) results in the header occupying the top of the page, thus the title and some of the first lines cannot be seen. This is not a big deal, but it's user-facing and a bit irritating. Can someone fix it, please? (I tried it on Firefox and Chromium on Ubuntu 14.10) Cheers, Gabor
Re: [DISCUSS] Make a release to be announced at ApacheCon
On the streaming side: Must have: * Tests for the fault tolerance (My first priority this week) * Merging Gyula's recent windowing PR [1] Really needed: * Self-join for DataStreams (Gabor has a prototype, PR coming today) [1] * ITCase tests for streaming examples (Peter myself, review and clean up pending) [3] * Different streaming/batch cluster memory settings (Stephan) [4] * Make projection operator chainable (Gabor Gevay - a wannabe GSoC student, PR coming soon) [5] * Parallel time discretization (Gyula, PR coming tomorrow) [6] Would be nice to have: * Complex integration test for streaming (Peter) [7] * Extend streaming aggregation tests to include POJOs [8] * Iteration bug for large input [9] We would also need a general pass over the streaming API for javadocs. This is not one week but we can hopefully fit into two weeks. [1] https://github.com/apache/flink/pull/465 [2] https://issues.apache.org/jira/browse/FLINK-1594 [3] https://issues.apache.org/jira/browse/FLINK-1560 [4] https://issues.apache.org/jira/browse/FLINK-1368 [5] https://issues.apache.org/jira/browse/FLINK-1641 [6] https://issues.apache.org/jira/browse/FLINK-1618 [7] https://issues.apache.org/jira/browse/FLINK-1595 [8] https://issues.apache.org/jira/browse/FLINK-1544 [9] https://issues.apache.org/jira/browse/FLINK-1239 On Tue, Mar 10, 2015 at 11:20 AM, Robert Metzger rmetz...@apache.org wrote: Hey, whats the status on this? There is one week left until we are going to fork off a branch for 0.9 .. if we stick to the suggested timeline. The initial email said I am very much in favor of doing this, under the strong condition that we are very confident that the master has grown to be stable enough. I think it is time to evaluate whether we are confident that the master is stable. Best Robert On Wed, Mar 4, 2015 at 9:42 AM, Robert Metzger rmetz...@apache.org wrote: +1 for Marton as a release manager. Thank you! On Tue, Mar 3, 2015 at 7:56 PM, Henry Saputra henry.sapu...@gmail.com wrote: Ah, thanks Márton. So we are chartering to the similar concept of Spark RRD staging execution =P I suppose there will be a runtime configuration or hint to tell the Flink Job manager to indicate which execution is preferred? - Henry On Tue, Mar 3, 2015 at 2:09 AM, Márton Balassi balassi.mar...@gmail.com wrote: Hi Henry, Batch mode is a new execution mode for batch Flink jobs where instead of pipelining the whole execution the job is scheduled in stages, thus materializing the intermediate result before continuing to the next operators. For implications see [1]. [1] http://www.slideshare.net/KostasTzoumas/flink-internals, page 18-21. On Mon, Mar 2, 2015 at 11:39 PM, Henry Saputra henry.sapu...@gmail.com wrote: HI Stephan, What is Batch mode feature in the list? - Henry On Mon, Mar 2, 2015 at 5:03 AM, Stephan Ewen se...@apache.org wrote: Hi all! ApacheCon is coming up and it is the 15th anniversary of the Apache Software Foundation. In the course of the conference, Apache would like to make a series of announcements. If we manage to make a release during (or shortly before) ApacheCon, they will announce it through their channels. I am very much in favor of doing this, under the strong condition that we are very confident that the master has grown to be stable enough (there are major changes in the distributed runtime since version 0.8 that we are still stabilizing). No use in a widely announced build that does not have the quality. Flink has now many new features that warrant a release soon (once we fixed the last quirks in the new distributed runtime). Notable new features are: - Gelly - Streaming windows - Flink on Tez - Expression API - Distributed Runtime on Akka - Batch mode - Maybe even a first ML library version - Some streaming fault tolerance Robert proposed to have a feature freeze mid Match for that. His cornerpoints were: Feature freeze (forking off release-0.9): March 17 RC1 vote: March 24 The RC1 vote is 20 days before the ApacheCon (13. April). For the last three releases, the average voting time was 20 days: R 0.8.0 -- 14 days R 0.7.0 -- 22 days R 0.6 -- 26 days Please share your opinion on this! Greetings, Stephan
[jira] [Created] (FLINK-1671) Add execution modes for programs
Stephan Ewen created FLINK-1671: --- Summary: Add execution modes for programs Key: FLINK-1671 URL: https://issues.apache.org/jira/browse/FLINK-1671 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 Currently, there is a single way that programs get executed: Pipelined. With the new code for batch shuffles (https://github.com/apache/flink/pull/471), we have much more flexibility and I would like to expose that. I suggest to add more execution modes that can be chosen on the `ExecutionEnvironment`: - {{BATCH}} A mode where every shuffle is executed in a batch way, meaning preceding operators must be done before successors start. Only for the batch programs (d'oh). - {{PIPELINED}} This is the mode corresponding to the current execution mode. It pipelines where possible and batches, where deadlocks would otherwise happen. Initially, I would make this the default (be close to the current behavior). Only available for batch programs. - {{PIPELINED_WITH_BATCH_FALLBACK}} This would start out with pipelining shuffles and fall back to batch shuffles upon failure and recovery, or once it sees that not enough slots are available to bring up all operators at once (requirement for pipelining). - {{STREAMING}} This is the default and only way for streaming programs. All communication is pipelined, and the special streaming checkpointing code is activated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1670) Collect method for streaming
Márton Balassi created FLINK-1670: - Summary: Collect method for streaming Key: FLINK-1670 URL: https://issues.apache.org/jira/browse/FLINK-1670 Project: Flink Issue Type: New Feature Components: Streaming Affects Versions: 0.9 Reporter: Márton Balassi Priority: Minor A convenience method for streaming back the results of a job to the client. As the client itself is a bottleneck anyway an easy solution would be to provide a socket sink with degree of parallelism 1, from which a client utility can read. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Website documentation minor bug
Looks good! +1 for the new one. Regards. Chiwan Park (Sent with iPhone) On Mar 10, 2015, at 7:28 PM, Hermann Gábor reckone...@gmail.com wrote: Looks nice, +1 for the new one. On Tue, Mar 10, 2015 at 11:24 AM Maximilian Michels m...@apache.org wrote: Seems like my smart data crawling web mail took the linked images out. So here we go again: New http://i.imgur.com/KK7fhiR.png Old http://i.imgur.com/kP2LPnY.png On Tue, Mar 10, 2015 at 11:17 AM, Stephan Ewen se...@apache.org wrote: Looks the same to me ;-) The mailing lists do not support attachments... On Tue, Mar 10, 2015 at 11:15 AM, Maximilian Michels m...@apache.org wrote: So here are the proposed changes. New Old If there are no objections, I will merge this by the end of the day. Best regards, Max On Mon, Mar 9, 2015 at 4:22 PM, Hermann Gábor reckone...@gmail.com wrote: Thanks Gyula, that helps a lot :D Nice solution. Thank you Max! I also support the reduced header size! Cheers, Gabor On Mon, Mar 9, 2015 at 3:36 PM Márton Balassi balassi.mar...@gmail.com wrote: +1 for the proposed solution from Max +1 for decreasing the size: but let's have preview, I also think that the current one is a bit too large On Mon, Mar 9, 2015 at 2:16 PM, Maximilian Michels m...@apache.org wrote: We can fix this for the headings by adding the following CSS rule: h1, h2, h3, h4 { padding-top: 100px; margin-top: -100px; } In the course of changing this, we could also reduce the size of the navigation header in the docs. It is occupies too much space and doesn't have a lot of functionality. I'd suggest to half its size. The positioning at the top is fine for me. Kind regards, Max On Mon, Mar 9, 2015 at 2:08 PM, Hermann Gábor reckone...@gmail.com wrote: I think the navigation looks nice this way. It's rather a small CSS/HTML problem that the header shades the title when clicking on an anchor link. (It's that the content starts at top, but there is the header covering it.) I'm not much into web stuff, but I would gladly fix it. Can someone help me with this? On Sun, Mar 8, 2015 at 9:52 PM Stephan Ewen se...@apache.org wrote: I agree, it is not optimal. What would be a better way to do this? Have the main navigation (currently on the left) at the top, and the per-page navigation on the side? Do you want to take a stab at this? On Sun, Mar 8, 2015 at 7:08 PM, Hermann Gábor reckone...@gmail.com wrote: Hey, Currently following an anchor link (e.g. #transformations http://ci.apache.org/projects/flink/flink-docs-master/ programming_guide.html#transformations ) results in the header occupying the top of the page, thus the title and some of the first lines cannot be seen. This is not a big deal, but it's user-facing and a bit irritating. Can someone fix it, please? (I tried it on Firefox and Chromium on Ubuntu 14.10) Cheers, Gabor
Re: Website documentation minor bug
Looks nice, +1 for the new one. On Tue, Mar 10, 2015 at 11:24 AM Maximilian Michels m...@apache.org wrote: Seems like my smart data crawling web mail took the linked images out. So here we go again: New http://i.imgur.com/KK7fhiR.png Old http://i.imgur.com/kP2LPnY.png On Tue, Mar 10, 2015 at 11:17 AM, Stephan Ewen se...@apache.org wrote: Looks the same to me ;-) The mailing lists do not support attachments... On Tue, Mar 10, 2015 at 11:15 AM, Maximilian Michels m...@apache.org wrote: So here are the proposed changes. New Old If there are no objections, I will merge this by the end of the day. Best regards, Max On Mon, Mar 9, 2015 at 4:22 PM, Hermann Gábor reckone...@gmail.com wrote: Thanks Gyula, that helps a lot :D Nice solution. Thank you Max! I also support the reduced header size! Cheers, Gabor On Mon, Mar 9, 2015 at 3:36 PM Márton Balassi balassi.mar...@gmail.com wrote: +1 for the proposed solution from Max +1 for decreasing the size: but let's have preview, I also think that the current one is a bit too large On Mon, Mar 9, 2015 at 2:16 PM, Maximilian Michels m...@apache.org wrote: We can fix this for the headings by adding the following CSS rule: h1, h2, h3, h4 { padding-top: 100px; margin-top: -100px; } In the course of changing this, we could also reduce the size of the navigation header in the docs. It is occupies too much space and doesn't have a lot of functionality. I'd suggest to half its size. The positioning at the top is fine for me. Kind regards, Max On Mon, Mar 9, 2015 at 2:08 PM, Hermann Gábor reckone...@gmail.com wrote: I think the navigation looks nice this way. It's rather a small CSS/HTML problem that the header shades the title when clicking on an anchor link. (It's that the content starts at top, but there is the header covering it.) I'm not much into web stuff, but I would gladly fix it. Can someone help me with this? On Sun, Mar 8, 2015 at 9:52 PM Stephan Ewen se...@apache.org wrote: I agree, it is not optimal. What would be a better way to do this? Have the main navigation (currently on the left) at the top, and the per-page navigation on the side? Do you want to take a stab at this? On Sun, Mar 8, 2015 at 7:08 PM, Hermann Gábor reckone...@gmail.com wrote: Hey, Currently following an anchor link (e.g. #transformations http://ci.apache.org/projects/flink/flink-docs-master/ programming_guide.html#transformations ) results in the header occupying the top of the page, thus the title and some of the first lines cannot be seen. This is not a big deal, but it's user-facing and a bit irritating. Can someone fix it, please? (I tried it on Firefox and Chromium on Ubuntu 14.10) Cheers, Gabor
Re: [DISCUSS] Offer Flink with Scala 2.11
Hey Alex, I don't know the exact status of the Scala 2.11 integration. But I wanted to point you to https://github.com/apache/flink/pull/454, which is changing a huge portion of our maven build infrastructure. If you haven't started yet, it might make sense to base your integration onto that pull request. Otherwise, let me know if you have troubles rebasing your changes. On Mon, Mar 2, 2015 at 9:13 PM, Chiwan Park chiwanp...@icloud.com wrote: +1 for Scala 2.11 Regards. Chiwan Park (Sent with iPhone) On Mar 3, 2015, at 2:43 AM, Robert Metzger rmetz...@apache.org wrote: I'm +1 if this doesn't affect existing Scala 2.10 users. I would also suggest to add a scala 2.11 build to travis as well to ensure everything is working with the different Hadoop/JVM versions. It shouldn't be a big deal to offer scala_version x hadoop_version builds for newer releases. You only need to add more builds here: https://github.com/apache/flink/blob/master/tools/create_release_files.sh#L131 On Mon, Mar 2, 2015 at 6:17 PM, Till Rohrmann trohrm...@apache.org wrote: +1 for Scala 2.11 On Mon, Mar 2, 2015 at 5:02 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: Spark currently only provides pre-builds for 2.10 and requires custom build for 2.11. Not sure whether this is the best idea, but I can see the benefits from a project management point of view... Would you prefer to have a {scala_version} × {hadoop_version} integrated on the website? 2015-03-02 16:57 GMT+01:00 Aljoscha Krettek aljos...@apache.org: +1 I also like it. We just have to figure out how we can publish two sets of release artifacts. On Mon, Mar 2, 2015 at 4:48 PM, Stephan Ewen se...@apache.org wrote: Big +1 from my side! Does it have to be a Maven profile, or does a maven property work? (Profile may be needed for quasiquotes dependency?) On Mon, Mar 2, 2015 at 4:36 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: Hi there, since I'm relying on Scala 2.11.4 on a project I've been working on, I created a branch which updates the Scala version used by Flink from 2.10.4 to 2.11.4: https://github.com/stratosphere/flink/commits/scala_2.11 Everything seems to work fine and the PR contains minor changes compared to Spark: https://issues.apache.org/jira/browse/SPARK-4466 If you're interested, I can rewrite this as a Maven Profile and open a PR so people can build Flink with 2.11 support. I suggest to do this sooner rather than later in order to * the number of code changes enforced by migration small and tractable; * discourage the use of deprecated or 2.11-incompatible source code in future commits; Regards, A.
Re: [DISCUSS] Add method for each Akka message
+1, let's change this lazily whenever we work on an action/message, we pull the handling out into a dedicated method. On Tue, Mar 10, 2015 at 11:49 AM, Ufuk Celebi u...@apache.org wrote: Hey all, I currently find it a little bit frustrating to navigate between different task manager operations like cancel or submit task. Some of these operations are directly done in the event loop (e.g. cancelling), whereas others forward the msg to a method (e.g. submitting). For me, navigating to methods is way easier than manually scanning the event loop. Therefore, I would prefer to forward all messages to a corresponding method. Can I get some opinions on this? Would someone be opposed? [Or is there a way in IntelliJ to do this navigation more efficiently? I couldn't find anything.] – Ufuk
Re: [jira] [Commented] (FLINK-1106) Deprecate old Record API
Yeah, I spotted a good amount of optimizer tests that depend on the Record API. I implemented the last optimizer tests with the new API and would volunteer to port the other optimizer tests. 2015-03-10 16:32 GMT+01:00 Stephan Ewen (JIRA) j...@apache.org: [ https://issues.apache.org/jira/browse/FLINK-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355063#comment-14355063 ] Stephan Ewen commented on FLINK-1106: - A bit of test coverage depends on the deprecated API. We would need to port at least some of the tests to the new API. We can probably drop some subsumed / obsolete tests. Deprecate old Record API Key: FLINK-1106 URL: https://issues.apache.org/jira/browse/FLINK-1106 Project: Flink Issue Type: Task Components: Java API Affects Versions: 0.7.0-incubating Reporter: Robert Metzger Assignee: Robert Metzger Priority: Critical Fix For: 0.7.0-incubating For the upcoming 0.7 release, we should mark all user-facing methods from the old Record Java API as deprecated, with a warning that we are going to remove it at some point. I would suggest to wait one or two releases from the 0.7 release (given our current release cycle). I'll start a mailing-list discussion at some point regarding this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [gelly] Tests fail, but build succeeds
Great, thanks Andra! On Mar 10, 2015 5:52 PM, Andra Lungu lungu.an...@gmail.com wrote: Fixed. My bad. I had a trailing local job manager still running. On Tue, Mar 10, 2015 at 5:36 PM, Andra Lungu lungu.an...@gmail.com wrote: So the sysout suppression worked like a charm. Thanks! However, this mini cluster setup is giving me a bit of a rough time. I added this to the test suite: @BeforeClass public static void setupCluster() { Configuration config = new Configuration(); config.setInteger(ConfigConstants.LOCAL_INSTANCE_MANAGER_NUMBER_TASK_MANAGER, 2); config.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, 2); config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 2 s); cluster = new ForkableFlinkMiniCluster(config, false); } And then I got: org.jboss.netty.channel.ChannelException: Failed to bind to: / 127.0.0.1:6123 because the address was already in use. Also, do I have to use something like this ExecutionEnvironment.createRemoteEnvironment( localhost, cluster.getJobManagerRPCPort()); instead of getExecutionEnvironment()? I get the same error in both cases. Thank you! Andra On Tue, Mar 10, 2015 at 5:30 PM, Vasiliki Kalavri vasilikikala...@gmail.com wrote: I think all other gelly tests extend MultipleProgramsTestBase, which is already using the mini-cluster set up :-) On 10 March 2015 at 17:21, Stephan Ewen se...@apache.org wrote: I would suggest to do this for all tests that have more than one ExectionEnvironment.getExecutionEnvironment() call. Does not have to be at once, can be migrated bit by bit. Maybe whenever a test is touched anyways, it can be adjusted. This should speed up tests and it has the added benefit for the system as a whole to add more tests where multiple programs run on the same cluster. That way we test for cluster stability, leaks, whether the system cleans up properly after finished program executions. On Tue, Mar 10, 2015 at 4:32 PM, Andra Lungu lungu.an...@gmail.com wrote: Hello Stephan, Would you like the mini-cluster set up for all the Gelly tests? Or just for the one dumping its output? Andra On Tue, Mar 10, 2015 at 3:22 PM, Stephan Ewen se...@apache.org wrote: Ah, I see. One thing to definitely fix in the near future is the followup exceptions from cancelling. They should not swamp the log like this. If you want to suppress systout printing, have a look here, we redirect sysout and syserr for this reason in some tests ( flink-clients/src/test/java/org/apache/flink/client/program/ExecutionPlanAfterExecutionTest.java) You can also significantly speed up your tests by reusing one mini cluster across multiple tests. They way you do it right now spawns a local executor for each test (bringing up actor systems, memory, etc) By starting one in a BeforeClass method, you can use the same Flink instance across multiple tests, making each individual test go like zooom. Have a look here for an example: flink-tests/src/test/java/org/apache/flink/test/recovery/SimpleRecoveryITCase.java It is even faster if you start it like this: public static void setupCluster() { Configuration config = new Configuration(); config.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, PARALLELISM); cluster = new ForkableFlinkMiniCluster(config); } Cheers, Stephan On Tue, Mar 10, 2015 at 3:05 PM, Vasiliki Kalavri vasilikikala...@gmail.com wrote: Hi Stephan, what you see isn't a test failure, it comes from http://github.com/apache/flink/pull/440 and it's testing that an exception is thrown. The output isn't coming from logging, it's sysout coming from the JobClient, so I couldn't turn it off. I was actually meaning to start a discussion about changing this, but I forgot, sorry.. Any suggestions on this then? Thanks! -Vasia. On 10 March 2015 at 14:53, Stephan Ewen se...@apache.org wrote: It seems JobExecution failures are not recognized in some of the Gelly tests. Also, the tests are logging quite a bit, would be nice to make them a bit more quiet. How is the logging is created, btw. The log4j-tests.properties have the log level set to OFF afaik. Here is a log from my latest build: Running org.apache.flink.graph.test.operations.FromCollectionITCase Running org.apache.flink.graph.test.operations.JoinWithVerticesITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.844
Re: [DISCUSS] Offer Flink with Scala 2.11
Yes, will do. 2015-03-10 16:39 GMT+01:00 Robert Metzger rmetz...@apache.org: Very nice work. The changes are probably somewhat easy to merge. Except for the version properties in the parent pom, there should not be a bigger issue. Can you also add additional build profiles to travis for scala 2.11 ? On Tue, Mar 10, 2015 at 2:50 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: We have is almost ready here: https://github.com/stratosphere/flink/commits/scala_2.11_rebased I wanted to open a PR today 2015-03-10 11:28 GMT+01:00 Robert Metzger rmetz...@apache.org: Hey Alex, I don't know the exact status of the Scala 2.11 integration. But I wanted to point you to https://github.com/apache/flink/pull/454, which is changing a huge portion of our maven build infrastructure. If you haven't started yet, it might make sense to base your integration onto that pull request. Otherwise, let me know if you have troubles rebasing your changes. On Mon, Mar 2, 2015 at 9:13 PM, Chiwan Park chiwanp...@icloud.com wrote: +1 for Scala 2.11 Regards. Chiwan Park (Sent with iPhone) On Mar 3, 2015, at 2:43 AM, Robert Metzger rmetz...@apache.org wrote: I'm +1 if this doesn't affect existing Scala 2.10 users. I would also suggest to add a scala 2.11 build to travis as well to ensure everything is working with the different Hadoop/JVM versions. It shouldn't be a big deal to offer scala_version x hadoop_version builds for newer releases. You only need to add more builds here: https://github.com/apache/flink/blob/master/tools/create_release_files.sh#L131 On Mon, Mar 2, 2015 at 6:17 PM, Till Rohrmann trohrm...@apache.org wrote: +1 for Scala 2.11 On Mon, Mar 2, 2015 at 5:02 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: Spark currently only provides pre-builds for 2.10 and requires custom build for 2.11. Not sure whether this is the best idea, but I can see the benefits from a project management point of view... Would you prefer to have a {scala_version} × {hadoop_version} integrated on the website? 2015-03-02 16:57 GMT+01:00 Aljoscha Krettek aljos...@apache.org : +1 I also like it. We just have to figure out how we can publish two sets of release artifacts. On Mon, Mar 2, 2015 at 4:48 PM, Stephan Ewen se...@apache.org wrote: Big +1 from my side! Does it have to be a Maven profile, or does a maven property work? (Profile may be needed for quasiquotes dependency?) On Mon, Mar 2, 2015 at 4:36 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: Hi there, since I'm relying on Scala 2.11.4 on a project I've been working on, I created a branch which updates the Scala version used by Flink from 2.10.4 to 2.11.4: https://github.com/stratosphere/flink/commits/scala_2.11 Everything seems to work fine and the PR contains minor changes compared to Spark: https://issues.apache.org/jira/browse/SPARK-4466 If you're interested, I can rewrite this as a Maven Profile and open a PR so people can build Flink with 2.11 support. I suggest to do this sooner rather than later in order to * the number of code changes enforced by migration small and tractable; * discourage the use of deprecated or 2.11-incompatible source code in future commits; Regards, A.
Re: [DISCUSS] Deprecate Spargel API for 0.9
Thanks for bringing up for discussion, Vasia I am +1 for deprecating Spargel for 0.9 release. It is confusing for new comer (well even for me) to Flink and found out there are 2 sets of Graph APIs. We could use 0.9 release as stabilization period for Gelly, which is why Spargel is deprecated and not removed, and by next release we have more time to flush it out and hopefully we could remove Spargel (maybe keep it deprecated one more time). But I think there should be only ONE Graph API that Flink should promote and I think it should be Gelly at this point. - Henry On Tue, Mar 10, 2015 at 2:02 PM, Vasiliki Kalavri vasilikikala...@gmail.com wrote: Hi all, I would like your opinion on whether we should deprecate the Spargel API in 0.9. Gelly doesn't depend on Spargel, it actually contains it -- we have copied the relevant classes over. I think it would be a good idea to deprecate Spargel in 0.9, so that we can inform existing Spargel users that we'll eventually remove it. Also, I think the fact that we have 2 Graph APIs in the documentation might be a bit confusing for newcomers. One might wonder why do we have them both and when shall they use one over the other? It might be a good idea to add a note in the Spargel guide that would suggest to use Gelly instead and a corresponding note in the beginning of the Gelly guide to explain that Spargel is part of Gelly now. Or maybe a Gelly or Spargel? section. What do you think? The only thing that worries me is that the Gelly API is not very stable. Of course, we are mostly adding things, but we are planning to make some changes as well and I'm sure more will be needed the more we use it. Looking forward to your thoughts! Cheers, Vasia.
[jira] [Created] (FLINK-1675) Rework Accumulators
Stephan Ewen created FLINK-1675: --- Summary: Rework Accumulators Key: FLINK-1675 URL: https://issues.apache.org/jira/browse/FLINK-1675 Project: Flink Issue Type: Bug Components: JobManager, TaskManager Affects Versions: 0.9 Reporter: Stephan Ewen Fix For: 0.9 The accumulators need an overhaul to address various issues: 1. User defined Accumulator classes crash the client, because it is not using the user code classloader to decode the received message. 2. They should be attached to the ExecutionGraph, not the dedicated AccumulatorManager. That makes them accessible also for archived execution graphs. 3. Accumulators should be sent periodically, as part of the heart beat that sends metrics. This allows them to be updated in real time 4. Accumulators should be stored fine grained (per executionvertex, or per execution) and the final value should be on computed by merging all involved ones. This allows users to access the per-subtask accumulators, which is often interesting. 5. Accumulators should subsume the aggregators by allowing to be versioned with a superstep. The versioned ones should be redistributed to the cluster after each superstep. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSS] Offer Flink with Scala 2.11
Very nice work. The changes are probably somewhat easy to merge. Except for the version properties in the parent pom, there should not be a bigger issue. Can you also add additional build profiles to travis for scala 2.11 ? On Tue, Mar 10, 2015 at 2:50 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: We have is almost ready here: https://github.com/stratosphere/flink/commits/scala_2.11_rebased I wanted to open a PR today 2015-03-10 11:28 GMT+01:00 Robert Metzger rmetz...@apache.org: Hey Alex, I don't know the exact status of the Scala 2.11 integration. But I wanted to point you to https://github.com/apache/flink/pull/454, which is changing a huge portion of our maven build infrastructure. If you haven't started yet, it might make sense to base your integration onto that pull request. Otherwise, let me know if you have troubles rebasing your changes. On Mon, Mar 2, 2015 at 9:13 PM, Chiwan Park chiwanp...@icloud.com wrote: +1 for Scala 2.11 Regards. Chiwan Park (Sent with iPhone) On Mar 3, 2015, at 2:43 AM, Robert Metzger rmetz...@apache.org wrote: I'm +1 if this doesn't affect existing Scala 2.10 users. I would also suggest to add a scala 2.11 build to travis as well to ensure everything is working with the different Hadoop/JVM versions. It shouldn't be a big deal to offer scala_version x hadoop_version builds for newer releases. You only need to add more builds here: https://github.com/apache/flink/blob/master/tools/create_release_files.sh#L131 On Mon, Mar 2, 2015 at 6:17 PM, Till Rohrmann trohrm...@apache.org wrote: +1 for Scala 2.11 On Mon, Mar 2, 2015 at 5:02 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: Spark currently only provides pre-builds for 2.10 and requires custom build for 2.11. Not sure whether this is the best idea, but I can see the benefits from a project management point of view... Would you prefer to have a {scala_version} × {hadoop_version} integrated on the website? 2015-03-02 16:57 GMT+01:00 Aljoscha Krettek aljos...@apache.org: +1 I also like it. We just have to figure out how we can publish two sets of release artifacts. On Mon, Mar 2, 2015 at 4:48 PM, Stephan Ewen se...@apache.org wrote: Big +1 from my side! Does it have to be a Maven profile, or does a maven property work? (Profile may be needed for quasiquotes dependency?) On Mon, Mar 2, 2015 at 4:36 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: Hi there, since I'm relying on Scala 2.11.4 on a project I've been working on, I created a branch which updates the Scala version used by Flink from 2.10.4 to 2.11.4: https://github.com/stratosphere/flink/commits/scala_2.11 Everything seems to work fine and the PR contains minor changes compared to Spark: https://issues.apache.org/jira/browse/SPARK-4466 If you're interested, I can rewrite this as a Maven Profile and open a PR so people can build Flink with 2.11 support. I suggest to do this sooner rather than later in order to * the number of code changes enforced by migration small and tractable; * discourage the use of deprecated or 2.11-incompatible source code in future commits; Regards, A.
[jira] [Created] (FLINK-1678) Extend internals documentation: program flow, optimizer, ...
Robert Metzger created FLINK-1678: - Summary: Extend internals documentation: program flow, optimizer, ... Key: FLINK-1678 URL: https://issues.apache.org/jira/browse/FLINK-1678 Project: Flink Issue Type: Task Reporter: Robert Metzger -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [jira] [Commented] (FLINK-1106) Deprecate old Record API
+1 for removal of old API On Mar 10, 2015 5:41 PM, Fabian Hueske fhue...@gmail.com wrote: And I'm +1 for removing the old API with the next release. 2015-03-10 17:38 GMT+01:00 Fabian Hueske fhue...@gmail.com: Yeah, I spotted a good amount of optimizer tests that depend on the Record API. I implemented the last optimizer tests with the new API and would volunteer to port the other optimizer tests. 2015-03-10 16:32 GMT+01:00 Stephan Ewen (JIRA) j...@apache.org: [ https://issues.apache.org/jira/browse/FLINK-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355063#comment-14355063 ] Stephan Ewen commented on FLINK-1106: - A bit of test coverage depends on the deprecated API. We would need to port at least some of the tests to the new API. We can probably drop some subsumed / obsolete tests. Deprecate old Record API Key: FLINK-1106 URL: https://issues.apache.org/jira/browse/FLINK-1106 Project: Flink Issue Type: Task Components: Java API Affects Versions: 0.7.0-incubating Reporter: Robert Metzger Assignee: Robert Metzger Priority: Critical Fix For: 0.7.0-incubating For the upcoming 0.7 release, we should mark all user-facing methods from the old Record Java API as deprecated, with a warning that we are going to remove it at some point. I would suggest to wait one or two releases from the 0.7 release (given our current release cycle). I'll start a mailing-list discussion at some point regarding this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1682) Port Record-API based optimizer tests to new Java API
Fabian Hueske created FLINK-1682: Summary: Port Record-API based optimizer tests to new Java API Key: FLINK-1682 URL: https://issues.apache.org/jira/browse/FLINK-1682 Project: Flink Issue Type: Task Components: Optimizer Reporter: Fabian Hueske Assignee: Fabian Hueske Priority: Minor Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1683) Scheduling preferences for non-unary tasks are not correctly computed
Fabian Hueske created FLINK-1683: Summary: Scheduling preferences for non-unary tasks are not correctly computed Key: FLINK-1683 URL: https://issues.apache.org/jira/browse/FLINK-1683 Project: Flink Issue Type: Bug Components: JobManager Affects Versions: 0.8.1, 0.9 Reporter: Fabian Hueske Assignee: Fabian Hueske Fix For: 0.9, 0.8.2 When computing scheduling preferences for an execution task, the JobManager looks at the assigned instances of all its input execution tasks and returns a preference only if not more than 8 instances have been found (if the input of a tasks is distributed across more than 8 tasks, local scheduling won't help a lot in any case). However, the JobManager treats all input execution tasks the same and does not distinguish between different logical input. The effect is that a join tasks with one broadcasted and one locally forwarded task is not locally assigned towards its locally forwarded input. This can have a significant impact on the performance of tasks that have more than one input and which rely on local forwarding and co-located task scheduling. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSS] Deprecate Spargel API for 0.9
Big +1 for deprecating Spargel :D On Tue, Mar 10, 2015 at 10:02 PM, Vasiliki Kalavri vasilikikala...@gmail.com wrote: Hi all, I would like your opinion on whether we should deprecate the Spargel API in 0.9. Gelly doesn't depend on Spargel, it actually contains it -- we have copied the relevant classes over. I think it would be a good idea to deprecate Spargel in 0.9, so that we can inform existing Spargel users that we'll eventually remove it. Also, I think the fact that we have 2 Graph APIs in the documentation might be a bit confusing for newcomers. One might wonder why do we have them both and when shall they use one over the other? It might be a good idea to add a note in the Spargel guide that would suggest to use Gelly instead and a corresponding note in the beginning of the Gelly guide to explain that Spargel is part of Gelly now. Or maybe a Gelly or Spargel? section. What do you think? The only thing that worries me is that the Gelly API is not very stable. Of course, we are mostly adding things, but we are planning to make some changes as well and I'm sure more will be needed the more we use it. Looking forward to your thoughts! Cheers, Vasia.
Re: [DISCUSS] Offer Flink with Scala 2.11
The PR is here: https://github.com/apache/flink/pull/477 Cheers! 2015-03-10 18:07 GMT+01:00 Alexander Alexandrov alexander.s.alexand...@gmail.com: Yes, will do. 2015-03-10 16:39 GMT+01:00 Robert Metzger rmetz...@apache.org: Very nice work. The changes are probably somewhat easy to merge. Except for the version properties in the parent pom, there should not be a bigger issue. Can you also add additional build profiles to travis for scala 2.11 ? On Tue, Mar 10, 2015 at 2:50 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: We have is almost ready here: https://github.com/stratosphere/flink/commits/scala_2.11_rebased I wanted to open a PR today 2015-03-10 11:28 GMT+01:00 Robert Metzger rmetz...@apache.org: Hey Alex, I don't know the exact status of the Scala 2.11 integration. But I wanted to point you to https://github.com/apache/flink/pull/454, which is changing a huge portion of our maven build infrastructure. If you haven't started yet, it might make sense to base your integration onto that pull request. Otherwise, let me know if you have troubles rebasing your changes. On Mon, Mar 2, 2015 at 9:13 PM, Chiwan Park chiwanp...@icloud.com wrote: +1 for Scala 2.11 Regards. Chiwan Park (Sent with iPhone) On Mar 3, 2015, at 2:43 AM, Robert Metzger rmetz...@apache.org wrote: I'm +1 if this doesn't affect existing Scala 2.10 users. I would also suggest to add a scala 2.11 build to travis as well to ensure everything is working with the different Hadoop/JVM versions. It shouldn't be a big deal to offer scala_version x hadoop_version builds for newer releases. You only need to add more builds here: https://github.com/apache/flink/blob/master/tools/create_release_files.sh#L131 On Mon, Mar 2, 2015 at 6:17 PM, Till Rohrmann trohrm...@apache.org wrote: +1 for Scala 2.11 On Mon, Mar 2, 2015 at 5:02 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: Spark currently only provides pre-builds for 2.10 and requires custom build for 2.11. Not sure whether this is the best idea, but I can see the benefits from a project management point of view... Would you prefer to have a {scala_version} × {hadoop_version} integrated on the website? 2015-03-02 16:57 GMT+01:00 Aljoscha Krettek aljos...@apache.org: +1 I also like it. We just have to figure out how we can publish two sets of release artifacts. On Mon, Mar 2, 2015 at 4:48 PM, Stephan Ewen se...@apache.org wrote: Big +1 from my side! Does it have to be a Maven profile, or does a maven property work? (Profile may be needed for quasiquotes dependency?) On Mon, Mar 2, 2015 at 4:36 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: Hi there, since I'm relying on Scala 2.11.4 on a project I've been working on, I created a branch which updates the Scala version used by Flink from 2.10.4 to 2.11.4: https://github.com/stratosphere/flink/commits/scala_2.11 Everything seems to work fine and the PR contains minor changes compared to Spark: https://issues.apache.org/jira/browse/SPARK-4466 If you're interested, I can rewrite this as a Maven Profile and open a PR so people can build Flink with 2.11 support. I suggest to do this sooner rather than later in order to * the number of code changes enforced by migration small and tractable; * discourage the use of deprecated or 2.11-incompatible source code in future commits; Regards, A.