[jira] [Created] (FLINK-1436) Command-line interface verbose option (-v)

2015-01-22 Thread Max Michels (JIRA)
Max Michels created FLINK-1436: -- Summary: Command-line interface verbose option (-v) Key: FLINK-1436 URL: https://issues.apache.org/jira/browse/FLINK-1436 Project: Flink Issue Type: Improvement

Re: ClassLoader issue when submitting Flink Streaming programs through the web cliend

2015-01-22 Thread Robert Metzger
Didn't we have a similar issue before the 0.7.0-incubating release as well? I thought I've tested submitting a streaming program with the web frontend for the 0.8 release and it worked. On Thu, Jan 22, 2015 at 2:31 PM, Gyula Fóra gyf...@apache.org wrote: Hey, While trying to add support for

Re: [flink-streaming] Regarding loops in the Job Graph

2015-01-22 Thread Paris Carbone
Thanks for the quick answers! It is possible to use iterations, we could detect circles while building the samoa topology and convert them into iterations. It is perhaps the proper way to go. I just thought whether we could hack around it but we better avoid messing with cyclic dependences.

Re: [jira] [Commented] (FLINK-1410) Integrate Flink version variables into website layout

2015-01-22 Thread Till Rohrmann
+1 On Thu, Jan 22, 2015 at 2:42 PM, Max Michels (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/FLINK-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287423#comment-14287423 ] Max Michels commented on FLINK-1410:

Re: Adding non-core API features to Flink

2015-01-26 Thread Max Michels
+1 for having an optional flink-contrib maven dependency and an extension repository in the long run. On Mon, Jan 26, 2015 at 12:00 PM, Robert Metzger rmetz...@apache.org wrote: I've added a JIRA issue to create the module: https://issues.apache.org/jira/browse/FLINK-1452 On Mon, Jan 26,

[jira] [Created] (FLINK-1455) ExternalSortLargeRecordsITCase.testSortWithShortMediumAndLargeRecords: Potential Memory leak

2015-01-26 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1455: - Summary: ExternalSortLargeRecordsITCase.testSortWithShortMediumAndLargeRecords: Potential Memory leak Key: FLINK-1455 URL: https://issues.apache.org/jira/browse/FLINK-1455

[jira] [Created] (FLINK-1454) CliFrontend blocks for 100 seconds when submitting to a non-existent JobManager

2015-01-26 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1454: - Summary: CliFrontend blocks for 100 seconds when submitting to a non-existent JobManager Key: FLINK-1454 URL: https://issues.apache.org/jira/browse/FLINK-1454

Tweets Custom Input Format

2015-01-23 Thread Mustafa Elbehery
Hi, I have created a custom InputFormat for tweets on Flink, based on JSON-Simple event driven parser. I would like to contribute my work into Flink, Regards. -- Mustafa Elbehery EIT ICT Labs Master School http://www.masterschool.eitictlabs.eu/home/ +49(0)15218676094 skype: mustafaelbehery87

Re: Adding non-core API features to Flink

2015-01-24 Thread Ted Dunning
As the community of flink add-ons grows, a CPAN or maven-like mechanism might be a nice option. That would let people download and install extensions very fluidly. The argument for making Apache contributions is definitely valid, but the argument for the agility of fostering independent projects

[jira] [Created] (FLINK-1445) Add support to enforce local input split assignment

2015-01-24 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-1445: Summary: Add support to enforce local input split assignment Key: FLINK-1445 URL: https://issues.apache.org/jira/browse/FLINK-1445 Project: Flink Issue

Re: YARN ITCases fail, master broken?

2015-01-24 Thread Vasiliki Kalavri
Hi, mvn clean verify fails for me on Ubuntu with deleted .m2 repository. I'm getting the following: Results : Failed tests: YARNSessionFIFOITCase.setup:56-YarnTestBase.startYARNWithConfig:249 null YARNSessionCapacitySchedulerITCase.setup:42-YarnTestBase.startYARNWithConfig:249 null Tests

Re: Adding non-core API features to Flink

2015-01-24 Thread Kostas Tzoumas
Thanks Fabian for starting the discussion. I would be biased towards option (1) that Stephan highlighted for the following reasons: - A separate github project is one more infrastructure to manage, and it lives outside the ASF. I would like to bring as much code as possible to the Apache

Re: Adding non-core API features to Flink

2015-01-24 Thread Fabian Hueske
I am also more in favor of option 1). 2015-01-24 20:27 GMT+01:00 Kostas Tzoumas ktzou...@apache.org: Thanks Fabian for starting the discussion. I would be biased towards option (1) that Stephan highlighted for the following reasons: - A separate github project is one more infrastructure to

[jira] [Created] (FLINK-1456) Projection to fieldnames, keyselectors

2015-01-26 Thread JIRA
Márton Balassi created FLINK-1456: - Summary: Projection to fieldnames, keyselectors Key: FLINK-1456 URL: https://issues.apache.org/jira/browse/FLINK-1456 Project: Flink Issue Type: New

YARN ITCases fail, master broken?

2015-01-23 Thread Fabian Hueske
Hi all, I tried to build the current master (mvn clean install) and some tests in the flink-yarn-tests module fail: Failed tests: YARNSessionCapacitySchedulerITCase.testClientStartup:50-YarnTestBase.runWithArgs:314 During the timeout period of 60 seconds the expected string did not show up

Re: Naming of semantic annotations

2015-01-23 Thread Vasiliki Kalavri
Hi, +1 for ForwardedFields. I like it much more than ConstantFields. I think it makes it clear what the feature does. It's a very cool feature and indeed not advertised a lot. I use it when I remember, but most of the times I forget it exists ;) -V. On 23 January 2015 at 22:12, Fabian Hueske

Re: Naming of semantic annotations

2015-01-23 Thread Chesnay Schepler
+1 ForwardedFields On 23.01.2015 22:38, Vasiliki Kalavri wrote: Hi, +1 for ForwardedFields. I like it much more than ConstantFields. I think it makes it clear what the feature does. It's a very cool feature and indeed not advertised a lot. I use it when I remember, but most of the times I

Naming of semantic annotations

2015-01-23 Thread Fabian Hueske
Hi all, I have a pending pull request (#311) to fix and enable semantic information for functions with nested and Pojo types. Semantic information is used to tell the optimizer about the behavior of user-defined functions. The optimizer can use this information to generate more efficient

Re: Kicking off the Machine Learning Library

2015-02-03 Thread Dmitriy Lyubimov
Perhaps one of good ways to go about it is to look at the spark module of mahout. Minimum stuff that is needed is stuff in sparkbindings.SparkEngine and CheckpointedDRM support. The idea is simple. When expressions are written, they are translated into logical operators impelmenting DrmLike[K]

Re: Drafting a roadmap for Flink

2015-02-03 Thread Henry Saputra
Hi All, I am not sure about Interactive Scala shell. I just feel like adding feature like this may look like following Spark. We could probably focus more on CLI and client libraries toward better and scale backend execution engine compare to Spark. - Henry On Thu, Jan 8, 2015 at 1:57 AM,

Re: Design Question in Expression API

2015-02-03 Thread Max Michels
If we want to have a tight integration with our existing API we have to hide the results of the expressions behind a wrapper. This enables us to change the internal implementation at any time and support future Flink API changes and features. +1 for not directly exposing the results as a row

[jira] [Created] (FLINK-1472) Web frontend config overview shows wrong value

2015-02-03 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-1472: -- Summary: Web frontend config overview shows wrong value Key: FLINK-1472 URL: https://issues.apache.org/jira/browse/FLINK-1472 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-1475) Minimize log output of yarn test cases

2015-02-04 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-1475: Summary: Minimize log output of yarn test cases Key: FLINK-1475 URL: https://issues.apache.org/jira/browse/FLINK-1475 Project: Flink Issue Type: Bug

Re: Drafting a roadmap for Flink

2015-02-04 Thread Henry Saputra
Thanks Stephan, Kostas, sounds good to me On Wed, Feb 4, 2015 at 2:57 AM, Kostas Tzoumas ktzou...@apache.org wrote: Yeah, this is something that nobody is working on AFAIK and not a major focus of the project. I moved it to the interesting projects page (and cleaned up the page a bit removing

Re: [jira] [Created] (FLINK-1462) Add documentation guide for the graph API

2015-02-04 Thread Ufuk Celebi
I think it is GitHub flavored markdown. GH has a reference with the additions. On Wednesday, February 4, 2015, Vasia Kalavri (JIRA) j...@apache.org wrote: [

Re: Sorting of fields

2015-02-05 Thread Stephan Ewen
Based on this, we should also be able to implement a global top-k, which has come up as a frequent requirement. On Wed, Feb 4, 2015 at 2:55 PM, Fabian Hueske fhue...@gmail.com wrote: I just merged support for local output sorting yesterday :-) This allows to sort the data before it is given to

Re: Timeout while requesting InputSplit

2015-01-30 Thread Till Rohrmann
I've updated the corresponding jira ticket. On Fri, Jan 30, 2015 at 5:46 PM, Till Rohrmann trohrm...@apache.org wrote: I looked into the problem and the problem is a deserialization issue on the TaskManager side. Somehow the system is not capable to send InputSplits around whose classes are

Re: ReduceGroup fails on server

2015-01-30 Thread Aljoscha Krettek
Hi Arvid, I have a fix that I hope fixes your problem: https://github.com/aljoscha/flink/tree/serializer-factories-fix Could you try building it and running your example? Cheers, Aljoscha On Fri, Jan 30, 2015 at 3:30 PM, Aljoscha Krettek aljos...@apache.org wrote: We have a bit of a divide in

Re: ReduceGroup fails on server

2015-02-02 Thread Aljoscha Krettek
Hi, we have some bug fixes queued up. So a 0.8.1 bug fix release should be expected in the upcoming weeks. Cheers, Aljoscha On Mon, Feb 2, 2015 at 12:48 PM, Arvid Heise arvid.he...@gmail.com wrote: OK, patch indeed worked for my workflow. Thank you very much. Any idea when this patch will be

[jira] [Created] (FLINK-1469) Initialize network environment at task manager startup

2015-02-02 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-1469: -- Summary: Initialize network environment at task manager startup Key: FLINK-1469 URL: https://issues.apache.org/jira/browse/FLINK-1469 Project: Flink Issue Type:

[jira] [Created] (FLINK-1468) Config parse failure fails task manager startup w/o an error message

2015-02-02 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-1468: -- Summary: Config parse failure fails task manager startup w/o an error message Key: FLINK-1468 URL: https://issues.apache.org/jira/browse/FLINK-1468 Project: Flink

Re: Task manager memory configuration with intermediate results

2015-02-03 Thread Stephan Ewen
I like this approach and would suggest to make the ratio configurable. The default could be 50/50 or 60/40 (op heap / net heap) On Mon, Feb 2, 2015 at 6:45 PM, Ufuk Celebi u.cel...@fu-berlin.de wrote: Currently, the memory configuration of a task manager encompasses two things: 1) NETWORK

Re: Task manager memory configuration with intermediate results

2015-02-03 Thread Max Michels
+1 The static memory assignment of the network buffer tool caused some problems for users in the past. Ultimately, dynamic memory management would be desirable. Until then, let's remove the absolute value configuration for the network buffers and introduce a parameter to divide the heap memory

Re: Drafting a roadmap for Flink

2015-02-04 Thread Kostas Tzoumas
Yeah, this is something that nobody is working on AFAIK and not a major focus of the project. I moved it to the interesting projects page (and cleaned up the page a bit removing streaming, tez, and mahout as these are ongoing efforts). On Wed, Feb 4, 2015 at 9:34 AM, Stephan Ewen se...@apache.org

Cluster execution - Jobmanager unreachable

2015-02-04 Thread Chesnay Schepler
Hello, I'm trying to run python jobs with the latest master on a cluster and get the following exception: Error: The program execution failed: JobManager not reachable anymore. Terminate waiting for job answer. org.apache.flink.client.program.ProgramInvocationException: The program

Re: Sorting of fields

2015-02-04 Thread Timo Walther
Ok, I found an earlier discussion about it. Sorry for the mail. However, I think this is a very important feature and I should be added soon. On 04.02.2015 14:38, Timo Walther wrote: Hey, is it correct that we currently do not support sorting without any grouping? I had this question by 2

January 2015 in the Flink community

2015-02-04 Thread Kostas Tzoumas
Here is a digestible read on some January activity in the Flink community: http://flink.apache.org/news/2015/02/04/january-in-flink.html Highlights: - Flink 0.8.0 was released - The Flink community published a technical roadmap for 2015 - Flink was used to scale matrix factorization to

Re: Sorting of fields

2015-02-04 Thread Fabian Hueske
I just merged support for local output sorting yesterday :-) This allows to sort the data before it is given to the OutputFormat. It is done like this: myData.write(myOF).sortLocalOutput(1, Order.ASCENDING); See the programming guide for details (only in master, not online). Full sorting can be

[jira] [Created] (FLINK-1473) Simplify SplittableIterator interface

2015-02-04 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1473: - Summary: Simplify SplittableIterator interface Key: FLINK-1473 URL: https://issues.apache.org/jira/browse/FLINK-1473 Project: Flink Issue Type: Task

[jira] [Created] (FLINK-1479) The spawned threads in the sorter have no context class loader

2015-02-05 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-1479: --- Summary: The spawned threads in the sorter have no context class loader Key: FLINK-1479 URL: https://issues.apache.org/jira/browse/FLINK-1479 Project: Flink

Planning Release 0.8.1

2015-02-05 Thread Robert Metzger
Hi guys, I would like to bundle a minor bugfix release for Flink soon. Some users were complaining about incomplete Kryo support, in particular for Avro. Also, we fixed some other issues which are easy to to port to 0.8.1 (some of them are already in the branch). I would like to start the vote

Re: Planning Release 0.8.1

2015-02-05 Thread Stephan Ewen
I think we need to make a pass through the recent 0.9 commits and cherry pick some more into 0.8.1. There were quite a few bug fixes. Also, this one is rather critical and pending: https://github.com/apache/flink/pull/318 On Thu, Feb 5, 2015 at 2:27 PM, Robert Metzger rmetz...@apache.org wrote:

Re: [DISCUSS] Be more patient with PR and patches in the review

2015-02-05 Thread Henry Saputra
Ah awesome, I do not about that, thanks for letting me know. Mea culpa from me. I think I saw only couple cases but thought I raise the discussions before I forgot =P Thanks for addressing this so quickly, Stephan. - Henry On Thu, Feb 5, 2015 at 8:09 AM, Stephan Ewen se...@apache.org wrote:

Re: [DISCUSS] Be more patient with PR and patches in the review

2015-02-05 Thread Max Michels
Hi Henry, I forgot to leave a message stating that I'm fine with Stephan's changes that would soon be merged into the master. Stephan did not push to the master immediately, so further comments could have been made to the pull request. It would have been more transparent if we had posted the

Re: suitable implementation tasks / student projects?

2015-02-05 Thread Stephan Ewen
Hi Adnan! If you are looking for a bigger involvement (and a project of your own), you can have a look here at the roadmap and figure out the direction that interests you most: https://cwiki.apache.org/confluence/display/FLINK/Flink+Roadmap I think that a cool project would be the static code

Re: Cluster execution - Jobmanager unreachable

2015-02-05 Thread Till Rohrmann
It looks to me that the TaskManager does not receive a ConsumerNotificationResult after having send the ScheduleOrUpdateConsumers message. This can either mean that something went wrong in ExecutionGraph.scheduleOrUpdateConsumers method or the connection was disassociated for some reasons. The

Re: Cluster execution - Jobmanager unreachable

2015-02-05 Thread Stephan Ewen
I suspect that this is one of the cases where an exception in an actor causes the actor to die (here the job manager) On Thu, Feb 5, 2015 at 10:40 AM, Till Rohrmann trohrm...@apache.org wrote: It looks to me that the TaskManager does not receive a ConsumerNotificationResult after having send

[jira] [Created] (FLINK-1478) Add strictly local input split assignment

2015-02-05 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-1478: --- Summary: Add strictly local input split assignment Key: FLINK-1478 URL: https://issues.apache.org/jira/browse/FLINK-1478 Project: Flink Issue Type: New

Eclipse JDT, Java 8, lambdas

2015-02-06 Thread Nam-Luc Tran
Hello, I am trying to use Java 8 lambdas in my project and hit the following error: Exception in thread main org.apache.flink.api.common.functions.InvalidTypesException: The generic type parameters of 'Tuple2' are missing.  It seems that your compiler has not stored them into the .class file. 

Re: Planning Release 0.8.1

2015-02-06 Thread Aljoscha Krettek
I have a fix for this user-discovered bug: https://issues.apache.org/jira/browse/FLINK-1463?jql=project%20%3D%20FLINK%20AND%20assignee%20%3D%20currentUser()%20AND%20resolution%20%3D%20Unresolved in this PR: https://github.com/apache/flink/pull/353 This should probably also be back-ported to

Re: Eclipse JDT, Java 8, lambdas

2015-02-06 Thread Robert Metzger
Hi, looking at your code, it seems that you are creating a DataSet for each file in the directory. Flink can also read entire directories. Now regarding the actual problem: How are you starting the Flink job? Out of your IDE, or using the ./bin/flink run tool? Best, Robert On Fri, Feb 6,

Re: Eclipse JDT, Java 8, lambdas

2015-02-06 Thread Robert Metzger
Sorry, I didn't see the pathID which is added in the map() method. Then your approach looks good for using Flink locally. On Fri, Feb 6, 2015 at 3:03 PM, Nam-Luc Tran namluc.t...@euranova.eu wrote: Thank you for your replies. @Stephen Updating to 0.9-SNAPSHOT and using the return statement

Re: Planning Release 0.8.1

2015-02-06 Thread Robert Metzger
@Aljoscha, can you merge the backported fix to the release-0.8 branch when its ready? On Fri, Feb 6, 2015 at 11:39 AM, Aljoscha Krettek aljos...@apache.org wrote: I have a fix for this user-discovered bug:

Fwd: Google Summer of Code 2015 is coming

2015-02-05 Thread Henry Saputra
I have seen some interests from students about Flink. Maybe should officially submit proposal to Google summer of code this year? -- Forwarded message -- From: Ulrich Stärk u...@apache.org Date: Mon, Feb 2, 2015 at 2:44 PM Subject: Google Summer of Code 2015 is coming To:

Streaming fault tolerance with event sending

2015-02-06 Thread Hermann Gábor
Hey, We've been implementing a simple Storm-like fault tolerance system with persisting source records and keeping track of all the records (whether they've been processed), and replaying them if they fail. The straightforward way to do this was creating a special AbstractJobVertex (and

Re: Google Summer of Code 2015 is coming

2015-02-08 Thread Fabian Hueske
I think it would be good to participate in GSoC and would be available as a mentor this year as well. The following projects from our project wiki page could serve as nice GSoC projects, IMO: - Improving monitoring (I hope we make some progress in that direction until GSoC starts, but there will

Re: [jira] [Commented] (FLINK-1319) Add static code analysis for UDFs

2015-02-08 Thread Fabian Hueske
Timo, thanks for picking up this very cool feature! I think as well that an integrated approach would be the better solution, if it can be done with reasonable effort. +1 implementing a prototype using ASM. Let me know, if I can help somehow. Cheers, Fabian 2015-02-05 14:31 GMT+01:00 Timo

Re: YARN ITCases fail, master broken?

2015-02-02 Thread Ufuk Celebi
What's the state of this? I got an error, which I didn't see before: https://s3.amazonaws.com/archive.travis-ci.org/jobs/49194479/log.txt On 27 Jan 2015, at 13:35, Fabian Hueske fhue...@gmail.com wrote: Robert, thanks for fixing the MacOS build! Building on my Ubuntu VM is still failing

Re: Task manager memory configuration with intermediate results

2015-02-03 Thread Fabian Hueske
Yes, I would really like to get rid of the distinction between operator and network buffers. Having all buffers been taken from the same pool is a good step towards that goal. Until the assignment is dynamic, I prefer to have a config option for the network / operator ratio. +1 for the proposal

Memory segment error when migrating functional code from Flink 0.9 to 0.8

2015-02-09 Thread Andra Lungu
Hello everyone, I am implementing a graph algorithm as part of a course and I will also add it to the Flink- Gelly examples. My problem is that I started developing it in the Gelly repository, which runs on flink 0.9. It works like a charm there, but in order to test in on a cluster to see its

Re: Eclipse JDT, Java 8, lambdas

2015-02-09 Thread Nam-Luc Tran
I did try the 4.5 M4 release and it did not go straightforward. -- View this message in context: http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Eclipse-JDT-Java-8-lambdas-tp3664p3688.html Sent from the Apache Flink (Incubator) Mailing List archive. mailing list

Re: Memory segment error when migrating functional code from Flink 0.9 to 0.8

2015-02-09 Thread Till Rohrmann
Hi Andra, have you tried increasing the number of network buffers in your cluster? You can control by the configuration value: taskmanager.network.numberOfBuffers: #numberBuffers Greets, Till On Mon, Feb 9, 2015 at 9:56 AM, Andra Lungu lungu.an...@gmail.com wrote: Hello everyone, I am

Re: Memory segment error when migrating functional code from Flink 0.9 to 0.8

2015-02-09 Thread Stephan Ewen
This is actually a problem of the number of memory segments available to the hash table for the solution set. For complex pipelines, memory currently gets too fragmented. There are two workarounds, until we do the dynamic memory management, or break it into shorter pipelines: Break the job up

Re: Eclipse JDT, Java 8, lambdas

2015-02-09 Thread Timo Walther
Hey, it seems that 4.4.2 also includes the fix (https://projects.eclipse.org/projects/eclipse/releases/4.4.2/bugs) and will be released end of february. I will try Eclipse Luna SR2 RC2 today and check if it is working. Regards, Timo On 09.02.2015 10:05, Nam-Luc Tran wrote: I did try the

Re: Kicking off the Machine Learning Library

2015-01-14 Thread Henry Saputra
Thanks Ted, I have reached out to them as well. The more requests the merrier I suppose =) - Henry On Wed, Jan 14, 2015 at 11:45 AM, Ted Dunning ted.dunn...@gmail.com wrote: On Thu, Jan 8, 2015 at 5:09 PM, Henry Saputra henry.sapu...@gmail.com wrote: I am trying to hook us up with H2O guys,

Re: [VOTE] Release Apache Flink 0.8.0 (RC2)

2015-01-14 Thread Henry Saputra
Hi Robert, yes you are right. Totally forgot about packaging the binaries. - Henry On Wed, Jan 14, 2015 at 11:27 AM, Robert Metzger rmetz...@apache.org wrote: Yes, there are two NOTICE files. They differ because bin and src releases require different licensing notices. (The bin NOTICE is

Re: [VOTE] Release Apache Flink 0.8.0 (RC2)

2015-01-14 Thread Robert Metzger
Yes, there are two NOTICE files. They differ because bin and src releases require different licensing notices. (The bin NOTICE is bigger) On Wed, Jan 14, 2015 at 8:13 PM, Henry Saputra henry.sapu...@gmail.com wrote: Ah, we have 2 copies of NOTICE file? Can binary diet just use the same one

Re: Kicking off the Machine Learning Library

2015-01-14 Thread Ted Dunning
On Thu, Jan 8, 2015 at 5:09 PM, Henry Saputra henry.sapu...@gmail.com wrote: I am trying to hook us up with H2O guys, lets hope it pays off =) I know the CEO and CTO reasonably well and will ping them.

Upgrading to Scala 2.11.x?

2015-01-15 Thread Alexander Alexandrov
Currently, Flink uses Scala 2.10.4 and relies on the macro paradise compiler plugin to get the quasi-quotes functionality. This makes the code incompatible with third-party add-ons that use macros written against a newer version of Scala. Scala 2.11 has been around for almost a year already. It

Re: [VOTE] Release Apache Flink 0.8.0 (RC3)

2015-01-16 Thread Till Rohrmann
Hi, I found an issue with the yarn binaries. In flink-0.8.0-bin-hadoop2-yarn.tgz the plan visualizer does not work. The reason is that the resources folder with the javascript files is not copied to flink-dist. I'm a little bit undecided wether this is a blocker or not. It is definitely a bad

Re: [VOTE] Release Apache Flink 0.8.0 (RC3)

2015-01-16 Thread Henry Saputra
Dont think that is a blocker for the release. We could make release notes to indicate this problem. On Fri, Jan 16, 2015 at 8:25 AM, Till Rohrmann trohrm...@apache.org wrote: Hi, I found an issue with the yarn binaries. In flink-0.8.0-bin-hadoop2-yarn.tgz the plan visualizer does not work.

Re: [VOTE] Release Apache Flink 0.8.0 (RC3)

2015-01-16 Thread Henry Saputra
The checksum files of source artifact look good The signature files is good Source compiled and tests passed Local tests work NOTICE, LICENSE files look good No executables 3rd party in source artifact License header exists +1 - Henry On Thu, Jan 15, 2015 at 3:10 AM, Márton Balassi

Re: Gather a distributed dataset

2015-01-16 Thread Alexander Alexandrov
Thanks, I will have a look at your comments tomorrow and create a PR which should superseed 210. BTW, is there already a test case where I can see the suggested way to do staged execution in with the new ExecutionEnvironment API? I thought about your second remark as well. The following lines

Keeping around temp datasets

2015-01-20 Thread Alexander Alexandrov
Hi there, I have to implement some generic fallback strategy on top of a more abstract DSL in order to keep datasets in a temp space (e.g. Tachyon). My implementation is based on the 0.8 release. At the moment I am undecided between three options: - BinaryInputFormat / BinaryOutputFormat -

Representing Scala base types in the Flink RT

2015-01-20 Thread Alexander Alexandrov
Hi there, I cannot figure out how the Scala base types (e.g. scala.Int, scala.Double, etc.) are mapped to the Flink runtime. It seems that there are not treated the same as their Java counterparts (e.g. java.lang.Integer, java.lang.Double). For example, if I write the following code: val

Re: Turn lazy operator execution off for streaming jobs

2015-01-21 Thread Gyula Fóra
Thank you! I will play around with it. On Wed, Jan 21, 2015 at 3:50 PM, Ufuk Celebi u...@apache.org wrote: Hey Gyula, On 21 Jan 2015, at 15:41, Gyula Fóra gyf...@apache.org wrote: Hey Guys, I think it would make sense to turn lazy operator execution off for streaming programs because

Very strange behaviour of groupBy() - sort() - first()

2015-01-21 Thread Felix Neutatz
Hi, my use case is the following: I have a Tuple2String,Long. I want to group by the String and sum up the Long values accordingly. This works fine with these lines: DataSetLineitem lineitems = getLineitemDataSet(env); lineitems.project(new int []{3,0}).groupBy(0).aggregate(Aggregations.SUM,

Re: [flink-streaming] Regarding loops in the Job Graph

2015-01-21 Thread Stephan Ewen
Hi Paris! The Streaming API allows you to define iterations, where parts of the stream are fed back. Do those work for you? In general, cyclic flows are a tricky thing, as the topological order of operators is needed for scheduling (may not be important for continuous streams) but also for a

[jira] [Created] (FLINK-1430) Add test for streaming scala api completeness

2015-01-21 Thread JIRA
Márton Balassi created FLINK-1430: - Summary: Add test for streaming scala api completeness Key: FLINK-1430 URL: https://issues.apache.org/jira/browse/FLINK-1430 Project: Flink Issue Type:

Re: Turn lazy operator execution off for streaming jobs

2015-01-21 Thread Stephan Ewen
I think that this is a fairly delicate thing. The execution graph / scheduling is the most delicate part of the system. I would not feel too well about a quick fix there, so let's think this through a little bit. The logic currently does the following: 1) It schedules the sources (see

Re: Very strange behaviour of groupBy() - sort() - first()

2015-01-21 Thread Fabian Hueske
Chesnay is right. Right now, it is not possible to do want you want in a straightforward way because Flink does not support to fully sort a data set (there are several related issues in JIRA). A workaround would be to attach a constant value to each tuple, group on that (all tuples are sent to

Re: Very strange behaviour of groupBy() - sort() - first()

2015-01-21 Thread Chesnay Schepler
If i remember correctly first() returns the first n values for every group. the javadocs actually don't make this behaviour very clear. On 21.01.2015 19:18, Felix Neutatz wrote: Hi, my use case is the following: I have a Tuple2String,Long. I want to group by the String and sum up the Long

Re: Very strange behaviour of groupBy() - sort() - first()

2015-01-21 Thread Stephan Ewen
Chesnay is right. What you want is a non-grouped sort/first, which would need to be added... Stephan Am 21.01.2015 11:25 schrieb Chesnay Schepler chesnay.schep...@fu-berlin.de: If i remember correctly first() returns the first n values for every group. the javadocs actually don't make this

[jira] [Created] (FLINK-1428) Typos in Java code example for RichGroupReduceFunction

2015-01-21 Thread Felix Neutatz (JIRA)
Felix Neutatz created FLINK-1428: Summary: Typos in Java code example for RichGroupReduceFunction Key: FLINK-1428 URL: https://issues.apache.org/jira/browse/FLINK-1428 Project: Flink Issue

Re: Master not building and how to notice it faster in the future

2015-01-21 Thread Henry Saputra
Would it be better to use Github Jenkins plugin [1] to connect to ASF Jenkins cluster? [1] https://wiki.jenkins-ci.org/display/JENKINS/GitHub+pull+request+builder+plugin [2] http://events.linuxfoundation.org/sites/events/files/slides/Jenkins_at_ASF_2014.pdf On Tue, Jan 20, 2015 at 2:57 PM,

Re: Implementing a list accumulator

2015-01-21 Thread Stephan Ewen
True, that is tricky. The user code does not necessarily respect the non-reuse mode. That may be true for any user code. Can the list accumulator immediately serialize the objects and send over a byte array? That should since it reliably without adding overhead (serialization will happen anyways).

Re: [ANNOUNCE] Apache Flink 0.8.0 released

2015-01-22 Thread Fabian Hueske
Awesome! Thank you very much Marton and Robert! Cheers, Fabian 2015-01-22 9:04 GMT+01:00 Robert Metzger rmetz...@apache.org: The Apache Flink team is proud to announce the next version of Apache Flink. Find the blogpost with the change log here:

[jira] [Created] (FLINK-1432) CombineTaskTest.testCancelCombineTaskSorting sometimes fails

2015-01-22 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1432: - Summary: CombineTaskTest.testCancelCombineTaskSorting sometimes fails Key: FLINK-1432 URL: https://issues.apache.org/jira/browse/FLINK-1432 Project: Flink

Re: How to use org.apache.hadoop.mapreduce.lib.input.MultipleInputs in Flink

2015-01-17 Thread Fabian Hueske
Why don't you just create two data sources that each wrap the ParquetFormat using a HadoopInputFormat and join them as for example done in the TPCH Q3 example [1] I always found the MultipleInputFormat to be an ugly workaround for Hadoop's deficiency to read data from multiple sources. AFAIK,

Re: Gather a distributed dataset

2015-01-15 Thread Ufuk Celebi
On 13 Jan 2015, at 16:50, Stephan Ewen se...@apache.org wrote: Hi! To follow up on what Ufuk explaned: - Ufuk is right, the problem is not getting the data set. https://github.com/apache/flink/pull/210 does that for anything that is not too gigantic, which is a good start. I think we

Re: Gather a distributed dataset

2015-01-15 Thread Alexander Alexandrov
@Stephan: yes, I would like to contribute (e.g. I can design the interfaces and merge 210). Please reply with more information once you have the branch, I can find some time for that next week (on the expense of FLINK-1347 https://issues.apache.org/jira/browse/FLINK-1347 which hopefully can wait

Re: [VOTE] Release Apache Flink 0.8.0 (RC2)

2015-01-14 Thread Henry Saputra
We need to use 2014-2015 [1] [1] http://www.apache.org/dev/licensing-howto.html On Wednesday, January 14, 2015, Márton Balassi mbala...@apache.org wrote: Hey guys, Thanks for the updates. Let me update the notice file to 2015 and also bump the inception year in the pom. [1] By the way

Re: Turn lazy operator execution off for streaming jobs

2015-01-22 Thread Ufuk Celebi
On 22 Jan 2015, at 11:37, Till Rohrmann trohrm...@apache.org wrote: I'm not sure whether it is currently possible to schedule first the receiver and then the sender. Recently, I had to fix the TaskManagerTest.testRunWithForwardChannel test case where this was exactly the case. Due to first

Re: Master not building and how to notice it faster in the future

2015-01-21 Thread Robert Metzger
Is the git hook something we can control for everybody? I thought its more like a personal thing everybody can set up if wanted? I'm against enforcing something like this for every committer. I don't want to wait for 15 minutes for pushing a typo fix to the documentation. On Wed, Jan 21, 2015

Re: Master not building and how to notice it faster in the future

2015-01-21 Thread Max Michels
Hi Robert, I like your solution using Travis and Google App Engine. However, I think there's a much simpler solution which can prevent commiters from pushing not even compiling or test-failing code to the master in the first place. Commiters could simply install a git pre-push hook in their git

[jira] [Created] (FLINK-1425) Turn lazy operator execution off for streaming programs

2015-01-21 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-1425: - Summary: Turn lazy operator execution off for streaming programs Key: FLINK-1425 URL: https://issues.apache.org/jira/browse/FLINK-1425 Project: Flink Issue Type:

Re: Master not building and how to notice it faster in the future

2015-01-21 Thread Ufuk Celebi
Thanks for the nice script. I've just installed it :-) On 21 Jan 2015, at 13:57, Max Michels m...@data-artisans.com wrote: I've created a pre-push hook that does what I described (and a bit more). It does only enforce a check for the remote flink master branch and doesn't disturb you on your

Turn lazy operator execution off for streaming jobs

2015-01-21 Thread Gyula Fóra
Hey Guys, I think it would make sense to turn lazy operator execution off for streaming programs because it would make life simpler for windowing. I also created a JIRA issue here https://issues.apache.org/jira/browse/FLINK-1425. Can anyone give me some quick pointers how to do this? Its

[jira] [Created] (FLINK-1422) Missing usage example for withParameters

2015-01-20 Thread Alexander Alexandrov (JIRA)
Alexander Alexandrov created FLINK-1422: --- Summary: Missing usage example for withParameters Key: FLINK-1422 URL: https://issues.apache.org/jira/browse/FLINK-1422 Project: Flink Issue

Re: [RESULT] [VOTE] Release Apache Flink 0.8.0 (RC3)

2015-01-18 Thread Márton Balassi
The vote has passed with +6 binding votes from the PMC. +1 votes are from: Aljoscha Krettek Robert Metzger Vasiliki Kalavri Henry Saputra Fabian Hueske Stephen Ewen Thank you for checking the release. I'll publish the release now On Thu, Jan 15, 2015 at 12:10 PM, Márton Balassi

  1   2   3   4   5   6   7   8   9   10   >