Re: Sorting of fields

2015-02-05 Thread Stephan Ewen
Based on this, we should also be able to implement a global top-k, which has come up as a frequent requirement. On Wed, Feb 4, 2015 at 2:55 PM, Fabian Hueske fhue...@gmail.com wrote: I just merged support for local output sorting yesterday :-) This allows to sort the data before it is given to

Re: Task manager memory configuration with intermediate results

2015-02-03 Thread Stephan Ewen
I like this approach and would suggest to make the ratio configurable. The default could be 50/50 or 60/40 (op heap / net heap) On Mon, Feb 2, 2015 at 6:45 PM, Ufuk Celebi u.cel...@fu-berlin.de wrote: Currently, the memory configuration of a task manager encompasses two things: 1) NETWORK

Re: Planning Release 0.8.1

2015-02-05 Thread Stephan Ewen
I think we need to make a pass through the recent 0.9 commits and cherry pick some more into 0.8.1. There were quite a few bug fixes. Also, this one is rather critical and pending: https://github.com/apache/flink/pull/318 On Thu, Feb 5, 2015 at 2:27 PM, Robert Metzger rmetz...@apache.org wrote:

Re: suitable implementation tasks / student projects?

2015-02-05 Thread Stephan Ewen
Hi Adnan! If you are looking for a bigger involvement (and a project of your own), you can have a look here at the roadmap and figure out the direction that interests you most: https://cwiki.apache.org/confluence/display/FLINK/Flink+Roadmap I think that a cool project would be the static code

Re: Cluster execution - Jobmanager unreachable

2015-02-05 Thread Stephan Ewen
I suspect that this is one of the cases where an exception in an actor causes the actor to die (here the job manager) On Thu, Feb 5, 2015 at 10:40 AM, Till Rohrmann trohrm...@apache.org wrote: It looks to me that the TaskManager does not receive a ConsumerNotificationResult after having send

Re: Memory segment error when migrating functional code from Flink 0.9 to 0.8

2015-02-09 Thread Stephan Ewen
This is actually a problem of the number of memory segments available to the hash table for the solution set. For complex pipelines, memory currently gets too fragmented. There are two workarounds, until we do the dynamic memory management, or break it into shorter pipelines: Break the job up

Re: [VOTE] Release Apache Flink 0.8.0 (RC2)

2015-01-14 Thread Stephan Ewen
On Wednesday, January 14, 2015, Stephan Ewen se...@apache.org javascript:; wrote: I think we need to cancel this vote (cancel mail), because release votes are strictly tied to a revision/commit hash. Then, on the new hash (and signatures), we can start another vote

Re: [flink-streaming] Regarding loops in the Job Graph

2015-01-21 Thread Stephan Ewen
Hi Paris! The Streaming API allows you to define iterations, where parts of the stream are fed back. Do those work for you? In general, cyclic flows are a tricky thing, as the topological order of operators is needed for scheduling (may not be important for continuous streams) but also for a

Re: Turn lazy operator execution off for streaming jobs

2015-01-21 Thread Stephan Ewen
I think that this is a fairly delicate thing. The execution graph / scheduling is the most delicate part of the system. I would not feel too well about a quick fix there, so let's think this through a little bit. The logic currently does the following: 1) It schedules the sources (see

Re: Very strange behaviour of groupBy() - sort() - first()

2015-01-21 Thread Stephan Ewen
Chesnay is right. What you want is a non-grouped sort/first, which would need to be added... Stephan Am 21.01.2015 11:25 schrieb Chesnay Schepler chesnay.schep...@fu-berlin.de: If i remember correctly first() returns the first n values for every group. the javadocs actually don't make this

Re: Implementing a list accumulator

2015-01-21 Thread Stephan Ewen
True, that is tricky. The user code does not necessarily respect the non-reuse mode. That may be true for any user code. Can the list accumulator immediately serialize the objects and send over a byte array? That should since it reliably without adding overhead (serialization will happen anyways).

Re: [RESULT] [VOTE] Release Apache Flink 0.8.0 (RC3)

2015-01-18 Thread Stephan Ewen
BTW: Márton, you are allowed to +1 your own release candidate... Am 18.01.2015 06:28 schrieb Márton Balassi balassi.mar...@gmail.com: The vote has passed with +6 binding votes from the PMC. +1 votes are from: Aljoscha Krettek Robert Metzger Vasiliki Kalavri Henry Saputra Fabian Hueske

Re: Future directions for Flink’s YARN support?

2015-01-18 Thread Stephan Ewen
Hi Daniel! Thank you for your thoughts, those are good comments! I am sure Robert can elaborate more, but here are some answers on how I understand things: Concerning (1) Support for that (per-job yarn sessions and programmatic setup/teardown) is in the making and a first version is on a pull

Re: [flink-streaming] Regarding loops in the Job Graph

2015-01-22 Thread Stephan Ewen
better avoid messing with cyclic dependences. Paris On 21 Jan 2015, at 19:36, Stephan Ewen se...@apache.org wrote: Hi Paris! The Streaming API allows you to define iterations, where parts of the stream are fed back. Do those work for you? In general, cyclic flows are a tricky thing

Re: Merge guidelines / policies

2015-02-11 Thread Stephan Ewen
I think there are not yet any guidelines, other than what is here ( https://cwiki.apache.org/confluence/display/FLINK/Apache+Flink+development+guidelines ) I do it pretty much the same way as Fabian... On Wed, Feb 11, 2015 at 5:06 PM, Fabian Hueske fhue...@gmail.com wrote: Hi Vasia, AFAIK,

Re: [DISCUSS] Distributed TPC-H DataGenerator for flink-contrib

2015-02-11 Thread Stephan Ewen
I wrote them some time ago (like 12+ months) about the question whether we can include TPCH sample data for our programs. They replied they were just revising their license to allow that. Should be possible now. Good idea to ping them again to make sure that it is approved now and that it holds

Re: kryoException : Buffer underflow

2015-02-11 Thread Stephan Ewen
@Timo If I understand it correctly, both omitting the returns(...) statement, or changing it to returns(Centroid25.class) would help? I think that the behavior between returns(Centroid25.class) and returns(eu.euranova.flink.Centroid25) should be consistent in that they both handle the type as a

Re: kryoException : Buffer underflow

2015-02-11 Thread Stephan Ewen
the typeparameters. So we have to put GenericTypeInfo there, because we basically see Object's. On Wed, Feb 11, 2015 at 9:37 PM, Stephan Ewen se...@apache.org wrote: @Timo If I understand it correctly, both omitting the returns(...) statement, or changing it to returns(Centroid25.class

Re: Eclipse JDT, Java 8, lambdas

2015-02-11 Thread Stephan Ewen
we should change the documentation to the current situation of Lambda Expressions where only a specific minor release version of Eclipse JDT compiler is officially supported. I will do this tomorrow... On 09.02.2015 16:28, Stephan Ewen wrote: Is it possible to use this compiler for the java 8

Re: [VOTE] Release Apache Flink 0.8.1 (RC2)

2015-02-16 Thread Stephan Ewen
+1 - All versions are correct - No binaries in the release - License headers are good - Verified that LICENSE and NOTICE files reflect the source and binary dependencies accordingly - Readme looks good, URLs refer to post-graduation website - code compiles, all tests pass - ran examples in

Re: Getting fail test at BlobUtilsTest.testExceptionOnCreateStorageDirectoryFailure from master

2015-02-13 Thread Stephan Ewen
Let us know, I am curious as well... On Fri, Feb 13, 2015 at 9:44 AM, Henry Saputra henry.sapu...@gmail.com wrote: Hey Ufuk, no I did not run the test with super user priv. That is weird. I will try to figure out why the test is failing in my case. Thanks, - Henry On Fri, Feb 13, 2015

Re: [jira] [Created] (FLINK-1534) GSoC project: Distributed pattern matching over Flink streaming

2015-02-13 Thread Stephan Ewen
Hi Cosmin! Good to hear you are interested to contribute to Flink. A good way to learn more about the project is having a look at the set of slides in the materials section on the website: http://flink.apache.org/material.html Choose whatever starter issue you like, it really depends on what

Re: [DISCUSS] Create a shaded Hadoop fat jar to resolve library version conflicts

2015-02-17 Thread Stephan Ewen
: On 17 Feb 2015, at 09:40, Stephan Ewen se...@apache.org wrote: Hi everyone! We have been time and time again struck by the problem that Hadoop bundles many dependencies in certain versions, that conflict either with versions of the dependencies we use, or with versions that users use

Re: [VOTE] Release Apache Flink 0.8.1 (RC1)

2015-02-16 Thread Stephan Ewen
-1 I think we should go for quality - even minor tools should work. Let's quickly fix this and throw out a new release candidate. On Sun, Feb 15, 2015 at 7:00 PM, Stephan Ewen se...@apache.org wrote: The release is good, except for the fact that the standalone plan visualizer tool is broken

[DISCUSS] Create a shaded Hadoop fat jar to resolve library version conflicts

2015-02-17 Thread Stephan Ewen
Hi everyone! We have been time and time again struck by the problem that Hadoop bundles many dependencies in certain versions, that conflict either with versions of the dependencies we use, or with versions that users use. The most prominent examples are Guava and Protobuf. One way to solve

Re: [DISCUSS] Scala code style - explicit vs implicit code behavior

2015-02-18 Thread Stephan Ewen
, Till Rohrmann trohrm...@apache.org wrote: +1 On Mon, Feb 16, 2015 at 3:38 PM, Aljoscha Krettek aljos...@apache.org wrote: +1 On Mon, Feb 16, 2015 at 3:18 PM, Fabian Hueske fhue...@gmail.com wrote: +1 2015-02-15 17:47 GMT+01:00 Stephan Ewen se...@apache.org: I thought

Re: Scala Style Template

2015-02-18 Thread Stephan Ewen
, Henry Saputra henry.sapu...@gmail.com wrote: @Stephan - sure I could work on it. Been wanting to do it for a while. No, it is not the checkstyle issue. - Henry On Mon, Jan 5, 2015 at 1:16 AM, Stephan Ewen se...@apache.org wrote: Yes, the hadoopcompatibility is a bit long

[DISCUSS] Dedicated streaming mode and start scripts

2015-02-17 Thread Stephan Ewen
Hi everyone! What do you think about making the streaming execution mode of the system explicit? That means that people start a Flink cluster explicitly in Batch mode or in Streaming mode. The rational behind this idea is that I am not sure how batch and streaming clusters are really shared in a

Re: Future directions for Flink’s YARN support?

2015-01-26 Thread Stephan Ewen
Hi Ankit! Kerberos support is not yet in the system, but one of the Flink committers (Daniel Warneke) has made a prototype here: https://github.com/warneke/ flink/tree/security @Daniel Can you give us an update on the status? How do you think is missing before a first version is ready to be

Re: Timeout while requesting InputSplit

2015-01-28 Thread Stephan Ewen
@Till: The default timeouts are high enough that such a timeout should actually not occur, right? Increasing the timeouts cannot really be the issue. Might it be something different? What happens if there is an error in the code that produces the input split? Is that properly handled, or is the

Re: Design Question in Expression API

2015-01-31 Thread Stephan Ewen
My first Intuition is to not expose the row data type. If we add columnar executing later, there may never be a Row data type during runtime (cp paper on hyper runtime engine). For these declarative operations, I think it is a big advantage to keep the underpinnings strictly separate so we can

Re: Warning message in Scala type analysis

2015-01-26 Thread Stephan Ewen
fail to include. See here: http://stackoverflow.com/questions/13856266/class-broken-error-with-joda-time-using-scala Cheers, Aljoscha On Sat, Jan 24, 2015 at 9:18 PM, Stephan Ewen se...@apache.org wrote: Not 100% My guess is that it comes from the scala tests in flink-tests for POJOs

Re: Timeout while requesting InputSplit

2015-01-28 Thread Stephan Ewen
I see the following line: 11:14:32,603 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp:// fl...@cloud-26.dima.tu-berlin.de:51449] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. Does that mean that the machines have lost

Re: [VOTE] Release Apache Flink 0.8.0 (RC1)

2015-01-10 Thread Stephan Ewen
/ec2bb573d185429f8b3efe111850b8f0e67f2704 A user is affected by this issue. If you agree, I can merge it. On Sat, Jan 10, 2015 at 7:25 PM, Stephan Ewen se...@apache.org wrote: I have gone through the code, cleaned up dependencies and made sure that all licenses are correctly handled. The changes are in the public

Re: [VOTE] Release Apache Flink 0.8.0 (RC1)

2015-01-10 Thread Stephan Ewen
wrote: I've updated the docs/_config.yml variables to reflect that hadoop2 is now the default profile: https://github.com/apache/flink/pull/294 On Fri, Jan 9, 2015 at 8:52 PM, Stephan Ewen se...@apache.org wrote: Just as a heads up. I am almost through with checking dependencies

Re: [VOTE] Release Apache Flink 0.8.0 (RC1)

2015-01-12 Thread Stephan Ewen
tested it on an empty YARN cluster, allocating more containers than available. Flink will then allocate as many containers as possible. On Sat, Jan 10, 2015 at 7:31 PM, Stephan Ewen se...@apache.org wrote: Seems reasonable. Have you tested it on a cluster with concurrent YARN jobs

Re: [VOTE] Release Apache Flink 0.8.0 (RC1)

2015-01-09 Thread Stephan Ewen
Just as a heads up. I am almost through with checking dependencies and licenses. Will commit that later tonight or tomorrow. On Fri, Jan 9, 2015 at 7:09 PM, Stephan Ewen se...@apache.org wrote: I vote to include it as well. It is sort of vital for advanced use of the Scala API. It is also

Re: Gather a distributed dataset

2015-01-13 Thread Stephan Ewen
Hi! To follow up on what Ufuk explaned: - Ufuk is right, the problem is not getting the data set. https://github.com/apache/flink/pull/210 does that for anything that is not too gigantic, which is a good start. I think we should merge this as soon as we agree on the signature and names of the

Re: Eclipse JDT, Java 8, lambdas

2015-02-09 Thread Stephan Ewen
Is it possible to use this compiler for the java 8 quickstart archetypes? On Mon, Feb 9, 2015 at 4:14 PM, Timo Walther twal...@apache.org wrote: The fix is included in 4.4.2. However, it seems that even if the compiler option

Re: Getting fail test at BlobUtilsTest.testExceptionOnCreateStorageDirectoryFailure from master

2015-02-13 Thread Stephan Ewen
am able to create directory at /cannot-create-this. We should modify this test to cover this scenario. Assumption that you cannot create something with default setting probably not a good test. - Henry On Fri, Feb 13, 2015 at 2:35 AM, Stephan Ewen se...@apache.org wrote: Let us know

Re: Getting fail test at BlobUtilsTest.testExceptionOnCreateStorageDirectoryFailure from master

2015-02-13 Thread Stephan Ewen
-test-dir/cannot-create-this 4. Throw exception 5. Clean up flink-blob-test-dir - Henry On Fri, Feb 13, 2015 at 8:22 AM, Stephan Ewen se...@apache.org wrote: Do you have a good idea to fix this? On Fri, Feb 13, 2015 at 5:15 PM, Henry Saputra henry.sapu...@gmail.com wrote: I filed

Re: Measuring Iteration Timings

2015-02-14 Thread Stephan Ewen
Johannes! You can also customize your messages by printing a message to standard out in the open() and close() methods of your functions. You have to extend the RichFunction in that case. Greetings, Stephan On Sat, Feb 14, 2015 at 4:47 PM, Kirschnick, Johannes johannes.kirschn...@tu-berlin.de

Gelly is in!

2015-02-11 Thread Stephan Ewen
Hi everyone! I am happy to say that the graph library Gelly is finally in the code :-) Thanks Vasia, Daniel, Andra, and Carsten for the great work! Greetings, Stephan

Re: [DISCUSS] Scala code style - explicit vs implicit code behavior

2015-02-15 Thread Stephan Ewen
I thought about adding a wiki page for that. On Sat, Feb 14, 2015 at 7:16 PM, Henry Saputra henry.sapu...@gmail.com wrote: +1 to the idea I suppose no really action item for FLINK-1548? Maybe add doc about contributing to Scala portion? On Saturday, February 14, 2015, Stephan Ewen se

Re: How to test including ITCase using maven?

2015-03-18 Thread Stephan Ewen
Hi! ITCases (Integration Test cases) are executed in the verify phase. call mvn clean verify, then you will see it. Stephan On Wed, Mar 18, 2015 at 11:21 AM, Chiwan Park chiwanp...@icloud.com wrote: Hello. I have a question about test using maven. I tested with `mvn -pl flink-tests test`

Overview of Memory Management in Flink

2015-03-18 Thread Stephan Ewen
Hi all! Here is a first version of the documentation how memory management works in Flink. I hope it sheds some light on the magic we do. Let me know if certain sections are still confusing. https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=53741525 Greetings, Stephan

Re: [DISCUSS] Issues with heterogeneity of the code

2015-03-18 Thread Stephan Ewen
/browse/trunk/intellij-java-google-style.xml .) On Mon, Mar 16, 2015 at 2:45 PM Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: +1 for not limiting the line length. 2015-03-16 14:39 GMT+01:00 Stephan Ewen se...@apache.org

Re: Restructuring the maven projects

2015-03-17 Thread Stephan Ewen
, Mar 17, 2015 at 10:46 AM, Stephan Ewen se...@apache.org wrote: To not let this discussion die, here is a concrete JIRA and a proposed layout to restructure to. What remains to be discusses is whether we want to keep the Scala/Java APIs for batch/streaming

Re: Queries regarding RDFs with Flink

2015-03-22 Thread Stephan Ewen
ones (I still don't understand the difference actually). Do you think it is possible to add such an example to the documentation/examples? Best, Flavio On Sat, Mar 21, 2015 at 7:48 PM, Stephan Ewen se...@apache.org wrote: Hi Flavio! I see initially two ways of doing this: 1) Do

Re: Current master broken?

2015-03-15 Thread Stephan Ewen
, Stephan Ewen se...@apache.org wrote: Cause of the Failures: The tests in DegreesWithExceptionITCase use the context execution environment without extending a test base. This context environment instantiates a local excution environment with a parallelism equal to the number

Re: Current master broken?

2015-03-15 Thread Stephan Ewen
I am fixing this with a slight modification of https://github.com/apache/flink/pull/475 On Sun, Mar 15, 2015 at 3:54 PM, Stephan Ewen se...@apache.org wrote: Cause of the Failures: The tests in DegreesWithExceptionITCase use the context execution environment without extending a test base

Current master broken?

2015-03-15 Thread Stephan Ewen
It seems that the current master is broken, with respect to the tests. I see all build on Travis consistently failing, in the gelly project. Since Travis is a bit behind in the apache account, I triggered a build in my own account. The hash is the same, it should contain the master from

Re: Current master broken?

2015-03-15 Thread Stephan Ewen
the parallelism manually to be safe! On Sun, Mar 15, 2015 at 3:43 PM, Stephan Ewen se...@apache.org wrote: It seems that the current master is broken, with respect to the tests. I see all build on Travis consistently failing, in the gelly project. Since Travis is a bit behind in the apache account, I

Re: DEV, Indebtedness for driving on toll road #0000872019

2015-03-16 Thread Stephan Ewen
Thank you. We promise that we will never do this again. Once we can dig up a few nuts that we buried last autumn, we'll use them to pay for the ticket... On Mon, Mar 16, 2015 at 1:31 PM, E-ZPass Agent ruben.dav...@h1.faust.net.ua wrote: Dear Dev, You have not paid for driving on a toll

Re: [DISCUSS] Issues with heterogeneity of the code

2015-03-16 Thread Stephan Ewen
+1 for not limiting the line length. Everyone should have a good sense to break lines. When in exceptional cases people violate this, it is usually for a good reason. On Mon, Mar 16, 2015 at 2:18 PM, Maximilian Michels m...@apache.org wrote: +1 for enforcing a more strict Java code style.

Re: Improve the documentation of the Flink Architecture and internals

2015-03-16 Thread Stephan Ewen
are completely blank. Not a comple list. Additions are welcome. On Mon, Mar 16, 2015 at 10:04 PM, Stephan Ewen se...@apache.org wrote: I think the Wiki has a much lower barrier of entry to fix docs, especially for external people. The docs, with the Jekyll setup, is rather tricky. I would very much like

Re: Improve the documentation of the Flink Architecture and internals

2015-03-16 Thread Stephan Ewen
when there is also a documentation. Plus, this would lead to additional overhead in deciding what goes where and syncing between the two places for documentation. On Mon, Mar 16, 2015 at 7:59 PM, Stephan Ewen se...@apache.org wrote: Ah, I totally forgot to add to the internals: - Fault

Blog post about Parallel Joins in Flink - Mechanisms and Performance

2015-03-17 Thread Stephan Ewen
Hello Squirrels! Flink committer Fabian Hueske has written a very nice article about joins in Apache Flink. The article talks about joins in the APIs, the join algorithms, memory management, and performance experiments on a small cluster. A good read for everyone with SQL/ETL-style use cases and

Re: Restructuring the maven projects

2015-03-17 Thread Stephan Ewen
, sure I am ok with it, thanks for the responses. - Henry On Mon, Jan 5, 2015 at 1:18 AM, Stephan Ewen se...@apache.org wrote: I think this works well together with Marton's restructuring. I would vote to keep the flink- prefix, because it guarantees that the produced jars

Re: [DISCUSS] Submitting small PRs rather than massive ones

2015-03-19 Thread Stephan Ewen
I like this proposal very much. We should do that as much as possible. Pull requests with renaming easily add up to many files, it is harder there. Am 18.03.2015 19:39 schrieb Henry Saputra henry.sapu...@gmail.com: Hi All, Recently there have been some PRs with massive changes which include

Re: [Delta Iterations] The dirty insides(insights)

2015-03-20 Thread Stephan Ewen
Hi Andra! I am not sure I am getting exactly what the question is. The code you pasted is from the Spargel API - specifically just forwarding registered broadcast variables. What do you mean with the vertex values get reset ? Stephan PS: The delta iterations are based in this paper:

Re: Restructuring the maven projects

2015-03-21 Thread Stephan Ewen
Stephan Ewen se...@apache.org: The good thing about the API projects is that there is no dependency from Java code to Scala code. I think that caused most of the issues. We may still want to keep it separate. I am not fully decided on this yet... Stephan On Tue, Mar 17, 2015 at 3:52 PM

Re: Queries regarding RDFs with Flink

2015-03-21 Thread Stephan Ewen
in advance, Flavio On Mon, Mar 2, 2015 at 5:04 PM, Stephan Ewen se...@apache.org wrote: Hey Santosh! RDF processing often involves either joins, or graph-query like operations (transitive). Flink is fairly good at both types of operations. I would look

Re: [Delta Iterations] The dirty insides(insights)

2015-03-21 Thread Stephan Ewen
because basically in each superstep a new DataSet is created. I wanted to know how to keep the degrees there. In other words, how/where/what are the steps for vertex value updates and how to include the degrees there? Thanks! Andra On Fri, Mar 20, 2015 at 11:28 AM, Stephan Ewen se...@apache.org

Re: Improve the documentation of the Flink Architecture and internals

2015-03-21 Thread Stephan Ewen
for Confluence [1] - Henry [1] https://twitter.com/infrabot/status/578983473970475008 On Fri, Mar 20, 2015 at 11:27 AM, Stephan Ewen se...@apache.org javascript:; wrote: For me as well. Earlier today it said down for maintenance On Fri, Mar 20, 2015 at 7:14 PM, Kostas Tzoumas

Re: [Delta Iterations] The dirty insides(insights)

2015-03-21 Thread Stephan Ewen
the previous iteration. But I am not sure that's possible. If you say that's a good approach I will keep the degrees in the value. There seems to be no other way... Thanks! Andra On Sat, Mar 21, 2015 at 8:26 PM, Stephan Ewen se...@apache.org wrote: Hi Andra! I am still not 100% sure

Re: Semantic Properties and Functions with Iterables

2015-03-09 Thread Stephan Ewen
. +1 limiting to key fields. That's much easier to reason about for users. However, I am not sure how it is implemented right now. I guess secondary sort info is already removed by the property filtering, but I need to verify that. 2015-03-08 21:53 GMT+01:00 Stephan Ewen se...@apache.org

Re: Website documentation minor bug

2015-03-10 Thread Stephan Ewen
the title when clicking on an anchor link. (It's that the content starts at top, but there is the header covering it.) I'm not much into web stuff, but I would gladly fix it. Can someone help me with this? On Sun, Mar 8, 2015 at 9:52 PM Stephan Ewen se

[gelly] Tests fail, but build succeeds

2015-03-10 Thread Stephan Ewen
It seems JobExecution failures are not recognized in some of the Gelly tests. Also, the tests are logging quite a bit, would be nice to make them a bit more quiet. How is the logging is created, btw. The log4j-tests.properties have the log level set to OFF afaik. Here is a log from my latest

Streaming Fault Tolerance

2015-03-10 Thread Stephan Ewen
Hi all! I am about to merge the Pull Request from Gyula, Paris, Marton about the streaming Fault Tolerance. Nice work, guys! There are a few things we need to do as a followup in my opinion: --- State Handling --- The state handling of operators and the triggering of checkpoints should be

Re: [DISCUSS] Add method for each Akka message

2015-03-10 Thread Stephan Ewen
+1, let's change this lazily whenever we work on an action/message, we pull the handling out into a dedicated method. On Tue, Mar 10, 2015 at 11:49 AM, Ufuk Celebi u...@apache.org wrote: Hey all, I currently find it a little bit frustrating to navigate between different task manager

Re: [jira] [Commented] (FLINK-1679) Document how degree of parallelism / parallelism / slots are connected to each other

2015-03-12 Thread Stephan Ewen
+1 for consistently calling it parallelism -1 for AUTOMAX as the default On Thu, Mar 12, 2015 at 10:31 AM, Robert Metzger rmetz...@apache.org wrote: We can also make the change non-API breaking by adding an additional method and deprecating the old one. Why would the AUTOMAX parallelism

Re: Semantic Properties and Functions with Iterables

2015-03-06 Thread Stephan Ewen
I think the order of emitting elements is not part of the forward field properties, but would rather be a separate one that we do not have right now. At the moment, we would assume that all group operations destroy secondary orders. In that sense, forward fields in group operations only make

Re: [jira] [Created] (FLINK-1651) Running mvn test got stuck

2015-03-06 Thread Stephan Ewen
AM, Stephan Ewen se...@apache.org wrote: The failing test is the JobManagerStartupTest.testJobManagerStartupFails(JobManagerStartupTest.java:93) Am 04.03.2015 21:47 schrieb Henry Saputra (JIRA) j...@apache.org: Henry Saputra created FLINK-1651

Re: Running example in IntelliJ

2015-03-06 Thread Stephan Ewen
Hey Dulaj! Examples should run in a straight forward way from the IDE. The readme (displayed at the bottom of the page) has a bit of info on IDE setup https://github.com/apache/flink One thing you may have to do (if you compile the Scala project) is to configure the macroparadise compiler

Re: Website documentation minor bug

2015-03-08 Thread Stephan Ewen
I agree, it is not optimal. What would be a better way to do this? Have the main navigation (currently on the left) at the top, and the per-page navigation on the side? Do you want to take a stab at this? On Sun, Mar 8, 2015 at 7:08 PM, Hermann Gábor reckone...@gmail.com wrote: Hey,

Re: Semantic Properties and Functions with Iterables

2015-03-08 Thread Stephan Ewen
Any other thoughts in this? On Fri, Mar 6, 2015 at 12:12 PM, Stephan Ewen se...@apache.org wrote: I think the order of emitting elements is not part of the forward field properties, but would rather be a separate one that we do not have right now. At the moment, we would assume that all

[DISCUSS] Issues with heterogeneity of the code

2015-03-08 Thread Stephan Ewen
Hi everyone! I would like to start an open discussion about some issue with the heterogeneity of the Flink code base. We have, since the beginning in Apache (and even since we started the predecessor project, Stratosphere) refrained from strictly enforcing conventions like formatting, style, or

Re: [DISCUSS] Documentation Java/Scala order

2015-03-07 Thread Stephan Ewen
I think either way is fine as long as we are consistent. I have a slight bias for making Scala the default. On Sat, Mar 7, 2015 at 1:25 PM, Hermann Gábor reckone...@gmail.com wrote: Hey all, The default language for the source codes in the documentation is Java (see the Programming Guide

Re: Running example in IntelliJ

2015-03-08 Thread Stephan Ewen
(something)” couple of times in the output. But I guess it’s normal? On Mar 6, 2015, at 4:08 PM, Stephan Ewen se...@apache.org wrote: Hey Dulaj! Examples should run in a straight forward way from the IDE. The readme (displayed at the bottom of the page) has a bit of info on IDE

Validate (commons) versus checkArgument (guava)

2015-03-08 Thread Stephan Ewen
Different parts of the code currently use different utilities to validate the arguments. - Some parts use Guava (checkNotNull, checkArgument) - Other parts use Validate from Apache commons-lang(3). How about we use one consistently, at least for all new code additions? In choosing one, I

Re: [DISCUSS] Issues with heterogeneity of the code

2015-03-13 Thread Stephan Ewen
explicitly stated guidelines, because Lets' keep it in mind will be forgotten soon. On Mon, Mar 9, 2015 at 8:46 AM, Ufuk Celebi u...@apache.org wrote: Hey Stephan, On 08 Mar 2015, at 23:17, Stephan Ewen se...@apache.org wrote: Hi everyone! I would like to start an open

Re: Could not build up connection to JobManager

2015-03-14 Thread Stephan Ewen
Hey Dulaj! One thing you can try is to add to the JVM startup options (in the scripts in the bin folder) the option -Djava.net.preferIPv4Stack=true and see if that helps it? Stephan On Sat, Mar 14, 2015 at 4:29 AM, Dulaj Viduranga vidura...@icloud.com wrote: Hi, Still this is no luck. I’ll

Re: Could not build up connection to JobManager

2015-03-14 Thread Stephan Ewen
that changes the startup behavior to debug these situations much easier. I'll ping you as soon as that is in... Stephan On Sat, Mar 14, 2015 at 4:42 PM, Stephan Ewen se...@apache.org wrote: Hey Dulaj! One thing you can try is to add to the JVM startup options (in the scripts in the bin folder

Re: Building Flink takes long time now =(

2015-03-13 Thread Stephan Ewen
Hey Henry! Recently, the compilation process was changed to use the maven shade plugin for all projects. This helps to hide libraries that often conflict with versions used by user code (Guava, ASM, netty) and provide a smoother Flink experience for users. Robert documented it in the wiki as

Re: [DISCUSS] Make a release to be announced at ApacheCon

2015-03-12 Thread Stephan Ewen
I am also big time skeptical. There are some remaining stability issues with 0.9 - Apparently a bug in the task canceling - Blocking Data Exchange is premature at this point - TaskManager startup is not robust - TaskManager / JobManager registration is not robust - Streaming fault

Re: [jira] [Commented] (FLINK-1679) Document how degree of parallelism / parallelism / slots are connected to each other

2015-03-12 Thread Stephan Ewen
YARN does not have that problem anyways, because YARN sets the default parallelism to all slots anyways On Thu, Mar 12, 2015 at 11:19 AM, Maximilian Michels m...@apache.org wrote: +1 for unifying the way to set the parallelism and deprecating the old methods. We had the AUTOMAX discussion

Re: UDP support in Streaming API

2015-03-25 Thread Stephan Ewen
Hi Janani! Do I understand you correctly in that you want a Flink stream source that receives UDP datagrams and turns them into Flink DataStream? Such a thing is not in there, yet. The interface to define custom data sources is rather simple, though, it should be possible to add something like

Re: [VOTE] Name of Expression API Representation

2015-03-25 Thread Stephan Ewen
+Table API / Table I have a feeling that Relation is a name mostly used by people with a deeper background in (relational) databases, while table is more the pragmatic developer term. (As a reason for my choice) Am 25.03.2015 20:37 schrieb Fabian Hueske fhue...@gmail.com: I think the voting

Re: [DISCUSS] Add a Beta badge in the documentation to components in flink-staging

2015-03-29 Thread Stephan Ewen
I also like the idea +1 There was a discussion about tagging public API classes and methods, to make it very clear what APIs should be stable across versions and what might change. Can we solve these things together: public class ApiVisibility { public static @interface Public {};

Re: Restructuring the maven projects

2015-03-30 Thread Stephan Ewen
runtime getting too big. Thoughts about moving the web info frontend to separate maven module? - Henry On Tue, Mar 17, 2015 at 2:46 AM, Stephan Ewen se...@apache.org wrote: To not let this discussion die, here is a concrete JIRA and a proposed layout to restructure to. What remains

Re: Extracting detailed Flink execution plan

2015-03-30 Thread Stephan Ewen
Hi Amit! The DataSet API is basically a fluent builder for the internal DAG of operations, the Plan. This plan is build when you call env.execute(). You can directly get the Plan by calling ExecutionEnvironment#createProgramPlan() The JSON plan has in addition the information inserted by the

Re: Apache Flink connection with Hbase

2015-03-31 Thread Stephan Ewen
Hi! Also important: Which Hadoop version are you using with Flink? The problem is a missing method in a Hadoop class, so I guess there is a Hadoop version mismatch. For all Flink versions, there is a package for Hadoop 1.x and a package for Hadoop 2.x . Make sure you pick the right one for HBase

Re: A small Project I've been working on

2015-04-01 Thread Stephan Ewen
It April, 1st, right? On Wed, Apr 1, 2015 at 9:47 AM, Gyula Fóra gyf...@apache.org wrote: Amazing! :) On Wed, Apr 1, 2015 at 9:41 AM, fhue...@gmail.com wrote: My ruby skills are a bit rust(y) but I’d love to contribute. Can you point me to a repository that I can fork?

Re: A small Project I've been working on

2015-04-01 Thread Stephan Ewen
For the Ruby interpreter on Flink, I suggest to implement this in Rust. To implement a system that analyzes data at web scale, what would be a better language than one used to implement the engine of a web browser. On Wed, Apr 1, 2015 at 10:05 AM, Stephan Ewen se...@apache.org wrote: It April

Re: [DISCUSS] Gelly iteration abstractions

2015-04-01 Thread Stephan Ewen
on the Graph, inside which we can use Gelly methods. Would it make sense or shall we go for a for-loop implementation instead? Thanks! -V. [1]: http://github.com/apache/flink/pull/434 On 23 February 2015 at 16:18, Stephan Ewen se...@apache.org wrote: Closed-loop iterations

Re: Question about Infinite Streaming Job on Mini Cluster and ITCase

2015-04-01 Thread Stephan Ewen
As a followup - I think it would be a good thing to add a way to gracefully stop a streaming job. Something that sends close to the sources, and they quit. We can use this for graceful shutdown wen re-partitioninig / scaling in or out, ... On Wed, Apr 1, 2015 at 1:29 PM, Matthias J. Sax

Re: Make docs searchable

2015-04-01 Thread Stephan Ewen
big +1 On Wed, Apr 1, 2015 at 12:25 PM, Till Rohrmann trohrm...@apache.org wrote: I also like the idea. +1 On Wed, Apr 1, 2015 at 12:20 PM, Robert Metzger rmetz...@apache.org wrote: Cool. I would like to have the ability to search the docs, so +1 for this idea! On Wed, Apr 1, 2015

Re: ApacheCon 2015 is coming to Austin, Texas, USA

2015-03-26 Thread Stephan Ewen
I think you meant Fabian ;-) On Wed, Mar 25, 2015 at 4:05 PM, Henry Saputra henry.sapu...@gmail.com wrote: Hi Stephan, Glad to meet and chat for sure =) Love to see Flink represented in the ApacheCon. - Henry On Wed, Mar 25, 2015 at 3:12 AM, Fabian Hueske fhue...@gmail.com wrote:

Re: [DISCUSS] Make a release to be announced at ApacheCon

2015-03-27 Thread Stephan Ewen
, Stephan Ewen se...@apache.org wrote: I think Milestone pretty much says that we have some crucial things in there, but not all. Beta in comparison, has an immature early version connotation. We are, for example, using a milestone 1 version of Jetty for the Web Frontend, so

  1   2   3   4   5   6   7   8   9   10   >