Re: Gelly and ML for Streaming

2015-05-12 Thread Vasiliki Kalavri
Hi Suminda, indeed this is a very exciting idea and we have been working on both Gelly streaming and ML streaming for a while here in Stockholm. Daniel has been looking into graph streaming for his thesis together with Paris and myself. We have evaluated existing streaming models and algorithms

Re: [DISCUSS] Merging Storm compatibility to Flink-contrib

2015-05-12 Thread Robert Metzger
Hi, Thank you for starting the discussion Marton! I would really like to merge the storm compat to our source repo. I think that code which is not merged there will not get enough attention. I'm against splitting flink-contrib into small maven modules. I totally understand your reasoning (mixed

[DISCUSS] Merging Storm compatibility to Flink-contrib

2015-05-12 Thread Márton Balassi
The purpose of flink-contrib currently is to hold contributions to the project that we do not consider part of the core flink functionality, but provide useful tools around it. In general code placed here has to meet less requirements in terms of covering all corner cases if it provides a nice

Re: New project website

2015-05-12 Thread Kostas Tzoumas
Good points raised by Stephan, Felix, and Volker. Do you think we can achieve this by iterating on the new design? One thing we can indeed do is de-clutter a bit. Apart from that, what high level points would you like to see in the frontpage? On Tue, May 12, 2015 at 6:16 AM, Markl, Volker, Prof.

[DISCUSS] Access to Time and Window in Streaming Operations

2015-05-12 Thread Aljoscha Krettek
Hi, I'll try to make it quick this time. I think we need to make information about the event time of an element and information about windows in which it resides accessible to the user. A simple example would be the aggregation of some user behaviour, for example: in = clickSource() analysedData

[jira] [Created] (FLINK-2004) Memory leack in presence of failed checkpoints in KafkaSource

2015-05-12 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-2004: --- Summary: Memory leack in presence of failed checkpoints in KafkaSource Key: FLINK-2004 URL: https://issues.apache.org/jira/browse/FLINK-2004 Project: Flink

Re: Migrating our website from SVN to Git

2015-05-12 Thread Fabian Hueske
Thanks Max! Happy to have the website on Git :-) 2015-05-11 18:56 GMT+02:00 Maximilian Michels m...@apache.org: We're now on Git for our website! Instructions for changing the website have been updated in the How to contribute guide:

Re: [DISCUSS] Access to Time and Window in Streaming Operations

2015-05-12 Thread Gyula Fóra
Hi, This was the exact need that motivated me to rework the windowing and introduce the StreamWindow abstraction which can hold any metadata that represents the current window. At this moment it only contains a unique id but this could be extended easily. When the user created a

Re: New project website

2015-05-12 Thread Stephan Ewen
I think starting with Ufuk's draft, de-cluttering it is a good start. - Having a more prominent tag line - Add some pictures for features back - align the content as per Felix' suggestion That should give us a faster quick overview, but keep some serious content in the starting page. I

Re: New project website

2015-05-12 Thread Alexander Alexandrov
PS. Is there a particular reason why the APIs are stacked above each other in the picture (ML on top of Gelly on top of the Table API)? I was actually picturing the three next to each other... 2015-05-12 12:08 GMT+02:00 Alexander Alexandrov alexander.s.alexand...@gmail.com: I suggest to change

Re: [DISCUSS] Naming and Functionality of Stream Operators and Tasks

2015-05-12 Thread Aljoscha Krettek
Every vote counts. :D On Tue, May 12, 2015 at 11:04 AM, Matthias J. Sax mj...@informatik.hu-berlin.de wrote: I like it. Not sure if my vote counts ;) On 05/12/2015 07:18 AM, Aljoscha Krettek wrote: My proposal for the runtime classes (per my Pull Request is this): StreamTask: base of

[jira] [Created] (FLINK-2003) Building on some encrypted filesystems leads to File name too long error

2015-05-12 Thread Theodore Vasiloudis (JIRA)
Theodore Vasiloudis created FLINK-2003: -- Summary: Building on some encrypted filesystems leads to File name too long error Key: FLINK-2003 URL: https://issues.apache.org/jira/browse/FLINK-2003

Re: New project website

2015-05-12 Thread Alexander Alexandrov
I suggest to change the layout of the bottom half in the following way (will solve the alignment issue): - 2 column layout in 1:1 ratio for *Getting Started*, 1st column with the text and the download button, second column with the maven code snippets - 2 column layout in 1:1 ratio for the

Re: [jira] [Created] (FLINK-1986) Group by fails on iterative data streams

2015-05-12 Thread Szabó Péter
The problem is that the the StreamIterationHead is not created, because only IterativeDataStream.transform(...) can create it. groupBy() on an IterativeDataStream does not call transform(), therefore the exception. All methods of DataStream that is supported for iterations and do not call

[Question]Test failed in cluster mode

2015-05-12 Thread Yi ZHOU
Hello, Thanks Andra for the gaussian sequence generation. It is a little tricky, i just leave this part for future work. I meet another problem in AffinityPropogation algorithm. I write a few test code for it.

Flink on Tez Test stuck

2015-05-12 Thread Stephan Ewen
I have observed that a Flink-on-Tez test job stalls in two cases on the Travis CI server. https://travis-ci.org/StephanEwen/incubator-flink/jobs/62302207 It looks like a shuffle fetch is simply not continuing, but freezing. The stack traces suggest at a first glance that this is actually a Tez

[jira] [Created] (FLINK-2005) Remove dependencies on Record APIs for flink-jdbc module

2015-05-12 Thread Henry Saputra (JIRA)
Henry Saputra created FLINK-2005: Summary: Remove dependencies on Record APIs for flink-jdbc module Key: FLINK-2005 URL: https://issues.apache.org/jira/browse/FLINK-2005 Project: Flink Issue

Re: Migrating our website from SVN to Git

2015-05-12 Thread Henry Saputra
Awesome work, thanks Max ! =) - Henry On Mon, May 11, 2015 at 9:56 AM, Maximilian Michels m...@apache.org wrote: We're now on Git for our website! Instructions for changing the website have been updated in the How to contribute guide:

Re: About Interplay of Merged Streams, Output Selectors and Checkpoint Barriers (and Watermarks)

2015-05-12 Thread Gyula Fóra
Hi, Checkpoint barriers are handled directly on top of the network layer and you are right they work similarly, by blocking input channels until it gets the barrier from all of them. A way of implementing this on the operator level would be by adding a way to ask the inputreader the channel

Re: About Interplay of Merged Streams, Output Selectors and Checkpoint Barriers (and Watermarks)

2015-05-12 Thread Matthias J. Sax
Hi, I don't understand why we need the same machnism twice in the code... Could checkpoing barrieres and low watermarks be unified (or one build on-top/by-using the other) -Matthias On 05/12/2015 02:47 PM, Gyula Fóra wrote: Hi, Checkpoint barriers are handled directly on top of the network

Re: About Interplay of Merged Streams, Output Selectors and Checkpoint Barriers (and Watermarks)

2015-05-12 Thread Aljoscha Krettek
What Stephan mentioned is exactly how I'm planning to implement it, yes. How do the barriers work with chained tasks and OutputSelectorS? Or is there no special-case code required? On Tue, May 12, 2015 at 2:53 PM, Gyula Fóra gyula.f...@gmail.com wrote: Its actually a very different mechanism as

Re: About Interplay of Merged Streams, Output Selectors and Checkpoint Barriers (and Watermarks)

2015-05-12 Thread Stephan Ewen
Watermarks also don't need to flush buffers, they can actually simply queue in as special stream records, if we want to. On Tue, May 12, 2015 at 2:53 PM, Gyula Fóra gyula.f...@gmail.com wrote: Its actually a very different mechanism as watermarks will not block the computations On Tue, May

About Interplay of Merged Streams, Output Selectors and Checkpoint Barriers (and Watermarks)

2015-05-12 Thread Aljoscha Krettek
Hi Folks, as I said in the subject. How will this work? I'm in the process about thinking how to implement low watermarks in Streaming. I'm thinking that the implementation should be quite similar to how the checkpointing barriers will be implemented since they also flush out stuff. Now I'm

Re: About Interplay of Merged Streams, Output Selectors and Checkpoint Barriers (and Watermarks)

2015-05-12 Thread Stephan Ewen
I would like to refrain from adding additional tasks as much as possible. I agree with Gyula that extending the reader to track watermarks and call a handler whenever the watermark advances would be a nice way to implement this. On Tue, May 12, 2015 at 2:40 PM, Aljoscha Krettek

Re: Gelly and ML for Streaming

2015-05-12 Thread Paris Carbone
Hi again Suminda! The stream ML api will be along the lines of the batch ML library and will have some interesting features. We consider re-using as much as possible from the batch ML (e.g. the same data structures and general abstractions etc.). Faye and Martha (CCed) are looking into

[jira] [Created] (FLINK-2002) Iterative test fails when ran with other tests in the same environment

2015-05-12 Thread JIRA
Péter Szabó created FLINK-2002: -- Summary: Iterative test fails when ran with other tests in the same environment Key: FLINK-2002 URL: https://issues.apache.org/jira/browse/FLINK-2002 Project: Flink

Re: [DISCUSS] Merging Storm compatibility to Flink-contrib

2015-05-12 Thread Stephan Ewen
+1 to merge it to contrib. We should hold a path open that - inactive code gets dropped from contrib after a while - very active code may gets its own project (inside or outside the Flink codebase, whatever fits better) On Tue, May 12, 2015 at 9:52 AM, Márton Balassi balassi.mar...@gmail.com

Re: [DISCUSS] Merging Storm compatibility to Flink-contrib

2015-05-12 Thread Matthias J. Sax
Hi, some UnsupportedOperationExceptions are required, because Storm interfaces are implement but Flink cannot support those functionality. Some other are not yet implemented once. A few other of them could be removed (in case an interface in not implemented, but only mimicked), by removing the

Re: Gelly and ML for Streaming

2015-05-12 Thread sirinath
In streaming API you can get efficiency as only some data changes. I am looking into moving window calculations on real time market data to generate trading signals -- View this message in context: