Re: Result comparison from 2 DataStream Sources

2016-05-26 Thread iñaki williams
Hi again! Exactly, I am working on the Final Proyect of the carreer and I am using Apache Flink looking for surebets, only checking live tennis matches. Thanks for your tips, I will try it and I will notify with the outcomes. Thanks! 2016-05-26 20:33 GMT+02:00 Konstantin Knauf

Re: Result comparison from 2 DataStream Sources

2016-05-26 Thread Konstantin Knauf
Hi, interesting use case, you are looking for sure bets, I guess ;) Well, I think, what you want to then is probably to use a ConnectedStream, which you keyBy the "name" of both streams. The you can use CoFlatMap for comparison. You can use a KeyValueState zu save prices. In each map you can

Re: Result comparison from 2 DataStream Sources

2016-05-26 Thread iñaki williams
Hi! I will explain it with more details: I am comparing real time sport odds from two different betting Webpages. Assuming that I get just one java object (in reality I should get a List of in-play matches), for each DataStream and assuming that the name is the same of course, what I want to do

Re: Result comparison from 2 DataStream Sources

2016-05-26 Thread Konstantin Knauf
Hi, let me first check, if I understand your requirements correctly. I assume you want to compare attribute price for objects with the same name only, right? Further, I assume the objects are some kind of offer/bid with a timestamp? I think the solution heavily depends on how the records, which

Re: Collect output of transformations on a custom source in real time

2016-05-26 Thread Stephan Ewen
Hi! I am not sure I understand the problem exactly, but one problem I see in your code is that you call "execute()" on and then "DataStreamUtils.collect( datastream);" The first call to "env.execute()" will start the program (source and filter) and the results will simply go nowhere. Then you

Re: whats is the purpose or impact of -yst( --yarnstreaming ) argument

2016-05-26 Thread prateekarora
Thanks for the information Regards Prateek -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/whats-is-the-purpose-or-impact-of-yst-yarnstreaming-argument-tp7183p7206.html Sent from the Apache Flink User Mailing List archive. mailing list

Re: How to perform multiple stream join functionality

2016-05-26 Thread prateekarora
Hi Thanks for the information . it will be good if in future you provide a API to implement such use cases more pleasantly. Regards Prateek -- View this message in context:

Result comparison from 2 DataStream Sources

2016-05-26 Thread iñaki williams
Hi! I am working on something quite similar to the stockPrice example that is posted on the webpage ( https://flink.apache.org/news/2015/02/09/streaming-example.html) I am extracting some data from 2 different webpages and I represent the result using a java object. The diagram could be

Collect output of transformations on a custom source in real time

2016-05-26 Thread Ahmed Nader
Hello, I have defined a custom source function for an infinite stream source, where in my overwritten run method I have a while true loop to keep listening for the input. I want to apply some transformations on the resulting datastream from my source and collect the output so far of these

Re: Dynamic partitioning for stream output

2016-05-26 Thread Aljoscha Krettek
Hi, while I think it would be possible to do it by creating a "meta sink" that contains several RollingSinks I think the approach of integrating it into the current RollinkSink is better. I think it's mostly a question of style and architectural purity but also of resource consumption and

Re: Debugging watermarks?

2016-05-26 Thread Niels Basjes
Thanks guys, Using the above code as a reference I was quickly able to find the problems in my code. Niels Basjes On Sun, May 22, 2016 at 2:00 PM, Stephan Ewen wrote: > Hi Niels! > > It may also be interesting for you to know that with the extension of the > metrics and the

Re: Incremental updates

2016-05-26 Thread Aljoscha Krettek
Hi, newly added nodes would sit idle, yes. Only when we finish the rescaling work mentioned in the link will we be able to dynamically adapt. The internal implementation of this will in fact hash keys to a larger number of partitions than the number of individual partitions and use these "key

Re: Logging with slf4j

2016-05-26 Thread Stephan Ewen
Hi! If the loggers are configured to log to files, all your output will show up in the ".log" files in the "log" directory. If the loggers are configured to log to console, all your output will show up in the ".out" files in the "log" directory. There is no functionality built in to take log

Re: Incremental updates

2016-05-26 Thread Malgorzata Kudelska
Hi, So is there any possibility to utilize an extra node that joins the cluster or will it remain idle? What if I use a custom key function that matches the key variable to a number of keys bigger than the initial number of nodes (following the idea from your link)? What about running flink on

Re: Weird Kryo exception (Unable to find class: java.ttil.HashSet)

2016-05-26 Thread Flavio Pompermaier
Still not able to reproduce the error locally but remotly :) Any suggestions about how to try to reproduce it locally on a subset of the data? This time I had: com.esotericsoftware.kryo.KryoException: Unable to find class: ^Z^A at

Re: Apache Beam and Flink

2016-05-26 Thread Maximilian Michels
Small addition: The Flink Runner translates into the DataSet or DataStream API depending on the "streaming" flag of the PipelineOptions. The default mode is batch. Ultimately, this flag we be removed and replaced with an automated decision depending on the sources used. On Thu, May 26, 2016 at

Re: Logging with slf4j

2016-05-26 Thread simon peyer
Hi I'm using log4j, running localy in cluster mode, with sh start_cluster.shNo warnings in the command line.Please find attached the Logging Files.Where are I'm supposed to find the logging information?In the log directory right?Cheers Simon log4j-cli.properties Description: Binary data

Re: whats is the purpose or impact of -yst( --yarnstreaming ) argument

2016-05-26 Thread Maximilian Michels
The "-yst" or "-yarnstreaming" parameter doesn't have an effect anymore because the streaming mode has been removed. I filed an issue some weeks ago: https://issues.apache.org/jira/browse/FLINK-3890 On Wed, May 25, 2016 at 10:27 PM, Aljoscha Krettek wrote: > Hi Prateek, >

Re: How to perform this join operation?

2016-05-26 Thread Till Rohrmann
Hi Elias, I like the idea of having a trailing / sliding window assigner to perform your join. However, the result should not be entirely correct wrt your initial join specification. Given an events data set which contains the elements e1 = (4000, 1, 1) and e2 = (4500, 2, 2) and a changes data

Re: How to perform multiple stream join functionality

2016-05-26 Thread Aljoscha Krettek
Hi, I think it is currently the right approach. I hope that we can in the future provide APIs to make such cases more pleasant to implement. Cheers, Aljoscha On Wed, 25 May 2016 at 22:13 prateekarora wrote: > Hi > > I am trying to port my spark application in flink.

Re: Apache Beam and Flink

2016-05-26 Thread Slim Baltagi
Hi Ashutosh There is a related open JIRA: Enable DataSet and DataStream Joins https://issues.apache.org/jira/browse/FLINK-2320 Slim > On May 26, 2016, at 3:05 AM, Fabian Hueske wrote: > > No, that is not supported yet.

Re: Apache Beam and Flink

2016-05-26 Thread Fabian Hueske
No, that is not supported yet. Beam provides a common API but the Flink runner translates programs against batch sources into the DataSet API programs and Beam programs against streaming source into DataStream programs. It is not possible to mix both. 2016-05-26 10:00 GMT+02:00 Ashutosh Kumar

Re: Apache Beam and Flink

2016-05-26 Thread Ashutosh Kumar
Thanks . So if we use Beam API with flink engine then we can get inter action between batch and stream ? As i know currently in flink Dataset and DStream can not talk . Is this correct ? Thanks Ashutosh On Thu, May 26, 2016 at 1:09 PM, Slim Baltagi wrote: > Hi Ashutosh > >

Re: Apache Beam and Flink

2016-05-26 Thread Slim Baltagi
Hi Ashutosh Apache Beam provides a Unified API for batch and streaming. It also supports multiple ‘runners’: local, Apache Spark, Apache Flink and Google Cloud Data Flow (commercial service). It is not an alternative to Flink because it is an API and you still need an execution engine. It can

Apache Beam and Flink

2016-05-26 Thread Ashutosh Kumar
How does apache beam fits with flink ? Is it an alternative for flink or complementary to it ? Thanks Ashutosh