Kafka producer sink message loss?

2016-06-03 Thread Elias Levy
I am correct in assuming that the Kafka producer sink can lose message? I don't expect exactly-once semantics using Kafka as a sink given Kafka publishing guarantees, but I do expect at least once. I gather from reading the source that the producer is publishing messages asynchronously, as

WindowedStream aggregation methods pre-aggregate?

2016-05-27 Thread Elias Levy
Can someone confirm whether the org.apache.flink.streaming.api.scala.WindowedStream methods other than "apply" (e.g. "sum") perform pre-aggregation? The API docs are silent on this.

Re: How to perform this join operation?

2016-05-20 Thread Elias Levy
. In my data set keys come and go, and many will never be observed again. That will lead to continuous state growth over time. On Mon, May 2, 2016 at 6:06 PM, Elias Levy <fearsome.lucid...@gmail.com> wrote: > Thanks for the suggestion. I ended up implementing it a different way. > > [

Re: How to perform this join operation?

2016-05-03 Thread Elias Levy
Till, Thanks again for putting this together. It is certainly along the lines of what I want to accomplish, but I see some problem with it. In your code you use a ValueStore to store the priority queue. If you are expecting to store a lot of values in the queue, then you are likely to be using

TimeWindow overload?

2016-05-02 Thread Elias Levy
Looking over the code, I see that Flink creates a TimeWindow object each time the WindowAssigner is created. I have not yet tested this, but I am wondering if this can become problematic if you have a very long sliding window with a small slide, such as a 24 hour window with a 1 minute slide. It

Re: How to perform this join operation?

2016-05-02 Thread Elias Levy
Thanks for the suggestion. I ended up implementing it a different way. What is needed is a mechanism to give each stream a different window assigner, and then let Flink perform the join normally given the assigned windows. Specifically, for my use case what I need is a sliding window for one

Re: How to perform this join operation?

2016-04-14 Thread Elias Levy
tState. As I > understand the drawback is that the list state is not maintained in the > managed memory. > I'm interested to hear about the right way to implement this. > > On Wed, Apr 13, 2016 at 3:53 PM, Elias Levy <fearsome.lucid...@gmail.com> > wrote: > >> I am wondering h

Re: Does Kafka connector leverage Kafka message keys?

2016-04-13 Thread Elias Levy
On Wed, Apr 13, 2016 at 2:10 AM, Stephan Ewen wrote: > If you want to use Flink's internal key/value state, however, you need to > let Flink re-partition the data by using "keyBy()". That is because Flink's > internal sharding of state (including the re-sharding to adjust

Does Kafka connector leverage Kafka message keys?

2016-04-09 Thread Elias Levy
I am wondering if the Kafka connectors leverage Kafka message keys at all? Looking through the docs my impression is that it does not. E.g. if I use the connector to consume from a partitioned Kafka topic, what I will get back is a DataStream, rather than a KeyedStream. And if I want access to

Access an event's TimeWindow?

2016-04-08 Thread Elias Levy
Is there an API to access an event's time window? When you are computing aggregates over a time window, you usually want to output the window along with the aggregate. I could compute the Window on my own, but this seems like an API that should exist.

WindowedStream sum behavior

2016-04-08 Thread Elias Levy
I am wondering about the expected behavior of the sum method. Obviously it sums a specific field in a tuple or POJO. But what should one expect in other fields? Does sum keep the first field, last field or there aren't any guarantees?

WindowedStream operation questions

2016-04-07 Thread Elias Levy
An observation and a couple of question from a novice. The observation: The Flink web site makes available ScalaDocs for org.apache.flink.api.scala but not for org.apache.flink.streaming.api.scala. Now for the questions: Why can't you use map to transform a data stream, say convert all the

<    1   2