Ordering expectations of data

2016-08-12 Thread Bart Wyatt
​Hello all, We have a kafka topic with lots of partitions where data is partitioned by an upstream publisher on "session". In flink we read this topic and another single partition topic which contains configuration definitions for a little flatMap based operation. We also do a little bit

Re: stream keyBy without repartition

2016-05-25 Thread Bart Wyatt
15:54 Bart Wyatt <bart.wy...@dsvolition.com<mailto:bart.wy...@dsvolition.com>> wrote: ​I will give this a shot this morning. Considering this and the other email "Does Kafka connector leverage Kafka message keys?" which also ends up talking about hacking around KeyedStream

enableObjectReuse and multiple downstream operators

2016-05-25 Thread Bart Wyatt
(For reference, I'm in 1.0.3) I have a job that looks like this: DataStream input = ... input .map(MapFunction...) .addSink(...); input .map(MapFunction...) ?.addSink(...); If I do not call enableObjectReuse() it works, if I do call enableObjectReuse() it

Re: stream keyBy without repartition

2016-05-25 Thread Bart Wyatt
of the pipeline that can be optimized. For example, given that you are concerned with the serialization overhead, it may be worth seeing if there are better alternatives to use. Kostas On May 24, 2016, at 4:22 PM, Bart Wyatt <bart.wy...@dsvolition.com<mailto:bart.wy...@dsvolit

stream keyBy without repartition

2016-05-24 Thread Bart Wyatt
(migrated from IRC) Hello All, My situation is this: I have a large amount of data partitioned in kafka by "session" (natural partitioning). After I read the data, I would like to do as much as possible before incurring re-serialization or network traffic due to the size of the data. I