Re: Scope of windows?

2019-04-30 Thread Robert Bradshaw
In the original version of the dataflow model, windowing was not annotated on each PCollection, rather it was inferred based on tracing up the graph to the latest WindowInto operation. This tracing logic was put in the SDK for simplicity. I agree that there is room for a variety of SDK/DSL choices

Re: Will Beam add any overhead or lack certain API/functions available in Spark/Flink?

2019-04-30 Thread Kenneth Knowles
It is worth noting that Beam isn't solely a portability layer that exposes underlying API features, but a feature-rich layer in its own right, with carefully coherent abstractions. For example, quite early on the SparkRunner supported streaming aspects of the Beam model - watermarks, windowing, tri

Re: Scope of windows?

2019-04-30 Thread Kenneth Knowles
+dev@ since this has taken a turn in that direction SDK/DSL consistency is nice. But each SDK/DSL being the best thing it can be is more important IMO. I'm including DSLs to be clear that this is a construction issue having little/nothing to do with SDK in the sense of the per-run-time coprocessor

Re: Will Beam add any overhead or lack certain API/functions available in Spark/Flink?

2019-04-30 Thread kant kodali
Staying behind doesn't imply one is better than the other and I didn't mean that in any way but I fail to see how an abstraction framework like Beam can stay ahead of the underlying execution engines? For example, If a new feature is added into the underlying execution engine that doesn't fit the

Re: Beam Summit at ApacheCon

2019-04-30 Thread Austin Bennett
Hi Users and Devs, The CfP deadline approaches. Do submit your technical and/or use case talks, etc etc. Feel free to reach out if you have any questions. Cheers, Austin On Tue, Apr 23, 2019 at 2:49 AM Maximilian Michels wrote: > Hi Austin, > > Thanks for the heads-up! I just want to highlig

Re: How to deserialise Avro messages from Kafka using KafkaAvroDeserializer

2019-04-30 Thread Yohei Onishi
Thanks. I have already found the second solution in the mail archive. I also provided it as answer to this question. https://stackoverflow.com/a/55917157/1872639 Yohei Onishi On Tue, Apr 30, 2019 at 10:13 PM Alexey Romanenko wrote: > Hi Yohei, > > In general, this code is correct but it requir

Re: How to deserialise Avro messages from Kafka using KafkaAvroDeserializer

2019-04-30 Thread Alexey Romanenko
Hi Yohei, In general, this code is correct but it requires additional casting and extracting the result of Kafka.read() into local variable, like this: KafkaIO.Read read = KafkaIO.read().withKeyDeserializer(LongDeserializer.class) .withValueDeserializerAndCoder((Class) KafkaAvroDeserialize

Re: Will Beam add any overhead or lack certain API/functions available in Spark/Flink?

2019-04-30 Thread Maximilian Michels
I wouldn't say one is, or will always be, in front of or behind another. That's a great way to phrase it. I think it is very common to jump to the conclusion that one system is better than the other. In reality it's often much more complicated. For example, one of the things Beam has focused

Re: Scope of windows?

2019-04-30 Thread Maximilian Michels
While it might be debatable whether "continuation triggers" are part of the model, the goal should be to provide a consistent experience across SDKs. I don't see a reason why the Java SDK would use continuation triggers while the Python SDK doesn't. This makes me think that trigger behavior ac

Re: Will Beam add any overhead or lack certain API/functions available in Spark/Flink?

2019-04-30 Thread Robert Bradshaw
Though we all certainly have our biases, I think it's fair to say that all of these systems are constantly innovating, borrowing ideas from one another, and have their strengths and weaknesses. I wouldn't say one is, or will always be, in front of or behind another. Take, as the given example Spar