Re: Kafka Offset handling for Restart/failure scenarios.

2017-03-24 Thread Jins George
Currently /org.apache.beam.runners.flink.FlinkPipelineOptions/ does not have a way to configure externalized checkpoints. Is that something in the road map for FlinkRunner? Thanks, Jins George On 03/23/2017 10:27 AM, Aljoscha Krettek wrote: For this you would use externalised checkpoints:

Slack Channel Request

2017-03-24 Thread Prabeesh K.
Hi, Can someone please add me to the Apache Beam slack channel? Regards, Prabeesh K.

Re: guava collections and kryo under spark runner

2017-03-24 Thread Aviem Zur
Oh yes I see your second version now, that indeed reproduces the issue, thanks! I'll update the gist to include this change. On Fri, Mar 24, 2017 at 3:42 PM Antony Mayi wrote: > Hi Aviem, > > Apologies for the confusion - did you see my second version of the file I > sent

Re: guava collections and kryo under spark runner

2017-03-24 Thread Antony Mayi
Hi Aviem, Apologies for the confusion - did you see my second version of the file I sent shortly after the first one? That second one had the Row class included (using just "implements Serializable"). Thanks,a. On Friday, 24 March 2017, 13:36, Aviem Zur wrote: Hi

Re: guava collections and kryo under spark runner

2017-03-24 Thread Aviem Zur
Hi Antony, Thanks for sharing your code! I created a test that uses the exact pipeline. I couldn't find the `Row` class referred to in your pipeline so I created it as a POJO and registered its coder as `AvroCoder`. Unfortunately this test passes so it does not reproduce the issue you are

Re: guava collections and kryo under spark runner

2017-03-24 Thread Antony Mayi
sorry, wrong version of the file. now corrected:a. On Friday, 24 March 2017, 13:06, Antony Mayi wrote: Hi Aviem, it took me a while to narrow it down to a simple reproducible case but here it is. The problem appears to be related to Combine.globally(). Attached is

Re: HDFS IO - Sink implemented or not?

2017-03-24 Thread Jean-Baptiste Onofré
Hi Jonas, We did good improvements recently on the HDFS Sink. It's waiting refactoring on the IOChannelFactory that should happen pretty soon. So, basically, you can already use the HDFS Sink but be aware that it gonna change pretty soon (and it will be flagged as "deprecated" as it is right

Re: guava collections and kryo under spark runner

2017-03-24 Thread Aviem Zur
Hi Antony. Spark uses serializers to serialize data, however this clashes with Beam's concept of coders, so we should be using coders instead of Spark's serializer (Specifically, in our configuration, Kryo is used as Spark's serializer). >From your stack trace it seems that Kryo is being used to

Re: guava collections and kryo under spark runner

2017-03-24 Thread Antony Mayi
I am on 0.6.0 thx,a. On Friday, 24 March 2017, 8:58, Jean-Baptiste Onofré wrote: Hi Antony, which Beam version are you using ? We did some improvement about guava shading recently, wanted to check if it's related. Regards JB On 03/24/2017 08:03 AM, Antony Mayi