Re: Output from Beam (on Flink) to Kafka

2016-03-19 Thread Emanuele Cesena
Hi William, I had the same issue and hacked together a simple KafkaIO. It hasn’t been hard. Unfortunately right now I’m traveling with limited connection for the next 2 weeks or so, but here’s some advice. 0. Note that I worked on flink-dataflow, I have no idea how code has evolved into beam, b

Re: Force pipe executions to run on same node

2016-05-23 Thread Emanuele Cesena
Hi Ben, With (Beam over) Spark over Yarn, data locality is taken into account when containers are created to process data. So, in principle, say you have your cad file in hdfs on hostA, then the processing will likely happen on hostA or “near enough”. Supported localities vary from same process

Re: Beam presentation materials

2016-05-31 Thread Emanuele Cesena
tl;dr: I’m with Frances, Beam. My 2 cents. Having worked for 2 companies that tried to write their name lowercase, I strongly recommend to give up and just call it Beam. It’s a lost battle, nobody (especially press) will ever stick with the lowercase name. In addition, you’d have to distinguish

Re: [user] Announcing 0.1.0-incubating release

2016-06-15 Thread Emanuele Cesena
All right, congrats to all that worked hard on this first release! -- Emanuele Cesena Il corpo non ha ideali > On Jun 15, 2016, at 11:03, Davor Bonaci wrote: > > Hi everyone, > I’m happy to announce that Apache Beam has officially released its first > version – 0.1.

A quick demo of Apache Beam with Docker

2016-06-22 Thread Emanuele Cesena
Hi, I just published a "quick start" with Beam and wanted to share: https://medium.com/@ecesena/a-quick-demo-of-apache-beam-with-docker-da98b99a502a Related repos: https://github.com/ecesena/docker-beam-flink https://github.com/ecesena/beam-starter Any feedback is more than welcome! Best, E.

Re: A quick demo of Apache Beam with Docker

2016-06-22 Thread Emanuele Cesena
gt; of the project itself. We are not quite there yet, but perhaps you can think > about contributing in this area. > > Thanks, > Davor > > [1] > http://search.maven.org/#search%7Cga%7C1%7Cg%3Aorg.apache.beam%20archetypes > > On Wed, Jun 22, 2016 at 1:18 PM, Emanuele

Re: A quick demo of Apache Beam with Docker

2016-06-23 Thread Emanuele Cesena
sult to different docker depending of the backend. > > I'm working on new "concrete" Beam samples showing that: > > https://github.com/jbonofre/beam-samples > > Great work anyway ! > > Regards > JB > > On 06/22/2016 10:18 PM, Emanuele Cesena wrote: >

Re: A quick demo of Apache Beam with Docker

2016-06-23 Thread Emanuele Cesena
shows Beam with Flink. Maybe we can enhance a bit showing how the > same pipeline can result to different docker depending of the backend. > > I'm working on new "concrete" Beam samples showing that: > > https://github.com/jbonofre/beam-samples > > Great work anyway

Re: A quick demo of Apache Beam with Docker

2016-06-23 Thread Emanuele Cesena
n Thu, Jun 23, 2016 at 7:21 PM Emanuele Cesena wrote: > Thank you Aljoscha! > > > On Jun 23, 2016, at 1:19 AM, Aljoscha Krettek wrote: > > > > It's a very nice write up indeed! Thanks for sharing. :-) > > > > On Thu, 23 Jun 2016 at 07:35 Jean-Bapt

Re: A quick demo of Apache Beam with Docker

2016-06-28 Thread Emanuele Cesena
I know many companies (small-medium) > > companies that prefer spawning standalone per use case/s. I'm currently > > biased now towards large clusters because of my current work place ;) which > > relates better to my previous comment. > > > > > > On Fri, Jun 24,

Re: Event time processing with Flink runner and Kafka source

2016-07-08 Thread Emanuele Cesena
later in the pipeline fails. I decided to backtrack and look at >>> how the timestamps are even being assigned initially, since the Flink >>> source has no concept of the structure of my messages and thus shouldn’t >>> know how to assign any time at all. I found that it turns out

Re: Example: pass Runner at command line

2016-07-25 Thread Emanuele Cesena
olleagues, > Is there a simple genetic example where the Runner is passed at the command > line, the Beam code sets it in the generic Beam Options.set Runner(), and > Pipeline.create() is? > No mention of ANY specific Runner in the code like FlinkPipelineOptions . > > Thanks. &

Re: Example: pass Runner at command line

2016-07-26 Thread Emanuele Cesena
tra work. > A true "unified" example. > Did you solve your State issue? I had the same questions sometime ago. > For now, I use Redis to persist run-time state. Kinda poor man's way :-) > works for now, but doesn't scale as I want it. > Cheers > > From:

Re: Appropriate Kafka & Hadoop versions...

2016-08-11 Thread Emanuele Cesena
p; > eventually Spark. > Whats the appropriate versions of Kafka & Hadoop to install so there wont be > any potential short ckt? > > Thanks > Amir- -- Emanuele Cesena, Data Eng. http://www.shopkick.com Il corpo non ha ideali

Re: Appropriate Kafka & Hadoop versions...

2016-08-12 Thread Emanuele Cesena
link as well? > Cheers > > > Sent from my iPhone > >> On Aug 11, 2016, at 10:48 PM, Emanuele Cesena wrote: >> >> Hi Amir, >> >> Which version of Spark are you using, 1.6 or 2? >> >> On hadoop I don’t think there’s any issue and I’d go

Re: Some Compilation errors

2016-08-15 Thread Emanuele Cesena
Hi Amir, FlinkPipelineRunner has been renamed FlinkRunner, you have to change your source code. Something similar for the ParDo, it should be documented somewhere in the release notes (sorry, can't check, I'm on the go). Best, -- Emanuele Cesena Il corpo non ha ideali > On

Issues with simple KafkaIO-read pipeline -- where to write?

2016-09-18 Thread Emanuele Cesena
there another recommended “simple sink” to use? Thank you much! Best, -- Emanuele Cesena, Data Eng. http://www.shopkick.com Il corpo non ha ideali

Re: Issues with simple KafkaIO-read pipeline -- where to write?

2016-09-18 Thread Emanuele Cesena
gt; You have to use a window to create a bounded collection from an unbounded > source. > > Regards > JB > > On Sep 18, 2016, at 21:04, Emanuele Cesena wrote: > Hi, > > I wrote a while ago about a simple example I was building to test KafkaIO: > https://github.com/e

Re: Issues with simple KafkaIO-read pipeline -- where to write?

2016-09-18 Thread Emanuele Cesena
fka. However, the pipeline will finish once X number of > records are read. > > Regards > Sumit Chawla > > > On Sun, Sep 18, 2016 at 2:00 PM, Emanuele Cesena > wrote: > Hi, > > Thanks for the hint - I’ll debug better but I thought I did that: > https://git

Re: Issues with simple KafkaIO-read pipeline -- where to write?

2016-09-19 Thread Emanuele Cesena
dSource. I'm not if it's supported. > What's the stack trace ? Does the exception originate in Flink translation > code or Beam code ? That should provide a good hint. > > On Mon, Sep 19, 2016, 00:00 Emanuele Cesena wrote: > Hi, > > Thanks for the hint - I’l

Re: Issues with simple KafkaIO-read pipeline -- where to write?

2016-09-19 Thread Emanuele Cesena
I'm only mentioning UnboundedFlinkSink for completeness, I would not > recommend using it since your program will only work on the Flink runner. The > way to go, IMHO, would be to write to Kafka and then take the data from there > and ship it to some final location such as HDFS. &

Re: Issues with simple KafkaIO-read pipeline -- where to write?

2016-09-19 Thread Emanuele Cesena
> > Regards > JB > > On 09/19/2016 05:59 PM, Emanuele Cesena wrote: >> Hi, >> >> This is a great insight. Is there any plan to support unbounded sink in Beam? >> >> On the temp kafka->kafka solution, this is exactly what we’re doing (and I >

Re: Strata+Hadoop World

2016-09-29 Thread Emanuele Cesena
ses and sample solutions you can do on your own. > > Thanks, > > Jesse -- Emanuele Cesena, Data Eng. http://www.shopkick.com Il corpo non ha ideali