Hi William,
I had the same issue and hacked together a simple KafkaIO. It hasn’t been hard.
Unfortunately right now I’m traveling with limited connection for the next 2
weeks or so, but here’s some advice.
0. Note that I worked on flink-dataflow, I have no idea how code has evolved
into beam, b
Hi Ben,
With (Beam over) Spark over Yarn, data locality is taken into account when
containers are created to process data.
So, in principle, say you have your cad file in hdfs on hostA, then the
processing will likely happen on hostA or “near enough”.
Supported localities vary from same process
tl;dr: I’m with Frances, Beam.
My 2 cents. Having worked for 2 companies that tried to write their name
lowercase, I strongly recommend to give up and just call it Beam. It’s a lost
battle, nobody (especially press) will ever stick with the lowercase name. In
addition, you’d have to distinguish
All right, congrats to all that worked hard on this first release!
--
Emanuele Cesena
Il corpo non ha ideali
> On Jun 15, 2016, at 11:03, Davor Bonaci wrote:
>
> Hi everyone,
> I’m happy to announce that Apache Beam has officially released its first
> version – 0.1.
Hi,
I just published a "quick start" with Beam and wanted to share:
https://medium.com/@ecesena/a-quick-demo-of-apache-beam-with-docker-da98b99a502a
Related repos:
https://github.com/ecesena/docker-beam-flink
https://github.com/ecesena/beam-starter
Any feedback is more than welcome!
Best,
E.
gt; of the project itself. We are not quite there yet, but perhaps you can think
> about contributing in this area.
>
> Thanks,
> Davor
>
> [1]
> http://search.maven.org/#search%7Cga%7C1%7Cg%3Aorg.apache.beam%20archetypes
>
> On Wed, Jun 22, 2016 at 1:18 PM, Emanuele
sult to different docker depending of the backend.
>
> I'm working on new "concrete" Beam samples showing that:
>
> https://github.com/jbonofre/beam-samples
>
> Great work anyway !
>
> Regards
> JB
>
> On 06/22/2016 10:18 PM, Emanuele Cesena wrote:
>
shows Beam with Flink. Maybe we can enhance a bit showing how the
> same pipeline can result to different docker depending of the backend.
>
> I'm working on new "concrete" Beam samples showing that:
>
> https://github.com/jbonofre/beam-samples
>
> Great work anyway
n Thu, Jun 23, 2016 at 7:21 PM Emanuele Cesena wrote:
> Thank you Aljoscha!
>
> > On Jun 23, 2016, at 1:19 AM, Aljoscha Krettek wrote:
> >
> > It's a very nice write up indeed! Thanks for sharing. :-)
> >
> > On Thu, 23 Jun 2016 at 07:35 Jean-Bapt
I know many companies (small-medium)
> > companies that prefer spawning standalone per use case/s. I'm currently
> > biased now towards large clusters because of my current work place ;) which
> > relates better to my previous comment.
> >
> >
> > On Fri, Jun 24,
later in the pipeline fails. I decided to backtrack and look at
>>> how the timestamps are even being assigned initially, since the Flink
>>> source has no concept of the structure of my messages and thus shouldn’t
>>> know how to assign any time at all. I found that it turns out
olleagues,
> Is there a simple genetic example where the Runner is passed at the command
> line, the Beam code sets it in the generic Beam Options.set Runner(), and
> Pipeline.create() is?
> No mention of ANY specific Runner in the code like FlinkPipelineOptions .
>
> Thanks.
&
tra work.
> A true "unified" example.
> Did you solve your State issue? I had the same questions sometime ago.
> For now, I use Redis to persist run-time state. Kinda poor man's way :-)
> works for now, but doesn't scale as I want it.
> Cheers
>
> From:
p;
> eventually Spark.
> Whats the appropriate versions of Kafka & Hadoop to install so there wont be
> any potential short ckt?
>
> Thanks
> Amir-
--
Emanuele Cesena, Data Eng.
http://www.shopkick.com
Il corpo non ha ideali
link as well?
> Cheers
>
>
> Sent from my iPhone
>
>> On Aug 11, 2016, at 10:48 PM, Emanuele Cesena wrote:
>>
>> Hi Amir,
>>
>> Which version of Spark are you using, 1.6 or 2?
>>
>> On hadoop I don’t think there’s any issue and I’d go
Hi Amir,
FlinkPipelineRunner has been renamed FlinkRunner, you have to change your
source code.
Something similar for the ParDo, it should be documented somewhere in the
release notes (sorry, can't check, I'm on the go).
Best,
--
Emanuele Cesena
Il corpo non ha ideali
> On
there another recommended “simple sink” to use?
Thank you much!
Best,
--
Emanuele Cesena, Data Eng.
http://www.shopkick.com
Il corpo non ha ideali
gt; You have to use a window to create a bounded collection from an unbounded
> source.
>
> Regards
> JB
>
> On Sep 18, 2016, at 21:04, Emanuele Cesena wrote:
> Hi,
>
> I wrote a while ago about a simple example I was building to test KafkaIO:
> https://github.com/e
fka. However, the pipeline will finish once X number of
> records are read.
>
> Regards
> Sumit Chawla
>
>
> On Sun, Sep 18, 2016 at 2:00 PM, Emanuele Cesena
> wrote:
> Hi,
>
> Thanks for the hint - I’ll debug better but I thought I did that:
> https://git
dSource. I'm not if it's supported.
> What's the stack trace ? Does the exception originate in Flink translation
> code or Beam code ? That should provide a good hint.
>
> On Mon, Sep 19, 2016, 00:00 Emanuele Cesena wrote:
> Hi,
>
> Thanks for the hint - I’l
I'm only mentioning UnboundedFlinkSink for completeness, I would not
> recommend using it since your program will only work on the Flink runner. The
> way to go, IMHO, would be to write to Kafka and then take the data from there
> and ship it to some final location such as HDFS.
&
>
> Regards
> JB
>
> On 09/19/2016 05:59 PM, Emanuele Cesena wrote:
>> Hi,
>>
>> This is a great insight. Is there any plan to support unbounded sink in Beam?
>>
>> On the temp kafka->kafka solution, this is exactly what we’re doing (and I
>
ses and sample solutions you can do on your own.
>
> Thanks,
>
> Jesse
--
Emanuele Cesena, Data Eng.
http://www.shopkick.com
Il corpo non ha ideali
23 matches
Mail list logo