Re: [Call for Speakers] Apache Beam Meetup @ San Francisco, CA on Oct. 24th
Got it. I can def look into recording the talks and share later, in case it's not possible I'll definitely share the slides. I can also look into organizing something in Seattle later on :) On Oct 9, 2017 10:59 PM, "Derek Hao Hu" wrote: > It's just that I'm in Seattle (and I guess a lot of us are not in SF). The > talks seem pretty interesting. :) > > Can you help share the slides after the talk then? > > Thanks, > > Derek > > On Mon, Oct 9, 2017 at 10:57 PM, Griselda Cuevas wrote: > >> Hi Derek - I could look into that but so far we're not planning on it. >> Would you like to watch them later? >> >> On 9 October 2017 at 16:36, Derek Hao Hu wrote: >> >>> Hi Griselda, >>> >>> Will the talks be recorded? >>> >>> Thanks, >>> >>> Derek >>> >>> On Mon, Oct 9, 2017 at 12:25 PM, Griselda Cuevas >>> wrote: >>> Hi everyone, I'm reaching out because I'm organizing a Meetup in San Francisco California on October 24th. and I'm looking for speakers. This is the agenda: *Talk 1: Apache Beam Overview* *Talk 2: Exactly one processing with Dataflow* We will have 30 min for each talk + 15 min of Q&A, the event will be hosted at Google San Francisco and will be promote on the San Francisco Cloud Mafia Meetup group. *Request: If you're interested in giving any of these two talks, reply to this message and we can arrange details.* The date is a bit flexible in case someone is interested but can't make the date. Thanks! G >>> >>> >>> -- >>> Derek Hao Hu >>> >>> Software Engineer | Snapchat >>> Snap Inc. >>> >> >> > > > -- > Derek Hao Hu > > Software Engineer | Snapchat > Snap Inc. >
Re: [Call for Speakers] Apache Beam Meetup @ San Francisco, CA on Oct. 24th
It's just that I'm in Seattle (and I guess a lot of us are not in SF). The talks seem pretty interesting. :) Can you help share the slides after the talk then? Thanks, Derek On Mon, Oct 9, 2017 at 10:57 PM, Griselda Cuevas wrote: > Hi Derek - I could look into that but so far we're not planning on it. > Would you like to watch them later? > > On 9 October 2017 at 16:36, Derek Hao Hu wrote: > >> Hi Griselda, >> >> Will the talks be recorded? >> >> Thanks, >> >> Derek >> >> On Mon, Oct 9, 2017 at 12:25 PM, Griselda Cuevas wrote: >> >>> Hi everyone, >>> >>> I'm reaching out because I'm organizing a Meetup in San Francisco >>> California on October 24th. and I'm looking for speakers. This is the >>> agenda: >>> >>> *Talk 1: Apache Beam Overview* >>> *Talk 2: Exactly one processing with Dataflow* >>> >>> We will have 30 min for each talk + 15 min of Q&A, the event will be >>> hosted at Google San Francisco and will be promote on the San Francisco >>> Cloud Mafia Meetup group. >>> >>> *Request: If you're interested in giving any of these two talks, reply >>> to this message and we can arrange details.* The date is a bit flexible >>> in case someone is interested but can't make the date. >>> >>> Thanks! >>> G >>> >>> >>> >> >> >> -- >> Derek Hao Hu >> >> Software Engineer | Snapchat >> Snap Inc. >> > > -- Derek Hao Hu Software Engineer | Snapchat Snap Inc.
Re: [Call for Speakers] Apache Beam Meetup @ San Francisco, CA on Oct. 24th
Hi Derek - I could look into that but so far we're not planning on it. Would you like to watch them later? On 9 October 2017 at 16:36, Derek Hao Hu wrote: > Hi Griselda, > > Will the talks be recorded? > > Thanks, > > Derek > > On Mon, Oct 9, 2017 at 12:25 PM, Griselda Cuevas wrote: > >> Hi everyone, >> >> I'm reaching out because I'm organizing a Meetup in San Francisco >> California on October 24th. and I'm looking for speakers. This is the >> agenda: >> >> *Talk 1: Apache Beam Overview* >> *Talk 2: Exactly one processing with Dataflow* >> >> We will have 30 min for each talk + 15 min of Q&A, the event will be >> hosted at Google San Francisco and will be promote on the San Francisco >> Cloud Mafia Meetup group. >> >> *Request: If you're interested in giving any of these two talks, reply to >> this message and we can arrange details.* The date is a bit flexible in >> case someone is interested but can't make the date. >> >> Thanks! >> G >> >> >> > > > -- > Derek Hao Hu > > Software Engineer | Snapchat > Snap Inc. >
Re: Slack inivitation
invited, welcome to join On Tue, Oct 10, 2017 at 12:31 PM, Artur Mrozowski wrote: > Hej, > could you please invite me to the slack channel? > > Thanks in advance > > /Artur >
Slack inivitation
Hej, could you please invite me to the slack channel? Thanks in advance /Artur
Re: October Apache Beam Newsletter
Cool, very informational, thanks! On Tue, Oct 10, 2017 at 2:39 AM Griselda Cuevas wrote: > Hi Apache Beam Community, > > Our first Apache Beam Newsletter is here!, I'm sharing a table of contents > of what's in this edition, which covers everything that has happened in the > project from June 2017 until October 2017. > > You can find the full content in this Google Doc: > https://docs.google.com/document/d/1BbpQne-9ng93G-_-UKH2C4UNafcEQcLALtu38qsXfI8/edit?usp=sharing > > Enjoy! > > * * * * * October 2017 Newsletter Table of Contents * * * * * > > >> What's Been Done > - Beam SQL DSL APIs > - Nexmark > - Splittable DoFn > - Improvements to reading and writing files > - Improvements to BigQueryIO > - New I/O connectors > - Docker development images and reproducible-builds > - Website updates: New Beam Execution Model page & the Mobile Gaming > Walkthrough was updated w/ new sample code for Python > > >> What We Are Working On > - Portability > - Splittable DoFn for Python SDK > - Website: Improve CoGroupByKey docs & website navigation/usability > > >> What's Planned > - FileIO.write() > > >> New Members (Welcome!) > - Daniel Harper, BBC, London (UK) > > >> Talks & Meetups > - Talks @ YOW Data Sydney and Strata NY > - Meetups @ London & dinner in NY > - Speakers & Meetup Founders group > > >> Resources > - Capability Matrix > - Contribution Guide > - Featured talk & sample intro talk deck >
Re: [Call for Speakers] Apache Beam Meetup @ San Francisco, CA on Oct. 24th
Hi Griselda, Will the talks be recorded? Thanks, Derek On Mon, Oct 9, 2017 at 12:25 PM, Griselda Cuevas wrote: > Hi everyone, > > I'm reaching out because I'm organizing a Meetup in San Francisco > California on October 24th. and I'm looking for speakers. This is the > agenda: > > *Talk 1: Apache Beam Overview* > *Talk 2: Exactly one processing with Dataflow* > > We will have 30 min for each talk + 15 min of Q&A, the event will be > hosted at Google San Francisco and will be promote on the San Francisco > Cloud Mafia Meetup group. > > *Request: If you're interested in giving any of these two talks, reply to > this message and we can arrange details.* The date is a bit flexible in > case someone is interested but can't make the date. > > Thanks! > G > > > -- Derek Hao Hu Software Engineer | Snapchat Snap Inc.
[Call for Speakers] Apache Beam Meetup @ San Francisco, CA on Oct. 24th
Hi everyone, I'm reaching out because I'm organizing a Meetup in San Francisco California on October 24th. and I'm looking for speakers. This is the agenda: *Talk 1: Apache Beam Overview* *Talk 2: Exactly one processing with Dataflow* We will have 30 min for each talk + 15 min of Q&A, the event will be hosted at Google San Francisco and will be promote on the San Francisco Cloud Mafia Meetup group. *Request: If you're interested in giving any of these two talks, reply to this message and we can arrange details.* The date is a bit flexible in case someone is interested but can't make the date. Thanks! G
October Apache Beam Newsletter
Hi Apache Beam Community, Our first Apache Beam Newsletter is here!, I'm sharing a table of contents of what's in this edition, which covers everything that has happened in the project from June 2017 until October 2017. You can find the full content in this Google Doc: https://docs.google.com/document/d/1BbpQne-9ng93G-_-UKH2C4UNafcEQcLALtu38qsXfI8/edit?usp=sharing Enjoy! * * * * * October 2017 Newsletter Table of Contents * * * * * >> What's Been Done - Beam SQL DSL APIs - Nexmark - Splittable DoFn - Improvements to reading and writing files - Improvements to BigQueryIO - New I/O connectors - Docker development images and reproducible-builds - Website updates: New Beam Execution Model page & the Mobile Gaming Walkthrough was updated w/ new sample code for Python >> What We Are Working On - Portability - Splittable DoFn for Python SDK - Website: Improve CoGroupByKey docs & website navigation/usability >> What's Planned - FileIO.write() >> New Members (Welcome!) - Daniel Harper, BBC, London (UK) >> Talks & Meetups - Talks @ YOW Data Sydney and Strata NY - Meetups @ London & dinner in NY - Speakers & Meetup Founders group >> Resources - Capability Matrix - Contribution Guide - Featured talk & sample intro talk deck
Re: Duplicate metric names when using Flink runner + Graphite reporter
Hey Aljoscha, Thanks for replying! Yes, I have three TextIO sources and two HBaseIO sources :) I believe duplicate naming also occurs for the intermediate transforms, though. The node names that are assigned to the "user-level" transforms, I believe, never make it to Beam's primitive transforms (I think they are called that?) and consequently never make it to the task and operator names in Flink. Since I posted this problem I have replaced Flink metrics with a single, manual reporting step after the pipeline runs, so I can't verify this hunch currently if you need some more information. Reinier From: Aljoscha Krettek Sent: 09 October 2017 16:56:26 To: user@beam.apache.org Subject: Re: Duplicate metric names when using Flink runner + Graphite reporter Hi Reinier, Do you have several sources in your pipeline? I think it's a problem of the Beam Flink runner that does not assign unique names which could be used to deduplicate the operators names that are used in the metrics name. Best, Aljoscha On 29. Sep 2017, at 20:17, Reinier Kip mailto:r...@bol.com>> wrote: Hi all, I'm running a Beam pipeline on Flink and sending metrics via the Graphite reporter. I get repeated exceptions on the slaves, which try to register the same metric multiple times. These duplicates all concern task and operator metrics. I have given all pipeline nodes unique names. I am using Beam 2.1.0, and am thus running Flink 1.3.0. Below you'll find the error+stacktrace, Flink's metrics configuration, and several examples of duplicate metric names concerning both tasks and operators. Is this a Beam problem? Should Beam give Flink tasks and operators unique names so they'll have unique metric names? Is this a Flink problem? Should Flink or Flink's Graphite reporter support duplicate metric names? Reinier ## Log message: [ERROR] Error while registering metric. Stack trace: java.lang.IllegalArgumentException: A metric named bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource (at Read(CompressedSource) (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numSplitsProcessed already exists at com.codahale.metrics.MetricRegistry.register(MetricRegistry.java:91) at org.apache.flink.dropwizard.ScheduledDropwizardReporter.notifyOfAddedMetric(ScheduledDropwizardReporter.java:151) at org.apache.flink.runtime.metrics.MetricRegistry.register(MetricRegistry.java:294) at org.apache.flink.runtime.metrics.groups.AbstractMetricGroup.addMetric(AbstractMetricGroup.java:370) at org.apache.flink.runtime.metrics.groups.AbstractMetricGroup.meter(AbstractMetricGroup.java:336) at org.apache.flink.runtime.metrics.groups.OperatorIOMetricGroup.(OperatorIOMetricGroup.java:42) at org.apache.flink.runtime.metrics.groups.OperatorMetricGroup.(OperatorMetricGroup.java:45) at org.apache.flink.runtime.metrics.groups.TaskMetricGroup.addOperator(TaskMetricGroup.java:133) at org.apache.flink.runtime.operators.chaining.ChainedDriver.setup(ChainedDriver.java:72) at org.apache.flink.runtime.operators.BatchTask.initOutputs(BatchTask.java:1299) at org.apache.flink.runtime.operators.BatchTask.initOutputs(BatchTask.java:1015) at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:256) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702) at java.lang.Thread.run(Thread.java:748) ## metrics.reporters: graphite metrics.reporter.graphite.class: org.apache.flink.metrics.graphite.GraphiteReporter metrics.reporter.graphite.host: something metrics.reporter.graphite.port: 2003 metrics.reporter.graphite.protocol: TCP metrics.reporter.graphite.interval: 1 SECONDS metrics.scope.jm: bla..jobmanager metrics.scope.jm.job: bla..jobmanager. metrics.scope.tm: bla..taskmanager. metrics.scope.tm.job: bla..taskmanager.. metrics.scope.task: bla..taskmanager...task.. metrics.scope.operator: bla..taskmanager...operator.. ## bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource (at Read(CompressedSource) (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numSplitsProcessed bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource (at Read(CompressedSource) (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numRecordsOutPerSecond bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource (at Read(CompressedSource) (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numRecordsOut bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource (at Read(CompressedSource) (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numRecordsInPerSecond bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087da
Re: Duplicate metric names when using Flink runner + Graphite reporter
Hi Reinier, Do you have several sources in your pipeline? I think it's a problem of the Beam Flink runner that does not assign unique names which could be used to deduplicate the operators names that are used in the metrics name. Best, Aljoscha > On 29. Sep 2017, at 20:17, Reinier Kip wrote: > > Hi all, > > I'm running a Beam pipeline on Flink and sending metrics via the Graphite > reporter. I get repeated exceptions on the slaves, which try to register the > same metric multiple times. These duplicates all concern task and operator > metrics. I have given all pipeline nodes unique names. > > I am using Beam 2.1.0, and am thus running Flink 1.3.0. > > Below you'll find the error+stacktrace, Flink's metrics configuration, and > several examples of duplicate metric names concerning both tasks and > operators. > > Is this a Beam problem? Should Beam give Flink tasks and operators unique > names so they'll have unique metric names? Is this a Flink problem? Should > Flink or Flink's Graphite reporter support duplicate metric names? > > Reinier > > ## > > Log message: [ERROR] Error while registering metric. > Stack trace: > > java.lang.IllegalArgumentException: A metric named > bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource > (at Read(CompressedSource) > (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numSplitsProcessed > already exists > at com.codahale.metrics.MetricRegistry.register(MetricRegistry.java:91) > at > org.apache.flink.dropwizard.ScheduledDropwizardReporter.notifyOfAddedMetric(ScheduledDropwizardReporter.java:151) > at > org.apache.flink.runtime.metrics.MetricRegistry.register(MetricRegistry.java:294) > at > org.apache.flink.runtime.metrics.groups.AbstractMetricGroup.addMetric(AbstractMetricGroup.java:370) > at > org.apache.flink.runtime.metrics.groups.AbstractMetricGroup.meter(AbstractMetricGroup.java:336) > at > org.apache.flink.runtime.metrics.groups.OperatorIOMetricGroup.(OperatorIOMetricGroup.java:42) > at > org.apache.flink.runtime.metrics.groups.OperatorMetricGroup.(OperatorMetricGroup.java:45) > at > org.apache.flink.runtime.metrics.groups.TaskMetricGroup.addOperator(TaskMetricGroup.java:133) > at > org.apache.flink.runtime.operators.chaining.ChainedDriver.setup(ChainedDriver.java:72) > at > org.apache.flink.runtime.operators.BatchTask.initOutputs(BatchTask.java:1299) > at > org.apache.flink.runtime.operators.BatchTask.initOutputs(BatchTask.java:1015) > at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:256) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702) > at java.lang.Thread.run(Thread.java:748) > > ## > > metrics.reporters: graphite > metrics.reporter.graphite.class: > org.apache.flink.metrics.graphite.GraphiteReporter > metrics.reporter.graphite.host: something > metrics.reporter.graphite.port: 2003 > metrics.reporter.graphite.protocol: TCP > metrics.reporter.graphite.interval: 1 SECONDS > metrics.scope.jm: bla..jobmanager > metrics.scope.jm.job: bla..jobmanager. > metrics.scope.tm: bla..taskmanager. > metrics.scope.tm.job: bla..taskmanager.. > metrics.scope.task: > bla..taskmanager...task.. > metrics.scope.operator: > bla..taskmanager...operator.. > > ## > > bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource > (at Read(CompressedSource) > (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numSplitsProcessed > bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource > (at Read(CompressedSource) > (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numRecordsOutPerSecond > bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource > (at Read(CompressedSource) > (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numRecordsOut > bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource > (at Read(CompressedSource) > (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numRecordsInPerSecond > bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource > (at Read(CompressedSource) > (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numRecordsIn > bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.task.DataSource > (at Read(CompressedSource) > (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numRecordsOutPerSecond > bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.task.DataSource > (at Read(CompressedSource) > (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numRecordsOut > bla.hdp-slave-019