Re: [Call for Speakers] Apache Beam Meetup @ San Francisco, CA on Oct. 24th

2017-10-09 Thread Griselda Cuevas
Got it.

I can def look into recording the talks and share later, in case it's not
possible I'll definitely share the slides.

I can also look into organizing something in Seattle later on :)

On Oct 9, 2017 10:59 PM, "Derek Hao Hu"  wrote:

> It's just that I'm in Seattle (and I guess a lot of us are not in SF). The
> talks seem pretty interesting. :)
>
> Can you help share the slides after the talk then?
>
> Thanks,
>
> Derek
>
> On Mon, Oct 9, 2017 at 10:57 PM, Griselda Cuevas  wrote:
>
>> Hi Derek - I could look into that but so far we're not planning on it.
>> Would you like to watch them later?
>>
>> On 9 October 2017 at 16:36, Derek Hao Hu  wrote:
>>
>>> Hi Griselda,
>>>
>>> Will the talks be recorded?
>>>
>>> Thanks,
>>>
>>> Derek
>>>
>>> On Mon, Oct 9, 2017 at 12:25 PM, Griselda Cuevas 
>>> wrote:
>>>
 Hi everyone,

 I'm reaching out because I'm organizing a Meetup in San Francisco
 California on October 24th. and I'm looking for speakers. This is the
 agenda:

 *Talk 1: Apache Beam Overview*
 *Talk 2: Exactly one processing with Dataflow*

 We will have 30 min for each talk + 15 min of Q&A, the event will be
 hosted at Google San Francisco and will be promote on the San Francisco
 Cloud Mafia Meetup group.

 *Request: If you're interested in giving any of these two talks, reply
 to this message and we can arrange details.* The date is a bit
 flexible in case someone is interested but can't make the date.

 Thanks!
 G



>>>
>>>
>>> --
>>> Derek Hao Hu
>>>
>>> Software Engineer | Snapchat
>>> Snap Inc.
>>>
>>
>>
>
>
> --
> Derek Hao Hu
>
> Software Engineer | Snapchat
> Snap Inc.
>


Re: [Call for Speakers] Apache Beam Meetup @ San Francisco, CA on Oct. 24th

2017-10-09 Thread Derek Hao Hu
It's just that I'm in Seattle (and I guess a lot of us are not in SF). The
talks seem pretty interesting. :)

Can you help share the slides after the talk then?

Thanks,

Derek

On Mon, Oct 9, 2017 at 10:57 PM, Griselda Cuevas  wrote:

> Hi Derek - I could look into that but so far we're not planning on it.
> Would you like to watch them later?
>
> On 9 October 2017 at 16:36, Derek Hao Hu  wrote:
>
>> Hi Griselda,
>>
>> Will the talks be recorded?
>>
>> Thanks,
>>
>> Derek
>>
>> On Mon, Oct 9, 2017 at 12:25 PM, Griselda Cuevas  wrote:
>>
>>> Hi everyone,
>>>
>>> I'm reaching out because I'm organizing a Meetup in San Francisco
>>> California on October 24th. and I'm looking for speakers. This is the
>>> agenda:
>>>
>>> *Talk 1: Apache Beam Overview*
>>> *Talk 2: Exactly one processing with Dataflow*
>>>
>>> We will have 30 min for each talk + 15 min of Q&A, the event will be
>>> hosted at Google San Francisco and will be promote on the San Francisco
>>> Cloud Mafia Meetup group.
>>>
>>> *Request: If you're interested in giving any of these two talks, reply
>>> to this message and we can arrange details.* The date is a bit flexible
>>> in case someone is interested but can't make the date.
>>>
>>> Thanks!
>>> G
>>>
>>>
>>>
>>
>>
>> --
>> Derek Hao Hu
>>
>> Software Engineer | Snapchat
>> Snap Inc.
>>
>
>


-- 
Derek Hao Hu

Software Engineer | Snapchat
Snap Inc.


Re: [Call for Speakers] Apache Beam Meetup @ San Francisco, CA on Oct. 24th

2017-10-09 Thread Griselda Cuevas
Hi Derek - I could look into that but so far we're not planning on it.
Would you like to watch them later?

On 9 October 2017 at 16:36, Derek Hao Hu  wrote:

> Hi Griselda,
>
> Will the talks be recorded?
>
> Thanks,
>
> Derek
>
> On Mon, Oct 9, 2017 at 12:25 PM, Griselda Cuevas  wrote:
>
>> Hi everyone,
>>
>> I'm reaching out because I'm organizing a Meetup in San Francisco
>> California on October 24th. and I'm looking for speakers. This is the
>> agenda:
>>
>> *Talk 1: Apache Beam Overview*
>> *Talk 2: Exactly one processing with Dataflow*
>>
>> We will have 30 min for each talk + 15 min of Q&A, the event will be
>> hosted at Google San Francisco and will be promote on the San Francisco
>> Cloud Mafia Meetup group.
>>
>> *Request: If you're interested in giving any of these two talks, reply to
>> this message and we can arrange details.* The date is a bit flexible in
>> case someone is interested but can't make the date.
>>
>> Thanks!
>> G
>>
>>
>>
>
>
> --
> Derek Hao Hu
>
> Software Engineer | Snapchat
> Snap Inc.
>


Re: Slack inivitation

2017-10-09 Thread Pei HE
invited, welcome to join

On Tue, Oct 10, 2017 at 12:31 PM, Artur Mrozowski  wrote:

> Hej,
> could you please invite me to the slack channel?
>
> Thanks in advance
>
> /Artur
>


Slack inivitation

2017-10-09 Thread Artur Mrozowski
Hej,
could you please invite me to the slack channel?

Thanks in advance

/Artur


Re: October Apache Beam Newsletter

2017-10-09 Thread James
Cool, very informational, thanks!

On Tue, Oct 10, 2017 at 2:39 AM Griselda Cuevas  wrote:

> Hi Apache Beam Community,
>
> Our first Apache Beam Newsletter is here!, I'm sharing a table of contents
> of what's in this edition, which covers everything that has happened in the
> project from June 2017 until October 2017.
>
> You can find the full content in this Google Doc:
> https://docs.google.com/document/d/1BbpQne-9ng93G-_-UKH2C4UNafcEQcLALtu38qsXfI8/edit?usp=sharing
>
> Enjoy!
>
> * * * * * October 2017 Newsletter Table of Contents * * * * *
>
> >> What's Been Done
> - Beam SQL DSL APIs
> - Nexmark
> - Splittable DoFn
> - Improvements to reading and writing files
> - Improvements to BigQueryIO
> - New I/O connectors
> - Docker development images and reproducible-builds
> - Website updates: New Beam Execution Model page & the Mobile Gaming
> Walkthrough was updated w/ new sample code for Python
>
> >> What We Are Working On
> - Portability
> - Splittable DoFn for Python SDK
> - Website: Improve CoGroupByKey docs & website navigation/usability
>
> >> What's Planned
> - FileIO.write()
>
> >>  New Members (Welcome!)
> - Daniel Harper, BBC, London (UK)
>
> >>  Talks & Meetups
> - Talks @ YOW Data Sydney and Strata NY
> - Meetups @ London & dinner in NY
> - Speakers & Meetup Founders group
>
> >>  Resources
> - Capability Matrix
> - Contribution Guide
> - Featured talk & sample intro talk deck
>


Re: [Call for Speakers] Apache Beam Meetup @ San Francisco, CA on Oct. 24th

2017-10-09 Thread Derek Hao Hu
Hi Griselda,

Will the talks be recorded?

Thanks,

Derek

On Mon, Oct 9, 2017 at 12:25 PM, Griselda Cuevas  wrote:

> Hi everyone,
>
> I'm reaching out because I'm organizing a Meetup in San Francisco
> California on October 24th. and I'm looking for speakers. This is the
> agenda:
>
> *Talk 1: Apache Beam Overview*
> *Talk 2: Exactly one processing with Dataflow*
>
> We will have 30 min for each talk + 15 min of Q&A, the event will be
> hosted at Google San Francisco and will be promote on the San Francisco
> Cloud Mafia Meetup group.
>
> *Request: If you're interested in giving any of these two talks, reply to
> this message and we can arrange details.* The date is a bit flexible in
> case someone is interested but can't make the date.
>
> Thanks!
> G
>
>
>


-- 
Derek Hao Hu

Software Engineer | Snapchat
Snap Inc.


[Call for Speakers] Apache Beam Meetup @ San Francisco, CA on Oct. 24th

2017-10-09 Thread Griselda Cuevas
Hi everyone,

I'm reaching out because I'm organizing a Meetup in San Francisco
California on October 24th. and I'm looking for speakers. This is the
agenda:

*Talk 1: Apache Beam Overview*
*Talk 2: Exactly one processing with Dataflow*

We will have 30 min for each talk + 15 min of Q&A, the event will be hosted
at Google San Francisco and will be promote on the San Francisco Cloud
Mafia Meetup group.

*Request: If you're interested in giving any of these two talks, reply to
this message and we can arrange details.* The date is a bit flexible in
case someone is interested but can't make the date.

Thanks!
G


October Apache Beam Newsletter

2017-10-09 Thread Griselda Cuevas
Hi Apache Beam Community,

Our first Apache Beam Newsletter is here!, I'm sharing a table of contents
of what's in this edition, which covers everything that has happened in the
project from June 2017 until October 2017.

You can find the full content in this Google Doc:
https://docs.google.com/document/d/1BbpQne-9ng93G-_-UKH2C4UNafcEQcLALtu38qsXfI8/edit?usp=sharing

Enjoy!

* * * * * October 2017 Newsletter Table of Contents * * * * *

>> What's Been Done
- Beam SQL DSL APIs
- Nexmark
- Splittable DoFn
- Improvements to reading and writing files
- Improvements to BigQueryIO
- New I/O connectors
- Docker development images and reproducible-builds
- Website updates: New Beam Execution Model page & the Mobile Gaming
Walkthrough was updated w/ new sample code for Python

>> What We Are Working On
- Portability
- Splittable DoFn for Python SDK
- Website: Improve CoGroupByKey docs & website navigation/usability

>> What's Planned
- FileIO.write()

>>  New Members (Welcome!)
- Daniel Harper, BBC, London (UK)

>>  Talks & Meetups
- Talks @ YOW Data Sydney and Strata NY
- Meetups @ London & dinner in NY
- Speakers & Meetup Founders group

>>  Resources
- Capability Matrix
- Contribution Guide
- Featured talk & sample intro talk deck


Re: Duplicate metric names when using Flink runner + Graphite reporter

2017-10-09 Thread Reinier Kip
Hey Aljoscha,


Thanks for replying! Yes, I have three TextIO sources and two HBaseIO sources 
:) I believe duplicate naming also occurs for the intermediate transforms, 
though. The node names that are assigned to the "user-level" transforms, I 
believe, never make it to Beam's primitive transforms (I think they are called 
that?) and consequently never make it to the task and operator names in Flink.


Since I posted this problem I have replaced Flink metrics with a single, manual 
reporting step after the pipeline runs, so I can't verify this hunch currently 
if you need some more information.


Reinier


From: Aljoscha Krettek 
Sent: 09 October 2017 16:56:26
To: user@beam.apache.org
Subject: Re: Duplicate metric names when using Flink runner + Graphite reporter

Hi Reinier,

Do you have several sources in your pipeline? I think it's a problem of the 
Beam Flink runner that does not assign unique names which could be used to 
deduplicate the operators names that are used in the metrics name.

Best,
Aljoscha

On 29. Sep 2017, at 20:17, Reinier Kip mailto:r...@bol.com>> 
wrote:


Hi all,


I'm running a Beam pipeline on Flink and sending metrics via the Graphite 
reporter. I get repeated exceptions on the slaves, which try to register the 
same metric multiple times. These duplicates all concern task and operator 
metrics. I have given all pipeline nodes unique names.

I am using Beam 2.1.0, and am thus running Flink 1.3.0.

Below you'll find the error+stacktrace, Flink's metrics configuration, and 
several examples of duplicate metric names concerning both tasks and operators.

Is this a Beam problem? Should Beam give Flink tasks and operators unique names 
so they'll have unique metric names? Is this a Flink problem? Should Flink or 
Flink's Graphite reporter support duplicate metric names?

Reinier

##

Log message: [ERROR] Error while registering metric.
Stack trace:

java.lang.IllegalArgumentException: A metric named 
bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource
 (at Read(CompressedSource) 
(org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numSplitsProcessed
 already exists
at com.codahale.metrics.MetricRegistry.register(MetricRegistry.java:91)
at 
org.apache.flink.dropwizard.ScheduledDropwizardReporter.notifyOfAddedMetric(ScheduledDropwizardReporter.java:151)
at 
org.apache.flink.runtime.metrics.MetricRegistry.register(MetricRegistry.java:294)
at 
org.apache.flink.runtime.metrics.groups.AbstractMetricGroup.addMetric(AbstractMetricGroup.java:370)
at 
org.apache.flink.runtime.metrics.groups.AbstractMetricGroup.meter(AbstractMetricGroup.java:336)
at 
org.apache.flink.runtime.metrics.groups.OperatorIOMetricGroup.(OperatorIOMetricGroup.java:42)
at 
org.apache.flink.runtime.metrics.groups.OperatorMetricGroup.(OperatorMetricGroup.java:45)
at 
org.apache.flink.runtime.metrics.groups.TaskMetricGroup.addOperator(TaskMetricGroup.java:133)
at 
org.apache.flink.runtime.operators.chaining.ChainedDriver.setup(ChainedDriver.java:72)
at org.apache.flink.runtime.operators.BatchTask.initOutputs(BatchTask.java:1299)
at org.apache.flink.runtime.operators.BatchTask.initOutputs(BatchTask.java:1015)
at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:256)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
at java.lang.Thread.run(Thread.java:748)

##

metrics.reporters: graphite
metrics.reporter.graphite.class: 
org.apache.flink.metrics.graphite.GraphiteReporter
metrics.reporter.graphite.host: something
metrics.reporter.graphite.port: 2003
metrics.reporter.graphite.protocol: TCP
metrics.reporter.graphite.interval: 1 SECONDS
metrics.scope.jm: bla..jobmanager
metrics.scope.jm.job: bla..jobmanager.
metrics.scope.tm: bla..taskmanager.
metrics.scope.tm.job: bla..taskmanager..
metrics.scope.task: 
bla..taskmanager...task..
metrics.scope.operator: 
bla..taskmanager...operator..


##


bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource
 (at Read(CompressedSource) 
(org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numSplitsProcessed
bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource
 (at Read(CompressedSource) 
(org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numRecordsOutPerSecond
bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource
 (at Read(CompressedSource) 
(org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numRecordsOut
bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource
 (at Read(CompressedSource) 
(org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numRecordsInPerSecond
bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087da

Re: Duplicate metric names when using Flink runner + Graphite reporter

2017-10-09 Thread Aljoscha Krettek
Hi Reinier,

Do you have several sources in your pipeline? I think it's a problem of the 
Beam Flink runner that does not assign unique names which could be used to 
deduplicate the operators names that are used in the metrics name.

Best,
Aljoscha

> On 29. Sep 2017, at 20:17, Reinier Kip  wrote:
> 
> Hi all,
> 
> I'm running a Beam pipeline on Flink and sending metrics via the Graphite 
> reporter. I get repeated exceptions on the slaves, which try to register the 
> same metric multiple times. These duplicates all concern task and operator 
> metrics. I have given all pipeline nodes unique names.
> 
> I am using Beam 2.1.0, and am thus running Flink 1.3.0.
> 
> Below you'll find the error+stacktrace, Flink's metrics configuration, and 
> several examples of duplicate metric names concerning both tasks and 
> operators.
> 
> Is this a Beam problem? Should Beam give Flink tasks and operators unique 
> names so they'll have unique metric names? Is this a Flink problem? Should 
> Flink or Flink's Graphite reporter support duplicate metric names?
> 
> Reinier
> 
> ##
> 
> Log message: [ERROR] Error while registering metric.
> Stack trace:
> 
> java.lang.IllegalArgumentException: A metric named 
> bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource
>  (at Read(CompressedSource) 
> (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numSplitsProcessed
>  already exists
> at com.codahale.metrics.MetricRegistry.register(MetricRegistry.java:91)
> at 
> org.apache.flink.dropwizard.ScheduledDropwizardReporter.notifyOfAddedMetric(ScheduledDropwizardReporter.java:151)
> at 
> org.apache.flink.runtime.metrics.MetricRegistry.register(MetricRegistry.java:294)
> at 
> org.apache.flink.runtime.metrics.groups.AbstractMetricGroup.addMetric(AbstractMetricGroup.java:370)
> at 
> org.apache.flink.runtime.metrics.groups.AbstractMetricGroup.meter(AbstractMetricGroup.java:336)
> at 
> org.apache.flink.runtime.metrics.groups.OperatorIOMetricGroup.(OperatorIOMetricGroup.java:42)
> at 
> org.apache.flink.runtime.metrics.groups.OperatorMetricGroup.(OperatorMetricGroup.java:45)
> at 
> org.apache.flink.runtime.metrics.groups.TaskMetricGroup.addOperator(TaskMetricGroup.java:133)
> at 
> org.apache.flink.runtime.operators.chaining.ChainedDriver.setup(ChainedDriver.java:72)
> at 
> org.apache.flink.runtime.operators.BatchTask.initOutputs(BatchTask.java:1299)
> at 
> org.apache.flink.runtime.operators.BatchTask.initOutputs(BatchTask.java:1015)
> at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:256)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
> at java.lang.Thread.run(Thread.java:748)
> 
> ##
> 
> metrics.reporters: graphite
> metrics.reporter.graphite.class: 
> org.apache.flink.metrics.graphite.GraphiteReporter
> metrics.reporter.graphite.host: something
> metrics.reporter.graphite.port: 2003
> metrics.reporter.graphite.protocol: TCP
> metrics.reporter.graphite.interval: 1 SECONDS
> metrics.scope.jm: bla..jobmanager
> metrics.scope.jm.job: bla..jobmanager.
> metrics.scope.tm: bla..taskmanager.
> metrics.scope.tm.job: bla..taskmanager..
> metrics.scope.task: 
> bla..taskmanager...task..
> metrics.scope.operator: 
> bla..taskmanager...operator..
> 
> ##
> 
> bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource
>  (at Read(CompressedSource) 
> (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numSplitsProcessed
> bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource
>  (at Read(CompressedSource) 
> (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numRecordsOutPerSecond
> bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource
>  (at Read(CompressedSource) 
> (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numRecordsOut
> bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource
>  (at Read(CompressedSource) 
> (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numRecordsInPerSecond
> bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.operator.DataSource
>  (at Read(CompressedSource) 
> (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numRecordsIn
> bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.task.DataSource
>  (at Read(CompressedSource) 
> (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numRecordsOutPerSecond
> bla.hdp-slave-019.taskmanager.3067db835689cdd8b6087dae79ec088f.bla-0929174053-aa405356.task.DataSource
>  (at Read(CompressedSource) 
> (org-apache-beam-runners-flink-translation-wrappers-SourceInputFormat)).5.numRecordsOut
> bla.hdp-slave-019