kafkaIO Run with Spark Runner: "streaming-job-executor-0"

2018-06-13 Thread linrick
Dear all, I am using the kafkaIO in my project (Beam 2.0.0 with Spark runner). My running environment is: OS: Ubuntn 14.04.3 LTS The different version for these tools: JAVA: JDK 1.8 Beam 2.0.0 (Spark runner with Standalone mode) Spark 1.6.0 Standalone mode :One driver node: ubuntu7; One master

Re: SQL Filter Pushdowns in Apache Beam SQL

2018-06-13 Thread Lukasz Cwik
It is currently the later where all the data is read and then filtered within the pipeline. Note that this doesn't mean that all the data is loaded into memory as the way that the join is done is dependent on the Runner that is powering the pipeline. Kenn had shared this doc[1] which is starting

Re: kafkaIO Run with Spark Runner: "streaming-job-executor-0"

2018-06-13 Thread Raghu Angadi
Can you check the logs on the worker? On Wed, Jun 13, 2018 at 2:26 AM wrote: > Dear all, > > > > I am using the kafkaIO in my project (Beam 2.0.0 with Spark runner). > > My running environment is: > > OS: Ubuntn 14.04.3 LTS > > The different version for these tools: > > JAVA: JDK 1.8 > > Beam

Re: [FYI] New Apache Beam Swag Store!

2018-06-13 Thread Ismaël Mejía
Great ! Thanks Gris and Matthias for putting this in place. Hope to get that hoodie soon. As a suggestion, more colors too, and eventually a t-shirt just with the big B logo. On Mon, Jun 11, 2018 at 6:50 PM Mikhail Gryzykhin wrote: > > That's nice! > > More colors are appreciated :) > > --Mikhail

Re: kafkaIO Run with Spark Runner: "streaming-job-executor-0"

2018-06-13 Thread Ismaël Mejía
Can you please update the version of Beam to at least version 2.2.0. There were some important fixes in streaming after the 2.0.0 release so this could be related. Ideally you should use the latest released version (2.4.0). Remember that starting with Beam 2.3.0 the Spark runner is based on Spark

SQL Filter Pushdowns in Apache Beam SQL

2018-06-13 Thread Harshvardhan Agrawal
Hi, We are currently playing with Apache Beam’s SQL extension on top of Flink. One of the features that we were interested is the SQL Predicate Pushdown feature that Spark provides. Does Beam support that? For eg: I have an unbounded dataset that I want to join with some static reference data

[Call for Speakers] Deep Learning in Production Meetup, Boston Area on June 26th

2018-06-13 Thread Griselda Cuevas
Hi Beam Community, Eila Arich-Landkof (from OrielResearch) and I are co-hosting the next edition of the Deep Learning in Production Meetup on June 26th at the Google Office in Cambridge, Massachusetts. *We are looking for speakers who would

Beam Metrics not shown in Flink Web UI

2018-06-13 Thread Abdul Qadeer
Hi! I am using Beam 2.4.0 SDK with Flink 1.4.0 runner I have created a "Counter" object in my pipeline (DoFn subclass) with "Metrics.counter" method. However I am not able to see this metric under "Accumulators" tab. I only see the following: It doesn't show up in "Metrics" -> "Add Metric"

Re: Apache Beam June Newsletter

2018-06-13 Thread Pablo Estrada
Thanks Gris! Lots of interesting things. Best -P. On Wed, Jun 13, 2018 at 4:40 PM Griselda Cuevas wrote: > Hi Beam Community! > > Here > > [1] > is the June Edition of our Apache Beam

Apache Beam June Newsletter

2018-06-13 Thread Griselda Cuevas
Hi Beam Community! Here [1] is the June Edition of our Apache Beam Newsletter. This edition was curated by our community of contributors, committers and PMCs. Generally, it contains the work done

Re: SQL Filter Pushdowns in Apache Beam SQL

2018-06-13 Thread Kenneth Knowles
This has come up in a couple of in-person conversations. Pushing filtering and projection into to connectors is something we intend to do. Calcite's optimizer is designed to support this, we just don't have it set up. Your use case sounds like one that might test the limits of that, since the

Re: Beam Metrics not shown in Flink Web UI

2018-06-13 Thread Abdul Qadeer
I followed this on stackoverflow too, but didn't help. On Wed, Jun 13, 2018 at 5:23 PM, Abdul Qadeer wrote: > Hi! > > I am using Beam 2.4.0 SDK with Flink 1.4.0 runner > I have created a

Re: Beam Metrics not shown in Flink Web UI

2018-06-13 Thread Abdul Qadeer
Not sure if inline image is visible, here is a link. On Wed, Jun 13, 2018 at 5:24 PM, Abdul Qadeer wrote: > I followed this > > on stackoverflow too, but didn't help.

Re: SQL Filter Pushdowns in Apache Beam SQL

2018-06-13 Thread Harshvardhan Agrawal
I would assume that in the case where we don’t go the SQL route we would have 2 options: 1) Store the reference data and supply it as side input. This solution would not be feasible in cases where I have to join against say 10 different datasets since I don’t want to have so much of data in