Re: New contributor to BEAM SQL

2019-09-16 Thread Lukasz Cwik
Welcome Kirill, I have granted you the JIRA permissions you requested. On Mon, Sep 16, 2019 at 10:59 AM Kirill Kozlov wrote: > Hello everyone! > > My name is Kirill Kozlov, I recently joined a Dataflow team at Google and > will be working on SQL filter pushdown. > Can I get permission to work

Re: portableWordCountBatch and portableWordCountStreaming failing in Python PreCommit

2019-09-16 Thread Mark Liu
Thank you Hannah! BTW, the fix is https://github.com/apache/beam/pull/9588. Since this affects release, https://github.com/apache/beam/pull/9595 will be cherry-picked to release branch. On Mon, Sep 16, 2019 at 9:47 PM Hannah Jiang wrote: > The fix is merged. I tested with PRs which used to

Re: portableWordCountBatch and portableWordCountStreaming failing in Python PreCommit

2019-09-16 Thread Hannah Jiang
The fix is merged. I tested with PRs which used to fail and the failures are fixed now. Please rerun the test if your PR is affected. On Mon, Sep 16, 2019 at 6:16 PM Mark Liu wrote: > Thanks for letting me know. I'll keep tracking on this issue since it's a > release blocker. Please update

Re: portableWordCountBatch and portableWordCountStreaming failing in Python PreCommit

2019-09-16 Thread Mark Liu
Thanks for letting me know. I'll keep tracking on this issue since it's a release blocker. Please update here/jira if you have any progress. Mark On Mon, Sep 16, 2019 at 5:04 PM Hannah Jiang wrote: > For issue with flink image, I re-opened a ticket which is currently > blocking

Re: The state of external transforms in Beam

2019-09-16 Thread Chamikara Jayalath
Thanks for the nice write up Chad. On Mon, Sep 16, 2019 at 12:17 PM Robert Bradshaw wrote: > Thanks for bringing this up again. My thoughts on the open questions below. > > On Mon, Sep 16, 2019 at 11:51 AM Chad Dombrova wrote: > > That commit solves 2 problems: > > > > Adds the pubsub Java

Re: portableWordCountBatch and portableWordCountStreaming failing in Python PreCommit

2019-09-16 Thread Hannah Jiang
For issue with flink image, I re-opened a ticket which is currently blocking release.(BEAM-8165) On Mon, Sep 16, 2019 at 5:00 PM Ahmet Altay wrote: > > > On Mon, Sep 16, 2019 at 2:07 PM Kyle Weaver wrote: > >> The original issue ("GetJobMetrics is unimplemented") is still probably >> hiding

Re: portableWordCountBatch and portableWordCountStreaming failing in Python PreCommit

2019-09-16 Thread Ahmet Altay
On Mon, Sep 16, 2019 at 2:07 PM Kyle Weaver wrote: > The original issue ("GetJobMetrics is unimplemented") is still probably > hiding under the docker issue, so I filed > https://issues.apache.org/jira/browse/BEAM-8245 for it > Thank you. Who should be assigned to this issue? If metrics are

Re: MQTT to Python SDK

2019-09-16 Thread Brian Hulette
If you go down the cross-language route you may find Chad Dombrova's experience developing a cross-language PubSubIO helpful. It's being discussed in another thread: https://lists.apache.org/thread.html/6e2e3b8c2becdf22303ed231ebeda73550a3ce9acbf2f73ccf1982f2@%3Cdev.beam.apache.org%3E On Mon, Sep

Re: Next LTS?

2019-09-16 Thread Valentyn Tymofieiev
I support nominating 2.16.0 as LTS release since in has robust Python 3 support compared with prior releases, and also for reasons of pending Python 2 deprecation. This has been discussed before [1]. As Robert pointed out in that thread, LTS nomination in Beam is currently retroactive. If we keep

Next LTS?

2019-09-16 Thread Austin Bennett
Hi All, According to our policies page [1]: "There will be at least one new LTS release in a 12 month period, and LTS releases are considered deprecated after 12 months" The last LTS was released 2018-10-02 [2]. Does that mean the next release (2.16) should be the next LTS? It looks like we

Re: portableWordCountBatch and portableWordCountStreaming failing in Python PreCommit

2019-09-16 Thread Kyle Weaver
The original issue ("GetJobMetrics is unimplemented") is still probably hiding under the docker issue, so I filed https://issues.apache.org/jira/browse/BEAM-8245 for it Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com On Mon, Sep 16, 2019 at 1:14 PM Hannah Jiang wrote:

Re: portableWordCountBatch and portableWordCountStreaming failing in Python PreCommit

2019-09-16 Thread Hannah Jiang
If we try to create a flink image from master branch, it will create *apachebeam/flink-job-server*, while the code is expecting *jenkins-docker-apache.bintray.io/beam/flink-job-server:latest *. This should be introduced when we

Re: MQTT to Python SDK

2019-09-16 Thread Chamikara Jayalath
Regarding cross-language transforms support, documentation is in flux at the moment since API is still being updated and runner support is in development. If you want to try out, I'd say cross-language wordcount example is a good starting point:

Re: portableWordCountBatch and portableWordCountStreaming failing in Python PreCommit

2019-09-16 Thread Lukasz Cwik
I'm also being impacted by this on my PR[1]. I found BEAM-6316[2] that has a similar error but it was resolved Dec 2018. 1: https://github.com/apache/beam/pull/9583 2: https://issues.apache.org/jira/browse/BEAM-6316 On Mon, Sep 16, 2019 at 12:43 PM Ning Kang wrote: > A new check renders

Re: portableWordCountBatch and portableWordCountStreaming failing in Python PreCommit

2019-09-16 Thread Ning Kang
A new check renders clearer message: Unable to find image ' jenkins-docker-apache.bintray.io/beam/flink-job-server:latest' locally docker: Error response from daemon: unknown: Repo 'apache' was not found. See 'docker run --help'. ERROR:root:Starting job service with ['docker', 'run', '-v',

Re: portableWordCountBatch and portableWordCountStreaming failing in Python PreCommit

2019-09-16 Thread Ning Kang
To Ahmet, these are warnings, I'm not able to identify the errors yet. Thanks everyone! I'm watching the Jira now. On Mon, Sep 16, 2019 at 12:07 PM Chad Dombrova wrote: > Ning, if you're having trouble making sense of the preCommit errors, you > may be interested in this Jira: >

Re: The state of external transforms in Beam

2019-09-16 Thread Robert Bradshaw
Thanks for bringing this up again. My thoughts on the open questions below. On Mon, Sep 16, 2019 at 11:51 AM Chad Dombrova wrote: > That commit solves 2 problems: > > Adds the pubsub Java deps so that they’re available in our portable pipeline > Makes the coder for the PubsubIO message-holder

Re: MQTT to Python SDK

2019-09-16 Thread Jean-Baptiste Onofré
Regarding Java SDK, you have MqttIO available. Regards JB On 16/09/2019 21:07, Lucas Magalhães wrote: > Thanks Altay.. Do you know where I could find more about cross language > transforms? Documentation and examples as well. > > thanks again > > On Mon, Sep 16, 2019 at 4:00 PM Ahmet Altay

Re: MQTT to Python SDK

2019-09-16 Thread Lucas Magalhães
Thanks Altay.. Do you know where I could find more about cross language transforms? Documentation and examples as well. thanks again On Mon, Sep 16, 2019 at 4:00 PM Ahmet Altay wrote: > A framework for python sdk to use a native unbounded connector does not > exist yet. You might be able to

Re: portableWordCountBatch and portableWordCountStreaming failing in Python PreCommit

2019-09-16 Thread Chad Dombrova
Ning, if you're having trouble making sense of the preCommit errors, you may be interested in this Jira: https://issues.apache.org/jira/browse/BEAM-8213# On Mon, Sep 16, 2019 at 12:02 PM Kyle Weaver wrote: > Python 2 isn't the reason the test is failing, that's just a warning. The > actual

Re: portableWordCountBatch and portableWordCountStreaming failing in Python PreCommit

2019-09-16 Thread Kyle Weaver
Python 2 isn't the reason the test is failing, that's just a warning. The actual error is at the very end of the log (it looks familiar to me, though I don't see a JIRA for it): <_Rendezvous of RPC that terminated with: status = StatusCode.UNIMPLEMENTED details = "Method

Re: portableWordCountBatch and portableWordCountStreaming failing in Python PreCommit

2019-09-16 Thread Ahmet Altay
To clarify are they errors or warnings? There is a plan to stop supporting python 2 by ~end of the year. +Valentyn Tymofieiev shared details about it earlier on the dev@ list. On Mon, Sep 16, 2019 at 11:34 AM Ning Kang wrote: > Hi! I've been seeing some errors during "Python PreCommit". > I'm

Re: MQTT to Python SDK

2019-09-16 Thread Ahmet Altay
A framework for python sdk to use a native unbounded connector does not exist yet. You might be able to use the same connector from Java using cross language transforms. /cc +Chamikara Jayalath On Mon, Sep 16, 2019 at 11:00 AM Lucas Magalhães < lucas.magalh...@paralelocs.com.br> wrote: > Hello

The state of external transforms in Beam

2019-09-16 Thread Chad Dombrova
Hi all, There was some interest in this topic at the Beam Summit this week (btw, great job to everyone involved!), so I thought I’d try to summarize the current state of things. First, let me explain the idea behind an external transforms for the uninitiated. Problem: - there’s a transform

portableWordCountBatch and portableWordCountStreaming failing in Python PreCommit

2019-09-16 Thread Ning Kang
Hi! I've been seeing some errors during "Python PreCommit". I'm seeing "UserWarning: You are using Apache Beam with Python 2. New releases of Apache Beam will soon support Python 3 only. 'You are using Apache Beam with Python 2. '" Is there any plan to remove py2 tests from the pre-commit check

MQTT to Python SDK

2019-09-16 Thread Lucas Magalhães
Hello dears! I'm starding a new project here and the mainly source is a MQTT. I could´n find any documentantion about to How to develeop a unbounded connector. Could anyone send me some instructions or guide line? Thanks a lot -- Lucas Magalhães, CTO Paralelo CS - Consultoria e Serviços

New contributor to BEAM SQL

2019-09-16 Thread Kirill Kozlov
Hello everyone! My name is Kirill Kozlov, I recently joined a Dataflow team at Google and will be working on SQL filter pushdown. Can I get permission to work issues in jira, my username is: kirillkozlov Looking forward to developing Beam together! Thank you, Kirill Kozlov

Re: using avro instead of json for BigQueryIO.Write

2019-09-16 Thread Steve Niemitz
Our experience has actually been that avro is more efficient than even parquet, but that might also be skewed from our datasets. I might try to take a crack at this, I found https://issues.apache.org/jira/browse/BEAM-2879 tracking it (which coincidentally references my thread from a couple years

Re: using avro instead of json for BigQueryIO.Write

2019-09-16 Thread Reuven Lax
It's been talked about, but nobody's done anything. There as some difficulties related to type conversion (json and avro don't support the same types), but if those are overcome then an avro version would be much more efficient. I believe Parquet files would be even more efficient if you wanted to

using avro instead of json for BigQueryIO.Write

2019-09-16 Thread Steve Niemitz
Has anyone investigated using avro rather than json to load data into BigQuery using BigQueryIO (+ FILE_LOADS)? I'd be interested in enhancing it to support this, but I'm curious if there's any prior work here.

Beam Dependency Check Report (2019-09-16)

2019-09-16 Thread Apache Jenkins Server
High Priority Dependency Updates Of Beam Python SDK: Dependency Name Current Version Latest Version Release Date Of the Current Used Version Release Date Of The Latest Release JIRA Issue mock 2.0.0 3.0.5 2019-05-20