Re: How to use "PortableRunner" in Python SDK?

2018-11-07 Thread Ruoyun Huang
Thanks Ankur and Maximilian. Just for reference in case other people encountering the same error message, the "permission denied" error in my original email is exactly due to docker inside docker issue that Ankur mentioned. Thanks Ankur! Didn't make the link when you said it, had to discover

Re: [BEAM-5442] Store duplicate unknown (runner) options in a list argument

2018-11-07 Thread Maximilian Michels
+1 If the preferred approach is to eventually have the JobServer serve the options, then the best intermediate solution is to replicate common options in the SDKs. If we went down the "--runner_option" path, we would end up with multiple ways of specifying the same options. We would

Re: [Euphoria] Looking for a reviewer.

2018-11-07 Thread Maximilian Michels
Yes, I'd still like to help out where possible but I missed your mail, David. Feel free to reach out to me via mail/Slack. Or simply mention me on the pull request. I'd leave this one to JB for now but will have a look tomorrow. Cheers, Max On 07.11.18 17:47, Lukasz Cwik wrote: Welcome back

Re: Performance of BeamFnData between Python and Java

2018-11-07 Thread Lukasz Cwik
gRPC folks provide a bunch of benchmarks for different scenarios: https://grpc.io/docs/guides/benchmarking.html You would be most interested in the streaming throughput benchmarks since the Data API is written on top of the gRPC streaming APIs. 200KB/s does seem pretty small. Have you captured

Re: Running SpannerWriteIT on dataflow

2018-11-07 Thread Lukasz Cwik
You want to run this task[1] (either on your machine or by opening a Github PR and using a trigger phrase). Tracing back from that task, you'll find that the root ":javaPostCommit"[2] task is responsible for running that task and a bunch of others and that the Java SDK Post Commit Tests[3]

Performance of BeamFnData between Python and Java

2018-11-07 Thread Hai Lu
Hi, This is Hai from LinkedIn. I'm currently working on Portable API for Samza Runner. I was able to make Python work with Samza container reading from Kafka. However, I'm seeing severe performance issue with my set up, achieving only ~200KB throughput between the Samza runner in the Java side

Re: sick today

2018-11-07 Thread Rui Wang
Hope you feel better! -Rui On Wed, Nov 7, 2018 at 2:02 AM Maximilian Michels wrote: > Hi Etienne, > > Hope you're feeling better soon! :) > > Cheers, > Max > > On 07.11.18 10:25, Etienne Chauchot wrote: > > Hi guys, > > I'm sick today, so I might be a bit unresponsive :) > > Etienne >

Re: [Euphoria] Looking for a reviewer.

2018-11-07 Thread Lukasz Cwik
Welcome back and thanks for picking it up. On Wed, Nov 7, 2018 at 8:35 AM Jean-Baptiste Onofré wrote: > Hi, > > Yes, I'm on it. Sorry, I'm just back from 2 weeks of vacation ;) > > Regards > JB > On 07/11/2018 17:07, Scott Wegner wrote: > > Václav Plajt had previously reached out [1] looking

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-07 Thread Lukasz Cwik
On Wed, Nov 7, 2018 at 8:33 AM Robert Bradshaw wrote: > I think that not returning the users specific subclass should be fine. > Does the removal of markDone imply that the consumer always knows a > "final" key to claim on any given restriction? > Yes, each restriction needs to support claiming

Re: [Euphoria] Looking for a reviewer.

2018-11-07 Thread Jean-Baptiste Onofré
Hi, Yes, I'm on it. Sorry, I'm just back from 2 weeks of vacation ;) Regards JB On 07/11/2018 17:07, Scott Wegner wrote: > Václav Plajt had previously reached out [1] looking for reviewers for > Euphoria, and a few individuals volunteered. JB, Max, are you still > able to help out with Euphoria

Re: [DISCUSS] SplittableDoFn Java SDK User Facing API

2018-11-07 Thread Robert Bradshaw
I think that not returning the users specific subclass should be fine. Does the removal of markDone imply that the consumer always knows a "final" key to claim on any given restriction? On Wed, Nov 7, 2018 at 1:45 AM Lukasz Cwik wrote: > > I have started to work on how to change the user facing

Nexmark Phrase Triggering

2018-11-07 Thread Łukasz Gajowy
Hi, recent experience with Nexmark crashes made enabling Phrase Triggering in Nexmark suites even more urgent. If you have any opinions in this area feel free to share them. Here's the link to a corresponding JIRA issue: https://issues.apache.org/jira/browse/BEAM-6011 Łukasz

Re: [Euphoria] Looking for a reviewer.

2018-11-07 Thread Scott Wegner
Václav Plajt had previously reached out [1] looking for reviewers for Euphoria, and a few individuals volunteered. JB, Max, are you still able to help out with Euphoria reviews? [1]

Running SpannerWriteIT on dataflow

2018-11-07 Thread Wout Scheepers
Hey all, I’m still running into a bug when streaming into spanner, which I describe in the comments of https://issues.apache.org/jira/browse/BEAM-4796. I think the cause is a missing equals method on SpannerSchema, for which I get a warning in the worker logs when running on Dataflow. To

Re: sick today

2018-11-07 Thread Maximilian Michels
Hi Etienne, Hope you're feeling better soon! :) Cheers, Max On 07.11.18 10:25, Etienne Chauchot wrote: Hi guys, I'm sick today, so I might be a bit unresponsive :) Etienne

sick today

2018-11-07 Thread Etienne Chauchot
Hi guys, I'm sick today, so I might be a bit unresponsive :) Etienne

Re: [DISCUSS] More precision supported by DATETIME field in Schema

2018-11-07 Thread Robert Bradshaw
Yes, microseconds is a good compromise for covering a long enough timespan that there's little reason it could be hit (even for processing historical data). Regarding backwards compatibility, could we just change the internal representation of Beam's element timestamps, possibly with new APIs to