Re: Possible Python SDK performance regression

2019-09-06 Thread Ahmet Altay
I agree, let's investigate. Thomas could you file JIRAs once you have additional information. Valentyn, I think the performance regression could be investigated now, by running whatever benchmarks that is available against 2.14, 2.15 and head and see if the same regression could be reproduced.

Re: [discuss] How we support our users on Slack / Mailing list / StackOverflow

2019-09-06 Thread Ahmet Altay
I agree Slack can be used by Beam users and it would be good to meet users where they are. If I understand correctly, the issue Pablo is raising is that there are not enough people online in Slack that can answer python questions. We also need to help people who ask questions and who can answer

Re: Possible Python SDK performance regression

2019-09-06 Thread Valentyn Tymofieiev
Sounds like these regressions need to be investigated ahead of 2.16.0 release. On Fri, Sep 6, 2019 at 6:44 PM Thomas Weise wrote: > > > On Fri, Sep 6, 2019 at 6:23 PM Ahmet Altay wrote: > >> >> >> On Fri, Sep 6, 2019 at 6:17 PM Thomas Weise wrote: >> >>> >>> >>> On Fri, Sep 6, 2019 at 2:24 PM

Re: Possible Python SDK performance regression

2019-09-06 Thread Thomas Weise
On Fri, Sep 6, 2019 at 6:23 PM Ahmet Altay wrote: > > > On Fri, Sep 6, 2019 at 6:17 PM Thomas Weise wrote: > >> >> >> On Fri, Sep 6, 2019 at 2:24 PM Valentyn Tymofieiev >> wrote: >> >>> +Mark Liu has added some benchmarks running across >>> multiple Python versions. Specifically we run 1 GB

Re: Possible Python SDK performance regression

2019-09-06 Thread Valentyn Tymofieiev
On Fri, Sep 6, 2019 at 6:23 PM Ahmet Altay wrote: > > > On Fri, Sep 6, 2019 at 6:17 PM Thomas Weise wrote: > >> >> >> On Fri, Sep 6, 2019 at 2:24 PM Valentyn Tymofieiev >> wrote: >> >>> +Mark Liu has added some benchmarks running across >>> multiple Python versions. Specifically we run 1 GB

Re: Possible Python SDK performance regression

2019-09-06 Thread Ahmet Altay
On Fri, Sep 6, 2019 at 6:17 PM Thomas Weise wrote: > > > On Fri, Sep 6, 2019 at 2:24 PM Valentyn Tymofieiev > wrote: > >> +Mark Liu has added some benchmarks running across >> multiple Python versions. Specifically we run 1 GB wordcount job on >> Dataflow runner on Python 2.7, 3.5-3.7. The

Re: Possible Python SDK performance regression

2019-09-06 Thread Thomas Weise
On Fri, Sep 6, 2019 at 2:24 PM Valentyn Tymofieiev wrote: > +Mark Liu has added some benchmarks running across > multiple Python versions. Specifically we run 1 GB wordcount job on > Dataflow runner on Python 2.7, 3.5-3.7. The benchmarks do not have > configured alerting and to my knowledge are

Re: [discuss] How we support our users on Slack / Mailing list / StackOverflow

2019-09-06 Thread Austin Bennett
I see no reason slack can't be suitable for Beam users -- other open source projects do utilize Slack for user chatter, too. Though what it could be is different from how currently used. There are 173 accounts in #beam-python, and a decent portion of recent conversations (at quick glance) look

Re: Hackathon @BeamSummit @ApacheCon

2019-09-06 Thread Austin Bennett
+u...@beam.apache.org On Fri, Sep 6, 2019 at 5:24 PM Austin Bennett wrote: > Ah, yes. We'll definitely be in Hackathon space 2-3p on Monday and > Tuesday (and can stay longer if needed). We aren't scheduling anything > official on Wed and Thurs, given the multiple Beam tracks that are >

Re: Hackathon @BeamSummit @ApacheCon

2019-09-06 Thread Austin Bennett
Ah, yes. We'll definitely be in Hackathon space 2-3p on Monday and Tuesday (and can stay longer if needed). We aren't scheduling anything official on Wed and Thurs, given the multiple Beam tracks that are occurring. On Fri, Sep 6, 2019 at 4:46 PM Mikhail Gryzykhin wrote: > I'll be in most of

Re: [discuss] How we support our users on Slack / Mailing list / StackOverflow

2019-09-06 Thread Ahmet Altay
Both StackOverflow and mailing lists have better answer rates for python questions. Suggesting either one of them makes sense. I also find StackOverflow easier to use but that is a personal preference. The original problem is that lack of support within Slack. Both mailing list and stackoverflow

Re: Hackathon @BeamSummit @ApacheCon

2019-09-06 Thread Mikhail Gryzykhin
I'll be in most of the week and will join gladly. On Thu, Sep 5, 2019, 14:32 Chad Dombrova wrote: > Has a date and time been picked for this? I'll be there for part of the > week and would love to join. > > On Tue, Sep 3, 2019 at 11:31 AM Brian Hulette wrote: > >> I will be around all week as

Re: [discuss] How we support our users on Slack / Mailing list / StackOverflow

2019-09-06 Thread Kenneth Knowles
+1 to StackOverflow first, though I'm not important for Beam Python users. Udi has a good point about discussions. If an SO question has a lot of back and forth, or no response, then it is good to point to other channels the user might try next. Kenn On Fri, Sep 6, 2019 at 2:20 PM Robert

Re: Interactive Beam - support for caching and introspection of PCollections

2019-09-06 Thread Ahmet Altay
(I believe you wanted to add +David Yan ) I am happy to see there are multiple related efforts. Both are introducing concepts. I would hope that beyond conflicts, we are not creating duplication and building a coherent experience. Could you reference to the discussions where this was agreed upon?

Re: Possible Python SDK performance regression

2019-09-06 Thread Valentyn Tymofieiev
+Mark Liu has added some benchmarks running across multiple Python versions. Specifically we run 1 GB wordcount job on Dataflow runner on Python 2.7, 3.5-3.7. The benchmarks do not have configured alerting and to my knowledge are not actively monitored yet. The zoom buttons on the dashboard [1]

Re: [discuss] How we support our users on Slack / Mailing list / StackOverflow

2019-09-06 Thread Robert Bradshaw
I would also suggest SO as the best alternative, especially due to its indexability and searchability. If discussion is needed, the users list (my preference) or slack can be good options, and ideally the resolution is brought back to SO. On Fri, Sep 6, 2019 at 1:10 PM Udi Meiri wrote: > > I

Re: Interactive Beam - support for caching and introspection of PCollections

2019-09-06 Thread Ning Kang
Thanks Alexey! The materialization of PCollection data directly from cache instead of going through the pipeline result would be very helpful for what we want to achieve! On Fri, Sep 6, 2019 at 12:31 PM Alexey Strokach wrote: > Hi everyone, > > I have recently finished my internship at Google,

Re: Improve container support

2019-09-06 Thread Hannah Jiang
Hi team I haven't received any objections, so will proceed with settings mentioned in a previous email. A reminder to PMC members, please let me know your docker hub id if you want to be an admin. Thanks, Hannah On Thu, Sep 5, 2019 at 5:02 PM Ankur Goenka wrote: > Please ignore the previous

Re: [discuss] How we support our users on Slack / Mailing list / StackOverflow

2019-09-06 Thread Udi Meiri
I don't go on Slack, but I will be notified of mentions. It has the advantage of being an informal space. SO can feel just as intimidating as the mailing list IMO. Unlike the others, it doesn't lend itself very well to discussions (you can only post comments or answers). On Fri, Sep 6, 2019 at

clickhouse tests failing

2019-09-06 Thread Elliotte Rusty Harold
At head I noticed the following: $ ./gradlew -p sdks/java/io/ check Configuration on demand is an incubating feature. > Task :sdks:java:io:clickhouse:test org.apache.beam.sdk.io.clickhouse.ClickHouseIOTest > classMethod FAILED java.lang.IllegalStateException

Interactive Beam - support for caching and introspection of PCollections

2019-09-06 Thread Alexey Strokach
Hi everyone, I have recently finished my internship at Google, which involved doing some work with Apache Beam in a Jupyter Notebook environment. One limitation that I encountered with my workflow is the lack of support for introspecting the contents of a PCollection and excessive boilerplate

[discuss] How we support our users on Slack / Mailing list / StackOverflow

2019-09-06 Thread Pablo Estrada
Hello all, THE SITUATION: It was brought to my attention recently that Python users in Slack are not getting much support, because most of the Beam Python-knowledgeable people are not on Slack. Unfortunately, in the Beam site, we do refer people to Slack for assistance[1]. Java users do receive

Re: [report] Understanding how people use the Apache Beam website!

2019-09-06 Thread Kenneth Knowles
On Fri, Sep 6, 2019 at 10:48 AM Pablo Estrada wrote: > > - Move de-facto documentation into official documentation (I'm looking at > you State : )) > - Encourage use of the Blog > Well, which is it?!? :-p - Encourage use of the Blog [but not for reference documentation] :-) Kenn >

[report] Understanding how people use the Apache Beam website!

2019-09-06 Thread Pablo Estrada
Hello all, I've put together a report analyzing how people have been using the Apache Beam website. The report is relatively simple, but it does show some interesting insights about how users navigate the website, what are the most popular pages - and it compiles a few action items (ideas) to

Re: installing Apache Beam on Pycharm with Python 3.7

2019-09-06 Thread Rakesh Kumar
Hi Priti, It would be helpful if you can provide more information about your environment and the error message. You can also ask this question in stackoverflow with 'apache-beam' tag for better visibility. On Thu, Sep 5, 2019 at 9:43 AM Priti Badami < pbadami.srdataengin...@gmail.com> wrote: >

Re: Possible Python SDK performance regression

2019-09-06 Thread Ahmet Altay
+Valentyn Tymofieiev do we have benchmarks in different python versions? Was there a recent change that is specific to python 3.x ? On Fri, Sep 6, 2019 at 8:36 AM Thomas Weise wrote: > The issue is only visible with Python 3.6, not 2.7. > > If there is a framework in place to add a streaming

Re: Possible Python SDK performance regression

2019-09-06 Thread Thomas Weise
The issue is only visible with Python 3.6, not 2.7. If there is a framework in place to add a streaming test, that would be great. We would use what we have internally as starting point. On Thu, Sep 5, 2019 at 5:00 PM Ahmet Altay wrote: > > > On Thu, Sep 5, 2019 at 4:15 PM Thomas Weise wrote: