Re: New contributor: Michał Walenia

2019-01-30 Thread Michał Walenia
HI all, thanks for a warm welcome :) Michał > Wiadomość napisana przez Ahmet Altay w dniu 30.01.2019, o > godz. 21:32: > > Welcome Michał! > > On Wed, Jan 30, 2019 at 11:38 AM Kenneth Knowles > wrote: > Welcome Michał! > > Kenn > > *And maybe your system uses a

Findbugs -> Spotbugs ?

2019-01-30 Thread Kenneth Knowles
Over the last few hours I activated findbugs on the Dataflow Java worker and fixed or suppressed the errors. They started around 60 but fixing some uncovered others, etc. You can see the result at https://github.com/apache/beam/pull/7684. It has convinced me that findbugs still adds value, beyond

Re: Example project configuration (maven or gradle) for projects depending on BeamSQL sdk extensions

2019-01-30 Thread Kenneth Knowles
Wow, thanks for the great report. Your configuration looks good to me. I filed https://issues.apache.org/jira/browse/BEAM-6558 to figure this out. Kenn On Wed, Jan 30, 2019 at 7:01 PM Yi Pan wrote: > Hi, all, > > Newbie here trying to figure out how to use published >

Example project configuration (maven or gradle) for projects depending on BeamSQL sdk extensions

2019-01-30 Thread Yi Pan
Hi, all, Newbie here trying to figure out how to use published beam-sdks-java-extensions-sql-2.9.0 in my own project. I tried to create a gradle project to use BeamSQL sdk libraries. Here is the build.gradle I have: {code} plugins { id 'java' } group 'com.mycompany.myproject' version

Re: Another new contributor!

2019-01-30 Thread Kenneth Knowles
Welcome! On Wed, Jan 30, 2019, 17:30 Connell O'Callaghan Welcome on board Brian! > > On Wed, Jan 30, 2019 at 5:29 PM Ahmet Altay wrote: > >> Welcome Brian! >> >> On Wed, Jan 30, 2019 at 5:26 PM Brian Hulette >> wrote: >> >>> Hi everyone, >>> I'm Brian Hulette, I just switched roles at

Another new contributor!

2019-01-30 Thread Brian Hulette
Hi everyone, I'm Brian Hulette, I just switched roles at Google and I'll be contributing to Beam Portability as part of my new position. For now I'm just going through documentation and getting familiar with Beam from the user perspective, so if anything I'll just be suggesting minor edits to

Re: Another new contributor!

2019-01-30 Thread Connell O'Callaghan
Welcome on board Brian! On Wed, Jan 30, 2019 at 5:29 PM Ahmet Altay wrote: > Welcome Brian! > > On Wed, Jan 30, 2019 at 5:26 PM Brian Hulette wrote: > >> Hi everyone, >> I'm Brian Hulette, I just switched roles at Google and I'll be >> contributing to Beam Portability as part of my new

Re: Another new contributor!

2019-01-30 Thread Ahmet Altay
Welcome Brian! On Wed, Jan 30, 2019 at 5:26 PM Brian Hulette wrote: > Hi everyone, > I'm Brian Hulette, I just switched roles at Google and I'll be > contributing to Beam Portability as part of my new position. For now I'm > just going through documentation and getting familiar with Beam from

Re: [DISCUSS] Should File based IOs implement readAll() or just readFiles()

2019-01-30 Thread Chamikara Jayalath
Thanks for the clarification Ismaël and Eugene. +1 for deprecating existing FooIO.readAll() transforms in favor of FooIO.readFiles(). On Wed, Jan 30, 2019 at 3:25 PM Eugene Kirpichov wrote: > TextIO.read() and AvroIO.read() indeed perform better than match() + > readMatches() + readFiles(), due

Re: [DISCUSS] Should File based IOs implement readAll() or just readFiles()

2019-01-30 Thread Eugene Kirpichov
TextIO.read() and AvroIO.read() indeed perform better than match() + readMatches() + readFiles(), due to DWR - so for these two in particular I would not recommend such a refactoring. However, new file-based IOs that do not support DWR should only provide readFiles(). Those that do, should provide

Re: [DISCUSS] Should File based IOs implement readAll() or just readFiles()

2019-01-30 Thread Ismaël Mejía
I meant TextIO.readFiles(). As you mention Cham they are in reality the 'same'. TextIO.readAll = FileIO.match() . FileIO.readMatches . TextIO.readFiles And this pattern can be repeated to all other file-based IOs. The whole point of this discussion is to decide if such 'wrapper' tranform

Re: [DISCUSS] Should File based IOs implement readAll() or just readFiles()

2019-01-30 Thread Chamikara Jayalath
On Wed, Jan 30, 2019 at 2:37 PM Chamikara Jayalath wrote: > > > On Wed, Jan 30, 2019 at 2:33 PM Ismaël Mejía wrote: > >> Ups slight typo, in the first line of the previous email I meant read >> instead of readAll >> >> On Wed, Jan 30, 2019 at 11:32 PM Ismaël Mejía wrote: >> > >> > Reuven is

Re: [DISCUSS] Should File based IOs implement readAll() or just readFiles()

2019-01-30 Thread Chamikara Jayalath
On Wed, Jan 30, 2019 at 2:33 PM Ismaël Mejía wrote: > Ups slight typo, in the first line of the previous email I meant read > instead of readAll > > On Wed, Jan 30, 2019 at 11:32 PM Ismaël Mejía wrote: > > > > Reuven is right for the example, readAll at this moment may be faster > > and also

Re: [DISCUSS] Should File based IOs implement readAll() or just readFiles()

2019-01-30 Thread Ismaël Mejía
Ups slight typo, in the first line of the previous email I meant read instead of readAll On Wed, Jan 30, 2019 at 11:32 PM Ismaël Mejía wrote: > > Reuven is right for the example, readAll at this moment may be faster > and also supports Dynamic Work Rebalancing (DWR), but the performance > of the

Re: [DISCUSS] Should File based IOs implement readAll() or just readFiles()

2019-01-30 Thread Ismaël Mejía
Reuven is right for the example, readAll at this moment may be faster and also supports Dynamic Work Rebalancing (DWR), but the performance of the other approach may (and must) be improved to be equal, once the internal implementation of TextIO.read moves to a SDF version instead of the

Re: [BEAM-6551] Python precommit is broken due to pyLint issues

2019-01-30 Thread Alex Amato
(Renaming subject and updating jira to reflect pyLint issue.) Unfortunately it occurred again when I reran the python precommit. https://jira.apache.org/jira/browse/BEAM-6551?filter=-2 04:58:37 > Task :beam-sdks-python:lintPy27 FAILED

Re: New contributor

2019-01-30 Thread Tao Feng
Thanks. Great to hear from you, Xinyu :) On Wed, Jan 30, 2019 at 12:29 PM Xinyu Liu wrote: > Welcome and glad to see you here, Tao! > > Xinyu > > On Wed, Jan 30, 2019 at 12:00 PM Kenneth Knowles wrote: > >> Done. Welcome! >> >> Kenn >> >> On Wed, Jan 30, 2019 at 11:53 AM Tao Feng wrote: >>

Re: New contributor: Michał Walenia

2019-01-30 Thread Ahmet Altay
Welcome Michał! On Wed, Jan 30, 2019 at 11:38 AM Kenneth Knowles wrote: > Welcome Michał! > > Kenn > > *And maybe your system uses a compose key. Ubuntu: > https://help.ubuntu.com/community/ComposeKey. It is composition of L and > / just like it looks. (unless I can't see it clearly) > > > On

Re: New contributor

2019-01-30 Thread Xinyu Liu
Welcome and glad to see you here, Tao! Xinyu On Wed, Jan 30, 2019 at 12:00 PM Kenneth Knowles wrote: > Done. Welcome! > > Kenn > > On Wed, Jan 30, 2019 at 11:53 AM Tao Feng wrote: > >> Hi, >> >> I would like to contribute to beam and work on some tickets in my spare >> time. Could you please

Re: [DISCUSS] Should File based IOs implement readAll() or just readFiles()

2019-01-30 Thread Robert Bradshaw
Yes, this is precisely the goal of SDF. On Wed, Jan 30, 2019 at 8:41 PM Kenneth Knowles wrote: > > So is the latter is intended for splittable DoFn but not yet using it? The > promise of SDF is precisely this composability, isn't it? > > Kenn > > On Wed, Jan 30, 2019 at 10:16 AM Jeff Klukas

Re: New contributor

2019-01-30 Thread Kenneth Knowles
Done. Welcome! On Wed, Jan 30, 2019 at 10:34 AM Bharath Kumara Subramanian < codin.mart...@gmail.com> wrote: > Hi, > I would like to contribute to beam and start picking up tickets. > Can you please add me to the beam project in JIRA? > > username: bharathkk > > Thanks, > Bharath >

Re: New contributor

2019-01-30 Thread Kenneth Knowles
Done. Welcome! Kenn On Wed, Jan 30, 2019 at 11:53 AM Tao Feng wrote: > Hi, > > I would like to contribute to beam and work on some tickets in my spare > time. Could you please add me to the beam project in JIRA? > > My jira user name is TaoFeng. > > Thanks, > -Tao >

Re: 2.7.1 (LTS) release?

2019-01-30 Thread Kenneth Knowles
Sounds good to me to target 2.7.1 and 2.10.0. I will have to re-roll RC2 after confirming fixes for the latest blockers that were found. These are not regressions from 2.9.0. But they seem severe enough that they are worth taking an extra day or two, because 2.9.0 had enough problems that I would

New contributor

2019-01-30 Thread Tao Feng
Hi, I would like to contribute to beam and work on some tickets in my spare time. Could you please add me to the beam project in JIRA? My jira user name is TaoFeng. Thanks, -Tao

Re: Portable metrics work and open questions

2019-01-30 Thread Robert Bradshaw
I think v1 of the querying API should be just "give me *all* the metrics." Shortly thereafter, we should have a v2 that allows for requesting just a subset metrics, possibly pre-aggregated. (My preference would be a filter like {URN: regex, label: [label_name: regex]} and all matching counters

Re: [DISCUSS] Should File based IOs implement readAll() or just readFiles()

2019-01-30 Thread Kenneth Knowles
So is the latter is intended for splittable DoFn but not yet using it? The promise of SDF is precisely this composability, isn't it? Kenn On Wed, Jan 30, 2019 at 10:16 AM Jeff Klukas wrote: > Reuven - Is TextIO.read().from() a more complex case than the topic Ismaël > is bringing up in this

Re: New contributor: Michał Walenia

2019-01-30 Thread Kenneth Knowles
Welcome Michał! Kenn *And maybe your system uses a compose key. Ubuntu: https://help.ubuntu.com/community/ComposeKey. It is composition of L and / just like it looks. (unless I can't see it clearly) On Wed, Jan 30, 2019 at 10:20 AM Rui Wang wrote: > Welcome! Welcome! > > -Rui > > On Wed, Jan

Re: [BEAM-6551] Python precommit is broken cannot parse value of 'http_proxy' env var

2019-01-30 Thread Michael Luckey
Lucky u. I have always to exclude python run from my Gradle builds locally… seems to work reliable only, if not run in parallel :( I always thought there is some parallelity issue - possibly on the virtual envs… But as precommit is doing well on Jenkins… Never had the time to look into the

Re: [BEAM-6551] Python precommit is broken cannot parse value of 'http_proxy' env var

2019-01-30 Thread Alex Amato
I'm just doing to re run the precommit and see if it happens again. I can't find the error why it failed, and it won't fail when I run locally... On Wed, Jan 30, 2019 at 10:08 AM Michael Luckey wrote: > There is (probably?): > > *04:58:37* >* Task :beam-sdks-python:lintPy27* FAILED > > > Dunno

New contributor

2019-01-30 Thread Bharath Kumara Subramanian
Hi, I would like to contribute to beam and start picking up tickets. Can you please add me to the beam project in JIRA? username: bharathkk Thanks, Bharath

Re: New contributor: Michał Walenia

2019-01-30 Thread Rui Wang
Welcome! Welcome! -Rui On Wed, Jan 30, 2019 at 9:22 AM Łukasz Gajowy wrote: > Impressive, so many ways! I didn't know the mac trick though, thanks > Ankur. :D > > śr., 30 sty 2019 o 17:24 Ismaël Mejía napisał(a): > >> Welcome Michał! >> >> For more foreign languages copy/pastables characters:

Re: [DISCUSSION] ParDo Async Java API

2019-01-30 Thread Xinyu Liu
I put the asks and email discussions in this JIRA to track the Async API: https://jira.apache.org/jira/browse/BEAM-6550. Bharath on the SamzaRunner side is willing to take a stab at this. He will come up with some design doc based on our discussions. Will update the thread once it's ready. Really

Re: [DISCUSS] Should File based IOs implement readAll() or just readFiles()

2019-01-30 Thread Jeff Klukas
Reuven - Is TextIO.read().from() a more complex case than the topic Ismaël is bringing up in this thread? I'm surprised to hear that the two examples have different performance characteristics. Reading through the implementation, I guess the fundamental difference is whether a given configuration

Re: [BEAM-6551] Python precommit is broken cannot parse value of 'http_proxy' env var

2019-01-30 Thread Michael Luckey
There is (probably?): *04:58:37* >* Task :beam-sdks-python:lintPy27* FAILED Dunno if more is failing On Wed, Jan 30, 2019 at 6:51 PM Alex Amato wrote: > Ah, my mistake. then I suppose I could not locate the failure in my build > output. There should be some magic "failure" keyword or

Re: [VOTE] Release 2.10.0, release candidate #1

2019-01-30 Thread Chamikara Jayalath
FYI, created another blocker: https://issues.apache.org/jira/browse/BEAM-6552 Thanks, Cham On Tue, Jan 29, 2019 at 4:38 PM Ahmet Altay wrote: > -1, I ran into a new blocking issue: > https://issues.apache.org/jira/browse/BEAM-6545 > > On Tue, Jan 29, 2019 at 4:08 PM Kenneth Knowles wrote: >

Re: [BEAM-6551] Python precommit is broken cannot parse value of 'http_proxy' env var

2019-01-30 Thread Alex Amato
Ah, my mistake. then I suppose I could not locate the failure in my build output. There should be some magic "failure" keyword or something, but I really don't know what it is. Plus the build scan says no failures On Wed, Jan 30, 2019 at 9:45 AM Michael Luckey wrote: > Don't think, this is

Re: [BEAM-6551] Python precommit is broken cannot parse value of 'http_proxy' env var

2019-01-30 Thread Michael Luckey
Don't think, this is failing the build... E.g. here ( https://scans.gradle.com/s/qxtrr65etbcem/console-log#L4674 ) we seem to have the same trace on successful build? On Wed, Jan 30, 2019 at 6:30 PM Alex Amato wrote: > JIRA link: > > https://jira.apache.org/jira/browse/BEAM-6551?filter=-2 > >

Re: [DISCUSS] Should File based IOs implement readAll() or just readFiles()

2019-01-30 Thread Reuven Lax
Jeff, what you did here is not simply a refactoring. These two are quite different, and will likely have different performance characteristics. The first evaluates the wildcard, and allows the runner to pick appropriate bundling. Bundles might contain multiple files (if they are small), and the

Re: [DISCUSS] Should File based IOs implement readAll() or just readFiles()

2019-01-30 Thread Jeff Klukas
I would prefer we move towards option [2]. I just tried the following refactor in my own code from: return input .apply(TextIO.read().from(fileSpec)); to: return input .apply(FileIO.match().filepattern(fileSpec)) .apply(FileIO.readMatches())

Re: New contributor: Michał Walenia

2019-01-30 Thread Łukasz Gajowy
Impressive, so many ways! I didn't know the mac trick though, thanks Ankur. :D śr., 30 sty 2019 o 17:24 Ismaël Mejía napisał(a): > Welcome Michał! > > For more foreign languages copy/pastables characters: > http://polish.typeit.org/ > > Yay for more people with crazy accents, (yes I know I can

Re: Portable metrics work and open questions

2019-01-30 Thread Alex Amato
Okay, yeah I was tossing and turning last night thinking the same thing. The querying API needs to be relatively simple, not use a structure similar to URNs/MonitoringInfo structure. But there there should be a way to pass through metrics so that they can be queried out. I think that is missing

Re: New contributor: Michał Walenia

2019-01-30 Thread Ismaël Mejía
Welcome Michał! For more foreign languages copy/pastables characters: http://polish.typeit.org/ Yay for more people with crazy accents, (yes I know I can be biased :P) Ismaël On Wed, Jan 30, 2019 at 3:30 PM Ankur Goenka wrote: > > Welcome Michał! > > long press "l" on mac to type "ł' :) > >

[DISCUSS] Should File based IOs implement readAll() or just readFiles()

2019-01-30 Thread Ismaël Mejía
Hello, A ‘recent’ pattern of use in Beam is to have in file based IOs a `readAll()` implementation that basically matches a `PCollection` of file patterns and reads them, e.g. `TextIO`, `AvroIO`. `ReadAll` is implemented by a expand function that matches files with FileIO and then reads them

Re: New contributor: Michał Walenia

2019-01-30 Thread Ankur Goenka
Welcome Michał! long press "l" on mac to type "ł' :) On Wed, Jan 30, 2019 at 7:57 PM Maximilian Michels wrote: > Welcome Michał! > > I do have to find out how to type ł without copy/pasting it every time ;) > > On 30.01.19 15:22, Łukasz Gajowy wrote: > > Hi all, > > > > a new fellow joined

Re: New contributor: Michał Walenia

2019-01-30 Thread Maximilian Michels
Welcome Michał! I do have to find out how to type ł without copy/pasting it every time ;) On 30.01.19 15:22, Łukasz Gajowy wrote: Hi all, a new fellow joined Kasia Kucharczyk and me to work on integration and load testing areas. Welcome, Michał! Łukasz

New contributor: Michał Walenia

2019-01-30 Thread Łukasz Gajowy
Hi all, a new fellow joined Kasia Kucharczyk and me to work on integration and load testing areas. Welcome, Michał! Łukasz

Re: Portable metrics work and open questions

2019-01-30 Thread Robert Bradshaw
Thanks for writing this up. I left some comments in the doc, but at a high level I am in favor of the "more deeply overhaul SDKs' metrics/querying structures to use MonitoringInfos / URNs" option, at least over the Jobs API, for consistency and completeness. The SDK can provide whatever

2.7.1 (LTS) release?

2019-01-30 Thread Maximilian Michels
Hi everyone, I know we are in the midst of releasing 2.10.0, but with the release process taking its time I consider creating a patch release for this issue in the FlinkRunner: https://jira.apache.org/jira/browse/BEAM-5386 Initially I thought it would be good to do a 2.9.1 release, but since

Re: [BEAM-5442] Store duplicate unknown (runner) options in a list argument

2019-01-30 Thread Maximilian Michels
Thomas was so kind to implement Option 3) in https://github.com/apache/beam/pull/7597 Heads-up to the Go SDK people to eventually implement the new DescribePipelineOptionsRequest. Tracking issue: https://issues.apache.org/jira/browse/BEAM-6549 Also related, we will have to follow-up with

Re: BEAM-6324 / #7340: "I've pretty much given up on the PR being merged. I use my own fork for my projects"

2019-01-30 Thread Łukasz Gajowy
Wow. I missed the sentence. Judging from the fact that others also proposed adding it, I think it might need some care. I proposed a PR here: https://github.com/apache/beam/pull/7670 Łukasz śr., 30 sty 2019 o 00:39 Kenneth Knowles napisał(a): > > > On Mon, Jan 28, 2019 at 5:25 AM Łukasz Gajowy