Re: [DISCUSS] [BEAM-438] Rename one of PTransform.apply or PInput.apply

2016-12-07 Thread Tyler Akidau
+1 On Thu, Dec 8, 2016 at 1:10 PM Jean-Baptiste Onofré wrote: > +1 > > Regards > JB > > On 12/07/2016 10:37 PM, Kenneth Knowles wrote: > > Hi all, > > > > I want to bring up another major backwards-incompatible change before it > is > > too late, to resolve [BEAM-438]. > > >

Re: Build failed in Jenkins: beam_PostCommit_Python_Verify #838

2016-12-07 Thread Ahmet Altay
This timeout is tracked in https://issues.apache.org/jira/browse/BEAM-1109. On Wed, Dec 7, 2016 at 12:24 PM, Apache Jenkins Server < jenk...@builds.apache.org> wrote: > See Verify/838/changes> > > Changes: > > [robertwb] [BEAM-1077]

Container Orchestration software for hosting data stores

2016-12-07 Thread Stephen Sisk
Hi, I wanted to give a quick update on my investigation of container management systems/orchestration software - we want to do this to allow us to host instances of data stores for IO transform testing as discussed in [1]. We wanted to compare: * kubernetes - the completely open source version

Re: [DISCUSS] [BEAM-438] Rename one of PTransform.apply or PInput.apply

2016-12-07 Thread Ben Chambers
+1 to pushing all remaining "major" (read likely to affect everyone) breaks through in a single release. On Wed, Dec 7, 2016 at 3:56 PM Dan Halperin wrote: > +user@, because this is a user-impacting change and they might not all be > paying attention to the dev@

Re: [DISCUSS] [BEAM-438] Rename one of PTransform.apply or PInput.apply

2016-12-07 Thread Dan Halperin
+user@, because this is a user-impacting change and they might not all be paying attention to the dev@ list. +1 I'm mildly reluctant because this will break all users that have written composite transforms -- and I'm the jerk that filed the issue (a few times now, on different iterations of the

Re: [DISCUSS] [BEAM-438] Rename one of PTransform.apply or PInput.apply

2016-12-07 Thread Aljoscha Krettek
+1 I've seen this mistake myself in some PRs. On Thu, 8 Dec 2016 at 06:10 Ben Chambers wrote: > +1 -- This seems like the best option. It's a mechanical change, and the > compiler will let users know it needs to be made. It will make the mistake > much less

Re: [DISCUSS] [BEAM-438] Rename one of PTransform.apply or PInput.apply

2016-12-07 Thread Ben Chambers
+1 -- This seems like the best option. It's a mechanical change, and the compiler will let users know it needs to be made. It will make the mistake much less common, and when it occurs it will be much clearer what is wrong. It would be great if we could make the mis-use a compiler problem or a

[DISCUSS] [BEAM-438] Rename one of PTransform.apply or PInput.apply

2016-12-07 Thread Kenneth Knowles
Hi all, I want to bring up another major backwards-incompatible change before it is too late, to resolve [BEAM-438]. Summary: Leave PInput.apply the same but rename PTransform.apply to PTransform.expand. I have opened [PR #1538] just for reference (it took 30 seconds using IDE automated

Re: Naming and API for executing shell commands

2016-12-07 Thread Eugene Kirpichov
I think it makes sense as a separate module, since I'd hesitate to call it an "IO" because importing and exporting data is not the main thing about executing shell commands. Let's continue discussion of the API per se: how can we design an API to encompass all these use cases? WDYT? On Wed, Dec

Re: [DISCUSS] ExecIO

2016-12-07 Thread Eugene Kirpichov
(discussion continues on a thread called "Naming and API for executing shell commands") On Wed, Dec 7, 2016 at 1:32 AM Jean-Baptiste Onofré wrote: > By the way, just to elaborate a bit why I provided as an IO: > > 1. From an user experience perspective, I think we have to

Re: Naming and API for executing shell commands

2016-12-07 Thread Jean-Baptiste Onofré
Hi Eugene, I like your ShellCommands.execute().withCommand("foo") ! And you listed valid points and usages, especially around the input/output of the command. My question is where do we put such ShellCommands extension ? As a module under IO ? As a new extensions module ? Regards JB On

Re: HiveIO

2016-12-07 Thread Jean-Baptiste Onofré
Yes that's the first idea ;) Regards JB⁣​ On Dec 7, 2016, 17:27, at 17:27, Vinoth Chandar wrote: >Interesting. So all the planning & execution is done by Hive, and Beam >will >process the results of the query? > >On Wed, Dec 7, 2016 at 8:24 AM, Jean-Baptiste Onofré

Re: HiveIO

2016-12-07 Thread Vinoth Chandar
Interesting. So all the planning & execution is done by Hive, and Beam will process the results of the query? On Wed, Dec 7, 2016 at 8:24 AM, Jean-Baptiste Onofré wrote: > Hi⁣ > > The HiveIO will directly use the native API and HiveQL. That's the plan on > which we are

Re: HiveIO

2016-12-07 Thread Jean-Baptiste Onofré
Hi⁣ The HiveIO will directly use the native API and HiveQL. That's the plan on which we are working right now. Regards JB On Dec 7, 2016, 17:18, at 17:18, Vinoth Chandar wrote: >Hi, > >I am not looking for a way to actually execute the query on Hive. I >would >like to do

Re: HiveIO

2016-12-07 Thread Vinoth Chandar
Hi, I am not looking for a way to actually execute the query on Hive. I would like to do something similar to Spark SQL/HiveContext, but with Beam. Just have a HiveIO that reads metadata from Hive metastore, and then later use a Spark runner to execute the query. So, HiveJDBC is not an option I

Re: DataCamp II Salzburg

2016-12-07 Thread Sergio Fernández
The slides I used are available at http://www.slideshare.net/Wikier/introduction-to-apache-beam-incubating-datacamp-salzburg-7-dec-2016 People really like it ;-) On Fri, Dec 2, 2016 at 8:21 AM, Davor Bonaci wrote: > This is great! (Please share any recording after the

Re: [DISCUSS] ExecIO

2016-12-07 Thread Jean-Baptiste Onofré
By the way, just to elaborate a bit why I provided as an IO: 1. From an user experience perspective, I think we have to provide convenient way to write pipeline. Any syntax simplifying this is valuable. I think it's easier to write: pipeline.apply(ExecIO.read().withCommand("foo")) than: