Re: Proposal: Beam to use GCP Libraries BOM

2020-03-05 Thread Ismaël Mejía
+1 Sounds like a good improvement for users and maintainers ! On Thu, Mar 5, 2020 at 6:59 AM Alex Van Boxel wrote: > > +1, I can remember the countless hours that we fought with Google > dependencies. > > On Thu, Mar 5, 2020, 04:07 Chamikara Jayalath wrote: >> >> +1 for this. >> >> This will

Re: [Discuss] Propose Calcite Vendor Release (1.22.0)

2020-03-05 Thread Ismaël Mejía
The calcite vote already passed so this is good to go, thanks for volunteering Rui. https://lists.apache.org/thread.html/r4962a4a2bacf481f2ee1064806b78829d96385c2e4a3c0ecb24a55a2%40%3Cdev.calcite.apache.org%3E On Thu, Mar 5, 2020 at 8:10 AM Kai Jiang wrote: > > Thanks, Rui! Big +1 for calcite

[DISCUSS] Query external resources as Tables with Beam SQL

2020-03-05 Thread Taher Koitawala
Hi All, We have been using Apache Beam extensively to process huge amounts of data, while beam is really powerful and can solve a huge number of use cases. A Beam job's development and testing time is significantly high. This gap can be filled with Beam SQL, where a complete SQL based

Re: [EXTERNAL] Re: Java Build broken

2020-03-05 Thread Maximilian Michels
Good find, Thomas! It looks like it is for testing releases because they are staged to this repository. IMHO there is no need for it to be enabled by default. -Max On 04.03.20 23:06, Thomas Weise wrote: I run into this problem today and found that removing

Re: [DISCUSS] Query external resources as Tables with Beam SQL

2020-03-05 Thread Andrew Pilloud
I believe we have this functionality alredy: https://beam.apache.org/documentation/dsls/sql/extensions/create-external-table/ Existing GCP tables can also be loaded through the GCP datacatalog metastore. What are you proposing that is new? Andrew On Thu, Mar 5, 2020, 12:29 AM Taher Koitawala

Re: [DISCUSS] Query external resources as Tables with Beam SQL

2020-03-05 Thread Taher Koitawala
Also auto creation is not there On Thu, Mar 5, 2020 at 3:59 PM Taher Koitawala wrote: > Proposal is to add more sources and also have time event time or > processing enhancements further on them > > On Thu, Mar 5, 2020 at 3:50 PM Andrew Pilloud wrote: > >> I believe we have this functionality

Re: [DISCUSS] Query external resources as Tables with Beam SQL

2020-03-05 Thread Taher Koitawala
Proposal is to add more sources and also have time event time or processing enhancements further on them On Thu, Mar 5, 2020 at 3:50 PM Andrew Pilloud wrote: > I believe we have this functionality alredy: > https://beam.apache.org/documentation/dsls/sql/extensions/create-external-table/ > >

No space left on device - beam-jenkins 1 and 7

2020-03-05 Thread Michał Walenia
Hi there, it seems we have a problem with Jenkins workers again. Nodes 1 and 7 both fail jobs with "No space left on device". Who is the best person to contact in these cases (someone with access permissions to the workers). I also noticed that such errors are becoming more and more frequent

Re: Proposal: Beam to use GCP Libraries BOM

2020-03-05 Thread Filipe Regadas
Big +1, this is a step in the right direction and checking with other Beam's direct and transitive deps is crucial since the referred bom only convers a small part of it. Apache Commons, Jackson, `com.google.{api, apis, cloud}`, slf4j comes to mind. Filipe Regadas On Thu, Mar 5, 2020 at 3:33 AM

Re: [VOTE] Upgrade gradle to 6.2

2020-03-05 Thread Alex Van Boxel
I will _/ _/ Alex Van Boxel On Thu, Mar 5, 2020 at 8:17 PM Ismaël Mejía wrote: > Looks like we have consensus on this one. Can you create a JIRA to > track this Alex. > I found this interesting presentation and associated repo, for the > interested on new improvements we can win with the

Re: [Discuss] Propose Calcite Vendor Release (1.22.0)

2020-03-05 Thread Xinyu Liu
Thanks, Rui! We've been waiting for the new version of Calcite which has the fix to unflatten the fields. Seems this version will come with it. Thanks, Xinyu On Thu, Mar 5, 2020 at 12:41 AM Ismaël Mejía wrote: > The calcite vote already passed so this is good to go, thanks for > volunteering

Re: Proposal: Beam to use GCP Libraries BOM

2020-03-05 Thread Tomo Suzuki
> Do Spark or Flink have BOMs? Not that I know of. I couldn't find "bom" in their artifacts [1, 2]. [1]: https://search.maven.org/search?q=g:org.apache.flink [2]: https://search.maven.org/search?q=g:org.apache.spark On Thu, Mar 5, 2020 at 1:46 PM Kenneth Knowles wrote: > +1 and you have

Re: [VOTE] Upgrade gradle to 6.2

2020-03-05 Thread Alex Van Boxel
https://issues.apache.org/jira/browse/BEAM-9456 I'll take it step-by-step, expect slow proces as till now haven't focussed on Python, Go and other runners. So be patient. _/ _/ Alex Van Boxel On Thu, Mar 5, 2020 at 9:13 PM Jean-Baptiste Onofre wrote: > Fair enough, we have the consensus,

Re: [DISCUSS] Query external resources as Tables with Beam SQL

2020-03-05 Thread Andrew Pilloud
For BigQueryIO, "CREATE EXTERNAL TABLE" does exactly what you describe in "CREATE TABLE". You could add a table property to set the CreateDisposition if you wanted to change that behavior. Andrew On Thu, Mar 5, 2020 at 11:10 AM Rui Wang wrote: > "CREATE TABLE" can be used to indicate if a

Re: [VOTE] Vendored Dependencies Release gRPC 1.26.0 v0.3 for BEAM-9288 RC #3

2020-03-05 Thread Ismaël Mejía
+1 (binding) Verified signatures Verified that there are no conscrypt classes or binaries in jar Verified pom.xml has runtime dependency on conscrypt On Thu, Mar 5, 2020 at 9:14 PM Jean-Baptiste Onofre wrote: > > +1 (binding) > > Regards > JB > > Le 5 mars 2020 à 19:55, Luke Cwik a écrit : > >

Re: [DISCUSS] Query external resources as Tables with Beam SQL

2020-03-05 Thread Rui Wang
"CREATE TABLE" can be used to indicate if a table does not exist, BeamSQL will help create it in storage systems if allowed, while "CREATE EXTERNAL TABLE" can be used only for registering a table, no matter if the table exists or not. BeamSQL provides a finer-grained way to distinct different

Re: [VOTE] Upgrade gradle to 6.2

2020-03-05 Thread Jean-Baptiste Onofre
Fair enough, we have the consensus, so agree to create Jira and move forward about this update. Regards JB > Le 5 mars 2020 à 20:16, Ismaël Mejía a écrit : > > Looks like we have consensus on this one. Can you create a JIRA to > track this Alex. > I found this interesting presentation and

Re: [VOTE] Vendored Dependencies Release gRPC 1.26.0 v0.3 for BEAM-9288 RC #3

2020-03-05 Thread Jean-Baptiste Onofre
+1 (binding) Regards JB > Le 5 mars 2020 à 19:55, Luke Cwik a écrit : > > Please review the release of the following artifacts that we vendor: > * beam-vendor-grpc-1_26_0 > > Hi everyone, > Please review and vote on the release candidate #1 for the version 0.3, as > follows: > [ ] +1,

Re: [DISCUSS] Query external resources as Tables with Beam SQL

2020-03-05 Thread Rui Wang
Back to this proposal, I think it's ok if there is a need to further distinguish the create/not create behaviour by either options or using "create external table/create table". -Rui On Thu, Mar 5, 2020 at 11:19 AM Andrew Pilloud wrote: > For BigQueryIO, "CREATE EXTERNAL TABLE" does exactly

Re: [VOTE] Upgrade gradle to 6.2

2020-03-05 Thread Ismaël Mejía
Looks like we have consensus on this one. Can you create a JIRA to track this Alex. I found this interesting presentation and associated repo, for the interested on new improvements we can win with the move to version 6.x.x https://melix.github.io/gradle-6-whats-new/#/

Re: [Discuss] Propose Calcite Vendor Release (1.22.0)

2020-03-05 Thread Robin Qiu
+1 Thanks Rui for proposing this. Bringing in the newest version of Calcite will also simplify our codebase [1] and resolve some existing issues [2] [1] https://issues.apache.org/jira/browse/BEAM-9190 [2] https://issues.apache.org/jira/browse/BEAM-9191 On Thu, Mar 5, 2020 at 11:42 AM Xinyu Liu

Run Python PreCommit break?

2020-03-05 Thread Rui Wang
Hi Community, Is python precommit breaking? I have observed a consistent test case failure from apache_beam.runners.portability.portable_runner_test.PortableRunnerTest.test_group_by_key [1] in the release branch. It might have been fixed in the master branch. Does anyone have insight on it?

Re: Contributing Twister2 runner to Apache Beam

2020-03-05 Thread Kenneth Knowles
I agree with both of you, mostly :-) The monorepo approach doesn't work/scale well for shipped libraries (name a Google library that silently just works and never causes any dependency problems) and the pain we feel has been constant and increasing, but I don't think we are at the breaking point.

Re: Contributing Twister2 runner to Apache Beam

2020-03-05 Thread Robert Bradshaw
I think we will get to a point where it makes sense for runners to live in their own repositories, with their own release cadence, but we're not at that point yet. One prerequisite is a stable API--we're closing in on that with the portability protos, but many (java) runners actually share the

Re: Run Python PreCommit break?

2020-03-05 Thread Robert Bradshaw
https://github.com/apache/beam/pull/11021 for getting rid of these vestigal error logs. On Thu, Mar 5, 2020 at 1:21 PM Rui Wang wrote: > > Hi Community, > > Is python precommit breaking? I have observed a consistent test case failure > from >

Re: [Discuss] Propose Calcite Vendor Release (1.22.0)

2020-03-05 Thread Kenneth Knowles
+1 thanks! On Thu, Mar 5, 2020 at 12:50 PM Robin Qiu wrote: > +1 > > Thanks Rui for proposing this. Bringing in the newest version of Calcite > will also simplify our codebase [1] and resolve some existing issues [2] > > [1] https://issues.apache.org/jira/browse/BEAM-9190 > [2]

Re: [VOTE] Vendored Dependencies Release gRPC 1.26.0 v0.3 for BEAM-9288 RC #2

2020-03-05 Thread Luke Cwik
Cancelling this release, I made a mistake for the commit id which I built from which I should have caught before sending this out. On Thu, Mar 5, 2020 at 10:45 AM Luke Cwik wrote: > Please review the release of the following artifacts that we vendor: > * beam-vendor-grpc-1_26_0 > > Hi

Re: Proposal: Beam to use GCP Libraries BOM

2020-03-05 Thread Tomo Suzuki
> How would the Apache Beam BOM and GCP BOM work together? I envision there will be (new) "Beam GCP BOM" that imports (existing) Beam BOM and GCP Libraries BOM with necessary overwrites (such as Guava version). This clarifies which versions of Google libraries should be compatible with Beam's

[VOTE] Vendored Dependencies Release gRPC 1.26.0 v0.3 for BEAM-9288 RC #3

2020-03-05 Thread Luke Cwik
Please review the release of the following artifacts that we vendor: * beam-vendor-grpc-1_26_0 Hi everyone, Please review and vote on the release candidate #1 for the version 0.3, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release (please provide specific comments) The

Re: [DISCUSS] Query external resources as Tables with Beam SQL

2020-03-05 Thread Andrew Pilloud
I'm not following the "CREATE TABLE" vs "CREATE EXTERNAL TABLE" distinction. We added the "EXTERNAL" to make it clear that Beam wasn't storing the table. Most of our current table providers will create the underlying table as needed. Andrew On Thu, Mar 5, 2020 at 10:47 AM Rui Wang wrote: >

Re: [VOTE] Vendored Dependencies Release gRPC 1.26.0 v0.3 for BEAM-9288 RC #3

2020-03-05 Thread Luke Cwik
+1 (binding) Verified that conscrypt jars and .so files don't appear in the jar. On Thu, Mar 5, 2020 at 10:55 AM Luke Cwik wrote: > Please review the release of the following artifacts that we vendor: > * beam-vendor-grpc-1_26_0 > > Hi everyone, > Please review and vote on the release

Re: Proposal: Beam to use GCP Libraries BOM

2020-03-05 Thread Luke Cwik
How would the Apache Beam BOM and GCP BOM work together? On Thu, Mar 5, 2020 at 7:25 AM Filipe Regadas wrote: > Big +1, this is a step in the right direction and checking with other > Beam's direct and transitive deps is crucial since the referred bom only > convers a small part of it. Apache

Re: [DISCUSS] Query external resources as Tables with Beam SQL

2020-03-05 Thread Rui Wang
There are two pieces of news from the proposal: 1. Spanner source in SQL. (Welcome to contribute it) 2. CREATE TABLE statement than CREATE EXTERNAL TABLE (the difference is whether assuming the table exists or not) There is a table property in the statement already that you can reuse to save

[VOTE] Vendored Dependencies Release gRPC 1.26.0 v0.3 for BEAM-9288 RC #2

2020-03-05 Thread Luke Cwik
Please review the release of the following artifacts that we vendor: * beam-vendor-grpc-1_26_0 Hi everyone, Please review and vote on the release candidate #1 for the version 0.3, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release (please provide specific comments) The

Re: Proposal: Beam to use GCP Libraries BOM

2020-03-05 Thread Kenneth Knowles
+1 and you have phrased the benefits and limitations well. We have plenty of not-Google-related dependencies that use Guava and protobuf (I know of Calcite, Cassandra, Kinesis, and Spark) so there's still work in managing deps, but the BOM should make it a lot easier to upgrade all these tightly