[beam] branch master updated (88d687b -> 08a9d54)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 88d687b Merge pull request #14228: [BEAM-7092] Fix artifact names for Spark separated modules + upgrade to Spark 3.1.1 add b2ce15e [BEAM-8221] Fix NPE while reading from non-existent Kafka topic new 08a9d54 Merge pull request #14217: [BEAM-8221] Fix NPE while reading from non-existent Kafka topic The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: .../beam/sdk/io/kafka/KafkaUnboundedSource.java| 7 +- .../org/apache/beam/sdk/io/kafka/KafkaIOTest.java | 26 ++ 2 files changed, 32 insertions(+), 1 deletion(-)
[beam] 01/01: Merge pull request #14217: [BEAM-8221] Fix NPE while reading from non-existent Kafka topic
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit 08a9d54b60cd4743afeb61e697fa6016279b1ac6 Merge: 88d687b b2ce15e Author: Ismaël Mejía AuthorDate: Mon Mar 15 11:39:55 2021 +0100 Merge pull request #14217: [BEAM-8221] Fix NPE while reading from non-existent Kafka topic .../beam/sdk/io/kafka/KafkaUnboundedSource.java| 7 +- .../org/apache/beam/sdk/io/kafka/KafkaIOTest.java | 26 ++ 2 files changed, 32 insertions(+), 1 deletion(-)
[beam] branch master updated (7ff7ceb -> 25bad0e)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 7ff7ceb Create 2.30.0 entry in CHANGES.md (#14250) add 3d4518f [BEAM-11726] Bump Clickhouse version to "0.2.6" add 25bad0e Merge pull request #14246: [BEAM-11726] Bump Clickhouse version to 0.2.6 No new revisions were added by this update. Summary of changes: sdks/java/io/clickhouse/build.gradle | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] branch master updated (25bad0e -> 8d6fa737)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 25bad0e Merge pull request #14246: [BEAM-11726] Bump Clickhouse version to 0.2.6 add 7af3d13 [BEAM-11764] Bump com.amazonaws version to 1.11.974 add 8d6fa737 Merge pull request #14247: [BEAM-11764] Bump com.amazonaws version to 1.11.974 No new revisions were added by this update. Summary of changes: buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] branch master updated (540957e -> c4f62ca)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 540957e Merge pull request #14253: [BEAM-11992] Run CrossLanguage ValidatesRunner tests only for Spark 2 add 6db22c6 [BEAM-8778] Bump software.amazon.awssdk version to 2.15.31 new c4f62ca Merge pull request #14264: [BEAM-8778] Bump software.amazon.awssdk version to 2.15.31 The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: .../main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy | 5 - sdks/java/io/{kinesis => amazon-web-services}/OWNERS | 0 sdks/java/io/{kinesis => amazon-web-services2}/OWNERS | 0 sdks/java/io/amazon-web-services2/build.gradle | 10 -- 4 files changed, 8 insertions(+), 7 deletions(-) copy sdks/java/io/{kinesis => amazon-web-services}/OWNERS (100%) copy sdks/java/io/{kinesis => amazon-web-services2}/OWNERS (100%)
[beam] 01/01: Merge pull request #14264: [BEAM-8778] Bump software.amazon.awssdk version to 2.15.31
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit c4f62ca933b570ec300f501dc02f360054d5634a Merge: 540957e 6db22c6 Author: Ismaël Mejía AuthorDate: Thu Mar 18 12:40:45 2021 +0100 Merge pull request #14264: [BEAM-8778] Bump software.amazon.awssdk version to 2.15.31 .../main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy | 5 - sdks/java/io/amazon-web-services/OWNERS| 4 sdks/java/io/amazon-web-services2/OWNERS | 4 sdks/java/io/amazon-web-services2/build.gradle | 10 -- 4 files changed, 16 insertions(+), 7 deletions(-)
[beam] 01/01: Merge pull request #14275: [BEAM-11023] Fix GroupByKeyTest testLargeKeys100MB and testGroupByKeyWithBadEqualsHashCode failing on Spark Structured Streaming runner
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit 5c9c8c0d1ef0f8224fac77265873d5f6afc431d2 Merge: e96f69a 3ac902a Author: Ismaël Mejía AuthorDate: Thu Mar 18 21:53:03 2021 +0100 Merge pull request #14275: [BEAM-11023] Fix GroupByKeyTest testLargeKeys100MB and testGroupByKeyWithBadEqualsHashCode failing on Spark Structured Streaming runner runners/spark/spark_runner.gradle | 2 ++ .../src/test/java/org/apache/beam/sdk/transforms/GroupByKeyTest.java | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-)
[beam] branch master updated (e96f69a -> 5c9c8c0)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from e96f69a Revert "Increase the timeout for ValidatesContainer suite" (#14219) add ff4366d [BEAM-11023] Fix testGroupByKeyWithBadEqualsHashCode failing on Spark Structured Streaming runner add 96bf00a [BEAM-11023] Fix testLargeKeys100MB on Spark Structured Streaming runner add 3ac902a [BEAM-11023] Change access level GroupByKeyTest new 5c9c8c0 Merge pull request #14275: [BEAM-11023] Fix GroupByKeyTest testLargeKeys100MB and testGroupByKeyWithBadEqualsHashCode failing on Spark Structured Streaming runner The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: runners/spark/spark_runner.gradle | 2 ++ .../src/test/java/org/apache/beam/sdk/transforms/GroupByKeyTest.java | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-)
[beam] 01/01: Merge pull request #14270: [BEAM-9283] Add Java 11 Jpms compatibility tests for Spark runner
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit aee24ad4aa78d5b1a5258a1003c08714808edcd5 Merge: 3fc2ab1 9113c6f Author: Ismaël Mejía AuthorDate: Thu Mar 18 19:13:05 2021 +0100 Merge pull request #14270: [BEAM-9283] Add Java 11 Jpms compatibility tests for Spark runner .../job_PostCommit_Java_Jpms_Spark_Java11.groovy | 49 ++ sdks/java/testing/jpms-tests/build.gradle | 8 2 files changed, 57 insertions(+)
[beam] branch master updated (3fc2ab1 -> aee24ad)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 3fc2ab1 [BEAM-9547] Add NDFrame to doctests, implement a few more operations (#14236) add 9113c6f [BEAM-9283] Add Java 11 Jpms compatibility tests for Spark runner new aee24ad Merge pull request #14270: [BEAM-9283] Add Java 11 Jpms compatibility tests for Spark runner The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: ...Java11.groovy => job_PostCommit_Java_Jpms_Spark_Java11.groovy} | 8 sdks/java/testing/jpms-tests/build.gradle | 8 2 files changed, 12 insertions(+), 4 deletions(-) copy .test-infra/jenkins/{job_PostCommit_Java_Jpms_Flink_Java11.groovy => job_PostCommit_Java_Jpms_Spark_Java11.groovy} (85%)
[beam] branch master updated (c391aba -> 8408e38)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from c391aba [BEAM-11033] Updates Dataflow metrics handling to support portable job submission (#14158) add 1de8ef9 [BEAM-11913] Add support for Hadoop configuration on ParquetIO add 8408e38 Merge pull request #14171: [BEAM-11913] Add support for Hadoop configuration on ParquetIO No new revisions were added by this update. Summary of changes: .../org/apache/beam/sdk/io/parquet/ParquetIO.java | 58 -- .../apache/beam/sdk/io/parquet/ParquetIOTest.java | 33 +++- 2 files changed, 76 insertions(+), 15 deletions(-)
[beam] branch master updated (8408e38 -> 21feb59)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 8408e38 Merge pull request #14171: [BEAM-11913] Add support for Hadoop configuration on ParquetIO add 4ce3176 [BEAM-11941] Upgrade Flink runner to Flink version 1.12.2 add 21feb59 Merge pull request #14173: [BEAM-11941] Upgrade Flink runner to Flink version 1.12.2 No new revisions were added by this update. Summary of changes: runners/flink/1.12/build.gradle | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] branch master updated (22a8b18 -> 40eef35)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 22a8b18 Add an option to create Dataflow piplines from a snapshot for python sdk (#14278) add ba14ca2 [BEAM-11992] Run CrossLanguage ValidatesRunner for Spark 3 new 40eef35 Merge pull request #14269: [BEAM-11992] Run CrossLanguage ValidatesRunner for Spark 3 The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: ...ob_PostCommit_CrossLanguageValidatesRunner_Spark3.groovy} | 12 ++-- README.md| 2 +- release/src/main/scripts/mass_comment.py | 1 + 3 files changed, 8 insertions(+), 7 deletions(-) copy .test-infra/jenkins/{job_PostCommit_CrossLanguageValidatesRunner_Direct.groovy => job_PostCommit_CrossLanguageValidatesRunner_Spark3.groovy} (84%)
[beam] 01/01: Merge pull request #14269: [BEAM-11992] Run CrossLanguage ValidatesRunner for Spark 3
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit 40eef355c8321bf6f376c68043db4c589bad78d9 Merge: 22a8b18 ba14ca2 Author: Ismaël Mejía AuthorDate: Fri Mar 19 18:42:30 2021 +0100 Merge pull request #14269: [BEAM-11992] Run CrossLanguage ValidatesRunner for Spark 3 ...mmit_CrossLanguageValidatesRunner_Spark3.groovy | 49 ++ README.md | 2 +- release/src/main/scripts/mass_comment.py | 1 + 3 files changed, 51 insertions(+), 1 deletion(-)
[beam] branch master updated (8ac1eb5 -> 5bfdc3b)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 8ac1eb5 Merge pull request #14282: [BEAM-7093] Update some out-of-date Gradle Spark instructions. add b661b87 [BEAM-7078] Bump com.amazonaws:amazon-kinesis-client to version 1.14.2 add 5bfdc3b Merge pull request #14283: [BEAM-7078] Bump com.amazonaws:amazon-kinesis-client to version 1.14.2 No new revisions were added by this update. Summary of changes: sdks/java/io/kinesis/build.gradle | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] branch master updated (d6b020a -> 8ac1eb5)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from d6b020a Merge pull request #14279 from apache/tvalentyn-patch-1 add 209fd5d [BEAM-7093] Update some out-of-date Gradle Spark instructions. add 8ac1eb5 Merge pull request #14282: [BEAM-7093] Update some out-of-date Gradle Spark instructions. No new revisions were added by this update. Summary of changes: sdks/python/apache_beam/runners/portability/spark_runner.py | 2 +- .../python/apache_beam/runners/portability/spark_uber_jar_job_server.py | 2 +- website/www/site/content/en/get-started/quickstart-go.md| 2 +- 3 files changed, 3 insertions(+), 3 deletions(-)
[beam] branch master updated (9791ef9 -> 540957e)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 9791ef9 Merge pull request #14237 from [BEAM-11979] Ignore not serializable filter fields in python MongoDBI… add abc7b20 [BEAM-11992] Run CrossLanguage ValidatesRunner tests only for Spark 2 new 540957e Merge pull request #14253: [BEAM-11992] Run CrossLanguage ValidatesRunner tests only for Spark 2 The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: .../jenkins/job_PostCommit_CrossLanguageValidatesRunner_Spark.groovy | 1 - 1 file changed, 1 deletion(-)
[beam] 01/01: Merge pull request #14253: [BEAM-11992] Run CrossLanguage ValidatesRunner tests only for Spark 2
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit 540957effe63ebe093dd1da82ae7f0d8f9aa482a Merge: 9791ef9 abc7b20 Author: Ismaël Mejía AuthorDate: Thu Mar 18 07:12:57 2021 +0100 Merge pull request #14253: [BEAM-11992] Run CrossLanguage ValidatesRunner tests only for Spark 2 .../jenkins/job_PostCommit_CrossLanguageValidatesRunner_Spark.groovy | 1 - 1 file changed, 1 deletion(-)
[beam] branch master updated (83543b2 -> b710eed)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 83543b2 Merge pull request #14030: [BEAM-11415] Adding ValidatesRunner w/ Python ULR tests to Go Precommit add b498952 [BEAM-11213] Instantiate SparkListenerApplicationStart in a Spark 3 compatible way add b710eed Merge pull request #14132: [BEAM-11213] Instantiate SparkListenerApplicationStart in a Spark 3 compatible way No new revisions were added by this update. Summary of changes: .../beam/runners/spark/SparkPipelineRunner.java| 18 +-- .../beam/runners/spark/util/SparkCompat.java | 63 ++ 2 files changed, 65 insertions(+), 16 deletions(-)
[beam] branch master updated (7b2c4dc -> 92ddb5c)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 7b2c4dc [BEAM-11770] bump google-cloud-pubsub to 1.110.3, because we follwed dependencies in google_cloud_platform_libraries_bom sets version. (#13994) add 157ebc4 [BEAM-11852] Update the title of our use-case add 92ddb5c Merge pull request #14042: [BEAM-11852] Update the title of our use-case on the new Beam Website No new revisions were added by this update. Summary of changes: website/www/site/content/en/powered-by/commercial/ricardo.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] branch master updated: [BEAM-7929] Support column projection for Parquet Tables
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new d79cd82 [BEAM-7929] Support column projection for Parquet Tables new 2447679 Merge pull request #14117: [BEAM-7929] Support column projection for Parquet Tables d79cd82 is described below commit d79cd82943c90dad518b705b7e81bcd2d2fc0f21 Author: Ismaël Mejía AuthorDate: Mon Mar 1 10:05:32 2021 +0100 [BEAM-7929] Support column projection for Parquet Tables --- sdks/java/extensions/sql/build.gradle | 1 + .../sql/meta/provider/parquet/ParquetTable.java| 132 + .../provider/parquet/ParquetTableProvider.java | 22 ++-- .../provider/parquet/ParquetTableProviderTest.java | 35 +- .../sdk/io/parquet/ParquetSchemaIOProvider.java| 127 5 files changed, 172 insertions(+), 145 deletions(-) diff --git a/sdks/java/extensions/sql/build.gradle b/sdks/java/extensions/sql/build.gradle index 6de73f2..6758e4b 100644 --- a/sdks/java/extensions/sql/build.gradle +++ b/sdks/java/extensions/sql/build.gradle @@ -79,6 +79,7 @@ dependencies { provided project(":sdks:java:io:kafka") provided project(":sdks:java:io:google-cloud-platform") compile project(":sdks:java:io:mongodb") + compile library.java.avro provided project(":sdks:java:io:parquet") provided library.java.jackson_dataformat_xml provided library.java.hadoop_client diff --git a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/parquet/ParquetTable.java b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/parquet/ParquetTable.java new file mode 100644 index 000..b2282ff --- /dev/null +++ b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/parquet/ParquetTable.java @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.sql.meta.provider.parquet; + +import java.io.Serializable; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; +import org.apache.avro.Schema; +import org.apache.avro.Schema.Field; +import org.apache.avro.generic.GenericRecord; +import org.apache.beam.sdk.annotations.Internal; +import org.apache.beam.sdk.extensions.sql.impl.BeamTableStatistics; +import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTableFilter; +import org.apache.beam.sdk.extensions.sql.meta.ProjectSupport; +import org.apache.beam.sdk.extensions.sql.meta.SchemaBaseBeamTable; +import org.apache.beam.sdk.extensions.sql.meta.Table; +import org.apache.beam.sdk.io.FileIO; +import org.apache.beam.sdk.io.parquet.ParquetIO; +import org.apache.beam.sdk.io.parquet.ParquetIO.Read; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.schemas.transforms.Convert; +import org.apache.beam.sdk.schemas.utils.AvroUtils; +import org.apache.beam.sdk.values.PBegin; +import org.apache.beam.sdk.values.PCollection; +import org.apache.beam.sdk.values.PCollection.IsBounded; +import org.apache.beam.sdk.values.POutput; +import org.apache.beam.sdk.values.Row; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +@Internal +@SuppressWarnings({"nullness"}) +class ParquetTable extends SchemaBaseBeamTable implements Serializable { + private static final Logger LOG = LoggerFactory.getLogger(ParquetTable.class); + + private final Table table; + + ParquetTable(Table table) { +super(table.getSchema()); +this.table = table; + } + + @Override + public PCollection buildIOReader(PBegin begin) { +final Schema schema = AvroUtils.toAvroSchema(table.getSchema()); +Read read = ParquetIO.read(schema).withBeamSchemas(true).from(table.getLocation() + "/*"); +return begin.apply("ParquetIORead", read).apply("ToRows", Convert.toRows()); + } + + @Override + public PCollection buildIOReader( + PBegin begin, BeamSqlTableFilter filters, List fieldNames) { +fina
[beam] branch master updated (86ac487 -> 968abf4)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 86ac487 [BEAM-11961] InfluxDBIOIT failing with unauthorized error (#14215) add e3d2654 [BEAM-12066] Bump classgraph to 4.8.104 add 968abf4 Merge pull request #14443: [BEAM-12066] Bump classgraph to 4.8.104 No new revisions were added by this update. Summary of changes: buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] 01/01: Merge pull request #14203: [BEAM-11948] Drop support for Flink 1.8 and 1.9
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit 572a99bab07e53e043887243e2b1e69120563be5 Merge: cb31b7b 05b3fd3 Author: Ismaël Mejía AuthorDate: Thu Apr 8 06:25:51 2021 +0200 Merge pull request #14203: [BEAM-11948] Drop support for Flink 1.8 and 1.9 CHANGES.md | 1 + gradle.properties | 2 +- runners/flink/1.10/build.gradle| 4 +- .../beam/runners/flink/FlinkCapabilities.java | 34 .../streaming/AbstractStreamOperatorCompat.java| 0 .../beam/runners/flink/RemoteMiniClusterImpl.java | 0 .../runners/flink/SourceTransformationCompat.java | 0 runners/flink/1.11/build.gradle| 4 +- runners/flink/1.12/build.gradle| 4 +- runners/flink/1.8/build.gradle | 34 .../flink/1.8/job-server-container/build.gradle| 26 - runners/flink/1.8/job-server/build.gradle | 31 --- .../beam/runners/flink/FlinkCapabilities.java | 34 .../streaming/io/BeamStoppableFunction.java| 29 -- .../beam/runners/flink/FlinkRunnerTestCompat.java | 43 --- .../runners/flink/streaming/StreamSources.java | 50 - runners/flink/1.9/build.gradle | 33 .../flink/1.9/job-server-container/build.gradle| 26 - runners/flink/1.9/job-server/build.gradle | 31 --- .../runners/flink/streaming/StreamSources.java | 62 -- .../flink/FlinkBatchTransformTranslators.java | 10 +--- .../beam/runners/flink/FlinkPipelineRunner.java| 8 --- .../org/apache/beam/runners/flink/FlinkRunner.java | 7 --- .../translation/functions/FlinkDoFnFunction.java | 12 + .../translation/types/CoderTypeSerializer.java | 0 .../translation/types/EncodedValueSerializer.java | 0 .../streaming/io/BeamStoppableFunction.java| 0 .../flink/batch/NonMergingGroupByKeyTest.java | 5 -- .../translation/types/CoderTypeSerializerTest.java | 0 settings.gradle.kts| 8 --- .../site/content/en/documentation/runners/flink.md | 25 ++--- 31 files changed, 29 insertions(+), 494 deletions(-) diff --cc runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkPipelineRunner.java index 396000b,96b8781..b911567 --- a/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkPipelineRunner.java +++ b/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkPipelineRunner.java @@@ -43,10 -43,9 +43,9 @@@ import org.apache.beam.sdk.metrics.Metr import org.apache.beam.sdk.metrics.MetricsOptions; import org.apache.beam.sdk.options.PipelineOptions; import org.apache.beam.sdk.options.PipelineOptionsFactory; -import org.apache.beam.vendor.grpc.v1p26p0.com.google.protobuf.Struct; +import org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.Struct; import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions; import org.apache.flink.api.common.JobExecutionResult; - import org.apache.flink.runtime.util.EnvironmentInformation; import org.checkerframework.checker.nullness.qual.Nullable; import org.kohsuke.args4j.CmdLineException; import org.kohsuke.args4j.CmdLineParser;
[beam] branch master updated (cb31b7b -> 572a99b)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from cb31b7b [BEAM-7372] cleanup codes for py2 compatibility from apache_beam/examples/snippets/*.py and apache_beam/examples/*.py (#1) add 05b3fd3 [BEAM-11948] Drop support for Flink 1.8 and 1.9 new 572a99b Merge pull request #14203: [BEAM-11948] Drop support for Flink 1.8 and 1.9 The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGES.md | 1 + gradle.properties | 2 +- runners/flink/1.10/build.gradle| 4 +- .../beam/runners/flink/FlinkCapabilities.java | 34 .../streaming/AbstractStreamOperatorCompat.java| 0 .../beam/runners/flink/RemoteMiniClusterImpl.java | 0 .../runners/flink/SourceTransformationCompat.java | 0 runners/flink/1.11/build.gradle| 4 +- runners/flink/1.12/build.gradle| 4 +- runners/flink/1.8/build.gradle | 34 .../flink/1.8/job-server-container/build.gradle| 26 - runners/flink/1.8/job-server/build.gradle | 31 --- .../beam/runners/flink/FlinkCapabilities.java | 34 .../streaming/io/BeamStoppableFunction.java| 29 -- .../beam/runners/flink/FlinkRunnerTestCompat.java | 43 --- .../runners/flink/streaming/StreamSources.java | 50 - runners/flink/1.9/build.gradle | 33 .../flink/1.9/job-server-container/build.gradle| 26 - runners/flink/1.9/job-server/build.gradle | 31 --- .../runners/flink/streaming/StreamSources.java | 62 -- .../flink/FlinkBatchTransformTranslators.java | 10 +--- .../beam/runners/flink/FlinkPipelineRunner.java| 8 --- .../org/apache/beam/runners/flink/FlinkRunner.java | 7 --- .../translation/functions/FlinkDoFnFunction.java | 12 + .../translation/types/CoderTypeSerializer.java | 0 .../translation/types/EncodedValueSerializer.java | 0 .../streaming/io/BeamStoppableFunction.java| 0 .../flink/batch/NonMergingGroupByKeyTest.java | 5 -- .../translation/types/CoderTypeSerializerTest.java | 0 settings.gradle.kts| 8 --- .../site/content/en/documentation/runners/flink.md | 25 ++--- 31 files changed, 29 insertions(+), 494 deletions(-) delete mode 100644 runners/flink/1.10/src/main/java/org/apache/beam/runners/flink/FlinkCapabilities.java rename runners/flink/{1.8 => 1.10}/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/AbstractStreamOperatorCompat.java (100%) rename runners/flink/{1.8 => 1.10}/src/test/java/org/apache/beam/runners/flink/RemoteMiniClusterImpl.java (100%) rename runners/flink/{1.8 => 1.10}/src/test/java/org/apache/beam/runners/flink/SourceTransformationCompat.java (100%) delete mode 100644 runners/flink/1.8/build.gradle delete mode 100644 runners/flink/1.8/job-server-container/build.gradle delete mode 100644 runners/flink/1.8/job-server/build.gradle delete mode 100644 runners/flink/1.8/src/main/java/org/apache/beam/runners/flink/FlinkCapabilities.java delete mode 100644 runners/flink/1.8/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/BeamStoppableFunction.java delete mode 100644 runners/flink/1.8/src/test/java/org/apache/beam/runners/flink/FlinkRunnerTestCompat.java delete mode 100644 runners/flink/1.8/src/test/java/org/apache/beam/runners/flink/streaming/StreamSources.java delete mode 100644 runners/flink/1.9/build.gradle delete mode 100644 runners/flink/1.9/job-server-container/build.gradle delete mode 100644 runners/flink/1.9/job-server/build.gradle delete mode 100644 runners/flink/1.9/src/test/java/org/apache/beam/runners/flink/streaming/StreamSources.java rename runners/flink/{1.8 => }/src/main/java/org/apache/beam/runners/flink/translation/types/CoderTypeSerializer.java (100%) rename runners/flink/{1.8 => }/src/main/java/org/apache/beam/runners/flink/translation/types/EncodedValueSerializer.java (100%) rename runners/flink/{1.9 => }/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/BeamStoppableFunction.java (100%) rename runners/flink/{1.8 => }/src/test/java/org/apache/beam/runners/flink/translation/types/CoderTypeSerializerTest.java (100%)
[beam] branch master updated (62ada38 -> 752798e)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 62ada38 Merge pull request #14506: [BEAM-11903] Bump objenesis to 3.2 add f725953 [BEAM-12151] Bump Apache Parquet to 1.12.0 add 752798e Merge pull request #14509: [BEAM-12151] Bump Apache Parquet to 1.12.0 No new revisions were added by this update. Summary of changes: sdks/java/io/parquet/build.gradle | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] branch master updated (97af077 -> 8e66956)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 97af077 [BEAM-9547] DataFrame.corr cleanup (#14327) add f9d4805 [BEAM-12091] Make file staging uniform among runners add 8e66956 Merge pull request #14520: [BEAM-12091] Make file staging uniform among runners No new revisions were added by this update. Summary of changes: .../construction/resources/PipelineResources.java | 37 +- .../resources/PipelineResourcesTest.java | 78 +++--- .../flink/FlinkPipelineExecutionEnvironment.java | 18 + .../SparkStructuredStreamingRunner.java| 20 +- .../runners/spark/SparkCommonPipelineOptions.java | 27 +--- .../beam/runners/spark/SparkPipelineRunner.java| 8 --- .../org/apache/beam/runners/spark/SparkRunner.java | 18 + .../beam/runners/twister2/Twister2Runner.java | 33 + .../beam/sdk/util/common/ReflectHelpers.java | 2 +- 9 files changed, 114 insertions(+), 127 deletions(-)
[beam] branch master updated (a86dc06 -> b10ce99)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from a86dc06 Bump container pandas version to 1.2.4 (#14524) add 992c378 [BEAM-2888] Added packages.confluent.io maven repo add b10ce99 Merge pull request #14545: [BEAM-2888] Added "packages.confluent.io" maven repo No new revisions were added by this update. Summary of changes: .test-infra/validate-runner/build.gradle | 3 +++ 1 file changed, 3 insertions(+)
[beam] branch master updated (c18e3cf -> e398a16)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from c18e3cf Merge pull request #14503 from [BEAM-12143] Fix PubsubReader to populate message id correctly add e398a16 Merge pull request #14472: [BEAM-12148] Align Spark runner jackson dependency version with Beam's No new revisions were added by this update. Summary of changes: .../src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy | 2 ++ runners/spark/spark_runner.gradle | 6 +- 2 files changed, 7 insertions(+), 1 deletion(-)
[beam] branch master updated (253bf38 -> b908f59)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 253bf38 Merge pull request #14464 from ibzib/BEAM-12123 add ca05c57 [BEAM-2303] Support SpecificData in AvroCoder add b908f59 Merge pull request #14410: [BEAM-2303] Support SpecificData in AvroCoder No new revisions were added by this update. Summary of changes: .../java/org/apache/beam/sdk/coders/AvroCoder.java | 23 .../org/apache/beam/sdk/coders/AvroCoderTest.java | 43 -- 2 files changed, 55 insertions(+), 11 deletions(-)
[beam] 02/04: [BEAM-11712] Add options for input/output paths, make it run via SparkRunner
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit a407d79680d01c35760f3fe4e76cd4192e34edd1 Author: Alexey Romanenko AuthorDate: Tue Mar 30 18:04:22 2021 +0200 [BEAM-11712] Add options for input/output paths, make it run via SparkRunner --- sdks/java/testing/tpcds/README.md | 68 ++ sdks/java/testing/tpcds/build.gradle | 11 +++- .../apache/beam/sdk/tpcds/BeamSqlEnvRunner.java| 14 ++--- .../java/org/apache/beam/sdk/tpcds/BeamTpcds.java | 18 +- .../org/apache/beam/sdk/tpcds/QueryReader.java | 49 +--- .../apache/beam/sdk/tpcds/SqlTransformRunner.java | 9 +-- .../beam/sdk/tpcds/TableSchemaJSONLoader.java | 43 +- .../org/apache/beam/sdk/tpcds/TpcdsOptions.java| 14 + .../beam/sdk/tpcds/TableSchemaJSONLoaderTest.java | 4 +- 9 files changed, 123 insertions(+), 107 deletions(-) diff --git a/sdks/java/testing/tpcds/README.md b/sdks/java/testing/tpcds/README.md new file mode 100644 index 000..89f8073 --- /dev/null +++ b/sdks/java/testing/tpcds/README.md @@ -0,0 +1,68 @@ + + +# TPC-DS Benchmark + +## Google Dataflow Runner + +To execute TPC-DS benchmark for 1Gb dataset on Google Dataflow, run the following example command from the command line: + +```bash +./gradlew :sdks:java:testing:tpcds:run -Ptpcds.args="--dataSize=1G \ + --runner=DataflowRunner \ + --queries=3,26,55 \ + --tpcParallel=2 \ + --dataDirectory=/path/to/tpcds_data/ \ + --project=apache-beam-testing \ + --stagingLocation=gs://beamsql_tpcds_1/staging \ + --tempLocation=gs://beamsql_tpcds_2/temp \ + --dataDirectory=/path/to/tpcds_data/ \ + --region=us-west1 \ + --maxNumWorkers=10" +``` + +To run a query using ZetaSQL planner (currently Query96 can be run using ZetaSQL), set the plannerName as below. If not specified, the default planner is Calcite. + +```bash +./gradlew :sdks:java:testing:tpcds:run -Ptpcds.args="--dataSize=1G \ + --runner=DataflowRunner \ + --queries=96 \ + --tpcParallel=2 \ + --dataDirectory=/path/to/tpcds_data/ \ + --plannerName=org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner \ + --project=apache-beam-testing \ + --stagingLocation=gs://beamsql_tpcds_1/staging \ + --tempLocation=gs://beamsql_tpcds_2/temp \ + --region=us-west1 \ + --maxNumWorkers=10" +``` + +## Spark Runner + +To execute TPC-DS benchmark with Query3 for 1Gb dataset on Apache Spark 2.x, run the following example command from the command line: + +```bash +./gradlew :sdks:java:testing:tpcds:run -Ptpcds.runner=":runners:spark:2" -Ptpcds.args=" \ + --runner=SparkRunner \ + --queries=3 \ + --tpcParallel=1 \ + --dataDirectory=/path/to/tpcds_data/ \ + --dataSize=1G \ + --resultsDirectory=/path/to/tpcds_results/" +``` diff --git a/sdks/java/testing/tpcds/build.gradle b/sdks/java/testing/tpcds/build.gradle index 6237776..79fb1e8 100644 --- a/sdks/java/testing/tpcds/build.gradle +++ b/sdks/java/testing/tpcds/build.gradle @@ -33,7 +33,7 @@ def tpcdsArgsProperty = "tpcds.args" def tpcdsRunnerProperty = "tpcds.runner" def tpcdsRunnerDependency = project.findProperty(tpcdsRunnerProperty) ?: ":runners:direct-java" -def shouldProvideSpark = ":runners:spark".equals(tpcdsRunnerDependency) +def shouldProvideSpark = ":runners:spark:2".equals(tpcdsRunnerDependency) def isDataflowRunner = ":runners:google-cloud-dataflow-java".equals(tpcdsRunnerDependency) def runnerConfiguration = ":runners:direct-java".equals(tpcdsRunnerDependency) ? "shadow" : null @@ -88,6 +88,15 @@ if (shouldProvideSpark) { } } +// Execute the TPC-DS queries or suites via Gradle. +// +// Parameters: +// -Ptpcds.runner +// Specify a runner subproject, such as ":runners:spark:2" or ":runners:flink:1.10" +// Defaults to ":runners:direct-java" +// +// -Ptpcds.args +// Specify the command line for invoking org.apache.beam.sdk.tpcds.BeamTpcds task run(type: JavaExec) { def tpcdsArgsStr = project.findProperty(tpcdsArgsProperty) ?: "" def tpcdsArgsList = new ArrayList() diff --git a/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/BeamSqlEnvRunner.java b/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/BeamSqlEnvRunner.java index 43b97d2..69e676f 100644 --- a/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/BeamSqlEnvRunner.java +++ b/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/BeamSqlEnvRunner.java @@ -17,6 +17,8 @@ */ package org.apache.beam.sdk.tpcds; +import static org.apache.beam.sdk.util.Preconditions.checkArgumentNotNull; + import com.alibaba.fastjson.JSONObject; import java.util.ArrayList; import java.util.Arrays; @@ -6
[beam] 04/04: Merge pull request #14373: [BEAM-11712] Run TPC-DS via BeamSQL and Spark runner
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit b3ef2035abf9ca2dd94a11a1a6aa4440df28adb9 Merge: f805f1c 8fe0c5c Author: Ismaël Mejía AuthorDate: Tue Apr 13 14:22:12 2021 +0200 Merge pull request #14373: [BEAM-11712] Run TPC-DS via BeamSQL and Spark runner sdks/java/testing/tpcds/README.md | 68 + sdks/java/testing/tpcds/build.gradle | 108 +- .../apache/beam/sdk/tpcds/BeamSqlEnvRunner.java| 327 +++-- .../java/org/apache/beam/sdk/tpcds/BeamTpcds.java | 50 +- .../java/org/apache/beam/sdk/tpcds/CsvToRow.java | 47 +- .../org/apache/beam/sdk/tpcds/QueryReader.java | 51 +- .../java/org/apache/beam/sdk/tpcds/RowToCsv.java | 38 +- .../apache/beam/sdk/tpcds/SqlTransformRunner.java | 314 +++-- .../apache/beam/sdk/tpcds/SummaryGenerator.java| 219 ++-- .../beam/sdk/tpcds/TableSchemaJSONLoader.java | 162 +-- .../org/apache/beam/sdk/tpcds/TpcdsOptions.java| 40 +- .../beam/sdk/tpcds/TpcdsOptionsRegistrar.java | 10 +- .../beam/sdk/tpcds/TpcdsParametersReader.java | 136 +- .../java/org/apache/beam/sdk/tpcds/TpcdsRun.java | 54 +- .../org/apache/beam/sdk/tpcds/TpcdsRunResult.java | 120 +- .../org/apache/beam/sdk/tpcds/TpcdsSchemas.java| 1336 ++-- ...pcdsOptionsRegistrar.java => package-info.java} | 16 +- .../org/apache/beam/sdk/tpcds/QueryReaderTest.java | 361 +++--- .../beam/sdk/tpcds/TableSchemaJSONLoaderTest.java | 261 ++-- .../beam/sdk/tpcds/TpcdsParametersReaderTest.java | 110 +- .../apache/beam/sdk/tpcds/TpcdsSchemasTest.java| 183 ++- 21 files changed, 2158 insertions(+), 1853 deletions(-)
[beam] 03/04: [BEAM-11712] Fix static analysis warnings and typos on TPC-DS module
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit 8fe0c5cee1bab62680ebd92a51aed8f3da80e190 Author: Ismaël Mejía AuthorDate: Tue Apr 13 14:20:54 2021 +0200 [BEAM-11712] Fix static analysis warnings and typos on TPC-DS module --- .../apache/beam/sdk/tpcds/BeamSqlEnvRunner.java| 3 +- .../java/org/apache/beam/sdk/tpcds/CsvToRow.java | 4 +- .../org/apache/beam/sdk/tpcds/QueryReader.java | 3 +- .../java/org/apache/beam/sdk/tpcds/RowToCsv.java | 2 +- .../apache/beam/sdk/tpcds/SqlTransformRunner.java | 2 +- .../beam/sdk/tpcds/TableSchemaJSONLoader.java | 6 +- .../beam/sdk/tpcds/TpcdsParametersReader.java | 4 +- .../org/apache/beam/sdk/tpcds/TpcdsRunResult.java | 11 +-- .../org/apache/beam/sdk/tpcds/TpcdsSchemas.java| 102 ++--- .../beam/sdk/tpcds/TableSchemaJSONLoaderTest.java | 3 +- .../apache/beam/sdk/tpcds/TpcdsSchemasTest.java| 6 +- 11 files changed, 68 insertions(+), 78 deletions(-) diff --git a/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/BeamSqlEnvRunner.java b/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/BeamSqlEnvRunner.java index 69e676f..304fdd2 100644 --- a/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/BeamSqlEnvRunner.java +++ b/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/BeamSqlEnvRunner.java @@ -211,8 +211,7 @@ public class BeamSqlEnvRunner { // Transform the result from PCollection into PCollection, and write it to the // location where results are stored. PCollection rowStrings = -rows.apply( -MapElements.into(TypeDescriptors.strings()).via((Row row) -> row.toString())); + rows.apply(MapElements.into(TypeDescriptors.strings()).via(Row::toString)); rowStrings.apply( TextIO.write() .to( diff --git a/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/CsvToRow.java b/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/CsvToRow.java index d66b128..d6c8ed8 100644 --- a/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/CsvToRow.java +++ b/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/CsvToRow.java @@ -33,8 +33,8 @@ import org.apache.commons.csv.CSVFormat; */ public class CsvToRow extends PTransform, PCollection> implements Serializable { - private Schema schema; - private CSVFormat csvFormat; + private final Schema schema; + private final CSVFormat csvFormat; public CSVFormat getCsvFormat() { return csvFormat; diff --git a/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/QueryReader.java b/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/QueryReader.java index 7b00a37..c6f3253 100644 --- a/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/QueryReader.java +++ b/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/QueryReader.java @@ -35,7 +35,6 @@ public class QueryReader { */ public static String readQuery(String queryFileName) throws Exception { String path = "queries/" + queryFileName + ".sql"; -String query = Resources.toString(Resources.getResource(path), Charsets.UTF_8); -return query; +return Resources.toString(Resources.getResource(path), Charsets.UTF_8); } } diff --git a/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/RowToCsv.java b/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/RowToCsv.java index 40a8cc5..a087948 100644 --- a/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/RowToCsv.java +++ b/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/RowToCsv.java @@ -33,7 +33,7 @@ import org.apache.commons.csv.CSVFormat; */ public class RowToCsv extends PTransform, PCollection> implements Serializable { - private CSVFormat csvFormat; + private final CSVFormat csvFormat; public RowToCsv(CSVFormat csvFormat) { this.csvFormat = csvFormat; diff --git a/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/SqlTransformRunner.java b/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/SqlTransformRunner.java index 4f56c1a..bea0261 100644 --- a/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/SqlTransformRunner.java +++ b/sdks/java/testing/tpcds/src/main/java/org/apache/beam/sdk/tpcds/SqlTransformRunner.java @@ -189,7 +189,7 @@ public class SqlTransformRunner { try { tables .apply(SqlTransform.query(queryString)) -.apply(MapElements.into(TypeDescriptors.strings()).via((Row row) -> row.toString())) + .apply(MapElements.into(TypeDescriptors.strings()).via(Row::toString)) .apply( TextIO.write()
[beam] branch master updated (f805f1c -> b3ef203)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from f805f1c Merge pull request #14499 from [BEAM-11408, BEAM-11772] Add explicit output typehints to ensure coder determinism for BQ with auto-sharding new 28eec3f [BEAM-11712] Make up-to-date build file and codestyle new a407d79 [BEAM-11712] Add options for input/output paths, make it run via SparkRunner new 8fe0c5c [BEAM-11712] Fix static analysis warnings and typos on TPC-DS module new b3ef203 Merge pull request #14373: [BEAM-11712] Run TPC-DS via BeamSQL and Spark runner The 4 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: sdks/java/testing/tpcds/README.md | 68 + sdks/java/testing/tpcds/build.gradle | 108 +- .../apache/beam/sdk/tpcds/BeamSqlEnvRunner.java| 327 +++-- .../java/org/apache/beam/sdk/tpcds/BeamTpcds.java | 50 +- .../java/org/apache/beam/sdk/tpcds/CsvToRow.java | 47 +- .../org/apache/beam/sdk/tpcds/QueryReader.java | 51 +- .../java/org/apache/beam/sdk/tpcds/RowToCsv.java | 38 +- .../apache/beam/sdk/tpcds/SqlTransformRunner.java | 314 +++-- .../apache/beam/sdk/tpcds/SummaryGenerator.java| 219 ++-- .../beam/sdk/tpcds/TableSchemaJSONLoader.java | 162 +-- .../org/apache/beam/sdk/tpcds/TpcdsOptions.java| 40 +- .../beam/sdk/tpcds/TpcdsOptionsRegistrar.java | 10 +- .../beam/sdk/tpcds/TpcdsParametersReader.java | 136 +- .../java/org/apache/beam/sdk/tpcds/TpcdsRun.java | 54 +- .../org/apache/beam/sdk/tpcds/TpcdsRunResult.java | 120 +- .../org/apache/beam/sdk/tpcds/TpcdsSchemas.java| 1336 ++-- .../org/apache/beam/sdk/tpcds}/package-info.java |4 +- .../org/apache/beam/sdk/tpcds/QueryReaderTest.java | 361 +++--- .../beam/sdk/tpcds/TableSchemaJSONLoaderTest.java | 261 ++-- .../beam/sdk/tpcds/TpcdsParametersReaderTest.java | 110 +- .../apache/beam/sdk/tpcds/TpcdsSchemasTest.java| 183 ++- 21 files changed, 2159 insertions(+), 1840 deletions(-) create mode 100644 sdks/java/testing/tpcds/README.md copy sdks/java/testing/{load-tests/src/main/java/org/apache/beam/sdk/loadtests => tpcds/src/main/java/org/apache/beam/sdk/tpcds}/package-info.java (92%)
[beam] branch master updated (0416017 -> 62ada38)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 0416017 Merge pull request #14450: [BEAM-12040, BEAM-11934] Remove the option withRunnerDeterminedShardingUnboundedInternal; add a check for merging windows add 0b35b27 [BEAM-11903] Bump objenesis to 3.2 add 62ada38 Merge pull request #14506: [BEAM-11903] Bump objenesis to 3.2 No new revisions were added by this update. Summary of changes: sdks/java/extensions/kryo/build.gradle | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] branch master updated (667ec40 -> 985e2f0)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 667ec40 Merge pull request #14504: Refactor PR template to separate test types and label test variants add 3ea492d [BEAM-12172] Bump gradle to 6.8.3 add 985e2f0 Merge pull request #14543: [BEAM-12172] Bump gradle to 6.8.3 No new revisions were added by this update. Summary of changes: gradle/wrapper/gradle-wrapper.properties | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] branch master updated (9601bde -> e6767c1)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 9601bde [BEAM-11227] Try reverting #14295: Moving from vendored gRPC 1.26 to 1.36 (#14466) add 2cca8f1 [BEAM-12092] Bump jedis to 3.5.2 add e6767c1 Merge pull request #14471: [BEAM-12092] Bump jedis to 3.5.2 No new revisions were added by this update. Summary of changes: sdks/java/io/redis/build.gradle | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] 01/01: Merge pull request #13981: [BEAM-11124] bump joda-time to 2.10.10
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit 3930d12d47a613458988a7421d789ca1abffa853 Merge: bfc858a 9631007 Author: Ismaël Mejía AuthorDate: Mon Feb 15 21:17:21 2021 +0100 Merge pull request #13981: [BEAM-11124] bump joda-time to 2.10.10 buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] branch master updated (bfc858a -> 3930d12)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from bfc858a [BEAM-11806] Explicit Partition Support for KafkaIO.WriteRecords (#13975) add 9631007 [BEAM-11124] bump joda-time to 2.10.10 new 3930d12 Merge pull request #13981: [BEAM-11124] bump joda-time to 2.10.10 The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] branch master updated (b6a59b3 -> c76e835)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from b6a59b3 Merge #13558. [BEAM-11494][BEAM-11821] FileIO stops overwriting files on retries add 3ede658 [BEAM-11801] Don't set useCachedDataPool if an emulator host is set new c76e835 Merge pull request #13964: [BEAM-11801] Don't set useCachedDataPool if an emulator host is set The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: .../java/org/apache/beam/sdk/io/gcp/bigtable/BigtableConfig.java | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-)
[beam] 01/01: Merge pull request #13964: [BEAM-11801] Don't set useCachedDataPool if an emulator host is set
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit c76e835f69b7b2328e44a1e46e3f299748959017 Merge: b6a59b3 3ede658 Author: Ismaël Mejía AuthorDate: Thu Feb 18 14:46:23 2021 +0100 Merge pull request #13964: [BEAM-11801] Don't set useCachedDataPool if an emulator host is set .../java/org/apache/beam/sdk/io/gcp/bigtable/BigtableConfig.java | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-)
[beam] branch master updated (b8dd9cb -> 88918f1)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from b8dd9cb Update go version to 1.12.7 (#13996) add 4178d26 [BEAM-11125] bump checkerframework to 3.10.0 new 88918f1 Merge pull request #13978: [BEAM-11125] bump checkerframework to 3.10.0 The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: buildSrc/build.gradle.kts | 2 +- .../src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy| 2 +- sdks/java/expansion-service/build.gradle | 2 +- sdks/java/extensions/schemaio-expansion-service/build.gradle | 4 ++-- sdks/java/extensions/sql/build.gradle | 2 +- 5 files changed, 6 insertions(+), 6 deletions(-)
[beam] 01/01: Merge pull request #13978: [BEAM-11125] bump checkerframework to 3.10.0
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit 88918f10e5740baa8f2c2612dd43ace944325003 Merge: b8dd9cb 4178d26 Author: Ismaël Mejía AuthorDate: Thu Feb 18 07:35:18 2021 +0100 Merge pull request #13978: [BEAM-11125] bump checkerframework to 3.10.0 buildSrc/build.gradle.kts | 2 +- .../src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy| 2 +- sdks/java/expansion-service/build.gradle | 2 +- sdks/java/extensions/schemaio-expansion-service/build.gradle | 4 ++-- sdks/java/extensions/sql/build.gradle | 2 +- 5 files changed, 6 insertions(+), 6 deletions(-)
[beam] branch master updated (edae900 -> 6e4adca)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from edae900 Merge pull request #14019 from [BEAM-11791] Fixing FnApiRunner Microbenchmarks to export to influxDB add 1c4b70e [BEAM-11120] bump Gradle License Report plugin to 1.16 add 6e4adca Merge pull request #14011: [BEAM-11120] bump Gradle License Report plugin to 1.16 No new revisions were added by this update. Summary of changes: sdks/java/container/build.gradle | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] branch master updated (a7503f6 -> 4497aaf)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from a7503f6 Merge pull request #13939: OnMergeContextImpl.deleteTimer should delete timers not set them. add 2bdeeb3 [BEAM-9112] Bump jboss-module to 1.11.0.Final add 4497aaf Merge pull request #14032: [BEAM-9112] Bump jboss-module to 1.11.0.Final No new revisions were added by this update. Summary of changes: .../src/main/groovy/org/apache/beam/gradle/GrpcVendoring_1_26_0.groovy | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] branch master updated (f15294a -> 3246690)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from f15294a [BEAM-8691] Upgrading bigtable-client-core to 1.19.1 add 62e0f38 [BEAM-12210] Use formatting string for checkArgument to avoid excess String appends add fc45ad4 Fix spotless add 3246690 Merge pull request #14620: [BEAM-12210] Use formatting string for checkArgument to avoid excess … No new revisions were added by this update. Summary of changes: .../src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
[beam] branch master updated (cbbebcd -> 50d8a85)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from cbbebcd Merge pull request #15335 from [BEAM-12751] Set clientRequestId for Dataflow python job creation add 41d515d [BEAM-12270] TPC-DS: Add schema projection for Parquet source add 50d8a85 Merge pull request #15361: [BEAM-12270] TPC-DS: Add schema projection for Parquet source No new revisions were added by this update. Summary of changes: .../org/apache/beam/sdk/tpcds/QueryReader.java | 20 ++ .../apache/beam/sdk/tpcds/SqlTransformRunner.java | 72 +++--- .../org/apache/beam/sdk/tpcds/QueryReaderTest.java | 28 + 3 files changed, 111 insertions(+), 9 deletions(-)
[beam] branch master updated (98747fd -> 3537f7e)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 98747fd Bump Python FnAPI beam-master container #15283 add e5c25a0 [BEAM-12628] Add Avro reflect-based Coder option add 3537f7e Merge pull request #15292: [BEAM-12628] Add Avro reflect-based Coder option No new revisions were added by this update. Summary of changes: .../java/org/apache/beam/sdk/coders/AvroCoder.java | 32 +++--- .../org/apache/beam/sdk/coders/AvroCoderTest.java | 21 ++ 2 files changed, 49 insertions(+), 4 deletions(-)
[beam] branch master updated (9e0aa6b -> eea81f4)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 9e0aa6b Merge pull request #17056 from [BEAM-14076] [SnowflakeIO] Add support for GEOGRAPHY column add 1aba87d [BEAM-13981] Remove Spark Runner specific code for event logging add eea81f4 Merge pull request #17073: [BEAM-13981] Remove Spark Runner specific code for event logging No new revisions were added by this update. Summary of changes: .../beam/runners/spark/SparkPipelineRunner.java| 17 - .../org/apache/beam/runners/spark/SparkRunner.java | 22 +- .../runners/spark/metrics/SparkBeamMetric.java | 10 --- .../beam/runners/spark/util/SparkCommon.java | 79 -- .../beam/runners/spark/util/SparkCompat.java | 61 - 5 files changed, 1 insertion(+), 188 deletions(-) delete mode 100644 runners/spark/src/main/java/org/apache/beam/runners/spark/util/SparkCommon.java
[beam] branch master updated (b2f2128 -> 6e98dd4)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from b2f2128 Merge pull request #16865: Create test category for UsesProcessingTimeTimers are exclude from Samza add add7bbc [BEAM-13202] Fix typos on tests names for VarianceFnTest add 1105c34 [BEAM-13202] Add Coder to CountIfFn.Accum add cf75357 [BEAM-13202] Reuse Count transform code since CountIf is a specific case add 6e98dd4 Merge pull request #16856: [BEAM-13202] Add Coder to CountIfFn.Accum No new revisions were added by this update. Summary of changes: .../extensions/sql/impl/transform/agg/CountIf.java | 53 +++ .../sql/impl/transform/agg/CountIfTest.java| 78 ++ .../sql/impl/transform/agg/VarianceFnTest.java | 4 +- 3 files changed, 104 insertions(+), 31 deletions(-) create mode 100644 sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/impl/transform/agg/CountIfTest.java
[beam] branch master updated (3cd1f7f -> 75c25f0)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 3cd1f7f [BEAM-13960] Add support for more types when converting from between row and proto (#16875) add fa32292 Bump org.mongodb:mongo-java-driver to 3.12.10 add 75c25f0 Merge pull request #16989: [BEAM-5577] Bump org.mongodb:mongo-java-driver to 3.12.10 No new revisions were added by this update. Summary of changes: buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[beam] branch master updated (7fa5387 -> c73066c)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 7fa5387 Regenerate python container base_image_requirements.txt (#16832) add ed693e5 [BEAM-9195] Bump org.testcontainers to 1.16.3 add c73066c Merge pull request #16661: [BEAM-9195] Bump org.testcontainers to 1.16.3 No new revisions were added by this update. Summary of changes: .../src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy | 2 +- sdks/java/io/debezium/build.gradle | 6 +++--- sdks/java/io/elasticsearch-tests/elasticsearch-tests-5/build.gradle | 2 +- sdks/java/io/elasticsearch-tests/elasticsearch-tests-6/build.gradle | 2 +- sdks/java/io/elasticsearch-tests/elasticsearch-tests-7/build.gradle | 2 +- .../io/elasticsearch-tests/elasticsearch-tests-common/build.gradle | 2 +- 6 files changed, 8 insertions(+), 8 deletions(-)
[beam] branch master updated (d352d60 -> 0a68801)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from d352d60 [BEAM-14071] Enabling Flink on Dataproc for Interactive Beam (#17044) add 7257e37 [BEAM-4106] Remove filesToStage from Flink pipeline option list. add 0a68801 Merge pull request #17143: [BEAM-4106] Remove filesToStage from Flink pipeline option list. No new revisions were added by this update. Summary of changes: website/www/site/layouts/shortcodes/flink_java_pipeline_options.html | 5 - .../www/site/layouts/shortcodes/flink_python_pipeline_options.html | 5 - 2 files changed, 10 deletions(-)
[beam] branch master updated (5beae2a -> 9434c4d)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 5beae2a [BEAM-13605] Add support for pandas 1.4.0 (#16590) add 99b903f Change links to Books from Amazon to Publisher add 9434c4d Merge pull request #16718: [website] Change links to Books from Amazon to Publisher No new revisions were added by this update. Summary of changes: .../www/site/content/en/documentation/resources/learning-resources.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
[beam] 01/01: Merge pull request #23214: Use avro DataFileReader to read avro container files
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit 483a0c95734c528aa45419596e9f27e9e650c5d7 Merge: 762edd7f3a6 a6cda1370b3 Author: Ismaël Mejía AuthorDate: Thu Sep 22 18:58:10 2022 +0200 Merge pull request #23214: Use avro DataFileReader to read avro container files .../java/org/apache/beam/sdk/io/AvroSource.java| 343 ++--- .../org/apache/beam/sdk/io/AvroSourceTest.java | 166 -- 2 files changed, 86 insertions(+), 423 deletions(-)
[beam] branch master updated (762edd7f3a6 -> 483a0c95734)
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git from 762edd7f3a6 Improved pipeline translation in SparkStructuredStreamingRunner (#22446) add a6cda1370b3 use avro DataFileReader to read avro container files new 483a0c95734 Merge pull request #23214: Use avro DataFileReader to read avro container files The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: .../java/org/apache/beam/sdk/io/AvroSource.java| 343 ++--- .../org/apache/beam/sdk/io/AvroSourceTest.java | 166 -- 2 files changed, 86 insertions(+), 423 deletions(-)
[beam] branch master updated: Fix #22466 Add github actions dependency updates with dependabot
This is an automated email from the ASF dual-hosted git repository. iemejia pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 3031a3d2aca Fix #22466 Add github actions dependency updates with dependabot 3031a3d2aca is described below commit 3031a3d2aca8e81b219364ca43cbf811abd68445 Author: Ismaël Mejía AuthorDate: Wed Aug 10 22:02:55 2022 +0200 Fix #22466 Add github actions dependency updates with dependabot --- .github/dependabot.yml | 8 1 file changed, 8 insertions(+) diff --git a/.github/dependabot.yml b/.github/dependabot.yml index 334414df9db..248e8d6a69b 100644 --- a/.github/dependabot.yml +++ b/.github/dependabot.yml @@ -42,3 +42,11 @@ updates: - dependency-name: "com.google.api.grpc:grpc-*" - dependency-name: "com.google.http-client:*" - dependency-name: "com.google.apis:google-api-services-*" + - package-ecosystem: "github-actions" +directory: "/" +schedule: + interval: "daily" +allow: + # Allow only automatic updates for official github actions + # Other github-actions require approval from INFRA + - dependency-name: "actions/*"