This is an automated email from the ASF dual-hosted git repository.

yihua pushed a change to branch release-1.2.0
in repository https://gitbox.apache.org/repos/asf/hudi.git


    from 5dbf6e4fda0d Add Apache Licence to hudi-notebooks/requirements.txt
     new d2ba868c5147 fix(schema): Handle BLOB and VARIANT in Hive-reader 
rewriteRecordWithNewSchema (#18580)
     new 002dd4cbea74 chore(deps): Pin AWS v1 SDK BOM to short-circuit 
transitive version-range walk (#18619)
     new b98ecf63d15c feat(examples): Add Hudi Unstructed Demo env (#18643)
     new bf5f4e3eb8b5 fix: filter spark.sql.* properties in 
SparkCatalogMetaStoreClient.toCatalogTable (#18654)
     new b2d6222bc577 fix: make SparkCatalogMetaStoreClient.setMetaConf a no-op 
(#18652)
     new bb001f243b95 fix: Honor SparkSession overrides for rebase mode and 
timezone in compaction tasks (#18675)
     new 6388ba6327a0 fix(ci): bump Maven heap to 8g to fix OOM in CI builds 
(#18618)
     new cdac700954c4 fix(lance): prevent file splitting for Lance base files 
to avoid duplicate reads (#18678)
     new 37ec7a1128c4 fix: filter EXTERNAL property in 
SparkCatalogMetaStoreClient.toCatalogTable (#18672)
     new dd2773a11212 fix(flink): Avoid emitting deletes for Flink source v2 
batch reads (#18694)
     new 54601e27cdd9 fix(hive): Tolerate pruned ArrayWritable in nested BLOB 
projection (#18581)
     new 371e02d2abb7 fix: Fix reflection ctor signature for 
AwsGlueCatalogSyncTool in HiveSyncContext (#18697)
     new 715c6a4cdcd3 test(schema): Add MOR log-only compaction tests for 
custom types (#18583)
     new 55566a88515f fix(flink): Support read non-VECTOR columns from table 
containing VEC… (#18712)
     new f8431eb6cbb0 fix(metadata): Exclude Variant/Blob/Vector from V1 column 
stats (#18695)
     new 99c21cbc2190 fix: remove the pk check for Flink append only table 
(#18738)
     new 2929f7dea38d fix(variant): align Spark 4.1 MOR merge with 
PushVariantIntoScan and restore Spark 4.0 reads (#18674)
     new 657405a94480 feat(blob): RFC-100: Clarify inline vs out-of-line blob 
read behavior (#18728)
     new fcca5584da0a fix(flink): shade codehale in flink bundle (#18730)
     new 86c29d74667a fix: clear Hive work map after combine split failures 
(#18719)
     new 797b6dcaff16 fix: Fix Scanner resource leak in HiveIncrementalPuller 
(#18441)
     new a8bef0d28937 test(spark): Add tests for batch-mode blob reads (#18736)
     new e06932b9430e fix: Use the latest docker image in hudi-integ-test for 
Macbook (#18746)
     new e9120c9378a3 chore: Update scripts to deploy staging jars as part of 
the release process (#18748)
     new 0b90200e4312 chore: Enhance validate_staged_bundles.sh to validate 
bundles before closing the staging repo (#18749)
     new 39ac0a147582 chore: Add script to copy release artifacts to the same 
staging repo (#18747)
     new 807b44c4d793 fix(docker): tag base image per Java version to avoid 
latest collision (#18663)
     new 14c54b9c862f fix: Support data pruning using nested partition columns 
(#18126)
     new 1558c870a908 fix: Follow-ups to JsonKinesisSource: numeric sequence 
comparison and call-site fixes (#18689)
     new 059fa0bc1f47 fix: Enable schema merging for incremental and dfs 
sources (#18385)
     new fea734c6bac5 chore: Fix license for copy_staging_repo.sh (#18753)
     new 42ae8584a36c fix(flink): fix the write handle close for append write 
(#18756)
     new 9d15be48fa58 fix(lance): Support Lance file format on Spark 4.1 
(#18760)
     new 441561befd0c chore: Harden workflow against command injection in PR 
title validation (#18771)
     new 0be0fbae8785 test(trino): de-flake TestHudi*FileOperations by polling 
for span stability (#18766)
     new 37df6e10f41e fix(lance): fail fast when write schema contains VARIANT 
columns (#18775)
     new 3dddb15ac610 test(azure): skip ITAzureStorageLockClientAzurite when 
MCR image pull fails (#18772)
     new 11a243752970 fix(aws): implement writer-version update in Glue sync 
client (#18707)
     new a17949b2ad2c fix: Enhance hudi-azure-bundle (#18472)
     new c07937a5e666 fix: Disable column stats and partition stats indices for 
Lance base files (#18588)
     new 685f4acbfbad fix: Skip pre-compaction rollback metadata reads in 
getValidInstantTimestamps (#18544)
     new 17d45ca763d3 fix: Fix SQL syntax parser for CREATE TABLE on Spark 4.1 
(#18779)
     new 4a0a7f2797d6 fix(spark): handle Avro 1.12 logical type values in Spark 
4.1 read path (#18773)
     new b1799290107e fix(spark): use HoodieStorageUtils factory in Spark 4.1 
legacy parquet read (#18785)
     new 658b2fb12c58 perf: Improve global index performance for commit time 
ordering (#17797)

The 45 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .github/workflows/bot.yml                          |   4 +-
 .github/workflows/pr_title_validation.yml          |   6 +-
 azure-pipelines-20230430.yml                       |   3 +-
 docker/README.md                                   |  34 +-
 docker/build_docker_images.sh                      |   6 +-
 docker/hoodie/hadoop/datanode/Dockerfile           |   5 +-
 docker/hoodie/hadoop/historyserver/Dockerfile      |   5 +-
 docker/hoodie/hadoop/hive_base/Dockerfile          |   5 +-
 docker/hoodie/hadoop/namenode/Dockerfile           |   5 +-
 docker/hoodie/hadoop/prestobase/Dockerfile         |   3 +-
 docker/hoodie/hadoop/trinobase/Dockerfile          |   3 +-
 .../hudi/aws/sync/AWSGlueCatalogSyncClient.java    |  16 +
 .../hudi/aws/sync/TestAWSGlueSyncClient.java       |  51 ++
 .../apache/hudi/aws/sync/TestAwsGlueSyncTool.java  |  31 +
 .../lock/ITAzureStorageLockClientAzurite.java      |  72 +-
 .../org/apache/hudi/index/HoodieIndexUtils.java    |   5 +-
 .../keygen/TimestampBasedAvroKeyGenerator.java     |   2 +-
 .../apache/hudi/util/HoodieSchemaConverter.java    |  23 +
 .../hudi/util/TestHoodieSchemaConverter.java       |  54 ++
 .../io/storage/HoodieSparkFileWriterFactory.java   |   1 +
 .../hudi/io/storage/HoodieSparkLanceWriter.java    |  44 +
 .../hudi/io/storage/HoodieSparkParquetReader.java  |  11 +-
 .../storage/row/HoodieRowParquetWriteSupport.java  |  42 +-
 .../SparkFileFormatInternalRowReaderContext.scala  |  76 +-
 .../sql/avro/HoodieSparkSchemaConverters.scala     |  20 +-
 .../parquet/HoodieParquetReadSupport.scala         |  48 +-
 .../org/apache/spark/sql/hudi/SparkAdapter.scala   |  40 +-
 .../row/TestHoodieRowParquetWriteSupport.java      | 120 +++
 .../parquet/TestHoodieParquetReadSupport.scala     |  34 -
 .../java/org/apache/hudi/avro/HoodieAvroUtils.java | 145 +++-
 .../apache/hudi/avro/HoodieAvroWrapperUtils.java   |  26 +-
 .../hudi/common/engine/HoodieReaderContext.java    |  10 +
 .../hudi/common/schema/HoodieProjectionMask.java   | 199 +++++
 .../table/read/buffer/FileGroupRecordBuffer.java   |  27 +-
 .../buffer/PositionBasedFileGroupRecordBuffer.java |  10 +-
 .../hudi/metadata/HoodieTableMetadataUtil.java     |  22 +-
 .../hudi/metadata/MetadataPartitionType.java       |  13 +-
 .../org/apache/hudi/avro/TestHoodieAvroUtils.java  | 144 ++++
 .../hudi/common/schema/HoodieSchemaTestUtils.java  |  29 +
 .../hudi/metadata/TestHoodieTableMetadataUtil.java |  21 +
 .../vector_cross_engine_validation/README.md       |  66 ++
 .../vector_cross_engine_validation/vector_cow.zip  | Bin 0 -> 80614 bytes
 .../vector_cross_engine_validation/vector_mor.zip  | Bin 0 -> 77871 bytes
 .../src/test/python/vector_blob_demo/.gitignore    |  19 +-
 .../src/test/python/vector_blob_demo/README.md     | 752 +++++++++++++++++
 .../vector_blob_demo/hudi_blob_reader_demo.py      | 620 ++++++++++++++
 .../hudi_dataframe_vector_blob_demo.py             | 631 ++++++++++++++
 .../vector_blob_demo/hudi_sql_vector_blob_demo.py  | 601 ++++++++++++++
 .../vector_blob_demo/notebooks/00_main_demo.ipynb  | 558 +++++++++++++
 .../notebooks/01_blob_reader.ipynb                 | 917 +++++++++++++++++++++
 .../notebooks/02_sql_vector_search.ipynb           | 912 ++++++++++++++++++++
 .../notebooks/03_dataframe_vector_search.ipynb     | 826 +++++++++++++++++++
 .../python/vector_blob_demo/notebooks/README.md    | 157 ++++
 .../test/python/vector_blob_demo/requirements.txt  |  45 +
 .../src/test/python/vector_blob_demo/run_demos.sh  |  72 ++
 .../apache/hudi/configuration/OptionsResolver.java |  33 +-
 .../hudi/sink/append/AppendWriteFunction.java      |  11 +
 .../AppendWriteFunctionWithBIMBufferSort.java      |  12 +-
 .../apache/hudi/sink/bulk/AutoRowDataKeyGen.java   |   2 +-
 .../org/apache/hudi/sink/bulk/RowDataKeyGens.java  |  11 +-
 .../apache/hudi/sink/utils/HiveSyncContext.java    |   6 +-
 .../java/org/apache/hudi/sink/utils/Pipelines.java |   2 +-
 .../hudi/source/prune/PrimaryKeyPruners.java       |   2 +-
 .../org/apache/hudi/table/HoodieTableFactory.java  |  19 +-
 .../org/apache/hudi/table/HoodieTableSource.java   |   2 +-
 .../apache/hudi/table/catalog/HoodieCatalog.java   |   6 +-
 .../hudi/table/catalog/HoodieHiveCatalog.java      |   2 +-
 .../java/org/apache/hudi/util/StreamerUtil.java    |   1 -
 .../hudi/configuration/TestOptionsResolver.java    |  27 +
 .../hudi/sink/append/TestAppendWriteFunction.java  |  17 +
 .../TestAppendWriteFunctionWithBIMBufferSort.java  | 106 +++
 .../apache/hudi/sink/bulk/TestRowDataKeyGens.java  |  21 +-
 .../hudi/sink/utils/TestHiveSyncContext.java       |  16 +
 .../apache/hudi/table/ITTestHoodieDataSource.java  |  30 +-
 .../ITTestVectorCrossEngineCompatibility.java      | 101 +++
 .../apache/hudi/table/TestHoodieTableFactory.java  |  40 +
 .../hudi/table/catalog/TestHoodieCatalog.java      |  47 ++
 .../org/apache/hudi/utils/TestConfigurations.java  |  23 +-
 .../avro/AvroSchemaConverterWithTimestampNTZ.java  |  11 +-
 .../hudi/metadata/TestHoodieTableMetadataUtil.java | 108 +++
 .../hudi/hadoop/HiveHoodieReaderContext.java       |  12 +-
 .../hudi/hadoop/HoodieColumnProjectionUtils.java   |  99 +++
 .../hadoop/hive/HoodieCombineHiveInputFormat.java  | 174 ++--
 .../utils/HoodieArrayWritableSchemaUtils.java      | 182 ++--
 .../hadoop/TestHoodieColumnProjectionUtils.java    |  92 +++
 .../hive/TestHoodieCombineHiveInputFormat.java     |  32 +
 .../utils/TestHoodieArrayWritableSchemaUtils.java  | 182 ++++
 hudi-integ-test/pom.xml                            |   2 +-
 .../scala/org/apache/hudi/HoodieFileIndex.scala    | 122 ++-
 .../apache/hudi/SparkHoodieTableFileIndex.scala    | 165 +++-
 .../HoodieFileGroupReaderBasedFileFormat.scala     |  26 +-
 .../sql/hive/SparkCatalogMetaStoreClient.scala     |  23 +-
 .../spark/sql/hudi/blob/BatchedBlobReader.scala    |   4 +-
 .../hudi/TestSparkSqlHudiPackageStructure.java     |   1 +
 .../TestGlobalIndexCommitTimeOrdering.java         | 473 +++++++++++
 .../io/storage/TestHoodieSparkLanceWriter.java     |  72 ++
 .../org/apache/hudi/TestHoodieFileIndex.scala      |  94 ++-
 .../apache/hudi/blob/TestReadBlobBatching.scala    | 223 +++++
 .../apache/hudi/functional/TestCOWDataSource.scala | 296 ++++++-
 .../hudi/functional/TestLanceDataSource.scala      |  99 ++-
 .../apache/hudi/functional/TestMORDataSource.scala |   5 +
 .../TestSparkAdapterRebaseModeDefault.scala        |  97 +++
 .../TestSparkAdapterRebaseModePropagation.scala    | 157 ++++
 .../hudi/functional/TestVectorDataSource.scala     | 129 ++-
 .../sql/hive/TestSparkCatalogMetaStoreClient.scala |  26 +
 .../sql/hudi/blob/TestBatchedBlobReaderMerge.scala | 183 ++++
 .../spark/sql/hudi/ddl/TestCreateTable.scala       |  12 +-
 .../apache/spark/sql/hudi/ddl/TestSpark3DDL.scala  |   4 -
 .../sql/hudi/dml/others/TestMergeIntoTable.scala   |   4 +-
 .../sql/hudi/dml/schema/TestBlobDataType.scala     | 285 +++++++
 .../sql/hudi/dml/schema/TestVariantDataType.scala  | 127 ++-
 .../Spark3HoodiePruneFileSourcePartitions.scala    |  18 +-
 .../apache/spark/sql/adapter/Spark3_3Adapter.scala |   9 +-
 .../Spark33HoodiePruneFileSourcePartitions.scala   |   2 +-
 .../apache/spark/sql/adapter/Spark3_4Adapter.scala |   9 +-
 .../apache/spark/sql/adapter/Spark3_5Adapter.scala |  17 +-
 .../spark/sql/adapter/BaseSpark4Adapter.scala      |  19 +-
 .../Spark4HoodiePruneFileSourcePartitions.scala    |  16 +-
 .../apache/spark/sql/adapter/Spark4_0Adapter.scala |  25 +-
 .../parquet/Spark40HoodieParquetReadSupport.scala  | 115 +++
 .../Spark40LegacyHoodieParquetFileFormat.scala     |   4 +-
 .../datasources/parquet/Spark40ParquetReader.scala |   2 +-
 .../TestSpark40HoodieParquetReadSupport.scala      |  59 ++
 .../src/main/antlr4/imports/SqlBase.g4             |   2 +-
 .../apache/spark/sql/adapter/Spark4_1Adapter.scala |  70 +-
 .../apache/spark/sql/avro/AvroDeserializer.scala   |  22 +
 .../Spark41LegacyHoodieParquetFileFormat.scala     |   5 +-
 .../HoodieSpark4_1ExtendedSqlAstBuilder.scala      |   2 +-
 .../hudi/TestSpark4_1AvroLogicalTypeBytes.scala    | 127 +++
 .../hudi/TestHudiAlluxioCacheFileOperations.java   |  30 +-
 .../hudi/TestHudiMemoryCacheFileOperations.java    |  30 +-
 .../plugin/hudi/TestHudiNoCacheFileOperations.java |  30 +-
 .../hudi/utilities/HiveIncrementalPuller.java      |  44 +-
 .../hudi/utilities/config/CloudSourceConfig.java   |  13 +
 ...FSSourceConfig.java => ORCDFSSourceConfig.java} |  24 +-
 .../utilities/config/ParquetDFSSourceConfig.java   |  15 +-
 .../hudi/utilities/sources/JsonKinesisSource.java  |   4 +-
 .../hudi/utilities/sources/KinesisSource.java      |   6 +-
 .../hudi/utilities/sources/ORCDFSSource.java       |   6 +-
 .../hudi/utilities/sources/ParquetDFSSource.java   |   4 +-
 .../helpers/CloudObjectsSelectorCommon.java        |  31 +-
 .../sources/helpers/KinesisOffsetGen.java          |  22 +-
 .../sources/helpers/KinesisReadConfig.java         |   4 +-
 .../TestHiveIncrementalPullerExecuteSQL.java       | 125 +++
 .../hudi/utilities/sources/TestAvroDFSSource.java  | 103 +++
 .../helpers/TestCloudObjectsSelectorCommon.java    |  62 ++
 packaging/hudi-azure-bundle/pom.xml                |  39 +-
 packaging/hudi-flink-bundle/pom.xml                |   4 +
 pom.xml                                            |  18 +-
 release/release_guide.md                           |  40 +-
 rfc/rfc-100/rfc-100.md                             |  61 ++
 scripts/release/copy_staging_repo.sh               | 175 ++++
 scripts/release/deploy_staging_jars.sh             |  23 +-
 scripts/release/deploy_staging_jars_java11.sh      |  82 --
 scripts/release/deploy_staging_jars_java17.sh      |  11 +
 scripts/release/validate_staged_bundles.sh         | 103 ++-
 156 files changed, 12616 insertions(+), 665 deletions(-)
 create mode 100644 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/io/storage/row/TestHoodieRowParquetWriteSupport.java
 create mode 100644 
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieProjectionMask.java
 create mode 100644 
hudi-common/src/test/resources/vector_cross_engine_validation/README.md
 create mode 100644 
hudi-common/src/test/resources/vector_cross_engine_validation/vector_cow.zip
 create mode 100644 
hudi-common/src/test/resources/vector_cross_engine_validation/vector_mor.zip
 copy 
hudi-spark-datasource/hudi-spark/src/test/resources/external-config/hudi-defaults.conf
 => 
hudi-examples/hudi-examples-spark/src/test/python/vector_blob_demo/.gitignore 
(63%)
 create mode 100644 
hudi-examples/hudi-examples-spark/src/test/python/vector_blob_demo/README.md
 create mode 100644 
hudi-examples/hudi-examples-spark/src/test/python/vector_blob_demo/hudi_blob_reader_demo.py
 create mode 100644 
hudi-examples/hudi-examples-spark/src/test/python/vector_blob_demo/hudi_dataframe_vector_blob_demo.py
 create mode 100644 
hudi-examples/hudi-examples-spark/src/test/python/vector_blob_demo/hudi_sql_vector_blob_demo.py
 create mode 100644 
hudi-examples/hudi-examples-spark/src/test/python/vector_blob_demo/notebooks/00_main_demo.ipynb
 create mode 100644 
hudi-examples/hudi-examples-spark/src/test/python/vector_blob_demo/notebooks/01_blob_reader.ipynb
 create mode 100644 
hudi-examples/hudi-examples-spark/src/test/python/vector_blob_demo/notebooks/02_sql_vector_search.ipynb
 create mode 100644 
hudi-examples/hudi-examples-spark/src/test/python/vector_blob_demo/notebooks/03_dataframe_vector_search.ipynb
 create mode 100644 
hudi-examples/hudi-examples-spark/src/test/python/vector_blob_demo/notebooks/README.md
 create mode 100644 
hudi-examples/hudi-examples-spark/src/test/python/vector_blob_demo/requirements.txt
 create mode 100755 
hudi-examples/hudi-examples-spark/src/test/python/vector_blob_demo/run_demos.sh
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/append/TestAppendWriteFunctionWithBIMBufferSort.java
 create mode 100644 
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/ITTestVectorCrossEngineCompatibility.java
 create mode 100644 
hudi-spark-datasource/hudi-spark/src/test/java/org/apache/hudi/functional/TestGlobalIndexCommitTimeOrdering.java
 create mode 100644 
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/blob/TestReadBlobBatching.scala
 create mode 100644 
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestSparkAdapterRebaseModeDefault.scala
 create mode 100644 
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestSparkAdapterRebaseModePropagation.scala
 create mode 100644 
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/blob/TestBatchedBlobReaderMerge.scala
 create mode 100644 
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/schema/TestBlobDataType.scala
 create mode 100644 
hudi-spark-datasource/hudi-spark4.0.x/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/Spark40HoodieParquetReadSupport.scala
 create mode 100644 
hudi-spark-datasource/hudi-spark4.0.x/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/TestSpark40HoodieParquetReadSupport.scala
 create mode 100644 
hudi-spark-datasource/hudi-spark4.1.x/src/test/scala/org/apache/hudi/TestSpark4_1AvroLogicalTypeBytes.scala
 copy 
hudi-utilities/src/main/java/org/apache/hudi/utilities/config/{ParquetDFSSourceConfig.java
 => ORCDFSSourceConfig.java} (62%)
 create mode 100644 
hudi-utilities/src/test/java/org/apache/hudi/utilities/TestHiveIncrementalPullerExecuteSQL.java
 create mode 100755 scripts/release/copy_staging_repo.sh
 delete mode 100755 scripts/release/deploy_staging_jars_java11.sh

Reply via email to