dependabot[bot] opened a new pull request, #9631: URL: https://github.com/apache/iceberg/pull/9631
Bumps [io.delta:delta-spark_2.12](https://github.com/delta-io/delta) from 3.0.0 to 3.1.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/delta-io/delta/releases">io.delta:delta-spark_2.12's releases</a>.</em></p> <blockquote> <h2>Delta Lake 3.1.0</h2> <p>We are excited to announce the release of Delta Lake 3.1.0. This release includes several exciting new features.</p> <h2>Few Highlights</h2> <ul> <li><strong>Delta-Spark:</strong> <a href="https://redirect.github.com/delta-io/delta/issues/2426">Support for merge with deletion vectors</a> to reduce the write overhead for merge operations. This feature improves the performance of merge by several folds.</li> <li><strong>Delta-Spark:</strong> <a href="https://redirect.github.com/delta-io/delta/issues/2092">Support for optimizing min/max aggregation queries</a> using the table metadata which improves the performance of simple aggregations queries (e.g SELECT min(x) FROM deltaTable) by up to 100x.</li> <li><strong>Delta-Spark:</strong> Support for <a href="https://docs.delta.io/3.1.0/delta-sharing.html">querying</a> tables shared through <a href="https://delta.io/sharing/">Delta Sharing</a> protocol.</li> <li><strong>Kernel:</strong> Support for data skipping for given query predicates to reduce the number of files read during the table scan.</li> <li><strong>Uniform:</strong> <a href="https://redirect.github.com/delta-io/delta/issues/2297">Enhanced Iceberg support</a> for Delta tables that enables MAP and LIST types and ease of use improvements to enable Uniform on a Delta table.</li> <li><strong>Delta-Flink:</strong> Flink write job startup time latency improvement using Kernel.</li> </ul> <p>Details by each component.</p> <h2>Delta Spark</h2> <p>Delta Spark 3.1.0 is built on <a href="https://spark.apache.org/releases/spark-release-3-5-0.html">Apache Spark™ 3.5</a>. Similar to Apache Spark, we have released Maven artifacts for both Scala 2.12 and Scala 2.13.</p> <ul> <li>Documentation: <a href="https://docs.delta.io/3.1.0/index.html">https://docs.delta.io/3.1.0/index.html</a></li> <li>API documentation: <a href="https://docs.delta.io/3.1.0/delta-apidoc.html#delta-spark">https://docs.delta.io/3.1.0/delta-apidoc.html#delta-spark</a></li> <li>Maven artifacts: <a href="https://repo1.maven.org/maven2/io/delta/delta-spark_2.12/3.1.0/">delta-spark_2.12</a>, <a href="https://repo1.maven.org/maven2/io/delta/delta-spark_2.13/3.1.0/">delta-spark_2.13</a>, <a href="https://repo1.maven.org/maven2/io/delta/delta-contribs_2.12/3.1.0/">delta-contribs_2.12</a>, <a href="https://repo1.maven.org/maven2/io/delta/delta-contribs_2.13/3.1.0/">delta_contribs_2.13</a>, <a href="https://repo1.maven.org/maven2/io/delta/delta-storage/3.1.0/">delta-storage</a>, <a href="https://repo1.maven.org/maven2/io/delta/delta-storage-s3-dynamodb/3.1.0/">delta-storage-s3-dynamodb</a>, <a href="https://repo1.maven.org/maven2/io/delta/delta-iceberg_2.12/3.1.0/">delta-iceberg_2.12</a>, <a href="https://repo1.maven.org/maven2/io/delta/delta-iceberg_2.13/3.1.0/">delta-iceberg_2.13</a></li> <li>Python artifacts: <a href="https://pypi.org/project/delta-spark/3.1.0/">https://pypi.org/project/delta-spark/3.1.0/</a></li> </ul> <p>The key features of this release are:</p> <ul> <li><a href="https://redirect.github.com/delta-io/delta/issues/2426"><strong>Support for merge with deletion vectors</strong></a> to reduce the write overhead for merge operations. This feature improves the performance of merge by several folds. Refer to the <a href="https://docs.delta.io/3.1.0/delta-deletion-vectors.html">documentation</a> on deletion vectors for more information.</li> <li><a href="https://redirect.github.com/delta-io/delta/issues/2092"><strong>Support for optimizing min/max aggregation queries</strong></a> using the table metadata which improves the performance of simple aggregations queries (e.g SELECT min(x) FROM deltaTable) by up to 100x.</li> <li><a href="https://redirect.github.com/delta-io/delta/issues/1874"><strong>(Preview) Liquid clustering for better table layout</strong></a> Now Delta allows clustering the data in a Delta table for better data skipping. Currently this is an experimental feature. See <a href="https://docs.delta.io/3.1.0/delta-clustering.html">documentation</a> and <a href="https://github.com/delta-io/delta/blob/branch-3.1/examples/scala/src/main/scala/example/Clustering.scala">example</a> for how to try out this feature.</li> <li><a href="https://redirect.github.com/delta-io/delta/issues/2238"><strong>Support for DEFAULT value columns</strong></a>. Delta supports defining default expressions for columns on Delta tables. Delta will generate default values for columns when users do not explicitly provide values for them when writing to such tables, or when the user explicitly specifies the DEFAULT SQL keyword for any such column. See <a href="https://docs.delta.io/3.1.0/delta-default-columns.html">documentation</a> on how to enable this feature and try out.</li> <li><a href="https://redirect.github.com/delta-io/delta/issues/1478"><strong>Support for Hive Metastore schema sync</strong></a>. Adds a mechanism for syncing the table schema to HMS. External tools can now directly consume the schema from HMS instead of accessing it from the Delta table directory. See the <a href="https://docs.delta.io/3.1.0/delta-batch.html#syncing-table-schema-and-properties-to-the-hive-metastore">documentation</a> on how to enable this feature.</li> <li><a href="https://redirect.github.com/delta-io/delta/pull/2414"><strong>Auto compaction</strong></a> to address the small files problem during table writes. Auto compaction which runs at the end of the write query combines small files within partitions to large files to reduce the metadata size and improve query performance. See the <a href="https://docs.delta.io/3.1.0/optimizations-oss.html#auto-compaction">documentation</a> for details on how to enable this feature.</li> <li><a href="https://redirect.github.com/delta-io/delta/pull/2145"><strong>Optimized write</strong></a> is an optimization that repartitions and rebalances data before writing them out to a Delta table. Optimized writes improve file size and reduce the small file problem as data is written and benefit subsequent reads on the table. See the <a href="https://docs.delta.io/3.1.0/optimizations-oss.html#optimized-write">documentation</a> for details on how to enable this feature.</li> </ul> <p>Other notable changes include:</p> <ul> <li><a href="https://redirect.github.com/delta-io/delta/pull/2536">Peformance improvement</a> by removing redundant jobs when performing DML operations with deletion vectors.</li> <li><a href="https://redirect.github.com/delta-io/delta/pull/2456">Update command</a> now writes deletions vectors by default when the table has deletion vectors enabled.</li> <li><a href="https://github.com/delta-io/delta/commit/d4fd5e2a">Support</a> for writing partition columns to data files.</li> <li><a href="https://github.com/delta-io/delta/commit/bcd0ee2d">Support</a> for phaseout of v2 checkpoint table feature.</li> <li><a href="https://github.com/delta-io/delta/commit/61dd5d16">Fix</a> an issue with case-sensitive column names in Merge.</li> <li><a href="https://redirect.github.com/delta-io/delta/pull/2558">Make</a> VACCUM command to be Delta protocol aware so that it can only vacuum tables with protocol that it supports.</li> </ul> <h2>Delta Sharing Spark</h2> <ul> <li>Documentation: <a href="https://docs.delta.io/3.1.0/delta-sharing.html">https://docs.delta.io/3.1.0/delta-sharing.html</a></li> <li>Maven artifacts: <a href="https://repo1.maven.org/maven2/io/delta/delta-sharing-spark_2.12/3.1.0/">delta-sharing-spark_2.12</a>, <a href="https://repo1.maven.org/maven2/io/delta/delta-sharing-spark_2.13/3.1.0/">delta-sharing-spark_2.13</a></li> </ul> <p>This release of Delta <a href="https://redirect.github.com/delta-io/delta/issues/2291">adds</a> a new module called delta-sharing-spark which enables reading Delta tables shared using the <a href="https://delta.io/sharing/">Delta Sharing</a> protocol in <a href="https://spark.apache.org/releases/spark-release-3-5-0.html">Apache Spark™</a>. It is migrated from <a href="https://github.com/delta-io/delta-sharing/tree/main/spark">https://github.com/delta-io/delta-sharing/tree/main/spark</a> repository to <a href="https://github.com/delta-io/delta/tree/master/sharing">https://github.com/delta-io/delta/tree/master/sharing</a> repository. Last release version of delta-sharing-spark is 1.0.4 from the previous location. Next release of delta-sharing-spark is with the current release of Delta which is 3.1.0.</p> <p>Supported read types are: read snapshot of the table, incrementally read the table using streaming or read the changes (Change Data Feed) between two versions of the table.</p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/delta-io/delta/commit/71b09f0027c2940806ad2022a6b9fcd10505f3fd"><code>71b09f0</code></a> Setting version to 3.1.0</li> <li><a href="https://github.com/delta-io/delta/commit/12ee15240df4d33879a9dcbd3dc910f2fcfe4d05"><code>12ee152</code></a> [Spark][Sharing] Fix Delta Sharing DataFrame not updated for Snapshot Query</li> <li><a href="https://github.com/delta-io/delta/commit/121c1c8de6fd13232dcf38cac658119582f599dd"><code>121c1c8</code></a> [Doc][3.1] Add a link to the V2 Checkpoint specification in the DROP TABLE Fe...</li> <li><a href="https://github.com/delta-io/delta/commit/a2357ebd6a90b8cd49402f1b57968b1839fea28f"><code>a2357eb</code></a> [Docs] Add auto-compact docs</li> <li><a href="https://github.com/delta-io/delta/commit/98db14c074765548fe89c38a2563e8ed35c648c2"><code>98db14c</code></a> [Docs] Update version in docs</li> <li><a href="https://github.com/delta-io/delta/commit/8f6f3c76592bdd6fe6ffbd83c1553e1bf7f98db0"><code>8f6f3c7</code></a> [Spark][Sharing] Add doc for delta sharing</li> <li><a href="https://github.com/delta-io/delta/commit/6704f0a93ca093fa533090db42ef778fb50b7913"><code>6704f0a</code></a> [Docs] Fix documentation for default columns</li> <li><a href="https://github.com/delta-io/delta/commit/85c8cb7833e92f69cb3186c4078715fdef97faf5"><code>85c8cb7</code></a> [Docs] Add docs for dropping table feature</li> <li><a href="https://github.com/delta-io/delta/commit/559d1f84bd4aa96d2111728233ecff1b9ad4ba1f"><code>559d1f8</code></a> [Spark] Add Writer Protocol check in Vacuum Command</li> <li><a href="https://github.com/delta-io/delta/commit/e0c3bfd242a1847dddafcf288f5660b3b36ceadc"><code>e0c3bfd</code></a> [Kernel] Update the usage docs to reflect the recent API changes</li> <li>Additional commits viewable in <a href="https://github.com/delta-io/delta/compare/v3.0.0...v3.1.0">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
