[PR] Build: Bump net.snowflake:snowflake-jdbc from 3.14.5 to 3.15.1 [iceberg]
dependabot[bot] opened a new pull request, #10095: URL: https://github.com/apache/iceberg/pull/10095 Bumps [net.snowflake:snowflake-jdbc](https://github.com/snowflakedb/snowflake-jdbc) from 3.14.5 to 3.15.1. Release notes Sourced from https://github.com/snowflakedb/snowflake-jdbc/releases;>net.snowflake:snowflake-jdbc's releases. v3.15.1 Please Refer to Release Notes at https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc;>https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc v3.15.0 Please Refer to Release Notes at https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc;>https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc Changelog Sourced from https://github.com/snowflakedb/snowflake-jdbc/blob/master/CHANGELOG.rst;>net.snowflake:snowflake-jdbc's changelog. JDBC Driver 3.15.1 ||Please Refer to Release Notes at https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc;>https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc JDBC Driver 3.15.0 ||Please Refer to Release Notes at https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc;>https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc JDBC Driver 3.14.5 ||Please Refer to Release Notes at https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc;>https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc JDBC Driver 3.14.4 ||Please Refer to Release Notes at https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc;>https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc JDBC Driver 3.14.3 ||Please Refer to Release Notes at https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc;>https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc JDBC Driver 3.14.2 ||Please Refer to Release Notes at https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc;>https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc JDBC Driver 3.14.1 ||Please Refer to Release Notes at https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc;>https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc JDBC Driver 3.14.0 ||Please Refer to Release Notes at https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc;>https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc JDBC Driver 3.13.33 || Please Refer to Release Notes at https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc;>https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc JDBC Driver 3.13.32 || Please Refer to Release Notes at https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc;>https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc JDBC Driver 3.13.31 || Please Refer to Release Notes at https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc;>https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc JDBC Driver 3.13.30 || Please Refer to Release Notes at https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc;>https://docs.snowflake.com/en/release-notes/clients-drivers/jdbc JDBC Driver 3.13.29 ... (truncated) Commits https://github.com/snowflakedb/snowflake-jdbc/commit/b4e92a41f9d8f68d348ca8615c8a4126a9f3bfec;>b4e92a4 Bump version to 3.15.1 for release (https://redirect.github.com/snowflakedb/snowflake-jdbc/issues/1699;>#1699) https://github.com/snowflakedb/snowflake-jdbc/commit/334b44f61fee6ea085c62ef46caf659a21012fc5;>334b44f SNOW-1299790 - handle nulls in structured types (https://redirect.github.com/snowflakedb/snowflake-jdbc/issues/1695;>#1695) https://github.com/snowflakedb/snowflake-jdbc/commit/357b93256e540834fc4dcd5fb7f00c438488ed57;>357b932 SNOW-969714: Add tests for max LOB size (https://redirect.github.com/snowflakedb/snowflake-jdbc/issues/1653;>#1653) https://github.com/snowflakedb/snowflake-jdbc/commit/ebd0f0627523f67f84a5a23a4c2c33fab5f2;>ebd0f06 SNOW-1234216 Add checks for structured types for getMap, getObject and getArr... https://github.com/snowflakedb/snowflake-jdbc/commit/846c7f9ae3a265d96f4cc8a2395e82efa35e5232;>846c7f9 SNOW-1281598: Bump nimbus-jose-jwt (https://redirect.github.com/snowflakedb/snowflake-jdbc/issues/1691;>#1691) https://github.com/snowflakedb/snowflake-jdbc/commit/3e086e04b17210bdafc1e0a7646813c462886642;>3e086e0 SNOW-1163212: InvalidPathException on Windows due to Nested file path https://github.com/snowflakedb/snowflake-jdbc/commit/55d4e6c90ece41fc70a1323deeaaefdcf497d3df;>55d4e6c SNOW-1246554: Move public suffix list to internal package when shading (https://redirect.github.com/snowflakedb/snowflake-jdbc/issues/1690;>#1690) https://github.com/snowflakedb/snowflake-jdbc/commit/1c4c7e8d850f6dd3b7a1710855672fcedeed5627;>1c4c7e8 SNOW-1234216: Read
Re: [PR] Build: Bump com.google.cloud:libraries-bom from 26.28.0 to 26.35.0 [iceberg]
dependabot[bot] commented on PR #10070: URL: https://github.com/apache/iceberg/pull/10070#issuecomment-2041309006 Superseded by #10094. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[PR] Build: Bump com.google.cloud:libraries-bom from 26.28.0 to 26.37.0 [iceberg]
dependabot[bot] opened a new pull request, #10094: URL: https://github.com/apache/iceberg/pull/10094 Bumps [com.google.cloud:libraries-bom](https://github.com/googleapis/java-cloud-bom) from 26.28.0 to 26.37.0. Release notes Sourced from https://github.com/googleapis/java-cloud-bom/releases;>com.google.cloud:libraries-bom's releases. v26.37.0 GCP Libraries BOM 26.37.0 Here are the differences from the previous version (26.36.0) The group ID of the following artifacts is com.google.cloud. Notable Changes Other libraries Version Upgrades Minor Version Upgrades google-cloud-apigee-registry:0.41.0 (prev:0.40.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v0.41.0) google-cloud-video-intelligence:2.40.0 (prev:2.39.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v2.40.0) google-cloud-assured-workloads:2.41.0 (prev:2.40.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v2.41.0) google-cloud-speech:4.36.0 (prev:4.35.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v4.36.0) google-cloud-eventarc-publishing:0.41.0 (prev:0.40.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v0.41.0) google-cloud-workstations:0.29.0 (prev:0.28.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v0.29.0) google-cloud-alloydb-connectors:0.19.0 (prev:0.18.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v0.19.0) google-cloud-network-security:0.44.0 (prev:0.43.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v0.44.0) google-cloud-bare-metal-solution:0.41.0 (prev:0.40.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v0.41.0) google-cloud-chat:0.5.0 (prev:0.4.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v0.5.0) google-cloud-domains:1.38.0 (prev:1.37.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v1.38.0) google-cloud-advisorynotifications:0.30.0 (prev:0.29.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v0.30.0) google-cloud-recommendations-ai:0.48.0 (prev:0.47.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v0.48.0) google-cloud-gke-multi-cloud:0.40.0 (prev:0.39.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v0.40.0) google-cloud-accessapproval:2.42.0 (prev:2.41.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v2.42.0) google-cloud-service-management:3.39.0 (prev:3.38.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v3.39.0) google-cloud-contact-center-insights:2.41.0 (prev:2.40.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v2.41.0) google-cloud-shell:2.40.0 (prev:2.39.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v2.40.0) google-cloud-policy-troubleshooter:1.40.0 (prev:1.39.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v1.40.0) google-cloud-translate:2.41.0 (prev:2.40.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v2.41.0) google-cloud-recommender:2.43.0 (prev:2.42.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v2.43.0) google-cloud-compute:1.51.0 (prev:1.50.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v1.51.0) google-cloud-datacatalog:1.47.0 (prev:1.46.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v1.47.0) google-cloud-automl:2.41.0 (prev:2.40.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v2.41.0) google-cloud-binary-authorization:1.40.0 (prev:1.39.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v1.40.0) google-cloud-vpcaccess:2.42.0 (prev:2.41.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v2.42.0) google-cloud-language:2.42.0 (prev:2.41.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v2.42.0) google-cloud-publicca:0.38.0 (prev:0.37.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v0.38.0) google-cloud-biglake:0.29.0 (prev:0.28.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v0.29.0) google-cloud-api-gateway:2.41.0 (prev:2.40.0; Release Notes: https://github.com/googleapis/google-cloud-java/releases/tag/v1.35.0;>v2.41.0) google-cloud-run:0.41.0
[PR] Build: Bump software.amazon.awssdk:bom from 2.25.21 to 2.25.26 [iceberg]
dependabot[bot] opened a new pull request, #10093: URL: https://github.com/apache/iceberg/pull/10093 Bumps software.amazon.awssdk:bom from 2.25.21 to 2.25.26. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=software.amazon.awssdk:bom=gradle=2.25.21=2.25.26)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- Dependabot commands and options You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Build: Bump com.google.cloud:libraries-bom from 26.28.0 to 26.35.0 [iceberg]
dependabot[bot] closed pull request #10070: Build: Bump com.google.cloud:libraries-bom from 26.28.0 to 26.35.0 URL: https://github.com/apache/iceberg/pull/10070 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] feat: init iceberg writer [iceberg-rust]
marvinlanhenke commented on PR #275: URL: https://github.com/apache/iceberg-rust/pull/275#issuecomment-2041304545 @liurenjie1024 @ZENOTME What's the current status on this PR - as it looks very promising as well as the outlined framework in #34 ? Since we have already completed some issues (or they are in progress) for read support, I think it would be beneficial to outline the next steps for implementing write support. Perhaps in another tracking issue (I don't think we have none yet?). I think we have most of the writers in place (when this PR is ready), but have yet to 'orchestrate' them? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[PR] Build: Bump mkdocs-material from 9.5.15 to 9.5.17 [iceberg]
dependabot[bot] opened a new pull request, #10092: URL: https://github.com/apache/iceberg/pull/10092 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.5.15 to 9.5.17. Release notes Sourced from https://github.com/squidfunk/mkdocs-material/releases;>mkdocs-material's releases. mkdocs-material-9.5.17 Updated Serbian translations Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/7003;>#7003: Confusing keyboard interaction for palette toggle Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/7001;>#7001: Blog posts now show time by default (9.5.16 regression) Fixed edge case in backport of social plugin font loading logic mkdocs-material-9.5.16 Updated Russian translations Improved error handling and reporting in social plugin Improved error handling and reporting in privacy plugin Fixed blog plugin not allowing to use time in format strings Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6983;>#6983: Social plugin crashes because of Google Fonts API change Thanks to https://github.com/kamilkrzyskow;>@kamilkrzyskow, https://github.com/Guts;>@Guts, https://github.com/szg-alex-payne;>@szg-alex-payne and https://github.com/natakazakova;>@natakazakova for their contributions Changelog Sourced from https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG;>mkdocs-material's changelog. mkdocs-material-9.5.17+insiders-4.53.6 (2024-04-05) Ensure working directory is set for projects when using projects plugin Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6970;>#6970: Incorrect relative paths in git submodules with projects plugin mkdocs-material-9.5.17+insiders-4.53.5 (2024-04-02) Fixed social plugin crashing when no colors are specified in palettes mkdocs-material-9.5.17 (2024-04-02) Updated Serbian translations Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/7003;>#7003: Confusing keyboard interaction for palette toggle Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/7001;>#7001: Blog posts now show time by default (9.5.16 regression) Fixed edge case in backport of social plugin font loading logic mkdocs-material-9.5.16+insiders-4.53.4 (2024-03-31) Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6973;>#6973: Escaping issue in tags extra files deprecation helper mkdocs-material-9.5.16 (2024-03-31) Updated Russian translations Improved error handling and reporting in social plugin Improved error handling and reporting in privacy plugin Fixed blog plugin not allowing to use time in format strings Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6983;>#6983: Social plugin crashes because of Google Fonts API change mkdocs-material-9.5.15+insiders-4.53.3 (2024-03-23) Added support for font variants in social plugin Improved resilience of font resolution in social plugin Fixed tag listing sometimes not being auto-populated Fixed tag listing scope not being correctly resolved Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6941;>#6941: Meta plugin adding duplicate entries Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6928;>#6928: Social plugin crashes for some fonts mkdocs-material-9.5.15 (2024-03-23) Reverted fix for transparent iframes (9.5.14) Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6929;>#6929: Interference of social plugin and auto dark mode Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6938;>#6938: Giscus shows dark background in light mode (9.5.14 regression) mkdocs-material-9.5.14+insiders-4.53.2 (2024-03-18) Fixed abort on first non-matching configuration in preview extension Fixed https://redirect.github.com/squidfunk/mkdocs-material/issues/6914;>#6914: Meta files take precedence over front matter mkdocs-material-9.5.14 (2024-03-18) ... (truncated) Commits https://github.com/squidfunk/mkdocs-material/commit/570161ab3f0f7928c6d528e9a9bfcce2bef71fa5;>570161a Prepare 9.5.17 release https://github.com/squidfunk/mkdocs-material/commit/78e93ac0af382b11d70ba91ed9b491455ffb;>78e93ac Improved keyboard interactions for palette toggle https://github.com/squidfunk/mkdocs-material/commit/a3655e8307a12afc83a89432e3b4355fb805db4e;>a3655e8 Updated Serbian translations https://github.com/squidfunk/mkdocs-material/commit/1041766d81ccdd53200f206860d17e6a64d4e65b;>1041766 Fixed time sneaking into default post format string https://github.com/squidfunk/mkdocs-material/commit/e741f80fbea58155b63048931b0fdb0ce65c5732;>e741f80 Documentation https://github.com/squidfunk/mkdocs-material/commit/7e13ae602f76635e9374ee645b146f41e85cb1d5;>7e13ae6
[PR] Docs: Fix On-screen display issues and minor expressions on Branching and Tagging DDL [iceberg]
lawofcycles opened a new pull request, #10091: URL: https://github.com/apache/iceberg/pull/10091 I propose following three modifications. - Fix a broken bullet point display. https://github.com/apache/iceberg/assets/70102274/1ce9bfe1-4424-4b59-baec-ccf7e6a3fec7;> https://github.com/apache/iceberg/assets/70102274/1e6e4e9a-7ecb-43df-9536-a682a7226096;> - Fix inconsistency between comment and SQL > -- CREATE audit-branch at snapshot 1234, retain audit-branch for 31 days, and retain the latest 31 days. The latest 3 > snapshot snapshots, and 2 days worth of snapshots. > ALTER TABLE prod.db.sample CREATE BRANCH `audit-branch` > AS OF VERSION 1234 RETAIN 30 DAYS > WITH SNAPSHOT RETENTION 3 SNAPSHOTS 2 DAYS - A little clearer explanation of the respective options for DDLs. I would be happy for you to review my proposal. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Parquet: Make row-group filters cooperate to filter [iceberg]
zhongyujiang commented on PR #6893: URL: https://github.com/apache/iceberg/pull/6893#issuecomment-2041294392 Replaced by #10090. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Parquet: Make row-group filters cooperate to filter [iceberg]
zhongyujiang closed pull request #6893: Parquet: Make row-group filters cooperate to filter URL: https://github.com/apache/iceberg/pull/6893 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Docs: Fix inconsistency in branching and tagging scenario [iceberg]
lawofcycles commented on PR #9968: URL: https://github.com/apache/iceberg/pull/9968#issuecomment-2041291407 @bitsondatadev While referring to your comments, I pushed new version to improve entire explanation for Historical Tags. I was aware of the following points. - assumption that each snapshot are compressed for each day - User must assure that target snapshot cover the intended term by timing to create tag. I hope this suggestion helps. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[PR] Parquet: Make row-group filters cooperate to filter [iceberg]
zhongyujiang opened a new pull request, #10090: URL: https://github.com/apache/iceberg/pull/10090 This PR refactors three Parquet row-group filters into a form that computes residual expressions, allowing it to return a residual expression for the given row-groups. The residual computed by the previous filter can be passed to the next filter, allowing the three Parquet row-group filters to work together. This improves the handling of some `OR` condition queries. For example: Let's assume we have a query `a = 'foo' OR b = 'bar'`, where column a is dictionary-encoded in a Parquet row-group, while column b is not entirely dictionary-encoded in all data pages but has a bloom filter. Therefore, `a = 'foo'` can only be evaluated by the dictionary filter, and `b = 'bar'` can only be evaluated by the bloom filter. In the current situation, even if both filters evaluate the expressions as `ROWS_CANNOT_MATCH` individually, because each filter can only evaluate one sub-expression, the final output would still be `ROWS_MIGHT_MATCH` (let's assume the metric filter evaluates both sub-expressions as `ROWS_MIGHT_MATCH`). After refactoring into the form of computing residuals, the dictionary filter will compute the residual for `a = 'foo' OR b = 'bar'` as `b = 'bar'`. Then this residual expression will be passed to the bloom filter and evaluated as `Expressions.alwaysFalse()`. As a result, the reading of this row-group can be skipped. This is a revive of #6893, and can close #10029. cc @cccs-jc @rdblue @huaxingao @amogh-jahagirdar @RussellSpitzer Could you please review this? Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Implement __getstate__ and __setstate__ on PyArrowFileIO and FsSpecFileIO so that they can be pickled [iceberg-python]
amogh-jahagirdar commented on code in PR #543: URL: https://github.com/apache/iceberg-python/pull/543#discussion_r1554762228 ## tests/io/test_pyarrow.py: ## @@ -256,6 +257,14 @@ def test_raise_on_opening_a_local_file_not_found() -> None: assert "[Errno 2] Failed to open local file" in str(exc_info.value) +def test_pickle_pyarrow_file_io() -> None: Review Comment: Sorry for the delay on this, got busy with other work, Updated! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] How do I fetch the latest ROADMAP [iceberg]
github-actions[bot] commented on issue #2469: URL: https://github.com/apache/iceberg/issues/2469#issuecomment-2041247971 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] master branch - flink sql create hive catalog error [iceberg]
github-actions[bot] commented on issue #2468: URL: https://github.com/apache/iceberg/issues/2468#issuecomment-2041247963 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] [Bug Fix] Allow HiveCatalog to create table with TimestamptzType [iceberg-python]
Fokko commented on code in PR #585: URL: https://github.com/apache/iceberg-python/pull/585#discussion_r1554715441 ## pyiceberg/catalog/hive.py: ## @@ -199,6 +184,7 @@ def _annotate_namespace(database: HiveDatabase, properties: Properties) -> HiveD DateType: "date", TimeType: "string", TimestampType: "timestamp", +TimestamptzType: "timestamp", Review Comment: Can we just set arbitrary strings in there? If so, I think `timestamp with local time zone` is more accurate. It would be good to validate using an integration test as well, since we have those tests already. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] [BUG] Valid column characters fail on to_arrow() or to_pandas() ArrowInvalid: No match for FieldRef.Name [iceberg-python]
kevinjqliu commented on issue #584: URL: https://github.com/apache/iceberg-python/issues/584#issuecomment-2041172613 Thanks for reporting this! And providing a reproducible example. Here's what I've found. In the `_task_to_table` function, the provided schema is modified before its passed to the underlying arrow Scanner. This is done in `sanitize_column_names` [here](https://github.com/apache/iceberg-python/blob/4148edb5e28ae88024a55e0b112238e65b873957/pyiceberg/io/pyarrow.py#L970) The `file_project_schema` field is then passed into the Scanner, which leads to the error above. The schema name is changed from `TEST:A1B2.RAW.ABC-GG-1-A` to `TEST_x3AA1B2_x2ERAW_x2EABC_x2DGG_x2D1_x2DA`. The read will work if you comment out the schema modification ([L981](https://github.com/apache/iceberg-python/blob/4148edb5e28ae88024a55e0b112238e65b873957/pyiceberg/io/pyarrow.py#L981)) and just return the `arrow_table` instead of [L1014](https://github.com/apache/iceberg-python/blob/4148edb5e28ae88024a55e0b112238e65b873957/pyiceberg/io/pyarrow.py#L1014) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]
marvinlanhenke commented on code in PR #320: URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1554654358 ## crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs: ## @@ -0,0 +1,260 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use crate::expr::{BoundPredicate, BoundReference, PredicateOperator}; +use crate::spec::Datum; +use crate::Result; +use fnv::FnvHashSet; + +pub enum OpLiteral<'a> { +None, +Single(&'a Datum), +Set(&'a FnvHashSet), +} + +pub trait BoundPredicateVisitor { +type T; + +fn always_true( self) -> Result; +fn always_false( self) -> Result; + +fn and( self, lhs: Self::T, rhs: Self::T) -> Result; +fn or( self, lhs: Self::T, rhs: Self::T) -> Result; +fn not( self, inner: Self::T) -> Result; Review Comment: I'm still unsure if we should provide those in the trait? I guess due to the same reasons @liurenjie1024 mentioned in the original discussion. Would `fn op` not be sufficient - and the rest can be handled in `fn visit` (it handles most of the traversal logic already)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]
marvinlanhenke commented on code in PR #320: URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1554653083 ## crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs: ## @@ -0,0 +1,260 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use crate::expr::{BoundPredicate, BoundReference, PredicateOperator}; +use crate::spec::Datum; +use crate::Result; +use fnv::FnvHashSet; + +pub enum OpLiteral<'a> { +None, Review Comment: could we remove `None` and use Option instead? Also I had to create nearly the same enum for the project transform PR #264. Perhaps it would make sense to have this in value.rs as an enum? ## crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs: ## @@ -0,0 +1,260 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use crate::expr::{BoundPredicate, BoundReference, PredicateOperator}; +use crate::spec::Datum; +use crate::Result; +use fnv::FnvHashSet; + +pub enum OpLiteral<'a> { +None, +Single(&'a Datum), +Set(&'a FnvHashSet), +} + +pub trait BoundPredicateVisitor { +type T; + +fn always_true( self) -> Result; +fn always_false( self) -> Result; + +fn and( self, lhs: Self::T, rhs: Self::T) -> Result; +fn or( self, lhs: Self::T, rhs: Self::T) -> Result; +fn not( self, inner: Self::T) -> Result; Review Comment: I'm still unsure if we should provide those in the trait? I guess due to the same reasons @liurenjie1024 mentioned in the original discussion. Would `fn op` not be sufficient - and the rest can be handled in `fn visit` (it handles most of the traversal logic already)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] Spark rewrite Files Action OOM [iceberg]
Zhanxiao-Ma commented on issue #10054: URL: https://github.com/apache/iceberg/issues/10054#issuecomment-2041085307 > @nk1506 Echoing Russell's comments, how many small files are there in your OOM case? How much memory do you set up? @RussellSpitzer I believe increasing memory is not a good solution for dealing with excessive information deletion because it is impossible to predict how much memory would be appropriate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [I] What is the meaning of `delete_rows_count` and `delete_data_count_file ` at manifest [iceberg]
wg1026688210 closed issue #2445: What is the meaning of `delete_rows_count` and `delete_data_count_file ` at manifest URL: https://github.com/apache/iceberg/issues/2445 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]
liurenjie1024 merged PR #317: URL: https://github.com/apache/iceberg-rust/pull/317 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]
sdd commented on PR #317: URL: https://github.com/apache/iceberg-rust/pull/317#issuecomment-2041050879 @liurenjie1024 I've added a test as well now for build_accessors :-) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]
liurenjie1024 commented on code in PR #317: URL: https://github.com/apache/iceberg-rust/pull/317#discussion_r1554560769 ## crates/iceberg/src/spec/schema.rs: ## @@ -137,9 +142,55 @@ impl SchemaBuilder { name_to_id, lowercase_name_to_id, id_to_name, + +field_id_to_accessor, }) } +fn build_accessors() -> HashMap> { Review Comment: Sorry for misclarification, in fact I mean the `r#struct` method in `SchemaVisitor`: https://github.com/apache/iceberg-rust/blob/1c2a20b13e67e4f91a44d49fa1b0e3432cd48432/crates/iceberg/src/spec/schema.rs#L324 But I agree that it could be left to do it later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]
sdd commented on code in PR #320: URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1554546747 ## crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs: ## @@ -0,0 +1,366 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use crate::expr::{BoundPredicate, BoundReference, PredicateOperator}; +use crate::spec::Datum; +use crate::Result; +use fnv::FnvHashSet; + +pub trait BoundPredicateVisitor { +type T; + +fn always_true( self) -> Result; +fn always_false( self) -> Result; + +fn and( self, lhs: Self::T, rhs: Self::T) -> Result; +fn or( self, lhs: Self::T, rhs: Self::T) -> Result; +fn not( self, inner: Self::T) -> Result; + +fn is_null( self, reference: ) -> Result; Review Comment: That still wouldn't work without collecting the set into a vec first for set ops. I've updated the PR with an alternative based on an enum, see what you think. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]
sdd commented on code in PR #317: URL: https://github.com/apache/iceberg-rust/pull/317#discussion_r1554542369 ## crates/iceberg/src/spec/schema.rs: ## @@ -137,9 +142,55 @@ impl SchemaBuilder { name_to_id, lowercase_name_to_id, id_to_name, + +field_id_to_accessor, }) } +fn build_accessors() -> HashMap> { Review Comment: Sorry Renjie, I still don't follow! `visit_struct` already exists and has a different signature to that? https://github.com/apache/iceberg-rust/blob/4e89ac71c3ea77b9270b71dac52b5c3e50c36526/crates/iceberg/src/spec/schema.rs#L365 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]
sdd commented on code in PR #317: URL: https://github.com/apache/iceberg-rust/pull/317#discussion_r1554540831 ## crates/iceberg/src/expr/accessor.rs: ## @@ -0,0 +1,119 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use crate::spec::{Datum, Literal, PrimitiveType, Struct}; +use serde_derive::{Deserialize, Serialize}; +use std::sync::Arc; + +#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)] +enum InnerOrType { +Inner(Box), +Type(PrimitiveType), +} + +#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)] +pub struct StructAccessor { +position: usize, +r#type: PrimitiveType, +inner: Option>, +} + +pub(crate) type StructAccessorRef = Arc; + +impl StructAccessor { +pub(crate) fn new(position: usize, r#type: PrimitiveType) -> Self { +StructAccessor { +position, +r#type, +inner: None, +} +} + +pub(crate) fn wrap(position: usize, inner: Box) -> Self { +StructAccessor { +position, +r#type: inner.r#type().clone(), +inner: Some(inner), +} +} + +pub(crate) fn position() -> usize { +self.position +} + +pub(crate) fn r#type() -> { +#type +} + +pub(crate) fn get<'a>(&'a self, container: &'a Struct) -> Datum { +match { +None => { +let Literal::Primitive(literal) = [self.position] else { +panic!("Expected Literal to be Primitive"); Review Comment: Sure, that's much nicer. Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]
sdd commented on code in PR #317: URL: https://github.com/apache/iceberg-rust/pull/317#discussion_r1554540767 ## crates/iceberg/src/expr/accessor.rs: ## @@ -0,0 +1,119 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use crate::spec::{Datum, Literal, PrimitiveType, Struct}; +use serde_derive::{Deserialize, Serialize}; +use std::sync::Arc; + +#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)] +enum InnerOrType { Review Comment: Whoops! Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[I] Concern about possible consistency issue in HiveCatalog's _commit_table [iceberg-python]
HonahX opened a new issue, #588: URL: https://github.com/apache/iceberg-python/issues/588 ### Question Currently, the HiveCatalog's `_commit_table` workflow looks like: 1. load current table metadata via `load_table` 2. construct updated metadata 3. lock the hive table 4. alter the hive table 5. unlock the hive table Suppose now there are 2 process, A and B try to commit some changes to the same iceberg table It is possible that the code execution happens to be in the following order: 1. process A load current table metadata 2. process A construct updated metadata 3. process B starts and finishes the **whole** `_commit_table` 4. process A lock the hive table 5. process A alter the hive table 6. process A unlock the hive table In this specific scenario, both processes successfully commit their changes because process B releases the lock before A tries to acquire. But if the `alter_table` does not support [transactional check](https://issues.apache.org/jira/browse/HIVE-26882), the changes made by process B will be overridden. Since in python we do not know which Hive version we are connecting to, I wonder if we need to update the code to lock the table before loading current table metadata, like what [Java implementation](https://github.com/apache/iceberg/blob/main/hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java#L184) does. BTW, it seems there are some consistency issue of https://issues.apache.org/jira/browse/HIVE-26882 as well and there is an open fix for that https://github.com/apache/hive/pull/5129 Please correct me if I misunderstand something here. Thanks! cc: @Fokko -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[I] Suppress duplicate OAuth token fetching in rest catalog client [iceberg-python]
TennyZhuang opened a new issue, #587: URL: https://github.com/apache/iceberg-python/issues/587 ### Feature Request / Improvement In the rest catalog client, we implemented the OAuth token refresh based on retry mechanism. https://github.com/apache/iceberg-python/blob/4148edb5e28ae88024a55e0b112238e65b873957/pyiceberg/catalog/rest.py#L126-L128 If there are concurrent calls to the rest client API, they may failed at the same time, then thousands of `_refresh_token` may be called. The behavior is likely acceptable with no exceptions, but it appears wasteful and not as expected. A potential solution is introducing a `threading.Lock` to protect the fetching process. Every call to `_refresh_token` should acquire the `Lock` first, and check whether the token is same as the expired token. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org