[I] Getting java.lang.IllegalArgumentException while create Table using a S3 URI with Curly Brace [iceberg]

2025-09-02 Thread via GitHub
fivetran-krunalpande opened a new issue, #13982: URL: https://github.com/apache/iceberg/issues/13982 ### Apache Iceberg version 1.7.1 ### Query engine None ### Please describe the bug šŸž Getting this error: java.lang.IllegalArgumentException: Illegal chara

Re: [PR] feat: avro schema add sanitize field name [iceberg-cpp]

2025-09-02 Thread via GitHub
wgtmac commented on PR #190: URL: https://github.com/apache/iceberg-cpp/pull/190#issuecomment-3247702773 Let me know when it is ready to review. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Bump Avro to 0.20 [iceberg-rust]

2025-09-02 Thread via GitHub
Fokko commented on PR #1644: URL: https://github.com/apache/iceberg-rust/pull/1644#issuecomment-3247851737 @kevinjqliu I tried to bump the lowerbound in the `Cargo.toml`, but was unsuccesful 😭 -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [I] `DataScan` `count` method does not respect limit [iceberg-python]

2025-09-02 Thread via GitHub
tushar-choudhary-tc commented on issue #2121: URL: https://github.com/apache/iceberg-python/issues/2121#issuecomment-3247835870 hi @rashampreet-singh how can I help you get unblocked -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [I] docs: add a table for data type conversion between arrow and iceberg types [iceberg-python]

2025-09-02 Thread via GitHub
tushar-choudhary-tc commented on issue #2226: URL: https://github.com/apache/iceberg-python/issues/2226#issuecomment-3247797715 I am picking this up today -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[PR] Core: Add CountNull Aggregation Support [iceberg]

2025-09-02 Thread via GitHub
jackylee-ch opened a new pull request, #13981: URL: https://github.com/apache/iceberg/pull/13981 We’re using `Aggregate` to generate per-column statistics from data files—metrics like value counts, min/max, and null counts—but discovered that null counts (CountNull) aren’t currently support

Re: [PR] feat: avro schema add sanitize field name [iceberg-cpp]

2025-09-02 Thread via GitHub
MisterRaindrop commented on PR #190: URL: https://github.com/apache/iceberg-cpp/pull/190#issuecomment-3247788098 > Let me know when it is ready to review. Thanks! You can review thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] Core, Spark: add snapshot properties for snapshot created by wap branches [iceberg]

2025-09-02 Thread via GitHub
yingjianwu98 commented on PR #13922: URL: https://github.com/apache/iceberg/pull/13922#issuecomment-3247732115 Thanks everyone for the review again. I have updated the PR based on suggestions and refactor the code. I have also added some more tests but not sure if this is the best p

[PR] Core: Use safeContainsKey to avoid NPE for CountNonNull [iceberg]

2025-09-02 Thread via GitHub
jackylee-ch opened a new pull request, #13980: URL: https://github.com/apache/iceberg/pull/13980 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] Build: Bump Parquet-Java to 1.16.0 [iceberg]

2025-09-02 Thread via GitHub
Fokko commented on code in PR #13941: URL: https://github.com/apache/iceberg/pull/13941#discussion_r2317787821 ## api/src/main/java/org/apache/iceberg/variants/Variant.java: ## @@ -22,6 +22,10 @@ /** A variant metadata and value pair. */ public interface Variant { Review Co

Re: [PR] feat: Introduce ArrowArrayReader factory on FileScanTask [iceberg-cpp]

2025-09-02 Thread via GitHub
wgtmac commented on code in PR #200: URL: https://github.com/apache/iceberg-cpp/pull/200#discussion_r2315587583 ## src/iceberg/table_scan.h: ## @@ -185,4 +169,40 @@ class ICEBERG_EXPORT DataTableScan : public TableScan { Result>> PlanFiles() const override; }; +/// \brief

Re: [I] Add benchmark to the CI [iceberg-python]

2025-09-02 Thread via GitHub
sungwy commented on issue #27: URL: https://github.com/apache/iceberg-python/issues/27#issuecomment-3247724553 No stale -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] Build: Bump Parquet-Java to 1.16.0 [iceberg]

2025-09-02 Thread via GitHub
Fokko commented on code in PR #13941: URL: https://github.com/apache/iceberg/pull/13941#discussion_r2317786465 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/ParquetWithSparkSchemaVisitor.java: ## @@ -154,11 +155,10 @@ public static T visit(DataType sType, Type

Re: [PR] feat: Introduce ArrowArrayReader factory on FileScanTask [iceberg-cpp]

2025-09-02 Thread via GitHub
wgtmac commented on code in PR #200: URL: https://github.com/apache/iceberg-cpp/pull/200#discussion_r2315589964 ## test/CMakeLists.txt: ## @@ -122,5 +122,6 @@ if(ICEBERG_BUILD_BUNDLE) SOURCES parquet_data_test.cc parque

Re: [PR] feat: Introduce ArrowArrayReader factory on FileScanTask [iceberg-cpp]

2025-09-02 Thread via GitHub
wgtmac commented on code in PR #200: URL: https://github.com/apache/iceberg-cpp/pull/200#discussion_r2315581021 ## test/parquet_test.cc: ## @@ -28,7 +28,7 @@ #include #include "iceberg/arrow/arrow_fs_file_io_internal.h" -#include "iceberg/parquet/parquet_reader.h" +#include

Re: [PR] Test, Spark: Improve the Speed of Rewrite Tests [iceberg]

2025-09-02 Thread via GitHub
RussellSpitzer commented on PR #13947: URL: https://github.com/apache/iceberg/pull/13947#issuecomment-3246216275 Thanks @huaxingao for the review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] feat: implement basic parquet writer and add roundtrip tests [iceberg-cpp]

2025-09-02 Thread via GitHub
HuaHuaY commented on code in PR #198: URL: https://github.com/apache/iceberg-cpp/pull/198#discussion_r2315530503 ## test/parquet_test.cc: ## @@ -17,38 +17,121 @@ * under the License. */ +#include + #include #include #include #include #include +#include #includ

Re: [PR] Arrow: Close child allocators [iceberg]

2025-09-02 Thread via GitHub
RussellSpitzer commented on PR #13976: URL: https://github.com/apache/iceberg/pull/13976#issuecomment-3246250205 For testing though I wonder if we should have it set up where each suite starts it's own Root and closes it, just for testing? -- This is an automated message from the Apache G

Re: [I] Schema update fails when dropping column with highest field ID [iceberg]

2025-09-02 Thread via GitHub
fivetran-kostaszoumpatianos commented on issue #13850: URL: https://github.com/apache/iceberg/issues/13850#issuecomment-3247630202 According to the spec `last-column-id` is ```An integer; the highest assigned column ID for the table. This is used to ensure columns are always assigned an unu

Re: [PR] Spark 4.0: Refactor Spark procedures to consistently use ProcedureInput for parameter handling. [iceberg]

2025-09-02 Thread via GitHub
slfan1989 commented on code in PR #13913: URL: https://github.com/apache/iceberg/pull/13913#discussion_r2317559815 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/procedures/ExpireSnapshotsProcedure.java: ## @@ -104,26 +118,28 @@ public ProcedureParameter[] parameters

Re: [PR] Build: Bump mypy-boto3-dynamodb from 1.40.14 to 1.40.20 [iceberg-python]

2025-09-02 Thread via GitHub
kevinjqliu merged PR #2419: URL: https://github.com/apache/iceberg-python/pull/2419 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Build: Bump Parquet-Java to 1.16.0 [iceberg]

2025-09-02 Thread via GitHub
Fokko commented on code in PR #13941: URL: https://github.com/apache/iceberg/pull/13941#discussion_r2316981688 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/ParquetWithSparkSchemaVisitor.java: ## @@ -154,11 +155,10 @@ public static T visit(DataType sType, Type

Re: [PR] Build: Bump Parquet-Java to 1.16.0 [iceberg]

2025-09-02 Thread via GitHub
Fokko commented on code in PR #13941: URL: https://github.com/apache/iceberg/pull/13941#discussion_r2316976868 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/ParquetWithSparkSchemaVisitor.java: ## @@ -154,11 +155,10 @@ public static T visit(DataType sType, Type

Re: [PR] Spark 4.0: Add truncate transform tests (non-SPJ only). [iceberg]

2025-09-02 Thread via GitHub
slfan1989 commented on PR #13907: URL: https://github.com/apache/iceberg/pull/13907#issuecomment-3247616242 Thank you very much to @huaxingao , @singhpk234 , and @RussellSpitzer for reviewing this PR and for your time. The comment below is no longer accurate, and I personally think

Re: [I] [epic] add all features of the IRC spec to the IRC reference implementation [iceberg]

2025-09-02 Thread via GitHub
kevinjqliu commented on issue #13707: URL: https://github.com/apache/iceberg/issues/13707#issuecomment-3247613225 Thanks @gaborkaszab you're right! I added it to the missing features section, cheers -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] Bump Avro to 0.20.0 [iceberg-rust]

2025-09-02 Thread via GitHub
Fokko commented on code in PR #1644: URL: https://github.com/apache/iceberg-rust/pull/1644#discussion_r2314570970 ## Cargo.toml: ## @@ -40,7 +40,7 @@ rust-version = "1.85" [workspace.dependencies] anyhow = "1.0.72" -apache-avro = "0.17" +apache-avro = "0.20.0" Review Commen

Re: [I] Apache Iceberg SPJ Joins [iceberg]

2025-09-02 Thread via GitHub
sezruby commented on issue #13916: URL: https://github.com/apache/iceberg/issues/13916#issuecomment-3243570642 Try setting the configs before writing tables. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] feat: implement endian conversion utilities [iceberg-cpp]

2025-09-02 Thread via GitHub
Fokko merged PR #196: URL: https://github.com/apache/iceberg-cpp/pull/196 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [I] commit on expire_snapshot tries to remove snapshot from wrong table. [iceberg-python]

2025-09-02 Thread via GitHub
kevinjqliu commented on issue #2409: URL: https://github.com/apache/iceberg-python/issues/2409#issuecomment-3242902290 i suspect this is due to the stale `table` object passed to `expire_oldest_snapshots`. i added `table = table.refresh()` on the top of `expire_oldest_snapshots` and th

Re: [PR] Test: Upgrade to Parquet 1.16.0rc [DO NOT MERGE] [iceberg]

2025-09-02 Thread via GitHub
aihuaxu commented on PR #13971: URL: https://github.com/apache/iceberg/pull/13971#issuecomment-3243270934 It's already tested in https://github.com/apache/iceberg/pull/13941. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] feat(datafusion): implement the partitioning node for DataFusion to define the partitioning [iceberg-rust]

2025-09-02 Thread via GitHub
fvaleye commented on code in PR #1620: URL: https://github.com/apache/iceberg-rust/pull/1620#discussion_r2315652561 ## crates/integrations/datafusion/src/physical_plan/repartition.rs: ## @@ -0,0 +1,805 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

[PR] Build: Bump datafusion from 47.0.0 to 49.0.0 [iceberg-python]

2025-09-02 Thread via GitHub
dependabot[bot] opened a new pull request, #2420: URL: https://github.com/apache/iceberg-python/pull/2420 Bumps [datafusion](https://github.com/apache/datafusion-python) from 47.0.0 to 49.0.0. Commits https://github.com/apache/datafusion-python/commit/97a6ac392d6d5b46b97cc1ea1b

[I] Iceberg Connector Confluent: java.lang.ClassNotFoundException: org.apache.iceberg.IcebergBuild [iceberg]

2025-09-02 Thread via GitHub
santurini opened a new issue, #13964: URL: https://github.com/apache/iceberg/issues/13964 I am deploying a confluent stack in Kubernetes which install the Apache Iceberg Connector like this: ``` apiVersion: platform.confluent.io/v1beta1 kind: Connect metadata: name: connect

Re: [PR] chore: Add release automation infrastructure [iceberg-cpp]

2025-09-02 Thread via GitHub
wgtmac commented on code in PR #193: URL: https://github.com/apache/iceberg-cpp/pull/193#discussion_r2313367553 ## .github/workflows/rc.yml: ## @@ -0,0 +1,147 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOT

Re: [PR] feat(table): add fanout partition writer and rolling data writer [iceberg-go]

2025-09-02 Thread via GitHub
zeroshade commented on code in PR #524: URL: https://github.com/apache/iceberg-go/pull/524#discussion_r2314393801 ## manifest.go: ## @@ -1461,33 +1598,114 @@ func mapToAvroColMap[K comparable, V any](m map[K]V) *[]colMap[K, V] { return &out } -func avroPartitionData(

Re: [PR] feat(sqllogictest): Add sqllogictest schedule definition and parsing [iceberg-rust]

2025-09-02 Thread via GitHub
liurenjie1024 merged PR #1630: URL: https://github.com/apache/iceberg-rust/pull/1630 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Build: Bump jackson-bom from 2.19.2 to 2.20.0 and jackson-annotations to 2.20 [iceberg]

2025-09-02 Thread via GitHub
Fokko closed pull request #13961: Build: Bump jackson-bom from 2.19.2 to 2.20.0 and jackson-annotations to 2.20 URL: https://github.com/apache/iceberg/pull/13961 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] chore: add release script and github workflow [iceberg-cpp]

2025-09-02 Thread via GitHub
HeartLinked commented on code in PR #193: URL: https://github.com/apache/iceberg-cpp/pull/193#discussion_r2315137488 ## .github/workflows/rc.yml: ## @@ -0,0 +1,144 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See th

Re: [I] Remove MockLocationGenerator and its usages [iceberg-rust]

2025-09-02 Thread via GitHub
CTTY commented on issue #1645: URL: https://github.com/apache/iceberg-rust/issues/1645#issuecomment-3246540061 Hi @vallimangai , please feel free to submit a pr, looking forward to it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] Allow V2 reader to read v1 manifests [iceberg-rust]

2025-09-02 Thread via GitHub
Fokko merged PR #1634: URL: https://github.com/apache/iceberg-rust/pull/1634 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Flink: Enhance DeleteFilesProcessor with Time Metrics and Unit Tests. [iceberg]

2025-09-02 Thread via GitHub
slfan1989 commented on PR #13831: URL: https://github.com/apache/iceberg/pull/13831#issuecomment-3242765238 @pvary Could you please take a look at this PR? Thank you very much! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] Set the ManifestEntryStatus [iceberg-python]

2025-09-02 Thread via GitHub
Fokko commented on PR #2408: URL: https://github.com/apache/iceberg-python/pull/2408#issuecomment-3243111992 Thanks for the suggestion @stevie9868 of adding a test, and after a second look, I believe the code is in a dead branch. Looks like we exclusively write new data with inheritance en

Re: [PR] Parquet: Test out the Parquet-Java 1.16.0 release [iceberg]

2025-09-02 Thread via GitHub
Fokko commented on code in PR #13941: URL: https://github.com/apache/iceberg/pull/13941#discussion_r2316538519 ## build.gradle: ## @@ -120,6 +120,9 @@ allprojects { repositories { mavenCentral() mavenLocal() +maven { + url = uri("https://repository.apache.o

Re: [PR] feat: add find field (by name) support to NestedType [iceberg-cpp]

2025-09-02 Thread via GitHub
HuaHuaY commented on code in PR #194: URL: https://github.com/apache/iceberg-cpp/pull/194#discussion_r2314812003 ## src/iceberg/util/macros.h: ## @@ -19,10 +19,13 @@ #pragma once -#define ICEBERG_RETURN_UNEXPECTED(result) \ - if (!result) [[unlikely]] {

Re: [PR] Azure: Don't fetch credential from endpoint if properties contain a valid credential [iceberg]

2025-09-02 Thread via GitHub
kevinjqliu commented on code in PR #13966: URL: https://github.com/apache/iceberg/pull/13966#discussion_r2316554209 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/VendedAdlsCredentialProvider.java: ## @@ -78,6 +81,31 @@ Mono credentialForAccount(String storageAccount) {

Re: [PR] feat(writer): Make LocationGenerator partition-aware [iceberg-rust]

2025-09-02 Thread via GitHub
CTTY commented on code in PR #1625: URL: https://github.com/apache/iceberg-rust/pull/1625#discussion_r2314794274 ## crates/iceberg/src/writer/file_writer/location_generator.rs: ## @@ -136,8 +159,17 @@ pub(crate) mod test { } impl LocationGenerator for MockLocationGen

Re: [PR] Reuse vectors across encoding switches in VectorizedArrowReader [iceberg]

2025-09-02 Thread via GitHub
stevenzwu commented on code in PR #13949: URL: https://github.com/apache/iceberg/pull/13949#discussion_r2317038609 ## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorizedArrowReader.java: ## @@ -217,6 +217,10 @@ public VectorHolder read(VectorHolder reuse, int num

Re: [PR] API, Spark 4.0: Add 'skip_file_list' option to RewriteTablePathProcedure for optional file-list generation. [iceberg]

2025-09-02 Thread via GitHub
dramaticlly commented on code in PR #13837: URL: https://github.com/apache/iceberg/pull/13837#discussion_r2316661941 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteTablePathSparkAction.java: ## @@ -90,12 +90,14 @@ public class RewriteTablePathSparkActi

Re: [PR] [docs] Move third-party integrations to root level of left-hand nav, add more catalogs [iceberg]

2025-09-02 Thread via GitHub
manuzhang commented on code in PR #13753: URL: https://github.com/apache/iceberg/pull/13753#discussion_r2314214853 ## docs/mkdocs.yml: ## @@ -59,40 +58,8 @@ nav: - flink-configuration.md - Kafka Connect: kafka-connect.md - Apache Hive: hive.md -- Third-party

Re: [PR] Core, Spark: add snapshot properties for snapshot created by wap branches [iceberg]

2025-09-02 Thread via GitHub
RussellSpitzer commented on code in PR #13922: URL: https://github.com/apache/iceberg/pull/13922#discussion_r2316714572 ## core/src/main/java/org/apache/iceberg/util/WapUtil.java: ## @@ -39,6 +41,33 @@ public static String publishedWapId(Snapshot snapshot) { : null;

Re: [PR] Spark 4.0: Refactor Spark procedures to consistently use ProcedureInput for parameter handling. [iceberg]

2025-09-02 Thread via GitHub
dramaticlly commented on code in PR #13913: URL: https://github.com/apache/iceberg/pull/13913#discussion_r2317235312 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/procedures/ExpireSnapshotsProcedure.java: ## @@ -104,26 +118,28 @@ public ProcedureParameter[] paramete

Re: [PR] Core, Spark: add snapshot properties for snapshot created by wap branches [iceberg]

2025-09-02 Thread via GitHub
RussellSpitzer commented on code in PR #13922: URL: https://github.com/apache/iceberg/pull/13922#discussion_r2316721061 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/SparkWriteConf.java: ## @@ -461,11 +461,22 @@ public boolean caseSensitive() { .parse();

Re: [PR] chore: add release script and github workflow [iceberg-cpp]

2025-09-02 Thread via GitHub
wgtmac commented on PR #193: URL: https://github.com/apache/iceberg-cpp/pull/193#issuecomment-3247549086 @Fokko @raulcd @zeroshade Do you have more comment on this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] feat: implement basic parquet writer and add roundtrip tests [iceberg-cpp]

2025-09-02 Thread via GitHub
wgtmac commented on code in PR #198: URL: https://github.com/apache/iceberg-cpp/pull/198#discussion_r2317667583 ## src/iceberg/parquet/parquet_writer.cc: ## @@ -0,0 +1,167 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agree

Re: [PR] Core, Spark: add snapshot properties for snapshot created by wap branches [iceberg]

2025-09-02 Thread via GitHub
RussellSpitzer commented on code in PR #13922: URL: https://github.com/apache/iceberg/pull/13922#discussion_r2316740635 ## core/src/main/java/org/apache/iceberg/util/WapUtil.java: ## @@ -39,6 +41,33 @@ public static String publishedWapId(Snapshot snapshot) { : null;

[PR] Build: Bump mkdocs-autorefs from 1.4.2 to 1.4.3 [iceberg-python]

2025-09-02 Thread via GitHub
dependabot[bot] opened a new pull request, #2421: URL: https://github.com/apache/iceberg-python/pull/2421 Bumps [mkdocs-autorefs](https://github.com/mkdocstrings/autorefs) from 1.4.2 to 1.4.3. Release notes Sourced from https://github.com/mkdocstrings/autorefs/releases";>mkdocs-aut

Re: [PR] Build: Bump Parquet-Java to 1.16.0 [iceberg]

2025-09-02 Thread via GitHub
aihuaxu commented on code in PR #13941: URL: https://github.com/apache/iceberg/pull/13941#discussion_r2317665752 ## api/src/main/java/org/apache/iceberg/variants/Variant.java: ## @@ -22,6 +22,10 @@ /** A variant metadata and value pair. */ public interface Variant { Review

Re: [PR] Core, Spark: add snapshot properties for snapshot created by wap branches [iceberg]

2025-09-02 Thread via GitHub
RussellSpitzer commented on PR #13922: URL: https://github.com/apache/iceberg/pull/13922#issuecomment-3246218538 I think my big comment here is I have no problem with introducing the wap.branch property, but I don't think we should couple the "staging" logic in the function which sets that

Re: [PR] Build: Bump Parquet-Java to 1.16.0 [iceberg]

2025-09-02 Thread via GitHub
stevenzwu commented on code in PR #13941: URL: https://github.com/apache/iceberg/pull/13941#discussion_r2317652922 ## spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java: ## @@ -2140,8 +2141,10 @@ protected void shouldHaveMultipleFiles

Re: [PR] Spark 4.0: Implement check for destination table location overlap with source table location. [iceberg]

2025-09-02 Thread via GitHub
slfan1989 commented on code in PR #13962: URL: https://github.com/apache/iceberg/pull/13962#discussion_r2317632371 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/actions/SnapshotTableSparkAction.java: ## @@ -222,4 +224,22 @@ public SnapshotTableSparkAction tableLocat

Re: [PR] Build: Bump Parquet-Java to 1.16.0 [iceberg]

2025-09-02 Thread via GitHub
kevinjqliu commented on PR #13941: URL: https://github.com/apache/iceberg/pull/13941#issuecomment-3247461567 push an empty commit to trigger ci, it takes a while :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Bump Avro to 0.20 [iceberg-rust]

2025-09-02 Thread via GitHub
kevinjqliu commented on PR #1644: URL: https://github.com/apache/iceberg-rust/pull/1644#issuecomment-3247468529 so do we need to wait for https://github.com/apache/avro-rs/commit/594ab0ac4cbaa59efc424ca801231fd6cc111b09 and a new avro release? -- This is an automated message from the Ap

Re: [PR] Build: Bump Parquet-Java to 1.16.0 [iceberg]

2025-09-02 Thread via GitHub
kevinjqliu commented on PR #13941: URL: https://github.com/apache/iceberg/pull/13941#issuecomment-3247459288 1.16 is released! https://lists.apache.org/thread/nf0m7z256gtq16m3by78mf4w6tpffdqh https://repo.maven.apache.org/maven2/org/apache/parquet/parquet-avro/1.16.0/ -- This is an aut

Re: [PR] Core: Use time-travel schema when resolving partition spec in scan [iceberg]

2025-09-02 Thread via GitHub
chenjian2664 commented on PR #13301: URL: https://github.com/apache/iceberg/pull/13301#issuecomment-3247459140 @manuzhang Thank you for helping reopen it! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Spark 4.0: Refactor Spark procedures to consistently use ProcedureInput for parameter handling. [iceberg]

2025-09-02 Thread via GitHub
slfan1989 commented on PR #13913: URL: https://github.com/apache/iceberg/pull/13913#issuecomment-3247452948 @dramaticlly Thank you very much for reviewing the code and for your valuable suggestions! I have adopted most of your suggestions. Regarding the suggestion to modify the field types,

Re: [PR] Spark 4.0: Refactor Spark procedures to consistently use ProcedureInput for parameter handling. [iceberg]

2025-09-02 Thread via GitHub
slfan1989 commented on code in PR #13913: URL: https://github.com/apache/iceberg/pull/13913#discussion_r2317568128 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/procedures/ExpireSnapshotsProcedure.java: ## @@ -104,26 +118,28 @@ public ProcedureParameter[] parameters

Re: [PR] Spark 4.0: Refactor Spark procedures to consistently use ProcedureInput for parameter handling. [iceberg]

2025-09-02 Thread via GitHub
slfan1989 commented on code in PR #13913: URL: https://github.com/apache/iceberg/pull/13913#discussion_r2317558726 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/procedures/CherrypickSnapshotProcedure.java: ## @@ -83,8 +85,10 @@ public ProcedureParameter[] parameters

Re: [PR] Spark 4.0: Implement check for destination table location overlap with source table location. [iceberg]

2025-09-02 Thread via GitHub
huaxingao commented on code in PR #13962: URL: https://github.com/apache/iceberg/pull/13962#discussion_r2317344043 ## spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/actions/TestSnapshotTableAction.java: ## @@ -65,4 +68,81 @@ public void testSnapshotWithParallelTasks() t

Re: [PR] Partition Writer Support Part 1: add partition splitter [iceberg-rust]

2025-09-02 Thread via GitHub
liurenjie1024 commented on code in PR #1040: URL: https://github.com/apache/iceberg-rust/pull/1040#discussion_r2315618104 ## crates/iceberg/src/arrow/record_batch_partition_splitter.rs: ## @@ -0,0 +1,298 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or mo

Re: [I] Add properties support for HadoopTables.load() [iceberg]

2025-09-02 Thread via GitHub
github-actions[bot] closed issue #12251: Add properties support for HadoopTables.load() URL: https://github.com/apache/iceberg/issues/12251 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[PR] Spark, Rest: Reference implementation of referenced-by in the loadTable call [iceberg]

2025-09-02 Thread via GitHub
singhpk234 opened a new pull request, #13979: URL: https://github.com/apache/iceberg/pull/13979 ### About the change This provides a reference implementation for passing the view name that table is referenced in as part of which the loadTable call is being made. Details on th

Re: [PR] Build: Bump Parquet-Java to 1.16.0 [iceberg]

2025-09-02 Thread via GitHub
stevenzwu commented on code in PR #13941: URL: https://github.com/apache/iceberg/pull/13941#discussion_r2317186940 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/ParquetWithSparkSchemaVisitor.java: ## @@ -154,11 +155,10 @@ public static T visit(DataType sType,

Re: [PR] Spark 4.0: Add Support for PartitionStatistics Files in RewriteTablePath. [iceberg]

2025-09-02 Thread via GitHub
huaxingao commented on PR #13956: URL: https://github.com/apache/iceberg/pull/13956#issuecomment-3247238062 also cc @szehon-ho @dramaticlly -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [I] Add properties support for HadoopTables.load() [iceberg]

2025-09-02 Thread via GitHub
github-actions[bot] commented on issue #12251: URL: https://github.com/apache/iceberg/issues/12251#issuecomment-3247212447 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [PR] Infra: update how-to-release.md doc on potential multiple staging repositories in a corp network with floating IPs for outbound requests [iceberg]

2025-09-02 Thread via GitHub
stevenzwu commented on PR #13978: URL: https://github.com/apache/iceberg/pull/13978#issuecomment-3247154811 Thanks community member "Peng Cheng" (pan3...@gmail.com) for pointing out the root cause. -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [PR] Infra: update how-to-release.md doc on potential multiple staging repositories in a corp network with floating IPs for outbound requests [iceberg]

2025-09-02 Thread via GitHub
stevenzwu commented on PR #13978: URL: https://github.com/apache/iceberg/pull/13978#issuecomment-3247152775 I have ran the `stage-binaries.sh` script 3 times at home. There is no multiple staging repos problem. The `--no-parallel` gradle option works correctly. This confirms that the floati

Re: [I] PartitionSpec Schema error [iceberg]

2025-09-02 Thread via GitHub
amogh-jahagirdar commented on issue #13945: URL: https://github.com/apache/iceberg/issues/13945#issuecomment-3247072335 @Zhanxiao-Ma I'm not sure I completely followed the issue. Do you have a reproducible example via reference implementation or sequence of SQL for an engine? >When r

Re: [I] Issue with Data Scanning After Using AppendFiles and PartitionedFanoutWriter in Iceberg 1.5.2 [iceberg]

2025-09-02 Thread via GitHub
amogh-jahagirdar commented on issue #13973: URL: https://github.com/apache/iceberg/issues/13973#issuecomment-3247055283 Yeah +1 to @RussellSpitzer , this looks like whatever files were written are invalid. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] Spark 4.0: Implement check for destination table location overlap with source table location. [iceberg]

2025-09-02 Thread via GitHub
huaxingao commented on code in PR #13962: URL: https://github.com/apache/iceberg/pull/13962#discussion_r2317351027 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/actions/SnapshotTableSparkAction.java: ## @@ -124,10 +126,10 @@ private SnapshotTable.Result doExecute()

Re: [PR] feat: Add `rest.Catalog` Option to provide custom `http.RoundTripper` [iceberg-go]

2025-09-02 Thread via GitHub
zeroshade merged PR #552: URL: https://github.com/apache/iceberg-go/pull/552 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [I] New rest.Catalog option to allow custom Transport [iceberg-go]

2025-09-02 Thread via GitHub
zeroshade closed issue #551: New rest.Catalog option to allow custom Transport URL: https://github.com/apache/iceberg-go/issues/551 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Spark 4.0: Implement check for destination table location overlap with source table location. [iceberg]

2025-09-02 Thread via GitHub
huaxingao commented on code in PR #13962: URL: https://github.com/apache/iceberg/pull/13962#discussion_r2317342861 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/actions/SnapshotTableSparkAction.java: ## @@ -222,4 +224,22 @@ public SnapshotTableSparkAction tableLocat

Re: [PR] Arrow: Close child allocators [iceberg]

2025-09-02 Thread via GitHub
RussellSpitzer commented on PR #13976: URL: https://github.com/apache/iceberg/pull/13976#issuecomment-3246248370 I think this makes sense to me. I wonder if we should just ignore the root allocator? As long as we close all the Childs we allocate we should be safe -- This is an automated m

Re: [PR] Backport Parquet encoding tests for Spark 3.5 [iceberg]

2025-09-02 Thread via GitHub
eric-maynard commented on PR #13859: URL: https://github.com/apache/iceberg/pull/13859#issuecomment-3246988136 Hey @kevinjqliu, my mistake, this is really only porting the tests. So you're right that it should not be a blocker. -- This is an automated message from the Apache Git Service.

Re: [PR] Spark 4.0: RewriteTablePath: Update sizes of rewritten manifests in manifest lists [iceberg]

2025-09-02 Thread via GitHub
dramaticlly commented on PR #13720: URL: https://github.com/apache/iceberg/pull/13720#issuecomment-3246875020 > Hi @dramaticlly, just wanted to check in on this. Please let me know if you have any thoughts on the path forward when you get a chance. Thanks! I am going to raise this in

Re: [PR] Build: Bump Parquet-Java to 1.16.0 [iceberg]

2025-09-02 Thread via GitHub
Fokko commented on code in PR #13941: URL: https://github.com/apache/iceberg/pull/13941#discussion_r2317168008 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/ParquetWithSparkSchemaVisitor.java: ## @@ -154,11 +155,10 @@ public static T visit(DataType sType, Type

Re: [PR] Core, Spark: add snapshot properties for snapshot created by wap branches [iceberg]

2025-09-02 Thread via GitHub
yingjianwu98 commented on code in PR #13922: URL: https://github.com/apache/iceberg/pull/13922#discussion_r2317090034 ## core/src/main/java/org/apache/iceberg/util/WapUtil.java: ## @@ -39,6 +41,33 @@ public static String publishedWapId(Snapshot snapshot) { : null; }

Re: [PR] Core, Spark: add snapshot properties for snapshot created by wap branches [iceberg]

2025-09-02 Thread via GitHub
yingjianwu98 commented on code in PR #13922: URL: https://github.com/apache/iceberg/pull/13922#discussion_r2317129473 ## core/src/main/java/org/apache/iceberg/util/WapUtil.java: ## @@ -39,6 +41,33 @@ public static String publishedWapId(Snapshot snapshot) { : null; }

Re: [PR] Arrow: Close child allocators [iceberg]

2025-09-02 Thread via GitHub
RussellSpitzer commented on PR #13976: URL: https://github.com/apache/iceberg/pull/13976#issuecomment-3246695299 > > I think this makes sense to me. I wonder if we should just ignore the root allocator? As long as we close all the Childs we allocate we should be safe > > Currently, we

Re: [PR] Arrow: Close child allocators [iceberg]

2025-09-02 Thread via GitHub
RussellSpitzer commented on PR #13976: URL: https://github.com/apache/iceberg/pull/13976#issuecomment-3246720794 > > We should use in both of these places a child allocator no? Or maybe root allocator is also fine, but in that case it is also important to close that too at the right tim

Re: [PR] Core, Spark: add snapshot properties for snapshot created by wap branches [iceberg]

2025-09-02 Thread via GitHub
RussellSpitzer commented on code in PR #13922: URL: https://github.com/apache/iceberg/pull/13922#discussion_r2317110363 ## core/src/main/java/org/apache/iceberg/util/WapUtil.java: ## @@ -39,6 +41,33 @@ public static String publishedWapId(Snapshot snapshot) { : null;

Re: [PR] Arrow: Close child allocators [iceberg]

2025-09-02 Thread via GitHub
nandorKollar commented on PR #13976: URL: https://github.com/apache/iceberg/pull/13976#issuecomment-3246710475 > > > I think this makes sense to me. I wonder if we should just ignore the root allocator? As long as we close all the Childs we allocate we should be safe > > > > > > C

Re: [PR] Core, Spark: add snapshot properties for snapshot created by wap branches [iceberg]

2025-09-02 Thread via GitHub
yingjianwu98 commented on code in PR #13922: URL: https://github.com/apache/iceberg/pull/13922#discussion_r2317092403 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/SparkWriteConf.java: ## @@ -461,11 +461,22 @@ public boolean caseSensitive() { .parse(); }

Re: [I] commit on expire_snapshot tries to remove snapshot from wrong table. [iceberg-python]

2025-09-02 Thread via GitHub
QlikFrederic commented on issue #2409: URL: https://github.com/apache/iceberg-python/issues/2409#issuecomment-3245873409 Hi, adding `table = table.refresh()` doesn't work with the in-memory catalog (then it seems to start searching for a sql catalog instead). So I've adapted the script

Re: [PR] Reuse vectors across encoding switches in VectorizedArrowReader [iceberg]

2025-09-02 Thread via GitHub
RussellSpitzer commented on code in PR #13949: URL: https://github.com/apache/iceberg/pull/13949#discussion_r2317066770 ## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorizedArrowReader.java: ## @@ -217,6 +217,10 @@ public VectorHolder read(VectorHolder reuse, int

Re: [PR] Reuse vectors across encoding switches in VectorizedArrowReader [iceberg]

2025-09-02 Thread via GitHub
stevenzwu commented on code in PR #13949: URL: https://github.com/apache/iceberg/pull/13949#discussion_r2317038609 ## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorizedArrowReader.java: ## @@ -217,6 +217,10 @@ public VectorHolder read(VectorHolder reuse, int num

Re: [PR] Reuse vectors across encoding switches in VectorizedArrowReader [iceberg]

2025-09-02 Thread via GitHub
stevenzwu commented on code in PR #13949: URL: https://github.com/apache/iceberg/pull/13949#discussion_r2317038609 ## arrow/src/main/java/org/apache/iceberg/arrow/vectorized/VectorizedArrowReader.java: ## @@ -217,6 +217,10 @@ public VectorHolder read(VectorHolder reuse, int num

Re: [PR] [docs] Move third-party integrations to root level of left-hand nav, add more catalogs [iceberg]

2025-09-02 Thread via GitHub
kevinjqliu commented on code in PR #13753: URL: https://github.com/apache/iceberg/pull/13753#discussion_r2316937326 ## site/nav.yml: ## @@ -47,6 +47,42 @@ nav: - Rust: https://rust.iceberg.apache.org/ - Go: https://go.iceberg.apache.org/ - C++: https://githu

Re: [PR] Infra: update how-to-release.md doc on potential multiple staging repositories in a corp network with floating IPs for outbound requests [iceberg]

2025-09-02 Thread via GitHub
stevenzwu commented on PR #13978: URL: https://github.com/apache/iceberg/pull/13978#issuecomment-3246252076 will not merge this until I verified it from my home network. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Build: Bump Parquet-Java to 1.16.0 [iceberg]

2025-09-02 Thread via GitHub
Fokko commented on code in PR #13941: URL: https://github.com/apache/iceberg/pull/13941#discussion_r2316884196 ## spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java: ## @@ -2140,8 +2141,10 @@ protected void shouldHaveMultipleFiles(Tab

  1   2   >