[PR] refactor(catalog/internal): Improve error handling in WriteTableMetadata and WriteMetadata functions [iceberg-go]

2025-08-20 Thread via GitHub
xixipi-lining opened a new pull request, #541: URL: https://github.com/apache/iceberg-go/pull/541 ### Changes - Updated the `WriteTableMetadata` and `WriteMetadata` functions to use `errors.Join` for better error handling, ensuring that both JSON encoding errors and file close errors

Re: [PR] Materialized View Spec [iceberg]

2025-08-20 Thread via GitHub
JanKaul commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r2289941916 ## format/view-spec.md: ## @@ -42,12 +42,28 @@ An atomic swap of one view metadata file for another provides the basis for maki Writers create view metadata files

Re: [PR] feat(table): add fanout partition writer and rolling data writer [iceberg-go]

2025-08-20 Thread via GitHub
badalprasadsingh commented on code in PR #524: URL: https://github.com/apache/iceberg-go/pull/524#discussion_r2289914455 ## table/rolling_data_writer.go: ## @@ -0,0 +1,199 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreement

Re: [PR] refactor: enhance forward declarations in type_fwd.h to reduce compile dependencies [iceberg-cpp]

2025-08-20 Thread via GitHub
wgtmac commented on code in PR #187: URL: https://github.com/apache/iceberg-cpp/pull/187#discussion_r2289875032 ## src/iceberg/type_fwd.h: ## @@ -127,16 +127,29 @@ struct ManifestList; class ManifestReader; class ManifestListReader; +class ManifestWriter; +class ManifestList

Re: [PR] Bump `markdownlint-cli` in `.pre-commit-config.yaml` file [iceberg-python]

2025-08-20 Thread via GitHub
ayushjariyal commented on PR #2366: URL: https://github.com/apache/iceberg-python/pull/2366#issuecomment-3208977115 @Fokko, could you please review this PR and let me know why updating to `v0.45.0` caused `make lint` to fail -- This is an automated message from the Apache Git Service. To

Re: [PR] Parquet: Fix column pruning for deeply nested fields [iceberg]

2025-08-20 Thread via GitHub
sriharshaj commented on code in PR #12634: URL: https://github.com/apache/iceberg/pull/12634#discussion_r2289426616 ## parquet/src/main/java/org/apache/iceberg/parquet/PruneColumns.java: ## @@ -90,11 +90,11 @@ public Type struct(StructType expected, GroupType struct, List field

[PR] Bump `markdownlint-cli` in `.pre-commit-config.yaml` file [iceberg-python]

2025-08-20 Thread via GitHub
ayushjariyal opened a new pull request, #2366: URL: https://github.com/apache/iceberg-python/pull/2366 issue #2341 Initially, I updated `markdownlint-cli` from `v0.43.0 → v0.45.0`, but running `make lint` failed with errors: https://github.com/user-attachments/assets/73785f76-

Re: [PR] test: add parquet reader test [iceberg-cpp]

2025-08-20 Thread via GitHub
Xuanwo merged PR #184: URL: https://github.com/apache/iceberg-cpp/pull/184 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] Cannot migrate a bucketed table with Spark [iceberg]

2025-08-20 Thread via GitHub
lirui-apache commented on issue #13869: URL: https://github.com/apache/iceberg/issues/13869#issuecomment-3208968393 I think silently ignore the bucketing can be a surprise to users. So I prefer an explicit check and error message. If users don't care about the bucketing, they can easily dro

[PR] refactor: enhance forward declarations in type_fwd.h to reduce compile dependencies [iceberg-cpp]

2025-08-20 Thread via GitHub
HeartLinked opened a new pull request, #187: URL: https://github.com/apache/iceberg-cpp/pull/187 Enhanced the forward declaration system in `type_fwd.h` to reduce compilation dependencies and improve build times. ## Changes Made - Added missing forward declarations for commonly use

Re: [PR] feat(inspect): `refs` [iceberg-rust]

2025-08-20 Thread via GitHub
emkornfield commented on code in PR #1489: URL: https://github.com/apache/iceberg-rust/pull/1489#discussion_r2289789274 ## crates/iceberg/src/inspect/refs.rs: ## @@ -0,0 +1,192 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agre

Re: [PR] feat(inspect): `refs` [iceberg-rust]

2025-08-20 Thread via GitHub
emkornfield commented on code in PR #1489: URL: https://github.com/apache/iceberg-rust/pull/1489#discussion_r2289786824 ## crates/iceberg/src/inspect/refs.rs: ## @@ -0,0 +1,192 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agre

Re: [PR] feat(inspect): `refs` [iceberg-rust]

2025-08-20 Thread via GitHub
emkornfield commented on code in PR #1489: URL: https://github.com/apache/iceberg-rust/pull/1489#discussion_r2289783890 ## crates/iceberg/src/inspect/refs.rs: ## @@ -0,0 +1,192 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agre

Re: [I] EPIC: Support parallel scan in iceberg-datafusion [iceberg-rust]

2025-08-20 Thread via GitHub
liurenjie1024 closed issue #1604: EPIC: Support parallel scan in iceberg-datafusion URL: https://github.com/apache/iceberg-rust/issues/1604 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] feat: add manifest&manifest list writer [iceberg-cpp]

2025-08-20 Thread via GitHub
wgtmac commented on code in PR #176: URL: https://github.com/apache/iceberg-cpp/pull/176#discussion_r2289776405 ## src/iceberg/manifest_adapter.h: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements.

Re: [PR] fix: Add support for unsigned Arrow datatypes in schema conversion [iceberg-rust]

2025-08-20 Thread via GitHub
emkornfield commented on code in PR #1617: URL: https://github.com/apache/iceberg-rust/pull/1617#discussion_r2289776451 ## crates/iceberg/src/arrow/schema.rs: ## @@ -1717,6 +1734,60 @@ mod tests { } } +#[test] Review Comment: this probably isn't the right

Re: [PR] fix: Add support for unsigned Arrow datatypes in schema conversion [iceberg-rust]

2025-08-20 Thread via GitHub
emkornfield commented on code in PR #1617: URL: https://github.com/apache/iceberg-rust/pull/1617#discussion_r2289775979 ## crates/iceberg/src/arrow/schema.rs: ## @@ -1717,6 +1734,60 @@ mod tests { } } +#[test] +fn test_unsigned_type_casting() { +/

Re: [PR] fix: Add support for unsigned Arrow datatypes in schema conversion [iceberg-rust]

2025-08-20 Thread via GitHub
emkornfield commented on code in PR #1617: URL: https://github.com/apache/iceberg-rust/pull/1617#discussion_r2289775400 ## crates/iceberg/src/arrow/schema.rs: ## @@ -1717,6 +1734,60 @@ mod tests { } } +#[test] +fn test_unsigned_type_casting() { +/

Re: [PR] fix: Add support for unsigned Arrow datatypes in schema conversion [iceberg-rust]

2025-08-20 Thread via GitHub
emkornfield commented on code in PR #1617: URL: https://github.com/apache/iceberg-rust/pull/1617#discussion_r2289774748 ## crates/iceberg/src/arrow/schema.rs: ## @@ -378,7 +378,24 @@ impl ArrowSchemaVisitor for ArrowSchemaConverter { DataType::Int8 | DataType::Int16

Re: [PR] fix: Add support for unsigned Arrow datatypes in schema conversion [iceberg-rust]

2025-08-20 Thread via GitHub
emkornfield commented on code in PR #1617: URL: https://github.com/apache/iceberg-rust/pull/1617#discussion_r2289774275 ## crates/iceberg/src/arrow/schema.rs: ## @@ -378,7 +378,24 @@ impl ArrowSchemaVisitor for ArrowSchemaConverter { DataType::Int8 | DataType::Int16

Re: [I] Undetermined behavior when fetching from iceberg table [iceberg]

2025-08-20 Thread via GitHub
liurenjie1024 commented on issue #13873: URL: https://github.com/apache/iceberg/issues/13873#issuecomment-3208874111 I've updated the data file and issue description to reflect latest changes, now it only contains several files and deletion files, but the error still happens. -- This is

Re: [PR] feat(type): add insensitive way to find schemafield & test [iceberg-cpp]

2025-08-20 Thread via GitHub
wgtmac commented on code in PR #180: URL: https://github.com/apache/iceberg-cpp/pull/180#discussion_r2284108623 ## src/iceberg/type.h: ## @@ -24,14 +24,20 @@ /// iceberg/type_fwd.h for the enum defining the list of types. #include +#include +#include #include +#include

Re: [I] Undetermined behavior when fetching from iceberg table [iceberg]

2025-08-20 Thread via GitHub
liurenjie1024 commented on issue #13873: URL: https://github.com/apache/iceberg/issues/13873#issuecomment-3208691665 > So I loaded this table and I got a completely different number of rows - 2048L It was also stable. I tested this by loading in a integration test within the Iceberg repo

[PR] Spark 4.0: Fix source location in stats file copy plan in RewriteTablePathSparkAction [iceberg]

2025-08-20 Thread via GitHub
anuragmantri opened a new pull request, #13881: URL: https://github.com/apache/iceberg/pull/13881 The `statsFileCopyPlan()` incorrectly uses staging directory instead of source directory for stats files. This PR fixes this. -- This is an automated message from the Apache Git Service. To r

Re: [PR] Spark 4.0: Fix source location in stats file copy plan in RewriteTablePathSparkAction [iceberg]

2025-08-20 Thread via GitHub
anuragmantri commented on PR #13881: URL: https://github.com/apache/iceberg/pull/13881#issuecomment-3208668490 @dramaticlly @szehon-ho - Could you take a look please? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] feat(datafusion): implement the project node to add the partition columns [iceberg-rust]

2025-08-20 Thread via GitHub
CTTY commented on code in PR #1602: URL: https://github.com/apache/iceberg-rust/pull/1602#discussion_r2289641476 ## crates/integrations/datafusion/src/physical_plan/project.rs: ## @@ -0,0 +1,661 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contri

Re: [PR] AWS: Support similar S3 Sync Client configurations for S3 Async Clients [iceberg]

2025-08-20 Thread via GitHub
guizmaii commented on PR #13387: URL: https://github.com/apache/iceberg/pull/13387#issuecomment-3208588702 not stale -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [PR] partition field names validation against schema field conflicts [iceberg-python]

2025-08-20 Thread via GitHub
Fokko commented on PR #2305: URL: https://github.com/apache/iceberg-python/pull/2305#issuecomment-3208019497 Let's move this forward, thanks @rutb327 for working on this, and thanks @kevinjqliu and @dingo4dev for the review 🙌 -- This is an automated message from the Apache Git Service. T

Re: [PR] Parquet: Fix column pruning for deeply nested fields [iceberg]

2025-08-20 Thread via GitHub
amogh-jahagirdar commented on code in PR #12634: URL: https://github.com/apache/iceberg/pull/12634#discussion_r2289600342 ## parquet/src/main/java/org/apache/iceberg/parquet/PruneColumns.java: ## @@ -90,11 +90,11 @@ public Type struct(StructType expected, GroupType struct, List

Re: [PR] fix: Add support for unsigned Arrow datatypes in schema conversion [iceberg-rust]

2025-08-20 Thread via GitHub
GeetKrishna commented on PR #1617: URL: https://github.com/apache/iceberg-rust/pull/1617#issuecomment-3208540482 @emkornfield Right, with the current approach, it has potential for silent data corruption because of Arrow's doc field dependency. @CTTY Thanks for the references, I will use th

Re: [PR] Docs: Update supported Flink versions [iceberg]

2025-08-20 Thread via GitHub
github-actions[bot] commented on PR #13611: URL: https://github.com/apache/iceberg/pull/13611#issuecomment-3208520216 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] iceberg kafka metrics reporter [iceberg]

2025-08-20 Thread via GitHub
github-actions[bot] commented on PR #13291: URL: https://github.com/apache/iceberg/pull/13291#issuecomment-3208520014 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Flink: Make dynamic Iceberg sink agnostic to user types [iceberg]

2025-08-20 Thread via GitHub
github-actions[bot] commented on PR #13394: URL: https://github.com/apache/iceberg/pull/13394#issuecomment-3208520108 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] AWS: Support similar S3 Sync Client configurations for S3 Async Clients [iceberg]

2025-08-20 Thread via GitHub
github-actions[bot] commented on PR #13387: URL: https://github.com/apache/iceberg/pull/13387#issuecomment-3208520053 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [I] Cherrypick the data rows [deleted or old values] from a past snapshot [iceberg]

2025-08-20 Thread via GitHub
github-actions[bot] commented on issue #12271: URL: https://github.com/apache/iceberg/issues/12271#issuecomment-3208519842 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] TestS3FileIO fails locally (on OSX with Docker Desktop) due to missing Content-MD5 header during delete [iceberg]

2025-08-20 Thread via GitHub
github-actions[bot] commented on issue #12237: URL: https://github.com/apache/iceberg/issues/12237#issuecomment-3208519736 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Kafka Connect: Add option to fail connector task after max retries [iceberg]

2025-08-20 Thread via GitHub
github-actions[bot] commented on PR #13268: URL: https://github.com/apache/iceberg/pull/13268#issuecomment-3208519978 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [I] Allow to configure the tables' namespace when using dynamic routing with Kafka Connect [iceberg]

2025-08-20 Thread via GitHub
github-actions[bot] commented on issue #12269: URL: https://github.com/apache/iceberg/issues/12269#issuecomment-3208519799 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Support row filter & column masking in REST spec [iceberg]

2025-08-20 Thread via GitHub
github-actions[bot] commented on issue #10909: URL: https://github.com/apache/iceberg/issues/10909#issuecomment-3208519519 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Parquet: Fix column pruning for deeply nested fields [iceberg]

2025-08-20 Thread via GitHub
sriharshaj commented on code in PR #12634: URL: https://github.com/apache/iceberg/pull/12634#discussion_r2289426616 ## parquet/src/main/java/org/apache/iceberg/parquet/PruneColumns.java: ## @@ -90,11 +90,11 @@ public Type struct(StructType expected, GroupType struct, List field

Re: [PR] fix: Add support for unsigned Arrow datatypes in schema conversion [iceberg-rust]

2025-08-20 Thread via GitHub
CTTY commented on PR #1617: URL: https://github.com/apache/iceberg-rust/pull/1617#issuecomment-3208490502 I have the same concern as @emkornfield , using `doc` to determine field type seems unsafe to me. I think casting the type should be fine. This way there would be type loss when convert

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-20 Thread via GitHub
rdblue commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2289533392 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/SparkParquetWriters.java: ## @@ -79,6 +84,123 @@ public static ParquetValueWriter buildWriter(StructT

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-20 Thread via GitHub
rdblue commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2289516194 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/TypeToSparkType.java: ## @@ -130,9 +131,10 @@ public DataType primitive(Type.PrimitiveType primitive) {

Re: [PR] [SPEC] Add FGAC enforcement instructions as part of loadTable [iceberg]

2025-08-20 Thread via GitHub
flyrain commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2289505934 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,28 @@ components: additionalProperties: type: string +FineGrainedDataProtect

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-20 Thread via GitHub
rdblue commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2289504978 ## core/src/test/java/org/apache/iceberg/data/DataTestBase.java: ## @@ -131,6 +131,7 @@ protected boolean allowsWritingNullValuesForRequiredFields() { Types.

Re: [PR] [SPEC] Add FGAC enforcement instructions as part of loadTable [iceberg]

2025-08-20 Thread via GitHub
singhpk234 commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2289495307 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,28 @@ components: additionalProperties: type: string +FineGrainedDataProt

Re: [PR] Flink custom source parallelism [iceberg]

2025-08-20 Thread via GitHub
swapna267 commented on code in PR #13878: URL: https://github.com/apache/iceberg/pull/13878#discussion_r2289493607 ## flink/v2.0/flink/src/test/java/org/apache/iceberg/flink/source/TestStreamScanSql.java: ## @@ -431,4 +435,36 @@ public void testConsumeFromStartTag() throws Excep

Re: [PR] Flink custom source parallelism [iceberg]

2025-08-20 Thread via GitHub
swapna267 commented on code in PR #13878: URL: https://github.com/apache/iceberg/pull/13878#discussion_r2289494481 ## flink/v2.0/flink/src/test/java/org/apache/iceberg/flink/source/TestStreamScanSql.java: ## @@ -431,4 +435,36 @@ public void testConsumeFromStartTag() throws Excep

Re: [PR] [SPEC] Add FGAC enforcement instructions as part of loadTable [iceberg]

2025-08-20 Thread via GitHub
amogh-jahagirdar commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2289474477 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,28 @@ components: additionalProperties: type: string +FineGrainedDa

Re: [PR] [SPEC] Add FGAC enforcement instructions as part of loadTable [iceberg]

2025-08-20 Thread via GitHub
rdblue commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2289491582 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,28 @@ components: additionalProperties: type: string +FineGrainedDataProtecti

Re: [PR] [SPEC] Add FGAC enforcement instructions as part of loadTable [iceberg]

2025-08-20 Thread via GitHub
rdblue commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2289491058 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,28 @@ components: additionalProperties: type: string +FineGrainedDataProtecti

Re: [PR] [SPEC] Add FGAC enforcement instructions as part of loadTable [iceberg]

2025-08-20 Thread via GitHub
amogh-jahagirdar commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2289483359 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,28 @@ components: additionalProperties: type: string +FineGrainedDa

Re: [PR] [SPEC] Add FGAC enforcement instructions as part of loadTable [iceberg]

2025-08-20 Thread via GitHub
amogh-jahagirdar commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2289474477 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,28 @@ components: additionalProperties: type: string +FineGrainedDa

Re: [PR] [SPEC] Add FGAC enforcement instructions as part of loadTable [iceberg]

2025-08-20 Thread via GitHub
singhpk234 commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2289483532 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,28 @@ components: additionalProperties: type: string +FineGrainedDataProt

Re: [PR] Data, Flink, Spark: Use TestHelpers for FormatVersion [iceberg]

2025-08-20 Thread via GitHub
RussellSpitzer commented on code in PR #13880: URL: https://github.com/apache/iceberg/pull/13880#discussion_r2289460446 ## spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkReaderDeletes.java: ## @@ -107,14 +107,16 @@ public class TestSparkReaderDeletes ext

Re: [PR] [SPEC] Add FGAC enforcement instructions as part of loadTable [iceberg]

2025-08-20 Thread via GitHub
rdblue commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2289478588 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,28 @@ components: additionalProperties: type: string +FineGrainedDataProtecti

Re: [PR] [SPEC] Add FGAC enforcement instructions as part of loadTable [iceberg]

2025-08-20 Thread via GitHub
rdblue commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2289477353 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,28 @@ components: additionalProperties: type: string +FineGrainedDataProtecti

Re: [PR] [SPEC] Add FGAC enforcement instructions as part of loadTable [iceberg]

2025-08-20 Thread via GitHub
rdblue commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2289475105 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,28 @@ components: additionalProperties: type: string +FineGrainedDataProtecti

Re: [PR] [SPEC] Add FGAC enforcement instructions as part of loadTable [iceberg]

2025-08-20 Thread via GitHub
rdblue commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2289473856 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,28 @@ components: additionalProperties: type: string +FineGrainedDataProtecti

Re: [PR] [SPEC] Add FGAC enforcement instructions as part of loadTable [iceberg]

2025-08-20 Thread via GitHub
rdblue commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2289476968 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,28 @@ components: additionalProperties: type: string +FineGrainedDataProtecti

Re: [PR] feat(catalog): Implement update_table for GlueCatalog [iceberg-rust]

2025-08-20 Thread via GitHub
CTTY commented on code in PR #1584: URL: https://github.com/apache/iceberg-rust/pull/1584#discussion_r2289456231 ## crates/catalog/glue/src/catalog.rs: ## @@ -635,10 +637,56 @@ impl Catalog for GlueCatalog { )) } -async fn update_table(&self, _commit: TableCo

Re: [PR] [SPEC] Add FGAC enforcement instructions as part of loadTable [iceberg]

2025-08-20 Thread via GitHub
rdblue commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2289466825 ## open-api/rest-catalog-open-api.yaml: ## @@ -3319,6 +3341,8 @@ components: type: array items: $ref: '#/components/schemas/StorageC

Re: [PR] feat(catalog): Implement update_table for GlueCatalog [iceberg-rust]

2025-08-20 Thread via GitHub
emkornfield commented on code in PR #1584: URL: https://github.com/apache/iceberg-rust/pull/1584#discussion_r2289461776 ## crates/catalog/glue/tests/glue_catalog_test.rs: ## @@ -367,3 +368,69 @@ async fn test_list_namespace() -> Result<()> { Ok(()) } + +#[tokio::test] +a

Re: [PR] Data, Flink, Spark: Use TestHelpers for FormatVersion [iceberg]

2025-08-20 Thread via GitHub
RussellSpitzer commented on code in PR #13880: URL: https://github.com/apache/iceberg/pull/13880#discussion_r2289458961 ## flink/v1.19/flink/src/test/java/org/apache/iceberg/flink/sink/TestIcebergCommitter.java: ## @@ -157,7 +158,7 @@ class TestIcebergCommitter extends TestBase

Re: [PR] Data, Flink, Spark: Use TestHelpers for FormatVersion [iceberg]

2025-08-20 Thread via GitHub
RussellSpitzer commented on code in PR #13880: URL: https://github.com/apache/iceberg/pull/13880#discussion_r2289459254 ## flink/v1.20/flink/src/test/java/org/apache/iceberg/flink/actions/TestRewriteDataFilesAction.java: ## @@ -99,7 +100,7 @@ public static List parameters() {

[PR] Data, Flink, Spark: Use TestHelpers for FormatVersion [iceberg]

2025-08-20 Thread via GitHub
RussellSpitzer opened a new pull request, #13880: URL: https://github.com/apache/iceberg/pull/13880 While I was working on V4 Parquet Manifests I found a bunch of test classes that do not properly parameterize formatVersion and don't run on V4. I've fixed all the ones I could find. -- Th

Re: [PR] feat(catalog): Implement update_table for GlueCatalog [iceberg-rust]

2025-08-20 Thread via GitHub
CTTY commented on code in PR #1584: URL: https://github.com/apache/iceberg-rust/pull/1584#discussion_r2289458683 ## crates/catalog/glue/tests/glue_catalog_test.rs: ## @@ -367,3 +368,69 @@ async fn test_list_namespace() -> Result<()> { Ok(()) } + +#[tokio::test] +async fn

Re: [PR] Parquet: Fix column pruning for deeply nested fields [iceberg]

2025-08-20 Thread via GitHub
sriharshaj commented on code in PR #12634: URL: https://github.com/apache/iceberg/pull/12634#discussion_r2289456813 ## parquet/src/main/java/org/apache/iceberg/parquet/PruneColumns.java: ## @@ -90,11 +90,11 @@ public Type struct(StructType expected, GroupType struct, List field

Re: [PR] feat(catalog): Implement catalog loader for hms [iceberg-rust]

2025-08-20 Thread via GitHub
lliangyu-lin commented on code in PR #1612: URL: https://github.com/apache/iceberg-rust/pull/1612#discussion_r2289452490 ## crates/catalog/hms/src/catalog.rs: ## @@ -29,15 +29,110 @@ use iceberg::io::FileIO; use iceberg::spec::{TableMetadata, TableMetadataBuilder}; use iceberg

Re: [PR] Spark 4.0: RewriteTablePath: Update sizes of rewritten manifests in manifest lists [iceberg]

2025-08-20 Thread via GitHub
vaultah commented on code in PR #13720: URL: https://github.com/apache/iceberg/pull/13720#discussion_r2289426645 ## core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java: ## @@ -447,6 +551,58 @@ public static RewriteResult rewriteDeleteManifest( } } + /** +

Re: [PR] Parquet: Fix column pruning for deeply nested fields [iceberg]

2025-08-20 Thread via GitHub
sriharshaj commented on code in PR #12634: URL: https://github.com/apache/iceberg/pull/12634#discussion_r2289426616 ## parquet/src/main/java/org/apache/iceberg/parquet/PruneColumns.java: ## @@ -90,11 +90,11 @@ public Type struct(StructType expected, GroupType struct, List field

Re: [PR] feat(transaction): Support snapshot validation [iceberg-rust]

2025-08-20 Thread via GitHub
CTTY commented on PR #1353: URL: https://github.com/apache/iceberg-rust/pull/1353#issuecomment-3208269751 This PR is stale for now, I've included a new version of snapshot validation in my big draft #1606 . I'll port the changes over once it's ready -- This is an automated message from th

Re: [PR] Parquet: Fix column pruning for deeply nested fields [iceberg]

2025-08-20 Thread via GitHub
sriharshaj commented on code in PR #12634: URL: https://github.com/apache/iceberg/pull/12634#discussion_r2289426616 ## parquet/src/main/java/org/apache/iceberg/parquet/PruneColumns.java: ## @@ -90,11 +90,11 @@ public Type struct(StructType expected, GroupType struct, List field

Re: [PR] Spark 3.5, 3.4: Disable executor cache for delete files in RewriteDataFilesSparkAction [iceberg]

2025-08-20 Thread via GitHub
huaxingao commented on PR #13868: URL: https://github.com/apache/iceberg/pull/13868#issuecomment-3208258397 Merged. Thanks @anuragmantri for the PR! Thanks @ebyhr @stevenzwu for the review! -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] Spark 3.5, 3.4: Disable executor cache for delete files in RewriteDataFilesSparkAction [iceberg]

2025-08-20 Thread via GitHub
huaxingao merged PR #13868: URL: https://github.com/apache/iceberg/pull/13868 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] [SPEC] Add FGAC enforcement instructions as part of loadTable [iceberg]

2025-08-20 Thread via GitHub
singhpk234 commented on PR #13879: URL: https://github.com/apache/iceberg/pull/13879#issuecomment-3208233237 cc @rdblue @RussellSpitzer @flyrain -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Parquet: Fix column pruning for deeply nested fields [iceberg]

2025-08-20 Thread via GitHub
sriharshaj commented on code in PR #12634: URL: https://github.com/apache/iceberg/pull/12634#discussion_r2289426616 ## parquet/src/main/java/org/apache/iceberg/parquet/PruneColumns.java: ## @@ -90,11 +90,11 @@ public Type struct(StructType expected, GroupType struct, List field

Re: [PR] Parquet: Fix column pruning for deeply nested fields [iceberg]

2025-08-20 Thread via GitHub
sriharshaj commented on code in PR #12634: URL: https://github.com/apache/iceberg/pull/12634#discussion_r2289426616 ## parquet/src/main/java/org/apache/iceberg/parquet/PruneColumns.java: ## @@ -90,11 +90,11 @@ public Type struct(StructType expected, GroupType struct, List field

Re: [PR] Spark 4.0: RewriteTablePath: Update sizes of rewritten manifests in manifest lists [iceberg]

2025-08-20 Thread via GitHub
vaultah commented on code in PR #13720: URL: https://github.com/apache/iceberg/pull/13720#discussion_r2289426645 ## core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java: ## @@ -447,6 +551,58 @@ public static RewriteResult rewriteDeleteManifest( } } + /** +

Re: [I] fix: validate naming conflicts between schema field names and partition field names during schema update and partition spec update [iceberg]

2025-08-20 Thread via GitHub
kevinjqliu commented on issue #13833: URL: https://github.com/apache/iceberg/issues/13833#issuecomment-3208203420 the refactor in #13835 actually caught this bug 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] Fix filesystem [iceberg-python]

2025-08-20 Thread via GitHub
Fokko commented on code in PR #2291: URL: https://github.com/apache/iceberg-python/pull/2291#discussion_r2289208284 ## pyiceberg/io/pyarrow.py: ## @@ -381,21 +381,38 @@ def to_input_file(self) -> PyArrowFile: class PyArrowFileIO(FileIO): fs_by_scheme: Callable[[str, Opti

Re: [PR] test: validate partition and schema field name conflicts on schema evolution [iceberg]

2025-08-20 Thread via GitHub
kevinjqliu closed pull request #13834: test: validate partition and schema field name conflicts on schema evolution URL: https://github.com/apache/iceberg/pull/13834 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] test: validate partition and schema field name conflicts on schema evolution [iceberg]

2025-08-20 Thread via GitHub
kevinjqliu commented on PR #13834: URL: https://github.com/apache/iceberg/pull/13834#issuecomment-3208192897 thanks for the review @RussellSpitzer. Im going to close this and combined it with #13835 -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [I] Partition field and schema field name conflicts not validated on schema evolution [iceberg]

2025-08-20 Thread via GitHub
kevinjqliu commented on issue #13833: URL: https://github.com/apache/iceberg/issues/13833#issuecomment-3208186526 thats a good one. Here's the call stack for this bug: `updateSpec()` [calls `PartitionSpec.Builder.add()`](https://github.com/apache/iceberg/blob/d5e3a56b3150b0e97adf526735261

Re: [PR] feat(catalog): Implement update_table for GlueCatalog [iceberg-rust]

2025-08-20 Thread via GitHub
emkornfield commented on code in PR #1584: URL: https://github.com/apache/iceberg-rust/pull/1584#discussion_r2289342970 ## crates/catalog/glue/tests/glue_catalog_test.rs: ## @@ -367,3 +368,69 @@ async fn test_list_namespace() -> Result<()> { Ok(()) } + +#[tokio::test] +a

Re: [PR] feat(catalog): Implement update_table for GlueCatalog [iceberg-rust]

2025-08-20 Thread via GitHub
emkornfield commented on code in PR #1584: URL: https://github.com/apache/iceberg-rust/pull/1584#discussion_r2289343440 ## crates/catalog/glue/src/catalog.rs: ## @@ -635,10 +637,56 @@ impl Catalog for GlueCatalog { )) } -async fn update_table(&self, _commit:

Re: [PR] perf: optimize `inspect.partitions` [iceberg-python]

2025-08-20 Thread via GitHub
Fokko commented on PR #2359: URL: https://github.com/apache/iceberg-python/pull/2359#issuecomment-3208173633 Thanks for fixing this @emilie-wang 🙌 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] perf: optimize `inspect.partitions` [iceberg-python]

2025-08-20 Thread via GitHub
Fokko merged PR #2359: URL: https://github.com/apache/iceberg-python/pull/2359 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [I] docs: add a table for data type conversion between arrow and iceberg types [iceberg-python]

2025-08-20 Thread via GitHub
Fokko commented on issue #2226: URL: https://github.com/apache/iceberg-python/issues/2226#issuecomment-3208172263 The conversion can be found in code here: https://github.com/apache/iceberg-python/blob/5a781df5ac02575745c2dad45496db897372e32f/pyiceberg/io/pyarrow.py#L691-L785 I think

Re: [PR] Document null field handling for PyArrow [iceberg-python]

2025-08-20 Thread via GitHub
Fokko merged PR #2365: URL: https://github.com/apache/iceberg-python/pull/2365 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [I] When writing data from a PyArrow DataFrame, how should we handle 'null' Fields? [iceberg-python]

2025-08-20 Thread via GitHub
Fokko closed issue #2119: When writing data from a PyArrow DataFrame, how should we handle 'null' Fields? URL: https://github.com/apache/iceberg-python/issues/2119 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] feat(catalog): Implement update_table for GlueCatalog [iceberg-rust]

2025-08-20 Thread via GitHub
emkornfield commented on code in PR #1584: URL: https://github.com/apache/iceberg-rust/pull/1584#discussion_r2289355481 ## crates/catalog/glue/src/catalog.rs: ## @@ -635,10 +637,56 @@ impl Catalog for GlueCatalog { )) } -async fn update_table(&self, _commit:

Re: [PR] feat(catalog): Implement update_table for GlueCatalog [iceberg-rust]

2025-08-20 Thread via GitHub
emkornfield commented on code in PR #1584: URL: https://github.com/apache/iceberg-rust/pull/1584#discussion_r2289354812 ## crates/catalog/glue/src/catalog.rs: ## @@ -635,10 +637,56 @@ impl Catalog for GlueCatalog { )) } -async fn update_table(&self, _commit:

[PR] [SPEC] Add FGAC enforcement instructions as part of loadTable [iceberg]

2025-08-20 Thread via GitHub
singhpk234 opened a new pull request, #13879: URL: https://github.com/apache/iceberg/pull/13879 ### About the proposal This aims at returning, policy evaluation result (Access decisions) for fine grained access policies based on the calling user as part of the loadTable response.

Re: [PR] perf: optimize `inspect.partitions` [iceberg-python]

2025-08-20 Thread via GitHub
emilie-wang commented on code in PR #2359: URL: https://github.com/apache/iceberg-python/pull/2359#discussion_r2289341474 ## pyiceberg/table/inspect.py: ## @@ -288,64 +288,86 @@ def partitions(self, snapshot_id: Optional[int] = None) -> "pa.Table": table_schema =

Re: [PR] Document null field handling for PyArrow [iceberg-python]

2025-08-20 Thread via GitHub
kris-gaudel commented on PR #2365: URL: https://github.com/apache/iceberg-python/pull/2365#issuecomment-3208137218 Added 2 unit tests to check proper handling behaviour -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Core: Don't copy stats of delete files in DeleteFileIndex [iceberg]

2025-08-20 Thread via GitHub
ebyhr commented on PR #13161: URL: https://github.com/apache/iceberg/pull/13161#issuecomment-3208130507 Rebased on main to resolve conflicts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] Spark 4.0: RewriteTablePath: Update sizes of rewritten manifests in manifest lists [iceberg]

2025-08-20 Thread via GitHub
stevenzwu commented on code in PR #13720: URL: https://github.com/apache/iceberg/pull/13720#discussion_r2289299893 ## core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java: ## @@ -335,7 +393,9 @@ public static RewriteResult rewriteDataManifest( * @param sourcePrefi

Re: [PR] Spark 4.0: RewriteTablePath: Update sizes of rewritten manifests in manifest lists [iceberg]

2025-08-20 Thread via GitHub
stevenzwu commented on code in PR #13720: URL: https://github.com/apache/iceberg/pull/13720#discussion_r2289310108 ## core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java: ## @@ -447,6 +551,58 @@ public static RewriteResult rewriteDeleteManifest( } } + /**

Re: [PR] Spark 3.5, 3.4: Disable executor cache for delete files in RewriteDataFilesSparkAction [iceberg]

2025-08-20 Thread via GitHub
anuragmantri commented on code in PR #13868: URL: https://github.com/apache/iceberg/pull/13868#discussion_r2289299084 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java: ## @@ -101,6 +102,10 @@ public class RewriteDataFilesSparkAc

  1   2   3   >