[PR] refactor: clean up code to apply name mapping on avro [iceberg-cpp]

2025-08-25 Thread via GitHub
wgtmac opened a new pull request, #195: URL: https://github.com/apache/iceberg-cpp/pull/195 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

[PR] feat: add case-insensitive field lookup for StructType, ListType, and… [iceberg-cpp]

2025-08-25 Thread via GitHub
nullccxsy opened a new pull request, #194: URL: https://github.com/apache/iceberg-cpp/pull/194 … MapType - Implemented case-insensitive GetFieldByName in NestedType subclasses. - Added lazy initialization for maps in StructType - Handled duplicate names/IDs with Statu

Re: [I] Does the add_files procedure add column lower and upper bounds statistics to manifest files? [iceberg]

2025-08-25 Thread via GitHub
royantman commented on issue #13218: URL: https://github.com/apache/iceberg/issues/13218#issuecomment-3222776676 @RussellSpitzer The fix is very small and simple, but I think the desired behavior needs to be clear - Is this a confirmed bug or intentional by some reasoning? (this is why I me

Re: [PR] feat: avro schema add sanitize field name [iceberg-cpp]

2025-08-25 Thread via GitHub
wgtmac commented on code in PR #190: URL: https://github.com/apache/iceberg-cpp/pull/190#discussion_r2299792892 ## src/iceberg/avro/avro_schema_util.cc: ## @@ -65,6 +65,31 @@ ::avro::CustomAttributes GetAttributesWithFieldId(int32_t field_id) { } // namespace +std::string

[PR] Doc: Flink Maintenance add Delete OrphansFiles part [iceberg]

2025-08-25 Thread via GitHub
Guosmilesmile opened a new pull request, #13923: URL: https://github.com/apache/iceberg/pull/13923 Since we have support delete orphansFiles in flink , we should add the contents to the doc. -- This is an automated message from the Apache Git Service. To respond to the message, please log

[PR] add snapshot properties for snapshot created by wap branches [iceberg]

2025-08-25 Thread via GitHub
stevie9868 opened a new pull request, #13922: URL: https://github.com/apache/iceberg/pull/13922 add snapshot properties for wap created snapshots -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Flink: add unit test to check skewness across tasks for range partitioner [iceberg]

2025-08-25 Thread via GitHub
stevenzwu commented on code in PR #13900: URL: https://github.com/apache/iceberg/pull/13900#discussion_r2299710351 ## flink/v2.0/flink/src/test/java/org/apache/iceberg/flink/sink/shuffle/TestRangePartitionerSkew.java: ## @@ -0,0 +1,188 @@ +/* + * Licensed to the Apache Software

Re: [PR] feat: implement literal expressions with binary serialization support [iceberg-cpp]

2025-08-25 Thread via GitHub
wgtmac commented on code in PR #185: URL: https://github.com/apache/iceberg-cpp/pull/185#discussion_r2299673313 ## src/iceberg/util/literal_format.cc: ## @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreement

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-25 Thread via GitHub
stevenzwu commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2299694970 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/ParquetWithSparkSchemaVisitor.java: ## @@ -182,18 +183,15 @@ private static T visitField( priv

[PR] chore: Add release automation infrastructure [iceberg-cpp]

2025-08-25 Thread via GitHub
HeartLinked opened a new pull request, #193: URL: https://github.com/apache/iceberg-cpp/pull/193 This contribution introduces release automation tooling including: - `/dev/release/README.md`: Release process documentation - `/dev/release/*.sh`: Release script utilities - `/dev/releas

Re: [PR] Docs: Apply spotless with flexmark plugin [iceberg]

2025-08-25 Thread via GitHub
manuzhang commented on PR #13908: URL: https://github.com/apache/iceberg/pull/13908#issuecomment-3222443986 @kevinjqliu We do include flink and kafka-connect conditionally. https://github.com/apache/iceberg/blob/5d5e0a3559946a94be2979f3ff9a09f3b8e67f19/settings.gradle#L110-L11

Re: [PR] Docs: Apply spotless with flexmark plugin [iceberg]

2025-08-25 Thread via GitHub
manuzhang commented on PR #13908: URL: https://github.com/apache/iceberg/pull/13908#issuecomment-3222438869 The title format doesn't look right. https://github.com/user-attachments/assets/1f894a60-d8d2-44b1-bed2-9ce847fa3c0c"; /> -- This is an automated message from the Apache

Re: [PR] Docs: Apply spotless with flexmark plugin [iceberg]

2025-08-25 Thread via GitHub
manuzhang commented on code in PR #13908: URL: https://github.com/apache/iceberg/pull/13908#discussion_r2299635703 ## docs/docs/flink-connector.md: ## @@ -61,8 +64,7 @@ CREATE TABLE flink_table ( ``` !!! info -The underlying catalog database (`hive_db` in the above examp

Re: [PR] Spark 4.0: Add truncate transform tests (non-SPJ only). [iceberg]

2025-08-25 Thread via GitHub
slfan1989 commented on code in PR #13907: URL: https://github.com/apache/iceberg/pull/13907#discussion_r2299609547 ## spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/sql/TestStoragePartitionedJoins.java: ## @@ -133,7 +133,44 @@ public void removeTables() { sql("DROP

Re: [PR] Spark 4.0: Add truncate transform tests (non-SPJ only). [iceberg]

2025-08-25 Thread via GitHub
slfan1989 commented on code in PR #13907: URL: https://github.com/apache/iceberg/pull/13907#discussion_r2299607087 ## spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/sql/TestStoragePartitionedJoins.java: ## @@ -133,7 +133,44 @@ public void removeTables() { sql("DROP

Re: [PR] Flink: add unit test to check skewness across tasks for range partitioner [iceberg]

2025-08-25 Thread via GitHub
pvary commented on code in PR #13900: URL: https://github.com/apache/iceberg/pull/13900#discussion_r2299602608 ## flink/v2.0/flink/src/test/java/org/apache/iceberg/flink/sink/shuffle/TestRangePartitionerSkew.java: ## @@ -0,0 +1,188 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] Flink: add unit test to check skewness across tasks for range partitioner [iceberg]

2025-08-25 Thread via GitHub
pvary commented on code in PR #13900: URL: https://github.com/apache/iceberg/pull/13900#discussion_r2299588455 ## flink/v2.0/flink/src/test/java/org/apache/iceberg/flink/sink/shuffle/DataDistributionUtil.java: ## @@ -0,0 +1,177 @@ +/* + * Licensed to the Apache Software Foundati

Re: [I] check usage of `IntStream.rangeClosed` in codebase [iceberg]

2025-08-25 Thread via GitHub
JeonDaehong commented on issue #13921: URL: https://github.com/apache/iceberg/issues/13921#issuecomment-309130 Can I go ahead and submit a PR to help fix this bug? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-25 Thread via GitHub
kevinjqliu commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2299412893 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/SparkParquetWriters.java: ## @@ -613,13 +617,32 @@ private static class InternalRowWriter extends

Re: [I] Validation Error in ConfigResponse Model with RestCatalog in PyIceberg using Nessie REST API [iceberg]

2025-08-25 Thread via GitHub
github-actions[bot] commented on issue #11255: URL: https://github.com/apache/iceberg/issues/11255#issuecomment-3222121223 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

[I] check usage of `IntStream.rangeClosed` in codebase [iceberg]

2025-08-25 Thread via GitHub
kevinjqliu opened a new issue, #13921: URL: https://github.com/apache/iceberg/issues/13921 ### Apache Iceberg version None ### Query engine None ### Please describe the bug 🐞 `rangeClosed` vs `range` Context: https://github.com/apache/iceberg/pull/

Re: [I] Limit the delete file/records [iceberg]

2025-08-25 Thread via GitHub
github-actions[bot] commented on issue #12343: URL: https://github.com/apache/iceberg/issues/12343#issuecomment-3222121524 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Kafka Connect config providers are not working when CVE-2024-31141 fix is applied [iceberg]

2025-08-25 Thread via GitHub
github-actions[bot] closed issue #12221: Kafka Connect config providers are not working when CVE-2024-31141 fix is applied URL: https://github.com/apache/iceberg/issues/12221 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [I] Kafka Connect config providers are not working when CVE-2024-31141 fix is applied [iceberg]

2025-08-25 Thread via GitHub
github-actions[bot] commented on issue #12221: URL: https://github.com/apache/iceberg/issues/12221#issuecomment-3222121391 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [PR] Materialized View Spec [iceberg]

2025-08-25 Thread via GitHub
stevenzwu commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r2299399276 ## format/view-spec.md: ## @@ -42,12 +42,28 @@ An atomic swap of one view metadata file for another provides the basis for maki Writers create view metadata fil

Re: [PR] Spark 4.0: RewriteTablePath: Update sizes of rewritten manifests in manifest lists [iceberg]

2025-08-25 Thread via GitHub
stevenzwu commented on code in PR #13720: URL: https://github.com/apache/iceberg/pull/13720#discussion_r2299391881 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteTablePathSparkAction.java: ## @@ -494,36 +483,60 @@ public RewriteContentFileResult appen

Re: [PR] Spark 4.0: RewriteTablePath: Update sizes of rewritten manifests in manifest lists [iceberg]

2025-08-25 Thread via GitHub
stevenzwu commented on code in PR #13720: URL: https://github.com/apache/iceberg/pull/13720#discussion_r2299391881 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteTablePathSparkAction.java: ## @@ -494,36 +483,60 @@ public RewriteContentFileResult appen

Re: [PR] feat(writer): Make LocationGenerator partition-aware [iceberg-rust]

2025-08-25 Thread via GitHub
CTTY commented on code in PR #1625: URL: https://github.com/apache/iceberg-rust/pull/1625#discussion_r2299302809 ## crates/iceberg/src/writer/file_writer/location_generator.rs: ## @@ -217,7 +249,82 @@ pub(crate) mod test { let location_generator = super::D

Re: [PR] feat(writer): Make LocationGenerator partition-aware [iceberg-rust]

2025-08-25 Thread via GitHub
CTTY commented on code in PR #1625: URL: https://github.com/apache/iceberg-rust/pull/1625#discussion_r2299302809 ## crates/iceberg/src/writer/file_writer/location_generator.rs: ## @@ -217,7 +249,82 @@ pub(crate) mod test { let location_generator = super::D

Re: [PR] feat(writer): Make LocationGenerator partition-aware [iceberg-rust]

2025-08-25 Thread via GitHub
CTTY commented on code in PR #1625: URL: https://github.com/apache/iceberg-rust/pull/1625#discussion_r2299295538 ## crates/iceberg/src/spec/partition.rs: ## @@ -176,6 +176,38 @@ impl PartitionSpec { } } +/// A partition key represents a specific partition in a table, con

Re: [PR] [SPEC] Add finer grained read restrictions as part of loadTable [iceberg]

2025-08-25 Thread via GitHub
singhpk234 commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2299340423 ## open-api/rest-catalog-open-api.py: ## @@ -1240,6 +1240,24 @@ class ViewUpdate(BaseModel): ] +class ReadRestrictions(BaseModel): +""" +Read Rest

Re: [I] Avro: Allow reading ManifestList V1 using a V2 reader [iceberg-rust]

2025-08-25 Thread via GitHub
emkornfield commented on issue #1587: URL: https://github.com/apache/iceberg-rust/issues/1587#issuecomment-3222020915 @Kurtiscwright are you still working on this? if not I might have some bandwidth to try it out. -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] [SPEC] Add finer grained read restrictions as part of loadTable [iceberg]

2025-08-25 Thread via GitHub
singhpk234 commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2294486557 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,25 +3260,25 @@ components: additionalProperties: type: string -FineGrainedDataPro

Re: [PR] feat(catalog): Implement catalog loader for in memory [iceberg-rust]

2025-08-25 Thread via GitHub
lliangyu-lin commented on code in PR #1623: URL: https://github.com/apache/iceberg-rust/pull/1623#discussion_r2299215680 ## crates/iceberg/src/catalog/memory/catalog.rs: ## @@ -24,17 +24,84 @@ use futures::lock::{Mutex, MutexGuard}; use itertools::Itertools; use super::names

Re: [PR] feat(catalog): Implement catalog loader for in memory [iceberg-rust]

2025-08-25 Thread via GitHub
lliangyu-lin commented on code in PR #1623: URL: https://github.com/apache/iceberg-rust/pull/1623#discussion_r2299215680 ## crates/iceberg/src/catalog/memory/catalog.rs: ## @@ -24,17 +24,84 @@ use futures::lock::{Mutex, MutexGuard}; use itertools::Itertools; use super::names

Re: [PR] Fix filesystem [iceberg-python]

2025-08-25 Thread via GitHub
mccormickt12 commented on code in PR #2291: URL: https://github.com/apache/iceberg-python/pull/2291#discussion_r2299306480 ## pyiceberg/io/pyarrow.py: ## @@ -381,21 +381,38 @@ def to_input_file(self) -> PyArrowFile: class PyArrowFileIO(FileIO): fs_by_scheme: Callable[[st

Re: [PR] Fix filesystem [iceberg-python]

2025-08-25 Thread via GitHub
mccormickt12 commented on code in PR #2291: URL: https://github.com/apache/iceberg-python/pull/2291#discussion_r2299304219 ## pyiceberg/io/pyarrow.py: ## @@ -381,21 +381,38 @@ def to_input_file(self) -> PyArrowFile: class PyArrowFileIO(FileIO): fs_by_scheme: Callable[[st

Re: [PR] feat(writer): Make LocationGenerator partition-aware [iceberg-rust]

2025-08-25 Thread via GitHub
CTTY commented on code in PR #1625: URL: https://github.com/apache/iceberg-rust/pull/1625#discussion_r2299302809 ## crates/iceberg/src/writer/file_writer/location_generator.rs: ## @@ -217,7 +249,82 @@ pub(crate) mod test { let location_generator = super::D

Re: [PR] feat(writer): Make LocationGenerator partition-aware [iceberg-rust]

2025-08-25 Thread via GitHub
CTTY commented on code in PR #1625: URL: https://github.com/apache/iceberg-rust/pull/1625#discussion_r2299295538 ## crates/iceberg/src/spec/partition.rs: ## @@ -176,6 +176,38 @@ impl PartitionSpec { } } +/// A partition key represents a specific partition in a table, con

Re: [PR] feat(writer): Make LocationGenerator partition-aware [iceberg-rust]

2025-08-25 Thread via GitHub
CTTY commented on code in PR #1625: URL: https://github.com/apache/iceberg-rust/pull/1625#discussion_r2299291559 ## crates/iceberg/src/writer/file_writer/location_generator.rs: ## @@ -21,14 +21,24 @@ use std::sync::Arc; use std::sync::atomic::AtomicU64; use crate::Result; -u

Re: [PR] Data, Flink, Spark: Use TestHelpers for FormatVersion [iceberg]

2025-08-25 Thread via GitHub
RussellSpitzer commented on PR #13880: URL: https://github.com/apache/iceberg/pull/13880#issuecomment-3221919284 My current understanding is that at least within our compaction code (and rewrite delete files action code) we are leaking memory. The biggest contributor is TestCombineMixedFile

Re: [PR] Data, Flink, Spark: Use TestHelpers for FormatVersion [iceberg]

2025-08-25 Thread via GitHub
RussellSpitzer commented on PR #13880: URL: https://github.com/apache/iceberg/pull/13880#issuecomment-3221912680 Ok So Compaction is really leaking direct memory (at least in our tests) https://github.com/user-attachments/assets/3b0aaff7-0a8b-494f-a52a-d677a20a247b"; /> Spark 3

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-25 Thread via GitHub
stevenzwu commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2299246777 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/ParquetWithSparkSchemaVisitor.java: ## @@ -182,18 +183,15 @@ private static T visitField( priv

Re: [PR] Docs: Add docs for Table Maintenance in Flink [iceberg]

2025-08-25 Thread via GitHub
JeonDaehong commented on PR #13853: URL: https://github.com/apache/iceberg/pull/13853#issuecomment-3221883143 > It is available here: https://iceberg.apache.org/docs/nightly/flink-maintenance/ Thanks :D -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-25 Thread via GitHub
stevenzwu commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2299240111 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/ParquetWithSparkSchemaVisitor.java: ## @@ -182,18 +183,15 @@ private static T visitField( priv

Re: [PR] Spec: clarify the partition-spec metadata for Avro manifest file [iceberg]

2025-08-25 Thread via GitHub
RussellSpitzer commented on PR #13895: URL: https://github.com/apache/iceberg/pull/13895#issuecomment-3221855611 > I've also seen this in the past. I don't see any harm in making this more explicit 👍 Could you share who was generating the files? We had a user report this but never fi

Re: [PR] Data, Flink, Spark: Use TestHelpers for FormatVersion [iceberg]

2025-08-25 Thread via GitHub
RussellSpitzer commented on PR #13880: URL: https://github.com/apache/iceberg/pull/13880#issuecomment-3221864508 Ok the issues is with Direct memory ... So Now I have to track down memory usage in our Arrow Readers -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] feat(catalog): Implement catalog loader for in memory [iceberg-rust]

2025-08-25 Thread via GitHub
lliangyu-lin commented on code in PR #1623: URL: https://github.com/apache/iceberg-rust/pull/1623#discussion_r2299215680 ## crates/iceberg/src/catalog/memory/catalog.rs: ## @@ -24,17 +24,84 @@ use futures::lock::{Mutex, MutexGuard}; use itertools::Itertools; use super::names

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-25 Thread via GitHub
Fokko commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2299199317 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/ParquetWithSparkSchemaVisitor.java: ## @@ -182,18 +186,18 @@ private static T visitField( private

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-25 Thread via GitHub
Fokko commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2299199317 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/ParquetWithSparkSchemaVisitor.java: ## @@ -182,18 +186,18 @@ private static T visitField( private

Re: [PR] Add serializer for AssertRefSnapshotId allowing null json value [iceberg-python]

2025-08-25 Thread via GitHub
kevinjqliu merged PR #2375: URL: https://github.com/apache/iceberg-python/pull/2375 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Add serializer for AssertRefSnapshotId allowing null json value [iceberg-python]

2025-08-25 Thread via GitHub
kevinjqliu commented on PR #2375: URL: https://github.com/apache/iceberg-python/pull/2375#issuecomment-3221822023 Thanks for the follow up PR @ox and thanks @Fokko for the review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [I] missing snapshot-id for assert-ref-snapshot-id requirement marshals to invalid JSON [iceberg-python]

2025-08-25 Thread via GitHub
kevinjqliu closed issue #2342: missing snapshot-id for assert-ref-snapshot-id requirement marshals to invalid JSON URL: https://github.com/apache/iceberg-python/issues/2342 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-25 Thread via GitHub
stevenzwu commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2299170244 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/SparkParquetWriters.java: ## @@ -613,13 +617,32 @@ private static class InternalRowWriter extends P

Re: [PR] feate(table): Add time travel functionality [iceberg-go]

2025-08-25 Thread via GitHub
dttung2905 commented on code in PR #548: URL: https://github.com/apache/iceberg-go/pull/548#discussion_r2299160635 ## table/table.go: ## @@ -227,6 +228,45 @@ func (t Table) doCommit(ctx context.Context, updates []Update, reqs []Requiremen return New(t.identifier, newMet

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-25 Thread via GitHub
Fokko commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2299097640 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/SparkParquetWriters.java: ## @@ -613,13 +617,32 @@ private static class InternalRowWriter extends Parqu

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-25 Thread via GitHub
Fokko commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2299141047 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/SparkParquetWriters.java: ## @@ -613,13 +617,32 @@ private static class InternalRowWriter extends Parqu

[PR] feat(sqllogictest): Add sqllogictest schedule definition and parsing [iceberg-rust]

2025-08-25 Thread via GitHub
lliangyu-lin opened a new pull request, #1630: URL: https://github.com/apache/iceberg-rust/pull/1630 ## Which issue does this PR close? - Closes #1214 . ## What changes are included in this PR? * Add schedule definition * Parse schedule to engines involved and test

Re: [PR] Add serializer for AssertRefSnapshotId allowing null json value [iceberg-python]

2025-08-25 Thread via GitHub
kevinjqliu commented on code in PR #2375: URL: https://github.com/apache/iceberg-python/pull/2375#discussion_r2299106728 ## pyiceberg/table/update/__init__.py: ## @@ -21,9 +21,12 @@ from abc import ABC, abstractmethod from datetime import datetime from functools import single

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-25 Thread via GitHub
Fokko commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2299097640 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/SparkParquetWriters.java: ## @@ -613,13 +617,32 @@ private static class InternalRowWriter extends Parqu

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-25 Thread via GitHub
stevenzwu commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2299085169 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/SparkParquetWriters.java: ## @@ -613,13 +617,32 @@ private static class InternalRowWriter extends P

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-25 Thread via GitHub
stevenzwu commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2299047920 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/SparkParquetWriters.java: ## @@ -613,13 +617,32 @@ private static class InternalRowWriter extends P

Re: [PR] feat(cli): add token parameter for rest catalog [iceberg-go]

2025-08-25 Thread via GitHub
rodrigopv commented on PR #549: URL: https://github.com/apache/iceberg-go/pull/549#issuecomment-3221579556 > How is this different from the `credential` option that already exists? `credential` accepts a colon-separated client_id and client_secret to perform OAuth2 authentication, whi

Re: [PR] Test both vectorized and nonvectorized readers in Parquet golden file tests [iceberg]

2025-08-25 Thread via GitHub
huaxingao commented on PR #13890: URL: https://github.com/apache/iceberg/pull/13890#issuecomment-3221570663 Merged. Thanks @eric-maynard for the PR! Thanks @kevinjqliu for the review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] Test both vectorized and nonvectorized readers in Parquet golden file tests [iceberg]

2025-08-25 Thread via GitHub
huaxingao merged PR #13890: URL: https://github.com/apache/iceberg/pull/13890 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-25 Thread via GitHub
Fokko commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2298924953 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/SparkParquetWriters.java: ## @@ -96,14 +98,18 @@ public ParquetValueWriter message( public ParquetV

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-25 Thread via GitHub
stevenzwu commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2298901059 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/SparkParquetWriters.java: ## @@ -96,14 +98,18 @@ public ParquetValueWriter message( public Parq

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-25 Thread via GitHub
stevenzwu commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2298863755 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/ParquetWithSparkSchemaVisitor.java: ## @@ -182,18 +186,18 @@ private static T visitField( priv

Re: [PR] Spark: Read/Write `UnknownType` [iceberg]

2025-08-25 Thread via GitHub
stevenzwu commented on code in PR #13445: URL: https://github.com/apache/iceberg/pull/13445#discussion_r2298863755 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/ParquetWithSparkSchemaVisitor.java: ## @@ -182,18 +186,18 @@ private static T visitField( priv

Re: [PR] Fix filesystem [iceberg-python]

2025-08-25 Thread via GitHub
Fokko commented on code in PR #2291: URL: https://github.com/apache/iceberg-python/pull/2291#discussion_r2298831593 ## pyiceberg/io/pyarrow.py: ## @@ -381,21 +381,38 @@ def to_input_file(self) -> PyArrowFile: class PyArrowFileIO(FileIO): fs_by_scheme: Callable[[str, Opti

Re: [PR] [SPEC] Add finer grained read restrictions as part of loadTable [iceberg]

2025-08-25 Thread via GitHub
singhpk234 commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2298822949 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,33 @@ components: additionalProperties: type: string +ReadRestrictions: +

Re: [PR] [SPEC] Add finer grained read restrictions as part of loadTable [iceberg]

2025-08-25 Thread via GitHub
singhpk234 commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2298790689 ## open-api/rest-catalog-open-api.py: ## @@ -1240,6 +1240,24 @@ class ViewUpdate(BaseModel): ] +class ReadRestrictions(BaseModel): +""" +Read Rest

Re: [PR] [SPEC] Add finer grained read restrictions as part of loadTable [iceberg]

2025-08-25 Thread via GitHub
singhpk234 commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2298790689 ## open-api/rest-catalog-open-api.py: ## @@ -1240,6 +1240,24 @@ class ViewUpdate(BaseModel): ] +class ReadRestrictions(BaseModel): +""" +Read Rest

Re: [PR] Spec: clarify the partition-spec metadata for Avro manifest file [iceberg]

2025-08-25 Thread via GitHub
Fokko commented on code in PR #13895: URL: https://github.com/apache/iceberg/pull/13895#discussion_r2298781163 ## format/spec.md: ## @@ -609,14 +609,14 @@ A manifest stores files for a single partition spec. When a table’s partition A manifest file must store the partition s

Re: [PR] [SPEC] Add finer grained read restrictions as part of loadTable [iceberg]

2025-08-25 Thread via GitHub
RussellSpitzer commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2298777374 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,33 @@ components: additionalProperties: type: string +ReadRestriction

Re: [I] Avro reader memory leak [iceberg-python]

2025-08-25 Thread via GitHub
kris-gaudel commented on issue #2325: URL: https://github.com/apache/iceberg-python/issues/2325#issuecomment-3221265037 @Declow I'm on macOS Sonomoa 14.6.1 and Python 3.11.6. ```python from pyiceberg.catalog.memory import InMemoryCatalog import tracemalloc from datetime impo

Re: [PR] [SPEC] Add finer grained read restrictions as part of loadTable [iceberg]

2025-08-25 Thread via GitHub
RussellSpitzer commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2298774061 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,33 @@ components: additionalProperties: type: string +ReadRestriction

Re: [PR] Spark 4.0: RewriteTablePath: Update sizes of rewritten manifests in manifest lists [iceberg]

2025-08-25 Thread via GitHub
vaultah commented on code in PR #13720: URL: https://github.com/apache/iceberg/pull/13720#discussion_r2298773311 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteTablePathSparkAction.java: ## @@ -494,36 +483,60 @@ public RewriteContentFileResult appendD

Re: [PR] [SPEC] Add finer grained read restrictions as part of loadTable [iceberg]

2025-08-25 Thread via GitHub
RussellSpitzer commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2298770034 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,33 @@ components: additionalProperties: type: string +ReadRestriction

Re: [PR] Flink: add unit test to check skewness across tasks for range partitioner [iceberg]

2025-08-25 Thread via GitHub
stevenzwu commented on code in PR #13900: URL: https://github.com/apache/iceberg/pull/13900#discussion_r2298769043 ## flink/v2.0/flink/src/test/java/org/apache/iceberg/flink/sink/shuffle/TestRangePartitionerSkew.java: ## @@ -0,0 +1,247 @@ +/* + * Licensed to the Apache Software

Re: [PR] [SPEC] Add finer grained read restrictions as part of loadTable [iceberg]

2025-08-25 Thread via GitHub
RussellSpitzer commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2298758463 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,33 @@ components: additionalProperties: type: string +ReadRestriction

Re: [I] Spark rewrite_data_files failing with java.lang.IllegalStateException: Connection pool shut down [iceberg]

2025-08-25 Thread via GitHub
troy-curtis commented on issue #12046: URL: https://github.com/apache/iceberg/issues/12046#issuecomment-3221213801 > [@troy-curtis](https://github.com/troy-curtis) are you sure the jobs succeed? In this case (rewrite data) the underlying failures did not propagate to a spark job failure. (I

Re: [PR] [SPEC] Add finer grained read restrictions as part of loadTable [iceberg]

2025-08-25 Thread via GitHub
RussellSpitzer commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2298748280 ## open-api/rest-catalog-open-api.yaml: ## @@ -3260,6 +3260,33 @@ components: additionalProperties: type: string +ReadRestriction

Re: [PR] [SPEC] Add finer grained read restrictions as part of loadTable [iceberg]

2025-08-25 Thread via GitHub
RussellSpitzer commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2298742292 ## open-api/rest-catalog-open-api.py: ## @@ -1240,6 +1240,24 @@ class ViewUpdate(BaseModel): ] +class ReadRestrictions(BaseModel): +""" +Read

Re: [I] Run GCP Integration tests for catalog [iceberg-python]

2025-08-25 Thread via GitHub
Fokko commented on issue #2368: URL: https://github.com/apache/iceberg-python/issues/2368#issuecomment-3221197663 @rambleraptor Could you raise this on the dev-list? If we have agreement here (I'm all in), then we can link that thread in a ASF Infra issue -- This is an automated message

Re: [I] Error when filtering by UUID in table scan [iceberg-python]

2025-08-25 Thread via GitHub
Fokko commented on issue #2372: URL: https://github.com/apache/iceberg-python/issues/2372#issuecomment-3221191849 If we need to convert from `uuid` to `fixed[16]` to fix this, I think that would be reasonable. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [SPEC] Add finer grained read restrictions as part of loadTable [iceberg]

2025-08-25 Thread via GitHub
RussellSpitzer commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2298731965 ## open-api/rest-catalog-open-api.py: ## @@ -1240,6 +1240,24 @@ class ViewUpdate(BaseModel): ] +class ReadRestrictions(BaseModel): +""" +Read

Re: [PR] [SPEC] Add finer grained read restrictions as part of loadTable [iceberg]

2025-08-25 Thread via GitHub
RussellSpitzer commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2298729008 ## open-api/rest-catalog-open-api.py: ## @@ -1240,6 +1240,24 @@ class ViewUpdate(BaseModel): ] +class ReadRestrictions(BaseModel): +""" +Read

Re: [PR] Add serializer for AssertRefSnapshotId allowing null json value [iceberg-python]

2025-08-25 Thread via GitHub
kevinjqliu commented on code in PR #2375: URL: https://github.com/apache/iceberg-python/pull/2375#discussion_r2298728951 ## pyiceberg/table/update/__init__.py: ## @@ -727,6 +728,11 @@ class AssertRefSnapshotId(ValidatableTableRequirement): ref: str = Field(...) snapsho

Re: [PR] [SPEC] Add finer grained read restrictions as part of loadTable [iceberg]

2025-08-25 Thread via GitHub
RussellSpitzer commented on code in PR #13879: URL: https://github.com/apache/iceberg/pull/13879#discussion_r2298722364 ## open-api/rest-catalog-open-api.py: ## @@ -1240,6 +1240,24 @@ class ViewUpdate(BaseModel): ] +class ReadRestrictions(BaseModel): +""" +Read

Re: [PR] Docs: Apply spotless with flexmark plugin [iceberg]

2025-08-25 Thread via GitHub
kevinjqliu commented on code in PR #13908: URL: https://github.com/apache/iceberg/pull/13908#discussion_r2298712317 ## docs/docs/flink-connector.md: ## @@ -61,8 +64,7 @@ CREATE TABLE flink_table ( ``` !!! info -The underlying catalog database (`hive_db` in the above exam

Re: [PR] Core: Support Distributed Scan For Partitions Metadata Table [iceberg]

2025-08-25 Thread via GitHub
RussellSpitzer commented on code in PR #13903: URL: https://github.com/apache/iceberg/pull/13903#discussion_r2298705218 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributedPartitionsTable.java: ## @@ -0,0 +1,209 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] Core: Support Distributed Scan For Partitions Metadata Table [iceberg]

2025-08-25 Thread via GitHub
RussellSpitzer commented on PR #13903: URL: https://github.com/apache/iceberg/pull/13903#issuecomment-3221135831 I think this is a great idea, but this is a relatively heavy approach to implementation. I think we really don't need to do much more than swapping the PartitionsTable implementa

Re: [PR] Add serializer for AssertRefSnapshotId allowing null json value [iceberg-python]

2025-08-25 Thread via GitHub
ox commented on code in PR #2375: URL: https://github.com/apache/iceberg-python/pull/2375#discussion_r2298686528 ## pyiceberg/table/update/__init__.py: ## @@ -727,6 +727,14 @@ class AssertRefSnapshotId(ValidatableTableRequirement): ref: str = Field(...) snapshot_id: Op

Re: [PR] docs: improve release docs [iceberg-python]

2025-08-25 Thread via GitHub
kevinjqliu commented on code in PR #2374: URL: https://github.com/apache/iceberg-python/pull/2374#discussion_r2298675249 ## mkdocs/docs/how-to-release.md: ## @@ -351,9 +351,9 @@ Send out an announcement on the dev mail list: ```text To: d...@iceberg.apache.org -Subject: [ANN

Re: [PR] docs: improve release docs [iceberg-python]

2025-08-25 Thread via GitHub
kevinjqliu commented on code in PR #2374: URL: https://github.com/apache/iceberg-python/pull/2374#discussion_r2298672766 ## mkdocs/docs/verify-release.md: ## @@ -48,13 +48,18 @@ Set an environment variable to the version to verify and path to use ```sh export PYICEBERG_VERS

Re: [PR] Spec: clarify the partition-spec metadata for Avro manifest file [iceberg]

2025-08-25 Thread via GitHub
stevenzwu commented on code in PR #13895: URL: https://github.com/apache/iceberg/pull/13895#discussion_r2298646328 ## format/spec.md: ## @@ -609,14 +609,14 @@ A manifest stores files for a single partition spec. When a table’s partition A manifest file must store the partiti

Re: [PR] Spark 4.0: Add truncate transform tests (non-SPJ only). [iceberg]

2025-08-25 Thread via GitHub
huaxingao commented on code in PR #13907: URL: https://github.com/apache/iceberg/pull/13907#discussion_r2298643947 ## spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/sql/TestStoragePartitionedJoins.java: ## @@ -133,7 +133,44 @@ public void removeTables() { sql("DROP

Re: [PR] Spark 4.0: Add truncate transform tests (non-SPJ only). [iceberg]

2025-08-25 Thread via GitHub
huaxingao commented on code in PR #13907: URL: https://github.com/apache/iceberg/pull/13907#discussion_r2298637250 ## spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/sql/TestStoragePartitionedJoins.java: ## @@ -133,7 +133,44 @@ public void removeTables() { sql("DROP

Re: [PR] Spark 4.0: Add truncate transform tests (non-SPJ only). [iceberg]

2025-08-25 Thread via GitHub
huaxingao commented on code in PR #13907: URL: https://github.com/apache/iceberg/pull/13907#discussion_r2298627317 ## spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/sql/TestStoragePartitionedJoins.java: ## @@ -133,7 +133,44 @@ public void removeTables() { sql("DROP

  1   2   3   >