Re: [I] Flink: Not Writing [iceberg]

2024-05-02 Thread via GitHub
pvary commented on issue #8916: URL: https://github.com/apache/iceberg/issues/8916#issuecomment-2092323155 @parrik: The issue you have linked is for the cases where the checkpointing is not enabled. If I read the python code correctly, this is not the case here. Still, the issue seems

Re: [PR] Implement table_exists method with try-catch for NoSuchTableError [iceberg-python]

2024-05-02 Thread via GitHub
HonahX commented on code in PR #678: URL: https://github.com/apache/iceberg-python/pull/678#discussion_r1588651526 ## pyiceberg/catalog/__init__.py: ## @@ -661,6 +661,14 @@ def create_table_transaction( ) def table_exists(self, identifier: Union[str,

Re: [I] Add runtime module to enable concurrent load of manifest files. [iceberg-rust]

2024-05-02 Thread via GitHub
marvinlanhenke commented on issue #124: URL: https://github.com/apache/iceberg-rust/issues/124#issuecomment-2092098359 ... so as a first step - simple wrap tokio::spawn (for example) like [here](https://github.com/launchbadge/sqlx/blob/main/sqlx-core/src/rt/mod.rs#L61-L78) - and not even

Re: [PR] Implement table_exists method with try-catch for NoSuchTableError [iceberg-python]

2024-05-02 Thread via GitHub
HonahX commented on code in PR #678: URL: https://github.com/apache/iceberg-python/pull/678#discussion_r1588651526 ## pyiceberg/catalog/__init__.py: ## @@ -661,6 +661,14 @@ def create_table_transaction( ) def table_exists(self, identifier: Union[str,

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-05-02 Thread via GitHub
a-agmon commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2092089365 > Can you share the metadata JSON? I don't think the field ID resolution is being applied, described in issue #353. `added_data_files_count` is the old name since in V2 it also

Re: [I] [feature request] Allow engines to time travel [iceberg-python]

2024-05-02 Thread via GitHub
corleyma commented on issue #600: URL: https://github.com/apache/iceberg-python/issues/600#issuecomment-2092076360 > More over, multiple different snapshots can also be committed between two consecutive metadata json files. In what situations would that occur? In my (possibly

Re: [I] Writing to S3 fails if the user is authenticated with `aws sso login` [iceberg-python]

2024-05-02 Thread via GitHub
github-actions[bot] commented on issue #39: URL: https://github.com/apache/iceberg-python/issues/39#issuecomment-2091940101 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the

Re: [I] Pass in the correct type for the VisitorWithParent [iceberg-python]

2024-05-02 Thread via GitHub
github-actions[bot] closed issue #58: Pass in the correct type for the VisitorWithParent URL: https://github.com/apache/iceberg-python/issues/58 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] Pass in the correct type for the VisitorWithParent [iceberg-python]

2024-05-02 Thread via GitHub
github-actions[bot] commented on issue #58: URL: https://github.com/apache/iceberg-python/issues/58#issuecomment-2091940082 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the

Re: [I] Writing to S3 fails if the user is authenticated with `aws sso login` [iceberg-python]

2024-05-02 Thread via GitHub
github-actions[bot] closed issue #39: Writing to S3 fails if the user is authenticated with `aws sso login` URL: https://github.com/apache/iceberg-python/issues/39 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] Caching Tables in SparkCatalog via CachingCatalog by default leads to stale data [iceberg]

2024-05-02 Thread via GitHub
github-actions[bot] closed issue #2319: Caching Tables in SparkCatalog via CachingCatalog by default leads to stale data URL: https://github.com/apache/iceberg/issues/2319 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] Caching Tables in SparkCatalog via CachingCatalog by default leads to stale data [iceberg]

2024-05-02 Thread via GitHub
github-actions[bot] commented on issue #2319: URL: https://github.com/apache/iceberg/issues/2319#issuecomment-2091938891 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-05-02 Thread via GitHub
zeodtr commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2091929341 @Fokko if #354 is applied, iceberg-rust will no longer be able to read the manifest list files created by pre-1.5.0 Spark and pre-#354 iceberg-rust, since iceberg-rust does not

Re: [PR] Add Pagination To List Apis [iceberg]

2024-05-02 Thread via GitHub
rahil-c commented on PR #9782: URL: https://github.com/apache/iceberg/pull/9782#issuecomment-2091903148 @danielcweeks Seems to be green now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] REST Catalog to support custom-catalog name like HMS/Glue [iceberg]

2024-05-02 Thread via GitHub
flyrain commented on issue #10205: URL: https://github.com/apache/iceberg/issues/10205#issuecomment-2091895186 IIUC, the `catalog-name` is a concept introduced in Hive 3.0 to support one additional layer on top of `database`. I will consider the multipart namespace in REST spec is a

[PR] Build: Bump cython from 3.0.8 to 3.0.10 [iceberg-python]

2024-05-02 Thread via GitHub
dependabot[bot] opened a new pull request, #697: URL: https://github.com/apache/iceberg-python/pull/697 Bumps [cython](https://github.com/cython/cython) from 3.0.8 to 3.0.10. Changelog Sourced from https://github.com/cython/cython/blob/master/CHANGES.rst;>cython's changelog.

Re: [PR] Add Pagination To List Apis [iceberg]

2024-05-02 Thread via GitHub
rahil-c commented on PR #9782: URL: https://github.com/apache/iceberg/pull/9782#issuecomment-2091882544 > Approved, pending checks. There's one that failed, but may have been a transient failure. @danielcweeks yea I think it is transient, since every time I go to check this link it

Re: [PR] Add ManifestFile Stats in snapshot summary. [iceberg]

2024-05-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #10246: URL: https://github.com/apache/iceberg/pull/10246#discussion_r1588511039 ## core/src/main/java/org/apache/iceberg/FastAppend.java: ## @@ -156,6 +156,8 @@ public List apply(TableMetadata base, Snapshot snapshot) {

[PR] Build: Bump mkdocs-section-index from 0.3.8 to 0.3.9 [iceberg-python]

2024-05-02 Thread via GitHub
dependabot[bot] opened a new pull request, #696: URL: https://github.com/apache/iceberg-python/pull/696 Bumps [mkdocs-section-index](https://github.com/oprypin/mkdocs-section-index) from 0.3.8 to 0.3.9. Release notes Sourced from

Re: [PR] Add Pagination To List Apis [iceberg]

2024-05-02 Thread via GitHub
danielcweeks commented on PR #9782: URL: https://github.com/apache/iceberg/pull/9782#issuecomment-2091881609 Approved, pending checks. There's one that failed, but may have been a transient failure. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Core: Retry connections in JDBC catalog with user configured error code list [iceberg]

2024-05-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #10140: URL: https://github.com/apache/iceberg/pull/10140#discussion_r1588457797 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -40,6 +40,8 @@ final class JdbcUtil { // property to control if view support is added

Re: [PR] Core: Retry connections in JDBC catalog with user configured error code list [iceberg]

2024-05-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #10140: URL: https://github.com/apache/iceberg/pull/10140#discussion_r1588457797 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -40,6 +40,8 @@ final class JdbcUtil { // property to control if view support is added

Re: [PR] Core: Retry connections in JDBC catalog with user configured error code list [iceberg]

2024-05-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #10140: URL: https://github.com/apache/iceberg/pull/10140#discussion_r1588457797 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -40,6 +40,8 @@ final class JdbcUtil { // property to control if view support is added

Re: [PR] Add labeler [iceberg-python]

2024-05-02 Thread via GitHub
syun64 commented on PR #549: URL: https://github.com/apache/iceberg-python/pull/549#issuecomment-2091773603 I'm +1 for adding the sync-labels as well, it sounds like it could be helpful ``` name: "Pull Request Labeler" on: - pull_request_target jobs: labeler:

Re: [PR] Core: Retry connections in JDBC catalog with user configured error code list [iceberg]

2024-05-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #10140: URL: https://github.com/apache/iceberg/pull/10140#discussion_r1588420304 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcClientPool.java: ## @@ -21,17 +21,39 @@ import java.sql.Connection; import java.sql.DriverManager;

Re: [PR] Views, Spark: Add support for Materialized Views; Integrate with Spark SQL [iceberg]

2024-05-02 Thread via GitHub
singhpk234 commented on PR #9830: URL: https://github.com/apache/iceberg/pull/9830#issuecomment-2091715487 > For (2): We have not discussed incremental refresh plans in the Iceberg community, but [there is some relevant work

Re: [PR] Core: Retry connections in JDBC catalog with user configured error code list [iceberg]

2024-05-02 Thread via GitHub
amogh-jahagirdar commented on code in PR #10140: URL: https://github.com/apache/iceberg/pull/10140#discussion_r1588420304 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcClientPool.java: ## @@ -21,17 +21,39 @@ import java.sql.Connection; import java.sql.DriverManager;

Re: [I] REST Catalog to support custom-catalog name like HMS/Glue [iceberg]

2024-05-02 Thread via GitHub
osscm commented on issue #10205: URL: https://github.com/apache/iceberg/issues/10205#issuecomment-2091666775 I also think catalog-name can be a separate entity. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Metadata Log Entries metadata table [iceberg-python]

2024-05-02 Thread via GitHub
syun64 commented on code in PR #667: URL: https://github.com/apache/iceberg-python/pull/667#discussion_r1588359700 ## pyiceberg/table/metadata.py: ## @@ -292,6 +292,13 @@ def snapshot_by_name(self, name: str) -> Optional[Snapshot]: return

Re: [I] [feature request] Allow engines to time travel [iceberg-python]

2024-05-02 Thread via GitHub
syun64 commented on issue #600: URL: https://github.com/apache/iceberg-python/issues/600#issuecomment-2091529571 > * the path to the metadata json file for a given snapshot id. > * I really wish this was a property of the Snapshot class; is that possible or does this break

Re: [I] Flink: Not Writing [iceberg]

2024-05-02 Thread via GitHub
parrik commented on issue #8916: URL: https://github.com/apache/iceberg/issues/8916#issuecomment-2091506520 https://github.com/apache/iceberg/pull/1515 fwiw -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Make `add_files` to support `snapshot_properties` argument [iceberg-python]

2024-05-02 Thread via GitHub
enkidulan commented on PR #695: URL: https://github.com/apache/iceberg-python/pull/695#issuecomment-2091494648 Thanks @syun64. I've added the tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Make `add_files` to support `snapshot_properties` argument [iceberg-python]

2024-05-02 Thread via GitHub
enkidulan commented on code in PR #695: URL: https://github.com/apache/iceberg-python/pull/695#discussion_r1588306192 ## tests/integration/test_add_files.py: ## @@ -122,8 +123,13 @@ def _create_table( return tbl +@pytest.fixture(name="format_version",

Re: [PR] Make `add_files` to support `snapshot_properties` argument [iceberg-python]

2024-05-02 Thread via GitHub
enkidulan commented on code in PR #695: URL: https://github.com/apache/iceberg-python/pull/695#discussion_r1588306192 ## tests/integration/test_add_files.py: ## @@ -122,8 +123,13 @@ def _create_table( return tbl +@pytest.fixture(name="format_version",

Re: [PR] Support partial deletes [iceberg-python]

2024-05-02 Thread via GitHub
Fokko commented on code in PR #569: URL: https://github.com/apache/iceberg-python/pull/569#discussion_r1588290938 ## pyiceberg/table/__init__.py: ## @@ -434,6 +456,9 @@ def overwrite( if table_arrow_schema != df.schema: df = df.cast(table_arrow_schema) +

Re: [PR] Support partial deletes [iceberg-python]

2024-05-02 Thread via GitHub
Fokko commented on code in PR #569: URL: https://github.com/apache/iceberg-python/pull/569#discussion_r1588286537 ## pyiceberg/table/__init__.py: ## @@ -292,7 +303,13 @@ def _apply(self, updates: Tuple[TableUpdate, ...], requirements: Tuple[TableRequ

Re: [PR] Support partial deletes [iceberg-python]

2024-05-02 Thread via GitHub
Fokko commented on code in PR #569: URL: https://github.com/apache/iceberg-python/pull/569#discussion_r1588266506 ## pyiceberg/table/__init__.py: ## @@ -235,6 +242,10 @@ class TableProperties: WRITE_PARTITION_SUMMARY_LIMIT = "write.summary.partition-limit"

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-05-02 Thread via GitHub
Fokko commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2091358002 Can you share the metadata JSON? I don't think the field ID resolution is being applied, described in issue https://github.com/apache/iceberg-rust/issues/353.

Re: [PR] Parquet: page skipping using filtered row groups for non-vectorized read [iceberg]

2024-05-02 Thread via GitHub
wypoon commented on PR #10228: URL: https://github.com/apache/iceberg/pull/10228#issuecomment-2091334137 @sunchao @chenjunjiedada you may be interested in this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] feat: add `ExpressionEvaluator` [iceberg-rust]

2024-05-02 Thread via GitHub
marvinlanhenke commented on code in PR #363: URL: https://github.com/apache/iceberg-rust/pull/363#discussion_r1588191095 ## crates/iceberg/src/expr/visitors/manifest_evaluator.rs: ## @@ -260,30 +260,6 @@ mod test { Ok((Arc::new(schema), Arc::new(spec))) } -

Re: [PR] [WIP] feat: Add `ExpressionEvaluator` [iceberg-rust]

2024-05-02 Thread via GitHub
marvinlanhenke commented on PR #363: URL: https://github.com/apache/iceberg-rust/pull/363#issuecomment-2091294085 @Fokko @liurenjie1024 @sdd PTAL. Implementation based on [pyiceberg](https://github.com/apache/iceberg-python/blob/main/pyiceberg/expressions/visitors.py#L461). -- This is

Re: [I] Failing to create a table using pyiceberg [iceberg-python]

2024-05-02 Thread via GitHub
syun64 commented on issue #692: URL: https://github.com/apache/iceberg-python/issues/692#issuecomment-2091269646 @salexln I think in order to write it, it uses `head_object` to check if the file exists first:

[I] SnapshotTableProcedure to migrate iceberg tables from one namespace to another [iceberg]

2024-05-02 Thread via GitHub
Gowthami03B opened a new issue, #10262: URL: https://github.com/apache/iceberg/issues/10262 ### Feature Request / Improvement Hello The current snapshot procedure (https://iceberg.apache.org/docs/nightly/spark-procedures/?h=spark_catalog#snapshot) seems to be helpful in only

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-05-02 Thread via GitHub
rahil-c commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1588018334 ## open-api/rest-catalog-open-api.yaml: ## @@ -2106,6 +2213,32 @@ components: items: $ref: '#/components/schemas/PartitionStatisticsFile' +

Re: [I] Failing to create a table using pyiceberg [iceberg-python]

2024-05-02 Thread via GitHub
salexln commented on issue #692: URL: https://github.com/apache/iceberg-python/issues/692#issuecomment-2091087047 @syun64 Does it try to write `0-5df640cc-b47c-4b39-b578-07113565dab5.metadata.json' file or read? If it is trying to write, than you might be right and this is

Re: [PR] Add Pagination To List Apis [iceberg]

2024-05-02 Thread via GitHub
rahil-c commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1587974890 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -494,22 +515,30 @@ public void createNamespace( @Override public List

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-05-02 Thread via GitHub
a-agmon commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2091052966 Strangely, I was working with the Rust API on tables generated by Spark with no such issue, but when I tried to port to Rust some code that deals with tables generated by Trino,

Re: [PR] Add Pagination To List Apis [iceberg]

2024-05-02 Thread via GitHub
rahil-c commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1587968616 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -136,6 +137,7 @@ public class RESTSessionCatalog extends BaseViewSessionCatalog

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-02 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1587953911 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/writer/IcebergStreamWriter.java: ## @@ -0,0 +1,154 @@ +/* + * Licensed to the Apache Software

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-02 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1587950378 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/writer/BaseDeltaTaskWriter.java: ## @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-02 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1587945650 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/SinkCommitter.java: ## @@ -0,0 +1,444 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Make `add_files` to support `snapshot_properties` argument [iceberg-python]

2024-05-02 Thread via GitHub
syun64 commented on PR #695: URL: https://github.com/apache/iceberg-python/pull/695#issuecomment-2090959978 @enkidulan - thank you for working on this. I think https://github.com/apache/iceberg-python/blob/main/tests/integration/test_add_files.py is the right place for these tests --

Re: [I] Support `snapshot_properties` argument for `add_files` function [iceberg-python]

2024-05-02 Thread via GitHub
enkidulan commented on issue #694: URL: https://github.com/apache/iceberg-python/issues/694#issuecomment-2090956456 I've created a draft PR - https://github.com/apache/iceberg-python/pull/695 -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [I] Flink: Not Writing [iceberg]

2024-05-02 Thread via GitHub
beingRealFrank commented on issue #8916: URL: https://github.com/apache/iceberg/issues/8916#issuecomment-2090945914 I am seeing a very similar issue when trying to write to S3 from pyflink using a query like the following. ``` INSERT INTO iceberg.devdb.table_name SELECT * FROM

Re: [I] Support `snapshot_properties` argument for `add_files` function [iceberg-python]

2024-05-02 Thread via GitHub
Fokko commented on issue #694: URL: https://github.com/apache/iceberg-python/issues/694#issuecomment-2090942304 @enkidulan Thanks for raising this, and I agree! Are you interested in contributing to this feature? -- This is an automated message from the Apache Git Service. To respond to

[I] Support `snapshot_properties` argument for `add_files` function [iceberg-python]

2024-05-02 Thread via GitHub
enkidulan opened a new issue, #694: URL: https://github.com/apache/iceberg-python/issues/694 ### Feature Request / Improvement It would be great to make the interface more aligned with `append` and `overwrite` function, which support `snapshot_properties` argument. The logic also

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-02 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1587875343 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/SinkAggregator.java: ## @@ -0,0 +1,129 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-02 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1587870550 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/IcebergManifestOutputFileFactory.java: ## @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-02 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1587870550 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/IcebergManifestOutputFileFactory.java: ## @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-02 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1587862893 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/committer/IcebergFlinkManifestUtil.java: ## @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-02 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1587857184 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/SimpleTableSupplier.java: ## @@ -0,0 +1,45 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-02 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1587851360 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/SimpleTableSupplier.java: ## @@ -0,0 +1,45 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-02 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1587845482 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergSink.java: ## @@ -0,0 +1,771 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [I] Failing to create a table using pyiceberg [iceberg-python]

2024-05-02 Thread via GitHub
syun64 commented on issue #692: URL: https://github.com/apache/iceberg-python/issues/692#issuecomment-2090796822 Hi @salexln - the "ACCESS_DENIED" error in the log trace you provided gives me the impression that you don't have HeadObject permissions in alex-iceberg-test-storage bucket for

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-02 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1587796859 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/writer/RowDataTaskWriterFactory.java: ## @@ -0,0 +1,245 @@ +/* + * Licensed to the Apache Software

Re: [PR] A new implementation of an Iceberg Sink [WIP] that will be used with upcoming Flink Compaction jobs [iceberg]

2024-05-02 Thread via GitHub
pvary commented on code in PR #10179: URL: https://github.com/apache/iceberg/pull/10179#discussion_r1587797538 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/CachingTableSupplier.java: ## @@ -33,7 +34,8 @@ * table loader should be used carefully when used

Re: [PR] fix minor version for striclty libs versions [iceberg]

2024-05-02 Thread via GitHub
nastra closed pull request #9886: fix minor version for striclty libs versions URL: https://github.com/apache/iceberg/pull/9886 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Build: Bump com.fasterxml.jackson.core:jackson-annotations from 2.16.0 to 2.16.1 [iceberg]

2024-05-02 Thread via GitHub
dependabot[bot] commented on PR #9375: URL: https://github.com/apache/iceberg/pull/9375#issuecomment-2090720301 Looks like com.fasterxml.jackson.core:jackson-annotations is no longer a dependency, so this is no longer needed. -- This is an automated message from the Apache Git Service.

Re: [PR] Build: Bump com.fasterxml.jackson.core:jackson-annotations from 2.16.0 to 2.16.1 [iceberg]

2024-05-02 Thread via GitHub
dependabot[bot] closed pull request #9375: Build: Bump com.fasterxml.jackson.core:jackson-annotations from 2.16.0 to 2.16.1 URL: https://github.com/apache/iceberg/pull/9375 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Build: Bump com.fasterxml.jackson.core:jackson-annotations from 2.16.0 to 2.16.1 [iceberg]

2024-05-02 Thread via GitHub
nastra commented on PR #9375: URL: https://github.com/apache/iceberg/pull/9375#issuecomment-2090716097 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Renaming of ConfVars Enums in Apache Hive breaks compatibility of HiveCatalog dependency in Apache Iceberg [iceberg]

2024-05-02 Thread via GitHub
dom93dd commented on issue #10254: URL: https://github.com/apache/iceberg/issues/10254#issuecomment-2090551374 @nastra Danke Dir! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] Renaming of ConfVars Enums in Apache Hive breaks compatibility of HiveCatalog dependency in Apache Iceberg [iceberg]

2024-05-02 Thread via GitHub
dom93dd closed issue #10254: Renaming of ConfVars Enums in Apache Hive breaks compatibility of HiveCatalog dependency in Apache Iceberg URL: https://github.com/apache/iceberg/issues/10254 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [WIP] POC of runtime module [iceberg-rust]

2024-05-02 Thread via GitHub
odysa commented on PR #233: URL: https://github.com/apache/iceberg-rust/pull/233#issuecomment-2090540873 Sorry, my bad. I thought I marked this PR ready for review, but I didn't remove the [WIP]. There are some code changes needed to resolve conflicts. Let me know if you have any

Re: [I] Add runtime module to enable concurrent load of manifest files. [iceberg-rust]

2024-05-02 Thread via GitHub
liurenjie1024 commented on issue #124: URL: https://github.com/apache/iceberg-rust/issues/124#issuecomment-2090490939 Maybe currently we don't need a `Runtime` trait? From what we have learned, we currently need two methods: 1. spawn 2. block_on I think the method

[I] Failing to create a table using pyiceberg [iceberg-python]

2024-05-02 Thread via GitHub
salexln opened a new issue, #692: URL: https://github.com/apache/iceberg-python/issues/692 ### Question I running this code ```python glue_database_name = "alex_iceberg_test_db" glue_catalog_uri = "s3://alex-iceberg-test-storage" my_namespace = 'alex_db' #

Re: [PR] Implement table_exists method with try-catch for NoSuchTableError [iceberg-python]

2024-05-02 Thread via GitHub
MehulBatra commented on code in PR #678: URL: https://github.com/apache/iceberg-python/pull/678#discussion_r1587445403 ## pyiceberg/catalog/__init__.py: ## @@ -661,6 +661,14 @@ def create_table_transaction( ) def table_exists(self, identifier: Union[str,

Re: [PR] Build: Bump coverage from 7.4.4 to 7.5.0 [iceberg-python]

2024-05-02 Thread via GitHub
Fokko merged PR #688: URL: https://github.com/apache/iceberg-python/pull/688 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] Core: Retry connections in JDBC catalog with user configured error code list [iceberg]

2024-05-02 Thread via GitHub
jbonofre commented on code in PR #10140: URL: https://github.com/apache/iceberg/pull/10140#discussion_r1587385129 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcClientPool.java: ## @@ -43,8 +65,18 @@ public JdbcClientPool(String dbUrl, Map props) { } public

Re: [PR] Test out Parquet 1.14.0 [iceberg]

2024-05-02 Thread via GitHub
Fokko commented on code in PR #10209: URL: https://github.com/apache/iceberg/pull/10209#discussion_r1587327951 ## build.gradle: ## @@ -24,9 +24,12 @@ import java.util.regex.Pattern buildscript { repositories { gradlePluginPortal() +maven { + url =

Re: [I] Unable to load an iceberg table from aws glue catalog [iceberg-python]

2024-05-02 Thread via GitHub
anatol-ju commented on issue #515: URL: https://github.com/apache/iceberg-python/issues/515#issuecomment-2089981937 We have the same problem here. My manager and me tried to get it to work in parallel and both ran into the same error. We assumed it is a permission issue, but even with

Re: [I] Renaming of ConfVars Enums in Apache Hive breaks compatibility of HiveCatalog dependency in Apache Iceberg [iceberg]

2024-05-02 Thread via GitHub
nastra commented on issue #10254: URL: https://github.com/apache/iceberg/issues/10254#issuecomment-2089980598 For Hive < 4 there is [iceberg-hive-runtime](https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-hive-runtime) (which is maintained by the Iceberg community) For Hive

Re: [PR] Core: Prevent duplicate data/delete files [iceberg]

2024-05-02 Thread via GitHub
nastra commented on code in PR #10007: URL: https://github.com/apache/iceberg/pull/10007#discussion_r1587260164 ## core/src/main/java/org/apache/iceberg/FastAppend.java: ## @@ -83,9 +85,13 @@ protected Map summary() { @Override public FastAppend appendFile(DataFile

Re: [I] [feature request] Improve integration test reliance on docker [iceberg-python]

2024-05-02 Thread via GitHub
AronsonDan commented on issue #637: URL: https://github.com/apache/iceberg-python/issues/637#issuecomment-2089891275 @kevinjqliu Did you guys consider using testcontainers? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Add Pagination To List Apis [iceberg]

2024-05-02 Thread via GitHub
nastra commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1587208312 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -494,22 +515,30 @@ public void createNamespace( @Override public List

Re: [PR] Add Pagination To List Apis [iceberg]

2024-05-02 Thread via GitHub
nastra commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1587206338 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -228,6 +230,12 @@ public void initialize(String name, Map unresolved) {

Re: [I] Spark: Dropping partition column from old partition table corrupts entire table [iceberg]

2024-05-02 Thread via GitHub
Fokko commented on issue #10234: URL: https://github.com/apache/iceberg/issues/10234#issuecomment-2089822067 Thanks for sharing the steps on how to reproduce this. I think the problem here is that it is not the current spec, but it is still in use. This looks like a bug indeed and should

Re: [PR] Add Pagination To List Apis [iceberg]

2024-05-02 Thread via GitHub
nastra commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1587205432 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -136,6 +137,7 @@ public class RESTSessionCatalog extends BaseViewSessionCatalog private

Re: [PR] Add Pagination To List Apis [iceberg]

2024-05-02 Thread via GitHub
nastra commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1587205432 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -136,6 +137,7 @@ public class RESTSessionCatalog extends BaseViewSessionCatalog private

Re: [I] Spark: Dropping partition column from old partition table corrupts entire table [iceberg]

2024-05-02 Thread via GitHub
Fokko commented on issue #10234: URL: https://github.com/apache/iceberg/issues/10234#issuecomment-2089818384 @EXPEbdodla To clarify, are you dropping a column that's part of the current partition spec? I did some work to fix this on previous partition specs:

Re: [PR] define snowflake catalog [iceberg-python]

2024-05-02 Thread via GitHub
prabodh1194 commented on PR #687: URL: https://github.com/apache/iceberg-python/pull/687#issuecomment-2089807730 I have added the apache license to the file headers now. the failing check shud start passing now. -- This is an automated message from the Apache Git Service. To respond to

Re: [I] REST Catalog to support custom-catalog name like HMS/Glue [iceberg]

2024-05-02 Thread via GitHub
jbonofre commented on issue #10205: URL: https://github.com/apache/iceberg/issues/10205#issuecomment-2089794429 @flyrain my understanding of the question is not on the namespace, but more at catalog level. Maybe I'm wrong :) -- This is an automated message from the Apache Git Service.

Re: [I] `iceberg-spark-runtime-3.3_2.12-1.5.1` seems to be compiled with a mismatched scala version [iceberg]

2024-05-02 Thread via GitHub
ajantha-bhat commented on issue #10251: URL: https://github.com/apache/iceberg/issues/10251#issuecomment-2089794121 @wForget, @kapkiai, @pan3793 : Feel free to test the Iceberg 1.5.2 staged artifacts I just tested with Nessie and it can work fine now