Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-05-09 Thread via GitHub
Fokko closed issue #338: Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` URL: https://github.com/apache/iceberg-rust/issues/338 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-05-07 Thread via GitHub
a-agmon commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2099788232 Added a PR that proposes an interim, but more elegant, solution to the problem. I think. WDYT @Fokko @zeodtr @liurenjie1024 -- This is an automated message from the Apache

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-05-06 Thread via GitHub
a-agmon commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2097495570 Thanks @zeodtr , We can certainly cache the manifest schema, and also recurse on the fields read from the file. Implementing a more efficient reader is also possible but

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-05-06 Thread via GitHub
zeodtr commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2097101287 @a-agmon My concerns are as follows: 1. The `manifest_file_schema_fields` hashmap should be calculated only once in an application's lifetime (for performance). 2. There are

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-05-06 Thread via GitHub
Fokko commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2095451994 I think creating a field-id to a field-name map is a good (interim) solution. Keep in mind that the next Avro release is planned for this week:

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-05-06 Thread via GitHub
a-agmon commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2095268461 Thanks, @Fokko and @zeodtr, for the clarifications and explanations! I think that it's important to fix this in the next release, as the current situation is that the Rust API

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-05-03 Thread via GitHub
zeodtr commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2093913951 @a-agmon Since the problem is in the schema, IMO checking the schema itself before reading the record is more appropriate. And since the error could be the other one (for example,

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-05-03 Thread via GitHub
Fokko commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2092909489 > @Fokko if https://github.com/apache/iceberg-rust/issues/354 is applied, iceberg-rust will no longer be able to read the manifest list files created by pre-1.5.0 Spark and

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-05-03 Thread via GitHub
a-agmon commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2092863958 Another way to resolve this, in a less workaround-ish way, is simply to capture the fact that we have a V1 schema, a V2 schema, and a V2 compatibility schema, which is identical

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-05-02 Thread via GitHub
a-agmon commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2092089365 > Can you share the metadata JSON? I don't think the field ID resolution is being applied, described in issue #353. `added_data_files_count` is the old name since in V2 it also

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-05-02 Thread via GitHub
zeodtr commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2091929341 @Fokko if #354 is applied, iceberg-rust will no longer be able to read the manifest list files created by pre-1.5.0 Spark and pre-#354 iceberg-rust, since iceberg-rust does not

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-05-02 Thread via GitHub
Fokko commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2091358002 Can you share the metadata JSON? I don't think the field ID resolution is being applied, described in issue https://github.com/apache/iceberg-rust/issues/353.

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-05-02 Thread via GitHub
a-agmon commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2091052966 Strangely, I was working with the Rust API on tables generated by Spark with no such issue, but when I tried to port to Rust some code that deals with tables generated by Trino,

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-04-24 Thread via GitHub
martin-g commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2074220391 There were some discussions about doing separate SDK releases - https://lists.apache.org/thread/2rfnszd4dk36jxynpj382b1717gbyv1y but nothing happened mainly due to the lack of

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-04-19 Thread via GitHub
martin-g commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2066328840 > @martin-g Do you have any ETA on Avro Rust 0.17? The Rust SDK is released with all other SDKs, i.e. when 1.12.0/1.11.4 is released. -- This is an automated message

Re: [I] Compatibility issues with `org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0` [iceberg-rust]

2024-04-19 Thread via GitHub
Fokko commented on issue #338: URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2066312852 @zeodtr Thanks for raising this issue. Looks like we need some proper Spark/Rust integration tests :) @martin-g Do you have any ETA on Avro Rust 0.17? -- This is an