Fokko closed issue #338: Compatibility issues with
`org.apache.iceberg:iceberg-spark-runtime-3.5_2.13:1.5.0`
URL: https://github.com/apache/iceberg-rust/issues/338
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
a-agmon commented on issue #338:
URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2099788232
Added a PR that proposes an interim, but more elegant, solution to the
problem. I think.
WDYT @Fokko @zeodtr @liurenjie1024
--
This is an automated message from the Apache
a-agmon commented on issue #338:
URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2097495570
Thanks @zeodtr ,
We can certainly cache the manifest schema, and also recurse on the fields
read from the file.
Implementing a more efficient reader is also possible but
zeodtr commented on issue #338:
URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2097101287
@a-agmon My concerns are as follows:
1. The `manifest_file_schema_fields` hashmap should be calculated only once
in an application's lifetime (for performance).
2. There are
Fokko commented on issue #338:
URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2095451994
I think creating a field-id to a field-name map is a good (interim) solution.
Keep in mind that the next Avro release is planned for this week:
a-agmon commented on issue #338:
URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2095268461
Thanks, @Fokko and @zeodtr, for the clarifications and explanations!
I think that it's important to fix this in the next release, as the current
situation is that the Rust API
zeodtr commented on issue #338:
URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2093913951
@a-agmon Since the problem is in the schema, IMO checking the schema itself
before reading the record is more appropriate. And since the error could be the
other one (for example,
Fokko commented on issue #338:
URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2092909489
> @Fokko if https://github.com/apache/iceberg-rust/issues/354 is applied,
iceberg-rust will no longer be able to read the manifest list files created by
pre-1.5.0 Spark and
a-agmon commented on issue #338:
URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2092863958
Another way to resolve this, in a less workaround-ish way, is simply to
capture the fact that we have a V1 schema, a V2 schema, and a V2 compatibility
schema, which is identical
a-agmon commented on issue #338:
URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2092089365
> Can you share the metadata JSON? I don't think the field ID resolution is
being applied, described in issue #353. `added_data_files_count` is the old
name since in V2 it also
zeodtr commented on issue #338:
URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2091929341
@Fokko if #354 is applied, iceberg-rust will no longer be able to read the
manifest list files created by pre-1.5.0 Spark and pre-#354 iceberg-rust, since
iceberg-rust does not
Fokko commented on issue #338:
URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2091358002
Can you share the metadata JSON? I don't think the field ID resolution is
being applied, described in issue
https://github.com/apache/iceberg-rust/issues/353.
a-agmon commented on issue #338:
URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2091052966
Strangely, I was working with the Rust API on tables generated by Spark with
no such issue, but when I tried to port to Rust some code that deals with
tables generated by Trino,
martin-g commented on issue #338:
URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2074220391
There were some discussions about doing separate SDK releases -
https://lists.apache.org/thread/2rfnszd4dk36jxynpj382b1717gbyv1y but nothing
happened mainly due to the lack of
martin-g commented on issue #338:
URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2066328840
> @martin-g Do you have any ETA on Avro Rust 0.17?
The Rust SDK is released with all other SDKs, i.e. when 1.12.0/1.11.4 is
released.
--
This is an automated message
Fokko commented on issue #338:
URL: https://github.com/apache/iceberg-rust/issues/338#issuecomment-2066312852
@zeodtr Thanks for raising this issue. Looks like we need some proper
Spark/Rust integration tests :)
@martin-g Do you have any ETA on Avro Rust 0.17?
--
This is an
16 matches
Mail list logo