[PR] test: add full types boundary consistency coverage [paimon-rust]

via GitHub Tue, 02 Jun 2026 00:50:46 -0700


QuakeWang opened a new pull request, #351:
URL: https://github.com/apache/paimon-rust/pull/351


   <!--
   Thank you very much for contributing to Paimon Rust - we are happy that you 
want to help us improve it. To help the community review your contribution in 
the best possible way, please go through the checklist below, which will get 
the contribution into a shape in which it can be best reviewed.
   
   ## Contribution Checklist
   
     - Make sure that the pull request corresponds to a [GitHub 
issue](https://github.com/apache/paimon-rust/issues). Exceptions are made for 
typos in documentation or comments, which need no issue.
   
     - Fill out the template below to describe the changes contributed by the 
pull request. That will give reviewers the context they need to do the review.
   
     - Make sure that the change passes the automated tests, i.e., `cargo test` 
passes.
   
     - Each pull request should address only one issue, not mix up code from 
multiple issues.
   
   **(The sections below can be removed for hotfixes or typos)**
   -->
   
   ### Purpose
   
   <!-- Linking this pull request to the issue -->
   
   Linked issue: #258
   
   The existing Spark-provisioned full types test covered mixed Parquet, ORC, 
and Avro reads, but it only used ordinary non-null values and did not 
explicitly assert that the scan plan included all three data file formats. This 
left boundary values, top-level nulls, and selected nested null/empty 
collection cases under-covered for Java/Spark-generated data files.
   
   This PR adds a focused consistency fixture and read test for those cases.
   
   <!-- What is the purpose of the change -->
   
   ### Brief change log
   
   <!-- Please describe the changes made in this pull request and explain how 
they address the issue -->
   
     - Add `full_types_boundary_table` to the Spark provisioning script.
     - Write Java/Spark-generated Paimon data files in Parquet, ORC, and Avro 
formats.
     - Cover primitive min/max-style values, decimal boundaries, empty 
string/binary, pre-epoch date, timestamp micros, top-level nulls, nested nulls, 
and empty collections.
     - Add a Rust integration test that:
       - verifies the scan plan includes `parquet`, `orc`, and `avro` data 
files;
       - reads the table through `FileSystemCatalog`;
       - checks Arrow arrays and decoded row values.
   
   ### Tests
   
   <!-- List unit tests or integration cases to verify this change -->
   
     - `cargo test -p paimon-integration-tests --test read_tables 
test_read_full_types_boundary_table -- --nocapture`
     - `cargo test -p paimon-integration-tests --test read_tables`
   
   ### API and Format
   
   <!-- Does this change affect API or storage format -->
   
   ### Documentation
   
   <!-- Does this change introduce a new feature or require documentation 
updates -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] test: add full types boundary consistency coverage [paimon-rust]

Reply via email to