On Wed, Oct 9, 2019 at 12:11 PM Andy Grove <[email protected]> wrote:
> I'm very interested in helping to find a solution to this because we really > do need integration tests for Rust to make sure we're compatible with other > implementations... there is also the ongoing CI dockerization work that I > feel is related. > > I haven't looked at the current integration tests yet and would appreciate > some pointers on how all of this works (do we have docs?) or where to start > looking. > I have a test in my latest PR: https://github.com/apache/arrow/pull/5523 And here is the generated data: https://github.com/apache/arrow-testing/pull/11 As with program to generate these data, it's just a simple java program. I'm not sure whether we need to integrate it into arrow. > > I imagine the integration test could follow the approach that Renjie is > outlining where we call Java to generate some files and then call Rust to > parse them? > > Thanks, > > Andy. > > > > > > > > On Tue, Oct 8, 2019 at 9:48 PM Renjie Liu <[email protected]> wrote: > > > Hi: > > > > I'm developing rust version of reader which reads parquet into arrow > array. > > To verify the correct of this reader, I use the following approach: > > > > > > 1. Define schema with protobuf. > > 2. Generate json data of this schema using other language with more > > sophisticated implementation (e.g. java) > > 3. Generate parquet data of this schema using other language with more > > sophisticated implementation (e.g. java) > > 4. Write tests to read json file, and parquet file into memory (arrow > > array), then compare json data with arrow data. > > > > I think with this method we can guarantee the correctness of arrow > reader > > because json format is ubiquitous and their implementation are more > stable. > > > > Any comment is appreciated. > > > -- Renjie Liu Software Engineer, MVAD
