Hello! I've used Arrow a decent bit in Python and JS but I'm pretty new to Rust. I'm trying to write a minimal binding of Rust's Parquet to WebAssembly in order to decode Parquet files to Arrow on the web. I have code that works <https://github.com/kylebarron/parquet-wasm/blob/main/src/lib.rs> but only some of the time. For example this test data <https://github.com/kylebarron/parquet-wasm/blob/9495a87e00ae7073966d171bdcbfa1b87c63991b/data/works.parquet> (created here <https://github.com/kylebarron/parquet-wasm/blob/9495a87e00ae7073966d171bdcbfa1b87c63991b/data/generate_data.py#L40-L43>) seems to work with the js arrow.RecordBatchReader <https://github.com/kylebarron/parquet-wasm/blob/79580c64c698570fd1a8a48b55698ca0be630aa8/www/index.js#L50-L52> but other test data <https://github.com/kylebarron/parquet-wasm/blob/9495a87e00ae7073966d171bdcbfa1b87c63991b/data/not_work.parquet> (created here <https://github.com/kylebarron/parquet-wasm/blob/79580c64c698570fd1a8a48b55698ca0be630aa8/data/generate_data.py#L45-L48>) raises with "Error: Expected to read 1249648 metadata bytes, but only read 300.".
Based on logging, it *seems* as if parsing the Parquet file goes smoothly. It's only writing the Arrow IPC format that fails (on the JS side when trying to verify it). I'm currently trying to create the StreamWriter <https://github.com/kylebarron/parquet-wasm/blob/79580c64c698570fd1a8a48b55698ca0be630aa8/src/lib.rs#L122-L123>, then write all the Arrow RecordBatches into the writer <https://github.com/kylebarron/parquet-wasm/blob/79580c64c698570fd1a8a48b55698ca0be630aa8/src/lib.rs#L127-L128>, then finish the writer <https://github.com/kylebarron/parquet-wasm/blob/79580c64c698570fd1a8a48b55698ca0be630aa8/src/lib.rs#L142>, and send the output back to JS <https://github.com/kylebarron/parquet-wasm/blob/79580c64c698570fd1a8a48b55698ca0be630aa8/src/lib.rs#L145-L156> . Has anyone seen a similar problem before, or any suggestions of where to debug further? Alternatively, if an end-to-end example exists of reading from a parquet file and returning an Arrow buffer would be very helpful to see. Best, Kyle Barron