Benoit Cantin created ARROW-17123:
-------------------------------------
Summary: [JS] Unable to open reader on .arrow file after fetch:
Uncaught (in promise) Error: Expected to read 1329865020 metadata bytes, but
only read 1123.
Key: ARROW-17123
URL: https://issues.apache.org/jira/browse/ARROW-17123
Project: Apache Arrow
Issue Type: Bug
Components: JavaScript
Affects Versions: 8.0.1
Reporter: Benoit Cantin
I created a file in raw arrow format with the script given in the Py arrow
cookbook here:
[https://arrow.apache.org/cookbook/py/io.html#saving-arrow-arrays-to-disk]
In a Node.js application, this file can be read doing:
{code:java}
const r = await RecordBatchReader.from(fs.createReadStream(filePath));
await r.open();
for (let i = 0; i < r.numRecordBatches; i++) {
const rb = await r.readRecordBatch(i);
if (rb !== null) {
console.log(rb.numRows);
}
} {code}
However this method loads the whole file in memory (is that a bug?), which is
not scalable.
To solve this scalability issue, I try to load the data with fetch as described
in the the
[README.md|[https://github.com/apache/arrow/tree/master/js#load-data-with-fetch].]
Both:
{code:java}
import { tableFromIPC } from "apache-arrow";
const table = await tableFromIPC(fetch(filePath));
console.table([...table]);{code}
and
{code:java}
const r = await RecordBatchReader.from(await fetch(filePath));
await r.open(); {code}
fail with error:
Uncaught (in promise) Error: Expected to read 1329865020 metadata bytes, but
only read 1123.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)