trxcllnt commented on PR #438:
URL: https://github.com/apache/arrow-js/pull/438#issuecomment-4522054313
v8 supports ArrayBuffers larger than 4GiB:
```shell
# Allocate a 64GiB ArrayBuffer
$ { node -p 'var a = new ArrayBuffer(2**36); setTimeout(() => {}, 2000);
a.byteLength / 1024' & }; pid=$!; sleep 1; ps -p $pid -o vsz,rss,cmd; wait $pid
[1] 1431480
VSZ RSS CMD
68109056 41444 node -p var a = new ArrayBuffer(2**36); setTimeout(() => {},
2000); a.byteLength / 1024
67108864
[1]+ Done node -p 'var a = new ArrayBuffer(2**36);
setTimeout(() => {}, 2000); a.byteLength / 1024'
```
And it's not unusual for other language implementations to create
RecordBatches larger than 4GiB (e.g. by using
[`table.combine_chunks()`](https://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table.combine_chunks)
from Python).
That said, it's unlikely anyone needs to address 8192 TiB of memory. I
agree with your point that 53-bit ints are fine for indices, thanks for walking
me through that :sweat_smile:.
> JS's per-ArrayBuffer cap means a List<Float64> tops out around 2^29
elements - a quarter of the spec ceiling
Could you explain this more? List's indices are Int32Array, which means its
child should be able to support up to 2**31 individual elements. How did you
arrive at 2**29 for the max child vector length?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]