swallez opened a new pull request, #44247: URL: https://github.com/apache/arrow/pull/44247
### Rationale for this change `Table.toArray()` provides the convenience of using Arrow tables as common object arrays. However it does allocate an array of the size of the table that hold proxies to the table rows. This array can consume a significant amount of memory with large tables. Providing an array proxy that wraps the table and creates struct row proxies on demand would avoid this and provide a object array view on top of a table at almost zero memory cost. ### What changes are included in this PR? This PR adds a new `Table.toArrayView()` method that returns an array proxy to the table rows. It complements `Table.toArray()` and doesn't replace it for the following reasons: * this is a read-only view, while `toArray()` returns a plain mutable array. Replacing it would be a breaking change. * the proxy adds some overhead to direct array access. Tests on iterator loops show that access via the array proxy takes 5 more time than direct array access. This can be a concern for some applications, even if it should generally be negligible, and if applications seeking maximum performance should avoid using these convenience wrappers. Also fixes #30863 by using a singleton proxy handler for `StructRowProxyHandler`, which further reduces memory usage. ### Are these changes tested? Yes. New tests are added to `js/test/unit/table/table-test.ts` for both `toArrayView()` and `toArray()` that was not tested. ### Are there any user-facing changes? This adds a new `Table.toArrayView()` method. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
