Hi folks, We are reaching out to better understand the performance of ArrowJS when it comes to viewing large amounts of data (> 1M records) in the browser’s DOM. Our backend (https://github.com/tenzir/vast) spits out record batches, which we are accumulating in the frontend with a RecordBatchReader.
At first, we only want to render the data fast, line by line, with minimal markup according to its types from the schema. We use a virtual scrolling window to avoid overloading the DOM, that is, we lazily convert the record batch data to DOM elements according to a scroll window defined by the user. As the user scrolls, elements outside the window get removed and new ones added. The data consists of one or more Tables that we are pulling in through the RecordBatchReader. We use the Async Interator interface to go over the record batches and convert them into rows. This API feels suboptimal for our use cases, where we want random access to the data. Is there a faster/better way to do this? Does anyone have any experience worth sharing with doing something similar? The DOM is the main bottleneck, but if there are some clever things we can do with Arrow to pull out the data in the most efficient way, that would be nice. Matthias