Hi folks,

We are reaching out to better understand the performance of ArrowJS when it
comes to viewing large amounts of data (> 1M records) in the browser’s DOM.
Our backend (https://github.com/tenzir/vast) spits out record batches,
which we are accumulating in the frontend with a RecordBatchReader.

At first, we only want to render the data fast, line by line, with minimal
markup according to its types from the schema. We use a virtual scrolling
window to avoid overloading the DOM, that is, we lazily convert the record
batch data to DOM elements according to a scroll window defined by the
user. As the user scrolls, elements outside the window get removed and new
ones added.

The data consists of one or more Tables that we are pulling in through the
RecordBatchReader. We use the Async Interator interface to go over the
record batches and convert them into rows. This API feels suboptimal for
our use cases, where we want random access to the data. Is there a
faster/better way to do this?

Does anyone have any experience worth sharing with doing something similar?
The DOM is the main bottleneck, but if there are some clever things we can
do with Arrow to pull out the data in the most efficient way, that would be
nice.

    Matthias

Reply via email to