> You can build contiguous buffers for each column while you are tokenizing.
That's true. But then you're doing strided memory writes, which I'm unsure is better. (edit: not exactly "strided" I realize though, perhaps more on the lines of "multiple contiguous streams accessed in a round-robin fashion"...) [ Full content available at: https://github.com/apache/arrow/pull/2576 ] This message was relayed via gitbox.apache.org for [email protected]
