> Doing the strided memory read here is not worse than doing it somewhere else.
What I'm actually saying is not to do strided memory access at all. You can build contiguous buffers for each column while you are tokenizing. I'll spend some time working on this and flamegraphing / etc. so that I can assure myself of the best approach to tokenize-convert. I have been unsatisfied for years about the pandas CSV reader but never motivated enough to do anything about it because of pandas's underlying internal problems. I'd like to turn this CSV reader into something that we can use for the next 20 years (!) [ Full content available at: https://github.com/apache/arrow/pull/2576 ] This message was relayed via gitbox.apache.org for [email protected]
