tustvold commented on issue #1718: URL: https://github.com/apache/arrow-rs/issues/1718#issuecomment-1707240666
Option 1 is likely the most tractable, ArrowWriter already encodes columns to separate memory regions and then stitches the encoded column chunks together. I could conceive doing something similar for a parallel writer. I think the biggest question in my mind is the mechanics of parallelism. A naive solution might be to just spawn tokio tasks for each column of each batch, but this will have very poor thread locality, high per-batch overheads, and in general feels a little off... I don't have a good solution here, typically we have avoided adding notions of threading into this crate... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
