zhztheplayer commented on PR #4818: URL: https://github.com/apache/incubator-gluten/pull/4818#issuecomment-1984779010
> @zhztheplayer can you check how the memory is allocated during the conversion? Where the arrow memory is allocated? how many memcpy during the conversion? Is there onheap=>offheap copy? @boneanxs If you'd like to address the questions also, thanks. I believe the patch reused our old `ArrowWritableColumnarVector` code to write Spark columnar data to native so there should be a bunch of "onheap => offheap" copies. And we should count on how much of copies the implementation exactly does ideally. @boneanxs You can also check on this part. What I was worried about is `ArrowWritableColumnarVector` have not actually been under active maintenance for a period of time so we should have more tests here especially for complex data types. Also would be great if you could share thoughts about the risk of memory leaks this approach may bring. Overall the PR's writing looks find to me and we had removed most of the unsafe APIs but still there might be some. Let's check this part carefully too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
