rich7420 commented on PR #680: URL: https://github.com/apache/mahout/pull/680#issuecomment-3614951502
Looks good, but the current implementation forces a memory copy (Vec allocation) even when we want to use Arrow directly. We should refactor io.rs so that read_parquet_to_arrow is the base implementation, ensuring true zero-copy performance for the pipeline. origin: Disk -> Arrow -> Vec (copy) -> Arrow (copy) -> GPU we need: Disk -> Arrow -> Arrow (Zero-copy Reference) -> GPU (through Pointer) I think so, plz correct me if I'm wrong. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
