guan404ming commented on PR #680: URL: https://github.com/apache/mahout/pull/680#issuecomment-3616332604
> @guan404ming Great work on the io refactor! Thanks! We need to upgrade our amplitude.rs because: encode_from_parquet currently falls back to encode_chunked, which merges all chunks into a huge Vec on CPU. This breaks zero-copy for large files. We should override encode_chunked in AmplitudeEncoder to stream chunks directly to the GPU without merging. What do you think? I could send a PR for this part. Thanks, I previously plan to send following PR for this but forget to add in PR description. You definitely could help with this, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
