[GitHub] [systemds] kykrueger commented on pull request #1847: [SYSTEMDS-2834] python IO benchmarking

via GitHub Sun, 25 Jun 2023 04:39:15 -0700


kykrueger commented on PR #1847:
URL: https://github.com/apache/systemds/pull/1847#issuecomment-1606051393


   @Baunsgaard 
   Initially I'd thought there was a problem with my benchmark for loading data 
with pandas, but it seems that the script runs correctly, and that the 
converter for pandas to frames cannot handle loading large datasets in a 
reasonable amount of time, am I mistaken, is this a known problem?
   
   Im pretty sure I could come up with an easy fix for it by extending 
Py4JConverterUtils to take some sort of array format directly from pandas or 
dumping the pandas columns to numpy to reuse part of the existing ones from the 
Matrices. That defnitely would break the scope of this issue though. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [systemds] kykrueger commented on pull request #1847: [SYSTEMDS-2834] python IO benchmarking

Reply via email to