[GitHub] [arrow] ashsharma96 opened a new issue, #13320: Memory Consumption while calling a feather file in python dataframe(Urgent)

GitBox Mon, 06 Jun 2022 08:08:32 -0700


ashsharma96 opened a new issue, #13320:
URL: https://github.com/apache/arrow/issues/13320


   Hey Apache Arrow Team,
   I'm working on some project and I'm using feather file. I've made a 
dataframe in feather format in my aws jupyter instance by the name of 
code.production. Whenever I read this feather file in my aws jupyter instance 
then it uses too much RAM. It takes up of 28 GB of RAM. The size of feather 
file is 844MB. The dataframe which is converted in feather file contains 4.2 
million rows and 50 columns. This feather takes 28GB RAM while calling. This is 
my code:
   
       tFilename1 = 'code.production'
       df_stored = feather.read_dataframe(tFilename1)
   
   This is my dataframe looks like:
   
![image](https://user-images.githubusercontent.com/17443937/172187264-fcd3cd01-af11-43d3-ac4b-ba40f073ac73.png)
   
   Columns shown in this dataframe are in list format except first two columns.
   The library versions in my aws jupyter is:
   
       arrow                              1.2.1
       feather                            0.1.2
       pyarrow                            6.0.1
   
   Can you guys help me in this like how should I use feather file with less 
RAM consumption. This RAM consumption sometimes leads to Kernel Die. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] ashsharma96 opened a new issue, #13320: Memory Consumption while calling a feather file in python dataframe(Urgent)

Reply via email to