[GitHub] TriLoo opened a new issue #12050: How to get the `.csv` file from `.mat` for training properly?

GitBox Mon, 06 Aug 2018 08:13:38 -0700

TriLoo opened a new issue #12050: How to get the `.csv` file from `.mat` for 
training properly?
URL: https://github.com/apache/incubator-mxnet/issues/12050
 
 
   I have the `NYUv2 Dataset` in `.mat` format(~3GB), which include the 
training data naming `images`, `depths`, `labels` *etc.* ( 1449 samples for 
each item). Now I want to convert it into `.csv` format then using `CSVIter` to 
avoid reading whole training data.
   
   What I have did is, iteretly read in one *image* and one *depth*  sample and 
then use `ravel()` and `numpy.concatenate` on them to get a row vector. 
Finally, write this 1D vector into a `.csv` file using `csv.writerrow()`.
   
   Howerer, the origin `.mat` file is only ~3GB,  and the `.csv` file turned 
out to be more than 8GB. What happened to this process? More strange, if I 
wirte the `image` and `depth` datas into two seperate `.csv` files and do not 
flatten them, the `.csv` files are only ~5MB.  
   
   All datas are in `float32` type.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] TriLoo opened a new issue #12050: How to get the `.csv` file from `.mat` for training properly?

Reply via email to