TriLoo opened a new issue #12050: How to get the `.csv` file from `.mat` for training properly? URL: https://github.com/apache/incubator-mxnet/issues/12050 I have the `NYUv2 Dataset` in `.mat` format(~3GB), which include the training data naming `images`, `depths`, `labels` *etc.* ( 1449 samples for each item). Now I want to convert it into `.csv` format then using `CSVIter` to avoid reading whole training data. What I have did is, iteretly read in one *image* and one *depth* sample and then use `ravel()` and `numpy.concatenate` on them to get a row vector. Finally, write this 1D vector into a `.csv` file using `csv.writerrow()`. Howerer, the origin `.mat` file is only ~3GB, and the `.csv` file turned out to be more than 8GB. What happened to this process? More strange, if I wirte the `image` and `depth` datas into two seperate `.csv` files and do not flatten them, the `.csv` files are only ~5MB. All datas are in `float32` type.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
