subject:"Re\: \[scikit\-learn\] Need help in dealing with large dataset"

Re: [scikit-learn] Need help in dealing with large dataset

2018-03-05 Thread Sebastian Raschka

Like Guillaume suggested, you don't want to load the whole array into memory if it's that large. There are many different ways for how to deal with this. The most naive way would be to break up your NumPy array into smaller NumPy array and load them iteratively with a running accuracy calculatio

Re: [scikit-learn] Need help in dealing with large dataset

2018-03-05 Thread Guillaume Lemaître

If you work with deep net you need to check the utils from the deep net library. For instance in keras, you should create a batch generator if you need to deal with large dataset. In patch torch you can use the data loader which and the ImageFolder from torchvision which manage the loading for you.