To clarify:
You have 2.3M samples
How many features?
How many active features on average per sample?
In 7k classes: multiclass or multilabel?
Have you tried limiting the depth of the forest? Have you tried embedding
your feature space into a smaller vector (pre-trained embeddings, hashing,
lda, PC
Ranjana,
have a look at this example
http://scikit-learn.org/stable/auto_examples/applications/plot_out_of_core_classification.html
Since you have a lot of RAM, you may not need to make all the
classification pipeline out-of-core, a start with your current code
could be to write a generator
For neural network training, try one of tensorflow, pytorch, chainer, or
mxnet. They’ll all parallelize the computations and can run the computations on
Nvidia GPUs with CUDA.
Best regards,
Jeremiah
Sent from my iPhone
On Dec 20, 2017, at 11:45, Raphael C
mailto:drr...@gmail.com>> wrote:
I
I believe tensorflow will do what you want.
Raphael
On 20 Dec 2017 16:43, "Luigi Lomasto"
wrote:
> Hi all,
>
> I have a computational problem to training my neural network so, can you
> say me if exists any parallel version about MLP library?
>
>
> __
Hi all,
I have a computational problem to training my neural network so, can you say me
if exists any parallel version about MLP library?
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn
Thank you all for your interest!
In order to clarify the case allow me to try to synthesize the spirit of
what I'd like to put into the pipeline using this sequence of steps:
#%%
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import DBSCAN
from sklear