Re: [scikit-learn] Generate data from trained naive bayes

2016-10-03 Thread klo uo
Great. Thanks for your time Manoj Cheers, Klo On Mon, Oct 3, 2016 at 8:20 PM, Manoj Kumar wrote: > Let's say you would like to generate just the first feature of 1000 > samples with label 0. > > The distribution of the first feature conditioned on label 1 follows a > Bernoulli distribution (

Re: [scikit-learn] Generate data from trained naive bayes

2016-10-03 Thread Manoj Kumar
Let's say you would like to generate just the first feature of 1000 samples with label 0. The distribution of the first feature conditioned on label 1 follows a Bernoulli distribution (as suggested by the name) with parameter "exp(feature_log_prob_[0, 0])". You could then generate the first featur

Re: [scikit-learn] Generate data from trained naive bayes

2016-10-03 Thread klo uo
Hi Manoj, thanks for your reply. Sorry to say, but I don't understand how to generate new feature. In this example I have `X` with shape (1000, 64) with 5 unique classes. `feature_log_prob_` has shape (5, 64) I can generate for example uniform data with `r = np.random.rand(64)` Now how can I gen

Re: [scikit-learn] Generate data from trained naive bayes

2016-10-03 Thread Manoj Kumar
Hi, feature_log_prob_ is an array of size (n_classes, n_features). exp(feature_log_prob_[class_ind, feature_ind]) gives P(X_{feature_ind} = 1 | class_ind)" Using the conditional independence assumptions of NaiveBayes, you can use this to sample each feature independently given the class. Hope t

Re: [scikit-learn] Generate data from trained naive bayes

2016-10-03 Thread klo uo
On Mon, Oct 3, 2016 at 5:08 PM, klo uo wrote: > I can see how can I sample from `feature_log_prob_`... > I meant I cannot see ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] Generate data from trained naive bayes

2016-10-03 Thread klo uo
Thanks Andy, I can comprehend to the point "...and then sample from these Bernoulli distributions" >From the data in `feature_log_prob_`, I would guess it contains single feature (features mean from the trained data) for each class. I can see how can I sample from `feature_log_prob_`... On Mon,

Re: [scikit-learn] Generate data from trained naive bayes

2016-10-03 Thread Andreas Mueller
Hi Klo. Yes, you could, but as the model is very simple, that's usually not very interesting. It stores for each label an independent Bernoulli distribution for each feature. these are stored in feature_log_prob_. I would suggest you look at this attribute, rather than sample from the distribu

[scikit-learn] Generate data from trained naive bayes

2016-10-03 Thread klo uo
Hi, because naive bayes is a generative model, does that mean that I can somehow generate data based on trained model? For example: clf = BernoulliNB() clf.fit(train, labels) Can I generate data for specific label? Thanks, Klo ___ scikit-learn maili