Re: [scikit-learn] Fitting Lognormal Distribution

2016-05-27 Thread Jacob Schreiber
Another option is to use pomegranate which has probability distribution fitting with the same API as scikit-learn. You can see a tutorials here and it includes Lo

Re: [scikit-learn] Artificial neural network not learning lower values of the training sample

2016-05-31 Thread Jacob Schreiber
Do you have any other baselines which you can compare to? It might be helpful in seeing if this is a problem which can be learned. On Tue, May 31, 2016 at 10:47 AM, muhammad waseem wrote: > Thanks for your reply. I have day, month, hour, temp, relative humidity, > Wind speed as my input variable

Re: [scikit-learn] Artificial neural network not learning lower values of the training sample

2016-05-31 Thread Jacob Schreiber
Using the same feature set? How well do other estimators work? (Linear regression, gradient boosting, etc...) On Tue, May 31, 2016 at 11:10 AM, muhammad waseem wrote: > This problem has been solved in the literature before, I can post papers. > > On Tue, May 31, 2016 at 7:07 PM, Jacob

Re: [scikit-learn] The culture of commit squashing

2016-06-13 Thread Jacob Schreiber
My research work involves frequently contributing small changes. I like to keep these around as a record of what I've done, until I've finished with that part of the code. However, I also hate having large numbers of commits (frequently can commit 50+ times a day without much substantitve progress)

Re: [scikit-learn] Tuning custom parameters using grid_search

2016-09-07 Thread Jacob Schreiber
You can use a pipeline object to contain both feature selection/transformation steps and an estimator. All elements of a pipeline can then be tuned using gridsearch. You can see a simple example here: http://scikit-learn.org/stable/modules/pipeline.html You may also be interested seeing if the Fea

[scikit-learn] pomegranate v0.6.0 release

2016-09-10 Thread Jacob Schreiber
Hello everyone! I just released pomegranate v0.6.0, which focuses on probabilistic modelling for python. It currently implements basic distributions, naive bayes, markov chains, general mixture models, hidden Markov models, and Bayesian networks in a fast and extremely flexible manner. I have a mo

Re: [scikit-learn] Scikit-learn 0.18-rc2 release candidate!

2016-09-15 Thread Jacob Schreiber
Hooray everyone! On Thu, Sep 15, 2016 at 9:52 AM, Aakash Agarwal wrote: > Awesome work guys! Keep it up :) > > On Thu, Sep 15, 2016 at 7:14 PM, Gael Varoquaux < > gael.varoqu...@normalesup.org> wrote: > >> On Thu, Sep 15, 2016 at 09:34:55AM -0400, Andy wrote: >> > >Who writes a blog post? :) >>

Re: [scikit-learn] New contributor to scikit-learn

2016-09-15 Thread Jacob Schreiber
Welcome! If you're looking to get started, you might try sorting issues by those with "Needs contributor" and "easy" to begin with. I look forward to seeing your contributions. On Thu, Sep 15, 2016 at 3:35 PM, Kathleen Chen wrote: > Hi! I'm Kathy, a student at Penn taking an open source software

Re: [scikit-learn] Joining the Community

2016-09-15 Thread Jacob Schreiber
Hello as well! As I mentioned in the other two threads, you may want to take a stab at "Needs contributors" and "easy" marked issues first if you're new to the project. I look forward to seeing your contributions! Jacob On Thu, Sep 15, 2016 at 3:36 PM, Josh Karnofsky SEAS wrote: > Hi everyone,

Re: [scikit-learn] Fwd: Featuring scikit-learn in Hacktoberfest 2016

2016-09-16 Thread Jacob Schreiber
I think it's always good to get more contributors, even if they only add small amounts of code and disappear. However, I'd be worried about people submitting low quality PRs in order to claim "success" and/or not following up. On Fri, Sep 16, 2016 at 5:59 PM, Andreas Mueller wrote: > Hey all. >

Re: [scikit-learn] Welcome Raghav to the core-dev team

2016-10-03 Thread Jacob Schreiber
Congrats Raghav! On Mon, Oct 3, 2016 at 10:06 AM, Sebastian Raschka wrote: > Congrats Raghav! And thanks a lot for all the great work on the > model_selection module! > > > On Oct 3, 2016, at 12:53 PM, Siddharth Gupta < > siddharthgupta...@gmail.com> wrote: > > > > Congrats Raghav! :D > > > > >

Re: [scikit-learn] Missing data and decision trees

2016-10-13 Thread Jacob Schreiber
I think Raghav is working on it in this PR: https://github.com/scikit-learn/scikit-learn/pull/5974 The reason they weren't initially supported is likely that it involves a lot of work and design choices to handle missing values appropriately, and the discussion on the best way to handle it was pos

Re: [scikit-learn] Recurrent Decision Tree

2016-11-07 Thread Jacob Schreiber
It hasn't been investigated by the sklearn team to my knowledge. As Dale said, there may be an independent implementation out there but not officially related to sklearn. On Mon, Nov 7, 2016 at 9:17 AM, KevNo wrote: > This is nothing to do with Scikit guidelines criteria > > This is about sc

Re: [scikit-learn] Bayesian Gaussian Mixture

2016-11-25 Thread Jacob Schreiber
Typically this means that the model is so confident in its predictions it does not believe it possible for the sample to come from the other component. Do you get the same results with a regular GaussianMixture? On Fri, Nov 25, 2016 at 11:34 AM, Tommaso Costanzo < tommaso.costanz...@gmail.com> wro

Re: [scikit-learn] Markov Clustering?

2016-12-03 Thread Jacob Schreiber
I don't think anyone is working on this. Contributions are always very welcome, but be aware before you start that the process of getting a completely new algorithm into scikit-learn will take a lot of time and reviews. On Sat, Dec 3, 2016 at 9:19 AM, Allan Visochek wrote: > Hi there, > > My nam

Re: [scikit-learn] Why do DTs have a different fit protocol than NB and SVMs?

2016-12-13 Thread Jacob Schreiber
The fit method returns the object itself, so regardless of which way you do it, it will work. The reason the fit method returns itself is so that you can chain methods, like "preds = clf.fit(X, y).predict(X)" On Tue, Dec 13, 2016 at 12:14 PM, Graham Arthur Mackenzie < graham.arthur.macken...@gmail

Re: [scikit-learn] Scikit Learn Random Classifier - TPR and FPR plotted on matplotlib

2016-12-14 Thread Jacob Schreiber
To make a proper ROC curve you need to test all possible thresholds, not just a subset of them. You can do this easily in sklearn. import matplotlib.pyplot as plt from sklearn.metrics import roc_curve, roc_auc_score ... ... y_pred = clf.predict_proba(X) fpr, tpr, _ = roc_curve(y_true, y_pred) a

Re: [scikit-learn] numpy.amin behaviour with multidimensionnal arrays

2016-12-29 Thread Jacob Schreiber
It means that instead of returning the minimum value anywhere in the entire matrix, it will return the minimum value for each column or each row depending on which axis you put in, so a vector instead of a scalar. On Thu, Dec 29, 2016 at 6:00 AM, greg g wrote: > Hi, > > I would like to understan

Re: [scikit-learn] meta-estimator for multiple MLPRegressor

2017-01-07 Thread Jacob Schreiber
If you have such a small number of observations (with a much higher feature space) then why do you think you can accurately train not just a single MLP, but an ensemble of them without overfitting dramatically? On Sat, Jan 7, 2017 at 2:26 PM, Thomas Evangelidis wrote: > Regarding the evaluation,

Re: [scikit-learn] meta-estimator for multiple MLPRegressor

2017-01-07 Thread Jacob Schreiber
Sat, Jan 7, 2017 at 4:01 PM, Thomas Evangelidis wrote: > > > On 8 January 2017 at 00:04, Jacob Schreiber > wrote: > >> If you have such a small number of observations (with a much higher >> feature space) then why do you think you can accurately train not just a >&g

Re: [scikit-learn] Roc curve from multilabel classification has slope

2017-01-07 Thread Jacob Schreiber
Slope usually means there are ties in your predictions. Check your dataset to see if you have repeated predicted values (possibly 1 or 0). On Sat, Jan 7, 2017 at 4:32 PM, José Ismael Fernández Martínez < ismael...@ciencias.unam.mx> wrote: > But is not a scikit-learn classifier, is a keras classif

Re: [scikit-learn] meta-estimator for multiple MLPRegressor

2017-01-09 Thread Jacob Schreiber
Thomas, it can be difficult to fine tune L1/L2 regularization in the case where n_parameters >>> n_samples ~and~ n_features >> n_samples. If your samples are very similar to the training data, why are simpler models not working well? On Sun, Jan 8, 2017 at 8:08 PM, Joel Nothman wrote: > Btw, I

Re: [scikit-learn] Complex variables in Gaussian mixture models?

2017-01-09 Thread Jacob Schreiber
I'm not too familiar with how complex values are traditionally treated, but is it possible to make the complex component a real valued component and treat it just as having twice as many features? On Mon, Jan 9, 2017 at 11:34 AM, Rory Smith wrote: > Hi All, > > I’d like to set up a GMM using mix

Re: [scikit-learn] meta-estimator for multiple MLPRegressor

2017-01-09 Thread Jacob Schreiber
Even with a single layer with 10 neurons you're still trying to train over 6000 parameters using ~30 samples. Dropout is a concept common in neural networks, but doesn't appear to be in sklearn's implementation of MLPs. Early stopping based on validation performance isn't an "extra" step for reduci

Re: [scikit-learn] numpy integration with random forrest implementation

2017-01-21 Thread Jacob Schreiber
If what you're saying is that you have a variable length input, then most sklearn classifiers won't work on this data. They expect a fixed feature set. Perhaps you could try extracting a set of informative features being fed into the classifier? On Sat, Jan 21, 2017 at 3:18 AM, Carlton Banks wrot

Re: [scikit-learn] numpy integration with random forrest implementation

2017-01-21 Thread Jacob Schreiber
constant both for the input and output. > > Den 21. jan. 2017 kl. 18.25 skrev Jacob Schreiber >: > > If what you're saying is that you have a variable length input, then most > sklearn classifiers won't work on this data. They expect a fixed feature > set. Perha

Re: [scikit-learn] GSOC call for mentors

2017-01-30 Thread Jacob Schreiber
I discussed this briefly with Gael and Joel. The consensus was that unless we already know excellent students who will fit well that it is unlikely we will participate in GSoC. That being said, if someone (other than me) is willing to step up and organize it, I'd volunteer to be a mentor again. I t

Re: [scikit-learn] can we have a slack team for scikit-learn

2017-02-18 Thread Jacob Schreiber
I would support a slack channel --if-- we had channels for different groups of modules, like a tree channel and a linear methods channel, and developers involved in those sections populated the channels. This would allow people to ask questions to developers involved directly. However, I can easily

Re: [scikit-learn] GSOC call for mentors

2017-02-18 Thread Jacob Schreiber
I think we have de facto decided not to participate by not having someone step up by now and organize it like Raghav did last year. On Sat, Feb 18, 2017 at 10:54 AM, Olivier Grisel wrote: > Personally I don't feel like mentoring this year. I would really like > to focus my scikit-learn time on f

Re: [scikit-learn] Regarding scikit learn to take part in GSOC 2017

2017-02-18 Thread Jacob Schreiber
Hi Akshay Thanks for the note. We've had several threads discussing this, and appear to have come to the consensus that while there are some people who are willing to serve as mentors, no one has the time right now to organize the entire thing. The team always welcomes contributions and is willin

Re: [scikit-learn] Women in Machine Learning and Data Science Sprint next Weekend (also call for help)

2017-02-27 Thread Jacob Schreiber
I will try to carve out some time Saturday to review PRs. What time is it occuring? On Mon, Feb 27, 2017 at 8:50 PM, Andreas Mueller wrote: > Hey all. > > There's gonna be an introductory scikit-learn sprint at NYC on Saturday > that a local Women's DS/ML group is organizing with me. > I feel li

Re: [scikit-learn] GSoC, 2017 - Parallel Decision Tree Building

2017-02-28 Thread Jacob Schreiber
Hi Aman I responded to your other email, but I'm not sure if it actually went through. Thanks for your interest in the project, and your current PRs. If you're looking to apply, you should write a gist which follows the format that nelson-liu used here: https://github.com/scikit-learn/scikit-lear

Re: [scikit-learn] Women in Machine Learning and Data Science Sprint next Weekend (also call for help)

2017-02-28 Thread Jacob Schreiber
h setup etc. > (EST that is). > > Andy > > > On 02/27/2017 11:58 PM, Jacob Schreiber wrote: > > I will try to carve out some time Saturday to review PRs. What time is it > occuring? > > On Mon, Feb 27, 2017 at 8:50 PM, Andreas Mueller wrote: > >> Hey all.

Re: [scikit-learn] GSoc, 2017 (proposal idea and intro) .reg

2017-03-02 Thread Jacob Schreiber
Hi Shubham Thanks for your interest. I'm eager to see your contributions to sklearn in the future. However, I'm pretty sure kmeans++ is already implemented: http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html Jacob On Thu, Mar 2, 2017 at 1:07 AM, SHUBHAM BHARDWAJ 15BCE07

Re: [scikit-learn] Scipy 2017

2017-03-03 Thread Jacob Schreiber
Do you still need someone to help with the tutorial? I may be able to attend. On Tue, Feb 28, 2017 at 9:43 AM, Nelson Liu wrote: > The conference generally (at least for the last three years) uploads > recordings of the tutorials afterwards, e.g. here >

Re: [scikit-learn] Logistic regression with elastic net regularization

2017-03-13 Thread Jacob Schreiber
Hi Stuart Take a look at this issue: https://github.com/scikit-learn/scikit-learn/issues/2968 On Mon, Mar 13, 2017 at 9:57 AM, Stuart Reynolds wrote: > Is there an implementation of logistic regression with elastic net > regularization in scikit? > (or pointers on implementing this - its seems

Re: [scikit-learn] Regarding Adaboost classifier

2017-03-18 Thread Jacob Schreiber
You really need to provide more details with what exactly you're stuck with. If you've extracted useful features from some image into a matrix X with binary labels y you can just do `clf.fit(X, y)` to train the classifier. On Sat, Mar 18, 2017 at 10:21 PM, Afzal Ansari wrote: > Hello Sir, > I w

[scikit-learn] GSoC 2017

2017-03-21 Thread Jacob Schreiber
Starting yesterday, students were able to submit their proposals on the GSoC website. Please review this site thoroughly before making a submission. We're eager to hear what prospective students have in mind for a

Re: [scikit-learn] GSoC 2017 : "Parallel Decision Tree Building"

2017-03-22 Thread Jacob Schreiber
Hi Aman Likely the easiest way to parallelize decision tree building is to parallelize the finding of the best split at each node, as it checks every non-constant feature for the best split. Several other approaches focus on how to parallelize tree building in the streaming or distributed cases, w

Re: [scikit-learn] Regarding GSoC projects and mentors

2017-03-22 Thread Jacob Schreiber
Hi Jeff I would be overseeing the parallel decision tree building project, and Gael is overseeing the linear models project. This will end up being fairly fluid, as we're looking for the right combination of mentors and students. Jacob On Wed, Mar 22, 2017 at 8:37 AM, Jeff Lee wrote: > Hi, > >

Re: [scikit-learn] GSoC 2017 : "Parallel Decision Tree Building"

2017-03-26 Thread Jacob Schreiber
ated to detailing etc. I will need > little more time for that. Meanwhile, I await your feedback and guidance. > > Thank You > > > > On 23 March 2017 at 02:38, Jacob Schreiber > wrote: > >> Hi Aman >> >> Likely the easiest way to parallelize decisi

Re: [scikit-learn] GSoC proposal - linear model

2017-03-29 Thread Jacob Schreiber
Hi Konstantinos I likely won't be a mentor for the linear models project, but I looked over your proposal and have a few suggestions. In general it was a good write up! 1. You should include some equations in the write up, basically the softmax loss (which I think is a more common term than multi

Re: [scikit-learn] GSoC proposal - linear model

2017-03-31 Thread Jacob Schreiber
> please let me know. Ideally I would prefer not to leave it till the last > day. > > Kind regards, > Konstantinos > > > > > On 30/03/2017 07:45 πμ, Jacob Schreiber wrote: > > Hi Konstantinos > > I likely won't be a mentor for the linear models pr

Re: [scikit-learn] GSoC 2017

2017-04-02 Thread Jacob Schreiber
Less than 11 hours left in the application period! If you've asked for feedback and we haven't gotten back to you, make sure you submit anyway. If you don't get your submission in before the deadline (April 3rd, 9:00am PST) we won't be able to consider you. On Tue, Mar 21, 20

Re: [scikit-learn] GSoC 2017

2017-04-02 Thread Jacob Schreiber
Make sure that you tag your proposal with 'scikit-learn' when you submit it so that we can identify them easily. On Sun, Apr 2, 2017 at 10:47 PM, Jacob Schreiber wrote: > Less than 11 hours left in the application period! If you've asked for > feedback and we haven'

Re: [scikit-learn] urgent help in scikit-learn

2017-04-05 Thread Jacob Schreiber
Also, in general it's not appropriate to repeatedly ping someone on this mailing list for 'urgent help.' On Wed, Apr 5, 2017 at 8:30 AM, Shane Grigsby wrote: > Hi Shuchi, > You probably want to query the Statsmodels community for this; they have a > google groups board here: > > https://groups.g

Re: [scikit-learn] impurity criterion in gradient boosted regression trees

2017-05-11 Thread Jacob Schreiber
The blog post from Matthew Drury sums it up well. The feature importance is indeed the Gini impurity. On Tue, May 9, 2017 at 8:34 AM, Olga Lyashevska wrote: > Hi all, > > I am trying to understand differences in feature importance plots obtained > with R package gbm and sklearn. Having compared

Re: [scikit-learn] How to best understand scikit-learn and know its modules and methods?

2017-06-04 Thread Jacob Schreiber
Everything will disappear if you don't save it. However, if you do ```clf = LinearRegression().fit(X, y)``` then the model is saved in the variable `clf`. On Sun, Jun 4, 2017 at 4:06 PM, C W wrote: > Yes, they make a lot sense. Thanks! > > I wanted to ask a follow-up: > > > LinearRegression().fi

Re: [scikit-learn] Random Forest max_features and boostrap construction parameters interpretation

2017-06-05 Thread Jacob Schreiber
Howdy When doing bootstrapping, n samples are selected from the dataset WITH replacement, where n is the number of samples in the dataset. This leads to situations where some samples have a weight > 1 and others have a weight of 0. This is done separately for each tree. When selecting the number

Re: [scikit-learn] Random Forest max_features and boostrap construction parameters interpretation

2017-06-05 Thread Jacob Schreiber
ures can occur as many times as > necessary, unless a maximum depth is specified in the constructor. > Note that an informative feature can be re-applied to form a decision > criteria at more than node in the decision tree. > -- > > Adjustments welcome. > > Many thanks

Re: [scikit-learn] Documentation proposal

2017-06-14 Thread Jacob Schreiber
Hi Gael Thanks for the work! We are grateful for the work that other people do in providing these types of tutorials and introductions as they lower the barrier of entry for new people to get into machine learning. We generally don't include these in the official sklearn documentation, in no small

Re: [scikit-learn] Help with data parsing (link to stack exchange question)

2017-06-14 Thread Jacob Schreiber
It's unclear to me what exactly you want to do with the classification algorithm. Is your goal to take in a binary data matrix indicating the presence of certain k-mers and predict whether the the present k-mers indicate a susceptible or resistant genome? If so, then you need to convert your sequen

Re: [scikit-learn] Need Help Random Forest Imputation Model as in R

2017-06-15 Thread Jacob Schreiber
No. On Thu, Jun 15, 2017 at 4:13 PM, Akash Devgun wrote: > Please let me know Do you have random Forest Imputation model in > python-scikit learn similar to rfImpute in R has ? > > Thanks > > ___ > scikit-learn mailing list > scikit-learn@python.o

Re: [scikit-learn] Need Help Random Forest Imputation Model as in R

2017-06-15 Thread Jacob Schreiber
Most likely not. If there is a willing contributor, we would be happy to review a PR though. On Thu, Jun 15, 2017 at 5:26 PM, Akash Devgun wrote: > Will you have in future?? > > On Thu, Jun 15, 2017 at 5:14 PM Jacob Schreiber > wrote: > >> No. >> >> On Th

Re: [scikit-learn] Need Help Random Forest Imputation Model as in R

2017-06-18 Thread Jacob Schreiber
d estimator-based imputation. > The problem with fancyimpute is that it has no notion of test set, so you > can't apply it to new data. > > Cheers, > Andy > > > > On 06/15/2017 08:31 PM, Jacob Schreiber wrote: > > Most likely not. If there is a willing contributor,

Re: [scikit-learn] Scikit-learn at Data Intelligence this past weekend

2017-06-30 Thread Jacob Schreiber
Thanks for the summary. I was there as well, and it seemed like scikit-learn had a strong showing. It seemed as though many talks that weren't directly on scikit-learn still mentioned it or used the models during the presentation. On Fri, Jun 30, 2017 at 9:47 AM, Francois Dion wrote: > This past

Re: [scikit-learn] Moving average transformer

2017-07-06 Thread Jacob Schreiber
Hi Jeremy! Thanks for your offer to contribute. We're always looking for people to add good ideas to the package. Time series data can be tricky to handle appropriately, and so I think we generally try to pass it off to more specialized packages that focus on that. Andreas may have a more detailed

Re: [scikit-learn] Replacing the Boston Housing Prices dataset

2017-07-06 Thread Jacob Schreiber
Hi Tony As others have pointed out, I think that you may be misunderstanding the purpose of that "feature." We are in agreement that discrimination against protected classes is not OK, and that even outside complying with the law one should avoid discrimination, in model building or elsewhere. How

Re: [scikit-learn] Help with NLP

2017-07-07 Thread Jacob Schreiber
The scikit-learn mailing list is probably not the best place to be asking for help with another module. On Fri, Jul 7, 2017 at 9:28 AM Ariani A wrote: > Yes , it is. > regards > > On Fri, Jul 7, 2017 at 12:23 PM, Carlton Banks wrote: > >> NLP as is Natural language processing? >> >> Den 7. jul.

Re: [scikit-learn] Replacing the Boston Housing Prices dataset

2017-07-07 Thread Jacob Schreiber
pread use than Lena was in image processing. > > > You can argue about whether or not it's morally right or wrong to include > the > dataset. I see merit to both arguments. But "too many tutorials use it" is > very > similar in flavour to "the economy of the South

Re: [scikit-learn] Contribution

2017-07-10 Thread Jacob Schreiber
Howdy This question and the one right after in the FAQ are probably relevant re: inclusion of new algorithms: http://scikit-learn.org/stable/faq.html#what-are-the-inclusion-criteria-for-new-algorithms. The gist is that we only include well established algorithms, and there are no end to those. I t

Re: [scikit-learn] Contribution - Markov Clustering

2017-07-11 Thread Jacob Schreiber
You don't need our permission to submit a PR, go ahead! We welcome PRs. On Mon, Jul 10, 2017 at 9:36 PM, Uri Goren wrote: > I have, > The only criterion that I am unsure about is the number citations. > > In the literature Markov clustering is usually compared to affinity > prolongation, which a

Re: [scikit-learn] Agglomerative clustering problem

2017-07-15 Thread Jacob Schreiber
Typically when I think of limiting the number of points in a cluster I think of KD trees. I suppose that wouldn't work? On Tue, Jul 11, 2017 at 11:22 AM, Ariani A wrote: > ِDear Uri, > Thanks. I just have a pairwise distance matrix and I want to implement it > so that each cluster has at least 4

Re: [scikit-learn] Classifiers for dataset with categorical features

2017-07-21 Thread Jacob Schreiber
Traditionally tree based methods are very good when it comes to categorical variables and can handle them appropriately. There is a current WIP PR to add this support to sklearn. I'm not exactly sure what you mean that "perform better" though. Estimators that ignore the categorical aspect of these

Re: [scikit-learn] Classifiers for dataset with categorical features

2017-07-21 Thread Jacob Schreiber
Traditionally tree based methods are very good when it comes to categorical variables and can handle them appropriately. There is a current WIP PR to add this support to sklearn. I'm not exactly sure what you mean that "perform better" though. Estimators that ignore the categorical aspect of these

[scikit-learn] scikit-learn hits 20k github stars

2017-07-22 Thread Jacob Schreiber
[image: Inline image 1] Many thanks to everyone who has worked on and contributed to the project for the past decade to make it such a great tool! Also a special thanks to Joel Nothman, who has been on top of answering issues and reviewing PRs for years now. 🎆🎉 ___

Re: [scikit-learn] Decision Tree Regressor - DepthFirstTreeBuilder vs BestFirstTreeBuilder

2017-09-22 Thread Jacob Schreiber
Hi Hanna Thanks for the questions! 1) Best first tends to product unbalanced but sparser trees, and frequently produces more generalizable models by only capturing the most important interactions. Unbalanced isn't necessarily bad either. You can imagine that in some parts of the tree where there

[scikit-learn] pomegranate v0.8.0 released

2017-10-09 Thread Jacob Schreiber
Howdy everyone! I am pleased to announce the release of pomegranate v0.8.0, for fast and flexible probabilistic modeling in Python. The core set of models in pomegranate include Bayesian networks, hidden Markov models, mixtures, and Bayes classifiers, among others. While no new models have been ad

Re: [scikit-learn] New core devs: Hanmin Qin, Guillaume Lemaître, and Roman Yurchak

2017-11-10 Thread Jacob Schreiber
Congrats! Welcome to the team, and thanks for your hard work so far. On Thu, Nov 9, 2017 at 8:36 AM, Olivier Grisel wrote: > Congrats to all three of you! Thank you very much for your contributions > and in particular in reviewing contributions by others. > > -- > Olivier > ​ > > ___

[scikit-learn] pomegranate v0.9.0 released: probabilistic modeling for Python

2018-01-03 Thread Jacob Schreiber
Howdy all! I'm pleased to announced the release of pomegranate v0.9.0. The focus of this release is on missing value support across all model fitting / structure learning / inference methods and models. This enables you to do everything from fitting a multivariate Gaussian distribution to an incom

Re: [scikit-learn] VOTE: scikit-learn governance document

2019-02-10 Thread Jacob Schreiber
+1 from me as well. Thanks for putting in the time to write this all out. On Sun, Feb 10, 2019 at 4:54 PM Hanmin Qin wrote: > +1 (personally I still think it's better to keep the flow chart, it seems > useful for beginners) > > Hanmin Qin > > - Original Message - > From: Alexandre Gramfo