[Scikit-learn-general] Zillow Uses Scikit-Learn

2015-08-30 Thread Jason Wolosonovich
Hello All! Wanted to pass along this cool article from datanami.com about how Zillow uses Scikit-Learn (and some other tools) to model their Zestimate home value and also detect fraudulent listings, pretty cool stuff! http://www.datanami.com/2015/08/12/inside-the-zestimate-data-science-at-zillo

Re: [Scikit-learn-general] Estimator Overview / Summary

2015-06-07 Thread Jason Wolosonovich
+1 for that idea. A "cheat sheet" of sorts, or perhaps more like Cliff notes. Great idea though for sure. -Original Message- From: Andy [mailto:t3k...@gmail.com] Sent: Saturday, June 06, 2015 2:12 PM To: scikit-learn-general@lists.sourceforge.net Subject: [Scikit-learn-general] Estimator

Re: [Scikit-learn-general] TSNE Memory Error

2015-04-20 Thread Jason Wolosonovich
oblem of the "MNIST implementation". I don't know how that could happen. :D On 04/20/2015 08:55 AM, Jason Wolosonovich wrote: > Oh wow, very cool. Thank you very much for the assistance and info Alexander! > > -Original Message- > From: afabisch [mailto:afabi...@m

Re: [Scikit-learn-general] TSNE Memory Error

2015-04-19 Thread Jason Wolosonovich
regards, Alexander Am 2015-04-18 01:48, schrieb Jason Wolosonovich: > Hello All, > > My dataset has 93 features and just under 62,000 observations (61,878 > to be exact). I'm running out of memory right after the mean sigma > value is computed/displayed. I've tried using

[Scikit-learn-general] TSNE Memory Error

2015-04-17 Thread Jason Wolosonovich
Hello All, My dataset has 93 features and just under 62,000 observations (61,878 to be exact). I'm running out of memory right after the mean sigma value is computed/displayed. I've tried using dimensionality reduction via TruncatedSVD with n_components set at different levels (78, 50 and 2 res

Re: [Scikit-learn-general] adaboost parameters

2015-04-14 Thread Jason Wolosonovich
n't know other tips or rule of thumbs are available. Thanks, ________ From: Jason Wolosonovich [jmwol...@asu.edu] Sent: Monday, April 13, 2015 10:47 PM To: scikit-learn-general@lists.sourceforge.net<mailto:scikit-learn-general@lists.sourceforge.net> Subject:

Re: [Scikit-learn-general] adaboost parameters

2015-04-13 Thread Jason Wolosonovich
use random forest, instead of decision tree? Thanks, From: Jason Wolosonovich [mailto:jmwol...@asu.edu] Sent: Saturday, April 11, 2015 9:13 AM To: scikit-learn-general@lists.sourceforge.net<mailto:scikit-learn-general@lists.sourceforge.net> Subject: Re: [Scikit-learn-general] adaboost parame

Re: [Scikit-learn-general] adaboost parameters

2015-04-11 Thread Jason Wolosonovich
What is your dataset like? How are you building your individual classifier that you are ensembling with AdaBoost? A common-use case would be boosted decision stumps (one-level decision trees). http://en.wikipedia.org/wiki/Decision_stump http://lyonesse.stanford.edu/~langley/papers/stump.ml92.pd

Re: [Scikit-learn-general] CV with SVM

2015-04-07 Thread Jason Wolosonovich
Hi Roberto, I'm no expert by any means, but I was reading a blog post the other day that talked about using Random Search vs Grid Search. The gist of the article is that, since you can feed distributions to Random Search and it selects values randomly over the number of iterations you choose,

Re: [Scikit-learn-general] For Devs/Web Site Admins Of Sklearn

2015-04-06 Thread Jason Wolosonovich
k for already overloaded people, with little clear benefit. If one of us is very bored, maybe it'll happen some time. I'm not sure we currently have the right legal infrastructure to actually handle any income, though. Cheers, Andy On 04/04/2015 04:38 AM, Jason Wolosonovich wrote: &g

[Scikit-learn-general] For Devs/Web Site Admins Of Sklearn

2015-04-04 Thread Jason Wolosonovich
Hello All, Have any of you (the developers/web site admins) considered placing the links to lectures and videos in a more prominent place on the Scikit homepage? Perhaps a diagonal little ribbon similar to the Github ribbon so it draws some attention? I ask because the videos of the presentatio

Re: [Scikit-learn-general] SOLVED: Scaling a Subset of Features in SKLEARN

2015-03-03 Thread Jason Wolosonovich
Andreas, Thank you very much for the response, your explanation makes sense. Pandas has the get_dummies() method that I've used (and then dropped one of each of the categorical indicators to prevent multicolinearity) but I'll check out One-Hot Encoder for that purpose as well. Sebastian, Than

[Scikit-learn-general] Scaling a Subset of Features in SKLEARN

2015-03-02 Thread Jason Wolosonovich
Hello All, When using any of the preprocessing options in sklearn, is it possible to select a subset of features (columns) in a dataset for preprocessing? Many datasets contain a mix of feature types (categorical, numerical, binary) and it doesn't seem like it would make sense to scale certain