Hello All!
Wanted to pass along this cool article from datanami.com about how Zillow uses
Scikit-Learn (and some other tools) to model their Zestimate home value and
also detect fraudulent listings, pretty cool stuff!
http://www.datanami.com/2015/08/12/inside-the-zestimate-data-science-at-zillo
+1 for that idea. A "cheat sheet" of sorts, or perhaps more like Cliff notes.
Great idea though for sure.
-Original Message-
From: Andy [mailto:t3k...@gmail.com]
Sent: Saturday, June 06, 2015 2:12 PM
To: scikit-learn-general@lists.sourceforge.net
Subject: [Scikit-learn-general] Estimator
oblem
of the "MNIST implementation". I don't know how that could happen. :D
On 04/20/2015 08:55 AM, Jason Wolosonovich wrote:
> Oh wow, very cool. Thank you very much for the assistance and info Alexander!
>
> -Original Message-
> From: afabisch [mailto:afabi...@m
regards,
Alexander
Am 2015-04-18 01:48, schrieb Jason Wolosonovich:
> Hello All,
>
> My dataset has 93 features and just under 62,000 observations (61,878
> to be exact). I'm running out of memory right after the mean sigma
> value is computed/displayed. I've tried using
Hello All,
My dataset has 93 features and just under 62,000 observations (61,878 to be
exact). I'm running out of memory right after the mean sigma value is
computed/displayed. I've tried using dimensionality reduction via TruncatedSVD
with n_components set at different levels (78, 50 and 2 res
n't
know other tips or rule of thumbs are available.
Thanks,
________
From: Jason Wolosonovich [jmwol...@asu.edu]
Sent: Monday, April 13, 2015 10:47 PM
To:
scikit-learn-general@lists.sourceforge.net<mailto:scikit-learn-general@lists.sourceforge.net>
Subject:
use random forest, instead of decision tree?
Thanks,
From: Jason Wolosonovich [mailto:jmwol...@asu.edu]
Sent: Saturday, April 11, 2015 9:13 AM
To:
scikit-learn-general@lists.sourceforge.net<mailto:scikit-learn-general@lists.sourceforge.net>
Subject: Re: [Scikit-learn-general] adaboost parame
What is your dataset like? How are you building your individual classifier that
you are ensembling with AdaBoost? A common-use case would be boosted decision
stumps (one-level decision trees).
http://en.wikipedia.org/wiki/Decision_stump
http://lyonesse.stanford.edu/~langley/papers/stump.ml92.pd
Hi Roberto,
I'm no expert by any means, but I was reading a blog post the other day that
talked about using Random Search vs Grid Search. The gist of the article is
that, since you can feed distributions to Random Search and it selects values
randomly over the number of iterations you choose,
k for already overloaded people, with
little clear benefit.
If one of us is very bored, maybe it'll happen some time. I'm not sure we
currently have the right legal infrastructure to actually handle any income,
though.
Cheers,
Andy
On 04/04/2015 04:38 AM, Jason Wolosonovich wrote:
&g
Hello All,
Have any of you (the developers/web site admins) considered placing the links
to lectures and videos in a more prominent place on the Scikit homepage?
Perhaps a diagonal little ribbon similar to the Github ribbon so it draws some
attention? I ask because the videos of the presentatio
Andreas,
Thank you very much for the response, your explanation makes sense. Pandas has
the get_dummies() method that I've used (and then dropped one of each of the
categorical indicators to prevent multicolinearity) but I'll check out One-Hot
Encoder for that purpose as well.
Sebastian,
Than
Hello All,
When using any of the preprocessing options in sklearn, is it possible to
select a subset of features (columns) in a dataset for preprocessing? Many
datasets contain a mix of feature types (categorical, numerical, binary) and it
doesn't seem like it would make sense to scale certain
13 matches
Mail list logo