I love SciKit and I'm going to contribute an SGD classifier for
semi-supervised problems.
I already read through all the contributor documentation and I've read
many of the docs.
I'm asking the list if I should model my code off of the style/quality
of the SGDClassifier class or if there is a bet
I have never done exactly what you are suggesting, but there is an
inverse_transform method for PCA objects, which may do what you are looking
for.
On Fri, Jan 31, 2014 at 1:25 PM, Arman Eshaghi wrote:
> Dear all,
>
> I understand that scikit-learn is a general purpose tool, however here I
> app
Dear all,
I understand that scikit-learn is a general purpose tool, however here I
appreciate if you could forward me to a webpage, tutorial , and etc to be able
to understand the basis of my problem. I work on 3D Images of MRI (or 4D). A
typical problem, for example, is when I use PCA decompos
thanks.
Fred
On Thu, Jan 30, 2014 at 8:28 PM, Patrick Mineault
wrote:
> Sure you can:
>
> http://www.cs.toronto.edu/~jasper/bayesopt.pdf
>
> And some python code:
>
> https://github.com/JasperSnoek/spearmint
>
>
> On Thu, Jan 30, 2014 at 7:53 PM, Frédéric Bastien wrote:
>>
>> I have a question
Here, some results on the 20 newsgroups dataset:
Classifiertrain-time test-time error-rate
5-nn0.0047s 13.6651s0.5916
random forest 263.3146s3.9985s0.2459
sgd 0.2265s0.0657s
On Fri, Jan 31, 2014 at 12:16:53PM +0100, Arnaud Joly wrote:
> We should definitely remove the old list. Since it biases applicants
> toward subjects that could be without mentors.
I am in favor of editing it quite violently to remove anything that
people cannot vouch for. Maybe not remove it comp
On Wed, Jan 22, 2014 at 9:48 AM, Mathieu Blondel wrote:
>
> Something I was wondering is whether sparse support in decision trees
> would actually be useful. Do decision trees (or ensembles of them like
> random forests) work better than linear models for high-dimensional data?
>
I share your poin
2014/1/31 Felipe Eltermann :
> OK, I finished reading _tree.pyx and now I understand CSC dense matrix
> format.
> I have a general view of what is necessary to be implemented.
>
> I've never seriously used Cython. What are you guys using as development
> environment?
Just a good text editor and a
Hello,
Your contributions to scikit-learn is highly appreciated.
However, we use only the scikit-learn mailing list to discuss
about GSOC ideas. At the moment, I don’t want to give any,
but might give some in a near future.
We should definitely remove the old list. Since it biases applicants to
OK, I finished reading _tree.pyx and now I understand CSC dense matrix
format.
I have a general view of what is necessary to be implemented.
I've never seriously used Cython. What are you guys using as development
environment? How to easily code/compile/test?
On Thu, Jan 23, 2014 at 11:55 AM, Ol
There are smarter ways to speed up SVM with parallel computation by
changing the algorithm, e.g:
http://www.cs.utexas.edu/~cjhsieh/dcsvm/
But this is new and not implemented in scikit-learn and it's too
recent to be implemented an maintained as part of scikit-learn.
However it could be implemente
2014-01-31 Thomas Johnson :
> It's definitely the bottleneck for my particular use case. I spawn ~180
> processes for a grid search on my Google Compute Engine cluster, but still
> end up waiting >90 minutes just for a few individual long-running processes
> with high C values.
Well, have you trie
> if not isinstance(score, numbers.Number):
> raise ValueError("scoring must return a number, got %s (%s)"
> " instead." % (str(score), type(score)))
I am not opposed to making this check more relaxed: we could add an 'or
(isinstance(score, np.ndarray) and score.dty
On Thu, Jan 30, 2014 at 07:53:16PM -0500, Frédéric Bastien wrote:
> I have a question on those type of algo for hyper parameter
> optimization. With a grid search, we can run all jobs in parallel. But
> I have the impression that those algo remove that possibility. Is
> there there way to sample ma
14 matches
Mail list logo