Relatedly, I recently noticed
http://docs.scipy.org/doc/numpy/reference/routines.dual.html
On Saturday, October 19, 2013, Thomas Unterthiner wrote:
> Hi there!
>
> I've noticed that sklearn uses numpy.linalg instead of scipy.linalg for
> its linear algebra implementations (e.g. dot or svd). Ho
way to work around relative imports in notebook? they doesn't
> seem to work as absolute imports. but absolute imports are taken from the
> installed package. I want to import from the version I'm working on.
>
> -Maheshakya
>
>
> On Mon, Sep 2, 2013 at 3:58 AM, Kenne
In addition to what everyone else has responded: you'd probably enjoy
working in an ipython notebook.
-Ken
On Aug 29, 2013 6:22 AM, "Maheshakya Wijewardena"
wrote:
> Hi
> I'm trying to implements a general Bagging module for Scikit-learn for an
> university project. I want to test the methods an
I wonder if at some point it gets easier to build a emscripten build of the
numeric Python ecosystem, then just run it on nodejs. Anybody tried that?
-Ken (on mobile)
On Jul 17, 2013 8:02 AM, "Olivier Grisel" wrote:
> NumPy is really easy to build compared to SciPy. AFAIK you need a
> gfortran r
I haven't been following the details of this thread, but I thought: why
automate? GridSearch could, e.g., take an OrderedDict of parameters, and
try combinations in C-array order. (For parallelism, maybe batches could be
queued up in the opposite (i.e., Fortran) order, though I haven't thought
that
I'll note also that the Python standard library docs have started to
include some source links (e.g.,
http://docs.python.org/2/library/webbrowser.html or
http://docs.python.org/3.4/library/html.entities.html) but it is not
consistent between modules.
-Ken
On Tue, Apr 30, 2013 at 9:34 AM, Jaques
Some gists:
https://gist.github.com/kcarnold/5439917
https://gist.github.com/kcarnold/5439945
They are rather terribly documented, sorry.
Input to such algorithms is usually given as:
- a set of similarity and dissimilarity links,
- relative comparisons (x is closer to y than w is to z), or
-
I have implemented a few metric learning algorithms myself. The quality of
that code is nowhere near sklearn standards, but I may have some incentive
to improve it soon.
-Ken
On Sun, Apr 21, 2013 at 3:42 PM, John Collins wrote:
> Has anybody or does anybody have plans to implement metric learn
If you want a Mahalanobis distance, though, you can instead just transform
your data using the Cholesky decomposition of the distance matrix.
-Ken
On Tue, Apr 2, 2013 at 3:09 PM, Andreas Mueller wrote:
> Hi Francis.
> No. It is highly non-trivial for most distance functions to do k-means as
>
On Thu, Mar 28, 2013 at 2:35 PM, Nelle Varoquaux
wrote:
> But in general, I don't think we can "force" the user to use sparse
> matrices. They are an absolute pain to work with because of the
> inconsistencies of interface with ndarray and conversion between sparse and
> dense can be time consumin
It was a pretty easy build on Mac -- I just used MacPorts to install and
select an llvm. Of course Anaconda is even easier.
I'd say Numba is a medium-term consideration. It's enough trouble getting
everybody using C compilers, so adding LLVM to the mix is probably way too
much of a change for the
tl;dr: Try Python 3.2 with MacPorts.
Unfortunately, Scipy 0.11.0 is broken on Python 3.3.
http://projects.scipy.org/scipy/ticket/1739
This is fixed in their master branch. I just made a successful build of
that branch on OS X
10.8:https://trac.macports.org/ticket/37400#comment:15There may be
some
In your code, 'document' is just a string, not a feature vector. You should
use the same Vectorizer that you used to train the classifier to begin with.
Trained classifier objects are generally not compatible across versions.
You should retrain the classifier using the new version (and who knows,
Why not use numpy arrays of strings all along? Their importance here is
fancy indexing... Or use X=np.arange(N) and do the fancy indexing yourself
on demand?
-Ken
On Jan 13, 2013 11:04 PM, "Robert Layton" wrote:
> When using cross_validation.X, all arrays are checked in the normal way --
> using
I use MacPorts for Python packages that are annoying to compile (like
py27-numpy and py27-matplotlib) or that depend on external C libraries
(py27-lxml), because in both cases the tooling and (often) pre-compiled
packages are helpful.
I install sklearn from source (or pip) because that's easy and
Btw, if at some point you do need to look at diffs for the cython generated
code, temporarily removing the gitattributes file should suffice. You
probably don't even have to commit, tho be aware of doing it by accident :)
-Ken
On Oct 28, 2012 4:32 PM, "Andreas Mueller"
wrote:
> Hey everybody.
>
This is actually not related to sklearn at all, but I run into it often
enough that I'm replying here anyway: Pickle dumps an object (first
parameter) to a file (second parameter). I get those backwards all the time
and used to have a utility function to swap args if I got it backwards.
Also, it ex
setup.py could only try to regenerate the file (and thus require
cython) if the source has been modified. Here's an example, though
there are likely better ways to accomplish the same thing:
https://github.com/commonsense/divisi2/blob/master/setup.py#L57
-Ken
On Fri, Oct 12, 2012 at 2:59 PM, Jak
t 7:21 AM, Gael Varoquaux <
gael.varoqu...@normalesup.org> wrote:
> On Fri, Oct 05, 2012 at 09:15:29PM -0400, Kenneth C. Arnold wrote:
> > Another option: ignore them in development branches and add them back in
> the
> > release branches.
>
> The goal is that anybody clon
If the generated files were marked as binary, they wouldn't show in diffs.
(I thought they already were... )
Another option: ignore them in development branches and add them back in
the release branches. This would also make cherry picking from the
development branch less likely to have merge conf
On Thu, Jan 19, 2012 at 11:03 AM, Satrajit Ghosh wrote:
> in one of my projects i use the scipy sparse library for turning a graph
> into a sparse dependency matrix and then manipulating this matrix
> (adding/subtracting columns/rows, setting elements to 0, ...). this is the
> only reason i have s
On Thu, Jan 19, 2012 at 3:05 AM, Olivier Grisel
wrote:
> Rather than improving the error message when passing sparse arrays to
> the dense impl of SVC we should refactor SVC to accept both dense and
> sparse representation and use the right wrapper as already done for
> SGD, LinearSVC, LogisticReg
It may be relevant to note that Cython has recently gained some OpenMP
support: http://docs.cython.org/src/userguide/parallelism.html -- I
haven't tried it, but perhaps it could help improve the scikit-learn
implementation.
-Ken
On Dec 23, 2011 7:31 AM, "Benjamin Hepp" wrote:
>
> Hi,
>
> I was
On Tue, Nov 29, 2011 at 4:53 PM, Olivier Grisel
wrote:
> Now back to you problem I think we should support fitting models with
> just one sample just for the sake of consistency / continuity even if
> theds is no practical application of fitting models with a single
> sample: fitting models with
There is no maximum likelihood solution to a GP with a single training
point, but you can certainly draw samples from the posterior; in fact,
you can draw samples from the prior (without conditioning on data).
That may help you determine if your covariance function is reasonable:
samples from the p
a Mac side note:
I have found that MacPorts solves most of my
getting-things-running-on-Mac problems. Either you can just use their
packages directly, often with precompiled binary downloads, or at
least `port info py27-scipy` will show you the package names for the
dependencies. (I'm currently ru
On Thu, Nov 10, 2011 at 9:40 AM, Gael Varoquaux
wrote:
> I think that it might be an interesting addition. I say 'might' because I
> have given such ideas a try on general problems, and they actually often
> do not work well: the score as a function of parameters is often a nasty
> landscape. Firs
I had nothing to do with that page, but it's
https://github.com/scikit-learn/scikit-learn/wiki/Related-Projects.
-Ken
2011/11/4 Frédéric Bastien :
> On Fri, Nov 4, 2011 at 4:24 PM, Kenneth C. Arnold
> wrote:
>> +1 for the sklearn review process AND for cooperating with othe
On Fri, Nov 4, 2011 at 12:25 PM, Mathieu Blondel wrote:
> Another possibility is to host a Theanos-based implementation as a
> side project on github and make the API scikit-learn compatible.
>
> # In general, I don't really buy the "why implement X if it already
> exists in Y" argument because it
On Wed, Nov 2, 2011 at 6:04 PM, Olivier Grisel wrote:
> 2011/11/2 Radim Rehurek :
>> If you decide to implement the randomized PCA, I can offer some observations:
>>
>> 1. oversampling does little, accuracy comes mostly from the extra power
>> iteration steps
>> 2. no power iterations result in m
:)
-Ken
> On Fri, Oct 28, 2011 at 11:00 PM, Conrad Lee wrote:
>>
>> Kenneth,
>>
>>
>> On Fri, Oct 28, 2011 at 3:44 PM, Kenneth C. Arnold
>> wrote:
>>>
>>> I just implemented Latent Dirichlet Allocation with collapsed Gibbs
>>> samp
I just implemented Latent Dirichlet Allocation with collapsed Gibbs
sampling and made a demo on 20 Newsgroups. If there's interest in
having this in sklearn, I could clean up the code for contribution.
I noticed there was same discussion back in January about PyMC that
didn't reach an actionable c
32 matches
Mail list logo