Re: [Scikit-learn-general] [scikit-learn] ImportError: Please do not forget to run `make` first (#359)

2011-09-24 Thread Vlad Niculae
can be closed and I can be rehabilitated in the public eye :) Best, Vlad On Sat, Sep 24, 2011 at 3:25 PM, josef.p...@gmail.com wrote: On Sat, Sep 24, 2011 at 8:08 AM, Vlad Niculae v...@vene.ro wrote: The date should be 24th I think since I uploaded it late at night. You can get it from PyPI

Re: [Scikit-learn-general] [scikit-learn] Failed to install sklearn 0.9 on Windows XP (py 2.7) (#368) (fwd)

2011-09-28 Thread Vlad Niculae
Hopefully the positive reports from the initial discussion are actually accurate and the new installers are working. As far as I can tell, sourceforge still hosts the old, buggy installers. (looking at the date) If this turns out to be the case, who can update the sourceforge files? Vlad On

Re: [Scikit-learn-general] Passing attributes to the compiler when building win32 binaries

2011-10-03 Thread Vlad Niculae
I think that the binaries without statically linked libgcc and libstdc++ would have worked if run from a mingw32 environment (ie. they worked in mine). I wonder if it works the other way around (ie. no static linking, build with MSVC, run from within mingw32) Fabian, is the 0.8 release built with

[Scikit-learn-general] Gaussian processes missing from class reference

2011-10-12 Thread Vlad Niculae
Hello As far as I can tell (hope I'm not too tired and missing something), gaussian processes are missing from the class reference. They are not included in the classes.rst index. Is this just an omission? Because the module seems to have solid docstrings that deserve to be listed. Best, Vlad

Re: [Scikit-learn-general] OMP behavior

2011-10-18 Thread Vlad Niculae
Thank you for the observation. I have been looking into this since yesterday where the same thing has been reported on my blog by Bob L. Strum. At the moment I have no idea what the cause is. Does it behave in the same way if you use the gram solver instead? Best, Vlad On Tue, Oct 18, 2011 at

Re: [Scikit-learn-general] OMP behavior

2011-10-19 Thread Vlad Niculae
Interesting. I've been staring at the code but the algorithm itself shouldn't be losing precision. On the other hand, there are those stopping conditions that I had taken from the C implementation of the author of the Cholesky-OMP paper. If it's as you say, it could be that when it fails, OMP

Re: [Scikit-learn-general] Documentation linking

2011-10-19 Thread Vlad Niculae
Hi Jake, A while back I remember having that issue because my local version of sphinx was higher than 1.0.0 and thus unsupported by the scikit-learn docs, so the function links wouldn't work when I built it locally, but they would work in the online-generated version. Did the sphinx version used

Re: [Scikit-learn-general] Multi Layer Perceptron / Neural Network in Sklearn

2011-11-04 Thread Vlad Niculae
On Fri, Nov 4, 2011 at 4:54 PM, Andreas Müller amuel...@ais.uni-bonn.de wrote: On 11/04/2011 03:49 PM, Andreas Müller wrote: On 11/04/2011 03:42 PM, Alexandre Passos wrote: On Fri, Nov 4, 2011 at 10:34, Lars Buitinck l.j.buiti...@uva.nl wrote: 2011/11/4 Alexandre Passos alexandre...@gmail.com:

Re: [Scikit-learn-general] Parallel GridSearchCV on sparse.SVC fails with ValueError

2011-11-06 Thread Vlad Niculae
Yes, I was thinking of a sequencial, exploratory IPython-style thing where you change something in your X and re-fit, when you don't want to clone and delete the old estimator. Hope this makes sense. Vlad 2011/11/6 Lars Buitinck l.j.buiti...@uva.nl: 2011/11/6 Vlad Niculae v...@vene.ro

Re: [Scikit-learn-general] pl.pcolormesh in the examples

2011-11-10 Thread Vlad Niculae
Very much +1, I would always cringe when seeing colormesh calls. I gave a talk using the IPython notebook and scikit-learn examples, setting it to SVG mode at the top, and I would have to switch back to PNG mode for all examples using pcolormesh because it would crash (well probably just take

Re: [Scikit-learn-general] December sprint planning (NIPS edition)

2011-11-14 Thread Vlad Niculae
Gael, can you give me the info (URL and telephone number) for the guest house? I would like to call them to make a reservation for me. I can make a reservation for other people too at the same time if needed (since Gael only reserved for 5 people). Yes and maybe we can benefit from Fabian's

Re: [Scikit-learn-general] NLP course at Stanford available for enrollment

2011-11-21 Thread Vlad Niculae
On Mon, Nov 21, 2011 at 2:08 PM, Lars Buitinck l.j.buiti...@uva.nl wrote: 2011/11/21 Jacob VanderPlas vanderp...@astro.washington.edu: I would recommend these: I'm currently taking the Machine Learning course, taught by Andrew Ng, which will be offered again in January. It's been a great intro

Re: [Scikit-learn-general] scikit test failure on osx

2011-11-27 Thread Vlad Niculae
Hello Massimo I believe this is an issue others, including me, have faced: https://github.com/scikit-learn/scikit-learn/issues/445 https://github.com/scikit-learn/scikit-learn/issues/330 I reverted to the stable versions of numpy and scipy from their website, and the bleeding-edge scikit-learn,

Re: [Scikit-learn-general] A new jenkins integration server is online for scikit-learn

2011-11-28 Thread Vlad Niculae
Hi Olivier, This is very cool. Could we plot average test coverage as well, similar to pep8? Is there a way to subscribe to the build reports, like with the buildbot? I signed up but still couldn't find one. Vlad On Mon, Nov 28, 2011 at 4:26 AM, Olivier Grisel olivier.gri...@ensta.org wrote:

Re: [Scikit-learn-general] Issue with gaussian processes

2011-11-29 Thread Vlad Niculae
On Tue, Nov 29, 2011 at 10:02 PM, Alexandre Gramfort alexandre.gramf...@inria.fr wrote: Hi Alex, I would say: if it makes sense to fit a GP with only one point:    it should be fixed Note that even though it might not make any sense in practice, unless there's a mathematical reason that I'm

Re: [Scikit-learn-general] Specify rather than learn sparse coding dictionary?

2011-12-06 Thread Vlad Niculae
On Dec 6, 2011, at 11:04 , Gael Varoquaux wrote: On Tue, Dec 06, 2011 at 09:41:56AM +0200, Vlad Niculae wrote: This is actually exactly how the module is designed. Great design! I should have looked at it closer before writing my mail. We have BaseDictionaryLearning which only implements

Re: [Scikit-learn-general] Specify rather than learn sparse coding dictionary?

2011-12-06 Thread Vlad Niculae
On Tue, Dec 6, 2011 at 11:46 PM, Alexandre Gramfort alexandre.gramf...@inria.fr wrote: I do confirm that Lasso and LassoLars both minimize 1/2n || y - Xw || + alpha ||w||_1 and that the n should not be present in the sparse coding context. it means :

Re: [Scikit-learn-general] Image denoising example

2011-12-07 Thread Vlad Niculae
I think I know what happened here. An upstream change in scipy removed scipy.lena() and left only scipy.misc.lena(). I wonder if this affects other examples as well. I will try to check and patch this soon. Vlad On Wed, Dec 7, 2011 at 1:38 PM, Gael Varoquaux gael.varoqu...@normalesup.org

Re: [Scikit-learn-general] December sprint planning (NIPS edition)

2011-12-08 Thread Vlad Niculae
On Dec 8, 2011, at 20:11 , David Warde-Farley wrote: On Tue, Nov 15, 2011 at 03:13:53AM +0900, Mathieu Blondel wrote: Hi, Thanks heaps Gael. I'm planning to contact the guy by tomorrow. I think it would be easier for him if we don't contact him individually. I can make the reservations

Re: [Scikit-learn-general] using string features for classification

2012-01-03 Thread Vlad Niculae
On Jan 3, 2012, at 17:02 , Olivier Grisel wrote: 2012/1/3 Lars Buitinck l.j.buiti...@uva.nl: We probably need to extend the sklearn.feature_extraction.text package to make it more user friendly to work with with pure categorical features occurrences: I'm not sure this belongs in

Re: [Scikit-learn-general] sklearn.test() weirdness

2012-01-05 Thread Vlad Niculae
On Jan 5, 2012, at 23:45 , Fabian Pedregosa wrote: and that was quite convenient for testing on systems on which nosetest fails (windows). Hi Fabian Could you please be more specific regarding this point, since as a former Windows user, I find that I don't know what you mean. On topic, I

[Scikit-learn-general] Development doc not being updated

2012-01-06 Thread Vlad Niculae
Hello all, especially Fabian. I've noticed that the new examples still don't show up in scikit-learn.org/dev/, in particular the multi-label one that I'd like to show off. Can somebody address this? Sorry if this should be discussed somewhere else. Best, Vlad

Re: [Scikit-learn-general] Development doc not being updated

2012-01-06 Thread Vlad Niculae
On Jan 6, 2012, at 17:39 , Olivier Grisel wrote: 2012/1/6 Vlad Niculae zephy...@gmail.com: Hello all, especially Fabian. I've noticed that the new examples still don't show up in scikit-learn.org/dev/, in particular the multi-label one that I'd like to show off. Can somebody address

[Scikit-learn-general] Verbose joblib output

2012-01-06 Thread Vlad Niculae
Hello everybody, This is something that has been bugging me for a while. I am not exactly sure what entity is printing these messages (I assume joblib) but when doing a verbose CV with n_jobs=1, the progress report looks something like: [Parallel(n_jobs=1)]: Done job 20 | elapsed: 28.3s

Re: [Scikit-learn-general] test time performance and sparse - dense dot products

2012-01-08 Thread Vlad Niculae
Olivier's solution sounds good. And it's easy to implement too :) @pprett can you confirm it solves your perf issue on your data? I'm talking without actually looking at the code but as long as after fit, the array will only be needed in F-order, this feels right. However afaik

Re: [Scikit-learn-general] Putting SVC and NuSVC into the same class

2012-01-08 Thread Vlad Niculae
A bit off topic but since we're talking about work on the SVM module, I noticed something wrong with the docs. http://scikit-learn.org/dev/modules/svm.html#tips-on-practical-use The scaling part makes reference to some Cookbook (I don't know what this is, it probably died before I joined you

Re: [Scikit-learn-general] sparse_encode implementation

2012-01-09 Thread Vlad Niculae
Short answer, no. sparse_encode is just a wrapper for funcionality that existed in the scikit already (lasso, omp), with support for parallelization. We couldn't embed SPAMS anyway, because of the license IIRC. A benchmark would be interesting indeed. Vlad On 09.01.2012, at 18:02, Ian

Re: [Scikit-learn-general] Strange install instructions

2012-01-18 Thread Vlad Niculae
I am quoting from http://docs.python.org/distutils/builtdist.html By default the installer will display the cool “Python Powered” logo when it is run, but you can also supply your own 152x261 bitmap which must be a Windows .bmpfile with the --bitmap option. I'm assuming -b is short for

Re: [Scikit-learn-general] Strange install instructions

2012-01-18 Thread Vlad Niculae
On Jan 18, 2012, at 20:23 , Andreas wrote: On 01/18/2012 07:19 PM, Vlad Niculae wrote: I am quoting from http://docs.python.org/distutils/builtdist.html By default the installer will display the cool “Python Powered” logo when it is run, but you can also supply your own 152x261 bitmap

Re: [Scikit-learn-general] GSoC 2012

2012-01-18 Thread Vlad Niculae
On Jan 19, 2012, at 00:23 , Gael Varoquaux wrote: On Wed, Jan 18, 2012 at 07:37:12PM +0900, Mathieu Blondel wrote: It would be nice if you could make a few contributions to scikit-learn before the application process starts. This will allow you to familiarize with the code base, us to

Re: [Scikit-learn-general] Announce: scikit-learn 0.10

2012-01-27 Thread Vlad Niculae
sorry, I don't have a Windows system at the moment, if you have a VM could you do it? If you're not set up either, I'll do it in a day or two. Best, Vlad Sent from my iPod On 27.01.2012, at 12:29, Fabian Pedregosa fabian.pedreg...@inria.fr wrote: @vene: do you have time to make the windows

Re: [Scikit-learn-general] Announce: scikit-learn 0.10

2012-01-27 Thread Vlad Niculae
My pleasure, I'm sorry for the delay! Sent from my iPod On 27.01.2012, at 20:49, Vincent Dubourg vincent.dubo...@gmail.com wrote: Thank you Vlad! After a slight upgrade of both numpy and scipy I managed to get a brand new working 0.10 sklearn! On 27/01/2012 17:32, Gael Varoquaux wrote: On

Re: [Scikit-learn-general] mean square error

2012-02-01 Thread Vlad Niculae
Sent from my iPod On 01.02.2012, at 15:43, Mathieu Blondel math...@mblondel.org wrote: On Wed, Feb 1, 2012 at 10:10 PM, David Warde-Farley warde...@iro.umontreal.ca wrote: I might suggest mean over training examples but sum over output dimensions, if there is more than one. Currently,

Re: [Scikit-learn-general] optimization with constraints

2012-02-03 Thread Vlad Niculae
A nice idea would be to extend the scipy NNLS in the ways needed to use it in scikit-learn's NMF instead of the _nls_subproblem code translated from C.J. Lin's code. The scipy NNLS is written in Fortran. I'd like to bench _nls_subproblem against it. Maybe we could have a cython projected sgd

Re: [Scikit-learn-general] optimization with constraints

2012-02-03 Thread Vlad Niculae
On Feb 3, 2012, at 18:07 , Mathieu Blondel wrote: On Fri, Feb 3, 2012 at 11:55 PM, Vlad Niculae zephy...@gmail.com wrote: The scipy NNLS is written in Fortran. I'd like to bench _nls_subproblem against it. Maybe we could have a cython projected sgd non-negative least square method

Re: [Scikit-learn-general] Windows vista installation error

2012-02-11 Thread Vlad Niculae
Hi Andre The installation instructions you are referring to apply only for installing scikit-learn from source. If you downloaded the binary installer (like you said) and ran it, there is no need to do `python setup.py install`. It should work for you to type `import sklearn` in the Python

Re: [Scikit-learn-general] strange behavior while using permutation_test_score

2012-02-29 Thread Vlad Niculae
On Feb 29, 2012, at 21:53 , Olivier Grisel wrote: 2012/2/29 Matthias Ekman matthias.ek...@googlemail.com: I did some further testing and could reproduce the error on several machines including a fresh install of debian squeeze using python 2.6.6. However the problem only occurs with the last

Re: [Scikit-learn-general] k nearest neighbour

2012-03-29 Thread Vlad Niculae
I will try to pick up the work on the one-hot transformer: https://github.com/scikit-learn/scikit-learn/pull/242 Vlad On Mar 29, 2012, at 11:36 , Andreas wrote: Hi Mohit. Generally all algorithms in sklearn assume that all features are continuous. Does discrete in your case mean categorial

Re: [Scikit-learn-general] GSoC 2012 pre-application

2012-04-04 Thread Vlad Niculae
Hello guys, Unfortunately I have come down with the flu, and therefore missed a good amount of time to work on gsoc 2012 proposals. I know that there's not much time left for review, but here is my pre-proposal for a overall speedup and benchmarking project.

Re: [Scikit-learn-general] Reminder GSoC Student Application Deadline is: April 06 at 19:00 UTC

2012-04-04 Thread Vlad Niculae
On Apr 4, 2012, at 17:11 , Olivier Grisel wrote: Detailed instructions and links on the wiki: https://github.com/scikit-learn/scikit-learn/wiki/A-list-of-topics-for-a-google-summer-of-code-%28gsoc%29-2012 Please write the draft proposal on a google document or some wiki page on your

Re: [Scikit-learn-general] GSoC 2012 pre-application

2012-04-05 Thread Vlad Niculae
Hi everyone I have updated my proposal thanks to your excellent suggestions. I also pointed out the style of optimization that will be applied by linking to my blog post on optimizing orthogonal matching pursuit code. Unfortunately this will also flash the bug I introduced before everyone's

Re: [Scikit-learn-general] gsoc application MLP

2012-04-05 Thread Vlad Niculae
Hi David, Like Gael said in the other thread, try to submit your proposal quite before the deadline. You can still edit it on their site. I agree with everybody regarding the importance of testing and examples. They are not afterthoughts. The documentation, though, can be left until the final

Re: [Scikit-learn-general] gsoc application MLP

2012-04-05 Thread Vlad Niculae
On Apr 6, 2012, at 02:56 , Andreas Mueller wrote: On 04/05/2012 11:17 PM, Vlad Niculae wrote: I would like to see a reproduction of the standard neural net digits example: http://ufldl.stanford.edu/wiki/images/8/84/SelfTaughtFeatures.png That looks like the weights of an autoencoder

Re: [Scikit-learn-general] gsoc application MLP

2012-04-05 Thread Vlad Niculae
Actually I couldn't find the code but I found something better, the assignment notes: https://github.com/SaveTheRbtz/ml-class/blob/master/ex4.pdf If you ran more iterations it would only get better. Looking back this was a very good class. Vlad On Apr 6, 2012, at 06:54 , Vlad Niculae wrote

Re: [Scikit-learn-general] gsoc application MLP

2012-04-06 Thread Vlad Niculae
On Apr 6, 2012, at 10:19 , Andreas Mueller wrote: On 04/06/2012 08:04 AM, xinfan meng wrote: On Fri, Apr 6, 2012 at 1:57 PM, David Warde-Farley warde...@iro.umontreal.ca wrote: On 2012-04-05, at 5:17 PM, Vlad Niculae zephy...@gmail.com wrote: http://ufldl.stanford.edu/wiki/images

Re: [Scikit-learn-general] GSoC proposal for Bayesian networks: update

2012-04-06 Thread Vlad Niculae
Hi Shankar I am also following the PGM class and I would like to stress out that the way they implement all the factor operations feels to me to be by no means efficient, way too much random memory indexing. However the class seems very insightful, maybe after it ends we will be illuminated as

Re: [Scikit-learn-general] SERIOUS BUG

2012-04-17 Thread Vlad Niculae
I think just moving from a train set to a test set would be problematic for small n_samples. Vlad On Apr 17, 2012, at 15:48 , Olivier Grisel wrote: Le 17 avril 2012 05:39, Gael Varoquaux gael.varoqu...@normalesup.org a écrit : On Tue, Apr 17, 2012 at 03:35:26PM +0300, Dimitrios Pritsos

Re: [Scikit-learn-general] GSOC 12' 3/3 !!!

2012-04-23 Thread Vlad Niculae
I am very flattered and happy! Thanks to everybody who helped provide this opportunity. I think it is very exciting for scikit-learn to have 3 GSoCers, and it's also a sign of our growth. Congratulations to David and Immanuel, great work so far, looking forward to interacting as much as we can

Re: [Scikit-learn-general] Regression with multiple outputs

2012-04-30 Thread Vlad Niculae
For now, lasso (and some others) can be invoked through the sparse_encode function and it does the multitarget wrapping automatically over multiple cores. Just pay attention to the shapes of the inputs since they need to be transposed (the function makes sense in a dictionary learning context).

Re: [Scikit-learn-general] Merging pyCRFSuite into scikit-learn

2012-05-02 Thread Vlad Niculae
There has been quite some interest in this in the last couple of months, so I'm sure it will get some momentum. The question is whether Jake and Olivier's points about the inappropriateness of the data structures can actually get a workaround or if this is (more or less) pointless. If crfsuite

[Scikit-learn-general] Need for Speed liftoff: Linear Regression models

2012-05-05 Thread Vlad Niculae
Hello everybody, I will start my effort for my GSoC project for this year, as discussed, with making the linear models faster where applicable, most importantly in multi-task regression problems. The plan (which will be piloted now, and towards the middle of the summer, hopefully will get

Re: [Scikit-learn-general] reproducing test failures

2012-05-09 Thread Vlad Niculae
I can confirm that that exact same test halted for me once too. I thought it was my old Windows PC that overheated. Sorry for not mentioning it. Vlad On May 9, 2012, at 04:11 , Yaroslav Halchenko wrote: On Wed, 09 May 2012, Olivier Grisel wrote: so if it fails for some specific seed, I

Re: [Scikit-learn-general] Need for Speed liftoff: Linear Regression models

2012-05-11 Thread Vlad Niculae
A significant part of this project will consist of the benchmark suite itself, that will need to be run by the CI we will deploy. The question is where to host the benchmark suite. Should I create a new repo in the scikit-learn project? scikit-learn/speed scikit-learn/scikit-learn-speed

Re: [Scikit-learn-general] Need for Speed liftoff: Linear Regression models

2012-05-12 Thread Vlad Niculae
On May 11, 2012, at 11:28 , Olivier Grisel wrote: 2012/5/11 Vlad Niculae zephy...@gmail.com: A significant part of this project will consist of the benchmark suite itself, that will need to be run by the CI we will deploy. The question is where to host the benchmark suite. Should I create

Re: [Scikit-learn-general] linear model benchmarking

2012-05-28 Thread Vlad Niculae
On May 28, 2012, at 13:50 , Immanuel B wrote: Hello, I could use some feedback on how to best set-up a benchmark for these models: l2 loss* log loss* multi-logit* with l1 and l1 l2 penalty Please have a look at the following file:

Re: [Scikit-learn-general] linear model benchmarking

2012-05-31 Thread Vlad Niculae
On May 31, 2012, at 12:42 , Immanuel B wrote: Does N mean n_samples and p n_features? yes What about number of targets, is it 1 everywhere? not sure what you mean... The first table contains binary classification data, in the second table the number of classes is given by #class. for

[Scikit-learn-general] What to do when the user's Gram matrix is likely wrong?

2012-06-26 Thread Vlad Niculae
This is a consistency question. I found that enet_path has a clever behaviour for this: https://github.com/scikit-learn/scikit-learn/tree/master/sklearn/linear_model/coordinate_descent.py#L561 The logic here is: if center_data changes X, then X wasn't centered. If this is the case, and the

Re: [Scikit-learn-general] congratulations to Peter and to scikit-learn!

2012-07-05 Thread Vlad Niculae
Congratulations Peter! Excellent work as always! Vlad On Jul 5, 2012, at 00:48 , Emanuele Olivetti wrote: Dear All, As some of you may have already noticed, Peter (Prettenhofer) has just won a the Online Product Sales competition on kaggle.com beating 365 teams:

[Scikit-learn-general] Summary of my recent blog post and GSoC progress

2012-07-05 Thread Vlad Niculae
Hello friends, As the midterm evaluation is approaching, I pushed the pedal to the metal and my blog and github profile have seen a lot of activity recently. I would like to link to everything from one place, and that place will be this e-mail. So this is what happened: -- I wrote a couple of

Re: [Scikit-learn-general] Summary of my recent blog post and GSoC progress

2012-07-06 Thread Vlad Niculae
Progress update, I adapted the ml-benchmarks over at https://github.com/vene/scikit-learn-speed/tree/ml-benchmarks On Jul 5, 2012, at 15:14 , Olivier Grisel wrote: Thanks very much for the wrap up Vlad. Could you please document how to use the %memit and %mrun tools in the performance chapter

Re: [Scikit-learn-general] Incorporation of extra training examples

2012-07-09 Thread Vlad Niculae
Another (hackish) idea to try would be to keep the labels of the extra data bit give it a sample_weight low enough not to override your good training data. On 09.07.2012, at 12:43, Philipp Singer kill...@gmail.com wrote: Hey! I am currently doing text classification. I have the following

Re: [Scikit-learn-general] Summary of my recent blog post and GSoC progress

2012-07-10 Thread Vlad Niculae
As per Gael's request, here is my progress compared to what was initially stated as mid-term goals. Overall the project is behind schedule, but not far, and I am fairly confident about its successful completion. -- GOAL: Set up a running performance benchmark such as speed.pypy.org or Wes

Re: [Scikit-learn-general] Incorporation of extra training examples

2012-07-11 Thread Vlad Niculae
On Jul 11, 2012, at 10:14 , Philipp Singer wrote: Am 11.07.2012 10:11, schrieb Olivier Grisel: LinearSVC is based on the liblinear C++ library which AFAIK does not support sample weight. Well, that's true. You should better have a look at SGDClassifier:

Re: [Scikit-learn-general] Summary of my recent blog post and GSoC progress

2012-07-11 Thread Vlad Niculae
This has been merged. https://github.com/vene/scikit-learn-speed There are now easy instructions you can follow to run the suite on your own machine. One step closer to running remotely. On Jul 6, 2012, at 18:33 , Vlad Niculae wrote: Progress update, I adapted the ml-benchmarks over

Re: [Scikit-learn-general] Summary of my recent blog post and GSoC progress

2012-07-12 Thread Vlad Niculae
On Jul 12, 2012, at 12:30 , Gael Varoquaux wrote: On Thu, Jul 12, 2012 at 12:16:50PM +0200, Olivier Grisel wrote: I get not results... I haven't followed too much the codebase (I should have but...). That said, I must confess that I am a bit frightened at the number of different

Re: [Scikit-learn-general] Summary of my recent blog post and GSoC progress

2012-07-12 Thread Vlad Niculae
On Jul 12, 2012, at 14:10 , Lars Buitinck wrote: 2012/7/12 Gael Varoquaux gael.varoqu...@normalesup.org: I haven't followed too much the codebase (I should have but...). That said, I must confess that I am a bit frightened at the number of different technologies that are being put together.

Re: [Scikit-learn-general] Summary of my recent blog post and GSoC progress

2012-07-15 Thread Vlad Niculae
on master I can run the suite again and we will have a line with 2 points, yay! V On Jul 12, 2012, at 14:25 , Gael Varoquaux wrote: On Thu, Jul 12, 2012 at 02:22:42PM +0200, Vlad Niculae wrote: For example a thing that hurts me is that for every 'predict' benchmark, the model is refitted

Re: [Scikit-learn-general] Summary of my recent blog post and GSoC progress

2012-07-15 Thread Vlad Niculae
, Jul 15, 2012 at 7:07 PM, Vlad Niculae zephy...@gmail.com wrote: After some bugfixes with Olivier's help, I published the output of the scikit-learn-speed here: http://vene.github.com/scikit-learn-speed/ Because there is only one data point, it looks like the plots are empty, but you can

Re: [Scikit-learn-general] Summary of my recent blog post and GSoC progress

2012-07-21 Thread Vlad Niculae
A preview of the HTML benchmarking report that will soon be deployed: http://blog.vene.ro/2012/07/20/scikit-learn-speed-html-reports-teaser/ Best, Vlad On Jul 16, 2012, at 20:05 , Peter Prettenhofer wrote: It seems like vbench is failing when it tries to run the following git command::

Re: [Scikit-learn-general] Summary of my recent blog post and GSoC progress

2012-07-29 Thread Vlad Niculae
! The tests just need some cleaning up, and scikit-learn-speed will soon be up and running! On Jul 21, 2012, at 15:54 , Vlad Niculae zephy...@gmail.com wrote: A preview of the HTML benchmarking report that will soon be deployed: http://blog.vene.ro/2012/07/20/scikit-learn-speed-html-reports

Re: [Scikit-learn-general] ndarray is not fortran contiguous

2012-08-02 Thread Vlad Niculae
Either way, is there a reason that I'm missing, why np.array([0]) should be both C- and F-contiguous, but np.array([[0]]) can only be one of them at a time? On Aug 2, 2012, at 17:26 , Olivier Grisel olivier.gri...@ensta.org wrote: 2012/8/2 Skipper Seabold jsseab...@gmail.com: On Thu, Aug 2,

Re: [Scikit-learn-general] Summary of my recent blog post and GSoC progress

2012-08-06 Thread Vlad Niculae
A full benchmark suite has been run successfully: http://jenkins-scikit-learn.github.com/scikit-learn-speed/ The whole process (build sklearn, run benchmarks, generate output) took ~40 minutes, so I am scheduling it once a week. Best, Vlad On Jul 29, 2012, at 20:49 , Vlad Niculae zephy

Re: [Scikit-learn-general] multivariate regression with higher degree polynomials

2012-08-09 Thread Vlad Niculae
Andy, Mathieu: The docs are lacking guidelines and examples on how to tune SVR parameters. IIUC, C, gamma, etc should be use just as in SVC. The tricky part is epsilon, how should it be set? What are sensible defaults and a sensible grid search range? Thanks, Vlad On Aug 9, 2012, at 13:30 ,

Re: [Scikit-learn-general] linear_model.base.center_data always returns C_CONTIGUOUS array

2012-08-16 Thread Vlad Niculae
On Aug 16, 2012, at 18:57 , iBayer mane.d...@googlemail.com wrote: Hi, I know it sounds stupid but where is the code for ``as_float_array``? because of: It's in `validation.py`, you can find this out either by looking where it's imported from in the `__init__.py` or by using the

Re: [Scikit-learn-general] Summary of my recent blog post and GSoC progress

2012-08-17 Thread Vlad Niculae
-learn-speed). We can then link to this from the homepage. What do you think? Best, Vlad On Aug 6, 2012, at 11:26 , Vlad Niculae zephy...@gmail.com wrote: A full benchmark suite has been run successfully: http://jenkins-scikit-learn.github.com/scikit-learn-speed/ The whole process (build

Re: [Scikit-learn-general] Summary of my recent blog post and GSoC progress

2012-08-19 Thread Vlad Niculae
On Aug 17, 2012, at 21:10 , Olivier Grisel olivier.gri...@ensta.org wrote: 2012/8/17 Vlad Niculae zephy...@gmail.com: If the build scheduled tonight runs successfully, with the newly added benchmarks, I would like to move the scikit-learn-speed codebase to scikit-learn/scikit-learn-speed

Re: [Scikit-learn-general] Summary of my recent blog post and GSoC progress

2012-08-19 Thread Vlad Niculae
On Aug 19, 2012, at 14:16 , Vlad Niculae zephy...@gmail.com wrote: On Aug 17, 2012, at 21:10 , Olivier Grisel olivier.gri...@ensta.org wrote: 2012/8/17 Vlad Niculae zephy...@gmail.com: If the build scheduled tonight runs successfully, with the newly added benchmarks, I would like

Re: [Scikit-learn-general] What does LarsCV do?

2012-08-20 Thread Vlad Niculae
This is confusing to me too. I wanted to copy it for the OMP CV, but it seems overly complicated. Vlad On Aug 20, 2012, at 17:19 , Andreas Müller amuel...@ais.uni-bonn.de wrote: Hi Alex. Thanks for the answer. So it estimates ``n_nonzero_coefs``. As far as I can see, you can not get this

Re: [Scikit-learn-general] Summary of my recent blog post and GSoC progress

2012-08-22 Thread Vlad Niculae
On Aug 20, 2012, at 04:06 , Vlad Niculae zephy...@gmail.com wrote: On Aug 19, 2012, at 14:16 , Vlad Niculae zephy...@gmail.com wrote: On Aug 17, 2012, at 21:10 , Olivier Grisel olivier.gri...@ensta.org wrote: 2012/8/17 Vlad Niculae zephy...@gmail.com: If the build scheduled tonight

Re: [Scikit-learn-general] Summary of my recent blog post and GSoC progress

2012-08-22 Thread Vlad Niculae
On Aug 22, 2012, at 20:38 , Olivier Grisel olivier.gri...@ensta.org wrote: FYI and as a side note for this GSoC project: I have just filled in the final evaluation form for Vlad's GSoC project and gave it a pass: congrats Vlad :) \o/ Thank you Olivier, this has been a very enjoyable project

Re: [Scikit-learn-general] Release

2012-08-31 Thread Vlad Niculae
We are all annoyed by warnings; we have a ton of them at the moment. Some of them are scheduled for removal, and others have even passed their deadline. I think we should go through them thoroughly before the release. I could volunteer for this. Best, Vlad -- Vlad N.

Re: [Scikit-learn-general] Release

2012-08-31 Thread Vlad Niculae
On Aug 31, 2012, at 18:27 , amuel...@ais.uni-bonn.de wrote: We do? Which warnings do you mean? I am not aware of any warnings in the tests or examples. Sorry, I exaggerated because I was looking at the latest release instead. The test suite is clean, but the codebase still has some leftover

Re: [Scikit-learn-general] Conceptual questions about linear_model.perceptron

2012-09-06 Thread Vlad Niculae
On Sep 6, 2012, at 18:08 , Mathieu Blondel math...@mblondel.org wrote: Hello, The Perceptron can be seen as a SGD algorithm optimizing the loss \sum_i max{t - y_i w^T x_i, 0} where t=0. On the other hand, online SVM optimizes the same loss but with t=1 (the advantage of setting t=1

Re: [Scikit-learn-general] Teaching materials

2012-10-01 Thread Vlad Niculae
On Oct 1, 2012, at 11:22 , Alexandre Gramfort alexandre.gramf...@inria.fr wrote: That's great news! Is this connected to your image processing seminar? It's related to my new position at ParisTech but image processing and ML are taught in different classes. That's great, congratulations

Re: [Scikit-learn-general] Compressed sensing and Lasso: L1 penalization == L1 minimization?

2012-10-04 Thread Vlad Niculae
Hi Jaidev, This seems relevant to your question: http://metaoptimize.com/qa/questions/7897/are-lasso-and-basis-pursuit-really-the-same-thing Vlad On Oct 5, 2012, at 01:16 , Jaidev Deshpande deshpande.jai...@gmail.com wrote: Hi, I've been going through the tomography reconstruction example

Re: [Scikit-learn-general] 0.12.1 Bugfix release

2012-10-05 Thread Vlad Niculae
Then, on Sunday, we can work on the release itself (building binaries, uploading the webpage...). How does that sound? Thanks Brian for volunteering to help with the Windows binaries. In case your schedule is tight I can step in too. Vlad If you have the time to do the cherry picking,

Re: [Scikit-learn-general] 0.12.1 Bugfix release

2012-10-07 Thread Vlad Niculae
If Brian is not available tomorrow morning/afternoon I can build and upload the win32 binaries. I didn't manage to set up win64 virtualenvs though. Vlad On Oct 7, 2012, at 21:58 , Gael Varoquaux gael.varoqu...@normalesup.org wrote: Sorry guys, I've had loads of stuff on and I might have a

Re: [Scikit-learn-general] Bugfix release 0.12.1

2012-10-08 Thread Vlad Niculae
I'm on the Win32 binaries. Vlad On Oct 9, 2012, at 00:14 , Gael Varoquaux gael.varoqu...@normalesup.org wrote: On Tue, Oct 09, 2012 at 12:27:54AM +0200, Gael Varoquaux wrote: I can't really give instruction, as I have never shipped windows binaries. The core idea is to run 'python setup.py

Re: [Scikit-learn-general] Bugfix release 0.12.1

2012-10-09 Thread Vlad Niculae
On Oct 9, 2012, at 09:17 , Andreas Mueller amuel...@ais.uni-bonn.de wrote: Thanks Gael for pulling of the release basically single-handedly :) And now, off we go towards 0.13! And/or 1.0! Vlad -- Don't let slow site

Re: [Scikit-learn-general] rebuilding cython extensions from .pyx file

2012-10-15 Thread Vlad Niculae
More importantly, the process of building Windows binaries should not need make, or anything else outside of what `python setup.py` can do. My 2c, Vlad On Mon, Oct 15, 2012 at 7:59 AM, Gael Varoquaux gael.varoqu...@normalesup.org wrote: On Mon, Oct 15, 2012 at 07:49:23AM +0100, Brian Holt

Re: [Scikit-learn-general] Sklearn and PY2exe

2012-10-17 Thread Vlad Niculae
Do you think we should add a check so the concatenation is not performed if the docstring is None? The increased binary compression might matter enough for some users. Vlad On Wed, Oct 17, 2012 at 2:13 PM, Legault, Alain alain.lega...@rncan-nrcan.gc.ca wrote: Brent hit it right on! I had

Re: [Scikit-learn-general] How to save an array of models

2012-10-18 Thread Vlad Niculae
Also, since you're using scikit-learn, you could try giving the joblib `dump` and `load` a go. Joblib is bundled with scikit-learn: `from sklearn.externals.joblib import dump, load`. They support various degrees of compression and were designed for saving large models. Vlad On Wed, Oct 17,

Re: [Scikit-learn-general] Atlas configuration error OSX

2012-10-18 Thread Vlad Niculae
On Oct 18, 2012, at 16:58 , Gael Varoquaux gael.varoqu...@normalesup.org wrote: On Thu, Oct 18, 2012 at 05:56:23PM +0200, Olivier Grisel wrote: Even though it's not officially supported by Apple, the bug seems to have been fixed in 10.8. Awesome, that's good news! It's not completely

[Scikit-learn-general] API for multi-sample documents

2012-10-31 Thread Vlad Niculae
Hello, It seems I have reached again the need for something that became apparent when working with image patches last summer. Sometimes we don't have a 1 to 1 correspondence between samples (rows in X) and actual documents we are interested in scoring over. Instead, each document consists of (a

Re: [Scikit-learn-general] API for multi-sample documents

2012-11-02 Thread Vlad Niculae
and let it pass throught when they do their job, so the grouping can be fed in with the dataset and used at the end during scoring. The text feature extraction sort of deals with this by using a list, right? I'm not sure what you mean by this. Cheers, Andy On 10/31/2012 01:13 PM, Vlad Niculae

Re: [Scikit-learn-general] Release schedule for 0.13

2012-11-14 Thread Vlad Niculae
On Nov 14, 2012, at 14:09 , Olivier Grisel olivier.gri...@ensta.org wrote: I would have also liked to implement a hashing text vectorizer but I am not sure I will find the time to do this week or the next week. I'd love to help with that next week! -- Vlad N. http://vene.ro

Re: [Scikit-learn-general] does anyone do dot( sparse vec, sparse vec ) ?

2012-12-28 Thread Vlad Niculae
In the matrix-matrix case (as opposed to vector-vector or matrix-vector), I played with Mathieu's dot-bench and it didn't beat Scipy's very efficient implementation. On Fri, Dec 28, 2012 at 7:51 AM, Mathieu Blondel math...@mblondel.orgwrote: I forgot to mention that the multiplication of two

Re: [Scikit-learn-general] Latent Semantic Indexing (LSI) with Sklearn

2013-01-03 Thread Vlad Niculae
Maybe my mind is not in its right place but how is that different from using the PCA transformer? On Thu, Jan 3, 2013 at 10:48 PM, Lars Buitinck l.j.buiti...@uva.nl wrote: 2013/1/3 Jack Alan j.o.alan2...@gmail.com: I'm working in document classification and I wonder if there is a way of

Re: [Scikit-learn-general] table showing time complexity of algorithms implemented in scikit-learn?

2013-01-10 Thread Vlad Niculae
PR #804 had some comments about generating the tables automatically, which would be nice. How about a consistently structured `Complexity` section to the docstrings, and use it to populate the table? On Thu, Jan 10, 2013 at 6:38 PM, Ronnie Ghose ronnie.gh...@gmail.comwrote: yes please. I was

Re: [Scikit-learn-general] PCA: first component too dominant?

2013-01-11 Thread Vlad Niculae
Olivier, the histogram plotting and data transformation is great, valuable practical advice that would be nice to have in the docs. I think it would go nicely as part of a tutorial, what do you think? Vlad On Fri, Jan 11, 2013 at 10:38 AM, Olivier Grisel olivier.gri...@ensta.orgwrote:

  1   2   3   >