Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
As there is so much positive feedback, I might make something up tonight. As I like to make small steps, I'd get rid of the defunc search bar and add some more menu items instead (and adjust the respective pages obv.) ps: For those who are wondering: no, I didn't choose this time because Gael

Re: [Scikit-learn-general] numba, cython and relation to sklearn future

2013-03-05 Thread federico vaggi
For Windows, installing numba is a breeze using: http://www.lfd.uci.edu/~gohlke/pythonlibs/ Basically, all the gnarly extensions are available already compiled with all the dependencies handled properly. It's absolutely amazing and I strongly encourage everyone who uses Python on Windows

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread 党晓彬
+1 On Tue, Mar 5, 2013 at 4:19 PM, Alexandre Gramfort alexandre.gramf...@inria.fr wrote: ps: For those who are wondering: no, I didn't choose this time because Gael is offline. :) I'd rather like to have his input :-/ he is back in a month, right? 3 weeks Alex

Re: [Scikit-learn-general] numba, cython and relation to sklearn future

2013-03-05 Thread klo uo
So are you saying llvm isn't needed, if numba/llvmpy are installed from Christoph's packages? On Tue, Mar 5, 2013 at 9:21 AM, federico vaggi vaggi.feder...@gmail.comwrote: For Windows, installing numba is a breeze using: http://www.lfd.uci.edu/~gohlke/pythonlibs/ Basically, all the gnarly

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 09:11 AM, Andreas Mueller wrote: As there is so much positive feedback, I might make something up tonight. As I like to make small steps, I'd get rid of the defunc search bar and add some more menu items instead (and adjust the respective pages obv.) I made a page but the CSS

Re: [Scikit-learn-general] numba, cython and relation to sklearn future

2013-03-05 Thread federico vaggi
Yup - you can just install those packages, then try to run the default example/tests, and both pass for me! For other packages, like mysqldb, which is a breeze to compile on Linux, but compiling it on Windows under 64 bit is incredibly painful. Here is a good guide if you want to do it on your

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
Ok, working now: http://amueller.github.com/ All feedback welcome :) I'd like to avoid bombarding the user with long lists / pages as much as possible. The Getting Started and Development pages now are a length that mostly fit on a screen and that I can still grasp. If we had an algorithms page

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Olivier Grisel
This looks good. Maybe we could reintroduce a canonical snippet on the home page: from sklearn.datasets import load_digits from sklearn.cross_validation import train_test_split from sklearn.svm import LinearSVC digits = load_digits() X_train, X_test, y_train, y_test = train_test_split( ...

Re: [Scikit-learn-general] setup script refering to .c

2013-03-05 Thread amueller
Exactly. Not only would you need cython, it also needs to be a recent version. people with older versions would get cryptic error messages, leading to frustrated users and busy mailing lists. Matthieu Brucher matthieu.bruc...@gmail.com schrieb: Hi, If I remember correctly, this is done to

[Scikit-learn-general] one class svm probability

2013-03-05 Thread Bill Power
hi all. just looking at the one class svm and I'd like to get a probabililty rather than a distance output. i know that in regular svms you can get parameters for the sigmoid function from five-fold cross validation and that's done by setting the probability=True in the constructor. i presume it's

Re: [Scikit-learn-general] one class svm probability

2013-03-05 Thread Lars Buitinck
2013/3/5 Bill Power bill.power...@gmail.com: investigating previous versions i saw that probability was available in version 0.9 with predict_proba and predict_log_proba functions http://scikit-learn.org/0.9/modules/generated/sklearn.svm.OneClassSVM.html but it's not here in the stable

Re: [Scikit-learn-general] one class svm probability

2013-03-05 Thread Bill Power
thanks lars i figured as much. do you know if there are any ppaers in the literature that i might be able to implement and then perhaps contribute the code to the package? or do i have to live with either using distances or a non-parameterised sigmoid function? thanks

Re: [Scikit-learn-general] one class svm probability

2013-03-05 Thread Peter Prettenhofer
libsvm does not support probability outputs for one-class SVM. One-class SVM is an algorithm for support estimation (not proper density estimation) - i.e. you get a confidence that P(X) t - where t is somewhat concealed in the nu parameter. 2013/3/5 Lars Buitinck l.j.buiti...@uva.nl: 2013/3/5

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Jaques Grobler
I like your changes Andy. It's definitely easier to navigate. I'm currently also changing your graph from http://peekaboo-vision.blogspot.de/2013/01/machine-learning-cheat-sheet-for-scikit.htmlinto a documentation-linking version that can be added to the documentation. I'll try put an online build

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Gilles Louppe
I feel like the About us section on the homepage shouldn't be there. I'd rather put a About link somewhere else than putting this in front on the home page. Also, I would use the space that we now have on the front page to highlight more important aspects of the package. On 5 March 2013 14:46,

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 02:55 PM, Gilles Louppe wrote: I feel like the About us section on the homepage shouldn't be there. I'd rather put a About link somewhere else than putting this in front on the home page. Also, I would use the space that we now have on the front page to highlight more important

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 02:46 PM, Jaques Grobler wrote: I like your changes Andy. It's definitely easier to navigate. I'm currently also changing your graph from http://peekaboo-vision.blogspot.de/2013/01/machine-learning-cheat-sheet-for-scikit.html into a documentation-linking version that can be

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Ronnie Ghose
can you make the new website design part of a repo so we can submit PRs or issues against it? On Tue, Mar 5, 2013 at 9:39 AM, Andreas Mueller amuel...@ais.uni-bonn.dewrote: On 03/05/2013 03:18 PM, Nelle Varoquaux wrote: Hi everyone, I'm actually not convinced about the new layout (sorry

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 03:46 PM, Ronnie Ghose wrote: can you make the new website design part of a repo so we can submit PRs or issues against it? It is a branch in my sklearn fork, but the branch is not completely up to date, working on it.

[Scikit-learn-general] Suggested technique for 1 D clustering

2013-03-05 Thread nipun batra
Hi, What clustering technique (with implementation in sklearn) is recommended for 1d data? -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today:

Re: [Scikit-learn-general] Suggested technique for 1 D clustering

2013-03-05 Thread Ronnie Ghose
..does kmeans not work? On Tue, Mar 5, 2013 at 9:51 AM, nipun batra nipunredde...@gmail.com wrote: Hi, What clustering technique (with implementation in sklearn) is recommended for 1d data? -- Everyone

Re: [Scikit-learn-general] Suggested technique for 1 D clustering

2013-03-05 Thread Andreas Mueller
On 03/05/2013 03:51 PM, nipun batra wrote: Hi, What clustering technique (with implementation in sklearn) is recommended for 1d data? I'd recommend looking at it ;) It feels like there might be some sweeping algorithm to get the optimal solution for the k-means algorithm. KMeans should be

Re: [Scikit-learn-general] Suggested technique for 1 D clustering

2013-03-05 Thread nipun batra
It should. I would have straight away tried it, but read the following 2 posts: 1. http://stackoverflow.com/questions/11513484/1d-number-array-clustering 2. http://stats.stackexchange.com/questions/13781/clustering-1d-data Any thoughts? On Tue, Mar 5, 2013 at 8:24 PM, Ronnie Ghose

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 03:18 PM, Nelle Varoquaux wrote: Hi everyone, I'm actually not convinced about the new layout (sorry Andy :( ). I should also say, I'm not convinced about panda's website. The menu is, I think, quite confusing. Overall, I think there are two many links which may refer to the

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Nelle Varoquaux
On 5 March 2013 15:39, Andreas Mueller amuel...@ais.uni-bonn.de wrote: On 03/05/2013 03:18 PM, Nelle Varoquaux wrote: Hi everyone, I'm actually not convinced about the new layout (sorry Andy :( ). I should also say, I'm not convinced about panda's website. The menu is, I think, quite

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Jaques Grobler
Awesome, thanks. You have my inkscape file, right? I am trying to make the user guide also more flat by putting your java script function in the file now :) I think as the user guide is much shorter now, it doesn't really hide things, but rather makes them easier to find. We'll see. @Andy

Re: [Scikit-learn-general] Announcement: scikit-image 0.8.0

2013-03-05 Thread Jaques Grobler
Congratulations :) Nice work 2013/3/4 Johannes Schönberger jschoenber...@demuc.de Announcement: scikit-image 0.8.0 We're happy to announce the 8th version of scikit-image! scikit-image is an image processing toolbox for SciPy that includes algorithms for

Re: [Scikit-learn-general] Suggested technique for 1 D clustering

2013-03-05 Thread Ronnie Ghose
interesting posts :). so 1) do we want a natural breaks method? https://en.wikipedia.org/wiki/Jenks_natural_breaks_optimization 2) have you considered looking at the distribution of the variable as they suggest? any small-d tends to allow this rather than the usual giant-d space. Do you have any

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 04:04 PM, Nelle Varoquaux wrote: Maybe that is the problem the core problem. The documentation has not been written to be without sections: before, the user guide was divided into three parts: 1. Installation 2. Tutorials: an overview of the scikit 3. Unsupervised learning 4.

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 04:11 PM, Jaques Grobler wrote: Awesome, thanks. You have my inkscape file, right? I am trying to make the user guide also more flat by putting your java script function in the file now :) I think as the user guide is much shorter now, it doesn't really hide things, but

Re: [Scikit-learn-general] Easy way to handle .arff files in sklearn?

2013-03-05 Thread Christian
Hi Tom, recently I saw the arff-package in pypi. Seems working. import arff import numpy as np barray = [] for row in arff.load('/home/chris/tools/weka-3-7-6/rd54_train.arff'): barray.append(list(row)) nparray = np.array(barray) print nparray.shape (4940, 56) HTH Christian I’m trying

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 04:04 PM, Nelle Varoquaux wrote: Maybe that is the problem the core problem. The documentation has not been written to be without sections: before, the user guide was divided into three parts: 1. Installation 2. Tutorials: an overview of the scikit 3. Unsupervised learning

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 04:11 PM, Jaques Grobler wrote: Awesome, thanks. You have my inkscape file, right? I am trying to make the user guide also more flat by putting your java script function in the file now :) I think as the user guide is much shorter now, it doesn't really hide things, but

Re: [Scikit-learn-general] Easy way to handle .arff files in sklearn?

2013-03-05 Thread Rob Zinkov
We really should have this support within the library. Does it make sense to just use the functionality in the arff-package? On Tue, Mar 5, 2013 at 7:34 AM, Christian mining.fa...@gmail.com wrote: Hi Tom, recently I saw the arff-package in pypi. Seems working. import arff import numpy as

Re: [Scikit-learn-general] Easy way to handle .arff files in sklearn?

2013-03-05 Thread Lars Buitinck
2013/3/5 Rob Zinkov cfour...@gmail.com: We really should have this support within the library. Does it make sense to just use the functionality in the arff-package? Is it better than the one in scipy.io? -- Lars Buitinck Scientific programmer, ILPS University of Amsterdam

Re: [Scikit-learn-general] Hierarchical Clustering

2013-03-05 Thread Robert McGibbon
The fastcluster project by Dan Mullner, a professor of math and statistics at Stanford, might be of interest. http://math.stanford.edu/~muellner/fastcluster.html These routines follow the same API of the hierarchical clustering routines in scipy, including single linkage and complete linkage,

Re: [Scikit-learn-general] Hierarchical Clustering

2013-03-05 Thread Olivier Grisel
2013/3/5 Robert McGibbon rmcgi...@gmail.com: The fastcluster project by Dan Mullner, a professor of math and statistics at Stanford, might be of interest. http://math.stanford.edu/~muellner/fastcluster.html These routines follow the same API of the hierarchical clustering routines in scipy,

Re: [Scikit-learn-general] Easy way to handle .arff files in sklearn?

2013-03-05 Thread Tom Fawcett
Thanks for your response, Christian. I experimented with the package. FYI, there’s a problem with the pypi arff reader. The package claims to handle numbers but it seems to encode everything (including numbers) as strings, like this: [['blonde' '17.2' '1' 'yes'] ['blue' '27.2' '2' 'yes']

Re: [Scikit-learn-general] Hierarchical Clustering

2013-03-05 Thread Robert McGibbon
On Mar 5, 2013, at 10:10 AM, Olivier Grisel wrote: This code is in C++ and the scikit-learn core maintainers are not all experts in C++ and prefer cython for optimized code. A cython rewrite of some of those algorithms would be of interest though. For anyone interested in either

Re: [Scikit-learn-general] Easy way to handle .arff files in sklearn?

2013-03-05 Thread Rob Zinkov
The import method doesn't support sparse representations: This function should be able to read most arff files. Not implemented functionality include: date type attributes string type attributes It can read files with numeric and nominal attributes. It cannot read files with sparse data ({}

[Scikit-learn-general] Loading libsvm data formats

2013-03-05 Thread Mohamed Radhouane Aniba
Hello everyone, I am new to scikit-learn package, I am still trying sone of the examples on the website. You have an example on RBF SVM parameters that is very interesting but my only problem is that my data are in libsvm format I know that you have an option for loading this format through

[Scikit-learn-general] Request for project to tackle

2013-03-05 Thread Jeff Van Voorst
Greetings, I have read the developer guidelines for scikit learn, and I would like to contribute (to boost my machine learning and python fu). Is there an outstanding, easy bug or feature that can be assigned to me or should I select one? Thanks, Jeff Van Voorst

Re: [Scikit-learn-general] Loading libsvm data formats

2013-03-05 Thread Ronnie Ghose
http://stackoverflow.com/questions/13590247/using-libsvm-format-in-scikit SO is amazing :) On Tue, Mar 5, 2013 at 2:19 PM, Mohamed Radhouane Aniba arad...@gmail.comwrote: Hello everyone, I am new to scikit-learn package, I am still trying sone of the examples on the website. You have an

Re: [Scikit-learn-general] Request for project to tackle

2013-03-05 Thread Andreas Mueller
Hi Jeff. Thanks for your will to contribute. As the dev guidelines state, there are certain issues that are tagged as easy: https://github.com/scikit-learn/scikit-learn/issues?labels=Easypage=1sort=updatedstate=open These might still vary a lot. Maybe just browse around. How familiar are you with

Re: [Scikit-learn-general] Multivariate Adaptive Regression Splines (MARS, aka earth)

2013-03-05 Thread Jason Rudy
So I've finally got something to show. Gael, you were entirely correct about it being a mouthful. I've been developing it as a separate package for simplicity, but will be integrating with scikit-learn as soon as I get the time. Here is what I've got so far in case anyone wants to take a look:

Re: [Scikit-learn-general] numba, cython and relation to sklearn future

2013-03-05 Thread Kenneth C. Arnold
It was a pretty easy build on Mac -- I just used MacPorts to install and select an llvm. Of course Anaconda is even easier. I'd say Numba is a medium-term consideration. It's enough trouble getting everybody using C compilers, so adding LLVM to the mix is probably way too much of a change for the

Re: [Scikit-learn-general] Multivariate Adaptive Regression Splines (MARS, aka earth)

2013-03-05 Thread Andreas Mueller
On 03/05/2013 08:15 PM, Jason Rudy wrote: So I've finally got something to show. Gael, you were entirely correct about it being a mouthful. I've been developing it as a separate package for simplicity, but will be integrating with scikit-learn as soon as I get the time. Here is what I've

Re: [Scikit-learn-general] Request for project to tackle

2013-03-05 Thread Robert Layton
On 6 March 2013 06:26, Andreas Mueller amuel...@ais.uni-bonn.de wrote: Hi Jeff. Thanks for your will to contribute. As the dev guidelines state, there are certain issues that are tagged as easy: https://github.com/scikit-learn/scikit-learn/issues?labels=Easypage=1sort=updatedstate=open

Re: [Scikit-learn-general] Easy way to handle .arff files in sklearn?

2013-03-05 Thread Christian
For me it works fine. Cheers, Christian test.arff @relation 'test' @attribute v1 {blonde,blue} @attribute v2 numeric @attribute v3 numeric @attribute class {yes,no} @data blonde,17.2 ,1,yes blue,27.2,2,yes blue,18.2,3,no end test.arff barray [['blonde', 17.2, 1.0, 'yes'], ['blue', 27.2,

[Scikit-learn-general] Multinomial HMM Issue #1158

2013-03-05 Thread David Reed
Hi, I added a comment to issue #1158 but since it is closed, I'm not sure if anyone would be alerted. I am not sure if this should be closed or perhaps a second issue should be opened. As already stated, the attribute n_symbols only gets created when an emission probability matrix is defined.

Re: [Scikit-learn-general] Multinomial HMM Issue #1158

2013-03-05 Thread Andreas Mueller
Hi. Should we just deprecate / remove the HMM? We deemed sequence prediction off-topic (Lars' words and I agree) and there is no core-dev maintaining them. Is there any project this could move to? Statsmodel, pandas? There should be a go-to place for time-series modelling. There was

[Scikit-learn-general] Website down

2013-03-05 Thread Andreas Mueller
Hi everybody. The dns seems to be down again. Should we try to switch? Does Stefan still have the domain or did someone else take care of it? Cheers, Andy -- Symantec Endpoint Protection 12 positioned as A LEADER in The