[Scikit-learn-general] Suggestion: break up the metrics module

2014-10-14 Thread Robert Layton
Currently the word "metrics" is overloaded with at least two type of algorithms in that module. The first is evaluation metrics and the second is functions dealing with distance metrics. My suggestion is to: 1) Move the evaluation metrics to a new top level folder called "evaluation" 2) Move the d

Re: [Scikit-learn-general] Welcome new core contributors

2014-10-12 Thread Robert Layton
Congrats! On 13 October 2014 05:42, Manoj Kumar wrote: > Thanks Gaël, > > Its a pleasure. Looking forward to learning and contributing more. > > On Sun, Oct 12, 2014 at 5:24 PM, Gael Varoquaux < > gael.varoqu...@normalesup.org> wrote: > >> I am happy to welcome new core contributors to scikit-le

Re: [Scikit-learn-general] Final summary for GSoC 2014 posted

2014-08-21 Thread Robert Layton
Great work, and I hope to see it merged soon. Thanks! On 17 August 2014 23:54, Issam wrote: > Hi all, > > I finished writing the final summary of my work for GSoC 2014. It is > posted here: http://issamlaradji.blogspot.com/ > > Thank you! > > Best regards, > --Issam Laradji > > > -

Re: [Scikit-learn-general] [GSoC] Wrap up post

2014-08-21 Thread Robert Layton
Really interesting work, well done in GSoC! On 22 August 2014 09:35, Manoj Kumar wrote: > Hi, > > A quick wrap up post about my Summer of Code > > http://manojbits.wordpress.com/2014/08/21/gsoc-the-end-of-another-journey/ > > > -- > Godspeed, > Manoj Kumar, > Mech Undergrad > http://manojbits.w

Re: [Scikit-learn-general] [GSOC] Wrap up blog post

2014-08-21 Thread Robert Layton
Great effort Hamzeh -- your GSoC has dealt with problems everyone has put off until later and helped the community a lot, thanks! On 19 August 2014 11:32, Hamzeh Alsalhi wrote: > Hello, I am wrapping up my final blogpost and I want to say that this was > an awesome summer of code! It has been a

Re: [Scikit-learn-general] [GSOC] Dummy with sparse target

2014-07-28 Thread Robert Layton
gt; > On Mon, Jul 28, 2014 at 7:18 PM, Robert Layton > wrote: > >> Looks good Hamzeh! >> >> This may be a dumb question, but is there an expected (in the statistical >> sense) difference between a most-frequent and a stratified dummy predictor? >> >> >

Re: [Scikit-learn-general] [GSOC] Dummy with sparse target

2014-07-28 Thread Robert Layton
Looks good Hamzeh! This may be a dumb question, but is there an expected (in the statistical sense) difference between a most-frequent and a stratified dummy predictor? On 29 July 2014 11:53, Hamzeh Alsalhi wrote: > Hi! This week I wrote a post (with many benchmark plots) of the sparse > targe

[Scikit-learn-general] Building with python3 -- what are the minimum requirements?

2014-07-20 Thread Robert Layton
I believe that the minimum requirements for building on python3 are higher than they are listed: - Python (>= 2.6 or >= 3.3), - NumPy (>= 1.6.1), - SciPy (>= 0.9) I can't make from source (without errors) with scipy 0.9. I've upgraded to scipy 0.14, which allows a proper build. (this is

Re: [Scikit-learn-general] My talk was approved for PyCon AU 2014!

2014-07-19 Thread Robert Layton
; sprint topic yet! So if you're thinking of some sklearn sprinting, I might > be up for it! > > Juan. > > > On Tue, May 13, 2014 at 11:43 PM, Gael Varoquaux < > gael.varoqu...@normalesup.org> wrote: > >> On Wed, May 14, 2014 at 11:19:57AM +1000, Robert La

Re: [Scikit-learn-general] DBSCAN

2014-07-17 Thread Robert Layton
Hi Roberto, >From the docs: X: array [n_samples, n_samples] or [n_samples, n_features] Array of distances between samples, or a feature array. The array is treated as a feature array unless the metric is given as 'precomputed'. In most cases, X is the

Re: [Scikit-learn-general] Clustering using TfidfVectorizer

2014-06-30 Thread Robert Layton
A bit more concretely, have a look at this class: http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfTransformer.html It is a transformer, so you can apply it to any matrix (that doesn't mean it makes sense, just that you can): # Create original matrix X = creat

Re: [Scikit-learn-general] About weekly posts for GSoc 2014

2014-06-01 Thread Robert Layton
to us. I think it's great incentive to > think of your work in terms of what you could show to others. No matter > how little there is to show, if you try to give it that twist, you might > appreciate things in a new light. > > My 2c, > Vlad > > > On Mon Jun 2 0

Re: [Scikit-learn-general] About weekly posts for GSoc 2014

2014-06-01 Thread Robert Layton
I don't believe that weekly posts are a requirement of GSoC, but (Daniel and I as his mentors) we have asked Maheshakya to do posts at least weekly. So don't worry, but do keep posting regularly! On 2 June 2014 09:28, Issam wrote: > Hi scikit-team > > I am sorry that I was not aware that week

Re: [Scikit-learn-general] Progress of the GSoC - LSH

2014-05-22 Thread Robert Layton
Thanks Maheshakya, good work (already!) On 22 May 2014 05:51, Maheshakya Wijewardena wrote: > Hi, > I have been continuously communicating with my mentors, but it's important > that all others should also know what I have been doing. I have done some > blog posts regarding this. > blog site: ht

Re: [Scikit-learn-general] Community Suggestion for Contribution appropriate as Undergraduate Thesis submission

2014-05-22 Thread Robert Layton
Hi Stelios, Thanks for your interest. We had a survey recently of features people would like to see: https://www.mail-archive.com/scikit-learn-general%40lists.sourceforge.net/msg05970.html Take a look at that, and see if anything peeks your interest. Make sure to check with the mailing list before

Re: [Scikit-learn-general] Graduate student contributing

2014-05-17 Thread Robert Layton
Thanks for your interest -- if you are looking for something to do, then have a look through bugs for starters -- we have a lot of PRs that need to be tested, bugs that need to be fixed. On 18 May 2014 12:42, Ronnie Ghose wrote: > do we have existing grad students contributing: yes > all you d

[Scikit-learn-general] My talk was approved for PyCon AU 2014!

2014-05-13 Thread Robert Layton
Hi all, Just letting you know that my talk for PyCon AU has been accepted! The topic title is: "Text mining online data with scikit-learn" and is 25 minutes long. PyConAU is August 1st to 5th, with the conference proper being the 2nd and 3rd. I probably won't be able to go to the sprints afterwar

Re: [Scikit-learn-general] Accepting as a GSoC project

2014-04-23 Thread Robert Layton
u have any > additional suggestions or anything that I need to do during this community > bonding period. > > > > On Wed, Apr 23, 2014 at 10:47 AM, Robert Layton wrote: > >> Thanks Maheshakya, you did well and earned your spot. >> >> I look forward to worki

Re: [Scikit-learn-general] Stratified k fold shuffle

2014-04-23 Thread Robert Layton
I did notice this a while ago, and I did start on a PR before I got distracted. I think it's a really good idea -- check out this thread for some of the discussion we had last time: http://sourceforge.net/p/scikit-learn/mailman/scikit-learn-general/thread/CALcD8sjf7jxe9juQma2QB22Jc-vqp6K49r8a%2BFMw

Re: [Scikit-learn-general] Welcome to GSoC students

2014-04-22 Thread Robert Layton
Thanks Gaël. The fact we received four students is testament to the hard work everyone has done before me! On 23 April 2014 15:46, Gael Varoquaux wrote: > Hi, > > It seems that the information is now out. I'd like to welcome the four > students that were accepted for the GSoC this year: > > * Is

Re: [Scikit-learn-general] Accepting as a GSoC project

2014-04-22 Thread Robert Layton
Thanks Maheshakya, you did well and earned your spot. I look forward to working with you -- have you got everything you need to get started? On 23 April 2014 14:57, Maheshakya Wijewardena wrote: > Thank you all for accepting my proposal. I congratulate and wish best of > luck to my fellow par

Re: [Scikit-learn-general] KFold cross validation strangely defaults to not shuffle

2014-04-22 Thread Robert Layton
What is the behaviour scikit-learn wide? Usually you would just pass whatever random_state is set to to the sklearn.utils.validation.check_random_state function. It may be possible to extend *that* function to accept "False" as an input, in which case it would not shuffle. (For example, calling ran

Re: [Scikit-learn-general] Website down?

2014-04-22 Thread Robert Layton
It's up for me now, it's slow, but I think that's just my connection today. On 23 April 2014 08:59, Andy wrote: > Hey everybody. > It looks like the website is down. > Does anyone have any idea what is going on? Is that sourceforge? > > Cheers, > Andy > > > -

Re: [Scikit-learn-general] KFold cross validation strangely defaults to not shuffle

2014-04-17 Thread Robert Layton
I think having it off by default is a good thing. Generally, you want as little to happen when you use the defaults. For example, if you "preshuffle" the data for some reason, you just want the KFold to split it up for you. On 17 April 2014 19:57, Mathieu Blondel wrote: > I think the main reaso

Re: [Scikit-learn-general] Speeding up K-means clustering model with fast approximate neighbor search methods

2014-04-16 Thread Robert Layton
hbor search methods, I think it will > not be much of a problem to apply ANN in DBSCAN. > > > On Wed, Apr 16, 2014 at 4:03 AM, Robert Layton wrote: > >> I wrote the original DBSCAN, in a time before I knew anything about >> sparse matrices (I know now a little), so there m

Re: [Scikit-learn-general] Speeding up K-means clustering model with fast approximate neighbor search methods

2014-04-15 Thread Robert Layton
I wrote the original DBSCAN, in a time before I knew anything about sparse matrices (I know now a little), so there may be artefacts in there that aren't scalable -- i.e. a separate iteration over the array for something or an operation that copies the matrix. It has since been updated though, and

Re: [Scikit-learn-general] Speeding up K-means clustering model with fast approximate neighbor search methods

2014-04-09 Thread Robert Layton
I think you are right. Updating k-means to be more effecient has a number of flow-on effects like, as you say, improving EM in turn. On 10 April 2014 02:58, Maheshakya Wijewardena wrote: > Hi, > > Currently in scikit-learn, Expectation maximization algorithm is used in > K-means clustering model

Re: [Scikit-learn-general] Implementing Locality Sensitivity Hashing to approximate neighbor search

2014-03-18 Thread Robert Layton
i Dong has informed me that there is another state-of-art approach > > he implemented to perform ANN search and asked to see if that can be > > used in the evaluating stage. This implementation is called KGraph > > <http://www.kgraph.org/index.php?n=Main.Python>. It has co

Re: [Scikit-learn-general] Implementing Locality Sensitivity Hashing to approximate neighbor search

2014-03-17 Thread Robert Layton
Thanks for sending this. The grammar needs updating, but I'll let you worry about proofreading. Comments below: - State some *evidence* for your experience with python, i.e. projects you worked on or links to big commits (this is covered late in the application, move some information forward). - D

Re: [Scikit-learn-general] Implementing Locality Sensitivity Hashing to approximate neighbor search

2014-03-13 Thread Robert Layton
Thanks Gael. My thinking was to implement "Basic LSH with basic data structures" and then spend some of the time working on seeing if moderate improvements (i.e. a more complex data structure) can deliver benefits. This way, we get the key deliverable, and spend some time trying to see if we can do

Re: [Scikit-learn-general] Implementing Locality Sensitivity Hashing to approximate neighbor search

2014-03-13 Thread Robert Layton
our > proposal (see [2]) which you should probably write as soon as possible, > the deadline is pretty close. > > How should an LSH project be evaluated? one proxy measurement is > improvement on kNN speed vs. accuracy. > > Daniel > > [1] supertech.csail.mit.edu/papers/

Re: [Scikit-learn-general] Implementing Locality Sensitivity Hashing to approximate neighbor search

2014-03-12 Thread Robert Layton
I apologise if I haven't been as active as I would have liked -- I had my wisdom teeth removed on Tuesday (ouch!). Daniel -- are you interested in becoming a mentor for this project? On the LSH forest, I don't have experience with it, but I think Daniel's point probably allows it to be included.

Re: [Scikit-learn-general] PyCon AU time 2014

2014-03-11 Thread Robert Layton
1 March 2014 06:03, Gael Varoquaux > wrote: > > On Sun, Mar 09, 2014 at 09:14:17PM +1100, Robert Layton wrote: > >> I'm planning to put in another talk on scikit-learn (last year I did a > mixture > >> alongside my research). > > > > Great! It's

Re: [Scikit-learn-general] PyCon AU time 2014

2014-03-11 Thread Robert Layton
I understand -- I can't do the same for the US/EU ones either! On 11 March 2014 06:03, Gael Varoquaux wrote: > On Sun, Mar 09, 2014 at 09:14:17PM +1100, Robert Layton wrote: > > I'm planning to put in another talk on scikit-learn (last year I did a > mixture >

[Scikit-learn-general] PyCon AU time 2014

2014-03-09 Thread Robert Layton
Hi All, The CFP for PyCon AU has just been released! http://2014.pycon-au.org/cfp I'm planning to put in another talk on scikit-learn (last year I did a mixture alongside my research). The conference is at the start of August, if anyone has plans to be there, let me know. Thanks, Robert ---

Re: [Scikit-learn-general] Implementing Locality Sensitivity Hashing to approximate neighbor search

2014-03-09 Thread Robert Layton
learn, and it looks like Forest LSH would be sufficiently well-used (Google tells me 138 citations), but I think we need the community's decision. On 9 March 2014 17:51, Maheshakya Wijewardena wrote: > Yes, I think Robert Layton is a possible mentor for this project. I have > already

Re: [Scikit-learn-general] GSoC 2014 Proposal - Improving Linear Models (First draft)

2014-03-06 Thread Robert Layton
I agree, it is a strong project and important stuff to do. I feel that the motivation is lacking purpose -- why bother doing this project? (that's not a rhetorical question). At present, the project feels like "here are some things to do, so I'll do them", without any real reason why they should be

Re: [Scikit-learn-general] GSoC: call for prospective mentors

2014-02-14 Thread Robert Layton
Hi Gael. Put me down but I haven't yet got permission from work to allocate the time. On Feb 15, 2014 3:43 AM, "Gael Varoquaux" wrote: > Hi scikiters (or learners?) > > I need (quickly :$) prospective mentors to send me a private mail saying > that they might be interested in mentoring students f

Re: [Scikit-learn-general] contributing to scikit

2014-02-01 Thread Robert Layton
Hi Joseph, In theory, you should be able to take any classifier in sklearn and base your implementation off that. That said, there are a few caveats. Some classifiers are older, before coding was more formalised. Others have a lot of cython code hooks, and can be difficult to read. That all said,

Re: [Scikit-learn-general] Google Summer of Code 2014

2014-01-28 Thread Robert Layton
ine learning. Is it basically > used for nearest neighbour methods? > > > On 28 January 2014 20:48, Robert Layton wrote: > >> In principle, I'm happy to be a mentor for LSH, as I've used it quite a >> bit and implemented nilsimsa in python and javascript, as

Re: [Scikit-learn-general] Google Summer of Code 2014

2014-01-28 Thread Robert Layton
In principle, I'm happy to be a mentor for LSH, as I've used it quite a bit and implemented nilsimsa in python and javascript, as well as tested a number of other algorithms. I don't know much about GSOC though. What would I need to do? On 28 January 2014 20:23, Alexandre Gramfort < alexandre.gra

Re: [Scikit-learn-general] Suggestion to add author names/emails at the bottom of module documentations

2014-01-16 Thread Robert Layton
I agree with Vlad. Further, if there is documentation or a module that none of the active developers can touch (due to complexity or lack of expertise), the preference has generally been to move to remove it from scikit-learn. On 17 January 2014 05:12, Vlad Niculae wrote: > I would rather have

Re: [Scikit-learn-general] K Nearest Neighbour with 3d array and custom distance metric

2014-01-09 Thread Robert Layton
I'd recommend Joel's suggestion for the short term. I wonder if that check could be removed -- as long as the input is fancy-indexable, the code should otherwise not have an issue (until it hits the distance metric, in which case you have that covered). On 10 January 2014 13:09, Joel Nothman wr

Re: [Scikit-learn-general] Cutting off HMMs

2013-12-07 Thread Robert Layton
I remember a while ago hearing of a (probably unofficial) scikit-learn experimental repository, where things that were sklearn-compatable, if not polished code, were put. Does that still exist? On 8 December 2013 11:43, Andy wrote: > On 12/08/2013 12:09 AM, Lars Buitinck wrote: > > 2013/12/7 G

[Scikit-learn-general] Fwd: [Broken] scikit-learn/scikit-learn#4168 (knnd - 6ec6346)

2013-10-16 Thread Robert Layton
69b> ) Build #4168 was broken.<http://mandrillapp.com/track/click.php?u=30007208&id=d0b12a22c58042bc94ae35053f9037e3&url=https%3A%2F%2Ftravis-ci.org%2Fscikit-learn%2Fscikit-learn%2Fbuilds%2F12651335&url_id=5ca76ac97d9bcaa67cb11f4e8db47cf794849a4b> 7 minutes and 20 seconds *Robe

Re: [Scikit-learn-general] Custom distance metrics

2013-10-13 Thread Robert Layton
Thanks Andreas. It was a low-cost question, not an urgent one :) On 14 October 2013 12:29, Andreas Mueller wrote: > On 10/01/2013 07:30 PM, Robert Layton wrote: > > Quick question: Is there a way to find out which classifiers in > > scikit-learn accept a custom distance metric/

[Scikit-learn-general] Custom distance metrics

2013-10-01 Thread Robert Layton
Quick question: Is there a way to find out which classifiers in scikit-learn accept a custom distance metric/distance matrix? -- Public key at: http://pgp.mit.edu/ Search for this email address and select the key from "2011-08-19" (key id: 54BA8735)

Re: [Scikit-learn-general] Which scikit-learn contributors share common interests?

2013-09-27 Thread Robert Layton
This is a great idea! I wonder if my recent PR on licences skewed my result (in which I touched nearly every file...). Perhaps this should be an (automatically updating) example in the scikit-learn gallery? (that said, it could also be a good conference paper for you) On 26 September 2013 03:31

Re: [Scikit-learn-general] Problems with 'make'

2013-09-17 Thread Robert Layton
by copy paste. Can you elaborate. > > > > > On Tue, Sep 17, 2013 at 2:53 PM, Robert Layton wrote: > >> Also, are you able to copy paste your __init__ function? >> >> >> On 17 September 2013 19:14, Lars Buitinck wrote: >> >>> 2013/9/17 Maheshakya Wi

Re: [Scikit-learn-general] Problems with 'make'

2013-09-17 Thread Robert Layton
Also, are you able to copy paste your __init__ function? On 17 September 2013 19:14, Lars Buitinck wrote: > 2013/9/17 Maheshakya Wijewardena : > > I have initialized my estimator(those I created) with default classifiers > > and regressors. But still, I get the same error. > > Did you read and

Re: [Scikit-learn-general] Problems with 'make'

2013-09-16 Thread Robert Layton
and > RegressorMixin. > So what can be done to resolve this? > > > On Tue, Sep 17, 2013 at 10:09 AM, Robert Layton wrote: > >> If you used the Mixin classes to build a classifier, they will get tested >> here automatically. >> Is that what you did? >> >> &g

Re: [Scikit-learn-general] Problems with 'make'

2013-09-16 Thread Robert Layton
If you used the Mixin classes to build a classifier, they will get tested here automatically. Is that what you did? On 17 September 2013 14:35, Maheshakya Wijewardena wrote: > I added my own features and tried to run make command. But while the > process I get the following error trace and the m

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-11 Thread Robert Layton
In the interests of a decision, can I push for renaming to SingleLinkageCluster, and then I'll work with Gael on a solution to either introduce a threshold cut to his implementation, or choose some other path? - Robert On 9 September 2013 20:22, Robert Layton wrote: > I haven't

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-09 Thread Robert Layton
I haven't yet compared against scipy's implementation. The main reason for this is that they are different types of clusterers (with the MSTCluster here generating flat clusters). That said, they are easily convertible. Perhaps we should just drop the separate class altogether, and add an ability

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-09 Thread Robert Layton
*SingleLinkageClustering On 9 September 2013 17:41, Robert Layton wrote: > Thanks for the comments everyone, and the praise Jake. > > Based on this conversation, I think a good avenue would be: > > 1) Rename MSTCluster to SingleLinkageCluster > 2) Merge (after checks, I st

Re: [Scikit-learn-general] Question about naming a clustering algorithm

2013-09-09 Thread Robert Layton
Thanks for the comments everyone, and the praise Jake. Based on this conversation, I think a good avenue would be: 1) Rename MSTCluster to SingleLinkageCluster 2) Merge (after checks, I still need to rebase again) 3) Gael's PR can either use this class, or replace it if he comes up with something

[Scikit-learn-general] Question about naming a clustering algorithm

2013-09-06 Thread Robert Layton
In my recent PR , I've implemented the MSTCluster algorithm. This algorithm finds a minimum spanning tree, then cuts any edge higher than a given threshold. This is equivalent to the single linkage clustering. Olivier and I are talking about

[Scikit-learn-general] EAC PR ready for review

2013-09-02 Thread Robert Layton
Well, after 5 months (I had to leave it for about 3 months, and there was some waiting on the MST algo pull), my PR on the EAC algorithm is ready for review. PR#1830 This implements the Evidence Accumulation Clustering algorithm

Re: [Scikit-learn-general] pdf manual

2013-09-01 Thread Robert Layton
2013 15:22, Gael Varoquaux wrote: > On Mon, Sep 02, 2013 at 10:37:24AM +1000, Robert Layton wrote: > > OK then, I'll build a copy this week and review it for errors. > > Thanks Robert! > > Gaël > > > -

Re: [Scikit-learn-general] pdf manual

2013-09-01 Thread Robert Layton
OK then, I'll build a copy this week and review it for errors. On 2 September 2013 04:32, Andreas Mueller wrote: > On 09/01/2013 12:44 PM, Sean Violante wrote: > > @robert layton > > > > Olivier gave the wrong command: > > > > it is > > > >

Re: [Scikit-learn-general] pdf manual

2013-08-31 Thread Robert Layton
What exactly is involved here? I tried make pdf, but with no luck. Also, do the PDF docs get downloaded at all? If the rate is very low, it may be worth dropping the concept altogether. On 29 August 2013 00:34, Andreas Mueller wrote: > On 08/28/2013 04:24 PM, Olivier Grisel wrote: > > 2013/8/28

[Scikit-learn-general] Stratified Cross Validation

2013-08-20 Thread Robert Layton
Should stratified cross validation be able to take a random_state parameter, that gives some variation to the returned folds? Currently, there is a reliance on the order of the data. Thoughts? -- Public key at: http://pgp.mit.edu/ Search for this email address and select the key from "2011-08

Re: [Scikit-learn-general] Classification accuracy too high

2013-08-15 Thread Robert Layton
The first thing I'd do is publish the result (just kidding!). Try it with another data set first, especially one that has an example in the docs. If you are still getting top marks, it may be your "framework" around the code. (are you doing proper test/train splits, etc) If it drops, consider that

Re: [Scikit-learn-general] Name of a hierarchical agglomerative clustering object

2013-07-23 Thread Robert Layton
Divisive clustering. Intuitively, all points start in the same cluster. You then determine the best way to split that cluster. Recursively repeat until all points are in their own clusters. http://nlp.stanford.edu/IR-book/html/htmledition/divisive-clustering-1.html On 24 July 2013 10:01, Juan

Re: [Scikit-learn-general] Name of a hierarchical agglomerative clustering object

2013-07-23 Thread Robert Layton
I'd go with AgglomerativeClusterer, Adding the hierarchical bit will make it too long. While agglomerative doesn't have to mean hierarchical, the usage is consistent enough that I don't predict it causing much confusion. On 23 July 2013 19:54, Andreas Mueller wrote: > On 07/23/2013 10:52 AM, Al

Re: [Scikit-learn-general] Talk video; python-future package [was: Python 3 port]

2013-07-22 Thread Robert Layton
Hi Ed, That's an interesting project with a novel viewpoint - build on 3 and support 2. Thanks for the link, I know a few people at work that would like to see it. - Robert On 23 July 2013 01:34, Ed Schofield wrote: > Hi Robert, hi all! > > > Thanks for your talk, it was very informative. I

Re: [Scikit-learn-general] scikit-learn for Android?

2013-07-16 Thread Robert Layton
I would imagine scikit-learn's dependency on numpy and scipy would be the major hurdles. Get those working, and scikit-learn should be relatively straightforward from there. On 17 July 2013 10:41, John Novak wrote: > Hi, > > it is possible to do android development in python, and in principle >

Re: [Scikit-learn-general] All scikit-learn tests now pass under Python3 on master

2013-07-16 Thread Robert Layton
Fantastic :) Great work. On 16 July 2013 20:54, Olivier Grisel wrote: > Hi all, > > Thanks to the work by @justinvf we now have all the tests pass under > Python 3.3 on Jenkins: > > > https://jenkins.shiningpanda-ci.com/scikit-learn/job/python-3.3-numpy-1.7.1-scipy-0.12.0/ > > I have enabled em

Re: [Scikit-learn-general] My talk from pycon AU is up

2013-07-14 Thread Robert Layton
for k means > > > On Sun, Jul 14, 2013 at 7:52 PM, Robert Layton wrote: > >> Hi Ronnie, >> >> Thanks for your compliment. >> I don't remember saying AIC -- what point in the video was that? >> >> - Robert >> >> >> On 14 July

Re: [Scikit-learn-general] My talk from pycon AU is up

2013-07-14 Thread Robert Layton
; > > On Sun, Jul 14, 2013 at 12:51 AM, Robert Layton wrote: > >> Hi all, >> >> My talk from pyconAU last week is up:scikit-learn, machine learning and >> cybercrime attribution >> >> >> http://pyvideo.org/video/2228/scikit-learn-machine-learning

[Scikit-learn-general] My talk from pycon AU is up

2013-07-13 Thread Robert Layton
Hi all, My talk from pyconAU last week is up:scikit-learn, machine learning and cybercrime attribution http://pyvideo.org/video/2228/scikit-learn-machine-learning-and-cybercrime-att Thanks! Robert -- Public key at: http://pgp.mit.edu/ Search for this email address and select the key from "20

Re: [Scikit-learn-general] Pystruct website and mailing list

2013-07-11 Thread Robert Layton
Structured prediction in sklearn was one of the outcomes from the survey. Would it be a better idea to send people to pystruct, rather than implement it here? On 12 July 2013 03:12, Andreas Mueller wrote: > Hey everybody. > This is spam about my "new" project pystruct. > Pystruct is my shot at

Re: [Scikit-learn-general] Scikit-learn's website is down

2013-07-09 Thread Robert Layton
Can confirm, it works without the *www*, but doesn't work with it. On 10 July 2013 15:38, Nigel Legg wrote: > Total content on www.scikit-learn.org > This space is managed by SourceForge.net. You have attempted to access a > URL that either never existed or is no longer active. Please check the

Re: [Scikit-learn-general] # of jobs run by GridSearchCV?

2013-07-09 Thread Robert Layton
rid)) * > len(check_cv(search.cv)), and I think this should be output at the start > of the search if verbose >= 1, and perhaps should also be calculated by > some method, so a user can estimate the time before finalising the grid... > > - Joel > > > > On Wed, Jul 10, 201

Re: [Scikit-learn-general] # of jobs run by GridSearchCV?

2013-07-09 Thread Robert Layton
Hi Josh, This is decided by the param_grid that you give it. The actual internals is handled by the ParameterGrid class ( http://scikit-learn.org/dev/modules/generated/sklearn.grid_search.ParameterGrid.html ). The example on that page shows how you could calculate the number of runs based on you

Re: [Scikit-learn-general] Scikit-learn's website is down

2013-07-09 Thread Robert Layton
/stable and /dev are both up for me at this time (which was two hours since Josh's email). On 10 July 2013 06:29, Josh Wasserstein wrote: > FYI: The website seems to be currently down. > > Josh > > > -- > See everything

Re: [Scikit-learn-general] Python 3 port

2013-07-08 Thread Robert Layton
Hi Ed, Thanks for your talk, it was very informative. I'm sorry I didn't get a chance to speak to you more. I went to the sprints first thing on Monday, but had to leave just before lunch to catch my flight. As for the port -- if you have anything that can improve the transition, put it in as a P

[Scikit-learn-general] MRG - updating sparsetools

2013-07-07 Thread Robert Layton
Hi all, I just tried updating my PR on updating sparsetools, messed it up completely (seriously, have a look at PR 2037) and redid the PR into a new PR 2134. Would someone please be able to have a look at this PR and merge it? It updates the sparsetools backport, which is needed for my implementa

Re: [Scikit-learn-general] Data Compression

2013-07-07 Thread Robert Layton
Data compression is more a distance metric than a specific algorithm. A data compression algorithm generally learns some key K of patterns in a document D, then uses K to compress D. The intuition behind using data compression methods for machine learning is that if we learn K from one document, a

[Scikit-learn-general] Pycon AU

2013-07-06 Thread Robert Layton
Hi guys, Yesterday I presented at pycon AU on scikit-learn. The talk went well -- I'm glad I went with the scope I did, and didn't get too deeply into anything. In talks afterwards, I found that quite a few people *knew* of scikit-learn, but few (in this crowd anyway) had actually used it. A slig

Re: [Scikit-learn-general] Adding Sparse Autoencoder to Scikit

2013-06-25 Thread Robert Layton
haha, oops. Thanks :) On 26 June 2013 13:29, Joel Nothman wrote: > > On Wed, Jun 26, 2013 at 9:28 AM, Robert Layton wrote: > >> The basics of cython are, and I'm not kidding here, quite easy to learn. >> Steps: >> 1) Rename .py file to .pyc >> > > Yo

Re: [Scikit-learn-general] Adding Sparse Autoencoder to Scikit

2013-06-25 Thread Robert Layton
The basics of cython are, and I'm not kidding here, quite easy to learn. Steps: 1) Rename .py file to .pyc 2) Put "int" in front of all object declarations that will be integers, "float" in front of things that are floats. (If you know java/C/C++ etc, this will feel really natural) 3) Compile with

Re: [Scikit-learn-general] sklearn.utils.cs_graph_components in a broken state?

2013-06-06 Thread Robert Layton
Andreas Mueller wrote: > On 06/06/2013 07:02 AM, Robert Layton wrote: > > OK, it's up as PR > > https://github.com/scikit-learn/scikit-learn/pull/2037 and ready to go. > > > Does that still assume a symmetric matrix? > The easy solution to you problem would have been

Re: [Scikit-learn-general] sklearn.utils.cs_graph_components in a broken state?

2013-06-05 Thread Robert Layton
OK, it's up as PR https://github.com/scikit-learn/scikit-learn/pull/2037 and ready to go. On 6 June 2013 13:09, Robert Layton wrote: > I was afraid you would say that, but if there was the problem, this seems > like the only solution :) > > I'll get on it. > > &

Re: [Scikit-learn-general] sklearn.utils.cs_graph_components in a broken state?

2013-06-05 Thread Robert Layton
ely. It may be worth > back-porting that routine as well, and replacing the old > cs_graph_components utility routine. > Jake > > > On Wed, Jun 5, 2013 at 4:54 PM, Robert Layton wrote: > >> I'm using this workflow for a minimum spanning tree clusterer for my >

[Scikit-learn-general] sklearn.utils.cs_graph_components in a broken state?

2013-06-05 Thread Robert Layton
I'm using this workflow for a minimum spanning tree clusterer for my current PR on evidence accumulation clustering (see https://github.com/scikit-learn/scikit-learn/pull/1830 ) My understanding is that the output of minimum_spanning_tree (recently backported to sklearn) should plug directly into

Re: [Scikit-learn-general] My talk has been accepted at PyCon AU!

2013-06-01 Thread Robert Layton
Updated, new link at: https://docs.google.com/file/d/0B8FUzd86yYa1SWJXTlkyUF9idlU/edit?usp=sharing Only the updates here have been changed. On 27 May 2013 01:03, Lars Buitinck wrote: > 2013/5/26 Robert Layton > >> I've updated the slides for my talk at pycon AU and put

Re: [Scikit-learn-general] My talk has been accepted at PyCon AU!

2013-05-26 Thread Robert Layton
e fitting example > > http://scikit-learn.org/stable/auto_examples/linear_model/plot_polynomial_interpolation.html > > This example also illustrates that too complex models tend to overfit and > too simplistic models will underfit. > > Mathieu > > On Sun, May 26, 2013 at 12:3

Re: [Scikit-learn-general] My talk has been accepted at PyCon AU!

2013-05-25 Thread Robert Layton
I've updated the slides for my talk at pycon AU and put them on my Google Drive: https://docs.google.com/file/d/0B8FUzd86yYa1VnYtZ1ZHVV9ieUk/edit?usp=sharing If anyone has any comments, I'm happy to take them. Thanks! - Robert On 3 May 2013 13:18, Robert Layton wrote: > I s

Re: [Scikit-learn-general] Out of memory when running silhouette score function

2013-05-07 Thread Robert Layton
about 50 clusters. Do you mind to explain me > more detail about how to 'sampling'. I don't need a very strict > mathematically guarantee, just a way to estimate score and choose the k > value. Thank you in advance for your help. > > Regards, > > T.Bao > &

Re: [Scikit-learn-general] Out of memory when running silhouette score function

2013-05-07 Thread Robert Layton
Hi Bao, The Silhouette Function hasn't been written with this type of scalability in mind. It requires a pairwise distance matrix, which is prohibitive (as others have said). If the number of clusters is low, sampling should give you a good approximation of the silhouette score, although I can't

Re: [Scikit-learn-general] My talk has been accepted at PyCon AU!

2013-05-02 Thread Robert Layton
gt; >> Congrats Robert! >> >> >> >> >> On Sun, Apr 28, 2013 at 7:56 AM, Robert Layton wrote: >> >>> I just received some good news. My talk "scikit-learn, machine learning >>> and cybercrime attribution" has been accepted! >

[Scikit-learn-general] My talk has been accepted at PyCon AU!

2013-04-28 Thread Robert Layton
I just received some good news. My talk "scikit-learn, machine learning and cybercrime attribution" has been accepted! I'll be presenting between the 5th and 7th of July. For those that missed the previous emails, my presentation will be sklearn-centric, with an light introduction to machine learn

Re: [Scikit-learn-general] [scikit-learn] plot SVM results and classification space

2013-04-25 Thread Robert Layton
As Andy said, you need to create some representation in two dimensions. You can easily do this by selecting just two features (i.e. the two most discriminating) or PCA is another good option, but it can be difficult to understand what is meant by the x-axis and y-axis. Keep in mind that classifica

[Scikit-learn-general] Book offer

2013-04-20 Thread Robert Layton
Hi all, I've had an offer to write a book on scipy, but I've turned it down (I really don't have the time to do such a thing well enough). The amount is not a huge amount, but the royalty about seems to be a good rate. If anyone is interested, I'm happy to forward names onto the publisher. Either

[Scikit-learn-general] Feedback on API sought

2013-04-18 Thread Robert Layton
Hi all, I have a PR that I would like some very early feedback on, before I go ahead and finish the rest of the necessary stuff (like docs, tests etc). https://github.com/scikit-learn/scikit-learn/pull/1830 I've implemented a cluster ensemble algorithm, with a slightly unusual API, which is why

Re: [Scikit-learn-general] Extend Python with Go?

2013-04-07 Thread Robert Layton
Great :) Thanks. On 8 April 2013 09:23, Andreas Mueller wrote: > On 04/08/2013 12:52 AM, Robert Layton wrote: > > > > Profiling can help with the optimisation, but it would be nice to have > > a pep8 style "warning line 32: missed type declaration"

Re: [Scikit-learn-general] Extend Python with Go?

2013-04-07 Thread Robert Layton
now that the code could be a lot > faster. > > On Sun, Apr 7, 2013 at 6:39 PM, Robert Layton > wrote: > > Nice link. > > It would be interesting to see a speed comparison, but cython has a > number > > of advantages: (1) it looks almost like python code, (2)

Re: [Scikit-learn-general] Extend Python with Go?

2013-04-07 Thread Robert Layton
Nice link. It would be interesting to see a speed comparison, but cython has a number of advantages: (1) it looks almost like python code, (2) python code is a subset, meaning you can just drop python code into a cython file, compile it and get speed gains, (3) numpy integration. On 8 April 2013

Re: [Scikit-learn-general] PyCon 2013 scikit-learn tutorial videos online!

2013-04-03 Thread Robert Layton
I just went through the advanced one -- great tutorial! On 31 March 2013 09:05, Olivier Grisel wrote: > Hi all, > > The videos of the scikit-learn tutorials Jake and I gave at PyCon 10 > days ago are now both online (along with IPython notebooks and > exercise material on our respective github

  1   2   3   >