Re: [Scikit-learn-general] Library of pre-trained models

2015-07-02 Thread JAGANADH G
On Thu, Jul 2, 2015 at 1:49 AM, Matthieu Brucher wrote: > The main interesting point is the date of filing the patent and the > date of the publication of the paper... These are as interesting to > have as the implementation date. > > The particular patent US 9037464 is not listed in "Open Patent

[Scikit-learn-general] Speed up transformation step with multiple 1 vs rest binary text classifiers.

2015-07-02 Thread nmura...@masonlive.gmu.edu
Hello, I have a text classification problem where I have about 50 classes and have 50 binary classifiers (1 per topic). The training set used to train each topic classifier is different (some instances might overlap). Each instance consists of a text snippet which is transformed using tf-idf

Re: [Scikit-learn-general] Is it possible to specify the order of spliting in decision tree with scikit-learn?

2015-07-02 Thread Rex
Sebastian, my question is similar to yours but somehow different. For my case, I want to find out the combined conditions leading to some *clustering* pattern. For example, given four columns, [A, B, ID, VALUE], say A is a categorical attribute, B is some integer number, and "VALUE" is the target

Re: [Scikit-learn-general] Is it possible to specify the order of spliting in decision tree with scikit-learn?

2015-07-02 Thread Jacob Schreiber
It sounds like you might want an ensemble of classifiers, where you have a different classifier for each category in A, if you know you want to split on A like that a priori. That way the classifier would learn some function on B which maps B to VALUE, and learns this function independently for eac

Re: [Scikit-learn-general] Speed up transformation step with multiple 1 vs rest binary text classifiers.

2015-07-02 Thread Artem
Hi Nikhil Do you somehow do topic-specific TF-IDF transformations? Could you provide a small (pseudo) code snippet for what you're doing? I may be wrong, but judging from what you wrote, it doesn't look like you use scikit-learn's OneVsRestClassifier

Re: [Scikit-learn-general] Speed up transformation step with multiple 1 vs rest binary text classifiers.

2015-07-02 Thread Joel Nothman
TfidfVectorizer is just CountVectorizer followed by a TfidfTransformer. The Tfidf transformation tends to be cheap relative to tokenization which is independent of what corpus you want to calculate TF.IDF over. If I understand correctly, you can perform CountVectorizer on all of your documents, the

[Scikit-learn-general] Scikit-learn with gcc 4.2 on FreeBSD.

2015-07-02 Thread Nastaran Baradaran
Hi, I am trying to use scikit-learn with gcc 4.2 and am facing some runtime issues. I can build and import the modules (numpy/scipy/sklearn) without any problems, however, I see the following two errors when running a script: 1. ImportError:/usr/local/lib/libgfortran.so.3: version GFORTRAN_1.0 re