On 2011-12-29, at 3:18 PM, Bronco Zaurus wrote:
> Hello,
>
> I have a beginner's question: how do you classify using non-numerical
> features, concretely strings (for example: 'audi', 'bmw',
> 'chevrolet')?
>
> One way that comes to mind is to give each value a number. Is there a
> more s
There are actually work on embedding word sense into vector space, "Word
representations: A simple and general method for semi-supervised learning"
for example.
On Fri, Dec 30, 2011 at 6:26 AM, Robert Layton wrote:
> On 30 December 2011 08:57, Gael Varoquaux
> wrote:
>
>> On Thu, Dec 29, 2011 a
On 30 December 2011 08:57, Gael Varoquaux wrote:
> On Thu, Dec 29, 2011 at 09:18:38PM +0100, Bronco Zaurus wrote:
> >I have a beginner's question: how do you classify using non-numerical
> >features, concretely strings (for example: 'audi', 'bmw',
> >'chevrolet')?
>
> You are in troubl
Hi Adnan,
probability=True performs a probability calibration on the decision
function.
In order to generate the ROC curve you could directly use the output of
decision_function
method and obtain exactly the same result as if you used probability
calibration (this
is because calibration is a stric
On Thu, Dec 29, 2011 at 12:46:36PM -0800, adnan rajper wrote:
>I use LinearSVC for text classification. My problem is that I want to
>generate ROC curve for LinearSVC. Since LinearSVC does not output
>probabilties. Is there any other way to generate ROC curve for LinearSVC?
>I have
On Thu, Dec 29, 2011 at 09:18:38PM +0100, Bronco Zaurus wrote:
>I have a beginner's question: how do you classify using non-numerical
>features, concretely strings (for example: 'audi', 'bmw',
>'chevrolet')?
You are in trouble as your input space is not metric: what's .5*('audi' +
'che
hi everybody,
I use LinearSVC for text classification. My problem is that I want to generate
ROC curve for LinearSVC. Since LinearSVC does not output probabilties. Is there
any other way to generate ROC curve for LinearSVC?
I have tried svm.SVC(kernel='linear', probabilities=True) but it gets
Hello,
I have a beginner's question: how do you classify using non-numerical
features, concretely strings (for example: 'audi', 'bmw',
'chevrolet')?
One way that comes to mind is to give each value a number. Is there a
more straightforward way of using string features in sklearn?
---
On Thu, Dec 29, 2011 at 10:34:16AM -0800, Josh Bleecher Snyder wrote:
> If you want to experiment with more options, you might also play with
> blosc (http://blosc.pytables.org/trac). The compression level is not
> as good as heavier weight algorithms, but it is really zippy. I ended
> up using it
> Obviously the fine-tuning that I did is not needed for the
> scikit's storage of the datasets, but it general fast dump/load of Python
> objects is useful for scientific computing and big data (think caching or
> message passing parallel computing).
If you want to experiment with more options, y
On Wed, Dec 28, 2011 at 05:21:39PM +0100, Alexandre Gramfort wrote:
> thanks Gael for the christmas present :)
I just couldn't help playing more. I have pushed a new update that
enables to control the compression level, and in general can achieve
better compromises between speed and compression. H
On Thu, Dec 29, 2011 at 04:55:48PM +0100, Andreas Müller wrote:
> I was wondering whether it is a good idea to use properties as to me
> that seems very unlike the rest of the user-interface.
> Also, from the documentation it is not entirely clear which attributes are
> properties and which are no
Hi Everybody.
As you might have noticed, I am trying to get all the errors out of the
docs.
One thing I noticed today is that there are two (or six, depending on
how you count)
places where properties are used:
The gmm and hmm modules.
I was wondering whether it is a good idea to use properties
13 matches
Mail list logo