Re: Classification beginner questions

Ted Dunning Sun, 12 Jun 2011 14:47:33 -0700

An infinite number of samples is fine.

It is still true that you need to have training samples from all of
the target categories.


On Sun, Jun 12, 2011 at 2:53 PM, Joscha Feth <[email protected]> wrote:
> Hi Ted,
>
> I see. Only for the OLR or also for any other algorithm? What if my
> other category theoretically contains an infinite number of samples?
>
> Cheers,
> Joscha
>
> Am 12.06.2011 um 15:08 schrieb Ted Dunning <[email protected]>:
>
>> Joscha,
>>
>> There is no implicit training.  you need to give negative examples as
>> well as positive.
>>
>>
>> On Sat, Jun 11, 2011 at 9:08 AM, Joscha Feth <[email protected]> wrote:
>>> Hello Ted,
>>>
>>> thanks for your response!
>>> What I wanted to accomplish is actually quite simple in theory: I have some
>>> sentences which have things in common (like some similar words for example).
>>> I want to train my model with these example sentences I have. Once it is
>>> trained I want to give an unknown sentence to my classifier and would like
>>> to get back a percentage to which the unknown sentence is similar to the
>>> sentences I trained my model with. So basically I have two categories
>>> (sentence is similar and sentence is not similar). To my understanding it
>>> does only make sense to train my model with the positives (e.g. the sample
>>> sentences) and put them all into the same category (I chose category 0,
>>> because the .classifyScalar() method seems to return the probability for the
>>> first category, e.g. category 0). All other sentences are implicitly (but
>>> not trained) in the second category (category 1).
>>>
>>> Does that make sense or am I completely off here?
>>>
>>> Kind regards,
>>> Joscha Feth
>>>
>>> On Sat, Jun 11, 2011 at 03:46, Ted Dunning <[email protected]> wrote:
>>>>
>>>> The target variable here is always zero.
>>>>
>>>> Shouldn't it vary?
>>>>
>>>> On Fri, Jun 10, 2011 at 9:54 AM, Joscha Feth <[email protected]> wrote:
>>>>>            algorithm.train(0, generateVector(animal));
>>>>>
>>>
>>>
>

Re: Classification beginner questions

Reply via email to