An infinite number of samples is fine.

It is still true that you need to have training samples from all of
the target categories.

On Sun, Jun 12, 2011 at 2:53 PM, Joscha Feth <[email protected]> wrote:
> Hi Ted,
>
> I see. Only for the OLR or also for any other algorithm? What if my
> other category theoretically contains an infinite number of samples?
>
> Cheers,
> Joscha
>
> Am 12.06.2011 um 15:08 schrieb Ted Dunning <[email protected]>:
>
>> Joscha,
>>
>> There is no implicit training.  you need to give negative examples as
>> well as positive.
>>
>>
>> On Sat, Jun 11, 2011 at 9:08 AM, Joscha Feth <[email protected]> wrote:
>>> Hello Ted,
>>>
>>> thanks for your response!
>>> What I wanted to accomplish is actually quite simple in theory: I have some
>>> sentences which have things in common (like some similar words for example).
>>> I want to train my model with these example sentences I have. Once it is
>>> trained I want to give an unknown sentence to my classifier and would like
>>> to get back a percentage to which the unknown sentence is similar to the
>>> sentences I trained my model with. So basically I have two categories
>>> (sentence is similar and sentence is not similar). To my understanding it
>>> does only make sense to train my model with the positives (e.g. the sample
>>> sentences) and put them all into the same category (I chose category 0,
>>> because the .classifyScalar() method seems to return the probability for the
>>> first category, e.g. category 0). All other sentences are implicitly (but
>>> not trained) in the second category (category 1).
>>>
>>> Does that make sense or am I completely off here?
>>>
>>> Kind regards,
>>> Joscha Feth
>>>
>>> On Sat, Jun 11, 2011 at 03:46, Ted Dunning <[email protected]> wrote:
>>>>
>>>> The target variable here is always zero.
>>>>
>>>> Shouldn't it vary?
>>>>
>>>> On Fri, Jun 10, 2011 at 9:54 AM, Joscha Feth <[email protected]> wrote:
>>>>>            algorithm.train(0, generateVector(animal));
>>>>>
>>>
>>>
>

Reply via email to