Re: [theano-users] General help regarding correctness of trained data

Frédéric Bastien Wed, 28 Sep 2016 07:30:33 -0700

Probably related to the imbalanced training set. search for how to handle
that. There is many technique for that. You can resample your examples to
have a better proportion in your training data. You can do this by
duplicating the positive samples or distording them.


This isn't the only possible improvement. You could pretrain with an
unsupervised model then finetune with a resampling technique. I'm pretty
sure there is other technique too.

Fred

On Tue, Sep 27, 2016 at 4:14 PM, Mallika Agarwal <[email protected]>
wrote:

> Okay, so I tried with the complete training data.
>
> However there's a problem with it as well. I have only 266 positive
> samples out of ~14k samples. I tried this without any weight function, and
> even on the training data, my model is able to correctly identify only
> 88/266 positive samples.
>
> Is this because of the imbalanced training set or could there be some
> other problem as well?
>
> Again, any help would be appreciated. Thanks,
>
> On Tuesday, September 27, 2016 at 3:59:36 PM UTC+5:30, Arjun Jain wrote:
>>
>> Glad to be of help!
>>
>> On Tue, Sep 27, 2016 at 3:57 PM, Mallika Agarwal <[email protected]>
>> wrote:
>>
>>> Hi Arjun, thanks a lot for the quick response.
>>>
>>> But even then, I read somewhere that the probability for no label should
>>> be completely 1, because it will always predict every label with some
>>> non-zero probability - is that incorrect?
>>>
>>> Yes, I just tried it with the training data, it's predicting 0 (with 1
>>> probability) for all the training examples.
>>>
>>> But I think I found the problem. The training data just contains one
>>> sample with label 1 (like you said, imbalanced class distribution). I'll
>>> try training the model on more data, and get back. Thanks!
>>>
>>>
>>>
>>> On Monday, September 26, 2016 at 10:13:11 PM UTC+5:30, Arjun Jain wrote:
>>>>
>>>> Hi Mallika,
>>>>
>>>> It is indeed a non-theano specific question buy I will try to answer it
>>>> nevertheless - there can be a variety of reasons you are always getting the
>>>> first class e.g. if the first class is the dominant class (more number of
>>>> training examples and you do not use a weight in the cost function), a good
>>>> local minima for the network to learn is to output the first class all the
>>>> time as it would be a good guess.
>>>>
>>>> Have you tested your trained model on the training data?
>>>>
>>>> Best,
>>>> Arjun
>>>>
>>>> On Mon, Sep 26, 2016 at 8:11 PM, Mallika Agarwal <[email protected]
>>>> > wrote:
>>>>
>>>>> I suppose this isn't theano specific, let me know if I should shift
>>>>> the question somewhere else.
>>>>>
>>>>> I am trying to perform verification for face recognition. (Binary
>>>>> classification)
>>>>>
>>>>> I trained a CNN using ~1000 examples, and stored the model to file.
>>>>>
>>>>> I am now testing this on  ~5000 examples. I seem to be getting
>>>>> something very incorrect. When I output the probabilities from logistic
>>>>> regression (for each test example I get [1.0, 0.0] this prob 
>>>>> distribution).
>>>>>
>>>>> Is there a way I could check where the problem lies - in my trained
>>>>> parameters or in the test method?
>>>>>
>>>>> Any help would be appreciated, thanks. :S
>>>>>
>>>>> --
>>>>>
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "theano-users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>> --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "theano-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [theano-users] General help regarding correctness of trained data

Reply via email to