Sorry, That email was sent prematurely (thanks gmail)
I was saying that labs' dimension was indicative of the error. Doing this
solved it:
labs = labs[:,0]
It may seem obvious but switching from matlab, the difference between
(1000,1) and (1000,) is still not perfectly clear to me. Hence I spent so
much time on it.
Thanks!
On Mon, Jul 22, 2013 at 11:18 AM, Arslan, Ali <[email protected]> wrote:
> Hi Andreas,
> Alas you were right.
>
>
> On Mon, Jul 22, 2013 at 9:07 AM, Andreas Mueller <[email protected]
> > wrote:
>
>> That seems to look good to me (I think labs might need to be (1000,)
>> but I'm not entirely sure).
>>
>> Can you reproduce your error on random / generated data?
>> A gist to reproduce the problem would be great.
>>
>> Cheers,
>> Andy
>>
>>
>>
>> On 07/22/2013 03:02 PM, Arslan, Ali wrote:
>>
>> Hi Andy,
>>
>> ipdb> feats.dtype
>> dtype('float64')
>>
>> ipdb> type(feats)
>> <type 'numpy.ndarray'>
>>
>> ipdb> feats.shape
>> (1000, 20)
>>
>> ipdb> labs.dtype
>> dtype('int8')
>>
>> ipdb> type(labs)
>> <type 'numpy.ndarray'>
>>
>> ipdb> labs.shape
>> (1000, 1)
>>
>>
>> I think it could also be related to the values inside the feats matrix
>> but I don't know what would cause these errors. I made sure that it's not
>> full of zero but that's the only thing I could think of.
>> Any ideas?
>> Thanks,
>> A
>>
>>
>> On Mon, Jul 22, 2013 at 4:43 AM, Andreas Mueller <
>> [email protected]> wrote:
>>
>>> Hi Ali.
>>> What is the type and size of your input and output vectors?
>>> (type, dtype, shape)
>>>
>>> Cheers,
>>> Andy
>>>
>>>
>>> On 07/22/2013 01:24 AM, Arslan, Ali wrote:
>>>
>>> Hi,
>>> I'm trying to use AdaBoostClassifier with a decision tree stump as the
>>> base classifier. I noticed that the weight adjustment done by
>>> AdaBoostClassifier has been giving me errors both for SAMME.R and SAMME
>>> options.
>>>
>>> Here's a brief overview of what I'm doing
>>>
>>> def train_adaboost(features, labels):
>>> uniqLabels = np.unique(labels)
>>> allLearners = []
>>> for targetLab in uniqLabels:
>>> runs=[]
>>> for rrr in xrange(10):
>>> feats,labs = get_binary_sets(features, labels, targetLab)
>>> baseClf = DecisionTreeClassifier(max_depth=1,
>>> min_samples_leaf=1)
>>> baseClf.fit(feats, labs)
>>>
>>> ada_real = AdaBoostClassifier( base_estimator=baseClf,
>>> learning_rate=1,
>>> n_estimators=20,
>>> algorithm="SAMME")
>>> runs.append(ada_real.fit(feats, labs))
>>> allLearners.append(runs)
>>>
>>> return allLearners
>>>
>>> I looked at the fit for every single decision tree classifier and they
>>> are able to predict some labels. When I look at the AdaBoostClassifier
>>> using this base classifier, however, I get errors about the weight boosting
>>> algorithm.
>>>
>>> def compute_confidence(allLearners, dada, labbo):
>>> for ii,thisLab in enumerate(allLearners):
>>> for jj, thisLearner in enumerate(thisLab):
>>> #accessing thisLearner's methods here
>>>
>>> The methods give errors like these:
>>>
>>> ipdb> thisLearner.predict_proba(myData)
>>>
>>> PATHTOPACKAGE/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py:727:
>>> RuntimeWarning: invalid value encountered in double_scalars proba /=
>>> self.estimator_weights_.sum() *** ValueError: 'axis' entry is out of bounds
>>>
>>> ipdb> thisLearner.predict(myData)
>>>
>>> PATHTOPACKAGE/lib/python2.7/site-packages/sklearn/ensemble/weight_boosting.py:639:
>>> RuntimeWarning: invalid value encountered in double_scalars pred /=
>>> self.estimator_weights_.sum() *** IndexError: 0-d arrays can only use a
>>> single () or a list of newaxes (and a single ...) as an index
>>>
>>> I tried SAMME.R algorithm for adaboost but I can't even fit adaboost in
>>> that case because of this error[...]
>>>
>>> File "PATH/sklearn/ensemble/weight_boosting.py", line 388, in fit return
>>> super(AdaBoostClassifier, self).fit(X, y, sample_weight)
>>>
>>> File "PATH/sklearn/ensemble/weight_boosting.py", line 124, in fit
>>> X_argsorted=X_argsorted)
>>>
>>> File "PATH/sklearn/ensemble/weight_boosting.py", line 435, in _boost
>>> X_argsorted=X_argsorted)
>>>
>>> File "PATH/sklearn/ensemble/weight_boosting.py", line 498, in
>>> _boost_real (estimator_weight < 0)))
>>>
>>> ValueError: non-broadcastable output operand with shape (1000) doesn't
>>> match the broadcast shape (1000,1000)
>>>
>>> the data's dimensions are actually compatible with the format that
>>> classifier is expecting, both before using adaboost and when I try to test
>>> the trained classifiers. What can these errors indicate?
>>>
>>> Thanks,
>>> Ali
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> See everything from the browser to the database with AppDynamics
>>> Get end-to-end visibility with application monitoring from AppDynamics
>>> Isolate bottlenecks and diagnose root cause in seconds.
>>> Start your free trial of AppDynamics Pro
>>> today!http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
>>>
>>>
>>>
>>> _______________________________________________
>>> Scikit-learn-general mailing
>>> [email protected]https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> See everything from the browser to the database with AppDynamics
>>> Get end-to-end visibility with application monitoring from AppDynamics
>>> Isolate bottlenecks and diagnose root cause in seconds.
>>> Start your free trial of AppDynamics Pro today!
>>>
>>> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> --
>> Ali B Arslan, M.Sc.
>> Cognitive, Linguistic and Psychological Sciences
>> Brown University
>>
>>
>> ------------------------------------------------------------------------------
>> See everything from the browser to the database with AppDynamics
>> Get end-to-end visibility with application monitoring from AppDynamics
>> Isolate bottlenecks and diagnose root cause in seconds.
>> Start your free trial of AppDynamics Pro
>> today!http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
>>
>>
>>
>> _______________________________________________
>> Scikit-learn-general mailing
>> [email protected]https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> See everything from the browser to the database with AppDynamics
>> Get end-to-end visibility with application monitoring from AppDynamics
>> Isolate bottlenecks and diagnose root cause in seconds.
>> Start your free trial of AppDynamics Pro today!
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> --
> Ali B Arslan, M.Sc.
> Cognitive, Linguistic and Psychological Sciences
> Brown University
>
--
Ali B Arslan, M.Sc.
Cognitive, Linguistic and Psychological Sciences
Brown University
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general