Dear Artur,
the shapes of your input arrays got muddled up. As opposed to e.g. matlab,
in numpy there exist 1D arrays, and these are usually interpreted as line
vectors. Thus, male_height, male_weight, male_age are all written into the
same line by hstack. What you are looking to hstack are column vectors,
which are 2D arrays. You can obtain these in a number of ways. While there
are more concise ways of doing this, the most useful command (in the sense
of teaching to fish) I can give you at this stage is reshape (a full numpy
tutorial can be very useful in general):
male_height = np.array([111,121,137,143,157]).reshape(-1, 1)
male_weight = np.array([60,70,88,99,75]).reshape(-1, 1)
male_age = np.array([41,32,73,54,35]).reshape(-1, 1)
The same needs to be done for the females.
Next problem up will be the label vector, which at the moment only has 3
entries, but should have as many entries as examples. Ie it should be
labels = np.array([0, 0, 0, 0, 0, 1, 1, 1,1, 1, 2, 2, 2, 2, 2]).
Hope this helps!
Michael
On Sun, Oct 12, 2014 at 3:57 PM, Artur Bercik <vbubbl...@gmail.com> wrote:
> Dear sklearn users:
> I am hanging with the following simple problem of doing support vector
> machine with numpy arrays. I would be grateful if someone answer me.
>
> import numpy as np
> from sklearn import svm
>
> ##I have 3 classes/labels ('male', 'female','na') denoted as follows:
>
> labels = [0,1,2]
>
> ##Each class was defined by 3 variables ('height','weight','age') as the
> training data:
>
> male_height = np.array([111,121,137,143,157])
> male_weight = np.array([60,70,88,99,75])
> male_age = np.array([41,32,73,54,35])
>
> males = np.hstack([male_height,male_weight,male_age])
>
> female_height = np.array([91,121,135,98,90])
> female_weight = np.array([32,67,98,86,56])
> female_age = np.array([51,35,33,67,61])
>
> females = np.hstack([female_height,female_weight,female_age])
>
> na_height = np.array([96,127,145,99,91])
> na_weight = np.array([42,97,78,76,86])
> na_age = np.array([56,35,49,64,66])
>
> nas = np.hstack([na_height,na_weight,na_age])
>
> ##Now I have to fit the support vector machine method for the training
> data to predict the class given that 3 variable:
>
> height_weight_age = [100,100,100]
>
> clf = svm.SVC()
> trainingData = np.vstack([males,females,nas])
>
> clf.fit(trainingData, labels)
>
> result = clf.predict(height_weight_age)
>
> print result
>
> #Unfortunately, the following error occurs:
> # ValueError: X.shape[1] = 3 should be equal to 15, the number of
> features at #training time
> #How should I modify the 'trainingData' and 'labels' to get the correct
> answer?
>
>
> Thanks in the advance.
> Artur Bercik
>
>
>
> ------------------------------------------------------------------------------
> Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
> Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
> Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
> Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
> http://p.sf.net/sfu/Zoho
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://p.sf.net/sfu/Zoho
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general