> > given a list of of features - e.g. dataDescrs[0] = (140.0, 2, 0.5 - 
and a
> > list of experimental observations - e.g. data_activities[0] = 0 - how 
do I
> > transform these lists to the scikit-learn nomenclature?
> 
> Depends on what these things represent, but if all tuples in
> dataDescrs have the same length and data_activities contains what you
> want to predict, then the simplest possible "transformation" would be
> 
>     X = dataDescrs
>     y = data_activities

I was trying to do a train/test set split:
from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(dataDescrs, 
data_activities, test_size=.4)

However, I found it strange that "X_train.shape" gives (373, 177) - 
shouldn't be the second bit be the number of classes, i.e. 2?

I also tried this
"
dataDescrs_array = np.array(dataDescrs)
print dataDescrs_array.shape
"
which gives (622,177).

177 corresponds, BTW, to the number of features..
622 corresponds to the number of samples in my dataset.


Cheers & thanks so far,
Paul


> 
> i.e., you should be able to feed your data to an estimator as-is. If
> that doesn't work, or gives very surprising results, then please
> report back with some more details of what your data looks like.
> 
> (You might have to call np.array on both; if you do and it works, also
> please report back, as it would likely mean there's an input
> validation bug in one of our estimators.)
> 
> -- 
> Lars Buitinck
> Scientific programmer, ILPS
> University of Amsterdam
> 


This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, you 
must not copy this message or attachment or disclose the contents to any other 
person. If you have received this transmission in error, please notify the 
sender immediately and delete the message and any attachment from your system. 
Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept 
liability for any omissions or errors in this message which may arise as a 
result of E-Mail-transmission or for damages resulting from any unauthorized 
changes of the content of this message and any attachment thereto. Merck KGaA, 
Darmstadt, Germany and any of its subsidiaries do not guarantee that this 
message is free of viruses and does not accept liability for any damages caused 
by any virus transmitted therewith.

Click http://www.merckgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to