Hi Diego.
Welcome to the list :)
There really is no such thing as a scikit-learn format.
We do everything with numpy arrays (or sparse matrices).
For the internal datasets, some use a bunch with a data and a target
attribute
but that is really not necessary.
You just need to have a numpy array of shape (n_samples, n_features),
which we usually call X.
The target (usually called y) is user-specified -i.e. someone sat down,
looked at the digits in the digit dataset and said "this is an 8, this
is a 1".
Does your data come with any labels?
If not, you will not be able to use the scores you quoted below.
(And if you do, why don't you do classification?)
Hope that helps.
Andy
On 10/08/2012 10:02 PM, Diego Casado wrote:
Hi guys!!
first time posting here, and then new to scikit. Happy to say hi to
everyone! :)
I've a simple question. I'm trying to cluster a own test-set of
unlabeled data.
Actually they are vectors of 200 dimensions (features). I've like to
conver this set to the specific format of scikit dataset (.data .shape
and .target info).
The idea behind to get this data is to be able to compute after the
metrics to know how well my model performs on my data: Example:
*labels* =*.target
metrics.homogeneity_score(*labels*, estimator.labels_),
metrics.completeness_score(*labels*, estimator.labels_),
metrics.v_measure_score(*labels*, estimator.labels_),
metrics.adjusted_rand_score(*labels*, estimator.labels_),
metrics.adjusted_mutual_info_score(*labels*, estimator.labels_),
metrics.silhouette_score(data, estimator.labels_,
metric='euclidean',
sample_size=sample_size)
In the case of loading a dataset or generate samples from the API is simple,
but what method/function to use if I want to generate the same dictionary from
new data. Is there any already implemented call or it should be done handcraft.
If so...hoy .target is calculated from the incoming data?
Thanks in advance.
Dieguich
------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general