2013/5/8 Ark <[email protected]>:
> I am using sgdclassifier for document clasification.
[snip]
> -is there a way to predict next best match directly?
The decision_function method returns what you want: scores for the
individual classes, which can be combined with the labels using
something like
> -or is there a best way to switch to something like knn (which initially
Correction: -or is the best way to switch to something like knn?
--
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is t
I am using sgdclassifier for document clasification.
where (n_samples, n_features) = (12000, 50).
In my project in some of the cases the category chosen leads to
post-processing the document and again trying to predict, in which case it
should not predict the same category, but return th
Bao,
To compute the silhouette distance, the scikit precompute the matrix of
distances between the elements of X (samples). But it is possible to do
without this matrix and compute the distance between two samples only when
it's needed. This is the most naive implementation of the silhouette. Ther
Hi Alexandre,
Thank for your feedback. But could you please more clarify about "computing
the distance between samples "on the fly"'. In my case, the time
requirement is not very serious. If you can make me clear about this, I
think it would be a suitable solution for my case.
Regards,
T.Bao
O
Hi Bao,
If I am not mistaken, the computation of pairwise distances is a way to
speed up silhouette calculus, and make the code simpler. It is possible to
compute silhouette by computing the distance between samples "on the fly".
This will be very slow indeed but no additional memory is required.