Hey.
So anything in scikit-learn is "unordered".
And all the clustering algorithms should work with the binary
representation.
Maybe using an algorithm that allows l1 distance would make more sense,
but jaccard distance would also be interesting.
[There is a gotcha with jaccard distance, but for binary input it should
be fine].
Maybe try DBSCAN?
Cheers,
Andy
On 04/30/2015 04:32 PM, Paul Frandsen wrote:
Hello,
I'm interested in clustering many unordered sets of bitsets. In
general, a data point would look like: {1011000010, 0100000001,
0000001100, 0000110000}, where each bitset has the same number of
digits and are ordered, but the set is unordered. Alternatively (with
this particular data set), I could represent the same data point as a
set of sets of integers: {{0,2,3,8},{1,9},{6,7},{4,5}}. Ideally, I'd
like to use k-means, but I imagine that figuring out centroids would
be difficult. Are there any clustering algorithms in scikit-learn that
could cluster data like these? I've looked through the docs, but I am
coming up short.
Thank you,
Paul Frandsen
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general