Hey.
So anything in scikit-learn is "unordered".
And all the clustering algorithms should work with the binary representation. Maybe using an algorithm that allows l1 distance would make more sense, but jaccard distance would also be interesting. [There is a gotcha with jaccard distance, but for binary input it should be fine].
Maybe try DBSCAN?

Cheers,
Andy

On 04/30/2015 04:32 PM, Paul Frandsen wrote:
Hello,

I'm interested in clustering many unordered sets of bitsets. In general, a data point would look like: {1011000010, 0100000001, 0000001100, 0000110000}, where each bitset has the same number of digits and are ordered, but the set is unordered. Alternatively (with this particular data set), I could represent the same data point as a set of sets of integers: {{0,2,3,8},{1,9},{6,7},{4,5}}. Ideally, I'd like to use k-means, but I imagine that figuring out centroids would be difficult. Are there any clustering algorithms in scikit-learn that could cluster data like these? I've looked through the docs, but I am coming up short.

Thank you,

Paul Frandsen



------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to