[scikit-learn] [semi-supervised learning] Using a pre-existing graph with LabelSpreading API

Delip Rao Thu, 01 Dec 2016 19:35:50 -0800

Hello,

I have an existing graph dataset in the edge format:


node_i node_j weight

The number of nodes are around 3.6M, and the number of edges are around 72M.

I also have some labeled data (around a dozen per class with 16 classes in
total), so overall, a perfect setting for label propagation or its
variants. In particular, I want to try the LabelSpreading implementation
for the regularization. I looked at the documentation and can't find a way
to plug in a pre-computed graph (or adjacency matrix). So two questions:

1. What are any scaling issues I should be aware of for a dataset of this
size? I can try sparsifying the graph, but would love to learn any knobs I
should be aware of.
2. How do I plugin an existing weighted graph with the current API? Happy
to use any undocumented features.

Thanks in advance!
Delip

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

[scikit-learn] [semi-supervised learning] Using a pre-existing graph with LabelSpreading API

Reply via email to