Heya, sorry for not responding sooner.

Running those algorithms algorithm is expensive (O(n^3) from memory), so
that's going to be a big limiting factor. And I worry that your graph may
be too big for these algorithsm. The max_iter param is certainly available
for tuning which trade-off the accuracy of the result. Totally speculating:
I don't think sparsifying would help too much with these implementations.
These both create fully connected graphs as part of the graph construction
step. I think sparsification would help a lot if you instead directly
simulated the particle movements through the graph, instead of using these
exact solutions.

For #2, what if you subclassed the LabelSpreading class and overrode
_build_graph
<https://github.com/scikit-learn/scikit-learn/blob/a5ab948/sklearn/semi_supervised/label_propagation.py#L449>
to
inject the graph that you set up? May be a big hack.

On Thu, Dec 1, 2016 at 7:33 PM, Delip Rao <delip...@gmail.com> wrote:

> Hello,
>
> I have an existing graph dataset in the edge format:
>
> node_i node_j weight
>
> The number of nodes are around 3.6M, and the number of edges are around
> 72M.
>
> I also have some labeled data (around a dozen per class with 16 classes in
> total), so overall, a perfect setting for label propagation or its
> variants. In particular, I want to try the LabelSpreading implementation
> for the regularization. I looked at the documentation and can't find a way
> to plug in a pre-computed graph (or adjacency matrix). So two questions:
>
> 1. What are any scaling issues I should be aware of for a dataset of this
> size? I can try sparsifying the graph, but would love to learn any knobs I
> should be aware of.
> 2. How do I plugin an existing weighted graph with the current API? Happy
> to use any undocumented features.
>
> Thanks in advance!
> Delip
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to