Am 08.04.20 um 00:23 schrieb Deklan Webster: > Thanks for the quick reply > >> Note that link prediction itself is a valid criterion for model > selection, so you might just want to focus on that. > > Not sure how to interpret this. Are you saying that even if the > non-Poisson model has lower entropy, the Poisson model may still be > 'better' for community detection if it's better at link prediction > (which you suggest is most likely the case)?
Yes, it may have better *evidence* (which corresponds to the sum over many partitions), i.e. despite not yielding the best description length (which is the property of a single partition). Please read section IX of this book chapter: https://arxiv.org/abs/1705.10225 >> You should use logsumexp(): > Wouldn't that be non-incremental? Wouldn't the incremental version of > this be summing the exp(a) of the results at each step, and then taking > log at the end? That is exactly what this does, but in a numerically stable manner. > I'm assuming that `u = s.collect_marginal(u)` is taking care of counting > the non-edge appearances over every sample (and not just the last one). > I assume if u.edge(x, y) is None that means the non-edge was never > observed, correct? Yes. > Well, I ran it just now on a smaller graph to test. The graph is > directed with 475 nodes and 20,911 edges. Intuitively I would expect a > reasonable number of non-edges to be observed given that there are 21k > edges... maybe 500~ at least. Running that code above I see that > `non_edge_found_c` is 13. Only 13 non-edges are observed with non-zero > probability? The largest of those 13 probabilities is 0.07. And, > edge_not_found_c is 2. How should I interpret this situation where the > edge is in the original but not in (any of?) the marginals? > > Am I doing something wrong here? Do I need to adjust n, x, n_default, > x_default? If you actually know the number of edges removed, you can supply this to the algorithm via fn_params which controls the prior for the missing edge probabilities, by setting alpha and beta. If you have removed a fraction f of the edges, you need to choose alpha/(alpha+beta) = f, and the higher the values of both alpha and beta are, the stronger will be the role of the prior (which is a beta distribution). If you don't supply anything, a flat prior will be used, which means the inference will be guided by the network structure alone. Best, Tiago -- Tiago de Paula Peixoto <[email protected]> _______________________________________________ graph-tool mailing list [email protected] https://lists.skewed.de/mailman/listinfo/graph-tool
