> It's a sum in log-space, so it's incremental.

Can you show me in code how to use logsumexp in such a way such that I
don't need to store 6,000,000 * k floats in memory where k is the number of
sampling iterations?

> Most other link prediction methods only provide a ranking or scores of
the edges. In order to actually predict edges, all these methods require
you to determine how many edges should be actually placed.

In your setup you still end up with a probability in the end. What
probability cutoff you consider to be a "real missing edge" or not is still
arbitrary, correct? Using a binary classifier and whatever link prediction
features you also can get probabilities in the end. For instance, and
relating to the point of choosing that f ratio you mention, I counted some
probabilities to get an estimate on the number of "actually missing" edges.
I've confirmed that these probabilities are fairly well calibrated by
removing edges and predicting, etc.

39 with probability above 0.80
143 with probability above 0.70
318 with probability above 0.60

(Funny note, the 20th most likely link prediction is davidlazer ->
tiagopeixoto. This small network I'm testing with is my Twitter network.)

These are only out of the 129,971 non-edges which are at distance 2. So,
roughly, it seems like 100-400~ missing edges is reasonable. The network
reconstruction method with defaults and 1000 sampling iterations gave me 13
possible non-edges, the highest of which has probability 0.07.

> You should also try to consider using a different framework, where you
leave the missing edges "unobserved", instead of transforming them into
nonedges. The distinction is important, and changes the reconstruction
problem. You can achieve this with MeasuredBlockState by setting the
corresponding n value to zero.

Do you mean setting n_default=0? Just tried that and it appears like almost
all of my existing edges are missing from the marginal graph. Not sure
what's going on there. I tried setting mu and nu to extreme values control
this(?) but it doesn't appear to help much. With alpha=1, beta=1, mu=1,
nu=9999: 20733 out of 20911 of my original edges are missing. What's
happening here?

> A bound prior like this is not possible, but you can get something close
with the beta prior, where it is centered on a specific value, say 0.1
with an arbitrary variance.

With this I seem to, at least, be getting non-zero probabilities for more
edges.

Let m be the number of non-edges found with non-zero probability, and n be
the number of edges in original graph not found in marginal graphs. The
following after 1000 sampling iterations:

alpha=10, beta=90, m=349, n=4
alpha=100, beta=900, m=2381, n=0
alpha=20, beta=80, m=632, n=17
alpha=200, beta=800, m=4031, n=10

Okay, so to summarize what's happening here: if alpha and beta are both 1
then p is uniform from [0,1], and it appears the model is favoring very low
p for my graph. By setting alpha and beta I can control the distribution of
p. By maintaining this alpha/(alpha+beta) ratio while increasing alpha and
beta I can decrease the variance, which forces the p to be even higher (for
the same mean), because it wants to go to the lower tail. That correct?

Is there any reason not to set something very aggressive like alpha=500,
beta=500? I just tested this is as a feature in my binary classifier with
5000 sampling iterations and it performed much worse than the conditional
prob setup (with only 100 sampling iterations). I expected this marginal
prob setup to work comparably to the conditional prob setup. Thoughts?

Thanks for your help, as always







On Thu, Apr 9, 2020 at 9:29 AM Tiago de Paula Peixoto <[email protected]>
wrote:

> Am 09.04.20 um 02:01 schrieb Deklan Webster:
> >> That is exactly what this does, but in a numerically stable manner.
> > Not seeing how the function you linked is incremental. It seems to
> > require collecting every conditional probability in memory for every
> > vertex pair. I'm collecting for ~6m pairs on my laptop, that would blow
> > my RAM. (If I'm understanding correctly)
>
> It's a sum in log-space, so it's incremental.
>
> Doing
>
>    x = logsumexp([x, y])
>
> is the same as
>
>    x = log(exp(x) + exp(y))
>
> but is more accurate and does not overflow.
>
> >> If you don't supply anything, a flat prior will be used, which means the
> > inference will be guided by the network structure alone.
> >
> > When I use the conditional probability method it too only uses the
> > network structure, correct?
>
> If you use a flat prior, the number of removed edges is inferred from
> the network structure. If you supply it via the prior, then this will be
> used.
>
> > This smaller graph I'm testing on has 225,150 possible edges. There are
> > 20,911 actual edges. So, there are 204,239 possible additional edges.
> > Finding 13 potential edges out of 204,239 seems extremely off (even
> > using network structure alone; indeed all the other link prediction
> > methods I'm using only use network structure). I could test this by
> > removing 2000 and then seeing how many more it predicts, but I'm fairly
> > certain already it will be similarly low.
>
> Most other link prediction methods only provide a ranking or scores of
> the edges. In order to actually predict edges, all these methods require
> you to determine how many edges should be actually placed.
>
> The method in MeasuredBlockState() actually attempts to perform
> reconstruction, with a full posterior distributions of networks
> conditioned on the observed data. This includes the actual number of
> missing/spurious edges.
>
> In order for the reconstruction to be successful, there needs to be
> sufficient information in the data. For example, if the original network
> is completely random, then removing random edges leaves no trace of the
> original structure, and then reconstruction fails. In some cases (as I
> have shown in https://dx.doi.org/10.1103/PhysRevX.8.041011) the
> reconstruction works very well, but it may fail if the distortion is not
> noticeable with a SBM.
>
> You should also try to consider using a different framework, where you
> leave the missing edges "unobserved", instead of transforming them into
> nonedges. The distinction is important, and changes the reconstruction
> problem. You can achieve this with MeasuredBlockState by setting the
> corresponding n value to zero.
>
> > In the context of exploratory link prediction, recommender systems for
> > social networks, etc I don't think there is a principled way to give a
> > "deleted fraction". I don't know how many links are "missing" but it's
> > certainly not 13. I assume I could just assert something more aggressive
> > arbitrarily and get a better result. Would it not be more principled to
> > have a single 'non-edge aggressiveness parameter' and then make a
> > uniform prior (hyperprior?) for it over some reasonable range, perhaps
> > that f ratio you mention? Say, uniformly between [0.01, 0.2]. Is that
> > doable with graph-tool?
>
> A bound prior like this is not possible, but you can get something close
> with the beta prior, where it is centered on a specific value, say 0.1
> with an arbitrary variance.
>
> Best,
> Tiag
>
> --
> Tiago de Paula Peixoto <[email protected]>
>
> _______________________________________________
> graph-tool mailing list
> [email protected]
> https://lists.skewed.de/mailman/listinfo/graph-tool
>
_______________________________________________
graph-tool mailing list
[email protected]
https://lists.skewed.de/mailman/listinfo/graph-tool

Reply via email to