Err, I meant, of course: ``` Directed : ~13/1000 correctly identified Undirected: 488/1000 correctly identified ```
On Wed, Apr 15, 2020 at 8:54 PM Deklan Webster <[email protected]> wrote: > I, totally by accident, think I discovered the issue! In the process of > setting up the testing between the reconstructive method, get_edges_prob > method, and the simple stacked topological similarity classifier I > accidentally set the graph to undirected without setting it back to > directed. The results appeared to be reasonable finally. I've gone back and > tested on the graph I've been testing on, and indeed: > > Undirected: ~13/1000 correctly identified > Directed: 488/1000 correctly identified > > A giant improvement! Of course, I'm relieved since this solves my problem > (I can just, as least, set to undirected to compute this). But, now I'm > wondering what's going on. Is there some predictable reason why the > directed case is performing so much worse? Something to do with directed > graphs and equlibration? But if it were to do with equlibration why did > this issue never affect `get_edges_prob`'s performance? Could there be a > bug in how graph-tool is handling the directed case? In the paper it seems > you've worked with the directed case before, so I assume you would have > noticed if there were. > > To be sure, I tested this on the "polblogs" graph as well. Undirected: > reconstructive performance is great, directed: terrible. > > > You didn't even have to read any paper, just the docstring would have > been enough. > > > Beta is the inverse temperature parameter, and setting it to infinity > means turning the MCMC into a greedy optimization heuristic. > > > And I don't "recommend it". It is not applicable to your context. > > I've read the documentation. The docstring says "Inverse temperature". The > relevant section of the docs say, > > > An alternative is to use a greedy algorithm based on a merge-split MCMC > with zero temperature [peixoto-merge-split-2020] > <https://graph-tool.skewed.de/static/doc/demos/inference/inference.html#peixoto-merge-split-2020>, > which requires a single global state, and thus can reduce memory usage. > > Just from this I don't understand. Not setting beta to infinity has > already reduced memory usage for me. I don't understand the significance of > this being greedy or why it doesn't apply to my situation. If the > explanation of all this is in one of the papers please point to it and I'll > read it. > > > To be honest, I think the pattern of saying "I plan to read your > documentation/paper at some point, but you could you please just explain > this to me before I do so" a bit disrespectful. Why is my time worth > less than yours? > > I'm sorry it seems that way to you. I have read all of the documentation > at least once, and some of parts of the papers. I don't think my time is > more valuable. I would happily volunteer to contribute to the docs or to > the project in general in any way I can. I've found some more typos in the > docs in addition to the one I already told you about. I can at least > contribute these. Also, I'll email you those paper typos after this. > > > This kind of comparison has been done already in > https://arxiv.org/abs/1802.10582 and https://arxiv.org/abs/1909.07578. > The SBM approach is the single best classifier among the over a hundred > they consider, which is marginally beat only by a stacking of around 40 > other predictors. > > I've read both these papers and referred to the latter twice in this > thread. We've already discussed that these don't use either > `get_edges_prob` nor the full reconstructive marginal edge probability > > On page 8 of the former, they're using this as their similarity for SBM: > "s_ij = θ_i θ_j * l_gi,gj" > > As I've already told you, the "score" I'm getting with `get_edges_prob` is > also the *most impactful feature of all the features I've tested* > (including many newer network embedding methods not considered in > aforementioned papers). This is impressive. Those papers didn't test this > as far as I'm aware. The reconstructive approach, however, was giving me > worse results. This discrepancy is what I was trying to account for (and, > did, now I think. see above). > > > In any case that was not the point of the PRX paper, which was to > develop an actual Bayesian reconstruction algorithm, not a binary > classifier. AFAIK there is no other algorithm that does this, so there > was nothing to compare to. If you're worried about comparing with binary > classifiers, you can just convert this approach into one by using the > marginal posterior probabilities as "scores" and go from there, as the > papers above do. Then you are comparing apples to apples. > > Yes, this is my intention as I've stated. > > Thanks for your help, as always > > On Wed, Apr 15, 2020 at 2:04 AM Tiago de Paula Peixoto <[email protected]> > wrote: > >> Am 15.04.20 um 05:40 schrieb Deklan Webster: >> >> Most typically, edge prediction is >> > formulated as a binary classification task [7], in which each >> > missing (or spurious) edge is attributed a “score” (which >> > may or may not be a probability), so that those that reach a >> > prespecified discrimination threshold are classified as true >> > edges (or true nonedges) >> > >> > I think this is misleading. This is not typical. As far as I can tell, >> > very few people use unsupervised binary classification for link >> > prediction. Most typically, edge prediction is formulated as a >> > *supervised* binary classification task. From that setup, you can >> > calibrate probabilities based on ability to predict. >> >> It's not misleading at all, this is exactly how a binary classifier >> works, supervised or otherwise. How you find the threshold is besides >> the point. >> >> >> Indeed selecting the optimal threshold can be done by cross validation >> > in a *supervised* setting, but even then this will depend in general on >> > the fraction of removed edges, size of test set, etc. >> > >> > I agree, this is a valid concern. If this is your objection to the >> > *supervised* binary classification formulation then I think this should >> > be the statement in the paper. >> >> This is just another difference. And I wasn't "objecting", just >> explaining what you had misunderstood. >> >> > Well, I have run it multiple times with different numbers. To be sure, I >> > just now ran it with the epsilon removed, 2000 wait, multiflip on, and >> > then 100k(!) sampling iterations. Results were pretty much the same. >> >> It's a shame. >> >> > I noticed that in the docs you now recommend setting beta=np.inf when >> > using multiflip. What exactly is this doing? (I plan to soon read that >> > other paper you mention which probably includes this..) >> >> You didn't even have to read any paper, just the docstring would have >> been enough. >> >> Beta is the inverse temperature parameter, and setting it to infinity >> means turning the MCMC into a greedy optimization heuristic. >> >> And I don't "recommend it". It is not applicable to your context. >> >> To be honest, I think the pattern of saying "I plan to read your >> documentation/paper at some point, but you could you please just explain >> this to me before I do so" a bit disrespectful. Why is my time worth >> less than yours? >> >> > I noticed that in your paper you didn't compare your reconstruction >> > method performance against any baseline. How do you know how well it's >> > performing if you don't have a baseline? I'm currently pessimistic given >> > its performance on the graph I'm testing. Some of those graphs you were >> > testing on in the paper are loaded into graph-tool, right? It would be >> > fairly easily to train a RandomForest (no fancy boosted trees necessary) >> > with the stacked similarity measures from graph-tool (and maybe a few >> > other simple features I have in mind...) and test the performance >> > against your reconstruction approach (at least for just link >> > prediction). Interested in this? Conjectures? I would be willing to do >> > it for some of the moderately-sized graphs. >> This kind of comparison has been done already in >> https://arxiv.org/abs/1802.10582 and https://arxiv.org/abs/1909.07578. >> The SBM approach is the single best classifier among the over a hundred >> they consider, which is marginally beat only by a stacking of around 40 >> other predictors. >> >> In any case that was not the point of the PRX paper, which was to >> develop an actual Bayesian reconstruction algorithm, not a binary >> classifier. AFAIK there is no other algorithm that does this, so there >> was nothing to compare to. If you're worried about comparing with binary >> classifiers, you can just convert this approach into one by using the >> marginal posterior probabilities as "scores" and go from there, as the >> papers above do. Then you are comparing apples to apples. >> >> If you have further questions about how to use the library, please go >> ahead and ask. But if you want to discuss how to compare supervised vs >> unsupervised edge prediction, etc, please take this off this list since >> it's off-topic. >> >> Best, >> Tiago >> >> -- >> Tiago de Paula Peixoto <[email protected]> >> >> _______________________________________________ >> graph-tool mailing list >> [email protected] >> https://lists.skewed.de/mailman/listinfo/graph-tool >> >
_______________________________________________ graph-tool mailing list [email protected] https://lists.skewed.de/mailman/listinfo/graph-tool
