#19031: New Algorithm for Top-K Closeness Centralities
-------------------------------------+-------------------------------------
Reporter: borassi | Owner:
Type: enhancement | Status: needs_review
Priority: major | Milestone: sage-6.9
Component: graph theory | Resolution:
Keywords: Closeness | Merged in:
Centrality, top-k | Reviewers:
Authors: Michele Borassi | Work issues:
Report Upstream: N/A | Commit:
Branch: | 06d2c00fbfdf6f0166834b68e5e9b0929c5abb89
u/borassi/new_algorithm_for_top_k_closeness_centralities| Stopgaps:
Dependencies: #19007,#19014 |
-------------------------------------+-------------------------------------
Comment (by borassi):
> > > - What's the need for using type `uint64_t` instead of `int`? the
corresponding variables are later cast to `int`.
> > >
> > >
> > >
> > >
> >
> > Well, it's a bit tricky. The point is that in the recursion for
`reachU_scc` I might find values that are much bigger than `g.n`. Since I
did not want to test at each step if this occurs, I used `uint64_t` in
order to avoid overflow (these values are smaller than `g.n^2`). Then,
only at the end, I use min (so that now the values do not overflow), and I
set `reachU[i] = <int> min(reachU_scc[scc[i]], g.n)`. However, you are
right with `reachL`, and I modified it.
> >
> >
> >
> >
> Is it mentioned somewhere?
>
I added two inline comments to explain it.
> > > - the tests for the validity of the results are excessive. There is
no need for the loop `range(1000)`. Testing for *one* randomly chosen
graph with say `n=20` nodes and `random.randint(1, n*(n-1) / 2)` edges is
sufficient.
> > >
> > >
> > >
> > >
> >
> > Done! I left it because I had to test it 100,000 times to find the
last bug...
> >
> >
> >
> >
> That's a lot!
> However, I don't like the `n = random.randint(2,20)`. You could safely
let `n=20`, no?
Done!
> Can we always assume that farness[v]>0? This is for the return
sorted([(1.0/farness[v],....
Yep.
The only instructions that modify the farness are `farness[x] = n` (if a
visit is stopped) and `farness[x] = ((<double> f) * (n-1)) /
(<double>(nd-1) * (nd-1))`, where `f` is a (positive) sum of distances,
`nd-1` is positive because `nd` is the number of reachable vertices from a
vertex `x`. If `nd==1`, it means that the out-degree of `x` is 0, and we
skip the analysis of `x`, so this case never occurs. Then, `n>=nd`, and
hence also `n-1` is positive.
> Instead of `G.vertices()[v]`, you should add a line like `cdef list V =
G.vertices()` before the return statement and then use `V[v]`.
Done!
> About the behavior when `k>=n`. What you propose is in fact to return
the centrality closeness (as a list of couples (centrality,vertex) ).
> - Is your method faster than `centrality_closeness` in this case?
No, it should be the same (actually, a bit slower because we make some
computations during the BFSes).
> - This behavior should be documented for the parameter `k`
Done!
--
Ticket URL: <http://trac.sagemath.org/ticket/19031#comment:11>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-trac.
For more options, visit https://groups.google.com/d/optout.