[igraph] status of the Viger-Latapy sampling method for graphs with a given degree sequence

2018-03-06 Thread Szabolcs Horvát
Hello everyone,

I am looking at the various methods available in igraph to *uniformly*
sample random graphs with a given degree sequence.  The obvious candidate
function is  igraph_degree_sequence_game().

http://igraph.org/c/doc/igraph-Generators.html#igraph_degree_sequence_game

This function provides three generation methods:

"SIMPLE" is explained, and it's clear that the sampling is not uniform.
Also, this method allows multigraphs and self-loops.

"SIMPLE_NO_MULTIPLE" is explicitly mentioned as not uniform.

What remains is the Viger-Latapy method. The link here is broken, but it's
easy to google up the original paper, https://arxiv.org/pdf/cs/0502085.pdf,
the abstract of which says:

"We address here the problem of generating random graphs uniformly from the
set of simple connected graphs having a prescribed degree sequence."

While I didn't read the entire paper, the abstract suggests that this
method should sample uniformly from the set of *connected* simple graphs.
However, this does not appear to be the case in a simple test.

Consider the degree sequence (1, 2, 1, 2).  The only two simple graphs with
this degree sequence are:


But the Viger-Latapy method, as implemented in igraph, will generate only
the second one.

Let's look at a more complicated example, the sequence (1, 2, 2, 2, 1).
Here's the list of such graphs (one of which is not connected):


The Viger-Latapy method generates only these:



Within this set, the sampling is indeed uniform, but there are three
connected graphs which are never generated.

*Question:*

Is the Viger-Latapy method known to be flawed, or is there something I'm
missing here?  There doesn't seem to be a peer-reviewed publication about
this.

Szabolcs

*P.S. *A method that does work in practice is generating a single
realization of the degree sequence, then using igraph_rewire() on it. What
is unclear in such situations is how many rewiring steps are necessary to
approximate uniform sampling.
___
igraph-help mailing list
igraph-help@nongnu.org
https://lists.nongnu.org/mailman/listinfo/igraph-help


[igraph] FW: R multilevel.community only shows 2 levels

2018-03-06 Thread Heim D .
Just to let you know that I found a solution to my problem. The function I was 
looking for is 'cut_at()'.
So the simple line
> membership <- cut_at(com, 5)
gives me the desired membership for level 5.

Hope this will help someone else in the future.

Cheers,
Thea


From: igraph-help 
[mailto:igraph-help-bounces+dh=geodata.soton.ac...@nongnu.org] On Behalf Of 
Heim D.
Sent: 04 March 2018 13:08
To: igraph-help@nongnu.org
Subject: [igraph] R multilevel.community only shows 2 levels


Hi,

I am new to Community Detection and to make things harder I am also a beginner 
in R - please keep this in mind when you kindly answer.

I have a undirected graph with 2198 nodes and 1380771 edges with weights, I use 
the walktrap algorithm to detect communities which will give me 10 communities, 
which is the partitioning with the best modularity value. However, I want to 
extract the community membership and the modularity for each level/partition. I 
believe I need to use the multilevel.communities function,but it does not seem 
to give me the expected result.

The data I use is here: 
https://www.dropbox.com/s/833784qseu7ybyc/IntLargerZeroFlows.csv?dl=0



Current code

### load data and create undirected graph with 2198 nodes and 1380771 edges 
with weights

> df <- 
> read.csv("C:\\Users\\dh2r15\\Desktop\\Malaria\\CommunityDetection\\input\\IntLargerZeroFlows.csv",
>  header= TRUE,  stringsAsFactors = FALSE)

> mylinks <- df[,-1]

> colnames(mylinks)[3] <- "weight" #rename pred_seed1 to weight



> install.packages("igraph")

> library(igraph)



# create directed graph from dataframe

> directed_graph <- graph.data.frame(mylinks, directed = TRUE)

# collapse to undirected graph, summing weights

> net <- as.undirected(directed_graph, mode = "collapse", edge.attr.comb = 
> "sum")



### community detection ###

com <- walktrap.community(net, weights = E(net)$weight, steps = 4, merges 
=mTRUE, modularity = TRUE)

# analyse communities

length(com) #  number of communities

modularity(com) # how modular the graph partitioning is

# communities for every level

> mlc <- multilevel.community(net, weights=E(net)$weight)

> write.csv(mlc$memberships, "MultilevelMembership.csv")



The resulting MultilevelMembership.csv has 2198 columns, I guess each column 
represents one node. First question is how can I link these back to my original 
node names (e.g. MMR_1)?

Second problem is that this table has only 2 rows, each representing one 
level/partition (I assume). Why are there only 2? I was expecting a table with 
one row for each level, and as there are 2198 nodes surely there must be 2198 
levels (ranging from just one superlarge community including all nodes, down to 
2198 'communities' where each node is by itself in a community)?



Maybe I need to use a totally different function, something along the lines of 
createing a dendrogramm and cutting that?



Any help on this matter is greatly appreciated, I am trying to solve this for 
weeks now and do not get anywhere.

Many thanks for reading,

Thea











length(com) #  number of communities: 10

modularity(com) # how modular the graph partitioning is





library(igraph)

net <- graph_from_data_frame(d=mylinks, vertices=mynodes, directed=T)

I then use the walktrap algorithm to create communities:

com <- walktrap.community(net, weights = E(net)$weight, steps = 4, merges =

TRUE, modularity = TRUE)

Walktrap splits the graph into 10 communities and gives me a modularity score 
of 0.4819893 (which I think is quite good):

length(com) #  number of communities: 10

modularity(com) # how modular the graph partitioning is

I can plot a dendrogram with the 10 communities on top:

dendrogram <- dendPlot(com, mode="hclust")

Now my question: I need access to the modularity scores and memberships for all 
levels, not just the best cut level with 10 communities. So basically I want to 
be able to cut my dendrogram at each level and calculate the modularity as well 
as get the membership. How do I do this best please? I am after something like 
this:
* Level Modularity
* 1 ?
* 2 ?
* 3 ?
* ... ...
* 10 0.4819893
* ... ...

This list should include as many levels as the graph has nodes. Then in 
addition there should be something which shows me the membership for each level.

Second Problem

The second problem I have is to visualize this graph, as it contains so many 
nodes and edges that one can't really see anything:

com_plot <- plot(com, net)

plotted graph with communities

I would be glad for any ideas and solutions on this front.

Thank you for reading all this!

--
GeoData
University of Southampton
Southampton
SO17 1BJ

Tel: 023 8059 2719
Email: d...@geodata.soton.ac.uk

www.geodata.soton.ac.uk

Please note office hours Monday to Thursday
___