[graph-tool] Assortative constraint?

2021-08-15 Thread dadakinda
Hello,

Is there any way to bias the nested DC-SBM model towards assortative partitions?

I've tested PPBlockState, but as far as I understand it distinguishes groups 
only by internal vs external density via the Planted Partition model: far more 
restrictive than the full SBM. I'm wondering if there is a middle-ground where 
you have a SBM but the groups are biased (required?) to be assortative.

I searched for something like this and found a couple papers: "A Regularized 
Stochastic Block Model for the robust community detection in complex networks" 
by Lu et al and "Assortative-Constrained Stochastic Block Models" by Gribel et 
al. Unfortunately neither uses priors to prevent overfitting, a nested 
formulation to address resolution limit, etc. The latter paper restricts the 
intra-block connection probabilities to be greater than inter-block connection 
probabilities. I'm not sure if there is an equivalent idea for the 
microcanonical model. The only idea I can think of would be to modify the 
likelihood in some way to make intra-block edges more likely.

Any guidance is appreciated,
Thank you___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de


[graph-tool] Re: Assortative constraint?

2021-08-15 Thread Tiago de Paula Peixoto

Am 15.08.21 um 21:51 schrieb dadakinda:

Hello,

Is there any way to bias the nested DC-SBM model towards assortative 
partitions?


I've tested PPBlockState, but as far as I understand it distinguishes 
groups only by internal vs external density via the Planted Partition 
model: far more restrictive than the full SBM. I'm wondering if there is 
a middle-ground where you have a SBM but the groups are biased 
(required?) to be assortative.


I searched for something like this and found a couple papers: "A 
Regularized Stochastic Block Model for the robust community detection in 
complex networks" by Lu et al and "Assortative-Constrained Stochastic 
Block Models" by Gribel et al. Unfortunately neither uses priors to 
prevent overfitting, a nested formulation to address resolution limit, 
etc. The latter paper restricts the intra-block connection probabilities 
to be greater than inter-block connection probabilities. I'm not sure if 
there is an equivalent idea for the microcanonical model. The only idea 
I can think of would be to modify the likelihood in some way to make 
intra-block edges more likely.


Such model variations are not implemented in the library.

Furthermore, I don't think that the microcanonical priors / integrated 
likelihoods will be easy to write down in closed form for this kind of 
constraint.


--
Tiago de Paula Peixoto 
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de


[graph-tool] Re: Creating a layered graph (aka messing with property maps)

2021-08-15 Thread Tiago de Paula Peixoto

Am 06.08.21 um 16:38 schrieb Davide Cittaro:

Hi all,
I'm trying to analyze a graph for which I have multiple layers and I'm finding 
difficulties in creating the graph. I have two graphs with the same vertex for 
which I define a edge property (the layer) like this:

(g_atac, dist_atac) = gt.generate_knn(A_data, k=n_neighbors)
(g_rna, dist_rna) = gt.generate_knn(R_data, k=n_neighbors)

layer_atac = g_atac.new_edge_property('int')
for e in g_atac.edges():
 layer_atac[e] = 1
layer_rna = g_rna.new_edge_property('int')
for e in g_rna.edges():
 layer_rna[e] = -1
g_atac.edge_properties['layer'] = layer_atac
g_rna.edge_properties['layer'] = layer_rna

I then merge the graphs by graph union, first defining a node mapping:

rna_mappings = g_rna.new_vertex_property("int")
for x in range(g_atac.num_vertices()):
 rna_mappings[x] = x

and then performing the actual union:

gu = gt.graph_union(g_atac, g_rna, intersection=rna_mappings, 
internal_props=True)

This indeed creates the final graph, but I noticed that instead of two layers I 
have three:

np.unique(gu.edge_properties['layer'].a)
PropertyArray([-1,  0,  1], dtype=int32)

Looking back at the start graphs I have zeros in the edge_properties as well

np.unique(g_rna.edge_properties['layer'].a)
PropertyArray([-1,  0], dtype=int32)

np.unique(g_atac.edge_properties['layer'].a)
PropertyArray([0, 1], dtype=int32)

and, similarly, the elements in the edge property vector are much higher than 
the number of edges:

print((len(g_rna.ep['layer'].a), g_rna.num_edges()))
(156672, 14317)

which is different from what I can get from this:

g_rna.get_edges([g_rna.ep['layer']]).shape
(14317, 3)

I'm evidently messing with (internal) properties and I'm clueless, any advice?


As is explained in the documentation, edge indexes need not to be 
contiguous (differently from vertex indexes, which are always contiguous).


When accessing the property map values via the ".a" property, this 
returns an array indexed by the edge index, which will contain values 
for non-existing edges if the indexes are not contiguous. This is what 
you are seeing.


If you want to see only the property values for existing edges, you 
should get a filtered array with the ".fa" property.


Best,
Tiago

--
Tiago de Paula Peixoto 
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de