[graph-tool] Re: random graph

2021-09-01 Thread Davide Cittaro
Tiago de Paula Peixoto wrote:
> Am 01.09.21 um 19:24 schrieb Davide Cittaro:
> >   
> If we assume that the big graph is sampled from a SBM, then the 
> sub-sampled graph would also be sampled from a SBM, but not from the 
> same one, if we are dealing with sparse networks. The sub-sampled SBM 
> would be sparser (smaller average degree), and have a deformed degree 
> distribution in the case of the DC-SBM.

Since my graphs are kNN graphs, would you suggest to recompute them on 
subsampled data? To be fair I’ve tried and results do not change dramatically

> 
> The intuition here is that the evidence for the underlying structure 
> will become weaker after sub-sampling, according to how sparser the 
> network becomes. With the MDL/Bayesian approach in graph-tool, you 
> should see fewer groups in the sub-sampled network, but they should 
> otherwise be similar to the full network.

This is indeed what I observe. I’m thinking to use this possibility to “sniff” 
the data and, in case needed, one can (and should) use the full network. Also, 
I’m aware subsampling will generally wipe out small communities which won’t be 
identified

Thanks again

d
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de


[graph-tool] Re: random graph

2021-09-01 Thread Tiago de Paula Peixoto

Am 01.09.21 um 19:24 schrieb Davide Cittaro:

Following up this post, I have large datasets for which I already have a NSBM. 
I wanted to speed up thinking to an approximate model, so I subsampled a 
fraction of nodes (randomly chosen) and performed NSBM, then performed a sort 
of label transfer to the original graph. Except for the fact the partitions at 
level 0 are now larger than the original ones (as expected) I noticed a general 
concordance between the communities using subsampled and full graphs.
Do you have some literature, ideas or hints about analysis of subsamples?


If we assume that the big graph is sampled from a SBM, then the 
sub-sampled graph would also be sampled from a SBM, but not from the 
same one, if we are dealing with sparse networks. The sub-sampled SBM 
would be sparser (smaller average degree), and have a deformed degree 
distribution in the case of the DC-SBM.


The intuition here is that the evidence for the underlying structure 
will become weaker after sub-sampling, according to how sparser the 
network becomes. With the MDL/Bayesian approach in graph-tool, you 
should see fewer groups in the sub-sampled network, but they should 
otherwise be similar to the full network.


--
Tiago de Paula Peixoto 
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de


[graph-tool] Re: random graph

2021-09-01 Thread Davide Cittaro
Following up this post, I have large datasets for which I already have a NSBM. 
I wanted to speed up thinking to an approximate model, so I subsampled a 
fraction of nodes (randomly chosen) and performed NSBM, then performed a sort 
of label transfer to the original graph. Except for the fact the partitions at 
level 0 are now larger than the original ones (as expected) I noticed a general 
concordance between the communities using subsampled and full graphs.
Do you have some literature, ideas or hints about analysis of subsamples?
___
graph-tool mailing list -- graph-tool@skewed.de
To unsubscribe send an email to graph-tool-le...@skewed.de