Dear Sander . > > 1. The igraph documentation suggests that the bfgs function is used to > estimate the power law alpha, but I think the C implementation relies on > the Broyden-Fletcher-Goldfarb-Shanno optimization function of the > lbfgs library instead. Is that correct? > > This is the exact implementation of the BFGS optimization that we use in power law fitting:
https://github.com/ntamas/plfit/blob/master/src/lbfgs.c As far as I know this is the C port of the limited memory variant of the Broyden-Fletcher-Goldfarb-Shanno method, originally written in FORTRAN. The license notes in the source code might give you more clues. > 1. The fit_power_law function relies on the MLE function of the stat4 > package. I am curious why this was deprecated, given the availability of > plfit and MLE parameters. Is this simply a memory issue? > > I don't know; this is purely in the domain of the R interface of igraph; the C core uses the L-BFGS method and my "plfit" library: https://github.com/ntamas/plfit The plfit library is an efficient implementation of the method published by Clauset, Shalizi and Newman: Clauset A, Shalizi CR and Newman MEJ: Power-law distributions in empirical data. SIAM Review 51, 661-703 (2009). > > 1. How to interpret the p-value of the Kolmogorov-Smirnov test? > > See the paper cited above for more details. > > 1. The igraph help file states: "Small p-values (less than 0.05) > indicate that the test rejected the hypothesis that the original data could > have been drawn from the fitted power-law distribution" . The C > implementation of the KS test in igraph uses the Hurwitz Zeta function. > Shouldn't this mean that *high *p-values indicate a good model fit, as > suggested by Clauset et al (2009:678)? > > Well, tests based on p-values are not really about whether a model is a "good fit" or a "bad fit"; a low p-value _roughly_ says that "it is very unlikely that the data could have been generated from the hypothesized distribution" (in our case, a power-law). A high p-value _roughly_ means that "the data may have come from the hypothesized distribution"; however, there could be alternative distributions that can describe the data just as well. So, in a nutshell: low p-value --> null hypothesis (power-law) rejected --> data is likely not a power-law high p-value --> null hypothese (power-law) _not_ rejected --> data could come from a power-law, or maybe from something else, we don't know, we just could not _exclude_ the power-law All the best, T.
_______________________________________________ igraph-help mailing list igraph-help@nongnu.org https://lists.nongnu.org/mailman/listinfo/igraph-help