On Mon, Sep 14, 2009 at 3:29 PM, Evan Daniel <evanbd at gmail.com> wrote:
> There are several confounding factors. ?First, the data aren't
> independent; there should be local clustering, and I seem to have
> double-counted the links to my node (18 out of 353 data points) (links
> between my peers would also be double-counted, but there don't appear
> to be any). ?Second, there's the bandwidth issue: some of my peers are
> faster than others, as can be seen from their varied number of FOAF
> locations. ?Peers with more FOAF locations will receive more traffic,
> in (rough) proportion to the number of FOAF locations they have. ?I'm
> uncertain how the link length distribution should respond; perhaps a
> link to a peer should be counted n times, where n is the number of
> FOAF locations it advertises? ?Or perhaps not; figuring that out would
> take some theoretical work I haven't done. ?On average, across a large
> number of nodes / links, that effect should go away. ?Third, we must
> be wary of observer bias: nodes that connect to other nodes are more
> likely to be observed by a random sample of nodes. ?This will impact
> FOAF link length counting, but not local link length counting.

Sorry, there are some inaccuracies above:
My node had 16 peers in that dataset (not 18).  There are 33 duplicate
locations in the full listing.  15 of those represent duplicates of my
node's location.  The remaining 18 are repeated counts of nodes other
than mine: either my node connected to nodes A and B, which are
connected to each other (and therefore each counted twice), or my node
connected to nodes A and B, both of which are connected to a common
node C (which isn't connected to my node).

Evan Daniel

Reply via email to