I can't speak to the policy issues, but I'll share a few things about
social graph backing stores.

To put it politely, the social graph grows quickly. Projecting the
growth out just 3 or 6 months causes most engineers to do a spit-

We have three online (user-visible) ways of storing the social graph.
One is considered canonical, but it is useless for online queries. The
second used to handle all queries. This store began to suffer from
correctness and internal inconsistency problems as this store was
pushed well beyond its capabilities. We recognized this issue long
before the issues became critical, allocated significant resources,
and built a third store. This store is correct (eventually
consistent), internally consistent, fast, efficient, very scalable,
and we're very happy with it.

As the second system was slagged into uselessness, we had to cut over
the majority of the site to the third system when the third reached a
good, but not totally perfect, state. As we cut over, all sorts of
problems, bugs and issues were eliminated. Hope was restored, flowers
bloomed, etc. Yet, the third store has two minor user-visible flaws
that we are fixing. Note that working on a large critical production
data store with heavy read and write volume takes time, care and
resources. There is minor pagination jitter in one case and a certain
class of row-count-based queries have to be deprecated (or limited)
and replaced with cursor-based queries to be practical. For now, we're
sending the row-count-queries queries back to the second system, which
is otherwise idle, but isn't consistent with the first or third

We also have follower and following counts memoized in two ways that I
know about, and there's probably at least one more way that I don't
know about.

Experienced hands can intuit the trade-offs and well-agonized choices
that were made when we were well-behind a steep growth curve on the
social graph.

These are the cards.

-John Kalucki
Services, Twitter Inc.

Reply via email to