John,

Thanks for the background info. "Row count queries" means to me the
summary friends and followers numbers displayed on the Twitter web
pages, and returned on the user profile via the API, correct? So, if I
am understanding you correctly, then the friends and followers that
we're getting back from the social graph methods are pulled from the
third store, and doing a count() on the returned JSON array gives one
the actual valid numbers of current friends and followers. (Not that
users would ever believe us. LOL. They believe what they see on the
Twitter web pages.)

Anyway, I cannot imagine the challenges you must face with your
explosive growth. It will be interesting if, one day, one of your
engineers could give an overview of your technical architecture.
Facebook has done that (I remember the one regarding their image
serving) and it was very fascinating.

I will appreciate it if you can fix the 10+ seconds delay issue on
Tuesday or Wednesday. It's not a major "train smash" issue, it is just
slowing down my scripts to a great extent. They are battling to keep
up with the workload when they are slowed down like that.

Dewald

On Sep 6, 11:59 am, John Kalucki <jkalu...@gmail.com> wrote:
> I can't speak to the policy issues, but I'll share a few things about
> social graph backing stores.
>
> To put it politely, the social graph grows quickly. Projecting the
> growth out just 3 or 6 months causes most engineers to do a spit-
> take.
>
> We have three online (user-visible) ways of storing the social graph.
> One is considered canonical, but it is useless for online queries. The
> second used to handle all queries. This store began to suffer from
> correctness and internal inconsistency problems as this store was
> pushed well beyond its capabilities. We recognized this issue long
> before the issues became critical, allocated significant resources,
> and built a third store. This store is correct (eventually
> consistent), internally consistent, fast, efficient, very scalable,
> and we're very happy with it.
>
> As the second system was slagged into uselessness, we had to cut over
> the majority of the site to the third system when the third reached a
> good, but not totally perfect, state. As we cut over, all sorts of
> problems, bugs and issues were eliminated. Hope was restored, flowers
> bloomed, etc. Yet, the third store has two minor user-visible flaws
> that we are fixing. Note that working on a large critical production
> data store with heavy read and write volume takes time, care and
> resources. There is minor pagination jitter in one case and a certain
> class of row-count-based queries have to be deprecated (or limited)
> and replaced with cursor-based queries to be practical. For now, we're
> sending the row-count-queries queries back to the second system, which
> is otherwise idle, but isn't consistent with the first or third
> system.
>
> We also have follower and following counts memoized in two ways that I
> know about, and there's probably at least one more way that I don't
> know about.
>
> Experienced hands can intuit the trade-offs and well-agonized choices
> that were made when we were well-behind a steep growth curve on the
> social graph.
>
> These are the cards.
>
> -John Kaluckihttp://twitter.com/jkalucki
> Services, Twitter Inc.

Reply via email to