Re: [Boston.pm] Passing large complex data structures between process

Ben Tilly Fri, 05 Apr 2013 14:43:21 -0700

On Fri, Apr 5, 2013 at 12:04 PM, John Redford <[email protected]> wrote:
> Ben Tilly emitted:
>>
>> Pro tip.  I've seen both push based systems and pull based systems at
> work.  The
>> push based systems tend to break whenever the thing that you're pushing to
>> has problems.  Pull-based systems tend to be much more reliable in my
>> experience.
> [...]
>>
>> If you disregard this tip, then learn from experience and give thought in
>> advance to how you're going to monitor the things that you're pushing to,
>> notice their problems, and fix them when they break.
>> (Rather than 2 weeks later when someone wonders why their data stopped
>> updating.)
>
> Your writing is FUD.


Are you reading something into what I wrote that wasn't there?
Because I'm pretty sure that what I wrote isn't FUD.

A pull-based system relies on having the job that does the work ask
for work when it's ready.  A push-based system relies on pushing to a
worker.  If the worker in question is busy on a long job, or has
crashed for some reason, it is easy for work to get delayed or lost
with a push-based system while other workers sit idle.  A recent
well-publicized example of resulting sporadic problems is
http://rapgenius.com/James-somers-herokus-ugly-secret-lyrics.  A
pull-based system avoids that failure mode unless all workers crash at
once.

For an example of an interesting failure case, consider a request that
crashes whatever worker tries to do it.  With a push-based system, a
worker gets it, crashes, might be brought up automatically, tries the
same request, crashes again, and all requests sent to the unlucky
worker are permanently lost.  With a pull-based system, the bad
request will start crashing workers left and right, but progress
continues to be made on everything.

This is not to say that push-based systems are always inappropriate.
http is a push-based system, so often push-based system is simpler to
build and design.  But if you have an even choice, prefer the
pull-based system.  Yes, you will have to poll, but it tends to have
better failure modes.

> Pro tip.  Learn to use a database.  I know that it can be fun to play with
> the latest piece of shiny technofrippery, like Redis, and to imagine that
> because it is new, it somehow is better than anything that came before and
> that it can solve problems that have never been solved before.  It's not.
> There's nothing specifically wrong with it, but it's not a silver bullet and
> parallelism is not a werewolf.

What makes you think that I don't know how to use a database?  (Here
is a hint: a separate table per downloader is not exactly a best
practice.)  If you'll note, my first suggestion was to implement
polling on the database.  That's because I've been there, done that.
It works and the database gets better throughput than most people
realize it can.  In fact it probably gets more than sufficient for
this particular application.

If the queries are properly designed (often means that someone else
did the heavy work putting things into the queue), distributing
hundreds of jobs per second to workers is pretty easy.  (I don't know
the limit, 100/second was sufficient when I needed to do this last,
and MySQL didn't break a sweat on that.)  I'll describe how to do that
in a second.

But this particular use case isn't a great fit for a database's
capabilities.  It is like using army tanks for picking up groceries
from the corner store.  If you've got the tanks, might as well, but
there are more appropriate tools.  With Redis will distribute tens of
thousands of jobs per second pretty easily.  Scaling farther than that
requires distributing work in a more sophisticated way, but it sounds
like they have a long way to go before running into that barrier.

(NOTE FOR DAVID: here is a blueprint for something that might be easy
for you to build, to solve your current scaling problem.  It will also
allow you to trivially distribute downloading across multiple machines
for better throughput, without introducing new technologies into your
stack.)

Now if you're curious how to achieve that throughput with a database
and polling, here you go.  This is based on a system that I've built
variations of several times.  Have two tables, let's call them
job_order and job_pickup.  We insert into job_order when we want work
done.  A worker inserts into job_pickup when it's ready to do work.

When a worker wakes up, it checks whether the top id for job_order
exceeds the top id for job_pickup.  If not, sleep.  If it does, then
insert a row into job_pickup.  The id of that row is your job.  Start
polling for that job_order.  When you find it, update the record with
a new status.  Once you're done, mark it done.  If your job_order had
been there right away, assume that there is another, and insert into
job_pickup until the workers have caught up with requests.  Then after
that job, sleep.

When I say sleep, I mean something like usleep(rand(0.2)).  The rand
avoids a "thundering herd problem".  When you're polling, put a
smaller random sleep between poll requests to avoid overloading the
system.  You can play with those numbers depending on how many
workers, requests, etc that you have.  But the excess polling overhead
can easily be limited.

If you want to have multiple types of jobs, and not all workers able
to handle all kinds of jobs, you won't be able to use the
autoincrementing ID for synchronization.  But you can use the same
tables and a pair of sequences per type.  See
http://www.postgresql.org/docs/8.1/static/sql-createsequence.html for
information on how to do that with PostgreSQL.

As a sanity check you can have a monitor that will look for jobs that
do not seem to be processed, mark them as failed, and resubmit them.
(If you lock properly, the monitor is safe.  Most developers do not
understand MVCC well enough to avoid a small race condition, but the
odds of hitting that are very small.)  Put a concatenated index on
(status, create_datetime) and the queries that it needs to make will
be extremely efficient.  Also thanks to row-level locking, there is
almost no contention between processes.

That's a design for a generic polling-based job control system using a
SQL database for a back end.  It works.  It isn't great but scales
much farther than most developers would expect.  And when it maxes
out, well, that's a perfect use case for Redis.  And when Redis maxes
out, come back and talk, i know how to do that as well.  I picked up a
lot of knowledge about how to build reliable and scaleable systems
when I worked at Google.

(That said, Redis is a good piece of software.  Why are you resistant
to learning it?)

_______________________________________________
Boston-pm mailing list
[email protected]
http://mail.pm.org/mailman/listinfo/boston-pm

Re: [Boston.pm] Passing large complex data structures between process

Reply via email to