In data 29 settembre 2008 alle ore 21:00:41, Mark Stosberg <[EMAIL PROTECTED]> ha scritto:

This question isn't so much a mod_perl question, as it is a question
about building high performance websites with Perl.

We have a large, busy, database application that relates to millions of
photos, which we also need to store and display. We've been keeping the
meta data about the photos in PostgreSQL, and the files on the file
system. (Implemented much like CGI::Uploader ).

We have:

a) ~150,000 avatar tiny pictures (50x50);
b) ~300,000 user photos (320x240 originals), but
   also available in 4 more sizes;
c) tens of millions of album pictures in original
   and thumbnail sizes;

We're using MySQL 5.0 with MyISAM storage engine. Yes.
Until recently, a) & b) were stored into a MySQL blob field. Yes.
Did you hear me screaming? :-)

Problems I found when I started working here:

- our mod_perl backends were serving 20-40% of picture requests,
  which is completely insane;

- our picture-serving code was fetching the picture from the database,
  eventually scaling it on the fly (!), saving it in memcached
  and $r->print()ing it out down the wire.
  That's completely insane. The scaling even disabled caching.

- when you update a picture metadata (this is MySQL), you _LOCK_
  the _ENTIRE_ table with hundreds of thousands images.

- fetching from a blob field in MySQL is expensive.

Now instead:

- avatars were our first experiment. They are stored as static resources
  with a hierachical and balanced filesystem structure
  (using digests and splitting them up);

  We completely removed the caching layer from our mod_perl code,
  because caching happens directly in the browser for static
  resources;

  We managed to move away 500,000 req/day from mod_perl to static
  HTTP servers.

- User photos are coming. We wrote a nice application layer that
  can upload a single resource to many pools of static servers and
  in different sizes with automatic thumbnailing. The filesystem path
  scheme can be defined into the resource perl class, but basically
  is the same digest + splitting, like in:

    http://static.myapp.com/pool1/a1/b2c/d3f4g5h6.../123456_m.jpg

  We are mass-exporting pictures from the database blobs to our
  filesystems via DAV. Using DAV is not the most efficient way but
  allows you to attach arbitrary metadata to the filesystem.

  We now use that to resolve inconsistencies and "sync" the
  metadata in the database.
  This should go live with next release. I'll let you know :)

I'm not sure what to do for album pictures. They are already out
of the database, thank god, but the "design" guys now want 3-4
thumbnails even for those pics. Suggestions?

This has worked great in terms of performance, but with so much data to
manage, over time we have run into data inconsistency issues between the
file system and the database.

Can you explain the issues you found?
I'd really like to know, so I'm prepared. :-)

So, I'm asking if anyone has had experience successfully storing photos
(or othe files) directly in database? That would solve the consistency
issue, but may create a performance issue.

Yes, performance issues.

Perhaps the right kind of caching layer could solve that.

I'm not sure. If you throw caching into the "pics-in-the-db" mess,
IMHO you only make the situation worse.

--
Cosimo

Reply via email to