Re: Any success with storing photos in a database?

André Warnier Mon, 29 Sep 2008 13:22:49 -0700

Mark Stosberg wrote:

This question isn't so much a mod_perl question, as it is a question

about building high performance websites with Perl.

We have a large, busy, database application that relates to millions of
photos, which we also need to store and display. We've been keeping the
meta data about the photos in PostgreSQL, and the files on the file
system. (Implemented much like CGI::Uploader ).


This has worked great in terms of performance, but with so much data to
manage, over time we have run into data inconsistency issues between the
file system and the database.

So, I'm asking if anyone has had experience successfully storing photos
(or othe files) directly in database? That would solve the consistency
issue, but may create a performance issue. Perhaps the right kind of
caching layer could solve that.

I am curious about your application, because we have something similarand similar volumes, not with photos but with documents in general.


The following is an opinion piece.

We have also looked at various data organisations over time.

Regarding storing large objects directly in a database, the one issue isalways that (because of the object sizes), it makes any operation on therows of such a database or table very heavy. Imagine having to dump orreload a table that contains 500,000 "blobs" of 2-3 MB each.(Don't know about PostgreSQL, but many db systems require a dump and areload when you change a table structure).Or simply take a backup of that table. And you cannot make an "rsync"of a database as easily as of a filesystem.It also means that any intermediate buffer (which are often used toimprove retrieval of "nearby" rows) is quickly full, with few rows in it.(On the other hand, if you just keep thumbnails of a couple of Kb, Iguess it would not matter much.)

Another issue is that it happens that databases get screwed up, and thelikelihood probably increases as you push them to their limits (forexample with very large row sizes). Resolving some inconsistenciesbetween database rows and files on disk may be no fun, but resolvinginconsistencies within a database may be even less so.

One point I am curious about, is what kind of file structure you use tostore the millions of images on the filesystem. I can't imagine thatyou do it really into one flat directory ?

And are you storing the real paths directly in the database ?

To get back to your issue of inconsistency : maybe the best strategy isjust to check for such inconsistencies as early as possible ? Forexample, when you add or delete objects, write this informationsomewhere in a daily "transactions" file, which is then analysed atnight by some job which checks that everything is really where it issupposed to be, and lets you know when not.

Regarding efficiency : when you think about it, a modern filesystem isabout the fastest, most efficient in space and most reliable databasesystem that one can think of, with the lowest overhead, as long as youknow the exact path of an object, and as long as all the directories inthe path are kept to a reasonable size (important). It has its inherentbuffering at various levels, optimised to access files. It has a wholebunch of utilities to manipulate it; it it shareable, but can be lockedwhen you need it. It is portable.It does have one inconvenient : it has a single "key" to access anobject (the path). But for that, you have your database system.

Oh, and I've thought of another advantage, in an Apache/web context : tosend the content of a file to a browser, you can take advantage of thesendfile() call, which is very efficient. Now if your file is a blob ina row of a database, you have to read it yourself in memory, and sendit, don't you ?


I've just re-convinced myself not to change our storage structure.

Re: Any success with storing photos in a database?

Reply via email to