Tom Lane <t...@sss.pgh.pa.us> wrote:
 
> Now the argument against that is that it won't scale terribly well
> to situations with very large numbers of blobs.  However, I'm not
> convinced that the current approach of cramming them all into one
> TOC entry scales so well either.  If your large objects are
> actually large, there's not going to be an enormous number of
> them.  We've heard of people with many tens of thousands of
> tables, and pg_dump speed didn't seem to be a huge bottleneck for
> them (at least not in recent versions).  So I'm feeling we should
> not dismiss the idea of one TOC entry per blob.
> 
> Thoughts?
 
We've got a "DocImage" table with about 7 million rows storing PDF
documents in a bytea column, approaching 1 TB of data.  (We don't
want to give up ACID guarantees, replication, etc. by storing them
on the file system with filenames in the database.)  This works
pretty well, except that client software occasionally has a tendency
to run out of RAM.  The interface could arguably be cleaner if we
used BLOBs, but the security issues have precluded that in
PostgreSQL.
 
I suspect that 7 million BLOBs (and growing fast) would be a problem
for this approach.  Of course, if we're atypical, we could stay with
bytea if this changed.  Just a data point.
 
-Kevin
 
cir=> select count(*) from "DocImage";
  count
---------
 6891626
(1 row)

cir=> select pg_size_pretty(pg_total_relation_size('"DocImage"'));
 pg_size_pretty
----------------
 956 GB
(1 row)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to