* Jon Stewart: >> BYTEA is slower to load and a bit inconvenient to use from DBI, but >> occupies less space on disk than TEXT or VARCHAR in hex form (17 vs 33 >> bytes with PostgreSQL 8.3).
> Can you clarify the "slower to load" point? Where is that pain point > in the postgres architecture? COPY FROM needs to read 2.5 bytes on average, instead 2, and a complex form of double-decoding is necessary. > Storing the values in binary makes intuitive sense to me since the > data is twice as dense, thus getting you more bang for the buck on > comparisons, caching, and streaming reads. I'm not too concerned about > raw convenience, as there's not going to be a lot of code around my > application. The main issue is that you can't use the parameter-providing version of $sth->execute (or things like $sth->selectarray, $sth->do), you must use explicit binding by parameter index in order to specify the type information. > The idea is that you have named sets of hash values, and hash values > can be in multiple sets. The ID step is only going to help you if your sets are very large and you use certain types of joins, I think. So it's better to denormalize in this case (if that's what you were alluding to in your original post). > The big operations will be to calculate the unions, intersections, and > differences between sets. That is, I'll load a new set into the > database and then see whether it has anything in common with another > set (probably throw the results into a temp table and then dump it > out). In this case, PostgreSQL's in-memory bitmap indices should give you most of the effect of your hash <-> ID mapping anyway. > I will also periodically run queries to determine the size of > the intersection of two sets for all pairs of sets (in order to > generate some nice graphs). I think it's very difficult to compute that efficiently, but I haven't thought much about it. This type of query might benefit from your hash <-> ID mapping, however, because the working set is smaller. -- Florian Weimer <[EMAIL PROTECTED]> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance