[ Please continue all discussions regarding this thread off list,
continuing it on list serves no purpose, IMHO ]
I manage a production db (non-dbmail) of 30 million rows (largest
table has 19 mil rows) spanning 21 gigs.
When I worked at Cisco where we had a contract with MySQL AB and
routinely saw MySQL crash on several of our busy (but noncritical
hosts) at least once every two weeks. MySQL acknowledged the problem
but couldn't suggest a fix, only a workaround that required stopping
the database (kill -9 style), repairing the tables, and moving on with
life as if nothing happened (nevermind the cost of the outage + lost
data, which translated into repopulating the database, all of which
were quietly ignored even though it added up to roughly 4hrs of work
for someone every two weeks... or ~$1200/mo). The Oracle DBAs loved
this and I don't blame them (we joked about adding the slogan, "MySQL:
corrupting data the fastest since 1998"). Because you haven't run
across this yet means you're lucky, not that MySQL is immune or doesn't
have these problems. :)
You need to modify avg_row_size to increase the default table row
limit.
Other than basic tuning letting PostgreSQL know the type of iron you're
running on, you don't need to adjust anything to grow your database to
this size. This, however, is the step that most people fail to do
correctly.
If you want to prevent table locks on Mysql, Innodb is availabble. The
problem that some do not point out is that a transaction safe table
require 50%+ more hd space (same applies if you move myisam table to
pgsql).
This is flat out false information and probably even wrong in the
innodb case. In PostgreSQL, maybe you're referring to the checkpoint
segments? Regardless, by default, that number is set to 3, where each
checkpoint segment is 16MB in size. Small price to pay for ACID
compliance. As for the actual storage size of a column, PostgreSQL
stores TEXT data as 4 + LENGTH(message). A row is ~28 bytes overhead +
size of all columns. If a database is freshly vacuumed, there is no
extra space requirement. There is, however, a hard drive space
requirement when data is changing in the form of the WAL logs. This
only grows as large as the changing data set in a single transaction
before a COMMIT. This never grows to a specific % of the size of your
tables and the data is constantly being recycled. On my database, I
have 256MB in WAL logs, but that figure never grows. Again, small
price to pay for having ACID compliance 24/7.
Which means the scan operations will be slower.
*If* a table has empty tuples in it, yes. But this is under the
assumption that a database isn't tended to. In the case of any well
designed application, indexes are used and placed on the appropriate
columns and scanning isn't necessary.
For me, I move the high transaction, low storage tables to Innodb and
large size, low trans tables to the slim Myisam format.
This, however, may be good advice for the folks using MySQL folks.
Don't dog Mysql cause if you google PgSQL it has just as much
problems. =)
It doesn't, as a matter of fact and most of these "problems" are
inaccurate FUD from ten years ago when PostgreSQL was Postgres95, or
earlier. If you have a specific problem, please let me know what it is
off list and I'll let you know if such problem really is a problem.
Most of the time it's a lack of understanding when coming from MySQL to
PostgreSQL (of which there are many), something has a different name,
or a limitation because PostgreSQL is ACID compliant all the time and
supports transactions all of the time. mysql.com is full of
inaccuracies that the mysql folks have refused to update because it
aids their point of view that MySQL is the only open source "database"
that people should use, nevermind that the benchmarks used by MySQL and
crew are done on untuned PostgreSQL databases. I used MySQL, and used
it hard back in the day and know full well what it can and can't do and
what MySQL says about PostgreSQL. Unfortunately I fell for their
propaganda and regret having done so.
-sc
--
Sean Chittenden