Jonah H. Harris wrote:
On 9/23/07, Carlos Moreno <[EMAIL PROTECTED]> wrote:
Wait a second --- am I correct in understanding then that the bloating
you guys are referring to occurs *in memory*??

No, bloating occurs on-disk; but this does affect memory.  Bloat means
that even though your table data may take up 1G after the initial
load, due to poor vacuuming, table layouts, etc. it to equal something
more... say 2G.

The thing is, even though the table only stores 1G of data, it is now
physically 2G.  So, anything that would need to read the entire table
(like COUNT(*)), or large sections of it sequentially, are performing
twice as many I/Os to do so.

OK --- that was my initial impression...  But again, then I'm still puzzled
about why *the second time* that I load the query, it still take a few seconds.

That is:  the first time I run the query, it has to go through the disk;
in the normal case it would have to read 100MB of data, but due to bloating,
it actually has to go through 2GB of data.   Ok, but then, it will load
only 100MB  (the ones that are not "uncollected disk garbage") to memory.
The next time that I run the query, the server would only need to read
100MB from memory --- the result should be instantaneous...

The behaviour I observed was:  first time I run the query took over one
minute; second time, a little above two seconds.  Tried four or five times
more;  in every instance it was around 2 seconds.   On the new server, *the
first time* I run the query, it takes *no time* (I repeat: *no time* --- as
in perhaps 10 to 100 msec;  in any case, my eyes could not resolve between
the moment I hit enter and the moment I see the result with the count of
rows --- that's between one and two orders of magnitude faster than with the old server --- and again, we're comparing *the first* time I execute the query
on the new machine, in which case it is expected that it would have to read
from disk, compared to the second and subsequent times that I execute it on
the old machine, in which case, since the bloating does not occur in memory,
the entire seq. scan should occur exclusively in memory ... )

That's what still puzzles me --- Alvaro's reply seemed to explain it if I
accept that the bloating affects memory (dead tuples loaded to memory reduce
the capacity to load the entire dataset into memory)...

Someone could shed some light and point out if there's still something I'm
missing or some other mistake in my analysis??   Hope I'm not sounding like
I'm being dense!!

Thanks,

Carlos
--



---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

               http://www.postgresql.org/about/donate

Reply via email to