Re: [PERFORM] Join slow on "large" tables

Josué Maldonado Tue, 08 Jun 2004 08:02:23 -0700

Hello Scott,

El 07/06/2004 4:47 PM, Scott Marlowe en su mensaje escribio:

OK, you have to ask yourself a question here. Do I have enough memory to let both postgresql and the kernel to cache this data, or enough memory for only one. Then, you pick one and try it out. But there's some issues here. PostgreSQL's shared buffer are not, and should not generally be thought of as "cache". A cache's job it to hold the whole working set, or as much as possible, ready for access. A buffer's job is to hold all the data we're tossing around right this second. Once we're done with the data, the buffers can and do just drop whatever was in them. PostgreSQL does not have caching, in the classical sense. that may or may not change.

The kernel, on the other hand, has both cache and buffer. Ever notice that a Linux top shows the cache usually being much bigger than the buffers? My 512 Meg home box right now has 252968k for cache, and 43276k for buffers.

I noticed buffers are lower agains cache at least as top shows, dunno if I'm wrong:

8:28am up 1:00, 2 users, load average: 0.40, 0.97, 0.75 65 processes: 64 sleeping, 1 running, 0 zombie, 0 stopped CPU0 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle CPU1 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle CPU2 states: 0.0% user, 0.1% system, 0.0% nice, 99.4% idle CPU3 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle Mem: 2069596K av, 1882228K used, 187368K free, 0K shrd, 32924K buff Swap: 2096440K av, 0K used, 2096440K free 1757220K cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
 1508 root      13   0  1040 1040   856 R     0.1  0.0   0:00 top
    1 root       8   0   476  476   432 S     0.0  0.0   0:04 init

Now, you're tossing around enough data to actually maybe have a use for
a huge set of buffers, but this means you'll need to starve your cache
to get enough buffers.  Which means that if one process does this kind
of join, drops connection, and two seconds later, another process
connects and does nearly the same thing, it's likely to have to read it
all from the hard drives again, as it's not in the postgresql buffer,
and not in the kernel cache.

Starting a seperate connection, doing a simple select * from table1;
sekect * from table 2, dropping the result set returned, and staying
connected seems to be enough to get 7.4 to hold onto the data.

PostgreSQL's current buffer management algo is dirt simple.  The ones in
the kernel's cache are quite good.  So you can quickly reach a point
where PostgreSQL is chasing it's tail where the kernel would have done
OK.

Your numbers show that you are tossing 659M and 314M against each other,
but I don't know if you're harvesting the whole set at once, or just a
couple row of each.  Indexing help, or is this always gonna be a big seq
scan of 90% of both tables?

Generally only a small set is queried, the bigest record set expected is about 24,000 rows and does not exced the 10MB size, explain analyze shows the planner is using the index as expected but performance still poor.

If you are getting the whole thing all the time, and want postgresql to
buffer the whole thing (I recommend against it, although a very few
circumstances seem to support it) you need to have 973M of buffer.  That
would be 124544 or we'll just call it 130000.  This high of a number
means you will be getting more than 50% of the RAM for postgreSQL.  At
that point, it seems you might as well go for broke and grab most of it,
~200000 or so.

If you're not always mushing the two things against each other, and
you've got other datasets to interact with, index it.

Oh, in your reply you might to include an explain analyze of the query,
and maybe an output of top while the query is running.

dbmund=# explain analyze select * from vkardex where kprocode='1013'; Nested Loop (cost=0.00..2248.19 rows=403 width=114) (actual time=846.318..16030.633 rows=3145 loops=1) -> Index Scan using pkd_pcode_idx on pkardex (cost=0.00..806.27 rows=403 width=72) (actual time=0.054..87.393 rows=3544 loops=1) Index Cond: ((pkd_pcode)::text = '1013'::text) -> Index Scan using pdc_pk_idx on pmdoc (cost=0.00..3.55 rows=1 width=50) (actual time=4.482..4.484 rows=1 loops=3544) Index Cond: (pmdoc.pdc_pk = "outer".doctofk) Total runtime: 16033.807 ms (6 rows)

At the time the querie was running top returned: 5:11pm up 1:28, 3 users, load average: 0.19, 0.97, 1.41 69 processes: 66 sleeping, 1 running, 2 zombie, 0 stopped CPU0 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle CPU1 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle CPU2 states: 0.1% user, 0.4% system, 0.0% nice, 98.4% idle CPU3 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle Mem: 2069596K av, 1477784K used, 591812K free, 0K shrd, 2336K buff Swap: 2096440K av, 9028K used, 2087412K free 1388372K cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
 1225 postgres  17   0  257M 257M  255M S     0.6 12.7   7:14 postmaster
 1978 postgres  11   0  1044 1044   860 R     0.2  0.0   0:00 top
    1 root       9   0   472  444   428 S     0.0  0.0   0:04 init
    2 root       8   0     0    0     0 SW    0.0  0.0   0:00 keventd

and free returned:
/root: free
             total       used       free     shared    buffers     cached
Mem:       2069596    1477832     591764          0       2320    1388372
-/+ buffers/cache:      87140    1982456
Swap:      2096440       9028    2087412


Thanks,

-- Sinceramente, JosuÃ Maldonado. "El verdadero placer estÃ en la bÃsqueda, mÃs que en la explicaciÃn." -- Isaac Asimov

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Re: [PERFORM] Join slow on "large" tables

Reply via email to