On Fri, 2005-08-19 at 12:18 -0500, John A Meinel wrote: > Jeremiah Jahn wrote: > > Sorry about the formatting. > > > > On Thu, 2005-08-18 at 12:55 -0500, John Arbash Meinel wrote: > > > >>Jeremiah Jahn wrote: > >> > >> > > ... > > >>The expensive parts are the 4915 lookups into the litigant_details (each > >>one takes approx 4ms for a total of ~20s). > >>And then you do it again on case_data (average 3ms each * 4906 loops = > >>~15s). > > > > Is there some way to avoid this? > > > > Well, in general, 3ms for a single lookup seems really long. Maybe your > index is bloated by not vacuuming often enough. Do you tend to get a lot > of updates to litigant_details? I have vacuumed this already. I get lots of updates, but this data is mostly unchanging.
> > There are a couple possibilities at this point. First, you can REINDEX > the appropriate index, and see if that helps. However, if this is a test > box, it sounds like you just did a dump and reload, which wouldn't have > bloat in an index. I loaded it using slony > > Another possibility. Is this the column that you usually use when > pulling information out of litigant_details? If so, you can CLUSTER > litigant_details on the appropriate index. This will help things be > close together that should be, which decreases the index lookup costs. clustering on this right now. Most of the other things are already clustered. name and case_data > > However, if this is not the common column, then you probably will slow > down whatever other accesses you may have on this table. > > After CLUSTER, the current data will stay clustered, but new data will > not, so you have to continually CLUSTER, the same way that you might > VACUUM. *However*, IIRC CLUSTER grabs an Exclusive lock, so it is as > expensive as a VACUUM FULL. Be aware of this, but it might vastly > improve your performance, so it would be worth it. I generally re-cluster once a week. > > > > > > >>So there is no need for preloading your indexes on the identity table. > >>It is definitely not the bottleneck. > >> > >>So a few design bits, which may help your database. > >>Why is "actor_id" a text field instead of a number? > > > > This is simply due to the nature of the data. > > > > I'm just wondering if changing into a number, and using a number->name > lookup would be faster for you. It may not be. In general, I prefer to > use numbers for references. I may be over paranoid, but I know that some > locales are bad with string -> string comparisons. And since the data in > your database is stored as UNICODE, I'm not sure if it has to do any > translating or not. Again, something to consider, it may not make any > difference. I don't believe so. I initialze the DB as 'lang=C'. I used to have the problem where things were being inited as en_US. this would prevent any text based index from working. This doesn't seem to be the case here, so I'm not worried about it. > > > > > >>You could try creating an index on "litigant_details (actor_id, > >>count_ori)" so that it can do just an index lookup, rather than an index > >>+ filter. > > > > I have one, but it doesn't seem to like to use it. Don't really need it > > though, I can just drop the court_id out of the query. It's redundant, > > since each actor_id is also unique in litigant details. I had run vac > > full and analyze but I ran them again anyway and the planning improved. > > However, my 14 disk raid 10 array is still slower than my 3 disk raid 5 > > on my production box. 46sec vs 30sec (with live traffic on the > > production) One of the strange things is that when I run the cat command > > on my index and tables that are "HOT" it has no effect on memory usage. > > Right now I'm running ext3 on LVM. I'm still in a position to redo the > > file system and everything. Is this a good way to do it or should I > > switch to something else? What about stripe and extent sizes...? kernel > > parameters to change? > > Well, the plans are virtually identical. There is one small difference > as to whether it joins against case_data or court first. But 'court' is > very tiny (small enough to use a seqscan instead of index scan) I'm a > little surprised with court being this small that it doesn't do > something like a hash aggregation, but court takes no time anyway. > > The real problem is that your nested loop index time is *much* slower. > > Devel: > -> Index Scan using lit_actor_speed on litigant_details > (cost=0.00..3.96 rows=1 width=81) > (actual time=4.788..4.812 rows=1 loops=5057) > > Production: > -> Index Scan using lit_actor_speed on litigant_details > (cost=0.00..5.63 rows=1 width=81) > (actual time=3.355..3.364 rows=1 loops=5057) > > Devel: > -> Index Scan using case_speed on case_data > (cost=0.00..3.46 rows=1 width=26) > (actual time=4.222..4.230 rows=1 loops=5052) > > Production: > -> Index Scan using case_data_pkey on case_data > (cost=0.00..5.31 rows=1 width=26) > (actual time=1.897..1.904 rows=1 loops=5052) > > Notice that the actual per-row cost is as much as 1/2 less than on your > devel box. > > As a test, can you do "time cat $index_file >/dev/null" a couple of > times. And then determine the MB/s. > Alternatively run vmstat in another shell. If the read/s doesn't change, > then you know the "cat" is being served from RAM, and thus it really is > cached. it's cached alright. I'm getting a read rate of about 150MB/sec. I would have thought is would be faster with my raid setup. I think I'm going to scrap the whole thing and get rid of LVM. I'll just do a straight ext3 system. Maybe that will help. Still trying to get suggestions for a stripe size. > > I can point you to REINDEX and CLUSTER, but if it is caching in ram, I > honestly can't say why the per loop would be that much slower. > Are both systems running the same postgres version? It sounds like it is > different (since you say something about switching to 8.0). These had little or no effect. The production machine is running 7.4 while the devel machine is running 8.0 > I doubt it, but you might try an 8.1devel version. > > ... > > >>Do you regularly vacuum analyze your tables? > >>Just as a test, try running: > >>set enable_nested_loop to off; > > > > not quite acceptable > > Total runtime: 221486.149 ms > > > > Well, the estimates are now at least closer (3k vs 5k instead of 1), and > it is still choosing nested loops. So they probably are faster. > I would still be interested in the actual EXPLAIN ANALYZE with nested > loops disabled. It is possible that *some* of the nested loops are > performing worse than they have to. this is a cached version. > copa=> explain analyze select > full_name,identity_id,identity.case_id,court.id,date_of_birth,assigned_case_role,litigant_details.impound_litigant_data > copa-> from identity > copa-> join litigant_details on identity.actor_id = litigant_details.actor_id > copa-> join case_data on litigant_details.case_id = case_data.case_id and > litigant_details.court_ori = case_data.court_ori > copa-> join court on identity.court_ori = court.id > copa-> where identity.court_ori = 'IL081025J' and full_name like 'SMITH%' > order by full_name; > > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Sort (cost=100502560.72..100502583.47 rows=9099 width=86) (actual > time=17843.876..17849.401 rows=8094 loops=1) > Sort Key: identity.full_name > -> Merge Join (cost=100311378.72..100501962.40 rows=9099 width=86) > (actual time=15195.816..17817.847 rows=8094 loops=1) > Merge Cond: ((("outer".court_ori)::text = "inner"."?column10?") AND > (("outer".case_id)::text = "inner"."?column11?")) > -> Index Scan using case_speed on case_data (cost=0.00..170424.73 > rows=3999943 width=26) (actual time=0.015..4540.525 rows=3018284 loops=1) > -> Sort (cost=100311378.72..100311400.82 rows=8839 width=112) > (actual time=9594.985..9601.174 rows=8094 loops=1) > Sort Key: (litigant_details.court_ori)::text, > (litigant_details.case_id)::text > -> Nested Loop (cost=100002491.43..100310799.34 rows=8839 > width=112) (actual time=6892.755..9555.828 rows=8094 loops=1) > -> Seq Scan on court (cost=0.00..3.29 rows=1 width=12) > (actual time=0.085..0.096 rows=1 loops=1) > Filter: ('IL081025J'::text = (id)::text) > -> Merge Join (cost=2491.43..310707.66 rows=8839 > width=113) (actual time=6892.656..9519.680 rows=8094 loops=1) > Merge Cond: (("outer".actor_id)::text = > "inner"."?column7?") > -> Index Scan using lit_actor_speed on > litigant_details (cost=0.00..295722.00 rows=4956820 width=81) (actual > time=0.027..5613.814 rows=3736703 loops=1) > -> Sort (cost=2491.43..2513.71 rows=8913 > width=82) (actual time=116.071..122.272 rows=8100 loops=1) > Sort Key: (identity.actor_id)::text > -> Index Scan using name_speed on identity > (cost=0.00..1906.66 rows=8913 width=82) (actual time=0.133..81.104 rows=8100 > loops=1) > Index Cond: (((full_name)::text >= > 'SMITH'::character varying) AND ((full_name)::text < 'SMITI'::character > varying)) > Filter: (((court_ori)::text = > 'IL081025J'::text) AND ((full_name)::text ~~ 'SMITH%'::text)) > Total runtime: 17859.917 ms > But really, you have worse index speed, and that needs to be figured out. > > John > =:-> -- Speak softly and carry a +6 two-handed sword. ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match