On Fri, 2005-08-19 at 12:18 -0500, John A Meinel wrote:
> Jeremiah Jahn wrote:
> > Sorry about the formatting.
> > On Thu, 2005-08-18 at 12:55 -0500, John Arbash Meinel wrote:
> >>Jeremiah Jahn wrote:
> >>The expensive parts are the 4915 lookups into the litigant_details (each
> >>one takes approx 4ms for a total of ~20s).
> >>And then you do it again on case_data (average 3ms each * 4906 loops =
> > Is there some way to avoid this?
> Well, in general, 3ms for a single lookup seems really long. Maybe your
> index is bloated by not vacuuming often enough. Do you tend to get a lot
> of updates to litigant_details?
I have vacuumed this already. I get lots of updates, but this data is
> There are a couple possibilities at this point. First, you can REINDEX
> the appropriate index, and see if that helps. However, if this is a test
> box, it sounds like you just did a dump and reload, which wouldn't have
> bloat in an index.
I loaded it using slony
> Another possibility. Is this the column that you usually use when
> pulling information out of litigant_details? If so, you can CLUSTER
> litigant_details on the appropriate index. This will help things be
> close together that should be, which decreases the index lookup costs.
clustering on this right now. Most of the other things are already
clustered. name and case_data
> However, if this is not the common column, then you probably will slow
> down whatever other accesses you may have on this table.
> After CLUSTER, the current data will stay clustered, but new data will
> not, so you have to continually CLUSTER, the same way that you might
> VACUUM. *However*, IIRC CLUSTER grabs an Exclusive lock, so it is as
> expensive as a VACUUM FULL. Be aware of this, but it might vastly
> improve your performance, so it would be worth it.
I generally re-cluster once a week.
> >>So there is no need for preloading your indexes on the identity table.
> >>It is definitely not the bottleneck.
> >>So a few design bits, which may help your database.
> >>Why is "actor_id" a text field instead of a number?
> > This is simply due to the nature of the data.
> I'm just wondering if changing into a number, and using a number->name
> lookup would be faster for you. It may not be. In general, I prefer to
> use numbers for references. I may be over paranoid, but I know that some
> locales are bad with string -> string comparisons. And since the data in
> your database is stored as UNICODE, I'm not sure if it has to do any
> translating or not. Again, something to consider, it may not make any
I don't believe so. I initialze the DB as 'lang=C'. I used to have the
problem where things were being inited as en_US. this would prevent any
text based index from working. This doesn't seem to be the case here, so
I'm not worried about it.
> >>You could try creating an index on "litigant_details (actor_id,
> >>count_ori)" so that it can do just an index lookup, rather than an index
> >>+ filter.
> > I have one, but it doesn't seem to like to use it. Don't really need it
> > though, I can just drop the court_id out of the query. It's redundant,
> > since each actor_id is also unique in litigant details. I had run vac
> > full and analyze but I ran them again anyway and the planning improved.
> > However, my 14 disk raid 10 array is still slower than my 3 disk raid 5
> > on my production box. 46sec vs 30sec (with live traffic on the
> > production) One of the strange things is that when I run the cat command
> > on my index and tables that are "HOT" it has no effect on memory usage.
> > Right now I'm running ext3 on LVM. I'm still in a position to redo the
> > file system and everything. Is this a good way to do it or should I
> > switch to something else? What about stripe and extent sizes...? kernel
> > parameters to change?
> Well, the plans are virtually identical. There is one small difference
> as to whether it joins against case_data or court first. But 'court' is
> very tiny (small enough to use a seqscan instead of index scan) I'm a
> little surprised with court being this small that it doesn't do
> something like a hash aggregation, but court takes no time anyway.
> The real problem is that your nested loop index time is *much* slower.
> -> Index Scan using lit_actor_speed on litigant_details
> (cost=0.00..3.96 rows=1 width=81)
> (actual time=4.788..4.812 rows=1 loops=5057)
> -> Index Scan using lit_actor_speed on litigant_details
> (cost=0.00..5.63 rows=1 width=81)
> (actual time=3.355..3.364 rows=1 loops=5057)
> -> Index Scan using case_speed on case_data
> (cost=0.00..3.46 rows=1 width=26)
> (actual time=4.222..4.230 rows=1 loops=5052)
> -> Index Scan using case_data_pkey on case_data
> (cost=0.00..5.31 rows=1 width=26)
> (actual time=1.897..1.904 rows=1 loops=5052)
> Notice that the actual per-row cost is as much as 1/2 less than on your
> devel box.
> As a test, can you do "time cat $index_file >/dev/null" a couple of
> times. And then determine the MB/s.
> Alternatively run vmstat in another shell. If the read/s doesn't change,
> then you know the "cat" is being served from RAM, and thus it really is
it's cached alright. I'm getting a read rate of about 150MB/sec. I would
have thought is would be faster with my raid setup. I think I'm going to
scrap the whole thing and get rid of LVM. I'll just do a straight ext3
system. Maybe that will help. Still trying to get suggestions for a
> I can point you to REINDEX and CLUSTER, but if it is caching in ram, I
> honestly can't say why the per loop would be that much slower.
> Are both systems running the same postgres version? It sounds like it is
> different (since you say something about switching to 8.0).
These had little or no effect.
The production machine is running 7.4 while the devel machine is running
> I doubt it, but you might try an 8.1devel version.
> >>Do you regularly vacuum analyze your tables?
> >>Just as a test, try running:
> >>set enable_nested_loop to off;
> > not quite acceptable
> > Total runtime: 221486.149 ms
> Well, the estimates are now at least closer (3k vs 5k instead of 1), and
> it is still choosing nested loops. So they probably are faster.
> I would still be interested in the actual EXPLAIN ANALYZE with nested
> loops disabled. It is possible that *some* of the nested loops are
> performing worse than they have to.
this is a cached version.
> copa=> explain analyze select
> copa-> from identity
> copa-> join litigant_details on identity.actor_id = litigant_details.actor_id
> copa-> join case_data on litigant_details.case_id = case_data.case_id and
> litigant_details.court_ori = case_data.court_ori
> copa-> join court on identity.court_ori = court.id
> copa-> where identity.court_ori = 'IL081025J' and full_name like 'SMITH%'
> order by full_name;
> QUERY PLAN
> Sort (cost=100502560.72..100502583.47 rows=9099 width=86) (actual
> time=17843.876..17849.401 rows=8094 loops=1)
> Sort Key: identity.full_name
> -> Merge Join (cost=100311378.72..100501962.40 rows=9099 width=86)
> (actual time=15195.816..17817.847 rows=8094 loops=1)
> Merge Cond: ((("outer".court_ori)::text = "inner"."?column10?") AND
> (("outer".case_id)::text = "inner"."?column11?"))
> -> Index Scan using case_speed on case_data (cost=0.00..170424.73
> rows=3999943 width=26) (actual time=0.015..4540.525 rows=3018284 loops=1)
> -> Sort (cost=100311378.72..100311400.82 rows=8839 width=112)
> (actual time=9594.985..9601.174 rows=8094 loops=1)
> Sort Key: (litigant_details.court_ori)::text,
> -> Nested Loop (cost=100002491.43..100310799.34 rows=8839
> width=112) (actual time=6892.755..9555.828 rows=8094 loops=1)
> -> Seq Scan on court (cost=0.00..3.29 rows=1 width=12)
> (actual time=0.085..0.096 rows=1 loops=1)
> Filter: ('IL081025J'::text = (id)::text)
> -> Merge Join (cost=2491.43..310707.66 rows=8839
> width=113) (actual time=6892.656..9519.680 rows=8094 loops=1)
> Merge Cond: (("outer".actor_id)::text =
> -> Index Scan using lit_actor_speed on
> litigant_details (cost=0.00..295722.00 rows=4956820 width=81) (actual
> time=0.027..5613.814 rows=3736703 loops=1)
> -> Sort (cost=2491.43..2513.71 rows=8913
> width=82) (actual time=116.071..122.272 rows=8100 loops=1)
> Sort Key: (identity.actor_id)::text
> -> Index Scan using name_speed on identity
> (cost=0.00..1906.66 rows=8913 width=82) (actual time=0.133..81.104 rows=8100
> Index Cond: (((full_name)::text >=
> 'SMITH'::character varying) AND ((full_name)::text < 'SMITI'::character
> Filter: (((court_ori)::text =
> 'IL081025J'::text) AND ((full_name)::text ~~ 'SMITH%'::text))
> Total runtime: 17859.917 ms
> But really, you have worse index speed, and that needs to be figured out.
Speak softly and carry a +6 two-handed sword.
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not