On Cum, 2005-08-26 at 19:31 -0400, Tom Lane wrote:
> I think the problem is that the planner is underestimating the cost of
> evaluating this complicated filter condition:
> 
> >                 ->  Seq Scan on scf_irsaliye irs  (cost=0.00..30.00 rows=1 
> > width=20) (actual time=0.290..0.290 rows=0 loops=10000)
> >                       Filter: (((karsifirma)::text = 'KENDI'::text) AND 
> > (((turu)::text = 'MAI'::text) OR ((turu)::text = 'KGI'::text) OR 
> > ((turu)::text = 'PS'::text) OR ((turu)::text = 'TS'::text) OR ((turu)::text 
> > = 'KC'::text) OR ((turu)::text = 'KCO'::text)) AND 
> > (((_key_sis_depo_dest)::text = '$$$$0000003l$1$$'::text) OR 
> > ((_key_sis_depo_dest)::text = '$$$$00000048$1$$'::text) OR 
> > ((_key_sis_depo_dest)::text = '$$$$0000004b$1$$'::text) OR 
> > ((_key_sis_depo_dest)::text = '$$$$0000004d$1$$'::text)) AND (tarih <= 
> > '2005-08-26'::date))
> 
> While you could attack that by raising the cpu_operator_cost parameter,
> it would also be worth inquiring *why* the condition is so expensive to
> evaluate.  I am suspicious that you are running the database in a locale
> in which strcoll() is really slow.  Can you run it in C locale instead,
> or do you really need locale-aware behavior?  Can you switch to a
> different database encoding?  (A single-byte encoding such as Latin1
> might be faster than UTF8, for example.)

Yes, you are perfectly right. We are using UTF8 and tr_TR.UTF8 locale.
However, I tried the same tests with latin1 and C locale, it is surely
faster, but not dramatically. i.e.:

                      Before Vacuum    After Vacuum
UTF8 and tr_TR.UTF8:       ~8 s            ~110 s
latin1 and C:              ~7 s             ~65 s

I also played with cpu_operator_cost parameter and it dramatically
reduced query times, but not to the level before vacuum:

                      Before Vacuum    After Vacuum
UTF8 and tr_TR.UTF8:       ~8 s            ~11 s
latin1 and C:              ~7 s             ~9 s

These values are much better but I really wonder if I can reach the
performance levels before vacuum. I am also worried about the
side-effects that may be caused by the non-default cpu_operator_cost
parameter.

> Another possibility is to take a hard look at whether you can't simplify
> the filter condition, but that'd require more knowledge of your
> application than I have.

Yes that is another option, we are even considering schema changes to
use less character types, but these are really costly and error-prone
operations at the moment.

> Or you could just play with the order of the filter conditions ... for
> example, the date condition at the end is probably far cheaper to test
> than the text comparisons, so if that's fairly selective it'd be worth
> putting it first.

We are experimenting on this.

Thanks your help!

Best Regards,
Umit Oztosun


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org

Reply via email to