Re: [HACKERS] Very ineffective plan with merge join
On Thu, 15 Apr 2010, Kevin Grittner wrote: Tom Lane t...@sss.pgh.pa.us wrote: I'm not sure how much it would help to increase the statistics targets, but that would be worth trying. Setting statistics to 1000 helps for that particular reduced query, but full query (attached) is out of luck. I notice that the scan rowcount estimates are very accurate, there's that one hash join result that's way off, though. What's up with the sort of _accrged7200 (in the slower plan) taking in 3.5 million rows and putting out 1 row? There's something there I'm not understanding. -Kevin Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83--set enable_mergejoin to off; explain analyze SELECT _V8TblAli1_Q_000_T_001._AccountRRef AS f_3, _V8TblAli1_Q_000_T_001._Value1_TYPE AS f_4, _V8TblAli1_Q_000_T_001._Value1_RTRef AS f_5, _V8TblAli1_Q_000_T_001._Value1_RRRef AS f_6, _V8TblAli1_Q_000_T_001._Value2_TYPE AS f_7, _V8TblAli1_Q_000_T_001._Value2_RTRef AS f_8, _V8TblAli1_Q_000_T_001._Value2_RRRef AS f_9, _V8TblAli1_Q_000_T_001._Value3_TYPE AS f_10, _V8TblAli1_Q_000_T_001._Value3_RTRef AS f_11, _V8TblAli1_Q_000_T_001._Value3_RRRef AS f_12, 0 AS f_13, 0 AS f_14, 0 AS f_15, 0 AS f_16, SUM(_V8TblAli1_Q_000_T_001._Fld7178TurnoverCt) AS f_17, SUM(_V8TblAli1_Q_000_T_001._Fld7180TurnoverCt) AS f_18, SUM(_V8TblAli1_Q_000_T_001._Fld7178TurnoverCt) AS f_19, SUM(_V8TblAli1_Q_000_T_001._Fld7180TurnoverCt) AS f_20, SUM(_V8TblAli1_Q_000_T_001._Fld7178TurnoverCt) AS f_21, SUM(_V8TblAli1_Q_000_T_001._Fld7180TurnoverCt) AS f_22 FROM ( SELECT _V8TblAli1_R._Period AS _Period, _V8TblAli1_R._RecorderTRef AS _RecorderTRef, _V8TblAli1_R._RecorderRRef AS _RecorderRRef, _V8TblAli1_R._AccountRRef AS _AccountRRef, _V8TblAli1_R._Value1_TYPE AS _Value1_TYPE, _V8TblAli1_R._Value1_RTRef AS _Value1_RTRef, _V8TblAli1_R._Value1_RRRef AS _Value1_RRRef, _V8TblAli1_R._Value2_TYPE AS _Value2_TYPE, _V8TblAli1_R._Value2_RTRef AS _Value2_RTRef, _V8TblAli1_R._Value2_RRRef AS _Value2_RRRef, _V8TblAli1_R._Value3_TYPE AS _Value3_TYPE, _V8TblAli1_R._Value3_RTRef AS _Value3_RTRef, _V8TblAli1_R._Value3_RRRef AS _Value3_RRRef, CASE WHEN SUM(_V8TblAli1_R._Fld7178TurnoverCt) IS NULL THEN CAST(0 AS NUMERIC(22,2)) ELSE SUM(_V8TblAli1_R._Fld7178TurnoverCt) END AS _Fld7178TurnoverCt, CASE WHEN SUM(_V8TblAli1_R._Fld7180TurnoverCt) IS NULL THEN CAST(0 AS NUMERIC(22,3)) ELSE SUM(_V8TblAli1_R._Fld7180TurnoverCt) END AS _Fld7180TurnoverCt FROM ( SELECT _AccRg7175_R._Period AS _Period, _AccRg7175_R._RecorderTRef AS _RecorderTRef, _AccRg7175_R._RecorderRRef AS _RecorderRRef, _AccRg7175_R._AccountDtRRef AS _AccountRRef, CASE WHEN RD3722586b56744409b0f43cff4e9ed7a9_RD._ED1 = TRUE THEN CASE WHEN _AccRgED7200_TED1._Value_TYPE = '\\001'::bytea OR _AccRgED7200_TED1._Value_TYPE = '\\010'::bytea AND _AccRgED7200_TED1._Value_RTRef IS NOT NULL AND _AccRgED7200_TED1._Value_RRRef IS NOT NULL THEN _AccRgED7200_TED1._Value_TYPE ELSE NULL END ELSE CASE WHEN CAST(NULL AS BYTEA) = '\\001'::bytea OR CAST(NULL AS BYTEA) = '\\010'::bytea AND CAST(NULL AS BYTEA) IS NOT NULL THEN CAST(NULL AS BYTEA) ELSE NULL END END AS _Value1_TYPE, CASE WHEN RD3722586b56744409b0f43cff4e9ed7a9_RD._ED1 = TRUE THEN CASE WHEN _AccRgED7200_TED1._Value_TYPE = '\\010'::bytea THEN _AccRgED7200_TED1._Value_RTRef WHEN _AccRgED7200_TED1._Value_TYPE IS NULL OR _AccRgED7200_TED1._Value_TYPE NOT IN ('\\001'::bytea,'\\010'::bytea) THEN NULL ELSE '\\000\\000\\000\\000'::bytea END ELSE CASE WHEN CAST(NULL AS BYTEA) = '\\010'::bytea THEN CAST(NULL AS BYTEA) WHEN CAST(NULL AS BYTEA) IS NULL OR CAST(NULL AS BYTEA) NOT IN ('\\001'::bytea,'\\010'::bytea) THEN NULL ELSE '\\000\\000\\000\\000'::bytea END END AS _Value1_RTRef, CASE WHEN RD3722586b56744409b0f43cff4e9ed7a9_RD._ED1 = TRUE THEN CASE WHEN _AccRgED7200_TED1._Value_TYPE = '\\010'::bytea THEN _AccRgED7200_TED1._Value_RRRef WHEN _AccRgED7200_TED1._Value_TYPE IS NULL OR _AccRgED7200_TED1._Value_TYPE NOT IN ('\\001'::bytea,'\\010'::bytea) THEN NULL ELSE '\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000'::bytea END ELSE CASE WHEN CAST(NULL AS BYTEA) = '\\010'::bytea THEN CAST(NULL AS BYTEA) WHEN CAST(NULL AS BYTEA) IS NULL OR CAST(NULL AS BYTEA) NOT IN ('\\001'::bytea,'\\010'::bytea) THEN NULL ELSE '\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000'::bytea END END AS _Value1_RRRef, CASE WHEN RD3722586b56744409b0f43cff4e9ed7a9_RD._ED2 = TRUE THEN CASE WHEN _AccRgED7200_TED2._Value_TYPE = '\\001'::bytea OR _AccRgED7200_TED2._Value_TYPE = '\\010'::bytea AND _AccRgED7200_TED2._Value_RTRef IS NOT NULL AND _AccRgED7200_TED2._Value_RRRef IS NOT NULL THEN _AccRgED7200_TED2._Value_TYPE ELSE NULL END ELSE CASE WHEN CAST(NULL AS BYTEA) =
Re: [HACKERS] Very ineffective plan with merge join
On Thu, 15 Apr 2010, Tom Lane wrote: Oleg Bartunov o...@sai.msu.su writes: below is an example of interesting query and two plans - the bad plan, which uses merge join and big sorting, took 216 sec, and good plan with merge join disabled took 8 sec. The good plan seems to be fast mainly because of heavily cached inner indexscans. If that's the normal operating state for this database, you should try reducing random_page_cost. Hmm, reducing random_page_cost to 3 helps, now all plans are the same. Also, as Pavel noted, the sub-join size estimates aren't very good, and those overestimates are discouraging it from using inner-indexscan nestloops. I'm not sure how much it would help to increase the statistics targets, but that would be worth trying. Yes, setting statistics to 1000 helped for that paticular query (reduced by me), but full query still chooses wrong plan with merge join. As you say before, random_page_cost=3 helped. I'm wondering if postgres could recognize such case (heavily cached inner indexscans). Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Very ineffective plan with merge join
Sorry, I used random_page_cost=2, while random_page_cost=3 didn't help. Oleg On Fri, 16 Apr 2010, Oleg Bartunov wrote: On Thu, 15 Apr 2010, Tom Lane wrote: Oleg Bartunov o...@sai.msu.su writes: below is an example of interesting query and two plans - the bad plan, which uses merge join and big sorting, took 216 sec, and good plan with merge join disabled took 8 sec. The good plan seems to be fast mainly because of heavily cached inner indexscans. If that's the normal operating state for this database, you should try reducing random_page_cost. Hmm, reducing random_page_cost to 3 helps, now all plans are the same. Also, as Pavel noted, the sub-join size estimates aren't very good, and those overestimates are discouraging it from using inner-indexscan nestloops. I'm not sure how much it would help to increase the statistics targets, but that would be worth trying. Yes, setting statistics to 1000 helped for that paticular query (reduced by me), but full query still chooses wrong plan with merge join. As you say before, random_page_cost=3 helped. I'm wondering if postgres could recognize such case (heavily cached inner indexscans). Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Very ineffective plan with merge join
Hi there, below is an example of interesting query and two plans - the bad plan, which uses merge join and big sorting, took 216 sec, and good plan with merge join disabled took 8 sec. Sorry for odd names, they were generated by popular accounting engine in Russia. 8.4.3 and HEAD show the same behaviour. The query: --set enable_mergejoin to off; explain analyze SELECT _V8TblAli1_Q_000_T_001._AccountRRef AS f_3, _V8TblAli1_Q_000_T_001._Value1_TYPE AS f_4, _V8TblAli1_Q_000_T_001._Value1_RTRef AS f_5, _V8TblAli1_Q_000_T_001._Value1_RRRef AS f_6, _V8TblAli1_Q_000_T_001._Value2_TYPE AS f_7, _V8TblAli1_Q_000_T_001._Value2_RTRef AS f_8, _V8TblAli1_Q_000_T_001._Value2_RRRef AS f_9, _V8TblAli1_Q_000_T_001._Value3_TYPE AS f_10, _V8TblAli1_Q_000_T_001._Value3_RTRef AS f_11, _V8TblAli1_Q_000_T_001._Value3_RRRef AS f_12, 0 AS f_13, 0 AS f_14, 0 AS f_15, 0 AS f_16, SUM(_V8TblAli1_Q_000_T_001._Fld7178TurnoverCt) AS f_17, SUM(_V8TblAli1_Q_000_T_001._Fld7180TurnoverCt) AS f_18, SUM(_V8TblAli1_Q_000_T_001._Fld7178TurnoverCt) AS f_19, SUM(_V8TblAli1_Q_000_T_001._Fld7180TurnoverCt) AS f_20, SUM(_V8TblAli1_Q_000_T_001._Fld7178TurnoverCt) AS f_21, SUM(_V8TblAli1_Q_000_T_001._Fld7180TurnoverCt) AS f_22 FROM ( SELECT _V8TblAli1_R._Period AS _Period, _V8TblAli1_R._RecorderTRef AS _RecorderTRef, _V8TblAli1_R._RecorderRRef AS _RecorderRRef, _V8TblAli1_R._AccountRRef AS _AccountRRef, _V8TblAli1_R._Value1_TYPE AS _Value1_TYPE, _V8TblAli1_R._Value1_RTRef AS _Value1_RTRef, _V8TblAli1_R._Value1_RRRef AS _Value1_RRRef, _V8TblAli1_R._Value2_TYPE AS _Value2_TYPE, _V8TblAli1_R._Value2_RTRef AS _Value2_RTRef, _V8TblAli1_R._Value2_RRRef AS _Value2_RRRef, _V8TblAli1_R._Value3_TYPE AS _Value3_TYPE, _V8TblAli1_R._Value3_RTRef AS _Value3_RTRef, _V8TblAli1_R._Value3_RRRef AS _Value3_RRRef, CASE WHEN SUM(_V8TblAli1_R._Fld7178TurnoverCt) IS NULL THEN CAST(0 AS NUMERIC(22,2)) ELSE SUM(_V8TblAli1_R._Fld7178TurnoverCt) END AS _Fld7178TurnoverCt, CASE WHEN SUM(_V8TblAli1_R._Fld7180TurnoverCt) IS NULL THEN CAST(0 AS NUMERIC(22,3)) ELSE SUM(_V8TblAli1_R._Fld7180TurnoverCt) END AS _Fld7180TurnoverCt FROM ( SELECT _AccRg7175_R._Period AS _Period, _AccRg7175_R._RecorderTRef AS _RecorderTRef, _AccRg7175_R._RecorderRRef AS _RecorderRRef, _AccRg7175_R._AccountDtRRef AS _AccountRRef, CASE WHEN RD3722586b56744409b0f43cff4e9ed7a9_RD._ED1 = TRUE THEN CASE WHEN _AccRgED7200_TED1._Value_TYPE = '\\001'::bytea OR _AccRgED7200_TED1._Value_TYPE = '\\010'::bytea AND _AccRgED7200_TED1._Value_RTRef IS NOT NULL AND _AccRgED7200_TED1._Value_RRRef IS NOT NULL THEN _AccRgED7200_TED1._Value_TYPE ELSE NULL END ELSE CASE WHEN CAST(NULL AS BYTEA) = '\\001'::bytea OR CAST(NULL AS BYTEA) = '\\010'::bytea AND CAST(NULL AS BYTEA) IS NOT NULL THEN CAST(NULL AS BYTEA) ELSE NULL END END AS _Value1_TYPE, CASE WHEN RD3722586b56744409b0f43cff4e9ed7a9_RD._ED1 = TRUE THEN CASE WHEN _AccRgED7200_TED1._Value_TYPE = '\\010'::bytea THEN _AccRgED7200_TED1._Value_RTRef WHEN _AccRgED7200_TED1._Value_TYPE IS NULL OR _AccRgED7200_TED1._Value_TYPE NOT IN ('\\001'::bytea,'\\010'::bytea) THEN NULL ELSE '\\000\\000\\000\\000'::bytea END ELSE CASE WHEN CAST(NULL AS BYTEA) = '\\010'::bytea THEN CAST(NULL AS BYTEA) WHEN CAST(NULL AS BYTEA) IS NULL OR CAST(NULL AS BYTEA) NOT IN ('\\001'::bytea,'\\010'::bytea) THEN NULL ELSE '\\000\\000\\000\\000'::bytea END END AS _Value1_RTRef, CASE WHEN RD3722586b56744409b0f43cff4e9ed7a9_RD._ED1 = TRUE THEN CASE WHEN _AccRgED7200_TED1._Value_TYPE = '\\010'::bytea THEN _AccRgED7200_TED1._Value_RRRef WHEN _AccRgED7200_TED1._Value_TYPE IS NULL OR _AccRgED7200_TED1._Value_TYPE NOT IN ('\\001'::bytea,'\\010'::bytea) THEN NULL ELSE '\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000'::bytea END ELSE CASE WHEN CAST(NULL AS BYTEA) = '\\010'::bytea THEN CAST(NULL AS BYTEA) WHEN CAST(NULL AS BYTEA) IS NULL OR CAST(NULL AS BYTEA) NOT IN ('\\001'::bytea,'\\010'::bytea) THEN NULL ELSE '\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000'::bytea END END AS _Value1_RRRef, CASE WHEN RD3722586b56744409b0f43cff4e9ed7a9_RD._ED2 = TRUE THEN CASE WHEN _AccRgED7200_TED2._Value_TYPE = '\\001'::bytea OR _AccRgED7200_TED2._Value_TYPE = '\\010'::bytea AND _AccRgED7200_TED2._Value_RTRef IS NOT NULL AND _AccRgED7200_TED2._Value_RRRef IS NOT NULL THEN _AccRgED7200_TED2._Value_TYPE ELSE NULL END ELSE CASE WHEN CAST(NULL AS BYTEA) = '\\001'::bytea OR CAST(NULL AS BYTEA) = '\\010'::bytea AND CAST(NULL AS BYTEA) IS NOT NULL THEN CAST(NULL AS BYTEA) ELSE NULL END END AS _Value2_TYPE, CASE WHEN RD3722586b56744409b0f43cff4e9ed7a9_RD._ED2 = TRUE THEN CASE WHEN _AccRgED7200_TED2._Value_TYPE = '\\010'::bytea THEN _AccRgED7200_TED2._Value_RTRef WHEN _AccRgED7200_TED2._Value_TYPE IS NULL OR _AccRgED7200_TED2._Value_TYPE NOT IN ('\\001'::bytea,'\\010'::bytea) THEN NULL ELSE '\\000\\000\\000\\000'::bytea END ELSE CASE WHEN CAST(NULL AS BYTEA) = '\\010'::bytea THEN CAST(NULL AS BYTEA) WHEN CAST(NULL AS BYTEA) IS NULL OR CAST(NULL AS BYTEA) NOT
Re: [HACKERS] Very ineffective plan with merge join
Oleg Bartunov o...@sai.msu.su wrote: Sorry for odd names, they were generated by popular accounting engine in Russia. How much of that can you trim out and still see the problem? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Very ineffective plan with merge join
On Thu, 15 Apr 2010, Kevin Grittner wrote: Oleg Bartunov o...@sai.msu.su wrote: Sorry for odd names, they were generated by popular accounting engine in Russia. How much of that can you trim out and still see the problem? It's difficult, since I don't know semantics of data. I reduced query, though. query: explain analyze SELECT _AccRg7175_R._Period AS _Period, _AccRg7175_R._RecorderTRef AS _RecorderTRef, _AccRg7175_R._RecorderRRef AS _RecorderRRef, _AccRg7175_R._AccountDtRRef AS _AccountRRef FROM _AccRg7175 _AccRg7175_R LEFT OUTER JOIN _Acc7_ExtDim7144 _Acc7_ExtDim7144_TEDAcc1 ON _Acc7_ExtDim7144_TEDAcc1._Acc7_IDRRef = _AccRg7175_R._AccountDtRRef AND _Acc7_ExtDim7144_TEDAcc1._LineNo = 1 LEFT OUTER JOIN _Acc7_ExtDim7144 _Acc7_ExtDim7144_TEDAcc2 ON _Acc7_ExtDim7144_TEDAcc2._Acc7_IDRRef = _AccRg7175_R._AccountDtRRef AND _Acc7_ExtDim7144_TEDAcc2._LineNo = 2 LEFT OUTER JOIN _Acc7_ExtDim7144 _Acc7_ExtDim7144_TEDAcc3 ON _Acc7_ExtDim7144_TEDAcc3._Acc7_IDRRef = _AccRg7175_R._AccountDtRRef AND _Acc7_ExtDim7144_TEDAcc3._LineNo = 3 LEFT OUTER JOIN _AccRgED7200 _AccRgED7200_TED3 ON _AccRgED7200_TED3._RecorderTRef = _AccRg7175_R._RecorderTRef AND _AccRgED7200_TED3._RecorderRRef = _AccRg7175_R._RecorderRRef AND _AccRgED7200_TED3._LineNo = _AccRg7175_R._LineNo AND _AccRgED7200_TED3._Period = _AccRg7175_R._Period AND _AccRgED7200_TED3._Correspond = 0 AND _AccRgED7200_TED3._KindRRef = _Acc7_ExtDim7144_TEDAcc3._DimKindRRef WHERE _AccRg7175_R._Active = TRUE AND _AccRg7175_R._AccountDtRRef IN (SELECT tt2._REFFIELDRRef AS f_1 FROM tt2) AND _AccRg7175_R._Period = '2009-10-01 00:00:00'::timestamp AND _AccRg7175_R._Period = '2009-10-31 23:59:59'::timestamp ; default plan: --- Hash Left Join (cost=762035.09..822357.84 rows=224988 width=56) (actual time=25007.488..25022.338 rows=9189 loops=1) Hash Cond: (_accrg7175_r._accountdtrref = _acc7_extdim7144_tedacc1._acc7_idrref) - Hash Left Join (cost=762017.69..819134.13 rows=153030 width=56) (actual time=25007.173..25017.249 rows=9189 loops=1) Hash Cond: (_accrg7175_r._accountdtrref = _acc7_extdim7144_tedacc2._acc7_idrref) - Merge Right Join (cost=762001.76..816793.89 rows=153030 width=56) (actual time=25006.895..25012.218 rows=9189 loops=1) Merge Cond: ((_accrged7200_ted3._lineno = _accrg7175_r._lineno) AND (_accrged7200_ted3._recordertref = _accrg7175_r._recordertref) AND (_accrged7200_ted3._recorderrref = _accrg7175_r._recorderrref) AND (_accrged7200_ted3._period = _accrg7175_r._period) AND (_accrged7200_ted3._kindrref = _acc7_extdim7144_tedacc3._dimkindrref)) - Sort (cost=694652.60..703399.93 rows=3498930 width=63) (actual time=24794.738..24794.738 rows=1 loops=1) Sort Key: _accrged7200_ted3._lineno, _accrged7200_ted3._recordertref, _accrged7200_ted3._recorderrref, _accrged7200_ted3._period, _accrged7200_ted3._kindrref Sort Method: external merge Disk: 230896kB - Seq Scan on _accrged7200 _accrged7200_ted3 (cost=0.00..182790.96 rows=3498930 width=63) (actual time=0.153..3018.474 rows=3526745 loops=1) Filter: (_correspond = 0::numeric) - Sort (cost=67344.64..67727.22 rows=153030 width=83) (actual time=212.145..213.289 rows=9189 loops=1) Sort Key: _accrg7175_r._lineno, _accrg7175_r._recordertref, _accrg7175_r._recorderrref, _accrg7175_r._period, _acc7_extdim7144_tedacc3._dimkindrref Sort Method: quicksort Memory: 1677kB - Hash Left Join (cost=10322.30..54166.12 rows=153030 width=83) (actual time=39.489..184.046 rows=9189 loops=1) Hash Cond: (_accrg7175_r._accountdtrref = _acc7_extdim7144_tedacc3._acc7_idrref) - Hash Join (cost=10308.08..52844.15 rows=153030 width=63) (actual time=39.256..180.388 rows=9189 loops=1) Hash Cond: (_accrg7175_r._accountdtrref = tt2._reffieldrref) - Bitmap Heap Scan on _accrg7175 _accrg7175_r (cost=10305.89..51535.65 rows=237384 width=63) (actual time=38.913..130.715 rows=235636 loops=1) Recheck Cond: ((_period = '2009-10-01 00:00:00'::timestamp without time zone) AND (_period = '2009-10-31 23:59:59'::timestamp without time zone)) Filter: _active - Bitmap Index Scan on _accntr7175_byperiod_trn (cost=0.00..10246.54 rows=237384 width=0) (actual time=37.281..37.281 rows=235636 loops=1)
Re: [HACKERS] Very ineffective plan with merge join
Hello there is significant problem in statistics I think, Regards Pavel Stehule 2010/4/15 Oleg Bartunov o...@sai.msu.su: On Thu, 15 Apr 2010, Kevin Grittner wrote: Oleg Bartunov o...@sai.msu.su wrote: Sorry for odd names, they were generated by popular accounting engine in Russia. How much of that can you trim out and still see the problem? It's difficult, since I don't know semantics of data. I reduced query, though. query: explain analyze SELECT _AccRg7175_R._Period AS _Period, _AccRg7175_R._RecorderTRef AS _RecorderTRef, _AccRg7175_R._RecorderRRef AS _RecorderRRef, _AccRg7175_R._AccountDtRRef AS _AccountRRef FROM _AccRg7175 _AccRg7175_R LEFT OUTER JOIN _Acc7_ExtDim7144 _Acc7_ExtDim7144_TEDAcc1 ON _Acc7_ExtDim7144_TEDAcc1._Acc7_IDRRef = _AccRg7175_R._AccountDtRRef AND _Acc7_ExtDim7144_TEDAcc1._LineNo = 1 LEFT OUTER JOIN _Acc7_ExtDim7144 _Acc7_ExtDim7144_TEDAcc2 ON _Acc7_ExtDim7144_TEDAcc2._Acc7_IDRRef = _AccRg7175_R._AccountDtRRef AND _Acc7_ExtDim7144_TEDAcc2._LineNo = 2 LEFT OUTER JOIN _Acc7_ExtDim7144 _Acc7_ExtDim7144_TEDAcc3 ON _Acc7_ExtDim7144_TEDAcc3._Acc7_IDRRef = _AccRg7175_R._AccountDtRRef AND _Acc7_ExtDim7144_TEDAcc3._LineNo = 3 LEFT OUTER JOIN _AccRgED7200 _AccRgED7200_TED3 ON _AccRgED7200_TED3._RecorderTRef = _AccRg7175_R._RecorderTRef AND _AccRgED7200_TED3._RecorderRRef = _AccRg7175_R._RecorderRRef AND _AccRgED7200_TED3._LineNo = _AccRg7175_R._LineNo AND _AccRgED7200_TED3._Period = _AccRg7175_R._Period AND _AccRgED7200_TED3._Correspond = 0 AND _AccRgED7200_TED3._KindRRef = _Acc7_ExtDim7144_TEDAcc3._DimKindRRef WHERE _AccRg7175_R._Active = TRUE AND _AccRg7175_R._AccountDtRRef IN (SELECT tt2._REFFIELDRRef AS f_1 FROM tt2) AND _AccRg7175_R._Period = '2009-10-01 00:00:00'::timestamp AND _AccRg7175_R._Period = '2009-10-31 23:59:59'::timestamp ; default plan: --- Hash Left Join (cost=762035.09..822357.84 rows=224988 width=56) (actual time=25007.488..25022.338 rows=9189 loops=1) Hash Cond: (_accrg7175_r._accountdtrref = _acc7_extdim7144_tedacc1._acc7_idrref) - Hash Left Join (cost=762017.69..819134.13 rows=153030 width=56) (actual time=25007.173..25017.249 rows=9189 loops=1) Hash Cond: (_accrg7175_r._accountdtrref = _acc7_extdim7144_tedacc2._acc7_idrref) - Merge Right Join (cost=762001.76..816793.89 rows=153030 width=56) (actual time=25006.895..25012.218 rows=9189 loops=1) Merge Cond: ((_accrged7200_ted3._lineno = _accrg7175_r._lineno) AND (_accrged7200_ted3._recordertref = _accrg7175_r._recordertref) AND (_accrged7200_ted3._recorderrref = _accrg7175_r._recorderrref) AND (_accrged7200_ted3._period = _accrg7175_r._period) AND (_accrged7200_ted3._kindrref = _acc7_extdim7144_tedacc3._dimkindrref)) - Sort (cost=694652.60..703399.93 rows=3498930 width=63) (actual time=24794.738..24794.738 rows=1 loops=1) Sort Key: _accrged7200_ted3._lineno, _accrged7200_ted3._recordertref, _accrged7200_ted3._recorderrref, _accrged7200_ted3._period, _accrged7200_ted3._kindrref Sort Method: external merge Disk: 230896kB - Seq Scan on _accrged7200 _accrged7200_ted3 (cost=0.00..182790.96 rows=3498930 width=63) (actual time=0.153..3018.474 rows=3526745 loops=1) Filter: (_correspond = 0::numeric) - Sort (cost=67344.64..67727.22 rows=153030 width=83) (actual time=212.145..213.289 rows=9189 loops=1) Sort Key: _accrg7175_r._lineno, _accrg7175_r._recordertref, _accrg7175_r._recorderrref, _accrg7175_r._period, _acc7_extdim7144_tedacc3._dimkindrref Sort Method: quicksort Memory: 1677kB - Hash Left Join (cost=10322.30..54166.12 rows=153030 width=83) (actual time=39.489..184.046 rows=9189 loops=1) Hash Cond: (_accrg7175_r._accountdtrref = _acc7_extdim7144_tedacc3._acc7_idrref) - Hash Join (cost=10308.08..52844.15 rows=153030 width=63) (actual time=39.256..180.388 rows=9189 loops=1) Hash Cond: (_accrg7175_r._accountdtrref = tt2._reffieldrref) - Bitmap Heap Scan on _accrg7175 _accrg7175_r (cost=10305.89..51535.65 rows=237384 width=63) (actual time=38.913..130.715 rows=235636 loops=1) Recheck Cond: ((_period = '2009-10-01 00:00:00'::timestamp without time zone) AND (_period = '2009-10-31 23:59:59'::timestamp without time zone)) Filter: _active - Bitmap
Re: [HACKERS] Very ineffective plan with merge join
On Thu, 15 Apr 2010, Pavel Stehule wrote: Hello there is significant problem in statistics I think, Ah, you're right ! Regards Pavel Stehule 2010/4/15 Oleg Bartunov o...@sai.msu.su: On Thu, 15 Apr 2010, Kevin Grittner wrote: Oleg Bartunov o...@sai.msu.su wrote: Sorry for odd names, they were generated by popular accounting engine in Russia. How much of that can you trim out and still see the problem? It's difficult, since I don't know semantics of data. I reduced query, though. query: explain analyze SELECT _AccRg7175_R._Period AS _Period, _AccRg7175_R._RecorderTRef AS _RecorderTRef, _AccRg7175_R._RecorderRRef AS _RecorderRRef, _AccRg7175_R._AccountDtRRef AS _AccountRRef FROM _AccRg7175 _AccRg7175_R LEFT OUTER JOIN _Acc7_ExtDim7144 _Acc7_ExtDim7144_TEDAcc1 ON _Acc7_ExtDim7144_TEDAcc1._Acc7_IDRRef = _AccRg7175_R._AccountDtRRef AND _Acc7_ExtDim7144_TEDAcc1._LineNo = 1 LEFT OUTER JOIN _Acc7_ExtDim7144 _Acc7_ExtDim7144_TEDAcc2 ON _Acc7_ExtDim7144_TEDAcc2._Acc7_IDRRef = _AccRg7175_R._AccountDtRRef AND _Acc7_ExtDim7144_TEDAcc2._LineNo = 2 LEFT OUTER JOIN _Acc7_ExtDim7144 _Acc7_ExtDim7144_TEDAcc3 ON _Acc7_ExtDim7144_TEDAcc3._Acc7_IDRRef = _AccRg7175_R._AccountDtRRef AND _Acc7_ExtDim7144_TEDAcc3._LineNo = 3 LEFT OUTER JOIN _AccRgED7200 _AccRgED7200_TED3 ON _AccRgED7200_TED3._RecorderTRef = _AccRg7175_R._RecorderTRef AND _AccRgED7200_TED3._RecorderRRef = _AccRg7175_R._RecorderRRef AND _AccRgED7200_TED3._LineNo = _AccRg7175_R._LineNo AND _AccRgED7200_TED3._Period = _AccRg7175_R._Period AND _AccRgED7200_TED3._Correspond = 0 AND _AccRgED7200_TED3._KindRRef = _Acc7_ExtDim7144_TEDAcc3._DimKindRRef WHERE _AccRg7175_R._Active = TRUE AND _AccRg7175_R._AccountDtRRef IN (SELECT tt2._REFFIELDRRef AS f_1 FROM tt2) AND _AccRg7175_R._Period = '2009-10-01 00:00:00'::timestamp AND _AccRg7175_R._Period = '2009-10-31 23:59:59'::timestamp ; default plan: --- Hash Left Join (cost=762035.09..822357.84 rows=224988 width=56) (actual time=25007.488..25022.338 rows=9189 loops=1) Hash Cond: (_accrg7175_r._accountdtrref = _acc7_extdim7144_tedacc1._acc7_idrref) - Hash Left Join (cost=762017.69..819134.13 rows=153030 width=56) (actual time=25007.173..25017.249 rows=9189 loops=1) Hash Cond: (_accrg7175_r._accountdtrref = _acc7_extdim7144_tedacc2._acc7_idrref) - Merge Right Join (cost=762001.76..816793.89 rows=153030 width=56) (actual time=25006.895..25012.218 rows=9189 loops=1) Merge Cond: ((_accrged7200_ted3._lineno = _accrg7175_r._lineno) AND (_accrged7200_ted3._recordertref = _accrg7175_r._recordertref) AND (_accrged7200_ted3._recorderrref = _accrg7175_r._recorderrref) AND (_accrged7200_ted3._period = _accrg7175_r._period) AND (_accrged7200_ted3._kindrref = _acc7_extdim7144_tedacc3._dimkindrref)) - Sort (cost=694652.60..703399.93 rows=3498930 width=63) (actual time=24794.738..24794.738 rows=1 loops=1) Sort Key: _accrged7200_ted3._lineno, _accrged7200_ted3._recordertref, _accrged7200_ted3._recorderrref, _accrged7200_ted3._period, _accrged7200_ted3._kindrref Sort Method: external merge Disk: 230896kB - Seq Scan on _accrged7200 _accrged7200_ted3 (cost=0.00..182790.96 rows=3498930 width=63) (actual time=0.153..3018.474 rows=3526745 loops=1) Filter: (_correspond = 0::numeric) - Sort (cost=67344.64..67727.22 rows=153030 width=83) (actual time=212.145..213.289 rows=9189 loops=1) Sort Key: _accrg7175_r._lineno, _accrg7175_r._recordertref, _accrg7175_r._recorderrref, _accrg7175_r._period, _acc7_extdim7144_tedacc3._dimkindrref Sort Method: quicksort Memory: 1677kB - Hash Left Join (cost=10322.30..54166.12 rows=153030 width=83) (actual time=39.489..184.046 rows=9189 loops=1) Hash Cond: (_accrg7175_r._accountdtrref = _acc7_extdim7144_tedacc3._acc7_idrref) - Hash Join (cost=10308.08..52844.15 rows=153030 width=63) (actual time=39.256..180.388 rows=9189 loops=1) Hash Cond: (_accrg7175_r._accountdtrref = tt2._reffieldrref) - Bitmap Heap Scan on _accrg7175 _accrg7175_r (cost=10305.89..51535.65 rows=237384 width=63) (actual time=38.913..130.715 rows=235636 loops=1) Recheck Cond: ((_period = '2009-10-01 00:00:00'::timestamp without time zone) AND (_period = '2009-10-31 23:59:59'::timestamp without time zone)) Filter: _active - Bitmap Index Scan on
Re: [HACKERS] Very ineffective plan with merge join
Oleg Bartunov o...@sai.msu.su writes: below is an example of interesting query and two plans - the bad plan, which uses merge join and big sorting, took 216 sec, and good plan with merge join disabled took 8 sec. The good plan seems to be fast mainly because of heavily cached inner indexscans. If that's the normal operating state for this database, you should try reducing random_page_cost. Also, as Pavel noted, the sub-join size estimates aren't very good, and those overestimates are discouraging it from using inner-indexscan nestloops. I'm not sure how much it would help to increase the statistics targets, but that would be worth trying. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Very ineffective plan with merge join
Tom Lane t...@sss.pgh.pa.us wrote: I'm not sure how much it would help to increase the statistics targets, but that would be worth trying. I notice that the scan rowcount estimates are very accurate, there's that one hash join result that's way off, though. What's up with the sort of _accrged7200 (in the slower plan) taking in 3.5 million rows and putting out 1 row? There's something there I'm not understanding. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Very ineffective plan with merge join
Kevin Grittner kevin.gritt...@wicourts.gov writes: What's up with the sort of _accrged7200 (in the slower plan) taking in 3.5 million rows and putting out 1 row? There's something there I'm not understanding. It's under a merge join, so what probably happened is that the first row from that side had a larger key than any row from the other side. A mergejoin will never bother to look at the remaining rows in such a case. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers