Hi 2015-12-18 16:21 GMT+01:00 Mathieu VINCENT <mathieu.vinc...@pmsipilot.com>:
> Hello, > > No one to help me to understand this bad estimation rows ? > It's *NOT* caused by : > > - correlation between columns (cross-correlation) > - bad statistics (i tried with default_statistics_target to 10 000) > - bad number of distinct values > - complexe join conditions > > I have no more ideas. > PostgreSQL has not cross tables statistics - so expect uniform distribution of foreign keys. This expectation is broken in your example. You can find some prototype solutions by Tomas Vondra in hackars mailing list. Regards Pavel > > thank you for your help. > Mathieu VINCENT > > 2015-12-17 11:58 GMT+01:00 Mathieu VINCENT <mathieu.vinc...@pmsipilot.com> > : > >> Adding foreign key between on t2 and t3, does not change the plan. >> >> drop table if exists t1; >> drop table if exists t2; >> drop table if exists t3; >> >> create table t1 as select generate_Series(1,200000) as c1; >> create table t2 as select generate_Series(1,200000)%100+1 as c1; >> create table t3 as select generate_Series(1,1500)%750+1 as c1; >> >> alter table t1 add PRIMARY KEY (c1); >> create index on t2 (c1); >> create index on t3 (c1); >> ALTER TABLE t2 ADD CONSTRAINT t2_fk FOREIGN KEY (c1) REFERENCES t1(c1); >> ALTER TABLE t3 ADD CONSTRAINT t3_fk FOREIGN KEY (c1) REFERENCES t1(c1); >> >> analyze verbose t1; >> analyze verbose t2; >> analyze verbose t3; >> >> EXPLAIN (analyze on, buffers on, verbose on) >> select >> * >> from >> t1 t1 >> inner join t2 on t1.c1=t2.c1 >> inner join t3 on t1.c1=t3.c1 >> >> Cordialement, >> <http://www.psih.fr/>PSIH Décisionnel en santé >> Mathieu VINCENT >> Data Analyst >> PMSIpilot - 61 rue Sully - 69006 Lyon - France >> >> 2015-12-17 11:37 GMT+01:00 Mathieu VINCENT <mathieu.vinc...@pmsipilot.com >> >: >> >>> Here, another issue with row estimate. >>> And, in this example, there is not correlation beetween columns in a >>> same table. >>> >>> drop table if exists t1; >>> drop table if exists t2; >>> drop table if exists t3; >>> >>> create table t1 as select generate_Series(1,200000) as c1; >>> create table t2 as select generate_Series(1,200000)%100 as c1; >>> create table t3 as select generate_Series(1,1500)%750 as c1; >>> >>> alter table t1 add PRIMARY KEY (c1); >>> create index on t2 (c1); >>> create index on t3 (c1); >>> >>> analyze verbose t1; >>> analyze verbose t2; >>> analyze verbose t3; >>> >>> EXPLAIN (analyze on, buffers on, verbose on) >>> select >>> * >>> from >>> t1 t1 >>> inner join t2 on t1.c1=t2.c1 >>> inner join t3 on t2.c1=t3.c1 >>> the explain plan : http://explain.depesz.com/s/YVw >>> Do you understand how postgresql calculate the row estimate ? >>> >>> BR >>> Mathieu VINCENT >>> >>> 2015-12-17 10:14 GMT+01:00 Matteo Grolla <matteo.gro...@gmail.com>: >>> >>>> Thank you both for the help! >>>> happy holidays >>>> >>>> 2015-12-17 10:10 GMT+01:00 Mathieu VINCENT < >>>> mathieu.vinc...@pmsipilot.com>: >>>> >>>>> thks Gunnar, >>>>> >>>>> I removed the correlation between t3.c1 and t3.c2 in this sql script : >>>>> >>>>> drop table if exists t1; >>>>> drop table if exists t2; >>>>> drop table if exists t3; >>>>> drop table if exists t4; >>>>> >>>>> create table t1 as select generate_Series(1,300000) as c1; >>>>> create table t2 as select generate_Series(1,400) as c1; >>>>> create table t3 as select floor(random()*100+1) as c1, c2 from >>>>> generate_Series(1,200000) c2; >>>>> create table t4 as select generate_Series(1,200000) as c1; >>>>> >>>>> alter table t1 add PRIMARY KEY (c1); >>>>> alter table t2 add PRIMARY KEY (c1); >>>>> alter table t3 add PRIMARY KEY (c1,c2); >>>>> create index on t3 (c1); >>>>> create index on t3 (c2); >>>>> alter table t4 add PRIMARY KEY (c1); >>>>> >>>>> analyze verbose t1; >>>>> analyze verbose t2; >>>>> analyze verbose t3; >>>>> analyze verbose t4; >>>>> >>>>> EXPLAIN (analyze on, buffers on, verbose on) >>>>> select >>>>> * >>>>> from >>>>> t1 t1 >>>>> inner join t2 on t1.c1=t2.c1 >>>>> inner join t3 on t2.c1=t3.c1 >>>>> inner join t4 on t3.c2=t4.c1 >>>>> >>>>> Now, the estimate is good : http://explain.depesz.com/s/gCX >>>>> >>>>> Have a good day >>>>> >>>>> Mathieu VINCENT >>>>> >>>>> 2015-12-15 11:21 GMT+01:00 Gunnar "Nick" Bluth < >>>>> gunnar.bluth.ext...@elster.de>: >>>>> >>>>>> Am 15.12.2015 um 10:49 schrieb Andreas Kretschmer: >>>>>> > Gunnar Nick Bluth <gunnar.bluth.ext...@elster.de> wrote: >>>>>> > >>>>>> >> Am 15.12.2015 um 09:05 schrieb Mathieu VINCENT: >>>>>> >>> Hello, >>>>>> >>> >>>>>> >>> No one to help me to understand this bad estimation rows ? >>>>>> >> >>>>>> >> Well, >>>>>> >> >>>>>> >> on a rather beefy machine, I'm getting quite a different plan: >>>>>> >> http://explain.depesz.com/s/3y5r >>>>>> > >>>>>> > you are using 9.5, right? Got the same plan with 9.5. >>>>>> >>>>>> Nope...: >>>>>> version >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------------------------------------------ >>>>>> PostgreSQL 9.3.5 on x86_64-unknown-linux-gnu, compiled by gcc >>>>>> (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit >>>>>> >>>>>> So much for those correlation improvements then ;-/ >>>>>> >>>>>> >>>>>> > Btw.: Hi Gunnar ;-) >>>>>> >>>>>> Hi :) >>>>>> >>>>>> -- >>>>>> Gunnar "Nick" Bluth >>>>>> DBA ELSTER >>>>>> >>>>>> Tel: +49 911/991-4665 >>>>>> Mobil: +49 172/8853339 >>>>>> >>>>>> >>>>>> -- >>>>>> Sent via pgsql-performance mailing list ( >>>>>> pgsql-performance@postgresql.org) >>>>>> To make changes to your subscription: >>>>>> http://www.postgresql.org/mailpref/pgsql-performance >>>>>> >>>>> >>>>> >>>> >>> >> >