Re: [HACKERS] <> join selectivity estimate question

2017-09-13 Thread Ashutosh Bapat
On Thu, Sep 14, 2017 at 4:30 AM, Tom Lane wrote: > Thomas Munro writes: >> On Wed, Sep 6, 2017 at 11:14 PM, Ashutosh Bapat >> wrote: >>> I added some "stable" tests to your patch taking inspiration from the >>>

Re: [HACKERS] <> join selectivity estimate question

2017-09-13 Thread Ashutosh Bapat
On Thu, Sep 14, 2017 at 4:19 AM, Thomas Munro wrote: > On Wed, Sep 6, 2017 at 11:14 PM, Ashutosh Bapat > wrote: >> On Fri, Jul 21, 2017 at 4:10 AM, Thomas Munro >> wrote: >>> That just leaves the

Re: [HACKERS] <> join selectivity estimate question

2017-09-13 Thread Tom Lane
Thomas Munro writes: > On Wed, Sep 6, 2017 at 11:14 PM, Ashutosh Bapat > wrote: >> I added some "stable" tests to your patch taking inspiration from the >> test SQL file. I think those will be stable across machines and runs. >>

Re: [HACKERS] <> join selectivity estimate question

2017-09-13 Thread Thomas Munro
On Wed, Sep 6, 2017 at 11:14 PM, Ashutosh Bapat wrote: > On Fri, Jul 21, 2017 at 4:10 AM, Thomas Munro > wrote: >> That just leaves the question of whether we should try to handle the >> empty RHS and single-value RHS cases using

Re: [HACKERS] <> join selectivity estimate question

2017-09-06 Thread Tom Lane
Simon Riggs writes: > Why isn't this an open item for PG10? Why should it be? This behavior has existed for a long time. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your

Re: [HACKERS] <> join selectivity estimate question

2017-09-06 Thread Simon Riggs
On 6 September 2017 at 04:14, Ashutosh Bapat wrote: > On Fri, Jul 21, 2017 at 4:10 AM, Thomas Munro > wrote: >> >> Thanks. Bearing all that in mind, I ran through a series of test >> scenarios and discovered that my handling for

Re: [HACKERS] <> join selectivity estimate question

2017-09-06 Thread Ashutosh Bapat
On Fri, Jul 21, 2017 at 4:10 AM, Thomas Munro wrote: > > Thanks. Bearing all that in mind, I ran through a series of test > scenarios and discovered that my handling for JOIN_ANTI was wrong: I > thought that I had to deal with inverting the result, but I now see >

Re: [HACKERS] <> join selectivity estimate question

2017-07-20 Thread Thomas Munro
On Fri, Jul 21, 2017 at 8:21 AM, Tom Lane wrote: > Ashutosh Bapat writes: >> On Thu, Jul 20, 2017 at 5:30 PM, Thomas Munro >> wrote: >>> Does anyone know how to test a situation where the join is reversed >>>

Re: [HACKERS] <> join selectivity estimate question

2017-07-20 Thread Tom Lane
Ashutosh Bapat writes: > On Thu, Jul 20, 2017 at 5:30 PM, Thomas Munro > wrote: >> Does anyone know how to test a situation where the join is reversed >> according to >> get_join_variables, or "complicated cases where we can't tell

Re: [HACKERS] <> join selectivity estimate question

2017-07-20 Thread Ashutosh Bapat
On Thu, Jul 20, 2017 at 5:30 PM, Thomas Munro wrote: > On Thu, Jul 20, 2017 at 11:47 PM, Ashutosh Bapat > wrote: >> On Thu, Jul 20, 2017 at 11:04 AM, Thomas Munro >> wrote: >>> On Fri, Jun 2, 2017 at

Re: [HACKERS] <> join selectivity estimate question

2017-07-20 Thread Thomas Munro
On Thu, Jul 20, 2017 at 11:47 PM, Ashutosh Bapat wrote: > On Thu, Jul 20, 2017 at 11:04 AM, Thomas Munro > wrote: >> On Fri, Jun 2, 2017 at 4:16 AM, Tom Lane wrote: >>> I don't think it does really. The thing

Re: [HACKERS] <> join selectivity estimate question

2017-07-20 Thread Ashutosh Bapat
On Thu, Jul 20, 2017 at 11:04 AM, Thomas Munro wrote: > On Fri, Jun 2, 2017 at 4:16 AM, Tom Lane wrote: >> I don't think it does really. The thing about a <> semijoin is that it >> will succeed unless *every* join key value from the inner query

Re: [HACKERS] <> join selectivity estimate question

2017-07-19 Thread Thomas Munro
On Fri, Jun 2, 2017 at 4:16 AM, Tom Lane wrote: > I don't think it does really. The thing about a <> semijoin is that it > will succeed unless *every* join key value from the inner query is equal > to the outer key value (or is null). That's something we should consider > to

Re: [HACKERS] <> join selectivity estimate question

2017-06-01 Thread Tom Lane
Dilip Kumar writes: > Actually, I was not proposing this patch instead I wanted to discuss > the approach. I was claiming that for > non-equal JOIN_SEMI selectivity estimation instead of calculating > selectivity in an existing way i.e > = 1- (selectivity of equal

Re: [HACKERS] <> join selectivity estimate question

2017-06-01 Thread Dilip Kumar
On Thu, Jun 1, 2017 at 8:24 PM, Robert Haas wrote: > On Wed, May 31, 2017 at 1:18 PM, Dilip Kumar wrote: >> + if (jointype = JOIN_SEMI) >> + { >> + sjinfo->jointype = JOIN_INNER; >> + } > > That is pretty obviously

Re: [HACKERS] <> join selectivity estimate question

2017-06-01 Thread Robert Haas
On Wed, May 31, 2017 at 1:18 PM, Dilip Kumar wrote: > + if (jointype = JOIN_SEMI) > + { > + sjinfo->jointype = JOIN_INNER; > + } That is pretty obviously half-baked and completely untested. -- Robert Haas EnterpriseDB:

Re: [HACKERS] <> join selectivity estimate question

2017-05-31 Thread Dilip Kumar
On Fri, Mar 17, 2017 at 6:49 PM, Thomas Munro wrote: > Right. If I temporarily hack neqjoinsel() thus: > > result = 1.0 - result; > + > + if (jointype == JOIN_SEMI) > + result = 1.0; > + > PG_RETURN_FLOAT8(result); > } I was

Re: [HACKERS] <> join selectivity estimate question

2017-03-17 Thread Thomas Munro
On Sat, Mar 18, 2017 at 11:49 AM, Thomas Munro wrote: > On Sat, Mar 18, 2017 at 6:14 AM, Tom Lane wrote: >> After a bit more thought, it seems like the bug here is that "the >> fraction of the LHS that has a non-matching row" is not one minus >>

Re: [HACKERS] <> join selectivity estimate question

2017-03-17 Thread Thomas Munro
On Sat, Mar 18, 2017 at 6:14 AM, Tom Lane wrote: > After a bit more thought, it seems like the bug here is that "the > fraction of the LHS that has a non-matching row" is not one minus > "the fraction of the LHS that has a matching row". In fact, in > this example, *all* LHS

Re: [HACKERS] <> join selectivity estimate question

2017-03-17 Thread Tom Lane
Robert Haas writes: > On Fri, Mar 17, 2017 at 1:14 PM, Tom Lane wrote: >> It would not be too hard to convince me that neqjoinsel should >> simply return 1.0 for any semijoin/antijoin case, perhaps with >> some kind of discount for nullfrac. Whether or

Re: [HACKERS] <> join selectivity estimate question

2017-03-17 Thread Robert Haas
On Fri, Mar 17, 2017 at 1:14 PM, Tom Lane wrote: > After a bit more thought, it seems like the bug here is that "the > fraction of the LHS that has a non-matching row" is not one minus > "the fraction of the LHS that has a matching row". In fact, in > this example, *all* LHS

Re: [HACKERS] <> join selectivity estimate question

2017-03-17 Thread Tom Lane
I wrote: > The problem here appears to be that we don't have any MCV list for > the "twothousand" column (because it has a perfectly flat distribution), > and the heuristic that eqjoinsel_semi is using for the no-MCVs case > is falling down badly. Oh ... wait. eqjoinsel_semi's charter is to

Re: [HACKERS] <> join selectivity estimate question

2017-03-17 Thread Tom Lane
Robert Haas writes: > The relevant code is in neqsel(). It estimates the fraction of rows > that will be equal, and then does 1 - that number. Evidently, the > query planner thinks that l1.l_suppkey = l2.l_suppkey would almost > always be true, and therefore l1.l_suppkey

Re: [HACKERS] <> join selectivity estimate question

2017-03-17 Thread Robert Haas
On Fri, Mar 17, 2017 at 1:54 AM, Thomas Munro wrote: > SELECT * >FROM lineitem l1 > WHERE EXISTS (SELECT * > FROM lineitem l2 > WHERE l1.l_orderkey = l2.l_orderkey); > > -> estimates 59986012 rows, actual rows 59,986,052

[HACKERS] <> join selectivity estimate question

2017-03-16 Thread Thomas Munro
Hi hackers, While studying a regression reported[1] against my parallel hash join patch, I noticed that we can also reach a good and a bad plan in unpatched master. One of the causes seems to be the estimated selectivity of a semi-join with an extra <> filter qual. Here are some times I

Re: [HACKERS] join selectivity

2004-12-24 Thread strk
On Thu, Dec 16, 2004 at 01:56:29PM -0500, Tom Lane wrote: Mark Cave-Ayland [EMAIL PROTECTED] writes: ... But in the case of column op unknown constant, if we're estimating the number of rows to return then that becomes harder I didn't say it was easy ;-). The existing selectivity

Re: [HACKERS] join selectivity

2004-12-24 Thread strk
On Thu, Dec 23, 2004 at 10:13:03AM -0500, Tom Lane wrote: [EMAIL PROTECTED] writes: On Thu, Dec 23, 2004 at 10:01:33AM -0500, Tom Lane wrote: Right. This amounts to assuming that the join conditions and the restriction conditions are independent, which of course is bogus, but we really

Re: [HACKERS] join selectivity

2004-12-24 Thread strk
On Thu, Dec 23, 2004 at 10:01:33AM -0500, Tom Lane wrote: [EMAIL PROTECTED] writes: Doing some tests I've found out that the returned value from the JOINSEL is applied to REL1.rows X REL2.rows, but REL1 and REL2 are not 'base' table, rather relations with a number of rows once again

Re: [HACKERS] join selectivity

2004-12-23 Thread strk
On Thu, Dec 16, 2004 at 03:12:21PM -0500, Greg Stark wrote: Mark Cave-Ayland [EMAIL PROTECTED] writes: Well at the moment PostGIS has a RESTRICT function that takes an expression of the form column op constant where column is a column consisting of geometries and constant is a bounding

Re: [HACKERS] join selectivity

2004-12-23 Thread Tom Lane
[EMAIL PROTECTED] writes: Doing some tests I've found out that the returned value from the JOINSEL is applied to REL1.rows X REL2.rows, but REL1 and REL2 are not 'base' table, rather relations with a number of rows once again estimated by other selectivity functions. Right. This amounts to

Re: [HACKERS] join selectivity

2004-12-23 Thread Tom Lane
[EMAIL PROTECTED] writes: On Thu, Dec 23, 2004 at 10:01:33AM -0500, Tom Lane wrote: Right. This amounts to assuming that the join conditions and the restriction conditions are independent, which of course is bogus, but we really don't have enough information to do better. Doesn't JOINSEL

Re: [HACKERS] join selectivity

2004-12-23 Thread Tom Lane
[EMAIL PROTECTED] writes: So it should NOT depend on full number of rows either, is this right ? No, it's supposed to return a fraction. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend

Re: [HACKERS] join selectivity

2004-12-16 Thread Mark Cave-Ayland
Hi Tom, -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: 13 December 2004 17:16 To: Mark Cave-Ayland Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [HACKERS] join selectivity Mark Cave-Ayland [EMAIL PROTECTED] writes: For a query

Re: [HACKERS] join selectivity

2004-12-16 Thread Tom Lane
Mark Cave-Ayland [EMAIL PROTECTED] writes: ...and with two indices RESTRICT is called four times. The part I find confusing is why with one index that RESTRICT is called twice. [ shrug... ] clause_selectivity doesn't try to cache the result. I was also thinking whether calling RESTRICT when

Re: [HACKERS] join selectivity

2004-12-16 Thread Mark Cave-Ayland
Hi Tom, -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: 16 December 2004 17:56 To: Mark Cave-Ayland Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [HACKERS] join selectivity Mark Cave-Ayland [EMAIL PROTECTED] writes: OK I think

Re: [HACKERS] join selectivity

2004-12-16 Thread Tom Lane
Mark Cave-Ayland [EMAIL PROTECTED] writes: ... But in the case of column op unknown constant, if we're estimating the number of rows to return then that becomes harder I didn't say it was easy ;-). The existing selectivity functions can't do better than a rough guess in such cases, and I

Re: [HACKERS] join selectivity

2004-12-16 Thread Greg Stark
Mark Cave-Ayland [EMAIL PROTECTED] writes: Well at the moment PostGIS has a RESTRICT function that takes an expression of the form column op constant where column is a column consisting of geometries and constant is a bounding box. This is based upon histogram statistics and works well. Are

Re: [HACKERS] join selectivity

2004-12-16 Thread Mark Cave-Ayland
-Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: 16 December 2004 15:55 To: Mark Cave-Ayland Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [HACKERS] join selectivity Mark Cave-Ayland [EMAIL PROTECTED] writes: ...and with two

Re: [HACKERS] join selectivity

2004-12-16 Thread Tom Lane
Mark Cave-Ayland [EMAIL PROTECTED] writes: OK I think I've misunderstood something more fundamental than that; I understood from what you said that the RESTRICT clause is used to evaluate the cost of table1.geom table2.geom against table2.geom table1.geom (i.e. it is used to help decide

Re: [HACKERS] join selectivity

2004-12-13 Thread Mark Cave-Ayland
Hi strk, (cut) Taking a look at join selectivity... For a query like this: SELECT id FROM table1, table2 WHERE table1.geom table2.geom; RESTRICT selectivity is invoked twice and JOIN selectivity is invoked once. The RESTRICT code is not able to find a costant part

Re: [HACKERS] join selectivity

2004-12-13 Thread Tom Lane
Mark Cave-Ayland [EMAIL PROTECTED] writes: For a query like this: SELECT id FROM table1, table2 WHERE table1.geom table2.geom; RESTRICT selectivity is invoked twice and JOIN selectivity is invoked once. Hm, are you testing in a context where both tables have indexes that are relevant

Re: [HACKERS] join selectivity

2004-12-13 Thread strk
On Mon, Dec 13, 2004 at 12:16:05PM -0500, Tom Lane wrote: Mark Cave-Ayland [EMAIL PROTECTED] writes: For a query like this: SELECT id FROM table1, table2 WHERE table1.geom table2.geom; RESTRICT selectivity is invoked twice and JOIN selectivity is invoked once. Hm, are you