Dilip Kumar <dilipbal...@gmail.com> writes:
> Actually, I was not proposing this patch instead I wanted to discuss
> the approach. I was claiming that for
> non-equal JOIN_SEMI selectivity estimation instead of calculating
> selectivity in an existing way i.e
> = 1- (selectivity of equal JOIN_SEMI) the better way would be = 1-
> (selectivity of equal). I have only tested only standalone scenario
> where it solves the problem but not the TPCH cases. But I was more
> interested in discussing that the way I am thinking how it should
> calculate the nonequal SEMI join selectivity make any sense.
I don't think it does really. The thing about a <> semijoin is that it
will succeed unless *every* join key value from the inner query is equal
to the outer key value (or is null). That's something we should consider
to be of very low probability typically, so that the <> selectivity should
be estimated as nearly 1.0. If the regular equality selectivity
approaches 1.0, or when there are expected to be very few rows out of the
inner query, then maybe the <> estimate should start to drop off from 1.0,
but it surely doesn't move linearly with the equality selectivity.
BTW, I'd momentarily confused this thread with the one about bug #14676,
which points out that neqsel() isn't correctly accounting for nulls.
neqjoinsel() isn't either. Not sure that we want to solve both things
in one patch though.
regards, tom lane
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: