Hi,

On 11/25/25 6:01 AM, Tom Lane wrote:
I wrote:
Wouldn't it be better to handle it more like the is_target_rel logic
a few lines further up?
Actually, after thinking a bit longer, it'd be better to do something
like the attached so that we don't keep redundant quals unless they'd
*all* be excluded.

There's definitely something fishy about the costing though.
I experimented with this variant of Sergei's example:

regression=# CREATE TABLE hash_partial(x) AS SELECT x % 100 as y from 
generate_series(1, 1000) as x;
SELECT 1000
regression=# ANALYZE hash_partial;
ANALYZE
regression=# CREATE INDEX partial_idx ON hash_partial USING hash(x) WHERE x = 1;
CREATE INDEX
regression=# set enable_seqscan TO 0;  -- else we'll go for a seqscan
SET
regression=# EXPLAIN SELECT x FROM hash_partial WHERE x = 1;
                                  QUERY PLAN
----------------------------------------------------------------------------
  Bitmap Heap Scan on hash_partial  (cost=24.08..32.56 rows=10 width=4)
    Recheck Cond: (x = 1)
    ->  Bitmap Index Scan on partial_idx  (cost=0.00..24.07 rows=10 width=0)
          Index Cond: (x = 1)
(4 rows)

regression=# drop index partial_idx;
DROP INDEX
regression=# CREATE INDEX ON hash_partial USING hash(x);
CREATE INDEX
regression=# EXPLAIN SELECT x FROM hash_partial WHERE x = 1;
                                     QUERY PLAN
----------------------------------------------------------------------------------
  Bitmap Heap Scan on hash_partial  (cost=4.08..12.56 rows=10 width=4)
    Recheck Cond: (x = 1)
    ->  Bitmap Index Scan on hash_partial_x_idx  (cost=0.00..4.08 rows=10 
width=0)
          Index Cond: (x = 1)
(4 rows)

Why are we thinking that a non-partial index would be substantially
cheaper to scan?  That seems surely wrong, and it runs counter to my
intuition about why this fix is incomplete.  (I expected an unfair
bias towards the partial index, not against it.)

                        regards, tom lane


Thanks for the fix. It seems there is another case for investigation:

DROP TABLE hash_partial;
CREATE TABLE hash_partial(x, y) AS
SELECT x, x + x as y from generate_series(1, 1000) as x;
ANALYZE hash_partial;
CREATE INDEX partial_idx  ON hash_partial USING hash(x) WHERE x = 1;
SET enable_seqscan TO 0;
EXPLAIN SELECT x FROM hash_partial WHERE x = 1 and y < 0;
--------------------------------------------------------------------------------
Seq Scan on hash_partial  (cost=0.00..23.00 rows=1 width=4)
   Disabled: true
   Filter: ((y < 0) AND (x = 1))
(3 rows)


 Regarding strangeness of the cost,
 cost is depends on numIndexPages and
 in genericcostestimate() we calulate numIndexPages:

 numIndexPages = ceil(numIndexTuples * index->pages / index->tuples);

 For non-partial index index->pages = 6 and index->tuples = 1000
 and for partial index index->pages = 6 and index->tuples = 10.
 Number of pages depends on index relation size and
 initial size is 6 * BLCKSZ for both, partial and non-partial hash indexes
 Initial size of the hash index relation, in turn,
 depends on total number of tuples in the table.

 Regards,
 Sergei Glukhov




Reply via email to