[HACKERS] Bug? ExecChooseHashTableSize() got assertion failed with crazy number of rows

Kouhei Kaigai Tue, 18 Aug 2015 07:46:16 -0700

Hello,

I noticed ExecChooseHashTableSize() in nodeHash.c got failed by
Assert(nbuckets > 0), when crazy number of rows are expected.


BACKTRACE:

#0  0x0000003f79432625 in raise () from /lib64/libc.so.6
#1  0x0000003f79433e05 in abort () from /lib64/libc.so.6
#2  0x000000000092600a in ExceptionalCondition (conditionName=0xac1ea0 
"!(nbuckets > 0)",
    errorType=0xac1d88 "FailedAssertion", fileName=0xac1d40 "nodeHash.c", 
lineNumber=545) at assert.c:54
#3  0x00000000006851ff in ExecChooseHashTableSize (ntuples=60521928028, 
tupwidth=8, useskew=1 '\001',
    numbuckets=0x7fff146bff04, numbatches=0x7fff146bff00, 
num_skew_mcvs=0x7fff146bfefc) at nodeHash.c:545
#4  0x0000000000701735 in initial_cost_hashjoin (root=0x253a318, 
workspace=0x7fff146bffc0, jointype=JOIN_SEMI,
    hashclauses=0x257e4f0, outer_path=0x2569a40, inner_path=0x2569908, 
sjinfo=0x2566f40, semifactors=0x7fff146c0168)
    at costsize.c:2592
#5  0x000000000070e02a in try_hashjoin_path (root=0x253a318, joinrel=0x257d940, 
outer_path=0x2569a40, inner_path=0x2569908,
    hashclauses=0x257e4f0, jointype=JOIN_SEMI, extra=0x7fff146c0150) at 
joinpath.c:543


See the following EXPLAIN output, at the configuration without --enable-cassert.
Planner expects 60.5B rows towards the self join by a relation with 72M rows.
(Probably, this estimation is too much.)

[kaigai@ayu ~]$ (echo EXPLAIN; cat ~/tpcds/query95.sql) | psql tpcds100
                                                                                
       QUERY PLAN                            
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=9168667273.07..9168667273.08 rows=1 width=20)
   CTE ws_wh
     ->  Custom Scan (GpuJoin)  (cost=3342534.49..654642911.88 rows=60521928028 
width=24)
           Bulkload: On (density: 100.00%)
           Depth 1: Logic: GpuHashJoin, HashKeys: (ws_order_number), JoinQual: 
((ws_warehouse_sk <> ws_warehouse_sk) AND (ws_order_number = ws_order_number)), 
nrows (ratio: 84056.77%)
           ->  Custom Scan (BulkScan) on web_sales ws1_1  
(cost=0.00..3290612.48 rows=72001248 width=16)
           ->  Seq Scan on web_sales ws2  (cost=0.00..3290612.48 rows=72001248 
width=16)
   ->  Sort  (cost=8514024361.19..8514024361.20 rows=1 width=20)
         Sort Key: (count(DISTINCT ws1.ws_order_number))
      :

This crash was triggered by Assert(nbuckets > 0), and nbuckets is calculated
as follows.

    /*
     * If there's not enough space to store the projected number of tuples and
     * the required bucket headers, we will need multiple batches.
     */
    if (inner_rel_bytes + bucket_bytes > hash_table_bytes)
    {
        /* We'll need multiple batches */
        long        lbuckets;
        double      dbatch;
        int         minbatch;
        long        bucket_size;

        /*
         * Estimate the number of buckets we'll want to have when work_mem is
         * entirely full.  Each bucket will contain a bucket pointer plus
         * NTUP_PER_BUCKET tuples, whose projected size already includes
         * overhead for the hash code, pointer to the next tuple, etc.
         */
        bucket_size = (tupsize * NTUP_PER_BUCKET + sizeof(HashJoinTuple));
        lbuckets = 1 << my_log2(hash_table_bytes / bucket_size);
        lbuckets = Min(lbuckets, max_pointers);
        nbuckets = (int) lbuckets;
        bucket_bytes = nbuckets * sizeof(HashJoinTuple);
          :
          :
    }
    Assert(nbuckets > 0);
    Assert(nbatch > 0);

In my case, the hash_table_bytes was 101017630802, and bucket_size was 48.
So, my_log2(hash_table_bytes / bucket_size) = 31, then lbuckets will have
negative number because both "1" and my_log2() is int32.
So, Min(lbuckets, max_pointers) chooses 0x80000000, then it was set on
the nbuckets and triggers the Assert().

Attached patch fixes the problem.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <[email protected]>

pgsql-fix-hash-nbuckets.patch
Description: pgsql-fix-hash-nbuckets.patch

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Bug? ExecChooseHashTableSize() got assertion failed with crazy number of rows

Reply via email to