Hi Hackers,

I noticed an int32 overflow problem in intarray’s compare_val_int4():
```
/*
 * Comparison function for binary search in mcelem array.
 */
static int
compare_val_int4(const void *a, const void *b)
{
    int32       key = *(int32 *) a;
    const Datum *t = (const Datum *) b;

    return key - DatumGetInt32(*t);
}
```

As this function is a bsearch comparator, it is supposed to return >0, =0
or <0. However this function uses subtraction with two int32 and returns an
int, which may result in an overflow. Say, key is INT32_MAX and *t is -1,
the return value will be negative due to overflow.

A simple C program shows the overflow:
```
#include <stdio.h>
#include <stdint.h>
#include <limits.h>

int main(void)
{
    int32_t a = INT32_MAX;
    int32_t b = -1;

    int32_t diff = a - b;  /* overflow */

    printf("a = %d\n", a);
    printf("b = %d\n", b);
    printf("a - b = %d\n", diff);

    return 0;
}
```

On my MacBook, the output is:
```
a = 2147483647
b = -1
a - b = -2147483648 <== INT32_MIN, but as a>b, we expect a positive value
for bsearch comparator
```

The fix is straightforward: compare the key and *t without doing a
potentially overflowing subtraction. Because no test covered this edge
case, I added one. The test loads arrays [-2], [-1], and [INT32_MAX],
ensuring -1 is the middle MCE value so the buggy comparator overflows on
the first bsearch comparison and fail to find INT32_MAX. It then checks the
EXPLAIN row estimation for a query on INT32_MAX. Before the fix, the lookup
falls back to DEFAULT_EQ_SEL and EXPLAIN
reports rows=1; after the fix, it finds the MCE and reports the correct row
estimation (the count of INT32_MAX rows).

Since it’s an edge case and doesn’t affect query results, I guess that’s
why it went unnoticed for years.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Attachment: v1-0001-intarray-fix-compare-overflow-add-test.patch
Description: Binary data

Reply via email to