15.02.2022 13:28, Dmitry Yemanov wrote:
All,
Historically, we store most numerics as double precision inside index keys. It makes stored keys independent of the declared scale
and allows both prefix and suffix compression. However, int64 keys (BIGINT and largish NUMERICs/DECIMALs) do not fit the double
format without risk of rounding/truncation, so they're stored differently -- normalized to the maximum value that fits into double
and then re-scaled. The problem is that this format is not well-compressable.
...
Maybe we couldn't do any better before FB4, but now we have INT128 and DECFLOAT which use more effective index key format, so I
asked myself whether it makes sense to utilize it for int64 values too.
Below is the stats if BIGINT would be stored as idx_decimal in the index:
T_BIGINT2 (134)
Index I_BIGINT2_ID (0)
Root page: 37193, depth: 3, leaf buckets: 739, nodes: 1000000
Average node length: 5.96, total dup: 0, max dup: 0
Average key length: 3.10, compression ratio: 1.59
Average prefix length: 3.87, average data length: 1.05
Index I_BIGINT2_COL (1)
Root page: 37215, depth: 2, leaf buckets: 697, nodes: 1000000
Average node length: 5.61, total dup: 367845, max dup: 8
Average key length: 2.76, compression ratio: 1.79
Average prefix length: 4.23, average data length: 0.70
Storage surely looks good, but internal conversions to Decimal128 are not free. I've done a few artificial tests (index scan in the
loop with many iterations, CPU bound as tables fit the page cache), here are the results:
(1) Wide range scan:
T_INT = 3.53s
T_BIGINT = 3.82s
T_INT128 = 4.36s
T_BIGINT2 = 3.72s
More or less what's expected: different number of pages are accessed, so T_BIGINT2 wins as compared to T_BIGINT. Interesting that
T_INT128 has the same index size, but is noticeably slower. Due to a bigger computational overhead?
(2) Narrow range scan:
T_INT = 2.74s
T_BIGINT = 2.79s
T_INT128 = 2.83s
T_BIGINT2 = 2.76s
Almost the same results for both BIGINT compressions.
(3) Unique scan:
T_INT = 1.83s
T_BIGINT = 1.89s
T_INT128 = 2.09s
T_BIGINT2 = 2.05s
This is the worst case for T_BIGINT2, it loses ~10% to T_BIGINT.
In cases (2) and (3) index size does not matter, only index depth matters. So the stats reflects CPU overhead of btr.cpp::compress()
and data type conversions.
It's also likely that if BIGINT is used as PK and currently its index is 4 levels depth, the new storage will shrink it down to 3
levels and this could noticeably improve both lookup and (especially) join performance.
Does anyone think we should investigate this possibility more closely? Any
other comments?
I'd vote to remove idx_numeric2 in favour of idx_decimal and look how to
improve
Decimal128::makeIndexKey() performance. For example, it stores every 3 decimal
digits into 10 bits using relatively complex (computationally) algorithm with
shifts and multiplications. Storing every digit in 4 bits will use 2 bits per 3
digits more but should save come CPU circles, I believe.
Regards,
Vlad
Firebird-Devel mailing list, web interface at
https://lists.sourceforge.net/lists/listinfo/firebird-devel