[Firebird-devel] INT64 and index keys

Dmitry Yemanov Tue, 15 Feb 2022 03:29:14 -0800

All,

Historically, we store most numerics as double precision inside indexkeys. It makes stored keys independent of the declared scale and allowsboth prefix and suffix compression. However, int64 keys (BIGINT andlargish NUMERICs/DECIMALs) do not fit the double format without risk ofrounding/truncation, so they're stored differently -- normalized to themaximum value that fits into double and then re-scaled. The problem isthat this format is not well-compressable.

Below is the storage / compression statistics for the same data (10^6identical values) stored as INT, BIGINT and INT128, all indexed. ColumnID is unique, column COL contains some duplicates.


T_INT (130)
    Index I_INT_ID (0)
        Root page: 10201, depth: 2, leaf buckets: 726, nodes: 1000000
        Average node length: 5.85, total dup: 0, max dup: 0
        Average key length: 3.00, compression ratio: 1.32
        Average prefix length: 2.96, average data length: 1.00
    Index I_INT_COL (1)
        Root page: 23911, depth: 2, leaf buckets: 683, nodes: 1000000
        Average node length: 5.51, total dup: 367845, max dup: 8
        Average key length: 2.65, compression ratio: 1.49
        Average prefix length: 3.32, average data length: 0.64

T_BIGINT (131)
    Index I_BIGINT_ID (0)
        Root page: 25109, depth: 3, leaf buckets: 1475, nodes: 1000000
        Average node length: 11.90, total dup: 0, max dup: 0
        Average key length: 9.04, compression ratio: 1.00
        Average prefix length: 2.96, average data length: 6.04
    Index I_BIGINT_COL (1)
        Root page: 26580, depth: 3, leaf buckets: 1157, nodes: 1000000
        Average node length: 9.32, total dup: 367845, max dup: 8
        Average key length: 6.46, compression ratio: 1.39
        Average prefix length: 5.17, average data length: 3.83

T_INT128 (132)
    Index I_INT128_ID (0)
        Root page: 27957, depth: 3, leaf buckets: 739, nodes: 1000000
        Average node length: 5.96, total dup: 0, max dup: 0
        Average key length: 3.10, compression ratio: 1.59
        Average prefix length: 3.87, average data length: 1.05
    Index I_INT128_COL (1)
        Root page: 27979, depth: 2, leaf buckets: 697, nodes: 1000000
        Average node length: 5.61, total dup: 367845, max dup: 8
        Average key length: 2.76, compression ratio: 1.79
        Average prefix length: 4.23, average data length: 0.70

As we can see, both BIGINT indices are almost twice bigger in storagethan either INT (which is stored as DOUBLE in the index) or INT128(which is stored as DECFLOAT in the index), due to longer keys. Notethat it also forces I_BIGINT_COL to have depth == 3 rather than 2.

Maybe we couldn't do any better before FB4, but now we have INT128 andDECFLOAT which use more effective index key format, so I asked myselfwhether it makes sense to utilize it for int64 values too.


Below is the stats if BIGINT would be stored as idx_decimal in the index:

T_BIGINT2 (134)
    Index I_BIGINT2_ID (0)
        Root page: 37193, depth: 3, leaf buckets: 739, nodes: 1000000
        Average node length: 5.96, total dup: 0, max dup: 0
        Average key length: 3.10, compression ratio: 1.59
        Average prefix length: 3.87, average data length: 1.05
    Index I_BIGINT2_COL (1)
        Root page: 37215, depth: 2, leaf buckets: 697, nodes: 1000000
        Average node length: 5.61, total dup: 367845, max dup: 8
        Average key length: 2.76, compression ratio: 1.79
        Average prefix length: 4.23, average data length: 0.70

Storage surely looks good, but internal conversions to Decimal128 arenot free. I've done a few artificial tests (index scan in the loop withmany iterations, CPU bound as tables fit the page cache), here are theresults:


(1) Wide range scan:
  T_INT = 3.53s
  T_BIGINT = 3.82s
  T_INT128 = 4.36s
  T_BIGINT2 = 3.72s

More or less what's expected: different number of pages are accessed, soT_BIGINT2 wins as compared to T_BIGINT. Interesting that T_INT128 hasthe same index size, but is noticeably slower. Due to a biggercomputational overhead?


(2) Narrow range scan:
  T_INT = 2.74s
  T_BIGINT = 2.79s
  T_INT128 = 2.83s
  T_BIGINT2 = 2.76s

Almost the same results for both BIGINT compressions.

(3) Unique scan:
  T_INT = 1.83s
  T_BIGINT = 1.89s
  T_INT128 = 2.09s
  T_BIGINT2 = 2.05s

This is the worst case for T_BIGINT2, it loses ~10% to T_BIGINT.

In cases (2) and (3) index size does not matter, only index depthmatters. So the stats reflects CPU overhead of btr.cpp::compress() anddata type conversions.

It's also likely that if BIGINT is used as PK and currently its index is4 levels depth, the new storage will shrink it down to 3 levels and thiscould noticeably improve both lookup and (especially) join performance.

Does anyone think we should investigate this possibility more closely?Any other comments?



Dmitry


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

[Firebird-devel] INT64 and index keys

Reply via email to