Re: PROPOSAL: Moving away from DB-enforced auto-increment primary keys

Mihaly Dallos Fri, 17 Oct 2025 04:27:27 -0700

Hi Kapil,
I would say the 10/90% ratio depends on the usage? Some places most of the
calls are based on resource id, such as client, loan etc.
Nevertheless, a perf improve of 10% is > then no improve of 100%
M.


On Fri, Oct 17, 2025 at 5:28 AM Kapil Panchal <
[email protected]> wrote:

> Databases store rows as key-value pairs keyed by the primary key. While
> UUIDs or Snowflake IDs can marginally improve primary-key lookup times via
> O(log n) algorithms, this benefits at most 10% of queries. The remaining
> 90% of searches—those performed on columns or fields—still require scanning
> O(n) rows, where such an approach offers no improvement. In practical
> terms, adopting UUIDs or Snowflake IDs does not yield measurable
> performance gains for the majority of operations and risks creating the
> impression of optimization without substantive effect.
>
> On Thu, Sep 4, 2025 at 2:26 PM Ádám Sághy <[email protected]> wrote:
>
>> Dear Fineract community,
>>
>> I’d like to open a discussion on our current use of auto-increment
>> database primary keys (e.g. BIGINT IDENTITY) and whether it’s time to
>> evolve this design.
>>
>> *Current Issues with Auto-Increment PKs*
>>
>>    - Always requires flush/commit: The database generates the ID only
>>    once the transaction is flushed or committed, which makes it difficult to
>>    work with new entities before persistence. This complicates batching,
>>    caching, and testability.
>>    - Performance implications: Because IDs are only known after flush,
>>    code often performs additional flushes just to obtain generated IDs. This
>>    introduces unnecessary round-trips and slows down persistence at scale.
>>    Also this cause various issues around having multiple flushes per
>>    transactions.
>>    - Predictability: Sequential IDs are trivially guessable. When
>>    exposed via APIs, this can pose a security and privacy concern.
>>    - Operational flexibility: While a single database removes the need
>>    for distributed coordination, relying solely on DB-generated sequences 
>> ties
>>    identity generation to persistence, which reduces flexibility when
>>    considering service decomposition, replication, or multi-tenant setups.
>>
>>
>> *Recommendation: Snowflake-style BIGINT IDs*
>>
>> Application-level ID generation using a Snowflake-style algorithm (64-bit
>> IDs with timestamp + machine ID + sequence).
>>
>> Pros:
>>
>>    - IDs are available immediately in memory, no flush required.
>>    - Globally unique across instances.
>>    - Roughly time-ordered, which helps with debugging and querying.
>>    - Removes guessability of plain sequential IDs.
>>    - Still compact BIGINT for indexing and joins.
>>    - *No need for migration!*
>>
>> Cons:
>>
>>    - Requires a generator implementation and clock synchronization
>>    safeguards.
>>    - IDs are less human-friendly than sequential integers.
>>
>>
>>
>> Looking forward to hear your thoughts on this proposal!
>>
>>
>> Regards,
>>
>> Adam
>>
>>
>>
>

-- 
Mihaly

Re: PROPOSAL: Moving away from DB-enforced auto-increment primary keys

Reply via email to