Hi Kapil, I would say the 10/90% ratio depends on the usage? Some places most of the calls are based on resource id, such as client, loan etc. Nevertheless, a perf improve of 10% is > then no improve of 100% M.
On Fri, Oct 17, 2025 at 5:28 AM Kapil Panchal < [email protected]> wrote: > Databases store rows as key-value pairs keyed by the primary key. While > UUIDs or Snowflake IDs can marginally improve primary-key lookup times via > O(log n) algorithms, this benefits at most 10% of queries. The remaining > 90% of searches—those performed on columns or fields—still require scanning > O(n) rows, where such an approach offers no improvement. In practical > terms, adopting UUIDs or Snowflake IDs does not yield measurable > performance gains for the majority of operations and risks creating the > impression of optimization without substantive effect. > > On Thu, Sep 4, 2025 at 2:26 PM Ádám Sághy <[email protected]> wrote: > >> Dear Fineract community, >> >> I’d like to open a discussion on our current use of auto-increment >> database primary keys (e.g. BIGINT IDENTITY) and whether it’s time to >> evolve this design. >> >> *Current Issues with Auto-Increment PKs* >> >> - Always requires flush/commit: The database generates the ID only >> once the transaction is flushed or committed, which makes it difficult to >> work with new entities before persistence. This complicates batching, >> caching, and testability. >> - Performance implications: Because IDs are only known after flush, >> code often performs additional flushes just to obtain generated IDs. This >> introduces unnecessary round-trips and slows down persistence at scale. >> Also this cause various issues around having multiple flushes per >> transactions. >> - Predictability: Sequential IDs are trivially guessable. When >> exposed via APIs, this can pose a security and privacy concern. >> - Operational flexibility: While a single database removes the need >> for distributed coordination, relying solely on DB-generated sequences >> ties >> identity generation to persistence, which reduces flexibility when >> considering service decomposition, replication, or multi-tenant setups. >> >> >> *Recommendation: Snowflake-style BIGINT IDs* >> >> Application-level ID generation using a Snowflake-style algorithm (64-bit >> IDs with timestamp + machine ID + sequence). >> >> Pros: >> >> - IDs are available immediately in memory, no flush required. >> - Globally unique across instances. >> - Roughly time-ordered, which helps with debugging and querying. >> - Removes guessability of plain sequential IDs. >> - Still compact BIGINT for indexing and joins. >> - *No need for migration!* >> >> Cons: >> >> - Requires a generator implementation and clock synchronization >> safeguards. >> - IDs are less human-friendly than sequential integers. >> >> >> >> Looking forward to hear your thoughts on this proposal! >> >> >> Regards, >> >> Adam >> >> >> > -- Mihaly
