Re: PROPOSAL: Moving away from DB-enforced auto-increment primary keys

Tom Daly Fri, 05 Sep 2025 11:31:06 -0700

Hi Victor, Adam et al,

There seems to be some general acknowledgment about the risks of handling ID 
generation at the app level rather than the DB level again particularly for 
clusters but I am also I suppose hearing that it is necessary.   I know little 
about the deployment cases and operational challenges for real customers so I 
have to defer to you Victor and others who say this is necessary.


Still, to my second question then, which is related to "risk" and that is how 
is any new ID generation scheme going to be tested under load and made bullet 
proof ?   What load generation tools exist for testing under load in both 
clustered and non-clustered scenarios to make sure that this works because (to 
say the least) it is not trivial.

regards
Tom




> On 5 Sep 2025, at 9:25 am, VICTOR MANUEL ROMERO RODRIGUEZ 
> <victor.rom...@fintecheando.mx> wrote:
> 
> Hi Adam,
> 
> I prefer to go for this approach.
> 
> Snowflake BIGINT → no migration, performance-friendly, but requires care 
> around machine ID/clock handling.
> 
> The machine Id and clock handling are items, that I consider, are already 
> managed in production environments. And if the system is already connected to 
> a payment system, the IT admins are already aware or are more familiar about 
> having clock synchronizarion while working with distributed systems connected 
> to Apache Fineract.
> 
> Regards
> 
> Victor
> 
> 
> El El vie, 5 de sep de 2025 a la(s) 2:03 a.m., Ádám Sághy 
> <adamsa...@gmail.com <mailto:adamsa...@gmail.com>> escribió:
> Hi Paul, Tom, James, Victor,
> 
> Thanks again for sharing your experience! 
> 
> I see why UUIDv7 can be attractive: time-ordered, globally unique, and avoids 
> the pitfalls of machine IDs and clock sync. 
> 
> That said, one major consideration for us is migration effort.
> 
> UUIDv7 would require a full schema and data migration, touching every table, 
> index, and foreign key. For a system like Fineract this is a massive 
> undertaking, realistically close to a year of work before things are stable, 
> with high risk and ongoing maintenance cost for projects that are using 
> Fineract for a while or using it with massive amount of data.
> 
> Snowflake-style BIGINTs, on the other hand, can be introduced without 
> changing the existing PK data type. We can keep the schema as-is, avoid 
> re-indexing terabytes of data, and stay compatible with all current 
> integrations.
> 
> Performance is the other angle: UUIDv7 doubles index size (128-bit vs 64-bit) 
> and tends to fragment B-tree indexes more than sequential or roughly 
> sequential IDs. Snowflake IDs remain compact BIGINTs, time-ordered enough for 
> efficient indexing, and guarantee immediate availability in memory without 
> requiring a flush.
> 
> There’s another very important angle I’d like to add to this discussion: our 
> current PK generation strategy. Today Fineract relies on IDENTITY columns, 
> which means the database only provides an ID during flush/commit. The backend 
> cannot obtain an ID before that point. Unfortunately, the platform depends 
> heavily on having an ID earlier in the lifecycle for things like:
> 
> Creating and sending business events with the new entity ID
> 
> Populating result objects with the resource ID immediately after creation
> 
> Enabling true batch write operations (currently blocked)
> 
> Avoiding unnecessary flushes purely to retrieve generated IDs
> 
> This limitation has significant impact on performance, testability, and 
> developer experience.
> 
> That’s why I’ve been leaning toward application-level ID generation 
> (Snowflake-style BIGINTs). They solve the “flush dependency” problem while 
> retaining our current schema and data type, so no massive migration is 
> required. UUIDv7 would address uniqueness and ordering, but it would mean 
> replacing all PK columns, indexes, and FKs — realistically a year-long, 
> high-risk migration for a project of this size.
> 
> Of course, Snowflake IDs aren’t perfect — after ~50 years we’d eventually hit 
> their limits, but realistically no system of this kind goes half a century 
> without a redesign anyway.
> 
> So to me the trade-off looks like:
> 
> UUIDv7 → simpler conceptually, but very heavy migration + larger 
> index/storage footprint.
> 
> Snowflake BIGINT → no migration, performance-friendly, but requires care 
> around machine ID/clock handling.
> 
> 
> 
> At the end the question is whether we want to fix 1 problem with a short 
> implementation timeline, or fix 2 problems with a pretty lengthy 
> implementation timeline and with the risk of a very complex data migration!
> 
> 
> 
> As an ending note, I really love this conversation and all the ideas shared — 
> that was exactly my intention in starting the thread. Please keep it going; 
> we should consider all angles and hear all voices in order to select the best 
> option for the community and the project!
> 
> Best,
> Adam
> 
> 
>> On 2025. Sep 4., at 13:36, Paul <pchristi...@gmail.com 
>> <mailto:pchristi...@gmail.com>> wrote:
>> 
>> Hi Adam, 
>> I'm not super technical, but have faced similar issues in the past and will 
>> try to regurgitate what we discovered as a "better way" which might be 
>> useful. 
>> 
>> First, let me share using the Machine ID in the sequence ends up being an 
>> ugly pain in the ass point and was a near catastrophic issue for us . . . so 
>> while your solution would be an improvement, it creates new risks which are 
>> just as onerous. That was on metal servers. With cloud services, scaling and 
>> churn would seem to multiply the risk? While I'm sure there are ways around 
>> each challenge, again, each "way around" complicates the solution and worse, 
>> increases time, cost and cognitive load.  
>> 
>> I going to suggest (PLEASE double check my sanity) 128-bit IDs using 
>> time-ordered UUIDs v7. This provides global uniqueness with the only 
>> negative being a slight increase in index size (16 vs 8 bit). It should 
>> avoid several "what ifs" on clock issues (jumps and fall backs) and risk of 
>> using Machin ID, you will discover a LOT of extra effort is required to make 
>> those work. THEN after ALL that, will find that uniqueness is NOT certain. I 
>> found this out the hard way. In short, we tried what sounded like your 
>> method and failed to enjoy a truly scalable solution. In fact, we made 3 
>> major efforts to get it to "play nice" and failed. 
>> 
>> This simplified method provides immediate ops benefit, is a low risk 
>> implementation. (Lead Developer, suggested it was zero risk, but I don't 
>> beleive in such a thing)  We were working with Postgres, Java app, NOT 
>> Fineract, but assuming its generally applicable.
>> 
>> Assumptions: 
>> 1) You're not trying to remove or replace primary keys and will keep them 
>> for internal use 
>> 2) Postgres handles indexing of uuid auto-magically; as long as the column 
>> is marked unique
>> 3) Use uuid as the public ID in APIs
>> 
>> If you think this is helpful or a better option, feel free to quiz me.  
>> 
>> Regards
>> Paul 
>

Re: PROPOSAL: Moving away from DB-enforced auto-increment primary keys

Reply via email to