+1 This approach that Arnold has explained On Fri, Sep 5, 2025 at 5:56 AM Arnold Galovics <arnold_galov...@docktape.com> wrote:
> Hey guys, > > If you remember Adam, I talked to you about this ~2 years ago :)) > > As I did then and now, I'm still suggesting the usage of UUIDv7 even > though it has serious implications in terms of migration. In my opinion the > long-term gains it'd bring is much more valuable to the project than the > required timeline. We can easily split this work if someone spends some > time on figuring this out architecturally and execution-wise. > > Just a quick thought how this could be done without manual migrations > (example is for Loan): > - expand loan table to have a new uuid column (not a pk yet) > - expand the loan repayment table (just one out of the other > relationships) to include a new uuid based fk column (not yet fk) > - generate uuids for the loans > - use the original loan ID to find corresponding repayment rows and use > the respective new uuids to fill the new fk column > - enforce foreign key on repayment uuid fk column, rename the original fk > and drop fk constraint (or maybe leave it as a legacy fk, both can make > sense) > - switch uuid column in loan to be a PK and rename the original one as > legacy_id or something > > Something like this could work but this needs to be done step by step. > > Best, > Arnold > > > > Arnold Gálovics > > *CEO, Co-Founder* > > *+36 30 904 1885* > > https://docktape.com > > Docktape Technologies > > > > > On Fri, Sep 5, 2025 at 10:26 AM VICTOR MANUEL ROMERO RODRIGUEZ < > victor.rom...@fintecheando.mx> wrote: > >> Hi Adam, >> >> I prefer to go for this approach. >> >> >> - >> >> *Snowflake BIGINT* → no migration, performance-friendly, but requires >> care around machine ID/clock handling. >> >> The machine Id and clock handling are items, that I consider, are already >> managed in production environments. And if the system is already connected >> to a payment system, the IT admins are already aware or are more familiar >> about having clock synchronizarion while working with distributed systems >> connected to Apache Fineract. >> >> Regards >> >> Victor >> >> >> El El vie, 5 de sep de 2025 a la(s) 2:03 a.m., Ádám Sághy < >> adamsa...@gmail.com> escribió: >> >>> Hi Paul, Tom, James, Victor, >>> >>> Thanks again for sharing your experience! >>> >>> I see why UUIDv7 can be attractive: time-ordered, globally unique, and >>> avoids the pitfalls of machine IDs and clock sync. >>> >>> That said, one major consideration for us is migration effort. >>> >>> - >>> >>> *UUIDv7* would require a *full schema and data migration*, touching >>> every table, index, and foreign key. For a system like Fineract this is a >>> massive undertaking, realistically close to a year of work before things >>> are stable, with high risk and ongoing maintenance cost for projects that >>> are using Fineract for a while or using it with massive amount of data. >>> - >>> >>> *Snowflake-style BIGINTs*, on the other hand, can be introduced >>> without changing the existing PK data type. We can keep the schema as-is, >>> avoid re-indexing terabytes of data, and stay compatible with all current >>> integrations. >>> >>> Performance is the other angle: UUIDv7 doubles index size (128-bit vs >>> 64-bit) and tends to fragment B-tree indexes more than sequential or >>> roughly sequential IDs. Snowflake IDs remain compact BIGINTs, time-ordered >>> enough for efficient indexing, and guarantee immediate availability in >>> memory without requiring a flush. >>> >>> There’s another very important angle I’d like to add to this discussion: >>> our current *PK generation strategy*. Today Fineract relies on IDENTITY >>> columns, >>> which means the database only provides an ID during flush/commit. The >>> backend cannot obtain an ID before that point. Unfortunately, the platform >>> depends heavily on having an ID *earlier* in the lifecycle for things >>> like: >>> >>> - >>> >>> Creating and sending business events with the new entity ID >>> - >>> >>> Populating result objects with the resource ID immediately after >>> creation >>> - >>> >>> Enabling true batch write operations (currently blocked) >>> - >>> >>> Avoiding unnecessary flushes purely to retrieve generated IDs >>> >>> This limitation has significant impact on performance, testability, and >>> developer experience. >>> >>> That’s why I’ve been leaning toward application-level ID generation >>> (Snowflake-style BIGINTs). They solve the “flush dependency” problem while >>> retaining our current schema and data type, so *no massive migration* is >>> required. UUIDv7 would address uniqueness and ordering, but it would mean >>> replacing all PK columns, indexes, and FKs — realistically a year-long, >>> high-risk migration for a project of this size. >>> >>> Of course, Snowflake IDs aren’t perfect — after ~50 years we’d >>> eventually hit their limits, but realistically no system of this kind goes >>> half a century without a redesign anyway. >>> >>> So to me the trade-off looks like: >>> >>> - >>> >>> *UUIDv7* → simpler conceptually, but very heavy migration + larger >>> index/storage footprint. >>> - >>> >>> *Snowflake BIGINT* → no migration, performance-friendly, but >>> requires care around machine ID/clock handling. >>> >>> >>> *At the end the question is whether we want to fix 1 problem with a >>> short implementation timeline, or fix 2 problems with a pretty lengthy >>> implementation timeline and with the risk of a very complex data migration!* >>> >>> >>> *As an ending note, I really love this conversation and all the ideas >>> shared — that was exactly my intention in starting the thread. Please keep >>> it going; we should consider all angles and hear all voices in order to >>> select the best option for the community and the project!* >>> >>> Best, >>> Adam >>> >>> On 2025. Sep 4., at 13:36, Paul <pchristi...@gmail.com> wrote: >>> >>> Hi Adam, >>> I'm not super technical, but have faced similar issues in the past and >>> will try to regurgitate what we discovered as a "better way" which might be >>> useful. >>> >>> First, let me share using the Machine ID in the sequence ends up being >>> an ugly pain in the ass point and was a near catastrophic issue for us >>> . . . so while your solution would be an improvement, it creates new risks >>> which are just as onerous. That was on metal servers. With cloud services, >>> scaling and churn would seem to multiply the risk? While I'm sure there are >>> ways around each challenge, again, each "way around" complicates the >>> solution and worse, increases time, cost and cognitive load. >>> >>> I going to suggest (PLEASE double check my sanity) 128-bit IDs using >>> time-ordered UUIDs v7. This provides global uniqueness with the only >>> negative being a slight increase in index size (16 vs 8 bit). It should >>> avoid several "what ifs" on clock issues (jumps and fall backs) and risk of >>> using Machin ID, you will discover a LOT of extra effort is required to >>> make those work. THEN after ALL that, will find that uniqueness is NOT >>> certain. I found this out the hard way. In short, we tried what sounded >>> like your method and failed to enjoy a truly scalable solution. In fact, we >>> made 3 major efforts to get it to "play nice" and failed. >>> >>> This simplified method provides immediate ops benefit, is a low risk >>> implementation. (Lead Developer, suggested it was zero risk, but I don't >>> beleive in such a thing) We were working with Postgres, Java app, NOT >>> Fineract, but assuming its generally applicable. >>> >>> Assumptions: >>> 1) You're not trying to remove or replace primary keys and will keep >>> them for internal use >>> 2) Postgres handles indexing of uuid auto-magically; as long as the >>> column is marked unique >>> 3) Use uuid as the public ID in APIs >>> >>> If you think this is helpful or a better option, feel free to quiz me. >>> >>> Regards >>> Paul >>> >>> >>>