kaxil commented on PR #43243: URL: https://github.com/apache/airflow/pull/43243#issuecomment-2441487051
@potiuk Yeah, I considered hashing but it is bad for Databases for indexing and, as a result, querying since it won't have temporal properties apart from Collision Handling complexity. UUID v7 is explicitly designed to support distributed databases with high insert rates due to its temporal ordering. > BTW. One of the ways it could be helped - we **could** potentially generate the unique id by hashing the remaining fields - with appropriate hashing algorithm, we could have very low probability of collision (and maybe we could implement a mechanism to detect collisions and implement handling of those collision in similar ways they are handled in hash maps) - that would be another way how we could approach it (and there the merge code could be simpler and still use session.merge() > > But this one also comes with it's own set of difficulties and collision handling is likely going to be complex. But I thought it's worth menioning it here as an option. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
