I am not sure why do you want to change on-disk storage format? If we are talking about indexes, it's more about comparison function (opclass) that is used in an index. Am I wrong?
сб, 25 трав. 2019 о 11:21 Ancoron Luciferis < ancoron.lucife...@googlemail.com> пише: > On 25/05/2019 16:57, Tom Lane wrote: > > Ancoron Luciferis <ancoron.lucife...@googlemail.com> writes: > >> So I investigated the PostgreSQL code to see how it is handling UUID's > >> with respect to storage, sorting, aso. but all I could find was that it > >> basically falls back to the 16-byte. > > > > Yup, they're just blobs to us. > > > >> After struggling to find a way to optimize things inside the database, I > >> reverted to introduce a hack into the application by not shuffling the > >> timestamp bytes for the UUID's, which makes it look quite serial in > >> terms of byte order. > > > >> So, my question now is: Would it make sense for you to handle these > >> time-based UUID's differently internally? Specifically un-shuffling the > >> timestamp before they are going to storage? > > > > No, because > > > > (1) UUID layout is standardized; > > You mean the presentation at the byte-level is. ;) > > > > > (2) such a change would break on-disk compatibility for existing > > databases; > > Yes, that certainly is a show-stopper. > > > > > (3) particularly for the case of application-generated UUIDs, we do > > not have enough information to know that this would actually do anything > > useful; > > Well, not only the layout is standardized, but also there is a semantic > to it depending on the version. Specifically for version 1, it has: > 1. a timestamp > 2. a clock sequence > 3. a node id > > Ans as PostgreSQL already provides this pretty concrete data type, it > could be a natural consequence to also support the semantic of it. > > E.g. the network types also come with a lot of additional operators and > functions. So I don't see a reason not to respect the additional > capabilities of a UUID. > > For other versions of UUID's, functions like timestamp would certainly > not be available (return NULL?), respecting the concrete semantic. > > > > > (4) it in fact *wouldn't* do anything useful, because we'd still have > > to sort UUIDs in the same order as today, meaning that btree index > behavior > > would remain the same as before. Plus UUID comparison would get a lot > > more complicated and slower than it is now. > > I get your first sentence, but not your second. I know that when > changing the internal byte order we'd have to completed re-compute > everything on-disk (from table to index data), but why would the sorting > in the index have to be the same? > > And actually, comparison logic wouldn't need to be changed at all if the > byte order is "changed" when the UUID is read in when reading the > representation into the internal UUID's byte array. > > Function: > string_to_uuid(const char *source, pg_uuid_t *uuid); > > ^^ here I would apply the change. And of course, reverse it for > generating the textual representation. > > That would slow down writes a bit, but that shouldn't be the case > because index insertions are speed up even more. > > But still, on-disk change is still a show-stopper, I guess. > > > > > (5) even if we ignored all that and did it anyway, it would only help > > for version-1 UUIDs. The index performance issue would still remain for > > version-4 (random) UUIDs, which are if anything more common than v1. > > Yes, I am aware that the changes might be of very limited gain. V4 > UUID's are usually used for external identifiers. For internal ones, > they don't make sense to me (too long, issues with randomness/enthropie > under high load, ...). ;) > > I just recently used these UUID's also together with a function for > TimescaleDB auto-partitioning to provide the timestamp for the > partitioning logic instead of the need for a separate timestamp column. > > This is also one of the reasons why I was also asking for native support > functions to extract the timestamp. I am apparently not very good at C > so I am currently using Python and/or PgPLSQL for it, which is pretty slow. > > > > > > > FWIW, I don't know what tool you're using to get those "bloat" numbers, > > but just because somebody calls it bloat doesn't mean that it is. > > The normal, steady-state load factor for a btree index is generally > > understood to be about 2/3rds, and that looks to be about what > > you're getting for the regular-UUID-format index. The fact that the > > serially-loaded index has nearly no dead space is because we hack the > > page split logic to make that happen --- but that is a hack, and it's > > not without downsides. It should *not* be taken to be an indication > > of what you can expect for any other insertion pattern. > > OK, understood. I was actually a bit surprised by those numbers myself > as these "serial" UUID's still only have the timestamp bytes in > ascending order, the clock sequence and node is still pretty random (but > not inside a single transaction, which might help the hack). > > > > > The insertion-speed aspect is a real problem, but the core of that > problem > > is that use of any sort of standard-format UUID turns applications that > > might have had considerable locality of reference into applications that > > have none. If you can manage to keep your whole index in RAM that would > > not hurt too much, but as soon as it doesn't fit you have a problem. > > When your app has more or less predictable reference patterns it's best > > to design a unique key that matches that, instead of expecting that > > essentially-random keys will work well. > > The system was configured to have more than enough space for the index > and table data to fit into memory, but I am not sure. How can I verify > that? An EXPLAIN on the INSERT apparently doesn't include index insertion. > > > > > Or in short, hacking up the way your app generates UUIDs is exactly > > the right way to proceed here. > > OK. Glad to hear that. > > One last question, though: > Would it make sense to create a specialized UUID v1 type (e.g. with an > extension) that does the transformation and delegates for all other > things to the existing UUID type support? > > > > > regards, tom lane > > > > >