Re: Отв.: Re: UUID v7

2024-11-29 Thread Masahiko Sawada
On Fri, Nov 29, 2024 at 5:59 AM Sergey Prokhorenko
 wrote:
>
>
>
> Sergey Prokhorenko sergeyprokhore...@yahoo.com.au
>
>
> On Friday 29 November 2024 at 09:19:33 am GMT+3, Masahiko Sawada 
>  wrote:
>
>
> On Thu, Nov 28, 2024 at 8:13 PM Sergey Prokhorenko
>
>  wrote:
> >
> > I mean to add not benchmark results to the patch, but functions so that 
> > everyone can compare themselves on their equipment. The comparison with 
> > UUIDv4 is not very interesting, as the choice is usually between UUIDv7 and 
> > an integer key. And I have described many use cases, and in your benchmark 
> > there is only one, the simplest.
>
>
> I don't think we should add such benchmark functions at least to this
> patch. If there already is a well-established workload using UUIDv7
> and UUIDv4 etc, users can use pgbench with custom scripts, or it might
> make sense to add it to pgbench as a built-in workload. Which however
> should be a separate patch. Having said that, I think users should use
> benchmarks that fit their workloads, and it would not be easy to
> establish workloads that are reasonable for most systems.
>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com
>
>
>
>
>
>
> Workloads can and must be added with parameters. Typically, companies use 
> test tables of 10,000 and 1,000,000 records, etc. Different companies have 
> mostly similar usage scenarios (for example, incremental loading). Each 
> company has to duplicate the work of others, creating the same benchmarks. 
> The worst thing is that this is entrusted to incompetent employees who are 
> not very good at understanding typical key usage scenarios. As a rule, these 
> are programmers, not system analysts. Accordingly, the solution in 99% of 
> cases will be in favor of integer keys, as they take up less space and are 
> generated faster. If we leave this problem until the next patch, it will take 
> us a year and a half. This is completely wrong.

There are still 4 months left until the feature freeze. We can discuss
this topic and might find solutions. I don't think it's a blocker of
this patch (UUIDv7 implementation patch).

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Отв.: Re: UUID v7

2024-11-29 Thread Andrey M. Borodin



> On 29 Nov 2024, at 18:57, Sergey Prokhorenko  
> wrote:
> 
> Workloads can and must be added with parameters. Typically, companies use 
> test tables of 10,000 and 1,000,000 records, etc. Different companies have 
> mostly similar usage scenarios (for example, incremental loading). Each 
> company has to duplicate the work of others, creating the same benchmarks. 
> The worst thing is that this is entrusted to incompetent employees who are 
> not very good at understanding typical key usage scenarios. As a rule, these 
> are programmers, not system analysts. Accordingly, the solution in 99% of 
> cases will be in favor of integer keys, as they take up less space and are 
> generated faster. If we leave this problem until the next patch, it will take 
> us a year and a half. This is completely wrong.

I think we have pretty decent documentation in the patch. It only points to RFC 
and that's it.
There were patch versions with opinionated novels in docs. Giving advises, 
comparing possibilities and all that stuff. I'm so happy we passed through this 
stage and moved forward :)


Best regards, Andrey Borodin.



Re: Отв.: Re: UUID v7

2024-11-29 Thread Sergey Prokhorenko


Sergey Prokhorenko sergeyprokhore...@yahoo.com.au 

On Friday 29 November 2024 at 09:19:33 am GMT+3, Masahiko Sawada 
 wrote:  
 
 On Thu, Nov 28, 2024 at 8:13 PM Sergey Prokhorenko
 wrote:
>
> I mean to add not benchmark results to the patch, but functions so that 
> everyone can compare themselves on their equipment. The comparison with 
> UUIDv4 is not very interesting, as the choice is usually between UUIDv7 and 
> an integer key. And I have described many use cases, and in your benchmark 
> there is only one, the simplest.

I don't think we should add such benchmark functions at least to this
patch. If there already is a well-established workload using UUIDv7
and UUIDv4 etc, users can use pgbench with custom scripts, or it might
make sense to add it to pgbench as a built-in workload. Which however
should be a separate patch. Having said that, I think users should use
benchmarks that fit their workloads, and it would not be easy to
establish workloads that are reasonable for most systems.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com






Workloads can and must be added with parameters. Typically, companies use test 
tables of 10,000 and 1,000,000 records, etc. Different companies have mostly 
similar usage scenarios (for example, incremental loading). Each company has to 
duplicate the work of others, creating the same benchmarks. The worst thing is 
that this is entrusted to incompetent employees who are not very good at 
understanding typical key usage scenarios. As a rule, these are programmers, 
not system analysts. Accordingly, the solution in 99% of cases will be in favor 
of integer keys, as they take up less space and are generated faster. If we 
leave this problem until the next patch, it will take us a year and a half. 
This is completely wrong.

  

Re: Отв.: Re: UUID v7

2024-11-28 Thread Masahiko Sawada
On Thu, Nov 28, 2024 at 8:13 PM Sergey Prokhorenko
 wrote:
>
> I mean to add not benchmark results to the patch, but functions so that 
> everyone can compare themselves on their equipment. The comparison with 
> UUIDv4 is not very interesting, as the choice is usually between UUIDv7 and 
> an integer key. And I have described many use cases, and in your benchmark 
> there is only one, the simplest.

I don't think we should add such benchmark functions at least to this
patch. If there already is a well-established workload using UUIDv7
and UUIDv4 etc, users can use pgbench with custom scripts, or it might
make sense to add it to pgbench as a built-in workload. Which however
should be a separate patch. Having said that, I think users should use
benchmarks that fit their workloads, and it would not be easy to
establish workloads that are reasonable for most systems.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Отв.: Re: UUID v7

2024-11-28 Thread Kirill Reshke
On Fri, 29 Nov 2024, 09:14 Sergey Prokhorenko, <
sergeyprokhore...@yahoo.com.au> wrote:

> I mean to add not benchmark results to the patch, but functions so that
> everyone can compare themselves on their equipment. The comparison with
> UUIDv4 is not very interesting, as the choice is usually between UUIDv7 and
> an integer key. And I have described many use cases, and in your benchmark
> there is only one, the simplest.
>
>
> Отправлено из Yahoo Почты на iPhone
> 
>
> Пользователь четверг, ноября 28, 2024, 11:09 AM написал Andrey M. Borodin <
> x4...@yandex-team.ru>:
>
>
>
> > On 28 Nov 2024, at 04:07, Sergey Prokhorenko <
> sergeyprokhore...@yahoo.com.au> wrote:
> >
> > It would be useful to add a standard comparative benchmark with several
> parameters and use cases to the patch, so that IT departments can compare
> UUIDv7, ULID, UUIDv4, Snowflake ID and BIGSERIAL for their hardware and
> conditions.
> >
> > I know for a fact that IT departments make such benchmarks of low
> quality. They usually measure the generation rate, which is meaningless
> because it is usually excessive. It makes sense to measure the rate of
> single-threaded and multi-threaded insertion of a large number of records
> (with and without partitioning), as well as the rate of execution of
> queries to join big tables, to update or delete a large number of records.
> It is important to measure memory usage, processor load, etc.
>
>
> Publishing benchmarks seems to be far beyond what our documentation go
> for. Mostly, because benchmarks are tricky. You can prove anything with
> benchmarks.
>
> Everyone is welcome to publish benchmark results in their blogs, but IMO
> docs have a very different job to do.
>
> I’ll just publish one benchmark in this mailing list. With patch v39
> applied on my MB Air M2 I get:
>
> postgres=# create table table_for_uuidv4(id uuid primary key);
> CREATE TABLE
> Time: 9.479 ms
> postgres=# insert into table_for_uuidv4 select uuidv4() from
> generate_series(1,3e7);
> INSERT 0 3000
> Time: 2003918.770 ms (33:23.919)
> postgres=# create table table_for_uuidv7(id uuid primary key);
> CREATE TABLE
> Time: 3.930 ms
> postgres=# insert into table_for_uuidv7 select uuidv7() from
> generate_series(1,3e7);
> INSERT 0 3000
> Time: 337001.315 ms (05:37.001)
>
> Almost an order of magnitude better :)
>
>
> Best regards, Andrey Borodin.
>
> Hi!
Do not top-post on this list

>