from:"Tomas Vondra"

Re: Most-common value docs in PG 12

2019-08-28 Thread Tomas Vondra


On Wed, Aug 28, 2019 at 12:22:38PM -0400, Bruce Momjian wrote:

Our docs for most-common values in PG 12 has:

-->  CREATE STATISTICS stts3 (mcv) ON state, city FROM zipcodes;

ANALYZE zipcodes;

SELECT m.* FROM pg_statistic_ext join pg_statistic_ext_data on (oid = 
stxoid),
pg_mcv_list_items(stxdmcv) m WHERE stxname = 'stts3';

 index | values | nulls | frequency | base_frequency
---++---+---+
-->   0 | {Washington, DC}   | {f,f} |  0.003467 |2.7e-05
 1 | {Apo, AE}  | {f,f} |  0.003067 |1.9e-05
 2 | {Houston, TX}  | {f,f} |  0.002167 |   0.000133
 3 | {El Paso, TX}  | {f,f} | 0.002 |   0.000113

It seems pg_mcv_list_items() reports the column names in the order they
appear in the table, not in the order they appear in the CREATE
STATISTICS statement.  Same for psql \d:

\d zipcodes
 Table "public.zipcodes"
 Column  | Type | Collation | Nullable | Default
-+--+---+--+-
 city| text |   |  |
 state   | text |   |  |
 zipcode | text |   |  |
Statistics objects:
-->  "public"."stts3" (mcv) ON city, state FROM zipcodes


If this is so, why don't we show the CREATE STATISTICS example as
city/state, and not state/city?



Yes, we deduplicate the attributes and store them sorted by attnum. I'm
not sure it makes sense to change the example to match this order, which
is mostly an implementation detail, though. It might be better to point
out the order may not exactly match CREATE STATISTICS, and point users to
what e.g. "\d" shows (because that will show the order as stored in the
system catalog).


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Most-common value docs in PG 12

2019-09-26 Thread Tomas Vondra

On Thu, Sep 26, 2019 at 04:20:59PM -0400, Bruce Momjian wrote:

On Thu, Sep 26, 2019 at 05:17:55PM -0300, Alvaro Herrera wrote:

On 2019-Sep-26, Bruce Momjian wrote:

> On Wed, Sep 11, 2019 at 06:15:22PM -0300, Alvaro Herrera wrote:
> > On 2019-Aug-30, Bruce Momjian wrote:
> >
> > > OK, how is this patch?  I didn't mention psql since I think everyone
> > > expects psql to show all information about tables and indexes.
> >
> > Why would you change perform.sgml?  It seems unnecessary; the commands
> > shown work fine.
>
> I realize they work fine, but the ordering in the examples not matching
> the defined order suggests that ordering matters, but it does not.

Well, I mean exactly the other way around: the fact that the orders
don't match illustrates that the order is not important.  And that is
reinforced by the explanation indicating explicitly that it does not
matter:

Uh, people normally list things in defined order, so you would usually
not list them in non-defined order unless there is a purpose.  Doing
that just to illustrate the order doesn't matter seems odd.

Well, that assumes there is a definition, and I don't think the zipcodes
table is defined anywhere. So how do you know in what order are those
columns defined?

Now, maybe the table should be defined somewhere in perform.sgml - I
don't recall why exactly I chose not to do that, maybe because there is
no universal definition (one country uses text, another number, ...).

I do however agree that had there been such definition, it's probably
natural to list columns in the same order. We know the order is not
important, the proposed patch states that explicitly, but this just
feels natural.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Most-common value docs in PG 12

2019-09-27 Thread Tomas Vondra

On Thu, Sep 26, 2019 at 05:31:07PM -0400, Bruce Momjian wrote:

On Thu, Sep 26, 2019 at 11:03:54PM +0200, Tomas Vondra wrote:

On Thu, Sep 26, 2019 at 04:20:59PM -0400, Bruce Momjian wrote:
> Uh, people normally list things in defined order, so you would usually
> not list them in non-defined order unless there is a purpose.  Doing
> that just to illustrate the order doesn't matter seems odd.
>

Well, that assumes there is a definition, and I don't think the zipcodes
table is defined anywhere. So how do you know in what order are those
columns defined?

In the USA, it is usually specific to general, i.e., city, state.

I'd probably define it the same way, but for example the zipcode data
sets I usually use for my talks [1] defines it like this:

 postal code   : varchar(20)
 place name: varchar(180)
 admin name1   : 1. order subdivision (state) varchar(100)
 admin code1   : 1. order subdivision (state) varchar(20)
 admin name2   : 2. order subdivision (county/province) varchar(100)
 admin code2   : 2. order subdivision (county/province) varchar(20)
 admin name3   : 3. order subdivision (community) varchar(100)
 admin code3   : 3. order subdivision (community) varchar(20)
 latitude  : estimated latitude (wgs84)
 longitude : estimated longitude (wgs84)
 accuracy  : accuracy of lat/lng

so in this case it's a bit of a mix of specific vs. general first. 

[1] http://download.geonames.org/export/zip/

Now, maybe the table should be defined somewhere in perform.sgml - I
don't recall why exactly I chose not to do that, maybe because there is
no universal definition (one country uses text, another number, ...)

Yeah, doesn't seem worth adding.

OK.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: DISTINCT term in aggregate function

2021-03-18 Thread Tomas Vondra

Sorry, I'm not reading pgsql-docs very often, so I missed the post.
Yeah, we should probably add an indexterm to the other places too.

regards

On 3/18/21 7:03 PM, Pantelis Theodosiou wrote:
> Hi, I didn't think of including you in this suggestion.
> Or the pdsql-docs was not the right list to post? I didn't want to mix
> it with the GROUP BY DISTINCT patch.
> 
> Please check my suggestion.
> 
> Best regards
> Pantelis Theodosiou
> 
> 
> 
> 
> -- Forwarded message -
> From: *Pantelis Theodosiou* mailto:yperc...@gmail.com>>
> Date: Sat, Mar 13, 2021 at 1:03 AM
> Subject: Fwd: GROUP BY DISTINCT
> To:  <mailto:pgsql-docs@lists.postgresql.org>>
> 
> 
> 
> -- Forwarded message -
> From: *Tomas Vondra*  <mailto:tomas.von...@enterprisedb.com>>
> Date: Fri, Mar 12, 2021 at 11:33 PM
> Subject: Re: GROUP BY DISTINCT
> To: Vik Fearing  <mailto:v...@postgresfriends.org>>, Georgios Kokolatos
> mailto:gkokola...@protonmail.com>>,
>  <mailto:pgsql-hack...@lists.postgresql.org>>
> Cc: Erik Rijkers mailto:e...@xs4all.nl>>
> 
> 
> Hi Vik,
> 
> The patch seems quite ready, I have just two comments.
> 
> 1) Shouldn't this add another  for DISTINCT, somewhere in the
> documentation? Now the index points just to the SELECT DISTINCT part.
> 
> .
> 
> regards
> 
> -- 
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com <http://www.enterprisedb.com>
> The Enterprise PostgreSQL Company
> 
> 
> 
> After reading the above thread in hackers, I noticed that the index does
> not point to aggrgeate functions either and DISTINCT is not mentioned in
> the aggregate functions page
> either: https://www.postgresql.org/docs/current/functions-aggregate.html
> <https://www.postgresql.org/docs/current/functions-aggregate.html>
> Shouldn't it be mentioned with an example of COUNT(DISTINCT ...)  or
> aggregate_function(DISTINCT ...) in general ?
> 
> Best regards
> 
> Pantelis Theodosiou

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Parallel index build for BRIN

2024-12-09 Thread Tomas Vondra

On 12/8/24 16:00, Egor Rogov wrote:
> Hi,
> 
> 
> On 17.11.2024 11:28, Egor Rogov wrote:
>> Hi everyone,
>>
>> This thread doesn't seem to have attracted attention, so let me try
>> again. Two documentation pages claim that B-tree is the only access
>> method that supports parallel building, which is no longer true. I
>> propose to fix it in a way like this:
>>
>> diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
>> index d54f9049569..b5b1580dee7 100644
>> --- a/doc/src/sgml/config.sgml
>> +++ b/doc/src/sgml/config.sgml
>> @@ -2835,7 +2835,7 @@ include_dir 'conf.d'
>>   Sets the maximum number of parallel workers that can be
>>   started by a single utility command.  Currently, the parallel
>>   utility commands that support the use of parallel workers are
>> - CREATE INDEX only when building a B-tree
>> index,
>> + CREATE INDEX when building a B-tree or
>> BRIN index,
>>   and VACUUM without FULL
>>   option.  Parallel workers are taken from the pool of processes
>>   established by ,
>> limited
>> diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/
>> create_index.sgml
>> index 621bc0e253c..208389e8006 100644
>> --- a/doc/src/sgml/ref/create_index.sgml
>> +++ b/doc/src/sgml/ref/create_index.sgml
>> @@ -808,7 +808,7 @@ Indexes:
>>     leveraging multiple CPUs in order to process the table rows faster.
>>     This feature is known as parallel index
>>     build.  For index methods that support building indexes
>> -   in parallel (currently, only B-tree),
>> +   in parallel (currently, B-tree and BRIN),
>>     maintenance_work_mem specifies the maximum
>>     amount of memory that can be used by each index build operation as
>>     a whole, regardless of how many worker processes were started.
> 
> 
> I've spotted another mention of B-tree being the only AM that supports
> parallel builds: comment in src/backend/catalog/index.c. As this mention
> is not visible to the users, I'd propose removing it altogether rather
> than fixing it. Updated patch is attached.
> 

Thanks for noticing this and the patches. You're right, this should have
been updated with the BRIN parallel builds. I'll get this committed
sometime the week.


regards

-- 
Tomas Vondra

Re: Parallel index build for BRIN

2024-12-16 Thread Tomas Vondra

On 12/9/24 19:54, Tomas Vondra wrote:
> On 12/8/24 16:00, Egor Rogov wrote:
>> Hi,
>>
>> ...
>>
>> I've spotted another mention of B-tree being the only AM that supports
>> parallel builds: comment in src/backend/catalog/index.c. As this mention
>> is not visible to the users, I'd propose removing it altogether rather
>> than fixing it. Updated patch is attached.
>>
> 
> Thanks for noticing this and the patches. You're right, this should have
> been updated with the BRIN parallel builds. I'll get this committed
> sometime the week.
> 

I've pushed the doc fix, and backpatched it to PG 17.

Thanks for the patience!

-- 
Tomas Vondra

Re: Parallel index build for BRIN

2024-12-17 Thread Tomas Vondra

On 12/17/24 07:40, Egor Rogov wrote:
> On 16.12.2024 21:24, Tomas Vondra wrote:
> 
>> On 12/9/24 19:54, Tomas Vondra wrote:
>>> On 12/8/24 16:00, Egor Rogov wrote:
>>>> Hi,
>>>>
>>>> ...
>>>>
>>>> I've spotted another mention of B-tree being the only AM that supports
>>>> parallel builds: comment in src/backend/catalog/index.c. As this
>>>> mention
>>>> is not visible to the users, I'd propose removing it altogether rather
>>>> than fixing it. Updated patch is attached.
>>>>
>>> Thanks for noticing this and the patches. You're right, this should have
>>> been updated with the BRIN parallel builds. I'll get this committed
>>> sometime the week.
>>>
>> I've pushed the doc fix, and backpatched it to PG 17.
> 
> 
> Thanks so much, Tomas!
> 
> Please note that the comment in src/backend/catalog/index.c remains
> unchanged:
> 
> 
> --- a/src/backend/catalog/index.c
> +++ b/src/backend/catalog/index.c
> @@ -2988,8 +2988,7 @@ index_build(Relation heapRelation,
> Assert(PointerIsValid(indexRelation->rd_indam->ambuildempty));
> 
>     /*
> -    * Determine worker process details for parallel CREATE INDEX. 
> Currently,
> -    * only btree has support for parallel builds.
> +    * Determine worker process details for parallel CREATE INDEX.
>  *
>  * Note that planner considers parallel safety for us.
>  */
> 

Oh, thanks for noticing that. I'm not sure why I missed that, I must
have used the first patch by mistake.

Anyway, I found another obsolete comment in planner.c, so I fixed that
too and pushed.


Thanks

-- 
Tomas Vondra

Re: Most-common value docs in PG 12

Re: Most-common value docs in PG 12

Re: Most-common value docs in PG 12

Re: DISTINCT term in aggregate function

Re: Parallel index build for BRIN

Re: Parallel index build for BRIN

Re: Parallel index build for BRIN

7 matches

Site Navigation

Mail list logo

Footer information