Re: Re: Seeking the correct term of art for the (unique) role that is usually called "postgres"—and the mental model that underlies it all

2022-10-27 Thread David G. Johnston
On Thu, Oct 27, 2022 at 4:02 PM Tom Lane  wrote:

> "David G. Johnston"  writes:
> > Yes, the description for --username probably should be modified to read:
>
> > "Selects the user name of the cluster's bootstrap superuser."
>
> Yeah, perhaps.  The term "bootstrap superuser" is reasonably well
> established by now --- I count half a dozen uses in our SGML docs
> and another dozen or so in the code --- and it's certainly more
> specific than "database superuser".  We should probably create
> a glossary entry for it and then change all the uses of "database
> superuser" as appropriate.
>

+1

>
> However ... it looks to me like some of those uses just mean to
> distinguish between Postgres-specific superuser-dom as opposed
> to whatever the term might mean out in the operating system.
> But I'm not sure that anybody really uses that term for an OS-level
> concept on any popular OS, so it feels a bit pedantic as well
> as confusing.  Should we leave those usages alone, or reduce them
> to just "superuser"?
>
>
Upon a third reflection I decided that leaving "database superuser" in
place is preferred; it is fairly pervasive in the code, docs, and
translations.  I would suggest documenting both "bootstrap superuser" and
"database superuser", making it clear that "database superuser" means any
role in the cluster that has the superuser attribute while "bootstrap
superuser" is specifically that superuser which was created by initdb and
thus owns all initialized objects including the catalogs in all databases
in the cluster.

I'm not sure what you are referring to with respect to OS-level references
but those ideally will not refer to superuser at all - reserving the
concept for the product.  admin/root/sudo or even just os-user suffice for
the few places where the two worlds intersect.

David J.


Re: Seeking the correct term of art for the (unique) role that is usually called "postgres"—and the mental model that underlies it all

2022-10-27 Thread Adrian Klaver

On 10/27/22 17:20, Bryn Llewellyn wrote:

david.g.johns...@gmail.com  wrote:


b...@yugabyte.com  wrote:

The fact that the "bootstrap superuser" term of art denotes a 
matching pair of two principals (an O/S user and a within-cluster role)


No, it does not.  It denotes only the PostgreSQL role.  "service user" 
is probably a better term for the O/S side of things.  Though, 
frankly, aside from trying to distinguish things when talking about 
logging in, the necessity to even care about the O/S user is fairly 
minimal.


[about your "usr" example] just create a database named "usr" and you 
won't get the "database usr not found" error message anymore and 
the login will succeed.


Thank you very much David. The scales have now finally fallen from my 
eyes. I know now that in order to be able to start a client session from 
the O/S of the machine where the PG software and cluster live, without 
needing to supply a password even when "pg_hba.conf" asks for password 
authentication, it's sufficient to do this (using my "usr" example):


(0) Simply leave the regime in place where the catalog-owning role is 
called "postgres" and the cluster's data files and other config files 
are owned by postgres.


(1) create a new database role thus (where "password null" is just so 
that I can prove a point here):


create role usr with login password null;

(2) Add this line under the existing final comment in the shipped copy 
of "pg_ident.conf" thus:


# MAPNAME       SYSTEM-USERNAME         PG-USERNAME
usr             usr                     usr          # Added by Bryn

(It seems that I could set the first field of this line to "dog"—but I 
won't test that.


The above is not contributing to the below(pg_hba.conf) and would be 
redundant any way as it just says OS user usr = Pg user usr and peer 
means that anyway. The purpose of mapping would be to do something like 
map OS user foo to PG user usr.


References:

https://www.postgresql.org/docs/current/auth-peer.html

"map

Allows for mapping between system and database user names. See 
Section 21.2 for details.

"

Section 21.2

"The map-name is an arbitrary name that will be used to refer to this 
mapping in pg_hba.conf."


This example below id for the ident auth method but the same syntax 
applies to peer.


https://www.postgresql.org/docs/current/auth-pg-hba-conf.html

# TYPE  DATABASEUSERADDRESS METHOD
hostall all 192.168.0.0/16  ident 
map=omicron





(3) Add this line between the existing two in the shipped copy of 
"pg_hba.conf" thus:


local   all             postgres                                peer # 
See the essay at the start.
local   all             usr                                     peer # 
Added by Bryn

local   all             all                                     peer



As noted above your pg_ident.conf will not do anything for the above. It 
will work though if you are logged in as OS user usr as it will connect 
as PG user usr.




(My copy of this file specifies "md5" and not "trust".)

I'd've thought that "all" would mean any O/S user existing, or 
yet-to-be-created. But the comment in the shipped "pg_hba.conf" says this:


# If you change this first entry you will need to make sure that the
# database superuser can access the database using some other method.
...
# Database administrative login by Unix domain socket
local   all             postgres                                peer

(So two terms for the one notion just a couple of lines apart!) I'll do 
the empirical test presently. Anyway, with these conditions met, I can 
"su usr" and then start a session like this:


psql -d postgres

Yes, your point about what artifacts exist the moment after "initdb" 
finishes is taken. So I finished my test by (after authorizing as 
"postgres") creating a database "usr" and granting "connect" on it to 
"usr".) Then I could create a new session from the O/S prompt when 
"whoami" shows "user" with the bare "psql"—just as I could the moment 
after the PG install finished from the O/S prompt when "whoami" shows 
"postgres".


I did think that I'd tried all this at the outset. But clearly I must've 
missed one of those steps or done a typo.




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Seeking the correct term of art for the (unique) role that is usually called "postgres"—and the mental model that underlies it all

2022-10-27 Thread Bryn Llewellyn
> t...@sss.pgh.pa.us> wrote:
> 
>> david.g.johns...@gmail.com wrote:
>> 
>> Yes, the description for --username probably should be modified to read:
>> 
>> "Selects the user name of the cluster's bootstrap superuser."
> 
> Yeah, perhaps. The term "bootstrap superuser" is reasonably well established 
> by now --- I count half a dozen uses in our SGML docs and another dozen or so 
> in the code --- and it's certainly more specific than "database superuser". 
> We should probably create a glossary entry for it and then change all the 
> uses of "database superuser" as appropriate.
> 
> However ... it looks to me like some of those uses just mean to distinguish 
> between Postgres-specific superuser-dom as opposed to whatever the term might 
> mean out in the operating system. But I'm not sure that anybody really uses 
> that term for an OS-level
> concept on any popular OS, so it feels a bit pedantic as well as confusing.  
> Should we leave those usages alone, or reduce them to just "superuser"?

Thanks, Tom. I'd certainly appreciate an entry in "Appendix M. Glossary" for 
the term of art that I've been struggling to name. Until I hear what the 
experts decide, I'll use "catalog-owning role". Then I'll switch to the newly 
blessed term.

I'm afraid that I didn't get your point in your last paragraph. The terms 
"within-cluster role" and "O/S user" seem to capture the distinction when the 
context doesn't make it clear. I aim never to use the term "user" for the 
within-cluster phenomenon. After all, an existing within-cluster role can flip 
between "with nologin" and "with login" at the drop of a hat of the guy "with 
createrole".



Re: Seeking the correct term of art for the (unique) role that is usually called "postgres"—and the mental model that underlies it all

2022-10-27 Thread Bryn Llewellyn
> david.g.johns...@gmail.com> wrote:
> 
>> b...@yugabyte.com wrote:
>> 
>> This invariant must hold if an "ordinary" within-cluster  superuser is to 
>> qualify as the cluster's "bootstrap superuser":
>> 
>> the name of the bootstrap superuser's within-cluster role
>> 
>> AND
>> 
>> the name of the O/S user that owns lots of (but not all*) the software files 
>> that define the PostgreSQL RDBMS, together with the various files that 
>> represent what users create
>> 
>> are identical.
> 
> Nope, the name of the bootstrap user is the one supplied to initdb via the 
> --username argument.  Period. It need not match any name on the host 
> operating system and it will still be the bootstrap superuser's role name.
> 
> Yes, the description for --username probably should be modified to read:
> 
> "Selects the user name of the cluster's bootstrap superuser." Or just 
> consider a "cluster superuser" the term d'art...since most people would just 
> refer to any old role having superuser authorization as being plain ole 
> "superuser".  The fact that is says "database superuser" is the same holdover 
> effect as the fact that "init db" means "init database" even though it 
> actually initializes a cluster.

Thanks again, David. And once again, all is clear now.



Re: Seeking the correct term of art for the (unique) role that is usually called "postgres"—and the mental model that underlies it all

2022-10-27 Thread Bryn Llewellyn
> david.g.johns...@gmail.com wrote:
> 
>> b...@yugabyte.com wrote:
>> 
>> «
>> You can start a session without specifying the name of the cluster role as 
>> which to authorize, its password, and the name of the database to which to 
>> connect, ONLY when these things are true:
>> 
>> 1. The within-cluster catalog-owning role has a certain name, say 
>> "pg_system" (or "bob").
>> 
>> 2.  The O/S user that owns (most of) the O/S presence of the cluster and the 
>> software that accesses it has the identical name "pg_system" (or "bob").
>> 
>> 3. The current O/S user when you make the attempt to connect is "pg_system" 
>> (or "bob").
>> »
>> 
>> I want to know if my hypothesis is correct.
> 
> It is not.  "By default" probably, but not "only". Peer authentication means:
> 
> If the local O/S user (bob) running "psql" requests to login to the database 
> using that same* role name (bob) and the role exists in the cluster, accept 
> the authentication attempt.
> 
> * You can implement aliases by using an identity mapping.
> 
> Nothing more, nothing less.
>  
>> And, more importantly, I want to know where I can read a nicely written 
>> linear account of what *is* correct tha defines and then uses the official 
>> terms of art.
> 
> It doesn't exist, deal with it. Most people just call the "bootstrap" role 
> "postgres" when not talking about a specific installed cluster that happens 
> to use something different. In any case, no matter what terminology is used 
> everyone seems to figure out what is being referred to from context at least 
> and largely don't make a big deal about it.  Unless you actually want to 
> write the documentation just pick something you like and go with it.

Yes, all is clear now. Thanks again. And thanks for the dispensation to choose 
my term of art. I'll use "catalog-owning role" from now on.




Re: Seeking the correct term of art for the (unique) role that is usually called "postgres"—and the mental model that underlies it all

2022-10-27 Thread Bryn Llewellyn
> david.g.johns...@gmail.com wrote:
> 
>> b...@yugabyte.com wrote:
>> 
>> The fact that the "bootstrap superuser" term of art denotes a matching pair 
>> of two principals (an O/S user and a within-cluster role)
> 
> No, it does not.  It denotes only the PostgreSQL role.  "service user" is 
> probably a better term for the O/S side of things.  Though, frankly, aside 
> from trying to distinguish things when talking about logging in, the 
> necessity to even care about the O/S user is fairly minimal.
> 
> [about your "usr" example] just create a database named "usr" and you won't 
> get the "database usr not found" error message anymore and the login will 
> succeed.

Thank you very much David. The scales have now finally fallen from my eyes. I 
know now that in order to be able to start a client session from the O/S of the 
machine where the PG software and cluster live, without needing to supply a 
password even when "pg_hba.conf" asks for password authentication, it's 
sufficient to do this (using my "usr" example):

(0) Simply leave the regime in place where the catalog-owning role is called 
"postgres" and the cluster's data files and other config files are owned by 
postgres.

(1) create a new database role thus (where "password null" is just so that I 
can prove a point here):

create role usr with login password null;

(2) Add this line under the existing final comment in the shipped copy of 
"pg_ident.conf" thus:

# MAPNAME   SYSTEM-USERNAME PG-USERNAME
usr usr usr  # Added by Bryn

(It seems that I could set the first field of this line to "dog"—but I won't 
test that.

(3) Add this line between the existing two in the shipped copy of "pg_hba.conf" 
thus:

local   all postgrespeer # See the 
essay at the start.
local   all usr peer # Added by 
Bryn
local   all all peer

(My copy of this file specifies "md5" and not "trust".)

I'd've thought that "all" would mean any O/S user existing, or 
yet-to-be-created. But the comment in the shipped "pg_hba.conf" says this:

# If you change this first entry you will need to make sure that the
# database superuser can access the database using some other method.
...
# Database administrative login by Unix domain socket
local   all postgrespeer

(So two terms for the one notion just a couple of lines apart!) I'll do the 
empirical test presently. Anyway, with these conditions met, I can "su usr" and 
then start a session like this:

psql -d postgres

Yes, your point about what artifacts exist the moment after "initdb" finishes 
is taken. So I finished my test by (after authorizing as "postgres") creating a 
database "usr" and granting "connect" on it to "usr".) Then I could create a 
new session from the O/S prompt when "whoami" shows "user" with the bare 
"psql"—just as I could the moment after the PG install finished from the O/S 
prompt when "whoami" shows "postgres".

I did think that I'd tried all this at the outset. But clearly I must've missed 
one of those steps or done a typo.



Re: Index corruption revealed after upgrade to 11.17, could date back to at least 11.12

2022-10-27 Thread Peter J. Holzer
On 2022-10-27 15:07:06 +0300, Kristjan Mustkivi wrote:
> On Thu, Oct 27, 2022 at 12:18 PM Peter J. Holzer  wrote:
> > On 2022-10-27 10:55:31 +0300, Kristjan Mustkivi wrote:
> > > We use dockerized postgres.
> >
> > So that means you aren't just replacing PostgreSQL, but your complete OS
> > (except the kernel). What is the source of your docker images? Do they
> > all use the same base OS distribution? Are the locale definitions the
> > same?
> >
> > (Just trying to rule other other possible error sources.)
> 
> Up until 11.17, the source of the docker images was tag "postgres:11"
> (from https://hub.docker.com/_/postgres), for 11.17 the tag became
> "postgres:11-bullseye" but as far as i could tell it was just a
> difference of tagging policy there. Everything else is kept the same
> when building our custom docker image (with pg_cron, wal2json and
> oracle_fdw). But.. I can see for example, that the PG 11.12 docker
> image used Debian 9.13 (PG 11.17 uses Debian 11.5 according to
> /etc/debian_version).

Ok, So that's an ugrade from Debian 9 to Debian 11. That's definitely
not just a "difference of tagging policy", That's two major versions of
the OS!

I don't remember exactly when the big incompatible libc upgrade was, but
it was very likely somewhere between Debian 9 and Debian 11, so you have
to rebuild all you indexes. Since you didn't do that immediately after
the upgrade you now have data corruption which you have to fix manually.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Re: Seeking the correct term of art for the (unique) role that is usually called "postgres"—and the mental model that underlies it all

2022-10-27 Thread Tom Lane
"David G. Johnston"  writes:
> Yes, the description for --username probably should be modified to read:

> "Selects the user name of the cluster's bootstrap superuser."

Yeah, perhaps.  The term "bootstrap superuser" is reasonably well
established by now --- I count half a dozen uses in our SGML docs
and another dozen or so in the code --- and it's certainly more
specific than "database superuser".  We should probably create
a glossary entry for it and then change all the uses of "database
superuser" as appropriate.

However ... it looks to me like some of those uses just mean to
distinguish between Postgres-specific superuser-dom as opposed
to whatever the term might mean out in the operating system.
But I'm not sure that anybody really uses that term for an OS-level
concept on any popular OS, so it feels a bit pedantic as well
as confusing.  Should we leave those usages alone, or reduce them
to just "superuser"?

regards, tom lane




Re: Seeking the correct term of art for the (unique) role that is usually called "postgres"—and the mental model that underlies it all

2022-10-27 Thread David G. Johnston
On Thu, Oct 27, 2022 at 12:09 PM Bryn Llewellyn  wrote:

>
> This invariant must hold if an "ordinary" within-cluster  superuser is to
> qualify as the cluster's "bootstrap superuser":
>
> the name of the bootstrap superuser's within-cluster role
>
>
> AND
>
> the name of the O/S user that owns lots of (but not all*) the software
> files that define the PostgreSQL RDBMS, together with the various files
> that represent what users create
>
>
> are identical.
>
>
>
Nope, the name of the bootstrap user is the one supplied to initdb via the
--username argument.  Period.  It need not match any name on the host
operating system and it will still be the bootstrap superuser's role name.

Yes, the description for --username probably should be modified to read:

"Selects the user name of the cluster's bootstrap superuser."  Or just
consider a "cluster superuser" the term d'art...since most people would
just refer to any old role having superuser authorization as being plain
ole "superuser".  The fact that is says "database superuser" is the same
holdover effect as the fact that "init db" means "init database" even
though it actually initializes a cluster.

David J.


Re: Seeking the correct term of art for the (unique) role that is usually called "postgres"—and the mental model that underlies it all

2022-10-27 Thread David G. Johnston
On Thu, Oct 27, 2022 at 3:24 PM Bryn Llewellyn  wrote:

> *«*
> *You can start a session without specifying the name of the cluster role
> as which to authorize, its password, and the name of the database to which
> to connect, ONLY when these things are true:*
>
>
>
> *1. The within-cluster catalog-owning role has a certain name, say
> "pg_system" (or "bob").2.  The O/S user that owns (most of) the O/S
> presence of the cluster and the software that accesses it has the identical
> name "pg_system"** (or "bob")**.*
>
> *3. The current O/S user when you make the attempt to connect is
> "pg_system"** (or "bob")**.*
> *»*
>
> I want to know if my hypothesis is correct.
>

It is not.  "By default" probably, but not "only".

Peer authentication means:

If the local O/S user (bob) running "psql" requests to login to the
database using that same* role name (bob) and the role exists in the
cluster, accept the authentication attempt.

* You can implement aliases by using an identity mapping.

Nothing more, nothing less.


> And, more importantly, I want to know where I can read a nicely written
> linear account of what *is* correct tha defines and then uses the official
> terms of art.
>

It doesn't exist, deal with it.  Most people just call the "bootstrap" role
"postgres" when not talking about a specific installed cluster that happens
to use something different.  In any case, no matter what terminology is
used everyone seems to figure out what is being referred to from context at
least and largely don't make a big deal about it.  Unless you actually want
to write the documentation just pick something you like and go with it.


Re: Seeking the correct term of art for the (unique) role that is usually called "postgres"—and the mental model that underlies it all

2022-10-27 Thread David G. Johnston
On Thu, Oct 27, 2022 at 12:09 PM Bryn Llewellyn  wrote:

> [*] I see that, in my Ubuntu installation, critical programs like
> "postgres" itself, "initdb", "pg_ctl", "pg_dump" and so on are owned by
> "root".
>

And they exist in a "bin" directory so that any user on the system can
actually execute them.  It doesn't really matter who owns the binaries so
far as the database is concerned, it matters who executes them.


> The fact that the "bootstrap superuser" term of art denotes a matching
> pair of two principals (an O/S user and a within-cluster role)
>

No, it does not.  It denotes only the PostgreSQL role.  "service user" is
probably a better term for the O/S side of things.  Though, frankly, aside
from trying to distinguish things when talking about logging in, the
necessity to even care about the O/S user is fairly minimal.


> means that some sentences will require extra verbiage to identify which
> half of the pair the sentence treats. I'm open to suggestions. But I'll
> start with these these I'm corrected: the "bootstrap (regular) OS-user" and
> the "bootstrap within cluster superuser role". Sadly, the fact that "super"
> is baked into the term of art makes it difficult to name the O/S half of
> the phenomenon.
>
> I can now characterize what I'd observed more clearly, thus: only a
> bootstrap super user (as defined above) can start a session without
> mentioning the name of the database to which to connect and the name of the
> within-cluster role to connect as—and without supplying a password. And it
> can do this only from as O/S session where the effective O/S user is the
> bootstrap superuser.
>

That is generally wrong - it is only correct when considering a newly
initialized cluster - since then the only database that exists is the
postgres database and you cannot connect to a cluster without specifying an
existing database (so if you don't want to explicitly specify one you
better arrange things so the default you end up using is postgres, which
means your O/S user has to be postgres).  Otherwise, as your "usr" example
demonstrates, just create a database named "usr" and you won't get the
"database usr not found" error message anymore and the login will succeed.

David J.


Re: Seeking the correct term of art for the (unique) role that is usually called "postgres"—and the mental model that underlies it all

2022-10-27 Thread Bryn Llewellyn
> jer...@musicsmith.net wrote:
> 
>> b...@yugabyte.com  wrote:
>> 
>> I can now characterize what I'd observed more clearly, thus: only a 
>> bootstrap super user (as defined above) can start a session without 
>> mentioning the name of the database to which to connect and the name of the 
>> within-cluster role to connect as—and without supplying a password. And it 
>> can do this only from as O/S session where the effective O/S user is the 
>> bootstrap superuser.
> 
> I don't believe this is correct.  psql is using libpq.  A hint to that is 
> given in the section under the \c command:
> 
> When the command neither specifies nor reuses a particular parameter, the 
> libpq default is used.
> 
> The libpq docs show more info on the connection parameters 
> (https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS)
> 
> If not specified, both the user and the dbname default to the name of your 
> operating system user.  So, this could work if your operating system user, 
> the database name, and the database username are all postgres, but they could 
> also all be bob.
> 
> As to the password requirement - this depends on the settings in pg_hba.conf. 
>  You could set all connections to trust and then no one would need a 
> password, but I wouldn't recommend that.

Thank you for this extra information, Jeremy. I'm afraid that what I wrote was 
insufficiently precise. I should have added these riders:

(1) My O/S env is in a brand new VM (it happens to be Parallels on my Big Sur 
Mac Book) that was created by a single button press that selected Ubuntu 20.04. 
(I followed that with what I hope is only benign customization for terminal 
colors and the like.)

(2) My PG regime is what I end up with in this VM immediately following the use 
of "apt install postgresql-11". (There's a good reason why I want that old 
version. I hope that its age isn't a distraction here.) Notably, the 
installation flow offers no opportunity to express choices.

(3) This gives me the PG software (largely owned by the O/S user "postgres", 
but with some programs owned by "root") and an already started cluster.

(4) I am able to start a plsql session, when my O/S user is "postgres" simply 
by typing the bare command "psql". Yes, I'm implicitly selecting various libpq 
default values—just as you described. But those defaults don't include a 
default for the password.

(5) When I start a session in this way, I see that I have this regime: a single 
role with the name "postgres" and the status "superuser"; and a single database 
non-template database also with the name "postgres", together with the usual 
"template1" and "template2". Further, this query (when connected to the 
"postgres" database):

select nspname
from pg_namespace n inner join pg_roles r on n.nspowner = r.oid
where r.rolname = 'postgres'
order by 1;

shows me this:

pg_toast
pg_temp_1
pg_toast_temp_1
pg_catalog
public
information_schema

When I asked how to refer to this clearly special cluster-role, David said:

> Don't think it's documented but I like "bootstrap user"


and Ian said:

> "bootstrap superuser" is also mentioned.


Meanwhile, in a separate thread, Adrian pointed me to the "initdb" doc (and 
command line help) where the term "database superuser" is used.

This means that I'm so far denied the possibility to use a single term that 
everybody agrees on. I may as well call it the "catalog owning role" here 
because at least that term is unambiguously descriptive.

I stated in my reply to Adrian that  I had formed this hypothesis (reworded 
slightly here).

When he environment is what I described at the start (which env. brings a 
"pg_hba.conf" file that requires password authentication by NOT specifying 
"trust"),

«
You can start a session without specifying the name of the cluster role as 
which to authorize, its password, and the name of the database to which to 
connect, ONLY when these things are true:

1. The within-cluster catalog-owning role has a certain name, say "pg_system" 
(or "bob").

2.  The O/S user that owns (most of) the O/S presence of the cluster and the 
software that accesses it has the identical name "pg_system" (or "bob").

3. The current O/S user when you make the attempt to connect is "pg_system" (or 
"bob").
»

I want to know if my hypothesis is correct. And, more importantly, I want to 
know where I can read a nicely written linear account of what *is* correct tha 
defines and then uses the official terms of art.



Re: Seeking the correct term of art for the (unique) role that is usually called "postgres"—and the mental model that underlies it all

2022-10-27 Thread Jeremy Smith
>
> I can now characterize what I'd observed more clearly, thus: only a
> bootstrap super user (as defined above) can start a session without
> mentioning the name of the database to which to connect and the name of the
> within-cluster role to connect as—and without supplying a password. And it
> can do this only from as O/S session where the effective O/S user is the
> bootstrap superuser.
>
>
I don't believe this is correct.  psql is using libpq.  A hint to that is
given in the section under the \c command:

When the command neither specifies nor reuses a particular parameter, the
> libpq default is used.
>

The libpq docs show more info on the connection parameters (
https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS
)

If not specified, both the user and the dbname default to the name of your
operating system user.  So, this could work if your operating system user,
the database name, and the database username are all postgres, but they
could also all be bob.

As to the password requirement - this depends on the settings in
pg_hba.conf.  You could set all connections to trust and then no one would
need a password, but I wouldn't recommend that.

  -Jeremy


Re: Seeking the correct term of art for the (unique) role that is usually called "postgres"—and the mental model that underlies it all

2022-10-27 Thread Bryn Llewellyn
> adrian.klaver@aklaver.comwrote:
> 
>> b...@yugabyte.com wrote
>> 
>> The descriptive designation "the role that owns the SQL part of the 
>> implementation of PostgreSQL" is too much of a mouthful for daily use. And 
>> anyway, this notion captures only part of the story that makes "postgres" 
>> uniquely what it is—at least on Ubuntu.
> 
> How much time would it have taken to go to the docs:
> 
> «
> https://www.postgresql.org/docs/current/app-initdb.html
> 
> Selects the user name of the database superuser. This defaults to the name of 
> the effective user running initdb. It is really not important what the 
> superuser's name is, but one might choose to keep the customary name 
> postgres, even if the operating system user's name is different.
> »

I HAD read that. The phrase occurs three times on that page. But the account 
doesn't define the term. Rather, it's used as if everybody knows what it means. 
Yet there's no x-ref to where the definition is. I did refer, albeit 
implicitly, to this doc by citing the text that "initdb --help" gives. The doc 
and the help say the same thing. You, Adrian, often accuse me of being too 
wordy. So I  catered to you by not using words to spell out what I just spelled 
out here.

A PG cluster has lots of databases. Lots of things have names whose uniqueness 
scope is (maximally) a single database. Just a couple of things, and roles in 
particular, need names that are unique in the cluster as a whole. Therefore, 
the term "database superuser" is tautologically wrong. It should, at least, be 
"cluster superuser". But then, like I said, you can have as many superusers as 
you please in a single cluster. So the idea that one is singled out as *THE* 
[cluster] superuser didn't make sense to me. Moreover, the "initdb" doc says 
that it doesn't really matter if what it calls the "database superuser" has the 
same name as the O/S ussr that owns (most of) the PG installation and cluster 
content. But David pointed out here:

https://www.postgresql.org/message-id/CAKFQuwYHLCMpLgPbBC7idoomoKwrgrjGxOnd%2BD1CXAQtf3DHcg%40mail.gmail.com

that you lose a lot if these two sides of the same coin don't have the same 
name. (So the "initdb" doc would be improved by an x-ref to the discussion of 
the consequences of the name choices here.)

This implies that there's still a missing term of art that denotes the nicely 
matched *pair* of within-cluster role and O/S user.

I said all this in my reply to David and Ian Barwick. I did send it about half 
an hour before you wrote this. But I see now that I'd managed to omit 
"pgsql-general@lists.postgresql.org 
" from the addressee list. (Yes, 
another of my notorious typos.) I resent it moments ago. It's here:

https://www.postgresql.org/message-id/5C47A318-F265-4A64-B306-C4019F499DB8%40yugabyte.com

>> ...listed as the owner of the pg_catalog schema, the objects in it, other 
>> related schemas...
> 
> 
> You need to define 'other related schemas'.

select nspname
from pg_namespace n inner join pg_roles r on n.nspowner = r.oid
where r.rolname = 'postgres';

produces this:

pg_toast
pg_catalog
information_schema
...

Same point as before. You and others in the cohort of "the pgsql-general list 
lawmakers" have made me nervous about spelling things out 'cos doing so uses 
words and code—and often I've been told off for being too wordy. This is a pity 
because accuracy and precision inevitably compete with brevity.

> ...For it to  work you have to be operating as the OS user postgres. I'm 
> guessing that is why your attempt as usr failed, you where not running as the 
> OS user usr.

No, I've been super-aware of the current identity of the O/S user in all tests, 
I've typed "whoami" more times in the last few days than before in my whole 
life to date.

>> I tried to set up "peer" authentication for a brand new O/S user that I 
>> called "usr" to match a brand new cluster role that I also called "usr". I 
>> added a new line in "pg_hba.conf" thus:
>> local   all usr peer
> 
> 
> Read:
> https://www.postgresql.org/docs/current/auth-peer.html 
> 

I had. And I'd followed the link to Section 21.2:
https://www.postgresql.org/docs/current/auth-username-maps.html

> «
> "The pg_ident.conf file is read on start-up and when the main server process 
> receives a SIGHUP signal. If you edit the file on an active system, you will 
> need to signal the postmaster (using pg_ctl reload, calling the SQL function 
> pg_reload_conf(), or using kill -HUP) to make it re-read the file.
> »

To be sure, I did the whole thing again now. (And, yes, my O/S user is 
"postgres", at the start of this account.) Here's the relevant part of the 
output from "cat /etc/passwd": 

postgres:x:1001:1001:,,,:/home/postgres:/bin/bash
usr:x:1002:1001:,,,:/home/usr:/bin/bash

I did "sudo systemctl stop postgresql". Then I made sure 

Re: Seeking the correct term of art for the (unique) role that is usually called "postgres"—and the mental model that underlies it all

2022-10-27 Thread Bryn Llewellyn
(David and Ian, I'm resending this because, I see that I managed to omit 
"pgsql-general@lists.postgresql.org 
" from the addressee list. So, of 
course, it didn't show up in the "pgsql-general" archive.)

> barw...@gmail.com  wrote:
> 
>> david.g.johns...@gmail.com :
>> 
>>> b...@yugabyte.com  wrote:
>>> 
>>> The descriptive designation "the role that owns the SQL part of the 
>>> implementation of PostgreSQL" is too much of a mouthful for daily use.
>> 
>> Don't think it's documented but I like "bootstrap user" which I've seen 
>> bandied about here a bit. It isn't that special but if the bootstrap user 
>> name and o/s user name are not the same name then you've broken an almost 
>> universal convention that exists to make stuff like logging in with peer 
>> authentication work better.
> 
> "bootstrap superuser" is also mentioned a few times in the [upcoming] docs, 
> see e.g.
> 
> https://www.postgresql.org/docs/devel/sql-grant.html 
> 
> 
> This recent commit: e530be2c5ce77475d56ccf8f4e0c4872b666ad5f [1] might also 
> be of interest to anyone considering the "special-ness" of this role.
> 
> [1] 
> https://git.postgresql.org/gitweb/?p%3Dpostgresql.git;a%3Dcommit;h%3De530be2c5ce77475d56ccf8f4e0c4872b666ad5f

Thanks. David and Ian. I'll take this:

> If GRANTED BY is specified, the grant is recorded as having been done by the 
> specified role. A user can only attribute a grant to another role if they 
> possess the privileges of that role. The role recorded as the grantor must 
> have ADMIN OPTION on the target role, unless it is the bootstrap superuser. 
> When a grant is recorded as having a grantor other than the bootstrap 
> superuser, it depends on the grantor continuing to possess ADMIN OPTION on 
> the role; so, if ADMIN OPTION is revoked, dependent grants must be revoked as 
> well.

from an upcoming version of the "grant" statement doc, to be a sufficient 
establishment of the canonical status of the term of art that I sought. I'll 
adopt the term "bootstrap superuser" (and not plain "bootstrap user") and I'll 
assume that everybody on this list (at least anybody who might answer my 
questions) shares the same, and immediate, understanding of the term—which 
implies this:

This invariant must hold if an "ordinary" within-cluster  superuser is to 
qualify as the cluster's "bootstrap superuser":

the name of the bootstrap superuser's within-cluster role

AND

the name of the O/S user that owns lots of (but not all*) the software files 
that define the PostgreSQL RDBMS, together with the various files that 
represent what users create

are identical.


[*] I see that, in my Ubuntu installation, critical programs like "postgres" 
itself, "initdb", "pg_ctl", "pg_dump" and so on are owned by "root".

The fact that the "bootstrap superuser" term of art denotes a matching pair of 
two principals (an O/S user and a within-cluster role) means that some 
sentences will require extra verbiage to identify which half of the pair the 
sentence treats. I'm open to suggestions. But I'll start with these these I'm 
corrected: the "bootstrap (regular) OS-user" and the "bootstrap within cluster 
superuser role". Sadly, the fact that "super" is baked into the term of art 
makes it difficult to name the O/S half of the phenomenon.

I can now characterize what I'd observed more clearly, thus: only a bootstrap 
super user (as defined above) can start a session without mentioning the name 
of the database to which to connect and the name of the within-cluster role to 
connect as—and without supplying a password. And it can do this only from as 
O/S session where the effective O/S user is the bootstrap superuser.

It seems, too, that one would be stupid to call the bootstrap superuser 
anything other than "postgres". Notice that this implies that the typical macOS 
regime (where my bootstrap super user is called "Bllewell" and has to be 
double-quoted in SQL, and yours is called "sagrawal") is, indeed, stupidly 
unconventional.

All this implies a little test. Here, I'll save typing by saying that my 
bootstrap superuser is called "postgres". I created a second database in a 
freshly created cluster called "x". And then, from the O/S, I tried this:

psql -d x

That worked fine. But, having said this, it would seem that it would be so very 
unconventional (given that you've already agreed to call your bootstrap 
superuser "postgres", not to make a database called "postgres" available too. 
(I tested that by dropping my "postgres" database. (This is the freedom that a 
VM with a nice snapshot together with the ability to start afresh with "initdb" 
brings.) Now, the bare "psql" causes the error "FATAL:  database "postgres" 
does not exist".



Re: Index corruption revealed after upgrade to 11.17, could date back to at least 11.12

2022-10-27 Thread Ron

On 10/27/22 07:07, Kristjan Mustkivi wrote:

On Thu, Oct 27, 2022 at 12:18 PM Peter J. Holzer  wrote:

On 2022-10-27 10:55:31 +0300, Kristjan Mustkivi wrote:

We use dockerized postgres.

So that means you aren't just replacing PostgreSQL, but your complete OS
(except the kernel). What is the source of your docker images? Do they
all use the same base OS distribution? Are the locale definitions the
same?

(Just trying to rule other other possible error sources.)

Hello!

Up until 11.17, the source of the docker images was tag "postgres:11"
(from https://hub.docker.com/_/postgres), for 11.17 the tag became
"postgres:11-bullseye" but as far as i could tell it was just a
difference of tagging policy there. Everything else is kept the same
when building our custom docker image (with pg_cron, wal2json and
oracle_fdw). But.. I can see for example, that the PG 11.12 docker
image used Debian 9.13 (PG 11.17 uses Debian 11.5 according to
/etc/debian_version). So the official docker images also upgrade the
OS (which is expected I suppose).


Differing locales is a strong possibility.

--
Angular momentum makes the world go 'round.




Lock: Speculative token

2022-10-27 Thread Ravi Krishna
Aurora PG based on PG 13.4
Our create concurrent index on a very large partitioned table (5 billion rows) 
waits in Lock: Speculative token.Never seen this error in PG. Google search 
also shows nothing.
Is this Aurora thingy ? If this is PG related I can provide full details.

Re: EXPLAIN ANALYZE does not return accurate execution times

2022-10-27 Thread Tom Lane
Mark Mizzi  writes:
> So to confirm, EXPLAIN ANALYZE does not detoast rows?

Not values that would have been transmitted to the client, no.

regards, tom lane




Re: EXPLAIN ANALYZE does not return accurate execution times

2022-10-27 Thread Mark Mizzi
Hi, thanks for your reply.
So to confirm, EXPLAIN ANALYZE does not detoast rows? The original goal of
these queries was to see the effect of fetching from toast tables on query
performance.

On Thu, 27 Oct 2022 at 15:43, Tom Lane  wrote:

> Mark Mizzi  writes:
> > When I run
>
> > EXPLAIN ANALYZE SELECT * FROM unary;
>
> > I get the following result:
>
> >  Seq Scan on unary  (cost=0.00..1637.01 rows=11 width=18) (actual
> > time=0.009..6.667 rows=11 loops=1)
> >  Planning Time: 0.105 ms
> >  Execution Time: 8.565 ms
>
> > On the other hand, the following command
>
> > time sudo -u postgres psql -c "SELECT * FROM unary" -o /dev/null
>
> > returns after 17s with:
> > sudo -u postgres psql -c "SELECT * FROM unary" -o /dev/null  0.01s user
> > 0.01s system 0% cpu 16.912 total
>
> The main thing actual execution does that EXPLAIN does not is
> format the data and send it off to the client.  There are a
> number of possible bottlenecks involved there -- TOAST fetching,
> data formatting, network traffic, or client processing.  Watching
> this example in "top", I see psql consuming near 100% CPU, meaning
> that the problem is with psql's code to make a nicely-aligned
> ASCII table out of the result.  This isn't too surprising: that
> code was never meant to operate on resultsets that are too large
> for human consumption.  You could use a different formatting rule,
> or switch to COPY.
>
> As an example, using
>
> psql -c '\pset format unaligned' -c "SELECT * FROM unary" -o /dev/null
>
> this example drops from ~16s to ~1.7s on my machine.
>
> regards, tom lane
>


Re: EXPLAIN ANALYZE does not return accurate execution times

2022-10-27 Thread Tom Lane
Mark Mizzi  writes:
> When I run

> EXPLAIN ANALYZE SELECT * FROM unary;

> I get the following result:

>  Seq Scan on unary  (cost=0.00..1637.01 rows=11 width=18) (actual
> time=0.009..6.667 rows=11 loops=1)
>  Planning Time: 0.105 ms
>  Execution Time: 8.565 ms

> On the other hand, the following command

> time sudo -u postgres psql -c "SELECT * FROM unary" -o /dev/null

> returns after 17s with:
> sudo -u postgres psql -c "SELECT * FROM unary" -o /dev/null  0.01s user
> 0.01s system 0% cpu 16.912 total

The main thing actual execution does that EXPLAIN does not is
format the data and send it off to the client.  There are a
number of possible bottlenecks involved there -- TOAST fetching,
data formatting, network traffic, or client processing.  Watching
this example in "top", I see psql consuming near 100% CPU, meaning
that the problem is with psql's code to make a nicely-aligned
ASCII table out of the result.  This isn't too surprising: that
code was never meant to operate on resultsets that are too large
for human consumption.  You could use a different formatting rule,
or switch to COPY.

As an example, using

psql -c '\pset format unaligned' -c "SELECT * FROM unary" -o /dev/null

this example drops from ~16s to ~1.7s on my machine.

regards, tom lane




Re: EXPLAIN ANALYZE does not return accurate execution times

2022-10-27 Thread Julien Rouhaud
Hi,

On Thu, Oct 27, 2022 at 03:28:14PM +0200, Mark Mizzi wrote:
>
> EXPLAIN ANALYZE SELECT * FROM unary;
>
> I get the following result:
>
>  Seq Scan on unary  (cost=0.00..1637.01 rows=11 width=18) (actual
> time=0.009..6.667 rows=11 loops=1)
>  Planning Time: 0.105 ms
>  Execution Time: 8.565 ms
>
> [...]
>
> sudo -u postgres psql -c "SELECT * FROM unary" -o /dev/null  0.01s user
> 0.01s system 0% cpu 16.912 total
>
> I am running Postgres 14 (installed via apt) on Ubuntu 22.04. All settings
> are default.
> The machine is a Dell Vostro 7500.
>
> All commands are being run locally, so I don't think this is a network
> bandwidth issue. What's going on?

EXPLAIN ANALYZE doesn't output the tuples, so it hides that part of the query
processing.  It's usually not a problem, at least if you want to identify non
optimal queries, but here you probably have the perfect scenario to notice the
difference.

You could try to use something like "SELECT COUNT(*) FROM unary", the timings
should be closer and the query would still scan all the lines.




EXPLAIN ANALYZE does not return accurate execution times

2022-10-27 Thread Mark Mizzi
As an example, let's take the following simple table:

CREATE TABLE unary(a VARCHAR);
-- simple way to make table large
ALTER TABLE unary
ALTER COLUMN a SET STORAGE EXTERNAL;

-- insert one million large rows
INSERT INTO unary
SELECT repeat('a', 8000)
FROM generate_series(0, 10);

-- update planner statistics on the unary table.
ANALYZE unary;

When I run

EXPLAIN ANALYZE SELECT * FROM unary;

I get the following result:

 Seq Scan on unary  (cost=0.00..1637.01 rows=11 width=18) (actual
time=0.009..6.667 rows=11 loops=1)
 Planning Time: 0.105 ms
 Execution Time: 8.565 ms

On the other hand, the following command

time sudo -u postgres psql -c "SELECT * FROM unary" -o /dev/null

returns after 17s with:

sudo -u postgres psql -c "SELECT * FROM unary" -o /dev/null  0.01s user
0.01s system 0% cpu 16.912 total

I am running Postgres 14 (installed via apt) on Ubuntu 22.04. All settings
are default.
The machine is a Dell Vostro 7500.

All commands are being run locally, so I don't think this is a network
bandwidth issue. What's going on?


Re: Value Too long varchar(100)

2022-10-27 Thread Stefan Knecht
You should be able to do something with this if you require it to be done
with COPY:

https://paquier.xyz/postgresql-2/postgres-9-3-feature-highlight-copy-tofrom-program/

But, as David suggested, I'd also recommend to do an intermediate step, and
load the data into a table, or pre-process the file to be how you want it.



On Thu, Oct 27, 2022 at 7:19 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

>
>
> On Thu, Oct 27, 2022 at 5:02 AM Rama Krishnan 
> wrote:
>
>> Hi team,
>>
>> We are getting csv file from client to upload data in my db table , one
>> particular column I. E clinet description column contains more than 100
>> character  hence I am getting value too long varchar (100) so we decided to
>> upload db only first 100 characters. How to use this thing in copy command
>>
>
> You cannot.  Either fix the content of the file or remove the arbitrary
> length limitation on the field (i.e., change the type to "text").  I
> suggest the later.  You may also,copy into a temporary staging table that
> lacks the limit, then use insert to move the transformed data (via a select
> query) into the production table.
>
> David J.
>


Re: Value Too long varchar(100)

2022-10-27 Thread David G. Johnston
On Thu, Oct 27, 2022 at 5:02 AM Rama Krishnan  wrote:

> Hi team,
>
> We are getting csv file from client to upload data in my db table , one
> particular column I. E clinet description column contains more than 100
> character  hence I am getting value too long varchar (100) so we decided to
> upload db only first 100 characters. How to use this thing in copy command
>

You cannot.  Either fix the content of the file or remove the arbitrary
length limitation on the field (i.e., change the type to "text").  I
suggest the later.  You may also,copy into a temporary staging table that
lacks the limit, then use insert to move the transformed data (via a select
query) into the production table.

David J.


Re: Index corruption revealed after upgrade to 11.17, could date back to at least 11.12

2022-10-27 Thread Kristjan Mustkivi
On Thu, Oct 27, 2022 at 12:18 PM Peter J. Holzer  wrote:
>
> On 2022-10-27 10:55:31 +0300, Kristjan Mustkivi wrote:
> > We use dockerized postgres.
>
> So that means you aren't just replacing PostgreSQL, but your complete OS
> (except the kernel). What is the source of your docker images? Do they
> all use the same base OS distribution? Are the locale definitions the
> same?
>
> (Just trying to rule other other possible error sources.)

Hello!

Up until 11.17, the source of the docker images was tag "postgres:11"
(from https://hub.docker.com/_/postgres), for 11.17 the tag became
"postgres:11-bullseye" but as far as i could tell it was just a
difference of tagging policy there. Everything else is kept the same
when building our custom docker image (with pg_cron, wal2json and
oracle_fdw). But.. I can see for example, that the PG 11.12 docker
image used Debian 9.13 (PG 11.17 uses Debian 11.5 according to
/etc/debian_version). So the official docker images also upgrade the
OS (which is expected I suppose).

Best regards,
-- 
Kristjan Mustkivi

Email: kristjan.mustk...@gmail.com




Re: PostgreSql Service different path

2022-10-27 Thread chris navarroza
I fixed it by editing the postgresql-14.service PGDATA path to the new
directory

Thanks,

butching

On Mon, Oct 24, 2022 at 7:25 PM chris navarroza 
wrote:

> But I'm using a different path when I initdb /usr/pgsql-14/bin/initdb -D 
> */home/dmartuser/pgsql/14/data
> *so  "/var/lib/pgsql/14/data/" is really empty. Is there a way to point
> the startup script to the new path  */home/dmartuser/pgsql/14/data*  ?
>
> Thanks,
>
> Chris Albert Navarroza
> Information Technology Officer I
> CTCO - ITDS - RDMD
>
>
> On Mon, Oct 24, 2022 at 6:57 PM Jeffrey Walton  wrote:
>
>> On Mon, Oct 24, 2022 at 6:38 AM chris navarroza
>>  wrote:
>> >
>> > I install postgresql14.5 with the following commands
>> >
>> > sudo yum install postgresql14-server postgresql14-contrib
>> >
>> >
>> > sudo su postgres
>> >
>> > cd /tmp
>> >
>> > /usr/pgsql-14/bin/initdb -D /home/dmartuser/pgsql/14/data
>> >
>> > and update the postgresql.conf to the new path, now when I start the
>> service, it has an error and when I check it says
>> >
>> > -- Unit postgresql-14.service has begun starting up.
>> > Oct 24 15:16:45 datamartds postgresql-14-check-db-dir[132338]:
>> "/var/lib/pgsql/14/data/" is missing or empty.
>> > Oct 24 15:16:45 datamartds postgresql-14-check-db-dir[132338]: Use
>> "/usr/pgsql-14/bin/postgresql-14-setup initdb" to initialize the database
>> cluster.
>> > Oct 24 15:16:45 datamartds postgresql-14-check-db-dir[132338]: See
>> /usr/share/doc/postgresql14/README.rpm-dist for more information.
>> > Oct 24 15:16:45 datamartds systemd[1]: postgresql-14.service: Control
>> process exited, code=exited status=1
>> > Oct 24 15:16:45 datamartds systemd[1]: postgresql-14.service: Failed
>> with result 'exit-code'.
>> >
>> > How can I point the service to read the new path (
>> /home/dmartuser/pgsql/14/data )?
>>
>> The startup script that is calling systemd should perform a `mkdir -p
>> /var/lib/pgsql/14/data` before calling the PostgreSQL binary.
>>
>> Jeff
>>
>


GIN Index Partial Match?

2022-10-27 Thread 黄宁
I create a gin index for a custom type. I want to use partial match, but I
find extract query method called twice in a query, when in the first
called, I return the minimal value of the type, and I want to set the
maximal value of the type in extra data will be used in compare partial
function ,But in the second extrace query function called,the query value
is the first my return.It's weird. ,how i can set the extra data correctly?

Thank You!


Value Too long varchar(100)

2022-10-27 Thread Rama Krishnan
Hi team,



We are getting csv file from client to upload data in my db table , one
particular column I. E clinet description column contains more than 100
character  hence I am getting value too long varchar (100) so we decided to
upload db only first 100 characters. How to use this thing in copy command


Re: Index corruption revealed after upgrade to 11.17, could date back to at least 11.12

2022-10-27 Thread Allan Kamau
On Thu, Oct 27, 2022 at 12:18 PM Peter J. Holzer  wrote:

> On 2022-10-27 10:55:31 +0300, Kristjan Mustkivi wrote:
> > We use dockerized postgres.
>
> So that means you aren't just replacing PostgreSQL, but your complete OS
> (except the kernel). What is the source of your docker images? Do they
> all use the same base OS distribution? Are the locale definitions the
> same?
>
> (Just trying to rule other other possible error sources.)
>
>
> > I do apologize, but I do not understand the value of doing that select
> > juggling.
>
> I think Allan may have misread your mail.
>
> hp
>
> --
>_  | Peter J. Holzer| Story must make more sense than reality.
> |_|_) ||
> | |   | h...@hjp.at |-- Charles Stross, "Creative writing
> __/   | http://www.hjp.at/ |   challenge!"
>



Hi Peter,

It was a long shot. I was trying to suggest determining if indeed the data
does actually contain duplicating values in the mentioned columns.
Then test to see if the index construction on the new table can be
successful on both versions and see if the error is reproducible.

-Allan.


Re: Index corruption revealed after upgrade to 11.17, could date back to at least 11.12

2022-10-27 Thread Peter J. Holzer
On 2022-10-27 10:55:31 +0300, Kristjan Mustkivi wrote:
> We use dockerized postgres.

So that means you aren't just replacing PostgreSQL, but your complete OS
(except the kernel). What is the source of your docker images? Do they
all use the same base OS distribution? Are the locale definitions the
same?

(Just trying to rule other other possible error sources.)


> I do apologize, but I do not understand the value of doing that select
> juggling.

I think Allan may have misread your mail.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Index corruption revealed after upgrade to 11.17, could date back to at least 11.12

2022-10-27 Thread Kristjan Mustkivi
Hi Allan,

We use dockerized postgres. So the upgrade is just replacing the old
minor version pg docker image with new minor version docker image and
that's it. Of course, I am checking the release notes to see if there
is anything to pay attention to particularly.

I do apologize, but I do not understand the value of doing that select
juggling. I have been searching for and fixing the problematic rows
with ctid (and xmin, xmax  to help establish the ones to remove) and
this has been effective in indicating the discrepancy between actual
data in the table and the corruption in the e.g Primary Key index.
Also, playing around with enable_indexscan, enable_bitmapscan,
enable_seqscan prove the same problem with an index. After deleting
the offending rows by ctid, REINDEX-ing is possible. (As these are
production systems, some of the relations had to be fixed right away.)

This case is most likely to do with some bug and the release notes for
11.14, .15 and .16 seem to explain the current situation. What would
be good to know is that this particular bug has been fixed and is not
one yet to be uncovered.

Best regards,

Kristjan

On Thu, Oct 27, 2022 at 10:41 AM Allan Kamau  wrote:
>
>
>
> On Thu, Oct 27, 2022 at 10:26 AM Allan Kamau  wrote:
>>
>>
>>
>> On Thu, Oct 27, 2022 at 10:20 AM Kristjan Mustkivi  
>> wrote:
>>>
>>> Dear community,
>>>
>>> Right after upgrading our postgres servers from 11.15 to 11.17 we
>>> started to encounter problems with data. Namely, when the query hit
>>> the index, it returned a single row; when the query hit a relation
>>> directly, it returned more than one row. Attempt to REINDEX revealed
>>> the underlying data had duplicates (unique index reindexing failed).
>>>
>>> Version facts:
>>> we started out with 11.12
>>> jan 2022 upgraded to 11.14
>>> mar 2022 to 11.15
>>> oct 2022 to 11.17
>>>
>>> We are not sure when this corruption actually happened. Could it be
>>> related to the indexing bugs reported in
>>> https://www.postgresql.org/docs/release/11.14/? And the condition only
>>> became known to us after 11.17 rollout which can perhaps be explained
>>> by the following: while 11.17 does not have any outstanding index
>>> related fixes, then https://www.postgresql.org/docs/release/11.15/
>>> mentions fix for index-only scans and so does
>>> https://www.postgresql.org/docs/release/11.16/.
>>>
>>> The bottom line is we would like to understand if the index corruption
>>> and its manifestation is explained by the above release fixes or is
>>> there something else that should be investigated further here with the
>>> help from the community.
>>>
>>> With best regards,
>>> --
>>> Kristjan Mustkivi
>>>
>>> Email: kristjan.mustk...@gmail.com
>>>
>>>
>> Hi Kristjan,
>> What if you construct a select statement containing the row id and the 
>> column which has the problematic index into a new table. Then perform 
>> queries on this table to test for uniqueness of the column on which the 
>> problematic index was reported.
>>
>> Allan.
>
>
> How was the data "transfer" between upgrades done? Was it by dump and restore?
> If you have the 11.15 instance running having the data, you may do the 
> selection of the row id and the specific column which the index is based into 
> a new table and perform queries on this too to determine uniqueness of the 
> values therein. Likewise do the same for the 11.17 version.
>
> Is it possible to build and install PG 15 from source on a different 
> directory (using --prefix ) then perform pg_dump using the binaries of this 
> installation into a directory. Then configure PG 15 installation to listen on 
> a different TCP/IP port to the one you are currently using with 11.17 
> instance. Once started, test to see if the index anomaly is present in the PG 
> 15 instance. Alternatively you may use the PG 15 docker image and docker to 
> start a PG 15 docker container for your tests instead of having to build and 
> install PG 15 for this test.
>
> -Allan
>
>
>


-- 
Kristjan Mustkivi

Email: kristjan.mustk...@gmail.com




Re: Index corruption revealed after upgrade to 11.17, could date back to at least 11.12

2022-10-27 Thread Allan Kamau
On Thu, Oct 27, 2022 at 10:26 AM Allan Kamau  wrote:

>
>
> On Thu, Oct 27, 2022 at 10:20 AM Kristjan Mustkivi 
> wrote:
>
>> Dear community,
>>
>> Right after upgrading our postgres servers from 11.15 to 11.17 we
>> started to encounter problems with data. Namely, when the query hit
>> the index, it returned a single row; when the query hit a relation
>> directly, it returned more than one row. Attempt to REINDEX revealed
>> the underlying data had duplicates (unique index reindexing failed).
>>
>> Version facts:
>> we started out with 11.12
>> jan 2022 upgraded to 11.14
>> mar 2022 to 11.15
>> oct 2022 to 11.17
>>
>> We are not sure when this corruption actually happened. Could it be
>> related to the indexing bugs reported in
>> https://www.postgresql.org/docs/release/11.14/? And the condition only
>> became known to us after 11.17 rollout which can perhaps be explained
>> by the following: while 11.17 does not have any outstanding index
>> related fixes, then https://www.postgresql.org/docs/release/11.15/
>> mentions fix for index-only scans and so does
>> https://www.postgresql.org/docs/release/11.16/.
>>
>> The bottom line is we would like to understand if the index corruption
>> and its manifestation is explained by the above release fixes or is
>> there something else that should be investigated further here with the
>> help from the community.
>>
>> With best regards,
>> --
>> Kristjan Mustkivi
>>
>> Email: kristjan.mustk...@gmail.com
>>
>>
>> Hi Kristjan,
> What if you construct a select statement containing the row id and the
> column which has the problematic index into a new table. Then perform
> queries on this table to test for uniqueness of the column on which the
> problematic index was reported.
>
> Allan.
>

How was the data "transfer" between upgrades done? Was it by dump and
restore?
If you have the 11.15 instance running having the data, you may do the
selection of the row id and the specific column which the index is based
into a new table and perform queries on this too to determine uniqueness of
the values therein. Likewise do the same for the 11.17 version.

Is it possible to build and install PG 15 from source on a different
directory (using --prefix ) then perform pg_dump using the binaries of this
installation into a directory. Then configure PG 15 installation to listen
on a different TCP/IP port to the one you are currently using with 11.17
instance. Once started, test to see if the index anomaly is present in the
PG 15 instance. Alternatively you may use the PG 15 docker image and docker
to start a PG 15 docker container for your tests instead of having to build
and install PG 15 for this test.

-Allan


Re: Index corruption revealed after upgrade to 11.17, could date back to at least 11.12

2022-10-27 Thread Allan Kamau
On Thu, Oct 27, 2022 at 10:20 AM Kristjan Mustkivi 
wrote:

> Dear community,
>
> Right after upgrading our postgres servers from 11.15 to 11.17 we
> started to encounter problems with data. Namely, when the query hit
> the index, it returned a single row; when the query hit a relation
> directly, it returned more than one row. Attempt to REINDEX revealed
> the underlying data had duplicates (unique index reindexing failed).
>
> Version facts:
> we started out with 11.12
> jan 2022 upgraded to 11.14
> mar 2022 to 11.15
> oct 2022 to 11.17
>
> We are not sure when this corruption actually happened. Could it be
> related to the indexing bugs reported in
> https://www.postgresql.org/docs/release/11.14/? And the condition only
> became known to us after 11.17 rollout which can perhaps be explained
> by the following: while 11.17 does not have any outstanding index
> related fixes, then https://www.postgresql.org/docs/release/11.15/
> mentions fix for index-only scans and so does
> https://www.postgresql.org/docs/release/11.16/.
>
> The bottom line is we would like to understand if the index corruption
> and its manifestation is explained by the above release fixes or is
> there something else that should be investigated further here with the
> help from the community.
>
> With best regards,
> --
> Kristjan Mustkivi
>
> Email: kristjan.mustk...@gmail.com
>
>
> Hi Kristjan,
What if you construct a select statement containing the row id and the
column which has the problematic index into a new table. Then perform
queries on this table to test for uniqueness of the column on which the
problematic index was reported.

Allan.


Index corruption revealed after upgrade to 11.17, could date back to at least 11.12

2022-10-27 Thread Kristjan Mustkivi
Dear community,

Right after upgrading our postgres servers from 11.15 to 11.17 we
started to encounter problems with data. Namely, when the query hit
the index, it returned a single row; when the query hit a relation
directly, it returned more than one row. Attempt to REINDEX revealed
the underlying data had duplicates (unique index reindexing failed).

Version facts:
we started out with 11.12
jan 2022 upgraded to 11.14
mar 2022 to 11.15
oct 2022 to 11.17

We are not sure when this corruption actually happened. Could it be
related to the indexing bugs reported in
https://www.postgresql.org/docs/release/11.14/? And the condition only
became known to us after 11.17 rollout which can perhaps be explained
by the following: while 11.17 does not have any outstanding index
related fixes, then https://www.postgresql.org/docs/release/11.15/
mentions fix for index-only scans and so does
https://www.postgresql.org/docs/release/11.16/.

The bottom line is we would like to understand if the index corruption
and its manifestation is explained by the above release fixes or is
there something else that should be investigated further here with the
help from the community.

With best regards,
-- 
Kristjan Mustkivi

Email: kristjan.mustk...@gmail.com