Re: Idea: PostgreSQL equivalent to Oracle's KEEP clause

2023-03-13 Thread Ben Clements
Thanks David.

Similar to your "TOP() and BOTTOM() aggregate" idea, you might find Erwin
Brandstetter's solution using the LAST() aggregate function interesting: (
https://dba.stackexchange.com/a/324646/100880)

If the FIRST_LAST_AGG extension is installed, then we can do something like
this:

SELECT country
 , count(*) AS ct_cities
 , max(population) AS highest_population
 , last(city ORDER BY population, city) AS biggest_city  -- !FROM
 citiesGROUP  BY countryHAVING count(*) > 1;


-Ben

On Mon, Mar 6, 2023 at 9:51 PM David Rowley  wrote:

> On Tue, 7 Mar 2023 at 12:40, Tom Lane  wrote:
> >
> > Ben Clements  writes:
> > > As shown above, the following calculated column can bring in the city
> name,
> > > even though the city name isn't in the GROUP BY:
> > >max(city) keep (dense_rank first order by population desc)
> >
> > You haven't really explained what this does, let alone why it can't
> > be implemented with existing features such as FILTER and ORDER BY.
>
> (It wasn't clear to me until I watched the youtube video.)
>
> Likely KEEP is more flexible than just the given example but I think
> that something similar to the example given could be done by inventing
> a TOP() and BOTTOM() aggregate. Then you could write something like:
>
> select
>country,
>count(*),
>max(population),
>bottom(city, population)
> from
>cities
> group by
>country
> having
>count(*) > 1
>
> the transfn for bottom() would need to remember the city and the
> population for the highest yet seen value of the 2nd arg.  The
> combinefn would need to find the aggregate state with the highest 2nd
> arg value, the finalfn would just spit out the column that's stored in
> the state.  Where this wouldn't work would be if multiple columns were
> required to tiebreak the sort.
>
> You could at least parallelize the aggregation this way. If there were
> to be some form of ORDER BY in the aggregate then no parallelization
> would be possible.  I'd assume since the whole thing can be done with
> a subquery that the entire point of having special syntax for this
> would be because we don't want to pay the price of looking at the
> table twice, i.e. performance must matter, so the ability to have
> parallel aggregates here seems good.
>
> I can't quite think of a way to have parallel query and an arbitrarily
> long list of columns to sort on...
>
> For Ben, we do tend to shy away from copying other RDBMS's extensions
> to the SQL language.  The problem is that copying these can cause
> problems in the future if/when the standard adopts that syntax with
> variations or invents something else that conflicts with the grammar
> that we've added.  One example of something we didn't do was Oracle's
> CONNECT BY.  Eventually, the SQL standard got WITH RECURSIVE to allow
> queries on hierarchical data. Of course, we do have many of our own
> extensions to the standard, so we certainly do make exceptions
> sometimes. So, don't be too surprised that there's some discussion of
> other methods which might make this work which don't involve copying
> what someone else has done.
>
> David
>


Re: Uppercase version of ß desired

2023-03-13 Thread Tom Lane
"Peter J. Holzer"  writes:
> On 2023-03-13 17:38:51 -0400, Celia McInnis wrote:
>> I would be really happy if postgresql had an upper case version of the ß
>> german character.

> But the 'ß' is a bit special as it is usually uppercased to 'SS'
> (although 'ẞ' is now officially allowed, too).
> Apparently your (and my) locale doesn't uppercase ß at all, which isn't
> correct according to German spelling rules but was very common in the
> last decades.

Our code for libc locales doesn't support upcasing 'ß' to 'SS',
because it uses towlower() which can only manage
one-character-to-one-character transformations.  It should work for
upcasing to 'ẞ', but as you say, you need to find a locale that thinks
that should happen.

You might have better luck if you have a version of Postgres that
supports ICU and you can use an ICU locale.  That code path doesn't
appear to have any hard-wired assumption about how many characters
in convert to how many out.

regards, tom lane




Re: Uppercase version of ß desired

2023-03-13 Thread Peter J. Holzer
On 2023-03-13 17:38:51 -0400, Celia McInnis wrote:
> I would be really happy if postgresql  had an upper case version of the ß
> german character. The wiki page
> https://en.wikipedia.org/wiki/%C3%9F
> 
> indicates that the capital (U+1E9E ẞ LATIN CAPITAL LETTER SHARP S) was encoded
> by ISO 10646 in 2008.

The character is there, of course, and lower-casing it works as
expected:

hjp=> select 'ẞ', lower('ẞ');
╔══╤═══╗
║ ?column? │ lower ║
╟──┼───╢
║ ẞ│ ß ║
╚══╧═══╝
(1 row)

But the 'ß' is a bit special as it is usually uppercased to 'SS'
(although 'ẞ' is now officially allowed, too).

Apparently your (and my) locale doesn't uppercase ß at all, which isn't
correct according to German spelling rules but was very common in the
last decades.

You can specify an alternate locale:

hjp=> select upper('ß');
╔═══╗
║ upper ║
╟───╢
║ ß ║
╚═══╝
(1 row)


hjp=> select upper('ß' collate "de-AT-x-icu");
╔═══╗
║ upper ║
╟───╢
║ SS║
╚═══╝
(1 row)


The challenge now is to find a locale which uppercases ß to ẞ.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Using GSSAPI/Kerbros/Active Directory: want the database user name to be the full name including the realm name without specifying the user

2023-03-13 Thread Michael Downey
Good afternoon,

We can successfully connect via GSSAPI/Kerberos using our Active Directory 
credentials when we map the domain user to a database user that matches the 
name when the domain information is removed.

When we have a user called mich
C:\Users\mich2581>"c:\Program Files\PostgreSQL\14\bin"\psql -h 
dev00*.esri.com snoopynet**1
Pg_hba:
host all all 0.0.0.0/0   gss 
include_realm=1 map=map
Pg_ident:
# MAPNAME   SYSTEM-USERNAME PG-USERNAME
map   /^(.*)@ESRI\.COM$ \1
map/^(.*)@esri\.com$ \1

If we create a database role called 
mich9...@esri.com, if we run psql or other clients 
without providing a name with the command, we fail like this:

2023-03-09 16:58:51.732 PST [3654539] LOG:  provided user name (mich) and 
authenticated user name (mich@ESRI.COM) do not match
2023-03-09 16:58:51.732 PST [3654539] FATAL:  GSSAPI authentication failed for 
user "mich2581"

If we specify the full name with domain at the command line, it will connect as 
the role with the full name with domain name. We test that without the mapping 
set above.

>From what we can ascertain, this appears to be default behavior. The thing is, 
>many notes on setting this up on the interweb have examples creating the 
>database user with the full name including domain.

Is it possible to do that? We are not certain where the domain name is being 
stripped off.

Thanks,
Michael


Uppercase version of ß desired

2023-03-13 Thread Celia McInnis
HI:

I would be really happy if postgresql  had an upper case version of the ß
german character. The wiki page
https://en.wikipedia.org/wiki/%C3%9F

indicates that the capital (U+1E9E ẞ LATIN CAPITAL LETTER SHARP S) was
encoded  by ISO 10646
 in 2008.

BTW the reason that I'd like upper('ß') to give something different than
'ß'  is because I have written a simple substitution puzzle for a large
number of languages where I show the encrypted lower case words in upper
case and the successful letter substitution submissions in lower case - so
I need the upper and lower case versions of each letter to be different!

Thanks for any assistance! Maybe I can hack what I want in python (which is
what I am using for the puzzle).

Celia McInnis


Re: Properly handle OOM death?

2023-03-13 Thread Israel Brewster
> On Mar 13, 2023, at 12:25 PM, Joe Conway  wrote:
> 
> On 3/13/23 16:18, Israel Brewster wrote:
>>> Did you try setting "vm.overcommit_memory=2"?
> 
>> root@novarupta:~# sysctl -w vm.overcommit_memory=2
>> sysctl: setting key "vm.overcommit_memory", ignoring: Read-only file system
> 
>> I’m thinking I wound up with a container rather than a full VM after
>> all - and as such, the best solution may be to migrate to a full VM
>> with some swap space available to avoid the issue in the first place.
>> I’ll have to get in touch with the sys admin for that though.
> 
> Hmm, well big +1 for having swap turned on, but I recommend setting 
> "vm.overcommit_memory=2" even so.

Makes sense. Presumably with a full VM I won’t get the “Read-only file system” 
error when trying to do so.

Thanks!

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145
> 
> -- 
> Joe Conway
> PostgreSQL Contributors Team
> RDS Open Source Databases
> Amazon Web Services: https://aws.amazon.com
> 





Re: Properly handle OOM death?

2023-03-13 Thread Israel Brewster


> On Mar 13, 2023, at 12:16 PM, Peter J. Holzer  wrote:
> 
> On 2023-03-13 09:55:50 -0800, Israel Brewster wrote:
>> On Mar 13, 2023, at 9:43 AM, Peter J. Holzer  wrote:
> The syslog should contain a list of all tasks prior to the kill. For
> example, I just provoked an OOM kill on my laptop and the syslog
> contains (among lots of others) these lines:
> 
> Mar 13 21:00:36 trintignant kernel: [112024.084117] [   2721]   126  2721
> 54563 2042   163840  555  -900 postgres
> Mar 13 21:00:36 trintignant kernel: [112024.084123] [   2873]   126  2873
> 18211   85   114688  594 0 postgres
> Mar 13 21:00:36 trintignant kernel: [112024.084128] [   2941]   126  2941
> 54592 1231   147456  565 0 postgres
> Mar 13 21:00:36 trintignant kernel: [112024.084134] [   2942]   126  2942
> 54563  535   143360  550 0 postgres
> Mar 13 21:00:36 trintignant kernel: [112024.084139] [   2943]   126  2943
> 54563 1243   139264  548 0 postgres
> Mar 13 21:00:36 trintignant kernel: [112024.084145] [   2944]   126  2944
> 54798  561   147456  545 0 postgres
> Mar 13 21:00:36 trintignant kernel: [112024.084150] [   2945]   126  2945
> 54563  215   131072  551 0 postgres
> Mar 13 21:00:36 trintignant kernel: [112024.084156] [   2956]   126  2956
> 18718  506   122880  553 0 postgres
> Mar 13 21:00:36 trintignant kernel: [112024.084161] [   2957]   126  2957
> 54672  269   139264  546 0 postgres
> 
> That's less helpful than it could be since all the postgres processes
> are just listed as "postgres" without arguments. However, it is very
> likely that the first one is actually the postmaster, because it has the
> lowest pid (and the other pids follow closely) and it has an OOM score
> of -900 as set in the systemd service file.
> 
> So I could compare the PID of the killed process with this list (in my
> case the killed process wasn't one of them but a test program which just
> allocates lots of memory).

Oh, interesting. I had just greped for ‘Killed process’, so I didn’t see those 
preceding lines  Looking at that, I see two things:
1) The entries in my syslog all refer to an R process, not a postgresql process 
at all
2) The ‘Killed process’ entry *does* actually have the process name in it - 
it’s just since the process name was “R”, I wasn’t making the connection 
 
> 
>hp
> 
> -- 
>   _  | Peter J. Holzer| Story must make more sense than reality.
> |_|_) ||
> | |   | h...@hjp.at  |-- Charles Stross, 
> "Creative writing
> __/   | http://www.hjp.at/ |   challenge!"



Re: Properly handle OOM death?

2023-03-13 Thread Joe Conway

On 3/13/23 16:18, Israel Brewster wrote:

On Mar 13, 2023, at 11:42 AM, Joe Conway  wrote:
I am not entirely sure, but without actually testing it I suspect
that since memory.max = high (that is, the limit is whatever the
host has available) the OOM kill is technically a cgroup OOM kill
even though it is effectively a host level memory pressure event.


Sorry, actually meant "memory.max = max" here



Did you try setting "vm.overcommit_memory=2"?



root@novarupta:~# sysctl -w vm.overcommit_memory=2
sysctl: setting key "vm.overcommit_memory", ignoring: Read-only file system



I’m thinking I wound up with a container rather than a full VM after
all - and as such, the best solution may be to migrate to a full VM
with some swap space available to avoid the issue in the first place.
I’ll have to get in touch with the sys admin for that though.


Hmm, well big +1 for having swap turned on, but I recommend setting 
"vm.overcommit_memory=2" even so.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: Properly handle OOM death?

2023-03-13 Thread Israel Brewster
> On Mar 13, 2023, at 11:42 AM, Joe Conway  wrote:
> 
> On 3/13/23 15:18, Israel Brewster wrote:
>> The syslog specifically says "Memory cgroup out of memory”, if that means
>> something (this is my first exposure to cgroups, if you couldn’t
>> tell).
> 
> I am not entirely sure, but without actually testing it I suspect that since 
> memory.max = high (that is, the limit is whatever the host has available) the 
> OOM kill is technically a cgroup OOM kill even though it is effectively a 
> host level memory pressure event.

That would make sense.

> 
> Did you try setting "vm.overcommit_memory=2"?

Yeah:

root@novarupta:~# sysctl -w vm.overcommit_memory=2
sysctl: setting key "vm.overcommit_memory", ignoring: Read-only file system

I’m thinking I wound up with a container rather than a full VM after all - and 
as such, the best solution may be to migrate to a full VM with some swap space 
available to avoid the issue in the first place. I’ll have to get in touch with 
the sys admin for that though.
---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

> 
> -- 
> Joe Conway
> PostgreSQL Contributors Team
> RDS Open Source Databases
> Amazon Web Services: https://aws.amazon.com
> 





Re: Properly handle OOM death?

2023-03-13 Thread Peter J. Holzer
On 2023-03-13 09:55:50 -0800, Israel Brewster wrote:
> On Mar 13, 2023, at 9:43 AM, Peter J. Holzer  wrote:
> > On 2023-03-13 09:21:18 -0800, Israel Brewster wrote:
> >> I’m running a postgresql 13 database on an Ubuntu 20.04 VM that is a bit 
> >> more
> >> memory constrained than I would like, such that every week or so the 
> >> various
> >> processes running on the machine will align badly and the OOM killer will 
> >> kick
> >> in, killing off postgresql, as per the following journalctl output:
> >> 
> >> Mar 12 04:04:23 novarupta systemd[1]: postgresql@13-main.service: A 
> >> process of
> >> this unit has been killed by the OOM killer.
> >> Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Failed 
> >> with
> >> result 'oom-kill'.
> >> Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Consumed 
> >> 5d
> >> 17h 48min 24.509s CPU time.
> >> 
> >> And the service is no longer running.
> > 
> > I might be misreading this, but it looks to me that systemd detects that
> > *some* process in the group was killed by the oom killer and stops the
> > service.
> > 
> > Can you check which process was actually killed? If it's not the
> > postmaster, setting OOMScoreAdjust is probably useless.
> > 
> > (I tried searching the web for the error messages and didn't find
> > anything useful)
> 
> Your guess is as good as (if not better than) mine. I can find the PID
> of the killed process in the system log, but without knowing what the
> PID of postmaster and the child processes were prior to the kill, I’m
> not sure that helps much.

The syslog should contain a list of all tasks prior to the kill. For
example, I just provoked an OOM kill on my laptop and the syslog
contains (among lots of others) these lines:

Mar 13 21:00:36 trintignant kernel: [112024.084117] [   2721]   126  2721
54563 2042   163840  555  -900 postgres
Mar 13 21:00:36 trintignant kernel: [112024.084123] [   2873]   126  2873
18211   85   114688  594 0 postgres
Mar 13 21:00:36 trintignant kernel: [112024.084128] [   2941]   126  2941
54592 1231   147456  565 0 postgres
Mar 13 21:00:36 trintignant kernel: [112024.084134] [   2942]   126  2942
54563  535   143360  550 0 postgres
Mar 13 21:00:36 trintignant kernel: [112024.084139] [   2943]   126  2943
54563 1243   139264  548 0 postgres
Mar 13 21:00:36 trintignant kernel: [112024.084145] [   2944]   126  2944
54798  561   147456  545 0 postgres
Mar 13 21:00:36 trintignant kernel: [112024.084150] [   2945]   126  2945
54563  215   131072  551 0 postgres
Mar 13 21:00:36 trintignant kernel: [112024.084156] [   2956]   126  2956
18718  506   122880  553 0 postgres
Mar 13 21:00:36 trintignant kernel: [112024.084161] [   2957]   126  2957
54672  269   139264  546 0 postgres

That's less helpful than it could be since all the postgres processes
are just listed as "postgres" without arguments. However, it is very
likely that the first one is actually the postmaster, because it has the
lowest pid (and the other pids follow closely) and it has an OOM score
of -900 as set in the systemd service file.

So I could compare the PID of the killed process with this list (in my
case the killed process wasn't one of them but a test program which just
allocates lots of memory).

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Properly handle OOM death?

2023-03-13 Thread Joe Conway

On 3/13/23 15:18, Israel Brewster wrote:

root@novarupta:~# cat 
/sys/fs/cgroup/system.slice/system-postgresql.slice/postgresql@13-main.service/memory.max
max
root@novarupta:~# cat 
/sys/fs/cgroup/system.slice/system-postgresql.slice/postgresql@13-main.service/memory.high
max
root@novarupta:~#

which would presumably indicate that it’s a system level limit being
exceeded, rather than a postgresql specific one?


Yep


The syslog specifically says "Memory cgroup out of memory”, if that means
something (this is my first exposure to cgroups, if you couldn’t
tell).


I am not entirely sure, but without actually testing it I suspect that 
since memory.max = high (that is, the limit is whatever the host has 
available) the OOM kill is technically a cgroup OOM kill even though it 
is effectively a host level memory pressure event.


Did you try setting "vm.overcommit_memory=2"?

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: Properly handle OOM death?

2023-03-13 Thread Jeffrey Walton
On Mon, Mar 13, 2023 at 1:21 PM Israel Brewster  wrote:
>
> I’m running a postgresql 13 database on an Ubuntu 20.04 VM that is a bit more 
> memory constrained than I would like, such that every week or so the various 
> processes running on the machine will align badly and the OOM killer will 
> kick in, killing off postgresql, as per the following journalctl output:
>
> Mar 12 04:04:23 novarupta systemd[1]: postgresql@13-main.service: A process 
> of this unit has been killed by the OOM killer.
> Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Failed with 
> result 'oom-kill'.
> Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Consumed 5d 
> 17h 48min 24.509s CPU time.
>
> And the service is no longer running.
>
> When this happens, I go in and restart the postgresql service, and everything 
> is happy again for the next week or two.
>
> Obviously this is not a good situation. Which leads to two questions:
>
> 1) is there some tweaking I can do in the postgresql config itself to prevent 
> the situation from occurring in the first place?
> 2) My first thought was to simply have systemd restart postgresql whenever it 
> is killed like this, which is easy enough. Then I looked at the default unit 
> file, and found these lines:
>
> # prevent OOM killer from choosing the postmaster (individual backends will
> # reset the score to 0)
> OOMScoreAdjust=-900
> # restarting automatically will prevent "pg_ctlcluster ... stop" from working,
> # so we disable it here. Also, the postmaster will restart by itself on most
> # problems anyway, so it is questionable if one wants to enable external
> # automatic restarts.
> #Restart=on-failure
>
> Which seems to imply that the OOM killer should only be killing off 
> individual backends, not the entire cluster to begin with - which should be 
> fine. And also that adding the restart=on-failure option is probably not the 
> greatest idea. Which makes me wonder what is really going on?
>

Related, we (a FOSS project) used to have a Linux server with a LAMP
stack on GoDaddy. The machine provided a website and wiki. It was very
low-end. I think it had 512MB or 1 GB RAM and no swap file. And no way
to enable a swap file (part of an upsell). We paid about $2 a month
for it.

MySQL was killed several times a week. It corrupted the database on a
regular basis. We had to run the database repair tools daily. We
eventually switched to Ionos for hosting. We got a VM with more memory
and a swap file for about $5 a month. No more OOM kills.

If possible, you might want to add more memory (or a swap file) to the
machine. It will help sidestep the OOM problem.

You can also add vm.overcommit_memory = 2 to stop Linux from
oversubscribing memory. The machine will act like a Solaris box rather
than a Linux box (which takes some getting used to). Also see
https://serverfault.com/questions/606185/how-does-vm-overcommit-memory-work
.

Jeff




Re: Properly handle OOM death?

2023-03-13 Thread Israel Brewster
On Mar 13, 2023, at 11:10 AM, Joe Conway  wrote:
> 
> On 3/13/23 14:50, Israel Brewster wrote:
>> Looks like V2:
>> root@novarupta:~# stat -fc %T /sys/fs/cgroup/
>> cgroup2fs
> 
> Interesting -- it does indeed look like you are using cgroup v2
> 
> So the file you want to look at in that case is:
> 8<---
> cat 
> /sys/fs/cgroup/system.slice/system-postgresql.slice/postgresql@14.service/memory.max
> 4294967296
> 
> cat 
> /sys/fs/cgroup/system.slice/system-postgresql.slice/postgresql@14.service/memory.high
> 3221225472
> 8<---
> If the value comes back as "max" it means no limit is set.

This does, in fact, appear to be the case here:

root@novarupta:~# cat 
/sys/fs/cgroup/system.slice/system-postgresql.slice/postgresql@13-main.service/memory.max
 
max
root@novarupta:~# cat 
/sys/fs/cgroup/system.slice/system-postgresql.slice/postgresql@13-main.service/memory.high
 
max
root@novarupta:~# 

which would presumably indicate that it’s a system level limit being exceeded, 
rather than a postgresql specific one? The syslog specifically says "Memory 
cgroup out of memory”, if that means something (this is my first exposure to 
cgroups, if you couldn’t tell).
---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145


> 
> In this example (on my Linux Mint machine with a custom systemd unit file) I 
> have memory.max set to 4G and memory.high set to 3G.
> 
> The value of memory.max determines when the OOM killer will strike. The value 
> of memory.high will determine when the kernel goes into aggressive memory 
> reclaim (trying to avoid memory.max and thus an OOM kill).
> 
> The corresponding/relevant systemd unit file parameters are:
> 8<---
> MemoryAccounting=yes
> MemoryHigh=3G
> MemoryMax=4G
> 8<---
> 
> There are other ways that memory.max may get set, but it seems most likely 
> that the systemd unit file is doing it (if it is in fact set).
> 
> -- 
> Joe Conway
> PostgreSQL Contributors Team
> RDS Open Source Databases
> Amazon Web Services: https://aws.amazon.com
> 





Re: Binary large object processing problems

2023-03-13 Thread Raivo Rebane
Ok, I will do that. ANd it works now in both way

Regards
Raivo

On Mon, Mar 13, 2023 at 6:18 PM Tom Lane  wrote:

> Raivo Rebane  writes:
> > Thank for answer, but I was also readed this manual and found there :
> > 35.3.2. Importing a Large Object
>
> You need to read the start of that section, which says
>
> All large object manipulation using these functions
> must take place within an SQL transaction block,
>
> As Christoph said, you're responsible for supplying a BEGIN and COMMIT
> around the call.
>
> regards, tom lane
>


Re: Properly handle OOM death?

2023-03-13 Thread Joe Conway

On 3/13/23 14:50, Israel Brewster wrote:

Looks like V2:

root@novarupta:~# stat -fc %T /sys/fs/cgroup/
cgroup2fs


Interesting -- it does indeed look like you are using cgroup v2

So the file you want to look at in that case is:
8<---
cat 
/sys/fs/cgroup/system.slice/system-postgresql.slice/postgresql@14.service/memory.max

4294967296

cat 
/sys/fs/cgroup/system.slice/system-postgresql.slice/postgresql@14.service/memory.high

3221225472
8<---
If the value comes back as "max" it means no limit is set.

In this example (on my Linux Mint machine with a custom systemd unit 
file) I have memory.max set to 4G and memory.high set to 3G.


The value of memory.max determines when the OOM killer will strike. The 
value of memory.high will determine when the kernel goes into aggressive 
memory reclaim (trying to avoid memory.max and thus an OOM kill).


The corresponding/relevant systemd unit file parameters are:
8<---
MemoryAccounting=yes
MemoryHigh=3G
MemoryMax=4G
8<---

There are other ways that memory.max may get set, but it seems most 
likely that the systemd unit file is doing it (if it is in fact set).


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: Properly handle OOM death?

2023-03-13 Thread Israel Brewster
> On Mar 13, 2023, at 10:37 AM, Joe Conway  wrote:
> 
> On 3/13/23 13:55, Israel Brewster wrote:
>> 1) They reference a “Memory cgroup out of memory”, which refers back
>> to the opening comment on Joe Conway’s message - this would imply to
>> me that I *AM* running with a cgroup memory.limit set. Not sure how
>> that changes things?
> 
> cgroup memory limit is enforced regardless of the actual host level memory 
> pressure. As an example, if your host VM has 128 GB of memory, but your 
> cgroup memory limit is 512MB, you will get an OOM kill when the sum memory 
> usage of all of your postgres processes (and anything else sharing the same 
> cgroup) exceeds 512 MB, even if the host VM has nothing else going on 
> consuming memory.
> 
> You can check if a memory is set by reading the corresponding virtual file, 
> e.g:
> 
> 8<---
> # cat 
> /sys/fs/cgroup/memory/system.slice/postgresql.service/memory.limit_in_bytes
> 9223372036854710272
> 8<---
> 
> A few notes:
> 1/ The specific path to memory.limit_in_bytes might vary, but this example is 
> the default for the RHEL 8 postgresql 10 RPM.

Not finding that file specifically (this is probably too much info, but…):

root@novarupta:~# ls /sys/fs/cgroup/system.slice/   

 -.mount   cgroup.threads  dev-hugepages.mount  
memory.events.local   memory.swap.eventsproc-diskstats.mount
   ssh.service   system-postgresql.slice
   systemd-resolved.service
 accounts-daemon.service   cgroup.type dev-lxc-console.mount
memory.high   memory.swap.high  proc-loadavg.mount  
   sys-devices-system-cpu-online.mount   systemd-initctl.socket 
   systemd-sysctl.service
 cgroup.controllersconsole-getty.service   dev-lxc-tty1.mount   
memory.lowmemory.swap.max   proc-meminfo.mount  
   sys-devices-virtual-net.mount systemd-journal-flush.service  
   systemd-sysusers.service
 cgroup.events console-setup.service   dev-lxc-tty2.mount   
memory.maxnetworkd-dispatcher.service   proc-stat.mount 
   sys-fs-fuse-connections.mount systemd-journald-audit.socket  
   systemd-tmpfiles-setup-dev.service
 cgroup.freeze cpu.pressuredev-mqueue.mount 
memory.minpids.current  proc-swaps.mount
   sys-kernel-debug.mount
systemd-journald-dev-log.socket   systemd-tmpfiles-setup.service
 cgroup.max.depth  cpu.statdev-ptmx.mount   
memory.numa_stat  pids.events   
proc-sys-kernel-random-boot_id.mount   syslog.socket 
systemd-journald.service  systemd-update-utmp.service
 cgroup.max.descendantscron.serviceio.pressure  
memory.oom.group  pids.max  proc-sys-net.mount  
   sysstat.service   systemd-journald.socket
   systemd-user-sessions.service
 cgroup.procs  data.mount  keyboard-setup.service   
memory.pressure   pool.mount   
'proc-sysrq\x2dtrigger.mount'  'system-container\x2dgetty.slice'  
systemd-logind.serviceufw.service
 cgroup.stat   dbus.servicememory.current   
memory.stat   postfix.service   proc-uptime.mount   
   system-modprobe.slice systemd-networkd.service   
   uuidd.socket
 cgroup.subtree_controldbus.socket memory.events
memory.swap.current   proc-cpuinfo.mountrsyslog.service 
   system-postfix.slice  systemd-remount-fs.service

root@novarupta:~# ls /sys/fs/cgroup/system.slice/system-postgresql.slice/
cgroup.controllers  cgroup.max.depthcgroup.stat cgroup.type 
  io.pressure memory.events.local  memory.maxmemory.oom.group  
memory.swap.current  memory.swap.max  pids.max
cgroup.events   cgroup.max.descendants  cgroup.subtree_control  
cpu.pressure  memory.current  memory.high  memory.min
memory.pressure   memory.swap.events   pids.current 
postgresql@13-main.service
cgroup.freeze   cgroup.procscgroup.threads  cpu.stat
  memory.events   memory.low   memory.numa_stat  memory.stat   
memory.swap.high pids.events

root@novarupta:~# ls 
/sys/fs/cgroup/system.slice/system-postgresql.slice/postgresql@13-main.service/
cgroup.controllers  cgroup.max.depthcgroup.stat cgroup.type 
  io.pressure memory.events.local  memory.maxmemory.oom.group  
memory.swap.current  memory.swap.max  pids.max
cgroup.events   cgroup.max.descendants  

Re: Properly handle OOM death?

2023-03-13 Thread Joe Conway

On 3/13/23 13:55, Israel Brewster wrote:

1) They reference a “Memory cgroup out of memory”, which refers back
to the opening comment on Joe Conway’s message - this would imply to
me that I *AM* running with a cgroup memory.limit set. Not sure how
that changes things?


cgroup memory limit is enforced regardless of the actual host level 
memory pressure. As an example, if your host VM has 128 GB of memory, 
but your cgroup memory limit is 512MB, you will get an OOM kill when the 
sum memory usage of all of your postgres processes (and anything else 
sharing the same cgroup) exceeds 512 MB, even if the host VM has nothing 
else going on consuming memory.


You can check if a memory is set by reading the corresponding virtual 
file, e.g:


8<---
# cat 
/sys/fs/cgroup/memory/system.slice/postgresql.service/memory.limit_in_bytes

9223372036854710272
8<---

A few notes:
1/ The specific path to memory.limit_in_bytes might vary, but this 
example is the default for the RHEL 8 postgresql 10 RPM.


2/ The value above, 9223372036854710272 basically means "no limit" has 
been set.


3/ The example assumes cgroup v1. There are very few distro's that 
enable cgroup v2 by default, and generally I have not seen much cgroup 
v2 usage in the wild (although I strongly recommend it), but if you are 
using cgroup v2 the names have changed. You can check by doing:


8<--cgroupv2 enabled-
# stat -fc %T /sys/fs/cgroup/
cgroup2fs
8<--cgroupv1 enabled-
# stat -fc %T /sys/fs/cgroup/
tmpfs
8<---


2) All the entries contain the line "oom_score_adj:0”, which would
seem to imply that the postmaster, with its -900 score is not being
directly targeted by the OOM killer.


Sounds correct

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: Properly handle OOM death?

2023-03-13 Thread Israel Brewster
> On Mar 13, 2023, at 9:43 AM, Peter J. Holzer  wrote:
> 
> On 2023-03-13 09:21:18 -0800, Israel Brewster wrote:
>> I’m running a postgresql 13 database on an Ubuntu 20.04 VM that is a bit more
>> memory constrained than I would like, such that every week or so the various
>> processes running on the machine will align badly and the OOM killer will 
>> kick
>> in, killing off postgresql, as per the following journalctl output:
>> 
>> Mar 12 04:04:23 novarupta systemd[1]: postgresql@13-main.service: A process 
>> of
>> this unit has been killed by the OOM killer.
>> Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Failed with
>> result 'oom-kill'.
>> Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Consumed 5d
>> 17h 48min 24.509s CPU time.
>> 
>> And the service is no longer running.
> 
> I might be misreading this, but it looks to me that systemd detects that
> *some* process in the group was killed by the oom killer and stops the
> service.
> 
> Can you check which process was actually killed? If it's not the
> postmaster, setting OOMScoreAdjust is probably useless.
> 
> (I tried searching the web for the error messages and didn't find
> anything useful)

Your guess is as good as (if not better than) mine. I can find the PID of the 
killed process in the system log, but without knowing what the PID of 
postmaster and the child processes were prior to the kill, I’m not sure that 
helps much. Though for what it’s worth, I do note the following about all the 
kill logs:

1) They reference a “Memory cgroup out of memory”, which refers back to the 
opening comment on Joe Conway’s message - this would imply to me that I *AM* 
running with a cgroup memory.limit set. Not sure how that changes things?
2) All the entries contain the line "oom_score_adj:0”, which would seem to 
imply that the postmaster, with its -900 score is not being directly targeted 
by the OOM killer.

> 
>> 2) My first thought was to simply have systemd restart postgresql whenever it
>> is killed like this, which is easy enough. Then I looked at the default unit
>> file, and found these lines:
>> 
>> # prevent OOM killer from choosing the postmaster (individual backends will
>> # reset the score to 0)
>> OOMScoreAdjust=-900
>> # restarting automatically will prevent "pg_ctlcluster ... stop" from 
>> working,
>> # so we disable it here.
> 
> I never call pg_ctlcluster directly, so that probably wouldn't be a good
> reason for me.

Valid point, unless something under-the-hood needs to call it?

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

> 
>> Also, the postmaster will restart by itself on most
>> # problems anyway, so it is questionable if one wants to enable external
>> # automatic restarts.
>> #Restart=on-failure
> 
> So I'd try this despite the comment.
> 
>hp
> 
> -- 
>   _  | Peter J. Holzer| Story must make more sense than reality.
> |_|_) ||
> | |   | h...@hjp.at |-- Charles Stross, "Creative writing
> __/   | http://www.hjp.at/ |   challenge!"





Re: Properly handle OOM death?

2023-03-13 Thread Peter J. Holzer
On 2023-03-13 09:21:18 -0800, Israel Brewster wrote:
> I’m running a postgresql 13 database on an Ubuntu 20.04 VM that is a bit more
> memory constrained than I would like, such that every week or so the various
> processes running on the machine will align badly and the OOM killer will kick
> in, killing off postgresql, as per the following journalctl output:
> 
> Mar 12 04:04:23 novarupta systemd[1]: postgresql@13-main.service: A process of
> this unit has been killed by the OOM killer.
> Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Failed with
> result 'oom-kill'.
> Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Consumed 5d
> 17h 48min 24.509s CPU time.
> 
> And the service is no longer running.

I might be misreading this, but it looks to me that systemd detects that
*some* process in the group was killed by the oom killer and stops the
service.

Can you check which process was actually killed? If it's not the
postmaster, setting OOMScoreAdjust is probably useless.

(I tried searching the web for the error messages and didn't find
anything useful)


> 2) My first thought was to simply have systemd restart postgresql whenever it
> is killed like this, which is easy enough. Then I looked at the default unit
> file, and found these lines:
> 
> # prevent OOM killer from choosing the postmaster (individual backends will
> # reset the score to 0)
> OOMScoreAdjust=-900
> # restarting automatically will prevent "pg_ctlcluster ... stop" from working,
> # so we disable it here.

I never call pg_ctlcluster directly, so that probably wouldn't be a good
reason for me.

> Also, the postmaster will restart by itself on most
> # problems anyway, so it is questionable if one wants to enable external
> # automatic restarts.
> #Restart=on-failure

So I'd try this despite the comment.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Properly handle OOM death?

2023-03-13 Thread Israel Brewster
> On Mar 13, 2023, at 9:36 AM, Joe Conway  wrote:
> 
> On 3/13/23 13:21, Israel Brewster wrote:
>> I’m running a postgresql 13 database on an Ubuntu 20.04 VM that is a bit 
>> more memory constrained than I would like, such that every week or so the 
>> various processes running on the machine will align badly and the OOM killer 
>> will kick in, killing off postgresql, as per the following journalctl output:
>> Mar 12 04:04:23 novarupta systemd[1]: postgresql@13-main.service: A process 
>> of this unit has been killed by the OOM killer.
>> Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Failed 
>> with result 'oom-kill'.
>> Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Consumed 
>> 5d 17h 48min 24.509s CPU time.
>> And the service is no longer running.
>> When this happens, I go in and restart the postgresql service, and 
>> everything is happy again for the next week or two.
>> Obviously this is not a good situation. Which leads to two questions:
>> 1) is there some tweaking I can do in the postgresql config itself to 
>> prevent the situation from occurring in the first place?
>> 2) My first thought was to simply have systemd restart postgresql whenever 
>> it is killed like this, which is easy enough. Then I looked at the default 
>> unit file, and found these lines:
>> # prevent OOM killer from choosing the postmaster (individual backends will
>> # reset the score to 0)
>> OOMScoreAdjust=-900
>> # restarting automatically will prevent "pg_ctlcluster ... stop" from 
>> working,
>> # so we disable it here. Also, the postmaster will restart by itself on most
>> # problems anyway, so it is questionable if one wants to enable external
>> # automatic restarts.
>> #Restart=on-failure
>> Which seems to imply that the OOM killer should only be killing off 
>> individual backends, not the entire cluster to begin with - which should be 
>> fine. And also that adding the restart=on-failure option is probably not the 
>> greatest idea. Which makes me wonder what is really going on?
> 
> First, are you running with a cgroup memory.limit set (e.g. in a container)?

Not sure, actually. I *think* I had it set it up as a full VM though, not a 
container. I’ll have to double-check that.

> Assuming no, see:
> 
> https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT
> 
> That will tell you:
> 1/ Turn off memory overcommit: "Although this setting will not prevent the 
> OOM killer from being invoked altogether, it will lower the chances 
> significantly and will therefore lead to more robust system behavior."
> 
> 2/ set /proc/self/oom_score_adj to -1000 rather than -900 
> (OOMScoreAdjust=-1000): the value -1000 is important as it is a "magic" value 
> which prevents the process from being selected by the OOM killer (see: 
> https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/oom.h#L6) 
> whereas -900 just makes it less likely.

..and that answers the question I just sent about the above linked page  
Thanks!

> 
> All that said, even if the individual backend gets killed, the postmaster 
> will still go into crash recovery. So while technically postgres does not 
> restart, the effect is much the same. So see #1 above as your best protection.

Interesting. Makes sense though. Thanks!


---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

> 
> HTH,
> 
> Joe
> 
> -- 
> Joe Conway
> PostgreSQL Contributors Team
> RDS Open Source Databases
> Amazon Web Services: https://aws.amazon.com 


Re: Properly handle OOM death?

2023-03-13 Thread Israel Brewster
> On Mar 13, 2023, at 9:28 AM, Adrian Klaver  wrote:
> 
> On 3/13/23 10:21 AM, Israel Brewster wrote:
>> I’m running a postgresql 13 database on an Ubuntu 20.04 VM that is a bit 
>> more memory constrained than I would like, such that every week or so the 
>> various processes running on the machine will align badly and the OOM killer 
>> will kick in, killing off postgresql, as per the following journalctl output:
>> Mar 12 04:04:23 novarupta systemd[1]: postgresql@13-main.service: A process 
>> of this unit has been killed by the OOM killer.
>> Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Failed 
>> with result 'oom-kill'.
>> Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Consumed 
>> 5d 17h 48min 24.509s CPU time.
>> And the service is no longer running.
>> When this happens, I go in and restart the postgresql service, and 
>> everything is happy again for the next week or two.
>> Obviously this is not a good situation. Which leads to two questions:
>> 1) is there some tweaking I can do in the postgresql config itself to 
>> prevent the situation from occurring in the first place?
>> 2) My first thought was to simply have systemd restart postgresql whenever 
>> it is killed like this, which is easy enough. Then I looked at the default 
>> unit file, and found these lines:
>> # prevent OOM killer from choosing the postmaster (individual backends will
>> # reset the score to 0)
>> OOMScoreAdjust=-900
>> # restarting automatically will prevent "pg_ctlcluster ... stop" from 
>> working,
>> # so we disable it here. Also, the postmaster will restart by itself on most
>> # problems anyway, so it is questionable if one wants to enable external
>> # automatic restarts.
>> #Restart=on-failure
>> Which seems to imply that the OOM killer should only be killing off 
>> individual backends, not the entire cluster to begin with - which should be 
>> fine. And also that adding the restart=on-failure option is probably not the 
>> greatest idea. Which makes me wonder what is really going on?
> 
> You might want to read:
> 
> https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT

Good information, thanks. One thing there confuses me though. It says:

Another approach, which can be used with or without altering 
vm.overcommit_memory, is to set the process-specific OOM score adjustment value 
for the postmaster process to -1000, thereby guaranteeing it will not be 
targeted by the OOM killer

Isn’t that exactly what the "OOMScoreAdjust=-900” line in the Unit file does 
though (except with a score of -900 rather than -1000)?

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145
> 
>> Thanks.
>> ---
>> Israel Brewster
>> Software Engineer
>> Alaska Volcano Observatory
>> Geophysical Institute - UAF
>> 2156 Koyukuk Drive
>> Fairbanks AK 99775-7320
>> Work: 907-474-5172
>> cell:  907-328-9145
> 
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com 


Re: Properly handle OOM death?

2023-03-13 Thread Joe Conway

On 3/13/23 13:21, Israel Brewster wrote:
I’m running a postgresql 13 database on an Ubuntu 20.04 VM that is a bit 
more memory constrained than I would like, such that every week or so 
the various processes running on the machine will align badly and the 
OOM killer will kick in, killing off postgresql, as per the following 
journalctl output:


Mar 12 04:04:23 novarupta systemd[1]: postgresql@13-main.service: A 
process of this unit has been killed by the OOM killer.
Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Failed 
with result 'oom-kill'.
Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: 
Consumed 5d 17h 48min 24.509s CPU time.


And the service is no longer running.

When this happens, I go in and restart the postgresql service, and 
everything is happy again for the next week or two.


Obviously this is not a good situation. Which leads to two questions:

1) is there some tweaking I can do in the postgresql config itself to 
prevent the situation from occurring in the first place?
2) My first thought was to simply have systemd restart postgresql 
whenever it is killed like this, which is easy enough. Then I looked at 
the default unit file, and found these lines:


# prevent OOM killer from choosing the postmaster (individual backends will
# reset the score to 0)
OOMScoreAdjust=-900
# restarting automatically will prevent "pg_ctlcluster ... stop" from 
working,

# so we disable it here. Also, the postmaster will restart by itself on most
# problems anyway, so it is questionable if one wants to enable external
# automatic restarts.
#Restart=on-failure

Which seems to imply that the OOM killer should only be killing off 
individual backends, not the entire cluster to begin with - which should 
be fine. And also that adding the restart=on-failure option is probably 
not the greatest idea. Which makes me wonder what is really going on?


First, are you running with a cgroup memory.limit set (e.g. in a container)?

Assuming no, see:

https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT

That will tell you:
1/ Turn off memory overcommit: "Although this setting will not prevent 
the OOM killer from being invoked altogether, it will lower the chances 
significantly and will therefore lead to more robust system behavior."


2/ set /proc/self/oom_score_adj to -1000 rather than -900 
(OOMScoreAdjust=-1000): the value -1000 is important as it is a "magic" 
value which prevents the process from being selected by the OOM killer 
(see: 
https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/oom.h#L6) 
whereas -900 just makes it less likely.


All that said, even if the individual backend gets killed, the 
postmaster will still go into crash recovery. So while technically 
postgres does not restart, the effect is much the same. So see #1 above 
as your best protection.


HTH,

Joe

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: Properly handle OOM death?

2023-03-13 Thread Adrian Klaver

On 3/13/23 10:21 AM, Israel Brewster wrote:
I’m running a postgresql 13 database on an Ubuntu 20.04 VM that is a bit 
more memory constrained than I would like, such that every week or so 
the various processes running on the machine will align badly and the 
OOM killer will kick in, killing off postgresql, as per the following 
journalctl output:


Mar 12 04:04:23 novarupta systemd[1]: postgresql@13-main.service: A 
process of this unit has been killed by the OOM killer.
Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Failed 
with result 'oom-kill'.
Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: 
Consumed 5d 17h 48min 24.509s CPU time.


And the service is no longer running.

When this happens, I go in and restart the postgresql service, and 
everything is happy again for the next week or two.


Obviously this is not a good situation. Which leads to two questions:

1) is there some tweaking I can do in the postgresql config itself to 
prevent the situation from occurring in the first place?
2) My first thought was to simply have systemd restart postgresql 
whenever it is killed like this, which is easy enough. Then I looked at 
the default unit file, and found these lines:


# prevent OOM killer from choosing the postmaster (individual backends will
# reset the score to 0)
OOMScoreAdjust=-900
# restarting automatically will prevent "pg_ctlcluster ... stop" from 
working,

# so we disable it here. Also, the postmaster will restart by itself on most
# problems anyway, so it is questionable if one wants to enable external
# automatic restarts.
#Restart=on-failure

Which seems to imply that the OOM killer should only be killing off 
individual backends, not the entire cluster to begin with - which should 
be fine. And also that adding the restart=on-failure option is probably 
not the greatest idea. Which makes me wonder what is really going on?


You might want to read:

https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT



Thanks.

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145




--
Adrian Klaver
adrian.kla...@aklaver.com




Properly handle OOM death?

2023-03-13 Thread Israel Brewster
I’m running a postgresql 13 database on an Ubuntu 20.04 VM that is a bit more 
memory constrained than I would like, such that every week or so the various 
processes running on the machine will align badly and the OOM killer will kick 
in, killing off postgresql, as per the following journalctl output:

Mar 12 04:04:23 novarupta systemd[1]: postgresql@13-main.service: A process of 
this unit has been killed by the OOM killer.
Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Failed with 
result 'oom-kill'.
Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Consumed 5d 
17h 48min 24.509s CPU time.

And the service is no longer running.

When this happens, I go in and restart the postgresql service, and everything 
is happy again for the next week or two.

Obviously this is not a good situation. Which leads to two questions:

1) is there some tweaking I can do in the postgresql config itself to prevent 
the situation from occurring in the first place?
2) My first thought was to simply have systemd restart postgresql whenever it 
is killed like this, which is easy enough. Then I looked at the default unit 
file, and found these lines:

# prevent OOM killer from choosing the postmaster (individual backends will
# reset the score to 0)
OOMScoreAdjust=-900
# restarting automatically will prevent "pg_ctlcluster ... stop" from working,
# so we disable it here. Also, the postmaster will restart by itself on most
# problems anyway, so it is questionable if one wants to enable external
# automatic restarts.
#Restart=on-failure

Which seems to imply that the OOM killer should only be killing off individual 
backends, not the entire cluster to begin with - which should be fine. And also 
that adding the restart=on-failure option is probably not the greatest idea. 
Which makes me wonder what is really going on?

Thanks.

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145



Re: Binary large object processing problems

2023-03-13 Thread Tom Lane
Raivo Rebane  writes:
> Thank for answer, but I was also readed this manual and found there :
> 35.3.2. Importing a Large Object

You need to read the start of that section, which says

All large object manipulation using these functions
must take place within an SQL transaction block,

As Christoph said, you're responsible for supplying a BEGIN and COMMIT
around the call.

regards, tom lane




Re: Binary large object processing problems

2023-03-13 Thread Raivo Rebane
Thank for answer, but I was also readed this manual and found there :
35.3.2. Importing a Large Object

To import an operating system file as a large object, call

Oid lo_import(PGconn *conn, const char *filename);

*filename* specifies the operating system name of the file to be imported
as a large object. The return value is the OID that was assigned to the new
large object, or InvalidOid (zero) on failure. Note that the file is read
by the client interface library, not by the server; so it must exist in the
client file system and be readable by the client application.
This file what I want to import is readable, I just tested it. So I
expected that lo_import doesn't work properly. It answers to my - invalid
large-object descriptor: 0
why ?

I try now by creating, opening, writing and closing sequence or are there
alternatives ?

Regards
Raivo



On Mon, Mar 13, 2023 at 5:26 PM Christoph Moench-Tegeder 
wrote:

> ## Raivo Rebane (raivor...@gmail.com):
>
> > Can anybody help me find where is my mistake an what is working solution
> ?
>
> The documentation clearly states "All large object manipulation using
> these functions must take place within an SQL transaction block"
> https://www.postgresql.org/docs/current/lo-interfaces.html
> but I couldn't see anything of thank kind in your code.
>
> Regards,
> Christoph
>
> --
> Spare Space.
>


Re: Binary large object processing problems

2023-03-13 Thread Christoph Moench-Tegeder
## Raivo Rebane (raivor...@gmail.com):

> Can anybody help me find where is my mistake an what is working solution ?

The documentation clearly states "All large object manipulation using
these functions must take place within an SQL transaction block"
https://www.postgresql.org/docs/current/lo-interfaces.html
but I couldn't see anything of thank kind in your code.

Regards,
Christoph

-- 
Spare Space.




Binary large object processing problems

2023-03-13 Thread Raivo Rebane
Hi

I am novice in pogres C API user and I want to store Binary Large Object
into database from C language program.

I am using following source :
Oid nloid = lo_import(conn, FileFUllName.c_str());

if (nloid == 0) {
std::cerr << "Error importing large object: " << PQerrorMessage(conn) <<
std::endl;
return -1;
}
else {
std::cout << "Large object imported successfully with fd " << nloid <<
std::endl;
}
And I seadely get 0 from lo_import.
I tried to create it previously an send oid to lo_impor_with_oid, but with
error

Can anybody help me find where is my mistake an what is working solution ?

Regards,
Raivo


Re: Seq Scan because of stats or because of cast?

2023-03-13 Thread Tom Lane
Dominique Devienne  writes:
> I figured that query would use the "pg_auth_members_role_member_index"
> index,
> but instead it's using a sequential scan.
> And I'm wondering is this is because the cardinality of that catalog is
> small (172),
> which is just an artifact of my dev-testing, or whether that's because I
> cast roleid
> to an int4, preventing the use of the index?

Both.

> Is there a way to know why the index is not used, in any of my attempts?

For testing purposes, you could set enable_seqscan = off and then see
whether the plan changes.  When I try this example I get

regression=# explain select * from pg_auth_members WHERE roleid = 
ANY(array[1::oid,2::oid,3::oid]);
   QUERY PLAN   

 Seq Scan on pg_auth_members  (cost=0.00..1.04 rows=3 width=19)
   Filter: (roleid = ANY ('{1,2,3}'::oid[]))
(2 rows)

regression=# set enable_seqscan to 0;
SET
regression=# explain select * from pg_auth_members WHERE roleid = 
ANY(array[1::oid,2::oid,3::oid]);
QUERY PLAN  
   
---
 Index Scan using pg_auth_members_role_member_index on pg_auth_members  
(cost=0.13..12.44 rows=3 width=19)
   Index Cond: (roleid = ANY ('{1,2,3}'::oid[]))
(2 rows)

So it's clearly not going to use the index until pg_auth_members gets
a great deal larger than it is on my test installation --- but it's
capable of doing so once it looks cost-attractive.

> PS: libpq uses unsigned int for Oid. So do OIDs go above 2GB in practice?

Yes, eventually.

> I'm asking, since I'm casting to ::int4, thus if they do, then that
> case might overflow.

I'd use int8.

> PPS: Are OIDs recycled / reused? Or are they monotonically increasing?
>   What happens when the Cluster runs out of OIDs?

The counter wraps around.  But it does so at 2^32 not 2^31.

>   Are they Cluster-wide unique or it depends on the OID type?

They're unique per catalog.  We don't attempt to guarantee
more than that.  In practice, they'll be unique across the
installation until after the first OID wraparound, and then
not so much.

regards, tom lane




Seq Scan because of stats or because of cast?

2023-03-13 Thread Dominique Devienne
I'm interested in the members of specific roles, providing the roles of
interest to the query via an array of integers (binary bind in code, not
textual array literal like I had to use to have the EXPLAIN work, see
below).

I figured that query would use the "pg_auth_members_role_member_index"
index,
but instead it's using a sequential scan.

And I'm wondering is this is because the cardinality of that catalog is
small (172),
which is just an artifact of my dev-testing, or whether that's because I
cast roleid
to an int4, preventing the use of the index?

In production, the cardinality will be much greator, which is why I worry a
bit.
Also, I don't really need the grantor and admin_option columns for now, thus
it could even be an index-only scan, IF the index was used by the plan.

I tried changing the cast around, or allowing an index-only scan,
but it's still a Seq Scan on the table (see below).

Is there a way to know why the index is not used, in any of my attempts?

I currently does not support (binary) binding Oids in my case, thus the
::int4 casts.
Would supporting binding actual Oid arrays instead of Int4 arrays help in
this case?

I'd appreciate some insights here. Thanks, --DD

PS: libpq uses unsigned int for Oid. So do OIDs go above 2GB in practice?
I'm asking, since I'm casting to ::int4, thus if they do, then that
case might overflow.

PPS: Are OIDs recycled / reused? Or are they monotonically increasing?
  What happens when the Cluster runs out of OIDs?
  Are they Cluster-wide unique or it depends on the OID type?

dd_pns2=> explain SELECT roleid::int4, member::int4, grantor::int4,
admin_option
dd_pns2->   FROM pg_auth_members
dd_pns2->  WHERE roleid::int4 = ANY($1);
ERROR:  there is no parameter $1
LINE 3:  WHERE roleid::int4 = ANY($1);
  ^
dd_pns2=> explain SELECT roleid::int4, member::int4, grantor::int4,
admin_option
dd_pns2->   FROM pg_auth_members
dd_pns2->  WHERE roleid::int4 = ANY(array[1,2,3]);
   QUERY PLAN

 Seq Scan on pg_auth_members  (cost=0.00..5.33 rows=3 width=13)
   Filter: ((roleid)::integer = ANY ('{1,2,3}'::integer[]))
(2 rows)

dd_pns2=> explain SELECT roleid::int4, member::int4, grantor::int4,
admin_option
dd_pns2->   FROM pg_auth_members
dd_pns2->  WHERE roleid = ANY(array[1::oid,2::oid,3::oid]);
   QUERY PLAN

 Seq Scan on pg_auth_members  (cost=0.00..5.33 rows=3 width=13)
   Filter: (roleid = ANY ('{1,2,3}'::oid[]))
(2 rows)

dd_pns2=> explain SELECT roleid::int4, member::int4
dd_pns2->   FROM pg_auth_members
dd_pns2->  WHERE roleid = ANY(array[1::oid,2::oid,3::oid]);
  QUERY PLAN
---
 Seq Scan on pg_auth_members  (cost=0.00..5.33 rows=3 width=8)
   Filter: (roleid = ANY ('{1,2,3}'::oid[]))
(2 rows)

dd_pns2=> \d pg_auth_members
   Table "pg_catalog.pg_auth_members"
Column|  Type   | Collation | Nullable | Default
--+-+---+--+-
 roleid   | oid |   | not null |
 member   | oid |   | not null |
 grantor  | oid |   | not null |
 admin_option | boolean |   | not null |
Indexes:
"pg_auth_members_member_role_index" UNIQUE, btree (member, roleid),
tablespace "pg_global"
"pg_auth_members_role_member_index" UNIQUE, btree (roleid, member),
tablespace "pg_global"
Tablespace: "pg_global"


dd_pns2=> select count(*) from pg_auth_members;
 count
---
   172
(1 row)