Re: Question on Aurora postgres

2023-10-31 Thread veem v
Thank you so much. So basically this application is an enterprise app with
24/7 users(doing DML and querying both) and is currently in Oracle on
premise database. It has data in the range of ~20-30TB and the queries will
be happening 24/7 on this database. It's an OLTP app. And ofcourse lower
environment Dev/qa/uat doesn't have those kinds of usage and data volume as
prod.

In the document below , I see a lot of instance types like Memory optimized
X family, R family and T based instance class. So I'm a bit confused, how
to get started with it, and if we can move from one type to
another seamlessly if required in future?

Will the cpu and memory, storage mentioned against each of these instance
types be directly proportional to the CPU and memory size, storage size of
data which we currently have on our on-premise oracle database?

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Concepts.DBInstanceClass.html

On Wed, 1 Nov 2023 at 04:55, Ben Chobot  wrote:

> veem v wrote on 10/31/23 2:49 PM:
>
> Hello all,
> We are planning to use aurora Postgres for a few applications. But wanted
> some guidance on which instance, class type should we use for lower
> environment/non prod like Dev/Qa/Uat/perf and what should we use for
> production database? Is there some recommendation based on usage etc. for
> this?
>
>
> Regardless of if you are going to be making databases in a cloud provider
> or on physical hardware, the first step is to figure out what your load
> will likely be and what variables will be influencing it. Will you have a
> lot of data? Will it churn very much? Will you have a lot of concurrent
> reads? Will they be simple queries or memory intensive ones?
>
> Only you can answer these questions, and once you have, the many (many)
> options available to you will start to separate out and there will be some
> obvious things to try. One of the great things about making databases in
> the cloud is that you aren't locked into any sizing choice like you can by
> if you go build a server.
>


Re: Inefficient query plan for SELECT ... EXCEPT ...

2023-10-31 Thread Tom Lane
David Rowley  writes:
> It would be possible to have some sort of MergeExcept operator and
> have the planner consider that. Unfortunately, since the upper planner
> was changed a few years ago to have it consider paths the same as the
> join planner does, nobody has yet come back to the union planner to
> properly pathify that.  I do have a WIP patch to do this work, but I
> wasn't planning on improving EXCEPT, only UNION. Making it work for
> EXCEPT and INTERSECT would require a new executor operator.

Yeah.  We're moderately good about UNION ALL, but UNION/INTERSECT/EXCEPT
are an area that nobody has ever gotten around to optimizing: the two
sub-queries will be planned independently and then fed to a simplistic
set-operation node.  Maybe that'll get better someday but don't hold
your breath.  In the meantime, try to recast an EXCEPT query as an
antijoin.

regards, tom lane




Feature/Suggestion: libpq/psql support for libsecret (formerly gnome-keyring)

2023-10-31 Thread Alan Evans
I was working with a user and was looking for a convenient and secure
method of password storage for `psql`.  I was reminded of the various
options, `-p password`, PGPASSWORD, .pgpass and so on.  I was thinking it
would be cool if psql/libpq could interact with the system keyring on linux
systems, libsecret (formerly gnome-secret, seahorse, etc etc).

It seems like a match made in heaven actually.  libpq stores
`hostname:port:database:username:password` and libsecret allows clients to
store arbitrary attributes associated with passwords.  Perhaps hostname,
port, database and username for example...

It seems like libpq could possibly just use libescret as an alternate
backend to using the .pgpass file.

I'm envisioning something like an environment variable PGPASSBACKEND where
the values are something like:
auto - (default) try libsecret if available, fall back to current behavior
if libsecret is unavailable or a suitable secret isn't found
auto-save - like auto, but automatically save passwords in libsecret if
it's unavailable
libsecret - only use libscret, fail if unavailable, fail if password not
found
pgpass - current behavior.
(Discussion needed, just brainstorming...)

Clearly there'd have to be some thought around interaction w/ things like
the -p -w and -W switches, other environment variables and default
behaviors for things like saving secrets in libsecret or not, but I thought
I'd start the discussion.

Regards,
-Alan


Re: Inefficient query plan for SELECT ... EXCEPT ...

2023-10-31 Thread David Rowley
On Wed, 1 Nov 2023 at 11:41, Dimitrios Apostolou  wrote:
> I'm wondering why the planner doesn't see that the left table is very small 
> and follow a different path.
> From an abstract computer science POV, I would
>
> 1. sort the left table  (the right one is already indexed)
> 2. "merge" the two tables, by walking them in-order in parallel and excluding 
> the matches
> 3. stop when the left table is exhausted, which would happen very early.

It would be possible to have some sort of MergeExcept operator and
have the planner consider that. Unfortunately, since the upper planner
was changed a few years ago to have it consider paths the same as the
join planner does, nobody has yet come back to the union planner to
properly pathify that.  I do have a WIP patch to do this work, but I
wasn't planning on improving EXCEPT, only UNION. Making it work for
EXCEPT and INTERSECT would require a new executor operator.

> Is this worth a bug report? I can file one if the issue is not known.

No. It's just a missing optimisation. We know about it.

> In the meantime I have replaced the query with a LEFT OUTER JOIN which
> performs much better, and I believe is equivalent. I find it less readable
> than the query in question though. Plus, I have a bunch of SELECT-EXCEPT
> queries (with smaller right-side tables) in my application that I would
> hate to change them all to the ugliest equivalent. Under what conditions
> would the above query plan perform well?

It'll be best if you just use NOT EXISTS. You should be able to form
the LEFT JOINS to make use of an Anti-Join.  If you don't want to
rewrite your queries then you'll just be at the mercy of the current
planner's ability to plan EXCEPT queries, unfortunately.  There won't
be any bug fixes to improve this. It may, however, be improved in some
future version of PostgreSQL.

David




Re: Inefficient query plan for SELECT ... EXCEPT ...

2023-10-31 Thread David G. Johnston
On Tue, Oct 31, 2023 at 3:41 PM Dimitrios Apostolou  wrote:

>
> Is this worth a bug report? I can file one if the issue is not known.
> Or am I misunderstanding the implications of the SELECT-EXCEPT query?
>
> In the meantime I have replaced the query with a LEFT OUTER JOIN which
> performs much better,


I'd expect a NOT EXISTS query would also perform well and more accurately
reflect what you seem to be wanting here.

I could see maybe optimizing away the second independent query if the first
produces zero rows but it also doesn't seem like that productive an
optimization to attempt.

David J.


Inefficient query plan for SELECT ... EXCEPT ...

2023-10-31 Thread Dimitrios Apostolou

Hello list,

I'm getting an inefficient query plan for a SELECT ... EXCEPT ... query,
where the left side is a very short table (even zero-length sometimes, but
also also rarely can be as long as 200K rows), and the right side is a
table with 10M UNIQUE NOT NULL rows:


\d test_datatags

Table "public.test_datatags"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+--
 test_datatag_n | integer |   | not null | generated by default as 
identity
 test_datatag   | text|   | not null |
Indexes:
"test_datatags_pkey" PRIMARY KEY, btree (test_datatag_n)
"test_datatags_test_datatag_key" UNIQUE CONSTRAINT, btree (test_datatag)


Follows a simplified and pathological case, that takes 1.7s even though
the left side is empty:



BEGIN;
CREATE TEMPORARY TABLE tmp_table(d1 TEXT) ON COMMIT DROP;
ANALYZE VERBOSE tmp_table ;

INFO:  analyzing "pg_temp_9.tmp_table"
INFO:  "tmp_table": scanned 0 of 0 pages, containing 0 live rows and 0 dead 
rows; 0 rows in sample, 0 estimated total rows

EXPLAIN (ANALYSE,VERBOSE,BUFFERS,SETTINGS)

SELECT DISTINCT  d1
FROM tmp_table
WHERE d1 IS NOT NULL
EXCEPT
SELECT test_datatag FROM test_datatags;

QUERY PLAN
--
 HashSetOp Except  (cost=0.00..299854.08 rows=1 width=36) (actual 
time=1726.470..1726.472 rows=0 loops=1)
   Output: "*SELECT* 1".d1, (0)
   Buffers: shared hit=1259 read=58800
   I/O Timings: shared/local read=77.713
   ->  Append  (cost=0.00..278054.53 rows=8719821 width=36) (actual 
time=3.754..1287.901 rows=8702840 loops=1)
 Buffers: shared hit=1259 read=58800
 I/O Timings: shared/local read=77.713
 ->  Subquery Scan on "*SELECT* 1"  (cost=0.00..0.02 rows=1 width=36) 
(actual time=0.003..0.003 rows=0 loops=1)
   Output: "*SELECT* 1".d1, 0
   ->  HashAggregate  (cost=0.00..0.01 rows=1 width=32) (actual 
time=0.002..0.003 rows=0 loops=1)
 Output: tmp_table.d1
 Group Key: tmp_table.d1
 Batches: 1  Memory Usage: 24kB
 ->  Seq Scan on pg_temp.tmp_table  (cost=0.00..0.00 rows=1 
width=32) (actual time=0.001..0.001 rows=0 loops=1)
   Output: tmp_table.d1
   Filter: (tmp_table.d1 IS NOT NULL)
 ->  Subquery Scan on "*SELECT* 2"  (cost=0.00..234455.40 rows=8719820 
width=26) (actual time=3.751..943.850 rows=8702840 loops=1)
   Output: "*SELECT* 2".test_datatag, 1
   Buffers: shared hit=1259 read=58800
   I/O Timings: shared/local read=77.713
   ->  Seq Scan on public.test_datatags  (cost=0.00..147257.20 
rows=8719820 width=22) (actual time=3.747..515.420 rows=8702840 loops=1)
 Output: test_datatags.test_datatag
 Buffers: shared hit=1259 read=58800
 I/O Timings: shared/local read=77.713
 Settings: effective_io_concurrency = '0', enable_partitionwise_aggregate = 
'on', enable_partitionwise_join = 'on', hash_mem_multiplier = '1',
 maintenance_io_concurrency = '0', max_parallel_workers = '12', 
max_parallel_workers_per_gather = '0', temp_buffers = '64MB'
 Planning:
   Buffers: shared hit=2
 Planning Time: 0.055 ms
 JIT:
   Functions: 15
   Options: Inlining false, Optimization false, Expressions true,
Deforming true
   Timing: Generation 0.317 ms, Inlining 0.000 ms, Optimization 0.179 ms,
Emission 3.542 ms, Total 4.037 ms
 Execution Time: 1726.835 ms
(33 rows)


I'm wondering why the planner doesn't see that the left table is very small and 
follow a different path.
From an abstract computer science POV, I would

1. sort the left table  (the right one is already indexed)
2. "merge" the two tables, by walking them in-order in parallel and excluding 
the matches
3. stop when the left table is exhausted, which would happen very early.

Is this worth a bug report? I can file one if the issue is not known.
Or am I misunderstanding the implications of the SELECT-EXCEPT query?

In the meantime I have replaced the query with a LEFT OUTER JOIN which
performs much better, and I believe is equivalent. I find it less readable
than the query in question though. Plus, I have a bunch of SELECT-EXCEPT
queries (with smaller right-side tables) in my application that I would
hate to change them all to the ugliest equivalent. Under what conditions
would the above query plan perform well?

Thanks in advance,
Dimitris





Question on Aurora postgres

2023-10-31 Thread veem v
Hello all,
We are planning to use aurora Postgres for a few applications. But wanted
some guidance on which instance, class type should we use for lower
environment/non prod like Dev/Qa/Uat/perf and what should we use for
production database? Is there some recommendation based on usage etc. for
this?

Regards
Veem


Re: pg_checksums?

2023-10-31 Thread Paul Förster
Hi Alexander,

> On Oct 30, 2023, at 19:49, Alexander Kukushkin  wrote:
> That's not what I said.

That's why I asked. Because you used the word orthogonal. 藍

> Patroni only manages Postgres. It is exactly the same Postgres as you would 
> run it without Patroni.
> Everything will work.

Now that is, what I suspected because of what I have learned how Patroni (and 
PostgreSQL replication) works so far.

Thanks very much.

Cheers
Paul



Re: xmax not zero?

2023-10-31 Thread Laurenz Albe
On Tue, 2023-10-31 at 08:23 +0100, Luca Ferrari wrote:
> What puzzled me was that such xmax was non-zero for a
> while (days). Inspecting pg_class.reloptions I found that accidentally
> (?) disabled autovacuum on such table, my fault!
> I should have checked pg_stat_user_tables at first!

VACUUM won't modify "xmax", nor will it remove the row.

Yours,
Laurenz Albe




Re: xmax not zero?

2023-10-31 Thread Luca Ferrari
On Mon, Oct 30, 2023 at 1:54 PM Guillaume Lelarge
 wrote:
> There are many reasons for a non-zero value: row updated or deleted in a 
> rollbacked transaction, row updated or deleted in a current transaction, row 
> locked by a SELECT FOR UPDATE, and perhaps others I don't remember right now.
>

Yes, clearly. What puzzled me was that such xmax was non-zero for a
while (days). Inspecting pg_class.reloptions I found that accidentally
(?) disabled autovacuum on such table, my fault!
I should have checked pg_stat_user_tables at first!

Thanks,
Luca