Re: Disk wait problem... may not be hardware...

2023-10-27 Thread Adrian Klaver

On 10/27/23 16:46, p...@pfortin.com wrote:

Peter,

Thanks for your patience; I've been feeling pressure to get this
resolved; so have been lax in providing info here..  Hope the following
helps...





Something I hadn't noticed before: SQL-workbench/J (build 129.6) displays
an execution timer at the bottom of the window.  I see all the queries
reporting 0-47ms which it plenty fast. It's the results that often take a
long time to display.


Which is not surprising as building GUI elements is an expensive 
operation. If the time to display is your wait issue then this is not 
really a disk issue. What happens if you use psql as the client?



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Disk wait problem... may not be hardware...

2023-10-27 Thread pf
Peter, 

Thanks for your patience; I've been feeling pressure to get this
resolved; so have been lax in providing info here..  Hope the following
helps...

On Fri, 27 Oct 2023 19:07:11 +0200 Peter J. Holzer wrote:

>On 2023-10-26 22:03:25 -0400, p...@pfortin.com wrote:
>> Are there any extra PG low level logs that can be turned on?  

$ uname -a
Linux pf.pfortin.com 6.5.3-server-1.mga10 #1 SMP PREEMPT_DYNAMIC
   Sat Sep 16 00:04:28 UTC 2023 x86_64 GNU/Linux

PostgreSQL 15.4 on x86_64-mageia-linux-gnu, compiled by gcc (Mageia
12.3.0-3.mga9) 12.3.0, 64-bit

Operating System: Mageia 10 KDE Plasma
Version: 5.27.8 KDE Frameworks Version: 5.111.0
Qt Version: 5.15.7
Kernel Version: 6.5.3-server-1.mga10 (64-bit)
Graphics Platform: X11
Processors: 20 × 12th Gen Intel® Core™ i7-12700K
Memory: 125.5 GiB of RAM
Graphics Processor: AMD Radeon RX 6600 XT
Manufacturer: Dell Inc.
Product Name: XPS 8950

>Have you looked at the query plans as I recommended? (You might also
>want to enable track_io_timing to get extra information, but comparing
>just the query plans of fast and slow queries would be a first step)

I didn't see how that would help since other than the table name the
queries are identical.  Curious: are you implying PG stores tables
differently?  The tables are probably greater than 90-95% identical rows.
We're still investigating how to best store that type of data which
changes a small number of rows each week...

The query is the built-in row-count query in SQL-workbench/J which is
visible in the attached files.

>explain (analyze, buffers) select count(*) from ncvhis_2016_12_03;
Finalize Aggregate  (cost=404669.65..404669.66 rows=1 width=8) (actual
time=844.158..847.309 rows=1 loops=1) Buffers: shared hit=248 read=25022
  ->  Gather  (cost=404669.43..404669.65 rows=2 width=8) (actual
time=844.133..847.301 rows=3 loops=1) Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=248 read=25022
->  Partial Aggregate  (cost=403669.43..403669.45 rows=1 width=8)
(actual time=838.772..838.772 rows=1 loops=3) Buffers: shared hit=248
read=25022 ->  Parallel Index Only Scan using
ncvhis_2016_12_03_voted_party_cd_idx on ncvhis_2016_12_03
(cost=0.44..372735.05 rows=12373755 width=0) (actual time=18.277..592.473
rows=9900389 loops=3) Heap Fetches: 0 Buffers: shared hit=248 read=25022
Planning Time: 0.069 ms JIT:
  Functions: 8
  Options: Inlining false, Optimization false, Expressions true,
Deforming true Timing: Generation 0.284 ms, Inlining 0.000 ms,
Optimization 0.268 ms, Emission 3.590 ms, Total 4.143 ms Execution Time:
847.498 ms

>explain (analyze, buffers) select count(*) from ncvhis_2020_08_01;
Finalize Aggregate  (cost=438377.94..438377.95 rows=1 width=8) (actual
time=624.700..628.024 rows=1 loops=1) Buffers: shared hit=286 read=27084
  ->  Gather  (cost=438377.73..438377.94 rows=2 width=8) (actual
time=624.669..628.016 rows=3 loops=1) Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=286 read=27084
->  Partial Aggregate  (cost=437377.73..437377.74 rows=1 width=8)
(actual time=619.297..619.297 rows=1 loops=3) Buffers: shared hit=286
read=27084 ->  Parallel Index Only Scan using
ncvhis_2020_08_01_voted_party_cd_idx on ncvhis_2020_08_01
(cost=0.44..403856.78 rows=13408379 width=0) (actual time=0.029..357.697
rows=10724282 loops=3) Heap Fetches: 0 Buffers: shared hit=286 read=27084
Planning Time: 0.072 ms JIT:
  Functions: 8
  Options: Inlining false, Optimization false, Expressions true,
Deforming true Timing: Generation 0.285 ms, Inlining 0.000 ms,
Optimization 0.218 ms, Emission 3.933 ms, Total 4.435 ms Execution Time:
628.216 ms

>explain (analyze, buffers) select count(*) from ncvhis_2020_10_31;
Finalize Aggregate  (cost=438179.73..438179.74 rows=1 width=8) (actual
time=1090.209..1092.976 rows=1 loops=1) Buffers: shared hit=276 read=27095
  ->  Gather  (cost=438179.52..438179.73 rows=2 width=8) (actual
time=1090.141..1092.967 rows=3 loops=1) Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=276 read=27095
->  Partial Aggregate  (cost=437179.52..437179.53 rows=1 width=8)
(actual time=1084.766..1084.767 rows=1 loops=3) Buffers: shared hit=276
read=27095 ->  Parallel Index Only Scan using
ncvhis_2020_10_31_voted_party_cd_idx on ncvhis_2020_10_31
(cost=0.44..403675.53 rows=13401595 width=0) (actual
time=222.686..816.447 rows=10720916 loops=3) Heap Fetches: 0 Buffers:
shared hit=276 read=27095 
Planning Time: 0.131 ms JIT:
  Functions: 8
  Options: Inlining false, Optimization false, Expressions true,
Deforming true Timing: Generation 0.322 ms, Inlining 0.000 ms,
Optimization 0.261 ms, Emission 3.648 ms, Total 4.231 ms Execution Time:
1093.209 ms



Those responses are reasonably quick...  A suggestion from the Linux list
was to run "fstrim -v /mnt/work" which is the NVMe mount point for the DB:
 fstrim -nv /mnt/work
/mnt/work: 0 B (dry run) trimmed

$ fstrim -v /mnt/work
/mnt/work: 2 TiB (2248993087488 bytes) trimmed


Re: PgAmin view

2023-10-27 Thread Merlin Moncure
On Thu, Oct 26, 2023 at 5:56 AM Shaozhong SHI 
wrote:

> If a PgAmin view is created and tables it queries are missing, what will
> happen?
>
> If a PdAmin view is created, and it may take 20 hours to complete, what
> will happen?
>
> Would views automatically run, when you start the PgAmin?
>

Views are rewritten into the query when the query is planned.

merlin


Re: pg_checksums?

2023-10-27 Thread Paul Förster
Hi Bruce, hi Daniel,

> On Oct 27, 2023, at 23:21, Bruce Momjian  wrote:
> 
> On Fri, Oct 27, 2023 at 10:45:16PM +0200, Daniel Gustafsson wrote:
>>> On 27 Oct 2023, at 20:34, Paul Förster  wrote:
>> 
>>> a) why isn't it possible to enable checksumming while a database cluster is 
>>> up?
>> 
>> It is surprisingly complicated to enable checksums on a live cluster, a patch
>> was submitted a while back but ultimately never made it into postgres.  The
>> below threads may shine some light on the problem:
>> 
>> https://www.postgresql.org/message-id/flat/CABUevEx8KWhZE_XkZQpzEkZypZmBp3GbM9W90JLp%3D-7OJWBbcg%40mail.gmail.com
>> https://www.postgresql.org/message-id/flat/560A2239-5DE2-4B9C-92BC-878C6822F47C%40yesql.se#10b665b2193445a3d7f24be7917a952c
> 
> Yeah, it was a big surprise that this feature was so hard to implement
> because we have _no_ infrastructure for having multiple data layouts
> active in a live system.  The discussion eventually made that clear.
> 
> If we have more features that need this kind of dynamic ability, we
> might revisit this feature too.

Ok, I see.

But unfortunately still, my questions c) and d) are unanswered. I'd especially 
be interested in an answer to c), i.e. is it *safe* to "pg_checksum -e" the 
replica instance in a patroni cluster, switch over, and then do the other one?

Cheers
Paul





Re: pg_checksums?

2023-10-27 Thread Bruce Momjian
On Fri, Oct 27, 2023 at 10:45:16PM +0200, Daniel Gustafsson wrote:
> > On 27 Oct 2023, at 20:34, Paul Förster  wrote:
> 
> > a) why isn't it possible to enable checksumming while a database cluster is 
> > up?
> 
> It is surprisingly complicated to enable checksums on a live cluster, a patch
> was submitted a while back but ultimately never made it into postgres.  The
> below threads may shine some light on the problem:
> 
> https://www.postgresql.org/message-id/flat/CABUevEx8KWhZE_XkZQpzEkZypZmBp3GbM9W90JLp%3D-7OJWBbcg%40mail.gmail.com
> https://www.postgresql.org/message-id/flat/560A2239-5DE2-4B9C-92BC-878C6822F47C%40yesql.se#10b665b2193445a3d7f24be7917a952c

Yeah, it was a big surprise that this feature was so hard to implement
because we have _no_ infrastructure for having multiple data layouts
active in a live system.  The discussion eventually made that clear.

If we have more features that need this kind of dynamic ability, we
might revisit this feature too.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: pg_checksums?

2023-10-27 Thread Daniel Gustafsson
> On 27 Oct 2023, at 20:34, Paul Förster  wrote:

> a) why isn't it possible to enable checksumming while a database cluster is 
> up?

It is surprisingly complicated to enable checksums on a live cluster, a patch
was submitted a while back but ultimately never made it into postgres.  The
below threads may shine some light on the problem:

https://www.postgresql.org/message-id/flat/CABUevEx8KWhZE_XkZQpzEkZypZmBp3GbM9W90JLp%3D-7OJWBbcg%40mail.gmail.com
https://www.postgresql.org/message-id/flat/560A2239-5DE2-4B9C-92BC-878C6822F47C%40yesql.se#10b665b2193445a3d7f24be7917a952c

--
Daniel Gustafsson





Re: pg_checksums?

2023-10-27 Thread Paul Förster
Hi Ron,

> On Oct 27, 2023, at 21:02, Ron  wrote:
>> b) why isn't it possible to check whether checksums are enabled or not?
> 
> (This is my tiny test instance.)
> 
> $ pg_controldata | grep checksum
> Data page checksum version:   0
> 
> postgres=# show data_checksums;
>  data_checksums
> 
>  off
> (1 row)

this helps a lot. Thanks very much.

Now, there are only two other questions.

Cheers
Paul




Re: pg_checksums?

2023-10-27 Thread Ron

On 10/27/23 13:34, Paul Förster wrote:

Hi,

I have a few questions about pg_checksums.

Long story short, checksums were disabled in our environment but are getting enabled now 
(again!) because of block corruptions which destroyed a few databases in a database 
cluster. And before you say "told you so", the decision to disable checksums 
was not mine. Some people just have to learn the hard way.

Anyway, re-enabling data checksums creates a few questions:

a) why isn't it possible to enable checksumming while a database cluster is up?


Data might be changing.


b) why isn't it possible to check whether checksums are enabled or not?


(This is my tiny test instance.)

$ pg_controldata | grep checksum
Data page checksum version:   0

postgres=# show data_checksums;
 data_checksums

 off
(1 row)



c) in a Patroni cluster consisting of a primary and a sync standby, is it safe 
to enable checksumming in the replica, then switch over and enable it in the 
ex-primary, i.e. now new replica, without any penalty? Or do I have to perform 
a reinit to really get them in sync again, though paronictl happily reports 
them to be in sync?

d) how long does it take to enable checksums in general? Minimizing down-time 
is crucial. Does it depend on the database cluster size, or the number of files 
it uses, or what can be taken as a criterion to estimate then necessary 
down-time.

Thanks in advance for your insights.

Cheers
Paul



--
Born in Arizona, moved to Babylonia.




pg_checksums?

2023-10-27 Thread Paul Förster
Hi,

I have a few questions about pg_checksums.

Long story short, checksums were disabled in our environment but are getting 
enabled now (again!) because of block corruptions which destroyed a few 
databases in a database cluster. And before you say "told you so", the decision 
to disable checksums was not mine. Some people just have to learn the hard way.

Anyway, re-enabling data checksums creates a few questions:

a) why isn't it possible to enable checksumming while a database cluster is up?

b) why isn't it possible to check whether checksums are enabled or not?

c) in a Patroni cluster consisting of a primary and a sync standby, is it safe 
to enable checksumming in the replica, then switch over and enable it in the 
ex-primary, i.e. now new replica, without any penalty? Or do I have to perform 
a reinit to really get them in sync again, though paronictl happily reports 
them to be in sync?

d) how long does it take to enable checksums in general? Minimizing down-time 
is crucial. Does it depend on the database cluster size, or the number of files 
it uses, or what can be taken as a criterion to estimate then necessary 
down-time.

Thanks in advance for your insights.

Cheers
Paul



Re: Disk wait problem... may not be hardware...

2023-10-27 Thread Peter J. Holzer
On 2023-10-26 22:03:25 -0400, p...@pfortin.com wrote:
> Are there any extra PG low level logs that can be turned on?

Have you looked at the query plans as I recommended? (You might also
want to enable track_io_timing to get extra information, but comparing
just the query plans of fast and slow queries would be a first step)

You haven't shown any postgresql logs or your settings, so it's hard to
know what you have already turned on.

There are a ton of logging related parameters.

> I've only found logging in the journal;

By "the journal" you mean the one maintained by journald? (Did you
mention what OS you are using? From several outputs I'm guessing it is
Linux). On Linux systems postgresql is usually set up to log into files
somewhere under /var/log/postgresql (see the log_directory entry in
postgresql.conf) but your setup might be different, of course.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


BRIN index maintenance on table without primary key

2023-10-27 Thread Dimitrios Apostolou

Hello list.

Key characteristics of my case:
+ HUGE table with 20G rows, ca 2TB
+ May be partitioned (have both versions on two test clusters ATM)
+ Plenty of inserts, no updates
+ No primary key - we had one IDENTITY bigint column until recently, but
  it proved useless and inefficient (index too big for mem) so we deleted it.
+ All queries are filtering on a not-unique not-null integer column.
  + On this column we have a BRIN INDEX since insert order is
/mostly/ incremental.

So the question is: how to maintain the physical order of the tuples?

Even though the insertions populate the index column /mostly/
incrementally, there are outliers and it's certain that over time the
BRIN index will deteriorate.

I'm monitoring the "correlation" and I want to run a command to "defrag"
it when it drops below 0.9.

+ Can't run CLUSTER:
ERROR:  cannot cluster on index "tst_brin_idx" because access method does 
not support clustering

+ Can't run pg_repack, from [1]:
> Target table must have a PRIMARY KEY, or at least a UNIQUE total index on 
a NOT NULL column

[1] https://reorg.github.io/pg_repack/


Any ideas?

Thank you in advance,
Dimitris





Re: need SQL logic to find out way's to Implement check sum to validate table data is migrated correctly

2023-10-27 Thread Ron

On 10/27/23 03:56, Y_esteembsv-forum wrote:

Hi
Need SQL logic/different approach method's  to find out way's
a) to Implement check sum to validate table data is migrated correctly


When migrating from Oracle to PostgreSQL, on both the Oracle side and the 
Postgresql side, I dumped each table (for the big ones, I split them into 
views) as a tab-separated value file ordered by the primary key and piped 
that into md5sum.


Had to play a few games with capitalization in blob/bytea columns

That way, I could verify that each table's data was identical.

--
Born in Arizona, moved to Babylonia.

Re: need SQL logic to find out way's to Implement check sum to validate table data is migrated correctly

2023-10-27 Thread Ron


OP might be referring to migrating from, say, Oracle to Postgresql when 
oracle_fdw is not available.


On 10/27/23 04:34, Thiemo Kellner wrote:


Hi

Why do you need checksums? Can you not employ a full outer join? My though 
behind this is that checksumming is quite costly cpuwise and you have to 
fiddle with each and every relevent column, be it as part of the checksum 
string be it as part of the joiner. The joiner would have the advantage 
that you could, if you wanted, also implement logic to tell you precisely 
in which columns the difference is encountered.


Kind regards

Thiemo

Am 27.10.2023 um 10:56 schrieb Y_esteembsv-forum:

Hi
Need SQL logic/different approach method's  to find out way's
a) to Implement check sum to validate table data is migrated correctly

Any guidance

Regards
Bharani SV


--
Born in Arizona, moved to Babylonia.

Re: need SQL logic to find out way's to Implement check sum to validate table data is migrated correctly

2023-10-27 Thread Thiemo Kellner

Hi

Why do you need checksums? Can you not employ a full outer join? My 
though behind this is that checksumming is quite costly cpuwise and you 
have to fiddle with each and every relevent column, be it as part of the 
checksum string be it as part of the joiner. The joiner would have the 
advantage that you could, if you wanted, also implement logic to tell 
you precisely in which columns the difference is encountered.


Kind regards

Thiemo

Am 27.10.2023 um 10:56 schrieb Y_esteembsv-forum:

Hi
Need SQL logic/different approach method's  to find out way's
a) to Implement check sum to validate table data is migrated correctly

Any guidance

Regards
Bharani SV

Re: Correct way of using complex expressions as partitioning key

2023-10-27 Thread Thomas Kellerer



Alexander Rumyantsev schrieb am 27.10.2023 um 06:27:
> Hello!
>
> Is there some correct way to use complex expressions as a key for partitioned 
> table?
> Inserting works as expected, but select runs over all partitions until use 
> complete partition key expression as predicate
>
> test=# create table test (
> id text,
> v1 bigint,
> v2 bigint
> )
> partition by range (((v1 + v2) % 10));
> CREATE TABLE
>
> test=# create table test_1 partition of test for values from (0) to (1);
> CREATE TABLE
>
> test=# create table test_2 partition of test for values from (1) to (2);
> CREATE TABLE
>
> test=# insert into test values (1, 100, 101);
> INSERT 0 1
>
> test=# insert into test values (1, 100, 100);
> INSERT 0 1
>
> test=# select * from test_1;
>  id | v1  | v2
> +-+-
>  1  | 100 | 100
> (1 row)
>
> test=# select * from test_2;
>  id | v1  | v2
> +-+-
>  1  | 100 | 101
> (1 row)

It seems you are trying to simulate hash partitioning using that expression.
Why not use hash partitioning directly then?

create table test (
  id text,
  v1 bigint,
  v2 bigint
)
partition by hash (v1, v2);

create table test_1
  partition of test
  for values with (modulus 2, remainder 0);
create table test_2
  partition of test
  for values with (modulus 2, remainder 1);

Then the select will result in:

QUERY PLAN
-
Seq Scan on test_1 test  (cost=0.00..17.49 rows=1 width=48) (actual 
time=0.015..0.016 rows=1 loops=1)
  Filter: ((v1 = 100) AND (v2 = 100))
Planning Time: 0.159 ms
Execution Time: 0.037 ms


Demo: https://dbfiddle.uk/wuopLYeQ






need SQL logic to find out way's to Implement check sum to validate table data is migrated correctly

2023-10-27 Thread Y_esteembsv-forum
Hi 
Need SQL logic/different approach method's  to find out way's a) to Implement 
check sum to validate table data is migrated correctly  

Any guidance
RegardsBharani SV

Re: Correct way of using complex expressions as partitioning key

2023-10-27 Thread Laurenz Albe
On Fri, 2023-10-27 at 07:27 +0300, Alexander Rumyantsev wrote:
> Is there some correct way to use complex expressions as a key for partitioned 
> table?
> Inserting works as expected, but select runs over all partitions until use 
> complete
> partition key expression as predicate
> 
> test=# create table test (
> id text,
> v1 bigint,
> v2 bigint
> )
> partition by range (((v1 + v2) % 10));
> CREATE TABLE
> 
> test=# explain analyze select * from test where v1 = 100 and v2 = 100;
> [no partition pruning]
> 
> test=# explain analyze select * from test where ((v1 + v2) % 10) = 0 and v1 = 
> 100 and v2 = 100;
> [partition pruning]

Yes, you only get partition pruning if the WHERE clause contains a comparison 
with
the partitioning key.  There is no way around that.

Yours,
Laurenz Albe




Correct way of using complex expressions as partitioning key

2023-10-27 Thread Alexander Rumyantsev
Hello!

Is there some correct way to use complex expressions as a key for partitioned 
table?
Inserting works as expected, but select runs over all partitions until use 
complete partition key expression as predicate

test=# create table test (
id text,
v1 bigint,
v2 bigint
)
partition by range (((v1 + v2) % 10));
CREATE TABLE

test=# create table test_1 partition of test for values from (0) to (1);
CREATE TABLE

test=# create table test_2 partition of test for values from (1) to (2);
CREATE TABLE

test=# insert into test values (1, 100, 101);
INSERT 0 1

test=# insert into test values (1, 100, 100);
INSERT 0 1

test=# select * from test_1;
 id | v1  | v2
+-+-
 1  | 100 | 100
(1 row)

test=# select * from test_2;
 id | v1  | v2
+-+-
 1  | 100 | 101
(1 row)

test=# explain analyze select * from test where v1 = 100 and v2 = 100;
   QUERY PLAN

 Append  (cost=0.00..52.11 rows=2 width=48) (actual time=0.011..0.017 rows=1 
loops=1)
   ->  Seq Scan on test_1  (cost=0.00..26.05 rows=1 width=48) (actual 
time=0.011..0.011 rows=1 loops=1)
 Filter: ((v1 = 100) AND (v2 = 100))
   ->  Seq Scan on test_2  (cost=0.00..26.05 rows=1 width=48) (actual 
time=0.004..0.004 rows=0 loops=1)
 Filter: ((v1 = 100) AND (v2 = 100))
 Rows Removed by Filter: 1
 Planning Time: 0.457 ms
 Execution Time: 0.036 ms
(8 rows)

test=# explain analyze select * from test where ((v1 + v2) % 10) = 0 and v1 = 
100 and v2 = 100;
  QUERY PLAN
---
 Seq Scan on test_1 test  (cost=0.00..34.08 rows=1 width=48) (actual 
time=0.010..0.011 rows=1 loops=1)
   Filter: ((v1 = 100) AND (v2 = 100) AND (((v1 + v2) % '10'::bigint) = 0))
 Planning Time: 0.131 ms
 Execution Time: 0.031 ms
(4 rows)





Re: Disk wait problem... may not be hardware...

2023-10-27 Thread Justin Clift

On 2023-10-27 12:03, p...@pfortin.com wrote:


I can't think of a common hardware bus/other that would only affect
PostgreSQL disk accesses.


Which file system is PostgreSQL being run on?

Asking because I remember seeing weirdness reported with *some* SSD
drives when used with ZFS:

  https://github.com/openzfs/zfs/discussions/14793

Note - that's not PostgreSQL specific or anything, but more of a
"weird stuff showing up with NVMe drives" thing.

Regards and best wishes,

Justin Clift