Re: Logical replication and AFTER UPDATE triggers [PG 16]

2024-02-02 Thread Chris Angelico
On Fri, 2 Feb 2024 at 13:20, Chris Angelico  wrote:
> create or replace function send_settings_notification() returns
> trigger language plpgsql as $$begin perform
> pg_notify('stillebot.settings', ''); return null; end$$;
> create trigger settings_update_notify after update on
> stillebot.settings execute function send_settings_notification();
> alter table stillebot.settings enable always trigger settings_update_notify;
>

Ah ha! A discovery. It may be that a FOR EACH STATEMENT trigger (which
is the default) does not fire on the subscriber. Converting to FOR
EACH ROW seems to make this function. Does this seem reasonable? I
can't find anything in the docs that confirms it.

ChrisA




Logical replication and AFTER UPDATE triggers [PG 16]

2024-02-01 Thread Chris Angelico
After various iterations of logical on PG 15, I bit the bullet and
installed PG 16. (Using the bookworm-pgdg repository.) Turns out, that
basically solved all the problems I'd been having previously - yay!

Got a bit of a curveball thrown at me though. I have a singleton
settings table (see other thread; I added a primary key to it to make
replication reliable) and it has a trigger on it:

create or replace function send_settings_notification() returns
trigger language plpgsql as $$begin perform
pg_notify('stillebot.settings', ''); return null; end$$;
create trigger settings_update_notify after update on
stillebot.settings execute function send_settings_notification();
alter table stillebot.settings enable always trigger settings_update_notify;

Updating the table (even to the same value as it currently has)
correctly notifies any listening clients on the *same* database
instance. However, the replicated node does not fire off a
notification. For testing purposes I have four clients - running on
Sikorsky connected to Sikorsky, on Sikorsky connected to Gideon, on
Gideon connected to Sikorsky, on Gideon connected to Gideon - and any
edit made on Sikorsky sends notifications to the two that are
connected to Sikorsky, and any edit made on Gideon sends notifications
to the two connected to Gideon. According to
https://www.postgresql.org/docs/current/sql-altertable.html this would
be the default behaviour, but with "enable always trigger", it should
fire on both ends, right?

The changes ARE getting replicated out - querying the table on either
end shows that data is flowing correctly - so it's just the trigger.

What's the correct way to use NOTIFY triggers in a replicated system?

ChrisA




Re: Logical replication claims to work, not working - new tables

2024-01-21 Thread Chris Angelico
On Mon, 22 Jan 2024 at 05:50, Chris Angelico  wrote:
>
> On Mon, 22 Jan 2024 at 05:25, Justin  wrote:
> > Adding a primary key will fix this issue.  Note PG 16 can use indexes to 
> > find qualifying rows when a table's replica is set to full.
>
> I'll try dropping the table, creating it again with a PK, and seeing
> how it goes. Thanks.
>

Okay, it seems to be working now. Thanks! Hopefully that was the only issue.

I'm a bit surprised by this, the docs do seem to imply that it ought
to work (albeit inefficiently). Maybe I made it worse by initially
creating the table without specifying the replica identity, and had to
alter it in afterwards?

ChrisA




Re: Logical replication claims to work, not working - new tables

2024-01-21 Thread Chris Angelico
On Mon, 22 Jan 2024 at 05:25, Justin  wrote:
>
> When using replica set to full this kicks off  a  full table scan for each 
> update or delete this is very expensive.  If there are no errors being 
> reported you will find it is working but hung doing full scans.  Inserts are 
> just appended to end of heap.
>

That shouldn't be too costly, since this table only has a single row
in it. But it never finishes the replication at all.

>
> You can copy the replication slot on the primary to peak at which 
> transactions LR is hung on to confirm.
>
> Adding a primary key will fix this issue.  Note PG 16 can use indexes to find 
> qualifying rows when a table's replica is set to full.

Ah, sadly PG 16 isn't an option as yet. It would definitely make
dual-master replication somewhat easier.

I'll try dropping the table, creating it again with a PK, and seeing
how it goes. Thanks.

ChrisA




Logical replication claims to work, not working - new tables

2024-01-20 Thread Chris Angelico
PostgreSQL 15 on Debian, both ends of replication.

I'm doing logical replication in a bit of a complex setup. Not sure
how much of this is relevant so I'll give you a lot of detail; sorry
if a lot of this is just noise.

* Bidirectional alternating-master replication. Since I'm still on PG
15, the replication is up in one direction, down in the other, rather
than actually being simultaneous.
* Replication is defined as "for all tables". All relevant tables are
in a single schema, "stillebot".
* Replication was working fine on initial deployment, including a swap
of master/slave.
* One table was created without a primary key, and subsequently
altered to have "replica identity full".
* Replication is not working for this table (stillebot.settings) but
is working for other tables.

I tried restarting Postgres on the slave and monitoring
/var/log/postgresql/postgresql-15-main.log and it all seemed happy.
According to pg_subscription_rel, all tables are in their
"replication" phase:

select srsubstate,srsublsn,relname from pg_subscription_rel join
pg_class on srrelid=oid;
 srsubstate | srsublsn | relname
+--+--
 r  |  | user_followed_categories
 r  |  | config
 r  |  | config_exportable
 r  |  | commands
 r  |  | settings
(5 rows)

If I make a change to user_followed_categories, it works fine. If I
make a change to settings, it does not replicate.

The stillebot.config table seems somewhat desynchronized - rows are
missing - but if I INSERT a new row into it, it gets properly
replicated.

How can I dig into this to determine what's going on? Have I broken
things by creating a table without a primary key? (It's a singleton
table, will only ever have one row in it; could add a meaningless PK
if it helps the replication.) Are there other logs to inspect?

Thanks in advance!

ChrisA




Re: Question on overall design

2023-12-11 Thread Chris Travers
On Tue, Dec 12, 2023 at 2:11 AM Ron Johnson  wrote:

> On Mon, Dec 11, 2023 at 4:41 AM Dominique Devienne 
> wrote:
>
>> On Sun, Dec 10, 2023 at 5:56 PM Ron Johnson 
>> wrote:
>>
>>> * We departitioned because SELECT statements were *slow*.  All
>>> partitions were scanned, even when the partition key was specified in the
>>> WHERE clause.
>>>
>>
>> Surely that's no the case on newer PostgreSQL, is it? Otherwise what's
>> the point of partitioning?
>> Also, I remember reading something about recent improvements with a large
>> number of partitions, no?
>>
>> As someone who's interested on partitioning, I'd appreciate details.
>> Thanks, --DD
>>
>
> This was on 12.5.  v13 was just released, and we weren't confident about
> running a mission-critical system on a .1 version.
>

Something's wrong if all partitions are scanned even when the partition
clause is explicit in the where clause.

There are however some things which can cause problems here, such as type
casts of the partition key, or when the partition key is being brought in
from a join.

>
> All "transaction" tables were partitioned by month on partion_date, while
> the PK was table_name_id, partition_date.
>
> Queries were _slow_, even when the application knew the partion_date range
> (since queries might span months).  PG just wouldn't prune.
>

Was there a datatype issue here?  Like having a partition key of type
timestamp, but the query casting from date?

>
> When I departitioned the tables, performance became acceptable.
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: running \copy through perl dbi ?

2023-12-11 Thread Chris Travers
\copy in psql just wraps PostgreSQL's COPY FROM STDIN.

if you are trying to do it from your own client program it is trivial to
change to that call instead.

On Mon, Dec 11, 2023 at 4:09 PM Vincent Veyron 
wrote:

> On Fri, 8 Dec 2023 10:45:28 -0500
> David Gauthier  wrote:
> >
> > I'm trying to run a PG client side "\copy" command from a perl script.  I
> > tried using $dbh->do("\\copy ...") but it barffed when it saw the '\'...
> > ERROR:  syntax error at or near "\"
> >
> > I can do this with a command line approach, attaching to the DB  then run
> > using...
>
> Duh! I just realized that what I proposed with system() is a command line
> approach.
>
> As David Johnston mentionned, you can use the SQL COPY command.
>
> However, you need then to deal with permissions so that the server may
> write the file, so I wonder what approach is cleaner?
>

I wouldn't do COPY FROM FILE in that case.  I would do COPY FROM STDIN and
hten write the data.

Here's the general docs in the DBD::Pg module:
https://metacpan.org/pod/DBD::Pg#COPY-support

The general approach is to COPY FROM STDIN and then use pg_putcopydata for
each row, and finally pg_putcopyend to close out this.  It's not too
different from what psql does in the background.

>
>
> --
>
> Bien à vous, Vincent Veyron
>
> https://marica.fr
> Logiciel de gestion des contentieux juridiques, des contrats et des
> sinistres d'assurance
>
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: Fixing or Mitigating this ERROR: invalid page in block 35217 of relation base/16421/3192429

2023-11-29 Thread Chris Travers
On Thu, Nov 30, 2023 at 9:03 AM Abdul Qoyyuum 
wrote:

> Hi Chris,
>
> On Wed, Nov 29, 2023 at 7:38 PM Chris Travers 
> wrote:
>
>>
>>
>> On Wed, Nov 29, 2023 at 4:36 PM Abdul Qoyyuum 
>> wrote:
>>
>>> Hi all,
>>>
>>> Knowing that it's a data corruption issue, the only way to fix this is
>>> to vacuum and reindex the database. What was suggested was the following:
>>>
>>> SET zero_damaged_pages = 0; # This is so that we can have the
>>> application to continue to run
>>> VACUUM FULL VERBOSE ANALYSE; # Do a full vacuum and analyse the problem
>>> if possible.
>>> REINDEX DATABASE "core"; # Then do a reindex and clean it up.
>>>
>>
>> So first, to clear up some confusion on my part here:
>>
>> This procedure doesn't make a lot of sense to me.  But did it clear up
>> the issue?
>>
> Yeah it did fix the issue before (same issue as last year) and it has
> fixed the problem that just happened a few days ago (almost exactly a year
> ago).
>
>>
>> In any of these cases, it is extremely important to diagnose the system
>> properly.  If you have a fault in your storage device or RAID controller,
>> for example, you are asking for more corruption and data loss later.
>>
>
>> At first I thought maybe you mistyped something and then realized there
>> were a few issues with the process so it actually didn't make sense.
>>
>> First, zero_damaged_pages defaults to 0, and I can think of no reason to
>> set  it explicitly.
>> Secondly, a vacuum full has to reindex, so there is no reason to do a
>> reindex following.  Your whole procedure is limited to a vacuum full, when
>> a reindex is the only part that could affect this.   If it did work,
>> reindexing is the only part that would have been helpful.
>>
> Oh that makes sense actually. Thanks.
>

So for a temporary workaround, it sounds like reindexing helps for now, but
yeah this really needs deeper investigation.

>
>> On to the question of what to do next
>>
>>>
>>> We're on Postgresql 12. This has worked before it happened (almost
>>> exactly a year ago) and I think this needs a more permanent solution. I've
>>> looked at routine vacuuming and checked the autovacuum is set to on and the
>>> following configurations:
>>>
>>> core=> select name, setting from pg_settings where name like
>>> 'autovacuum%';
>>> name |  setting
>>> -+---
>>>  autovacuum  | on
>>>  autovacuum_analyze_scale_factor | 0.1
>>>  autovacuum_analyze_threshold| 50
>>>  autovacuum_freeze_max_age   | 2
>>>  autovacuum_max_workers  | 3
>>>  autovacuum_multixact_freeze_max_age | 4
>>>  autovacuum_naptime  | 60
>>>  autovacuum_vacuum_cost_delay| 2
>>>  autovacuum_vacuum_cost_limit| -1
>>>  autovacuum_vacuum_scale_factor  | 0.2
>>>  autovacuum_vacuum_threshold | 50
>>>  autovacuum_work_mem | -1
>>> (12 rows)
>>>
>>> Can anyone advise if there's anything else we can do? We have no clue
>>> what causes the invalid page block and we are running a High Availability
>>> cluster set up but we are hoping that there may be a way to mitigate it.
>>>
>>>
>> You need to figure out why the corruption is happening.  This is most
>> likely, in my experience, not a PostgreSQL bug, but usually something that
>> happens on the hardware layer or an environmental factor.  It could be
>> failin storage or CPU.  Or it could be something like bad electrical input
>> or insufficient cooling (I have seen index and even table corruption issues
>> from both of these).
>>
>> If this is a server you run, the first things I would check are:
>> 1.  Is there a good-quality UPS that the server is plugged into?  Are the
>> batteries in good working order?
>>
> The servers are dual powered and hooked up to both supplied electricity,
> with a backup generator and if that fails, it will switch over to the UPS.
> All of these are supplied and maintained by the data centre that the
> servers are at. There have been no electrical problems so far.
>
>> 2.  Is the server somewhere that may be sitting in a pocket of hot air?
>>
> As you can imagine, the data centre has air-conditioning and floored fans
> blowing hot air up and out, keeping all servers cooled. Checking on

Re: Fixing or Mitigating this ERROR: invalid page in block 35217 of relation base/16421/3192429

2023-11-29 Thread Chris Travers
On Wed, Nov 29, 2023 at 4:36 PM Abdul Qoyyuum 
wrote:

> Hi all,
>
> Knowing that it's a data corruption issue, the only way to fix this is to
> vacuum and reindex the database. What was suggested was the following:
>
> SET zero_damaged_pages = 0; # This is so that we can have the application
> to continue to run
> VACUUM FULL VERBOSE ANALYSE; # Do a full vacuum and analyse the problem if
> possible.
> REINDEX DATABASE "core"; # Then do a reindex and clean it up.
>

So first, to clear up some confusion on my part here:

This procedure doesn't make a lot of sense to me.  But did it clear up the
issue?

In any of these cases, it is extremely important to diagnose the system
properly.  If you have a fault in your storage device or RAID controller,
for example, you are asking for more corruption and data loss later.

At first I thought maybe you mistyped something and then realized there
were a few issues with the process so it actually didn't make sense.

First, zero_damaged_pages defaults to 0, and I can think of no reason to
set  it explicitly.
Secondly, a vacuum full has to reindex, so there is no reason to do a
reindex following.  Your whole procedure is limited to a vacuum full, when
a reindex is the only part that could affect this.   If it did work,
reindexing is the only part that would have been helpful.

On to the question of what to do next

>
> We're on Postgresql 12. This has worked before it happened (almost exactly
> a year ago) and I think this needs a more permanent solution. I've looked
> at routine vacuuming and checked the autovacuum is set to on and the
> following configurations:
>
> core=> select name, setting from pg_settings where name like 'autovacuum%';
> name |  setting
> -+---
>  autovacuum  | on
>  autovacuum_analyze_scale_factor | 0.1
>  autovacuum_analyze_threshold| 50
>  autovacuum_freeze_max_age   | 2
>  autovacuum_max_workers  | 3
>  autovacuum_multixact_freeze_max_age | 4
>  autovacuum_naptime  | 60
>  autovacuum_vacuum_cost_delay| 2
>  autovacuum_vacuum_cost_limit| -1
>  autovacuum_vacuum_scale_factor  | 0.2
>  autovacuum_vacuum_threshold | 50
>  autovacuum_work_mem | -1
> (12 rows)
>
> Can anyone advise if there's anything else we can do? We have no clue what
> causes the invalid page block and we are running a High Availability
> cluster set up but we are hoping that there may be a way to mitigate it.
>
>
You need to figure out why the corruption is happening.  This is most
likely, in my experience, not a PostgreSQL bug, but usually something that
happens on the hardware layer or an environmental factor.  It could be
failin storage or CPU.  Or it could be something like bad electrical input
or insufficient cooling (I have seen index and even table corruption issues
from both of these).

If this is a server you run, the first things I would check are:
1.  Is there a good-quality UPS that the server is plugged into?  Are the
batteries in good working order?
2.  Is the server somewhere that may be sitting in a pocket of hot air?

Once you have ruled these out, the next things to check are CPU, memory,
and storage health.  Unfortunately checking these is harder but you can
check SMART indications, and other diagnostic indicators.

However, once these errors start happening, you are in danger territory and
need to find out why (and correct the underlying problem) before you get
data loss.

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


NUMA, PostgreSQL and docker images

2023-11-08 Thread Chris Travers
Hi everyone,

Does anyone here know if the default PostgreSQL images set NUMA policies?
I am assuming not?  Is there an easy way to make them do this?

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: Presentation tools used ?

2023-10-23 Thread Chris Travers
On Mon, Oct 23, 2023 at 8:30 AM Steve Litt 
wrote:

> Achilleas Mantzios said on Sun, 22 Oct 2023 08:50:10 +0300
>
> >Hello All
> >
> >I am going to give a talk about PostgerSQL, so I'd like to ask you
> >people what do you use for your presentations, also I have no idea how
> >the remote control works to navigate through slides. I have seen it,
> >but never came close to using one.
> >
> >I have access to google slides and libreoffice Impress.  What tools
> >would you suggest ? What's your setup ?
>
> I use presentations in my work, both given by myself and given by
> trainers. I can tell you Libreoffice Impress is absolute garbage. It
> intermittently loses style definitions. As far as google slides, I know
> nothing about them except I don't trust Google. Also, I'm not fan of
> Software as a Service (SaaS) for non-big-enterprise usage. I prefer to
> keep it all on my hard disk. That's where my PostgreSQL software
> resides.
>

One big problem on most of these also is that you have presentation and
content tied together.  So retheming a presentation is difficult or
impossible.

This is one area where Beamer (which I see you mentioned) really shines.  I
can change my presentations if someone wants them themed differently
separate from my content.

>
> Beamer (a LaTeX package) is the Cadillac of the industry, but only if
> you're willing to put in the work. I've done presentations in
> VimOutliner, but it's not "pretty" and so is only appropriate for
> certain audiences. I created Free Software called HTMLSlides, but it's
> not easy to use. I don't recommend it.
>

Also I have noticed a lot of folks in the community (myself included) use
Beamer mostly.

I love it.  It makes my life a LOT easier.

>
> If you don't want to use Beamer, my advice would be to research tools
> that convert Markdown to slides. Markdown is lightning quick to author
> in, very much unlike Beamer.
>
> Two other suggestions:
>
> 1) Please have mercy on your audience members with poor vision, and use
>black type on white background. Yeah, it's not "pretty" and it's not
>"hip", but you won't lose people who can't read purple on blue.
>Likewise, use large fonts so everyone can read. If you need small
>fonts to reveal all your info, you need to split the slide in two.
>

+1

I do sometimes make an exception for this when something is sponsored and
the sponsoring company requests it.

>
> 2) Don't read from your slides. If it's necessary to read the slide,
>what I do is tell the audience to read the slide, and then after
>they've read it I ask for questions and give them answers. But
>typically, my slides are an overview, and my verbal presentation is
>a dialog between myself and the audience.


Slides should be a mnemonic device for you as a speaker and for the
audience later, not a source of direct information except when you need a
visual exploration and then the images are helpful.



>
>
> HTH,
>
> SteveT
>
> Steve Litt
>
> Autumn 2023 featured book: Rapid Learning for the 21st Century
> http://www.troubleshooters.com/rl21
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: Question About PostgreSQL Extensibility

2023-10-17 Thread Chris Travers
On Mon, Oct 16, 2023 at 10:59 PM Laurenz Albe 
wrote:

> On Fri, 2023-10-13 at 13:55 +, felix.quin...@yahoo.com wrote:
> > For the same reason that you can use python or perl in postgresql. It's
> just another language.
> > I have .net code running on several sql servers and to change the
> database to postgresql I have to reprogram them.
>
> Yes, you'll have to rewrite them.
>

Or pick up the pldotnet handler and patch it to work on the new version.
My experience forward porting such things is that it is usually trivial.

Of course there may be other reasons to rewrite but it really depends on a
lot of factors.


>
> Yours,
> Laurenz Albe
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: Question About PostgreSQL Extensibility

2023-10-12 Thread Chris Travers
On Thu, Oct 12, 2023 at 10:09 PM Ilya Kosmodemiansky 
wrote:

> Hi Sepideh,
>
> > From: Sepideh Eidi 
> > Date: Thu, Oct 12, 2023 at 2:35 PM
> > Subject: Question About PostgreSQL Extensibility
> > To: pgsql-general@lists.postgresql.org <
> pgsql-general@lists.postgresql.org>
>
> > We have some .net assemblies and in your documents, I didn’t find any
> support for this type of files that is executable in DB or not.
>
> PostgreSQL doesn't support .net assemblies like SQL Server does. In
> Postgres you can certainly write stored procedures in different
> languages, for example in C, but I am not sure if it would solve your
> problem
>

Also if you are self-hosting, you can write and add your own language
handlers.  So if you need to support .Net assemblies, this is at least in
theory possible.

Note that there are significant reasons to consider rewriting in other
languages, however.  The large one is that .Net prefers a very different
threading model than Postgres and so that's something that has to be solved
(though the Pl/Java folks have solved that problem for Java).


>
> v. G.,
> Ilya
>
>
>
>
>
> best regards,
> Ilya Kosmodemiansky,
> CEO, Data Egret GmbH
> Herrenstr. 1 A 2,
> Spiesen-Elversberg, Germany
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: Ubuntu 18 + PHP 8.2 + PDO: can't find drivers

2023-09-24 Thread Chris Kelly
 Suddenly everything becomes clear: I'm running an OS that reached EOL 3 months 
ago. I'll upgrade to Ubuntu 20 if I need to get it working on this computer. 
Thanks.

 On Saturday, September 23, 2023 at 02:15:19 PM PDT, 
 wrote:  
 
 Am 23.09.23 um 20:56 schrieb Chris Kelly:
>  On Ubuntu "18.04.6 LTS (Bionic Beaver)":
>
> # apt-cache search pgsql | grep 8
> pike7.8-pg - PostgreSQL modules for Pike
> pike8.0-pg - PostgreSQL modules for Pike
>
>
> # apt-cache search pgsql | grep php
> php-pgsql - PostgreSQL module for PHP [default]
> php7.2-pgsql - PostgreSQL module for PHP
> php-db - Database Abstraction Layer
> php-mdb2-driver-pgsql - pgsql MDB2 driver
>
>
> This is for an existing site with an existing db dump and an existing 
> codebase. I can't use PHP7 due to composer requirements.
>
> I tried to add the ondrej repo but got "The following signatures couldn't be 
> verified because the public key is not available: NO_PUBKEY 467B942D3A79BD29".
>
>
>      On Friday, September 22, 2023 at 02:38:24 PM PDT, Ray O'Donnell 
> wrote:
>  
>  
> On 22 September 2023 21:40:38 Chris Kelly  wrote:
>
> The need has mostly passed (I used another computer with Ubuntu 20) but are 
> there clear, *working* instructions for how to connect to Postgress on Ubuntu 
> 18 via PDO? I don't see drivers that would work. This is for a Drupal site.
>
>  From memory (not at my laptop now) you need to install the php8.2-pgsql 
>package (or maybe it's php8.2-pdo-pgsql). It should just work then... Drupal 
>will notice the driver and offer Postgresql as an option on installation. 
>Don't forget to restart Apache after installing that package
> Ray.
read this
https://stackoverflow.com/questions/76864353/php8-2-in-ubuntu-18-04-e-unable-to-locate-package-php8-2
i would say it's time to update/upgrade or maybe build php from sources


  

Re: Ubuntu 18 + PHP 8.2 + PDO: can't find drivers

2023-09-23 Thread Chris Kelly
 On Ubuntu "18.04.6 LTS (Bionic Beaver)":

# apt-cache search pgsql | grep 8
pike7.8-pg - PostgreSQL modules for Pike
pike8.0-pg - PostgreSQL modules for Pike


# apt-cache search pgsql | grep php
php-pgsql - PostgreSQL module for PHP [default]
php7.2-pgsql - PostgreSQL module for PHP
php-db - Database Abstraction Layer
php-mdb2-driver-pgsql - pgsql MDB2 driver


This is for an existing site with an existing db dump and an existing codebase. 
I can't use PHP7 due to composer requirements.

I tried to add the ondrej repo but got "The following signatures couldn't be 
verified because the public key is not available: NO_PUBKEY 467B942D3A79BD29".


 On Friday, September 22, 2023 at 02:38:24 PM PDT, Ray O'Donnell 
 wrote:  
 
 
On 22 September 2023 21:40:38 Chris Kelly  wrote:

The need has mostly passed (I used another computer with Ubuntu 20) but are 
there clear, *working* instructions for how to connect to Postgress on Ubuntu 
18 via PDO? I don't see drivers that would work. This is for a Drupal site.

>From memory (not at my laptop now) you need to install the php8.2-pgsql 
>package (or maybe it's php8.2-pdo-pgsql). It should just work then... Drupal 
>will notice the driver and offer Postgresql as an option on installation. 
>Don't forget to restart Apache after installing that package 
Ray.

  

Ubuntu 18 + PHP 8.2 + PDO: can't find drivers

2023-09-22 Thread Chris Kelly
The need has mostly passed (I used another computer with Ubuntu 20) but are 
there clear, *working* instructions for how to connect to Postgress on Ubuntu 
18 via PDO? I don't see drivers that would work. This is for a Drupal site.




Re: Postgres partition max limit

2023-09-06 Thread Chris Travers
On Wed, Sep 6, 2023 at 5:27 PM Daulat  wrote:

> Hi Team,
>
> Do we have a max limit of partitions for a table in postgres?
>

As far as I can tell, there is no real hard limit to that per se.  After
all, you will eventually run out of OIDs for pg_attribute but that would
affect the number of columns in the database.

However I can say that based on past testing you probably don't want
thousands of partitions in your database.  At least last time I tried,
planning time would eventually become unreasonably high though I assume
things have improved somewhat since then (and to be fair, that system was
also in a pretty unusual configuration that may have made things worse).

Since that is a gradual process the specific limits for your application
may depend on your latency requirements for the queries in your
application.   I would recommend testing your application by creating all
the partitions you expect even if they are empty, and seeing how long
EXPLAIN takes to run.  If that's good enough, then go for it.  If it is too
long then you could pre-calculate what partition to hit or you could use
fewer partitions.

>
> Thanks.
>
>


-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: Backup Copy of a Production server.

2023-08-07 Thread Chris Travers
On Mon, Aug 7, 2023 at 7:00 PM KK CHN  wrote:

>
>
> On Mon, Aug 7, 2023 at 10:49 AM Ron  wrote:
>
>> On 8/7/23 00:02, KK CHN wrote:
>>
>> List,
>>
>> I am in need to copy a production PostgreSQL server  data( 1 TB)  to  an
>> external storage( Say USB Hard Drive) and need to set up a backup server
>> with this data dir.
>>
>> What is the trivial method to achieve this ??
>>
>>
pg_basebackup backs up the data dir.

Alternatively you could use another tool like pgbackrest which also has
some nice features regarding incremental backups, wal management etc and
parallel backup and restore.

>
>>
>> 1. Is Sqldump an option at a production server ?? (  Will this affect the
>> server performance  and possible slowdown of the production server ? This
>> server has a high IOPS). This much size 1.2 TB will the Sqldump support ?
>> Any bottlenecks ?
>>
>>
A sql dump can be had with pg_dumpall, but this is a different backup
strategy.  I would recommend in that case pg_dumpall -g to dump only
globals (roles and tablespaces) and then pg_dump on the databases
individually with format selection to either custom (if parallelism is not
required) or tar (if it is).  See the pg_dump man page for details.

The file size will depend on file format selected etc.  My naive guess for
custom format would be maybe 200-400GB.  For tar format probably more
(double or more) but total size depends on many factors and cannot be
reliably estimated.  In rare cases, it could even be larger than your data
directory.

>
>> Whether or not there will be bottlenecks depends on how busy (CPU and
>> disk load) the current server is.
>>
>
pg_basebackup is limited  by the fact that it is single threaded on both
sides (aside from wal), and this also limits disk I/O as well as network
throughput (if you have a long fat pipe).

pg_dump is also limited by having to interpret and serialize the output,
and also, if you have large text or binary fields, having to retrieve these
one at a time.  Additionally you could have lock contention.

>
>>
>> 2. Is copying the data directory from the production server to an
>> external storage and replace the data dir  at a  backup server with same
>> postgres version and replace it's data directory with this data dir copy is
>> a viable option ?
>>
>>
There is a lot of complexity to doing that right.  If you want to do that,
look at using pgbackrest.

>
>>
>>
>> # cp  -r   ./data  /media/mydb_backup  ( Does this affect the Production
>> database server performance ??)   due to the copy command overhead ?
>>
>>
>> OR  doing a WAL Replication Configuration to a standby is the right
>> method to achieve this ??
>>
>>
That is often also used, but you need to define what you want out of a
backup.  A standby will protect you from hardware failure for the most
part.  It will not, without a lot of other thought and configuration,
protect you from an administrator accidently dropping an important table or
database.   WAL archiving and backups can help there though (and hence my
recommendation for pgbackrest, which can also restore the data directory
and/or wals to a standby).

>
>>
>> But you say you can't establish a network connection outside the DC.  ( I
>> can't do for a remote machine .. But I can do  a WAL replication to another
>> host in the same network inside the DC. So that If I  do a sqldump  or Copy
>> of Data dir of the standby server it won't affect the production server, is
>> this sounds good  ?  )
>>
>
With a good backup archive accessible from only the places it needs to be
accessed, this problem goes away.

>
>>
>>
>>  This is to take out the database backup outside the Datacenter and our
>> DC policy won't allow us to establish a network connection outside the DC
>> to a remote location for WAL replication .
>>
>>
>> If you're unsure of what Linux distro & version and Postgresql version
>> that you'll be restoring the database to, then the solution is:
>> DB=the_database_you_want_to_backup
>> THREADS=
>> cd $PGDATA
>> cp -v pg_hba.conf postgresql.conf /media/mydb_backup
>> cd /media/mydb_backup
>> pg_dumpall --globals-only > globals.sql
>>
>
> What is the relevance of  globals-only and  what this will do  ${DB}.log
> // or is it  ${DB}.sql  ?
>
> pg_dump --format=d --verbose --jobs=$THREADS $DB &> ${DB}.log  // .log
>> couldn't get an idea what it mean
>>
>> If you're 100% positive that the system you might someday restore to is
>> *exactly* the same distro & version, and Postgresql major version, then
>> I'd use PgBackRest.
>>
>> --
>> Born in Arizona, moved to Babylonia.
>>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: Reset Postgresql users password

2023-07-17 Thread Chris Travers
You can use a DO block or write a function do to this.

It takes some practice (and you need to use EXECUTE FORMAT())

If users need to be able to change their own users, something like this
works:

CREATE FUNCTION change_my_password(in_password, text)
returns void language plpgsql as
$$
begin
   EXECUTE FORMAT($F$ALTER USER $I WITH PASSWORD %L$F$, session_user,
in_password);
end;
$$ SECURITY DEFINER;

On Mon, Jul 17, 2023 at 9:28 AM Ron  wrote:

> On 7/12/23 14:28, Johnathan Tiamoh wrote:
> > Hello,
> >
> > I wish to find out if there is a way to reset all users in Postgresql
> > password to the same password at once.
>
> To the same value??
>
> --
> Born in Arizona, moved to Babylonia.
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: Interconnected views

2023-06-02 Thread Chris Travers
On Fri, Jun 2, 2023, 09:36 Oliver Kohll  wrote:

> Hi,
>
> Just wondering, does anyone else create apps which might not have 'big'
> data, but quite complex arrangements of views joining to each other?
>
> If so, do you have scripts to aid refactoring them e.g. drop/recreate/test
> them in the right order etc.?
>

In Ledgersmb, we did this and also same with functions.  We used versioned
scripts, database schema change management tools, and reload utilities as
well as Pgtap for testing.

>
> I'm really impressed with the way Postgres handles these multi-layered
> views. Some of our explain analyze outputs could fill a book each! Tools
> like Depesz' can be very useful. Sometimes a little tweaking or a judicious
> index is necessary, but the performance is very reliable and scalable.
>
> Blog post about it here:
> https://blog.agilebase.co.uk/2023/05/21/refactoring-sql-views/
>

Yeah, I have always been impressed by Postgres here too.

>
> Oliver
>
> --
> See us at the Bath Digital Festival , 12th July
>


Blog post series on commitfests and patches

2023-03-08 Thread Chris Travers
Hi all;

I have been writing a few blog posts trying to shed some light on the
development process of PostgreSQL , what's coming and what I hope we see
more of.

I would be very much interested in feedback as to whether people
(particularly non-Postgres contributors) find this useful or not.

The latest entry is at:
https://www.timescale.com/blog/a-postgresql-developers-perspective-five-interesting-patches-from-januarys-commitfest/?utm_source=timescaledb_medium=linkedin_campaign=mar-2023-advocacy_content=tsdb-blog


-- 
Best Wishes,
Chris Travers


pg_trgm vs. Solr ngram

2023-02-09 Thread Chris

Hello list

I'm pondering migrating an FTS application from Solr to Postgres, just 
because we use Postgres for everything else.


The application is basically fgrep with a web frontend. However the 
indexed documents are very computer network specific and contain a lot 
of hyphenated hostnames with dot-separated domains, as well as IPv4 and 
IPv6 addresses. In Solr I was using ngrams and customized the 
TokenizerFactories until more or less only whitespace was as separator, 
while [.:-_\d] remains part of the ngrams. This allows to search for 
".12.255/32" or "xzy-eth5.example.org" without any false positives.


It looks like a straight conversion of this method is not possible since 
the tokenization in pg_trgm is not configurable afaict. Is there some 
other good method to search for a random substring including all the 
punctuation using an index? Or a pg_trgm-style module that is more 
flexible like the Solr/Lucene variant?


Or maybe hacking my own pg_trgm wouldn't be so hard and could be fun, do 
I pretty much just need to change the emitted tokens or will this lead 
to significant complications in the operators, indexes etc.?


thanks for any hints & cheers
Christian




system variable can be edited by all user?

2022-11-22 Thread chris navarroza
Hi,

Ive created a read only user (SELECT PRIVILEGE) but it turns out that this
user can do this queries: SHOW work_mem; SET work_mem='40MB'; How do I
limit him?

Thanks,

Butching


Re: PostgreSql Service different path

2022-10-27 Thread chris navarroza
I fixed it by editing the postgresql-14.service PGDATA path to the new
directory

Thanks,

butching

On Mon, Oct 24, 2022 at 7:25 PM chris navarroza 
wrote:

> But I'm using a different path when I initdb /usr/pgsql-14/bin/initdb -D 
> */home/dmartuser/pgsql/14/data
> *so  "/var/lib/pgsql/14/data/" is really empty. Is there a way to point
> the startup script to the new path  */home/dmartuser/pgsql/14/data*  ?
>
> Thanks,
>
> Chris Albert Navarroza
> Information Technology Officer I
> CTCO - ITDS - RDMD
>
>
> On Mon, Oct 24, 2022 at 6:57 PM Jeffrey Walton  wrote:
>
>> On Mon, Oct 24, 2022 at 6:38 AM chris navarroza
>>  wrote:
>> >
>> > I install postgresql14.5 with the following commands
>> >
>> > sudo yum install postgresql14-server postgresql14-contrib
>> >
>> >
>> > sudo su postgres
>> >
>> > cd /tmp
>> >
>> > /usr/pgsql-14/bin/initdb -D /home/dmartuser/pgsql/14/data
>> >
>> > and update the postgresql.conf to the new path, now when I start the
>> service, it has an error and when I check it says
>> >
>> > -- Unit postgresql-14.service has begun starting up.
>> > Oct 24 15:16:45 datamartds postgresql-14-check-db-dir[132338]:
>> "/var/lib/pgsql/14/data/" is missing or empty.
>> > Oct 24 15:16:45 datamartds postgresql-14-check-db-dir[132338]: Use
>> "/usr/pgsql-14/bin/postgresql-14-setup initdb" to initialize the database
>> cluster.
>> > Oct 24 15:16:45 datamartds postgresql-14-check-db-dir[132338]: See
>> /usr/share/doc/postgresql14/README.rpm-dist for more information.
>> > Oct 24 15:16:45 datamartds systemd[1]: postgresql-14.service: Control
>> process exited, code=exited status=1
>> > Oct 24 15:16:45 datamartds systemd[1]: postgresql-14.service: Failed
>> with result 'exit-code'.
>> >
>> > How can I point the service to read the new path (
>> /home/dmartuser/pgsql/14/data )?
>>
>> The startup script that is calling systemd should perform a `mkdir -p
>> /var/lib/pgsql/14/data` before calling the PostgreSQL binary.
>>
>> Jeff
>>
>


Re: PostgreSql Service different path

2022-10-25 Thread chris navarroza
But I'm using a different path when I initdb /usr/pgsql-14/bin/initdb
-D */home/dmartuser/pgsql/14/data
*so  "/var/lib/pgsql/14/data/" is really empty. Is there a way to point the
startup script to the new path  */home/dmartuser/pgsql/14/data*  ?

Thanks,

Chris Albert Navarroza
Information Technology Officer I
CTCO - ITDS - RDMD


On Mon, Oct 24, 2022 at 6:57 PM Jeffrey Walton  wrote:

> On Mon, Oct 24, 2022 at 6:38 AM chris navarroza
>  wrote:
> >
> > I install postgresql14.5 with the following commands
> >
> > sudo yum install postgresql14-server postgresql14-contrib
> >
> >
> > sudo su postgres
> >
> > cd /tmp
> >
> > /usr/pgsql-14/bin/initdb -D /home/dmartuser/pgsql/14/data
> >
> > and update the postgresql.conf to the new path, now when I start the
> service, it has an error and when I check it says
> >
> > -- Unit postgresql-14.service has begun starting up.
> > Oct 24 15:16:45 datamartds postgresql-14-check-db-dir[132338]:
> "/var/lib/pgsql/14/data/" is missing or empty.
> > Oct 24 15:16:45 datamartds postgresql-14-check-db-dir[132338]: Use
> "/usr/pgsql-14/bin/postgresql-14-setup initdb" to initialize the database
> cluster.
> > Oct 24 15:16:45 datamartds postgresql-14-check-db-dir[132338]: See
> /usr/share/doc/postgresql14/README.rpm-dist for more information.
> > Oct 24 15:16:45 datamartds systemd[1]: postgresql-14.service: Control
> process exited, code=exited status=1
> > Oct 24 15:16:45 datamartds systemd[1]: postgresql-14.service: Failed
> with result 'exit-code'.
> >
> > How can I point the service to read the new path (
> /home/dmartuser/pgsql/14/data )?
>
> The startup script that is calling systemd should perform a `mkdir -p
> /var/lib/pgsql/14/data` before calling the PostgreSQL binary.
>
> Jeff
>


PostgreSql Service different path

2022-10-24 Thread chris navarroza
Hi,

I install postgresql14.5 with the following commands


sudo yum install postgresql14-server postgresql14-contrib


sudo su postgres

cd /tmp

/usr/pgsql-14/bin/initdb -D /home/dmartuser/pgsql/14/data
and update the postgresql.conf to the new path, now when I start the
service, it has an error and when I check it says

-- Unit postgresql-14.service has begun starting up.
Oct 24 15:16:45 datamartds postgresql-14-check-db-dir[132338]:
*"/var/lib/pgsql/14/data/"
is missing or empty.*
Oct 24 15:16:45 datamartds postgresql-14-check-db-dir[132338]: Use
"/usr/pgsql-14/bin/postgresql-14-setup initdb" to initialize the database
cluster.
Oct 24 15:16:45 datamartds postgresql-14-check-db-dir[132338]: See
/usr/share/doc/postgresql14/README.rpm-dist for more information.
Oct 24 15:16:45 datamartds systemd[1]: postgresql-14.service: Control
process exited, code=exited status=1
Oct 24 15:16:45 datamartds systemd[1]: postgresql-14.service: Failed with
result 'exit-code'.

How can I point the service to read the new path (
/home/dmartuser/pgsql/14/data )?

Thanks,

Chris Albert Navarroza
Information Technology Officer I
CTCO - ITDS - RDMD


Re: Upgrading error

2022-10-24 Thread chris navarroza
This is the log I got.
-
  pg_upgrade run on Mon Oct 24 07:41:13 2022
-

command: "/usr/pgsql-12/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D
"/home/dmartuser/pgdata/data" -o "-p 50432 -b -c
config_file=/var/lib/pgsql/12/data/postgresql.conf -c listen_addresses=''
-c unix_socket_permissions=0700 -c unix_socket_directories='/tmp'" start >>
"pg_upgrade_server.log" 2>&1
waiting for server to start2022-10-23 23:41:13.458 GMT [107444] LOG:
 skipping missing configuration file
"/home/dmartuser/pgdata/data/postgresql.auto.conf"
2022-10-24 07:41:13.459 PST [107444] LOG:  starting PostgreSQL 12.9 on
x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat
8.5.0-4), 64-bit
2022-10-24 07:41:13.460 PST [107444] LOG:  listening on Unix socket
"/tmp/.s.PGSQL.50432"
2022-10-24 07:41:13.486 PST [107444] LOG:  redirecting log output to
logging collector process
2022-10-24 07:41:13.486 PST [107444] HINT:  Future log output will appear
in directory "log".
 stopped waiting
pg_ctl: could not start server
Examine the log output.

Thanks,

Chris Albert Navarroza
Information Technology Officer I
CTCO - ITDS - RDMD


On Mon, Oct 24, 2022 at 8:12 AM Tom Lane  wrote:

> chris navarroza  writes:
> > could not connect to source postmaster started with the command:
> > "/usr/pgsql-12/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D
> > "/home/dmartuser/pgdata/data" -o "-p 50432 -b -c
> > config_file=/var/lib/pgsql/12/data/postgresql.conf -c listen_addresses=''
> > -c unix_socket_permissions=0700 -c unix_socket_directories='/tmp'" start
> > Failure, exiting
>
> Did you look into the postmaster log (pg_upgrade_server.log, here)
> to see what it thought about the problem?  I'm guessing there's
> something that kept it from starting, but there's no evidence
> offered here about what.
>
> regards, tom lane
>


Re: Upgrading error

2022-10-24 Thread chris navarroza
Hi,

Thanks for the reply!

Sorry I missed that line, but now I'm having new error.

[root@datamartds tmp]# sudo su postgres
bash-4.4$ cd /tmp
bash-4.4$ /usr/pgsql-14/bin/pg_upgrade
--old-datadir=/home/dmartuser/pgdata/data/
--new-datadir=/home/dmartuser/pgsql/14/data/
--old-bindir=/usr/pgsql-12/bin/ --new-bindir=/usr/pgsql-14/bin/
--old-options '-c config_file=/var/lib/pgsql/12/data/postgresql.conf'
--new-options '-c config_file=/var/lib/pgsql/14/data/postgresql.conf'
--check
Performing Consistency Checks
-
Checking cluster versions   ok

*failure*
Consult the last few lines of "pg_upgrade_server.log" for
the probable cause of the failure.

connection to server on socket "/tmp/.s.PGSQL.50432" failed: No such file
or directory
Is the server running locally and accepting connections on that
socket?

could not connect to source postmaster started with the command:
"/usr/pgsql-12/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D
"/home/dmartuser/pgdata/data" -o "-p 50432 -b -c
config_file=/var/lib/pgsql/12/data/postgresql.conf -c listen_addresses=''
-c unix_socket_permissions=0700 -c unix_socket_directories='/tmp'" start
Failure, exiting



Thanks,

Chris Albert Navarroza
Information Technology Officer I
CTCO - ITDS - RDMD


On Fri, Oct 21, 2022 at 10:08 PM Tom Lane  wrote:

> chris navarroza  writes:
> > bash-4.4$ /usr/pgsql-14/bin/pg_upgrade
> > --old-datadir=/home/dmartuser/pgdata/data/
> > --new-datadir=/home/dmartuser/pgdata/data/
> --old-bindir=/usr/pgsql-12/bin/
> > --new-bindir=/usr/pgsql-14/bin/ --check --verbose
> > Running in verbose mode
> > Performing Consistency Checks
> > -
> > Checking cluster versions
> > This utility can only upgrade to PostgreSQL version 14.
>
> You cannot use the same directory as --old-datadir and --new-datadir.
> I think pg_upgrade is unhappy because what it sees in PG_VERSION in
> the --new-datadir isn't v14 ... but I wonder if we shouldn't be
> checking for this mistake more directly.
>
> Anyway, the right way to proceed is to use v14's initdb to create
> a new directory at, say, /home/dmartuser/pgdata/data-14
> and then pg_upgrade to that.  Afterwards you can rename the old
> data directory out of the way and move the new one to
> /home/dmartuser/pgdata/data, if you like.
>
> regards, tom lane
>


Upgrading error

2022-10-21 Thread chris navarroza
Hi,

I have a postgresql server version 12.9 and trying to upgrade it to 14.5.
My OS is Centos 8 Stream and I already managed to install postgresql14.5
but when I'm trying to upgrade, I am encountering this error

bash-4.4$ /usr/pgsql-14/bin/pg_upgrade
--old-datadir=/home/dmartuser/pgdata/data/
--new-datadir=/home/dmartuser/pgdata/data/ --old-bindir=/usr/pgsql-12/bin/
--new-bindir=/usr/pgsql-14/bin/ --check --verbose
Running in verbose mode
Performing Consistency Checks
-
Checking cluster versions
This utility can only upgrade to PostgreSQL version 14.
Failure, exiting

I can't seem to find any solution in the internet. Please help!

Sincerely,
butching


Re: LwLocks contention

2022-04-21 Thread Chris Bisnett
> We are occasionally seeing heavy CPU contention with hundreds of processes 
> active but waiting on a lightweight lock - usually lock manager or buffer 
> mapping it seems. This is happening with VMs configured with about 64 CPUs, 
> 350GBs ram, and while we would typically only have 30-100 concurrent 
> processes, there will suddenly be ~300 and many show active with LwLock and 
> they take much longer than usual. Any suggested options to monitor for such 
> issues or logging to setup so the next issue can be debugged properly?
>
> It has seemed to me that this occurs when there are more than the usual 
> number of a particular process type and also something that is a bit heavy in 
> usage of memory/disk. It has happened on various tenant instances and 
> different application processes as well.
>
> Would/how might the use of huge pages (or transparent huge pages, or OFF) 
> play into this scenario?

I've also been contending with a good bit of lightweight lock
contention that causes performance issues. Most often we see this with
the WAL write lock, but when we get too many parallel queries running
we end up in a "thundering herd" type of issue were the contention for
the lock manager lock consumes significant CPU resources causing the
number of parallel queries to increase as more clients back up behind
the lock contention leading to even more lock contention. When this
happens we have to pause our background workers long enough to allow
the lock contention to reduce and then we can resume our background
workers. When we hit the lock contention it's not a gradual
degredation, it goes immediately from nothing more than 100% CPU
usage. The same is true when reducing the lock contention - it goes
from 100% to nothing.

I've been working under the assumption that this has to do with our
native partitioning scheme and the fact that some queries cannot take
advantage of partition pruning because they don't contain the
partition column. My understanding is that when this happens ACCESS
SHARED locks have to be taken on all tables as well as all associated
resources (indexes, sequences, etc.) and the act of taking and
releasing all of those locks will increase the lock contention
significantly. We're working to update our application so that we can
take advantage of the pruning. Are you also using native partitioning?

- Chris




Re: Per-Table vacuum_freeze_min_age

2022-04-06 Thread Chris Bisnett
On Wed, Apr 6, 2022 at 6:31 PM Adrian Klaver 
wrote:

> On 4/6/22 3:28 PM, Chris Bisnett wrote:
> > On Wed, Apr 6, 2022 at 6:24 PM Adrian Klaver  > <mailto:adrian.kla...@aklaver.com>> wrote:
>
> >
> > It can:
> >
> >
> https://www.postgresql.org/docs/current/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS
> > <
> https://www.postgresql.org/docs/current/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS
> >
> >
> > Per-table value for vacuum_freeze_min_age parameter.
> >  >
> >  > - chris
> >  >
> >
> >
> > --
> > Adrian Klaver
> > adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>
> >
> >
> > Oh I must have missed this. Is this missing from the documentation here (
> > https://www.postgresql.org/docs/current/runtime-config-autovacuum.html
> > <https://www.postgresql.org/docs/current/runtime-config-autovacuum.html
> >)?
>
> That describes autovacuum in general. The per table options are
> attributes of a table.
>
> >
> > I can try again, but I’m pretty sure this option was rejected when I
> > attempted to set it via an alter table command.
>
> What was the command you used?
>
> What was the error you got?
>
> >
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com


Either I was trying to use vacuum_freeze_min_age on the table or I forgot
the parenthesis around the options in the alter table command. Either way,
I just tried it and it worked. Thanks for the help!

- chris


Re: Per-Table vacuum_freeze_min_age

2022-04-06 Thread Chris Bisnett
On Wed, Apr 6, 2022 at 6:24 PM Adrian Klaver 
wrote:

> On 4/6/22 3:13 PM, Chris Bisnett wrote:
> > Hi all!
> >
> > I have several large tables (1-2Tb) that are 99.9% writes (small number
> > of updates) with a decent commit rate (20K/sec). The basic idea is that
> > it’s generating a lot of data continuously. When the table would reach
> > the thresholds for autovacuum a vacuum would start and would start
> > generating wal write lock wait events. Once I set the freeze age to
> > 500,000 (default is 50,000,000) the vacuums have to touch many fewer
> > pages and is significantly faster without causing any write lock wait
> > events.
> >
> > The only downside I’ve seen is that this is a global setting and my
> > understanding is that this would cause decreased performance when used
> > with tables with a lot of writes and deletes. Is there a technical
> > reason this setting cannot be applied at the database or table context
> > like other autovacuum settings?
>
> It can:
>
>
> https://www.postgresql.org/docs/current/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS
>
> Per-table value for vacuum_freeze_min_age parameter.
> >
> > - chris
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com


> Oh I must have missed this. Is this missing from the documentation here (
https://www.postgresql.org/docs/current/runtime-config-autovacuum.html)?

I can try again, but I’m pretty sure this option was rejected when I
attempted to set it via an alter table command.


Per-Table vacuum_freeze_min_age

2022-04-06 Thread Chris Bisnett
Hi all!

I have several large tables (1-2Tb) that are 99.9% writes (small number of
updates) with a decent commit rate (20K/sec). The basic idea is that it’s
generating a lot of data continuously. When the table would reach the
thresholds for autovacuum a vacuum would start and would start generating
wal write lock wait events. Once I set the freeze age to 500,000 (default
is 50,000,000) the vacuums have to touch many fewer pages and is
significantly faster without causing any write lock wait events.

The only downside I’ve seen is that this is a global setting and my
understanding is that this would cause decreased performance when used with
tables with a lot of writes and deletes. Is there a technical reason this
setting cannot be applied at the database or table context like other
autovacuum settings?

- chris


Re: WAL Archiving and base backup

2022-01-19 Thread Chris Travers
On Fri, Jan 14, 2022 at 8:27 PM Ron  wrote:

> On 1/14/22 12:31 PM, Stephen Frost wrote:
> > Greetings,
> >
> > * Issa Gorissen (issa-goris...@usa.net) wrote:
> >> Thx a lot. I thought about it but was not so sure about having a complex
> >> script (compared to the very simple version when using the exclusive
> backup
> >> - but this this is deprecated...).
> >>
> >> I will test your option with the simpler version and post it back to it
> can
> >> maybe land in PostgreSQL documentation.
> > The PG docs show how the command works and that's it.  The commands
> > in the docs aren't intended to be actually used in production
> > environments.  Writing a full solution involves having a good
> > understanding of the PG code and how WAL archiving, backups, et al, are
> > done.  If you're not familiar with this portion of the PG code base, I'd
> > strongly suggest you look at using solutions written and maintained by
> > folks who are.
>
> Needing to read the PG source code to write a workable PITR recovery
> solution is a serious flaw in PG documentation (and why I use PgBackRest).
>
> The documentation of two other RDBMSs that I've worked with (Rdb/VMS and
> SQL
> Server) are perfectly clear on how to do such backups and restores with
> relatively small amounts of scripting.
>

So when I was writing my own backup solutions many years ago, I didn't
generally read the code to do that.  I think the problem is that there is a
lot of stuff that goes on around the backup and recovery process where to
make it safe you need to understand all the other things going on.

I can remember at least one case from those years ago when a needed backup
suddenly wasn't PITR-restorable when I needed it to be and that took some
urgent troubleshooting.  I got it resolved but I also understand why those
building,such tools read the code and more importantly understand
implications of design choices in that context.

Backups are critical pieces of infrastructure and one wants to make sure
that weird corner cases don't suddenly render your backup useless when your
production system dies.  And while I do think the docs could be improved, I
agree they will probably never be good enough for people to just roll their
own solutions.


>
> > Trying to write documentation on how to develop a complete solution
> > would be quite an effort and would certainly go beyond bash scripting
> > and likely wouldn't end up getting used anyway- those who are developing
> > such solutions are already reading through the actual code.

>
> > Thanks,
> >
> > Stephen
>
> --
> Angular momentum makes the world go 'round.
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: surprisingly slow creation of gist index used in exclude constraint

2021-12-22 Thread Chris Withers

  
  
Hi,
A year and a half later, now that version 14 is the latest
  available, I wonder if anything has changed with respect to gist
  index creation?
Probably also worth asking: are there now different index types
  this application should be using?
cheers,
Chris

On 14/05/2020 21:11, Chris Withers
  wrote:


  
  Hi,
  I'm upgrading a database from 9.4 to 11.5 by dumping from the
old cluster and loading into the new cluster.
  The database is tiny: around 2.3G, but importing this table is
proving problematic:
  Column  |   Type|Modifiers
+---+--
 period | tsrange   | not null
 col1   | character varying | not null
 col2   | character varying | not null
 col3   | integer   | not null
 col4   | character varying | not null default ''::character varying
 id | integer   | not null default nextval('mkt_profile_id_seq'::regclass)
 deleted| boolean   | not null default false
 managed| boolean   | not null default false
 col5   | character varying |
Indexes:
"mkt_profile_pkey" PRIMARY KEY, btree (id)
"mkt_profile_period_col1_col4_col2_chan_excl" EXCLUDE USING gist (period WITH &&, col1 WITH =, col4 WITH =, col2 WITH =, col3 WITH =)
Check constraints:
"mkt_profile_period_check" CHECK (period <> 'empty'::tsrange)
Foreign-key constraints:
"mkt_profile_col1_fkey" FOREIGN KEY (col1) REFERENCES host(name)
  It has 4.1 million rows in it and while importing the data only
takes a couple of minutes, when I did a test load into the new
cluster, building the
mkt_profile_period_col1_col4_col2_chan_excl index for the
exclude constraint took 15 hours.
  I feel like asking what I'm doing wrong here? The new server is
pretty decent hardware...
  Concrete questions:
  - what, if anything, am I getting badly wrong here?
  - what can I do to speed up creation of this index?
  - failing that, what can I do to import and then create the
index in the background? 
  
  As you can imagine, a 15hr outage for an upgrade has not met
with large amounts of happiness from the people whose
application it is ;-)
  Chris
  

  





Re:Why in pgAdmin an active session is marked/highlighted in Red

2021-11-11 Thread chris
Which version? Active session should be green.


Regards,
Chris
On 11/11/2021 21:24,Shaozhong SHI wrote:
I never came across this before and wonder why?


Regards,


David

Re: ZFS filesystem - supported ?

2021-10-25 Thread Chris Travers
On Mon, Oct 25, 2021 at 10:18 AM Laurenz Albe 
wrote:

> On Sat, 2021-10-23 at 11:29 +, Laura Smith wrote:
> > Given an upcoming server upgrade, I'm contemplating moving away from XFS
> to ZFS
> > (specifically the ZoL flavour via Debian 11).
> > BTRFS seems to be falling away (e.g. with Redhat deprecating it etc.),
> hence my preference for ZFS.
> >
> > However, somewhere in the back of my mind I seem to have a recollection
> of reading
> > about what could be described as a "strong encouragement" to stick with
> more traditional options such as ext4 or xfs.
>
> ZFS is probably reliable, so you can use it with PostgreSQL.
>
> However, I have seen reports of performance tests that were not favorable
> for ZFS.
> So you should test if the performance is good enough for your use case.
>

It very much depends on lots of factors.

On the whole ZFS on spinning disks is going to have some performance...
rough corners.  And it is a lot harder to reason about a lot of things
including capacity and performance when you are doing copy on write on both
the db and FS level, and have compression in the picture.  And there are
other areas of complexity, such as how you handle partial page writes.

On the whole I think for small dbs it might perform well enough.  On large
or high velocity dbs I think you will have more problems than expected.

Having worked with PostgreSQL on ZFS I wouldn't generally recommend it as a
general tool.

Best Wishes,
Chris Travers

>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: Unsynchronized parallel dumps from 13.3 replica produced by pg_dump

2021-10-18 Thread Chris Williams
Thanks Tom.  It's a strange one for sure.  Hopefully AWS support will shed
some light on it. I will clarify too that this is the regular RDS Postgres
version and not their other Aurora Postgres service.  I suspect the Aurora
Postgres probably differs from the community version by quite a bit, but
I'm unsure how much their regular Postgres offering differs, if at all.

Thanks,
Chris

On Mon, Oct 18, 2021 at 8:05 PM Tom Lane  wrote:

> Chris Williams  writes:
> > We have a script that runs a pg_dump off of an RDS PG13.3 replica several
> > times per day. We then load this dump using pg_restore into another
> > postgres RDS db in another AWS account, scrub some of the data, and then
> > take a snapshot of it.
>
> Hmm ... I'm fairly sure that RDS Postgres is not Postgres at this level
> of detail.  The info I've been able to find about their replication
> mechanism talks about things like "eventually consistent reads", which
> is not something community Postgres deals in.
>
> In particular, what I'd expect from the community code is that a replica
> could see a sequence as being *ahead* of the value that you might expect
> from looking at related tables; but never behind.  (Also, that statement
> is true regardless of whether you are doing parallel dump.)  And
> non-sequence tables should always be consistent, period.
>
> So I'm suspicious that this is an RDS-specific effect, and thus that
> you should consult Amazon support first.  If they say "no, it's Postgres
> all the way down", then we need to look closer.
>
> regards, tom lane
>


Unsynchronized parallel dumps from 13.3 replica produced by pg_dump

2021-10-18 Thread Chris Williams
Hi,

We have a script that runs a pg_dump off of an RDS PG13.3 replica several
times per day. We then load this dump using pg_restore into another
postgres RDS db in another AWS account, scrub some of the data, and then
take a snapshot of it. We then use these snapshots for testing and staging
servers. This script has been running fine for years without any issues.
However, a little less than a month ago, we changed the pg_dump command in
the script to use the "-j" flag to speed up our dumps. It was my
understanding that starting with postgres v10, this option should now be
supported on replicas. Since we recently upgraded to Postgres 13.3 from
9.6, we thought it would be a good idea to start using this feature to
speed up our pg_dumps. In order to use parallel dumps, we made two changes
to our script: 1) We swapped to this command: pg_dump -Fd -j9 -Z0 -x
-hreplica -Uuser -f dump_dir For reference, the previous command (which
never had any problems) was: pg_dump -Fc -Z0 -x -hreplica -Uuser > dump 2)
We added code to resize our replica db to a larger hardware size before
starting the pg_dump so we can utilize more parallelism. After resizing the
db, the script waits for our replica to go back to the "available" state
before running the pg_dump command. When the pg_dump finishes, we then
resize the db back down to a t3.medium for cost reasons. When we first
rolled this out, the speed improvement was fantastic, it brought our
pg_dump down from 30 minutes to 5. However, within a week or so of rolling
this out, we've noticed that sometimes the dumps produced by our pg_dump
command were not consistent/synchronized. Sometimes when we would restore
the backup using pg_restore to our intermediate db it would throw duplicate
key errors or other types of constraint errors. Sometimes, even when the
pg_restore fully restored without errors, we noticed that some of the
sequences were wrong (we saw some sequences with a value less than the max
id in the column, so new inserts would start failing on the table). The
errors were never consistent either. It appears to me that the dump
produced by our pg_dump is not actually synchronized. I checked the
postgres logs on the server and don't see anything noteworthy. We also are
not getting any errors from our pg_dump command. Everything we are using is
version 13.3 (all of the dbs involved and the server running the script is
using v13.3 of pg_dump/pg_restore commands) I was curious if anyone had any
thoughts on what could be causing this. From my reading of the pg_dump
manual, it seems like it should always be producing a consistent dump, but
please correct me if I'm misunderstanding. I've also opened a ticket with
AWS's support to see if they have any ideas. For now, we've had to go back
to using the single threaded pg_dump (which is disappointing because it's
25 minutes slower). We were really hoping to take advantage of the
parallelism.

Any ideas would be much appreciated.

Thanks,
Chris


Re: JOB | DBA (Canada)

2021-10-14 Thread Chris Travers
On Thu, Oct 14, 2021 at 5:37 PM Gavin Flower 
wrote:

> On 15/10/21 02:00, James Tobin wrote:
> > Hello, we are working with an employer that is looking to hire someone
> > capable of managing Mongo and Sybase databases at their office in
> > Canada.  Consequently, I had hoped that some members of this list may
> > like to discuss further.  Kind regards, James
> >
> >
> If they wanted people from this list the they should upgrade to
> PostgreSQL, they'll find PostgreSQL faster and more reliable than
> Mongo.  I suspect  the same might true for Sybase, but I've never done a
> detailed comparison.
>

Indeed.  MongoDB has some interesting features like capped collections
which make it useful as a cache, but my experience always leaves me feeling
like performance and scalability are lacking.

>
>
> Cheers,
> Gavin
>
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: JOB | DBA (Canada)

2021-10-14 Thread Chris Travers
On Thu, Oct 14, 2021 at 3:00 PM James Tobin  wrote:

> Hello, we are working with an employer that is looking to hire someone
> capable of managing Mongo and Sybase databases at their office in
> Canada.  Consequently, I had hoped that some members of this list may
> like to discuss further.  Kind regards, James
>

If you are posting here, is it because they want to move all these to
PostgreSQL?


-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: pgbouncer configuration

2021-04-28 Thread Chris Stephens
I had to install pam-devel before reinstalling pgbouncer. appears to be
working now. thanks for pointing me in the right direction Laurenz!

On Wed, Apr 28, 2021 at 7:09 AM Chris Stephens 
wrote:

> huh. you are right. i originally installed pgbouncer with yum but removed
> with just "yum remove pgbouncer" before following "Building from Git"
> section at http://www.pgbouncer.org/install.html.
>
> i just ran "make uninstall" -> reinstalled with
>
>   403  21-04-28 06:58:32 git submodule init
>   404  21-04-28 06:58:36 git submodule update
>   405  21-04-28 06:58:45 ./autogen.sh
>   406  21-04-28 06:59:23 ./configure --with-pam --with-systemd
>   407  21-04-28 07:02:11 make
>   408  21-04-28 07:02:16 make install
>
> but it still looks like no ldap or pam and same error when starting up.
> any suggestions on how to completely remove and reinstall with support for
> pam?
>
>
>
> On Wed, Apr 28, 2021 at 1:51 AM Laurenz Albe 
> wrote:
>
>> On Tue, 2021-04-27 at 19:46 -0500, Chris Stephens wrote:
>> > I'm trying to run pgbouncer but am having trouble with what looks like
>> a very simple configuration.
>> >
>> > centos 7
>> > postgres 12
>> > pgbouncer 1.15
>> >
>> > we are already using pam for database auth. pgbouncer was compiled with
>> --with-pam. there is a /etc/pam.d/pgbouncer config file copied from the one
>> currently being used for postgres auth.
>> >
>> > i can list the whole config file if needed but i get the following when
>> trying to start pgbouncer up:
>> >
>> > [postgres@lsst-pgsql02 ~]$ pgbouncer  -d /etc/pgbouncer/pgbouncer.ini
>> > 2021-04-27 19:37:34.256 CDT [10653] ERROR invalid value "pam" for
>> parameter auth_type in configuration (/etc/pgbouncer/pgbouncer.ini:118)
>> > 2021-04-27 19:37:34.256 CDT [10653] FATAL cannot load config file
>> >
>> > [postgres@lsst-pgsql02 pgbouncer]$ egrep "auth_type" pgbouncer.ini
>> > auth_type = pam
>> >
>> > any ideas?
>>
>> I'd suspect that pgBouncer is not built with PAM support after all.
>>
>> Run "ldd" on the executable and see if it links with OpenLDAP.
>>
>> Yours,
>> Laurenz Albe
>> --
>> Cybertec | https://www.cybertec-postgresql.com
>>
>>


Re: pgbouncer configuration

2021-04-28 Thread Chris Stephens
huh. you are right. i originally installed pgbouncer with yum but removed
with just "yum remove pgbouncer" before following "Building from Git"
section at http://www.pgbouncer.org/install.html.

i just ran "make uninstall" -> reinstalled with

  403  21-04-28 06:58:32 git submodule init
  404  21-04-28 06:58:36 git submodule update
  405  21-04-28 06:58:45 ./autogen.sh
  406  21-04-28 06:59:23 ./configure --with-pam --with-systemd
  407  21-04-28 07:02:11 make
  408  21-04-28 07:02:16 make install

but it still looks like no ldap or pam and same error when starting up. any
suggestions on how to completely remove and reinstall with support for pam?



On Wed, Apr 28, 2021 at 1:51 AM Laurenz Albe 
wrote:

> On Tue, 2021-04-27 at 19:46 -0500, Chris Stephens wrote:
> > I'm trying to run pgbouncer but am having trouble with what looks like a
> very simple configuration.
> >
> > centos 7
> > postgres 12
> > pgbouncer 1.15
> >
> > we are already using pam for database auth. pgbouncer was compiled with
> --with-pam. there is a /etc/pam.d/pgbouncer config file copied from the one
> currently being used for postgres auth.
> >
> > i can list the whole config file if needed but i get the following when
> trying to start pgbouncer up:
> >
> > [postgres@lsst-pgsql02 ~]$ pgbouncer  -d /etc/pgbouncer/pgbouncer.ini
> > 2021-04-27 19:37:34.256 CDT [10653] ERROR invalid value "pam" for
> parameter auth_type in configuration (/etc/pgbouncer/pgbouncer.ini:118)
> > 2021-04-27 19:37:34.256 CDT [10653] FATAL cannot load config file
> >
> > [postgres@lsst-pgsql02 pgbouncer]$ egrep "auth_type" pgbouncer.ini
> > auth_type = pam
> >
> > any ideas?
>
> I'd suspect that pgBouncer is not built with PAM support after all.
>
> Run "ldd" on the executable and see if it links with OpenLDAP.
>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>


pgbouncer configuration

2021-04-27 Thread Chris Stephens
I'm trying to run pgbouncer but am having trouble with what looks like a
very simple configuration.

centos 7
postgres 12
pgbouncer 1.15

we are already using pam for database auth. pgbouncer was compiled with
--with-pam. there is a /etc/pam.d/pgbouncer config file copied from the one
currently being used for postgres auth.

i can list the whole config file if needed but i get the following when
trying to start pgbouncer up:

[postgres@lsst-pgsql02 ~]$ pgbouncer  -d /etc/pgbouncer/pgbouncer.ini
2021-04-27 19:37:34.256 CDT [10653] ERROR invalid value "pam" for parameter
auth_type in configuration (/etc/pgbouncer/pgbouncer.ini:118)
2021-04-27 19:37:34.256 CDT [10653] FATAL cannot load config file

[postgres@lsst-pgsql02 pgbouncer]$ egrep "auth_type" pgbouncer.ini
auth_type = pam

any ideas?


SQL group by help

2020-12-11 Thread Chris Stephens
I'm trying to create a visual representation of a 6x8 grid of samples on a
rack using the following SQL format:

with rack_display as (
select sr.ts rack_ts
, sr.rack_id
, r.rack_barcode
, 1 as row_pos
, max(case when rack_well = 0 then 'A1: '||sample_barcode end) as
col1
, max(case when rack_well = 1 then 'A2: '||sample_barcode end) as
col2
, max(case when rack_well = 2 then 'A3: '||sample_barcode end) as
col3
, max(case when rack_well = 3 then 'A4: '||sample_barcode end) as
col4
, max(case when rack_well = 4 then 'A5: '||sample_barcode end) as
col5
, max(case when rack_well = 5 then 'A6: '||sample_barcode end) as
col6
from rack r
, sample_rack sr
, sample s
where r.rack_id = sr.rack_id
   and sr.sample_id = s.sample_id
   and sr.rack_well < 6
group by sr.ts, sr.rack_id, r.rack_barcode, row_pos
union all
select sr.ts rack_ts
, sr.rack_id
, r.rack_barcode
, 2 as row_pos
, max(case when rack_well = 6 then 'B1: '||sample_barcode end) as
col1
, max(case when rack_well = 7 then 'B2: '||sample_barcode end) as
col2
, max(case when rack_well = 8 then 'B3: '||sample_barcode end) as
col3
, max(case when rack_well = 9 then 'B4: '||sample_barcode end) as
col4
, max(case when rack_well = 10 then 'B5: '||sample_barcode end) as
col5
, max(case when rack_well = 11 then 'B6: '||sample_barcode end) as
col6
from rack r
, sample_rack sr
, sample s
where r.rack_id = sr.rack_id
   and sr.sample_id = s.sample_id
   and sr.rack_well >= 6
   and sr.rack_well < 12
group by sr.ts, sr.rack_id, r.rack_barcode, row_pos
union all
...
)
select * from rack_display order by rack_ts, rack_id, row_pos;

the "union all"s continue for another 6 blocks. reports would filter on
rack_id and timestamp.

if timestamps for each load of a rack were guaranteed to be the same, this
would work. however, the "sr.ts" values may vary by a few seconds so there
is potential for the "group by" to break. ts differences will be a minimum
of 5 minutes for each distinct load of a rack.

what i think i need is to manufacture a group by column based off rows in
"sample_rack" that have "ts" values that are < 1 minute from each other and
rack_id is the same.  i'm coming up blank on how to accomplish that though.
my first thought was to create an interval of +/- 1 min then find all rows
that overlap and assign a group number but i'm not sure how to accomplish
that.

there's also no guarantee an entire rack is full of samples so some "cells"
of display might be null. i think that makes the use of tablefunc crosstab
a little harder. if i remember correctly, it does not handle missing values
well. i'm open to any pivoting strategy.

anyways, i thought i'd reach out for ideas while i do my own digging.

thanks for any input! let me know if i'm not being clear on the problem and
desired outcome.


simple reporting tools for postgres in aws

2020-12-07 Thread Chris Stephens
Hello,

We have a need to display some very simple reports generated from views in
an rds postgres database. We need little more functionality beyond simple
filtering, column hiding, sorting, grouping. Anything much beyond that
would be a strike against the tool.

i'm looking for something similar to oracle application
express's interactive reports (i come from an oracle background).

i took a look at what's available from amazon's aws marketplace but quickly
got overwhelmed. every application i saw was overkill.

users will not be tech savvy. we have a tentative fallback plan to use
excel with pages linked to views in the database but i'm looking for
something a little more targeted.

does anyone have any suggestions that fit the description above?

thanks for any input,
chris


Re: Check constraints do not seem to be working!!!

2020-11-11 Thread Chris Sterritt


On 11/11/2020 06:44, Jitendra Loyal wrote:

Consider this table definition:
 create table t ( i serial, b bool, c bool,
                      constraint b_c check ( (b = true and c is not 
null ) or (b is distinct

from true and c is null) )
                      constraint b_c check ( (b = true and c is not 
null ) or (b = false and c

is null) or (b is null and c is null) )
                      );
Despite the above two constraints, the following rows get into the table:
 insert into t (b , c) values (null, true), (null, false);



(b =TRUE AND c IS NOT NULL) evaluates to null when b is null

Cheers,
Chris Sterritt



Re: Drop column constraint

2020-10-30 Thread Chris Sterritt



On 30/10/2020 15:30, Rich Shepard wrote:
A table has a unique constraint on a column that needs removing. 
Reading the

postgres-12.x docs for alter table it appears the correct syntax is:

alter table locations drop constraint unique;

but this is wrong.

Trying 'alter table locations alter column loc_nbr drop constraint 
unique;' also

failed.

What's the proper syntax to drop the unique constraint on a table column?

TIA,

Rich




You need

alter table locations drop constraint ;


Cheers, Chris Sterritt





Re: PostgreSQL processes use large amount of private memory on Windows

2020-10-12 Thread Chris Sterritt



On 17/09/2020 15:06, Tom Lane wrote:

=?UTF-8?Q?=C3=98ystein_Kolsrud?=  writes:

So my question is: When does a postgres process forked for a connection use
private memory instead of shared, and what can I do to avoid this?

The only significant long-term consumption of private memory is for
caches.  There are catalog caches, which can get large if the session
accesses a whole lot of database objects (e.g., thousands of different
tables).  Some of the PLs maintain caches with parsed versions of any
function that's been executed.  (An ex-employer of mine had a lot of
trouble in that regard, because they had hundreds of thousands of lines
worth of plpgsql functions.)  There isn't any user-accessible knob for
limiting the size of those caches.  If you have a problem of that sort,
about the only way to mitigate it is to use fewer backends so that the
total memory consumption stays in bounds, or redesign your application.
In some cases it might help to restart your sessions when they get too
big, but that seems like at best a band-aid.

regards, tom lane



Would executing DISCARD ALL release the PL cache?

Regards, Chris Sterritt




help flattening json

2020-09-27 Thread Chris Stephens
posgresql verion: 12

i can accomplish this procedurally fairly easily but would like to do this
strictly in SQL.

jsondb=# select jsonb_typeof(jsonb_path_query(vdl_json,'$.tables[*]')) from
vdl_json2;
 jsonb_typeof
--
 object
 object
 object
 object
 object
 object
(6 rows)

jsondb=# select jsonb_pretty(jsonb_path_query(vdl_json,'$.tables[1]')) from
vdl_json2;
jsonb_pretty

 { +
 "name": "AMP_DATA",   +
 "quoted": true,   +
 "columns": [  +
 { +
 "field": "WELL",  +
 "selected": true, +
 "displayName": "Well" +
 },+
 { +
 "field": "WELL_POSITION", +
 "selected": true, +
 "displayName": "Well Position"+
 },+
 { +
 "field": "CYCLE_NUMBER",  +
 "selected": true, +
 "displayName": "Cycle Number" +
 },+
 { +
 "field": "TARGET",+
 "selected": true, +
 "displayName": "Target"   +
 },+
 { +
 "field": "RN",+
 "selected": true, +
 "displayName": "Rn"   +
 },+
 { +
 "field": "DRN",   +
 "selected": true, +
 "displayName": "dRn"  +
 },+
 { +
 "field": "SAMPLE",+
 "selected": true, +
 "displayName": "Sample"   +
 },+
 { +
 "field": "OMIT",  +
 "selected": true, +
 "displayName": "Omit" +
 } +
 ],+
 "labeled": false, +
 "options": {  +
 },+
 "displayName": "Amplification Data",  +
 "sortedColumns": [+
 ],+
 "analysisModule": "primary"   +
 }
(1 row)

i would like to end up with (name text, field text[]).

I can get the field array when i manually filter on name:

jsondb=# select jsonb_path_query_array(vdl_json,'$.tables ? (@.name ==
"RAW_DATA").columns.field') fields
  from vdl_json2;
  fields
---
 ["WELL", "WELL_POSITION", "CYCLE_NUMBER"]
(1 row)

I can get the text of names:

jsondb=# \e
   name
--
 "RESULT"
 "AMP_DATA"
 "MULTICOMPONENT"
 "RAW_DATA"
 "REPLICATE_GROUP_RESULT"
 "WELL_CALL"
(6 rows)

I think i should be able to do this with a lateral join but i can't seem to
get it right. something like:

jsondb=# select tnames.tname, tfields_arr.* from
(select jsonb_path_query(vdl_json,'$.tables[*].name') as tname
  from vdl_json2) tnames,
lateral
(select jsonb_path_query_array(vdl_json,'$.tables ? (@.name ==
tnames.tname).columns.field') as tfields
  from vdl_json2) tfields_arr;
ERROR:  syntax error at or near " "
LINE 6:   from vdl_json2) tfields_arr;

Any json folks out there willing to help out?

i'd also like to get records of "(name text, field text, selected text,
displayName text)" but i haven't started on that one yet.

any help is greatly appreciated!


Re: Need explanation on index size

2020-09-24 Thread Chris Sterritt


On 24/09/2020 15:08, Guillaume Lelarge wrote:

Hi,

Le jeu. 24 sept. 2020 à 15:55, Guillaume Luchet <mailto:g.luc...@bilendi.com>> a écrit :


Hi,

I’m facing of a comportement I don’t understand on indexes, here a
quick example to reproduce my problem


test=# select version();
         version

--
 PostgreSQL 12.4 (Debian 12.4-1.pgdg90+1) on x86_64-pc-linux-gnu,
compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit
(1 row)

test=# create table plop (col_a int, col_b int, col_c int);
CREATE TABLE

test=# create unique index on plop (col_a);
CREATE INDEX

test=# create index on plop(col_b);
CREATE INDEX

test=# insert into plop (col_a, col_b) select generate_series(1,
1), generate_series(1, 1);
INSERT 0 1

test=# SELECT schemaname, tablename,
         pg_size_pretty(SIZE) AS size_pretty,
         pg_size_pretty(total_size) AS total_size_pretty
    FROM (SELECT *, pg_relation_size(quote_ident(schemaname) ||
'.' || quote_ident(tablename)) AS SIZE,
pg_total_relation_size(quote_ident(schemaname) || '.' ||
quote_ident(tablename)) AS total_size
          FROM pg_tables) AS TABLES where tablename = 'plop';
 schemaname | tablename | size_pretty | total_size_pretty
+---+-+---
 public     | plop      | 360 kB      | 864 kB
(1 row)

test=# update plop set col_c = floor(random() * 10 + 1)::int;
UPDATE 1

test=# SELECT schemaname, tablename,
         pg_size_pretty(SIZE) AS size_pretty,
         pg_size_pretty(total_size) AS total_size_pretty
    FROM (SELECT *, pg_relation_size(quote_ident(schemaname) ||
'.' || quote_ident(tablename)) AS SIZE,
pg_total_relation_size(quote_ident(schemaname) || '.' ||
quote_ident(tablename)) AS total_size
          FROM pg_tables) AS TABLES where tablename = 'plop';
 schemaname | tablename | size_pretty | total_size_pretty
+---+-+---
 public     | plop      | 792 kB      | 2160 kB
(1 row)

test=# reindex table plop;
REINDEX

test=# SELECT schemaname, tablename,
         pg_size_pretty(SIZE) AS size_pretty,
         pg_size_pretty(total_size) AS total_size_pretty
    FROM (SELECT *, pg_relation_size(quote_ident(schemaname) ||
'.' || quote_ident(tablename)) AS SIZE,
pg_total_relation_size(quote_ident(schemaname) || '.' ||
quote_ident(tablename)) AS total_size
          FROM pg_tables) AS TABLES where tablename = 'plop';
 schemaname | tablename | size_pretty | total_size_pretty
+---+-+---
 public     | plop      | 792 kB      | 1304 kB
(1 row)

I don’t understand why after the update where I only update a non
indexed column the indexes size is growing. Is it something
someone can explain ?


Every tuple is now on a different location on the table (remember that 
update in PostgreSQL is more something like delete+insert). So even if 
the value of the column doesn't change, its tuple location changes, so 
the index needs to be updated to reflect that change.



--
Guillaume.



If you execute
  vacuum full plop;
you will see the size shrink back as the dead tuples will have been removed.

Chris



Re: Check for duplicates before inserting new rows

2020-09-04 Thread Chris Sterritt


On 04/09/2020 14:21, Rich Shepard wrote:
This is a new issue for me: I've received data from a different source 
and

need to add non-duplicates to two tables in the database. Each row in the
new data has a station ID and associated measurements.

The existing database includes a table for station information and 
another

for measurements made there.

I want to learn which stations and which measurements in the new data set
are not included in the existing tables. And, I don't want to try 
inserting

the new data and have postgres tell me when it's found duplicates,
especially since there are two tables involved.

My research into how to do this has not found a solution so I ask for
pointers to resources that will teach me how to add these new data to 
both

tables.

Regards,

Rich




To insert data into both tables:


drop table if exists station;
drop table if exists measurement;

create table station (station_id serial,station_data text);

create table measurement (measurement_id bigserial,station_id 
integer,ameasurement text);

insert into station(station_data)
values ('station1'),
 ('station2'),
 ('station3') ;

with src_data as (select station_data,ameasurement from (VALUES 
('station1','meas1'),
   ('station2','meas2'),
   ('station3','meas3'),
   ('station4','meas4'))as 
m(station_data,ameasurement)),
 ins_station as (insert into station (station_data)
 select station_data from src_data except select station_data from 
station returning station_id,station_data )

insert into measurement (station_id,ameasurement)
select s.station_id,sd.ameasurement from src_data sd join (SELECT 
station_id,station_data FROM ins_station UNION SELECT station_id,station_data 
FROM station)s using (station_data)
except select station_id,ameasurement from measurement;

select * from station;
select * from measurement;


Regards,
Chris Sterritt



Re: Check for duplicates before inserting new rows

2020-09-04 Thread Chris Sterritt


On 04/09/2020 15:46, Rich Shepard wrote:

On Fri, 4 Sep 2020, Olivier Gautherot wrote:


First of all, what version of PostgreSQL are you using?


Olivier,

12.2.

One way would be to add a UNIQUE constraint and perform for each row 
of the

source table a INSERT ... ON CONFLICT DO NOTHING (see
https://www.postgresql.org/docs/current/sql-insert.html#SQL-ON-CONFLICT)

If it is a 2-way merge, I would encapsulate the process in a function
(which will create a transaction to protect your process) and add a 
column
to trace the rows that have been merged. For this purpose, you can 
use the

ON CONFLICT DO UPDATE ...


Lots for me to ponder.

The new data has attributes stored in two database tables: location and
measurements. I'll need to think about how both could be handled in a 
single

function. For example, the new data contains a row with a location not
already in the existing location table. That location needs to be 
added to
the location table and its associated measurement attributes then 
added to

the existing measurements table.

Thanks for the suggestion,

Rich




Assuming some simple table structures (I've not included PK or FK 
definitions for simplicity):


create table station (station_id integer,station_data text);

create table measurement (measurement_id bigserial,station_id 
integer,ameasurement text);

-- Add some test stations
insert into station(station_id,station_data)
values (1,'station1'),
 (2,'station2'),
 (3,'station3') ;


*Query to identify new measurements:
select station_id,ameasurement from (VALUES (1,'1meas1'), -- this represents 
your new test data set
 (1,'1meas2'),
 (2,'2meas1'),
 (3,'3meas1'))as m(station_id,ameasurement)
except select station_id,ameasurement from measurement;

The query above will give you a list of the new values which are not currently 
stored in table measurement.
Run it and we get all 4 rows returned.

Add a measurement row:
insert into measurement(station_id,ameasurement)
values (2,'2meas1');

Now if you repeat the check for new measurements with the same query as at *, 
you only get 3 rows.

Cheers, Chris Sterritt



Re: export to parquet

2020-08-26 Thread Chris Travers
On Wed, Aug 26, 2020 at 9:00 PM Scott Ribe 
wrote:

> I have no Hadoop, no HDFS. Just looking for the easiest way to export some
> PG tables into Parquet format for testing--need to determine what kind of
> space reduction we can get before deciding whether to look into it more.
>
> Any suggestions on particular tools? (PG 12, Linux)
>
> For simple exporting, the simplest thing is a single-node instance of
Spark.

You can read parquet files in Postgres using
https://github.com/adjust/parquet_fdw if you so desire but it does not
support writing as parquet files are basically immutable.


>
> --
> Scott Ribe
> scott_r...@elevated-dev.com
> https://www.linkedin.com/in/scottribe/
>
>
>
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: Implement a new data type

2020-08-11 Thread Chris Travers
On Wed, Aug 12, 2020 at 2:01 AM raf  wrote:

> On Tue, Aug 11, 2020 at 06:38:39AM -0700, Miles Elam <
> miles.e...@productops.com> wrote:
>
> > Also of note: PostgreSQL already has a money type (
> > https://www.postgresql.org/docs/current/datatype-money.html)
> > But you shouldn't use it (
> > https://wiki.postgresql.org/wiki/Don%27t_Do_This#Don.27t_use_money).
> >
> > I only bring it up so that you can know to make your money type a
> slightly
> > different name to avoid a conflict. Money is deceptively hard to
> implement
> > correctly. I'd recommend reading the second link if you have not already
> to
> > avoid previously known issues.
>
> I use decimal(10,2) for whole cents, and decimal(12,6)
> for sub-cents. Single currency only. I didn't know
> there was a money type originally, but it wouldn't be
> usable for me anyway without the ability to specify the
> scale and precision.
>

It is worth noting that decimal is an alias for numeric in Postgres.   For
that reason you will have less confusion if you use numeric instead.

>
> I recommend considering passing values to the database
> as "decimal '1.23'" rather than bare numeric literals,
> just so there's no chance of the value being
> interpreted as a float at any stage by postgres. Maybe
> that's being too paranoid but that's a good idea when
> it comes to money. :-)
>

I don't think the type designation buys you anything. unless it is a part
of an arithmetic expression  The single quotes do and cannot be omitted
here.

So I think there is a difference between ('1.23' + 1)::numeric and
'1.23'::numeric + 1 but there is also a difference between 1.23::numeric +
1 and '1.23'::numeric + 1

But there is no reason to add the cast when doing something like an insert
of a single value.

>
> Perhaps the incorporation of currency would make a new
> money type interesting. Currencies probably don't
> change as often as timezones but there would probably
> still be some ongoing need for updates.
>

The existing money type has another problem in that the currency it is
attached to is taken from the current locale.  So if you change your locale
settings you can change a value from, say, 100 IDR to 100 EUR at least for
display purposes.

I have some thoughts about how to do a multi-currency type but I am not
actually sure you get anything by tying the data together instead of having
it in separate columns.

>
> cheers,
> raf
>
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: Unexplained disk usage in AWS Aurora Postgres

2020-08-07 Thread Chris Borckholder
Thank you Adam and Christoph,

You are totally right, that AWS support is the one to help me with this
problem.
I am in contact with them for quite some time on this problem and as there
was no progress on resolving this,
I tried to find some insight or trick that I missed here. It's a long shot
(:

Best Regards
Chris

On Fri, Aug 7, 2020 at 4:22 PM Christoph Moench-Tegeder 
wrote:

> ## Chris Borckholder (chris.borckhol...@bitpanda.com):
>
> > We are experiencing a strange situation with an AWS Aurora postgres
> > instance.
>
> The main problem here is that "Amazon Aurora" is not PostgreSQL.
> If I understand Amazon's documentation, what you are using is
> officially named "Amazon Aurora with PostgreSQL Compatibility",
> and that sums is up quite nicely: Aurora is a database engine
> developed at Amazon - and it's inner workings are not publically
> documented.
> Whatever is using up that disk space - only AWS Support can know.
>
> Regards,
> Christoph
>
> --
> Spare Space
>


Re: Unexplained disk usage in AWS Aurora Postgres

2020-08-07 Thread Chris Borckholder
Thanks for your insight!

I cannot find any errors related to archiving in the logs that are
accessible to me.
It's definitely something that I will forward to the support team of the
managed database.

Best Regards
Chris

On Thu, Aug 6, 2020 at 3:18 AM Mohamed Wael Khobalatte <
mkhobala...@grubhub.com> wrote:

>
>
> On Tue, Aug 4, 2020 at 4:39 AM Chris Borckholder <
> chris.borckhol...@bitpanda.com> wrote:
>
>> Hi!
>>
>> We are experiencing a strange situation with an AWS Aurora postgres
>> instance.
>> The database steadily grows in size, which is expected and normal.
>> After enabling logical replication, the disk usage reported by AWS
>> metrics increases much faster then the database size (as seen by \l+ in
>> psql). The current state is that database size is ~290GB, while AWS reports
>> >640GB disk usage.
>> We reached out to AWS support of course, which is ultimately responsible.
>> Unfortunately they were not able to diagnose this until now.
>>
>> I checked with the queries from wiki
>> https://wiki.postgresql.org/wiki/Disk_Usage , which essentially give the
>> same result.
>> I tried to check on wal segment file size, but we have no permission to
>> execute select pg_ls_waldir().
>> The replication slot is active and it also progresses
>> (pg_replication_slots.confirmed_flush_lsn increases and is close to
>> pg_current_wal_flush_lsn).
>>
>> Can you imagine other things that I could check from within postgres with
>> limited permissions to diagnose this?
>>
>> Best Regards
>> Chris
>>
>>
> If you do archive wal files, maybe the archive_command is failing?
>


Re: Unexplained disk usage in AWS Aurora Postgres

2020-08-07 Thread Chris Borckholder
Thank you for your insight Seenu!

That is a good point, unfortunately we do not have access to the
server/file system as the database is a managed service.
Access to the file system from postgres like pg_ls_dir is also blocked.

Are you aware of another, creative way to infer the wal file size from
within postgres?

Best Regards
Chris

On Tue, Aug 4, 2020 at 11:39 AM Srinivasa T N  wrote:

> There may be lot of wal files or the size of log files in pg_log might be
> huge.  "du -sh *" of data directory holding the database might help.
>
> Regards,
> Seenu.
>
>
> On Tue, Aug 4, 2020 at 2:09 PM Chris Borckholder <
> chris.borckhol...@bitpanda.com> wrote:
>
>> Hi!
>>
>> We are experiencing a strange situation with an AWS Aurora postgres
>> instance.
>> The database steadily grows in size, which is expected and normal.
>> After enabling logical replication, the disk usage reported by AWS
>> metrics increases much faster then the database size (as seen by \l+ in
>> psql). The current state is that database size is ~290GB, while AWS reports
>> >640GB disk usage.
>> We reached out to AWS support of course, which is ultimately responsible.
>> Unfortunately they were not able to diagnose this until now.
>>
>> I checked with the queries from wiki
>> https://wiki.postgresql.org/wiki/Disk_Usage , which essentially give the
>> same result.
>> I tried to check on wal segment file size, but we have no permission to
>> execute select pg_ls_waldir().
>> The replication slot is active and it also progresses
>> (pg_replication_slots.confirmed_flush_lsn increases and is close to
>> pg_current_wal_flush_lsn).
>>
>> Can you imagine other things that I could check from within postgres with
>> limited permissions to diagnose this?
>>
>> Best Regards
>> Chris
>>
>>
>>


Unexplained disk usage in AWS Aurora Postgres

2020-08-04 Thread Chris Borckholder
Hi!

We are experiencing a strange situation with an AWS Aurora postgres
instance.
The database steadily grows in size, which is expected and normal.
After enabling logical replication, the disk usage reported by AWS metrics
increases much faster then the database size (as seen by \l+ in psql). The
current state is that database size is ~290GB, while AWS reports >640GB
disk usage.
We reached out to AWS support of course, which is ultimately responsible.
Unfortunately they were not able to diagnose this until now.

I checked with the queries from wiki
https://wiki.postgresql.org/wiki/Disk_Usage , which essentially give the
same result.
I tried to check on wal segment file size, but we have no permission to
execute select pg_ls_waldir().
The replication slot is active and it also progresses
(pg_replication_slots.confirmed_flush_lsn increases and is close to
pg_current_wal_flush_lsn).

Can you imagine other things that I could check from within postgres with
limited permissions to diagnose this?

Best Regards
Chris


Transaction control in SECURITY DEFINER procedures

2020-07-07 Thread Chris Sterritt
The documentation for CREATE PROCEDURE informs us "A|SECURITY 
DEFINER|procedure cannot execute transaction control statements (for 
example,|COMMIT|and|ROLLBACK|, depending on the language)."


Can anyone let me know why this is so and are there any plans to remove 
this restriction in future releases?


Regards, Chris Sterritt



ansible modules for postgresql installation/config

2020-06-11 Thread Chris Stephens
it looks like there are a number of roles available through ansible galaxy
that support initial postgresql setup and configuration.

i'm wondering if there are any that are more popular/capable that i should
consider vs just picking a few and evaluating those.

does anyone have any recommendations?


Re: troubleshooting postgresql ldap authentication

2020-06-09 Thread Chris Stephens
yes, shortly after i sent this out to the list, one of our security
administrators suggested ldapscheme.  I just tested and ldapurl works as
well.

the security admin explained it like this:

"since we are using port 636 I know that it needs the TLS connection in
place before LDAP commands. starttls does the opposite.  allows an LDAP
connection to "upgrade" to TLS.  so the previous errors were simply it
unable to connect to server."

i'm guessing information like that doesn't belong in postgresql
documentation but it would have been useful yesterday. :)

thanks for the response! i just recently made the switch to postgresql
after 20 years of mainly Oracle. during that time, the oracle-l mailing
list was invaluable as a learning tool and as a way to get help
when needed. it's great to know there's a similar mailing list in the
postgresql community!

On Mon, Jun 8, 2020 at 7:41 PM Thomas Munro  wrote:

> On Tue, Jun 9, 2020 at 9:05 AM Chris Stephens 
> wrote:
> > hostsslall all 0.0.0.0/0  ldap
> ldapserver="ldaps://xxx" ldapbasedn="yyy" ldaptls=1
>
> > does anyone know what might be causing "LDAP: Bad parameter to an ldap
> routine"
>
> You probably want ldapurl="ldaps://xxx" (note: ldapurl, not
> ldapserver).  Or you could use ldapscheme="ldaps" and
> ldapserver="xxx".
>


troubleshooting postgresql ldap authentication

2020-06-08 Thread Chris Stephens
posgresql 12
centos 7

i am trying to configure ldap authentication.

i have the following pg_hba.conf entry (server and basedn are correct but
not shown):

hostsslall all 0.0.0.0/0  ldap
ldapserver="ldaps://xxx" ldapbasedn="yyy" ldaptls=1

when i attempt to authenticate, i see the following in pg logs:

2020-06-08 15:58:43 CDT [20235]: [1-1]
user=[unknown],db=[unknown],app=[unknown],client=zzz LOG:  connection
received: host=zzz port=12378
2020-06-08 15:58:43 CDT [20235]: [2-1]
user=uuu,db=postgres,app=[unknown],client=zzz LOG:  could not initialize
LDAP: Bad parameter to an ldap routine
2020-06-08 15:58:43 CDT [20235]: [3-1]
user=uuu,db=postgres,app=[unknown],client=zzz FATAL:  LDAP authentication
failed for user "uuu"
2020-06-08 15:58:43 CDT [20235]: [4-1]
user=uuu,db=postgres,app=[unknown],client=zzz DETAIL:  Connection matched
pg_hba.conf line 90: "hostsslall all 0.0.0.0/0
ldap ldapserver="ldaps://xxx" ldapbasedn="yyy" ldaptls=1"

does anyone know what might be causing "LDAP: Bad parameter to an ldap
routine"

ldapsearch works fine w/ the supplied ldapserver and base dn.


Re: Oracle vs. PostgreSQL - a comment

2020-06-03 Thread Chris Travers
On Wed, Jun 3, 2020 at 7:45 PM Andreas Joseph Krogh 
wrote:

> På onsdag 03. juni 2020 kl. 18:50:12, skrev Jeremy Schneider <
> schnj...@amazon.com>:
>
> > On 6/2/20 1:30 PM, Stephen Frost wrote:
> >> No, nothing does as PG doesn't support it as we have one WAL stream for
> >> the entire cluster.
>
> On 6/2/20 11:38, Ron wrote:
> > Right.  Making WAL files specific to a database should be high on the
> > list of priorities.
>
> Did Oracle change this?  Last time I looked, I don't think Oracle
> supported local redo in their multitenant architecture either.
>
>
>
> Regardless of what Oracle does, I agree this would be a huge step in the
> right direction for pg-DBAs.
> I have absolutely no clue about how much work is required etc., but I
> think it's kind of strange that no companies have invested in making this
> happen.
>

I manage database clusters where the number of databases is a reason not to
do logical replication based upgrades, where pg_upgrade is far preferred
instead.

If this were to be the case, I would be very concerned that a bunch of
things would have to change:
1.  Shared catalogs would have txid problems unless you stay with global
txids and then how do local wal streams work there?
2.  Possibility that suddenly streaming replication has the possibility of
different databases having different amounts of lag
3.  Problems with io management on WAL on high throughput systems (I have
systems where a db cluster generates 10-20TB of WAL per day)

So I am not at all sure this would be a step in the right direction or
worth the work.

>
> --
> Andreas Joseph Krogh
>


-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: Audit Role Connections

2020-05-29 Thread Chris Morris
Ah, I do appear to have that enabled (inside Heroku's config), but I can't
find anything like that in the logs, so I've opened a ticket with them. Thx
a lot!

On Fri, May 29, 2020 at 2:25 PM Peter J. Holzer  wrote:

> On 2020-05-29 12:42:47 -0500, Chris Morris wrote:
> > We're using Heroku's PG,
> [...]
> > Other than polling pg_stat_activity (which isn't 100% accurate depending
> on
> > timing), is there a good way to audit connections? To detect which roles
> are
> > being used for connections?
>
> Do you have access to the log files?
>
> If you log_connections is on, you get messages like these:
>
>
> 2020-05-29 21:00:02 CEST [27995]: [2-1] user=w*,db=wds,pid=27995 LOG:
> connection authorized: user=w* database=wds
> 2020-05-29 21:00:18 CEST [27995]: [9-1] user=w*,db=wds,pid=27995 LOG:
> disconnection: session time: 0:00:15.979 user=w* database=wds
> host=[local]
> 2020-05-29 21:07:14 CEST [7481]: [2-1] user=u*,db=wds,pid=7481 LOG:
> connection authorized: user=u* database=wds
> 2020-05-29 21:07:14 CEST [7481]: [7-1] user=u*,db=wds,pid=7481 LOG:
> disconnection: session time: 0:00:00.016 user=u* database=wds
> host=[local]
> 2020-05-29 21:10:56 CEST [13918]: [2-1] user=m***,db=wds,pid=13918
> LOG:  connection authorized: user=m*** database=wds SSL enabled
> (protocol=TLSv1.2, cipher=ECDHE-RSA-AES256-GCM-SHA384, bits=256,
> compression=off)
> 2020-05-29 21:10:56 CEST [13918]: [11-1] user=m***,db=wds,pid=13918
> LOG:  disconnection: session time: 0:00:00.117 user=m*** database=wds
> host=143.130.**.** port=54037
>
> (user names and IP addresses censored for privacy reasons)
>
> hp
>
> --
>_  | Peter J. Holzer| Story must make more sense than reality.
> |_|_) ||
> | |   | h...@hjp.at |-- Charles Stross, "Creative writing
> __/   | http://www.hjp.at/ |   challenge!"
>


Audit Role Connections

2020-05-29 Thread Chris Morris
We're using Heroku's PG, and it comes with a default, almost super-user.
I'm wanting to restrict that account for only our production app servers,
and have anyone connecting for personal queries to go through an assigned
role by employee (that's all setup and working).

Other than polling pg_stat_activity (which isn't 100% accurate depending on
timing), is there a good way to audit connections? To detect which roles
are being used for connections?


Re: surprisingly slow creation of gist index used in exclude constraint

2020-05-15 Thread Chris Withers

On 14/05/2020 21:31, Tom Lane wrote:

Chris Withers  writes:

   It has 4.1 million rows in it and while importing the data only
   takes a couple of minutes, when I did a test load into the new
   cluster, building the mkt_profile_period_col1_col4_col2_chan_excl
   index for the exclude constraint took 15 hours.


Don't recall for sure, but I think GIST index build is sensitive
to the maintenance_work_mem setting; did you have that cranked up?


postgres=# show maintenance_work_mem;
 maintenance_work_mem
--
 2GB
(1 row)

Would it be worth turning that up more? The server has ~130GB memory.


 - failing that, what can I do to import and then create the index
   in the background?


CREATE INDEX CONCURRENTLY, perhaps.


How would I bring this into play with respect to the dump and load 
cycle? Is there an option to pg_dump or something else I should use or 
is this a case of awk'ing the create index in the dump?


Chris




Re: surprisingly slow creation of gist index used in exclude constraint

2020-05-15 Thread Chris Withers

On 14/05/2020 21:16, k...@rice.edu wrote:


Hi Chris,

This sounds like a candidate for pg_logical replicating from the old to 
new system.


Can you point me to a good guide as to how to easily set this up for one 
database and would work between pg 9.4 and pg 11.5?


cheers,

Chris




surprisingly slow creation of gist index used in exclude constraint

2020-05-14 Thread Chris Withers

  
  
Hi,
I'm upgrading a database from 9.4 to 11.5 by dumping from the old
  cluster and loading into the new cluster.
The database is tiny: around 2.3G, but importing this table is
  proving problematic:
Column  |   Type|Modifiers
+---+--
 period | tsrange   | not null
 col1   | character varying | not null
 col2   | character varying | not null
 col3   | integer   | not null
 col4   | character varying | not null default ''::character varying
 id | integer   | not null default nextval('mkt_profile_id_seq'::regclass)
 deleted| boolean   | not null default false
 managed| boolean   | not null default false
 col5   | character varying |
Indexes:
"mkt_profile_pkey" PRIMARY KEY, btree (id)
"mkt_profile_period_col1_col4_col2_chan_excl" EXCLUDE USING gist (period WITH &&, col1 WITH =, col4 WITH =, col2 WITH =, col3 WITH =)
Check constraints:
"mkt_profile_period_check" CHECK (period <> 'empty'::tsrange)
Foreign-key constraints:
"mkt_profile_col1_fkey" FOREIGN KEY (col1) REFERENCES host(name)
It has 4.1 million rows in it and while importing the data only
  takes a couple of minutes, when I did a test load into the new
  cluster, building the mkt_profile_period_col1_col4_col2_chan_excl
  index for the exclude constraint took 15 hours.
I feel like asking what I'm doing wrong here? The new server is
  pretty decent hardware...
Concrete questions:
- what, if anything, am I getting badly wrong here?
- what can I do to speed up creation of this index?
- failing that, what can I do to import and then create the index
  in the background? 

As you can imagine, a 15hr outage for an upgrade has not met with
  large amounts of happiness from the people whose application it is
  ;-)
Chris

  



script libraries?

2020-04-30 Thread Chris Stephens
as another Oracle DBA trying to pick up Postgresql one thing i haven't come
across are script libraries such as there are for Oracle (
https://github.com/tanelpoder/tpt-oracle and
https://oracle-base.com/dba/scripts as examples).

Does anything like that exist for PG? Would be nice to see how people
navigate through PG on the command line.

Thanks!


Re: possibilities for SQL optimization

2020-04-16 Thread Chris Stephens
On Thu, Apr 16, 2020 at 10:47 AM Michael Lewis  wrote:

> My other thought was to range partition by pixelID + brin index.
>

> I would expect brin index to be INSTEAD of partitioning. You didn't share
> buffer hits, which I expect were 100% on the subsequent explain analyze
> runs, but the index scan may still be faster if the planner knows it only
> needs to scan a few small indexes on one, or a few, partitions.
>

agreed but i wondered if partition elimination might be a faster way to
eliminate significant portions of table up fron then possibly parallelize
remaining partitioned brin index scans. not even sure its worth trying
though. this is a data volume vs cache size and predicate count (w/ each
predicate requiring a very efficient but not instantaneous index lookup)
issue.


>
What sort of growth do you see on this table? Is future scalability a
> significant concern, or is the problem just that 40-300ms for this select
> is unacceptable?
>

that's not really clear to me at this point but data will grow linearly for
a year and then remain constant. i think current volume represents ~ 3
months of data but i'm not sure. it is the 40-300ms response time that is
the issue. this system has ridiculous time constraints and has other
processing separate from database queries to account for. query response
times must fit into those requirements but since other pieces are still
being looked at, specific requirements aren't available as far as i
understand. "as fast as possible" is what we have right now. :(


>
>
> https://www.postgresql.org/docs/current/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-ASYNC-BEHAVIOR
> Have you tuned effective_io_concurrency? The documentation says "this
> setting only affects bitmap heap scans" and nearly all the time is there.
> If it is still set to 1 as default, then increasing to 200 or perhaps more
> might be prudent when on SSD or other memory backed storage. You don't even
> need to change the server config defaults for testing the impact-
>
> set effective_io_concurrency = 200;
> /* select query */
> reset effective_io_concurrency; /* if doing other things in the same
> session and wanting to revert to default behavior, else just disconnect */
>

I just tried that. results were same as without. thanks for the suggestion!


Re: possibilities for SQL optimization

2020-04-16 Thread Chris Stephens
disastrous :)

Planning Time: 7.569 ms
Execution Time: 316969.474 ms

On Thu, Apr 16, 2020 at 9:23 AM Pavel Stehule 
wrote:

>
>
> čt 16. 4. 2020 v 16:08 odesílatel Chris Stephens 
> napsal:
>
>> PG12
>> RHEL 8
>>
>> I suspect there's little I can do to get the following query to run
>> faster/more efficiently but thought I'd post to list and confirm.
>>
>> Caveat: I'm coming from an Oracle background and am extremely wet behind
>> ears w/ respect to postgresql (please be kind :)).
>>
>> Right now, we can't change the software generating the SQL though that is
>> very possible/likely in the future. For now, the query is what it is. I can
>> alter indexing, add table partitioning, or anything else that doesn't
>> change logical structure of table.
>>
>> This is a very wide table holding astronomical data which will be used to
>> track changes in the sky to generate alerts for astronomers to evaluate.
>> The query has a variable number of "pixelId BETWEEN" predicates. As data
>> volume in DiaSource increases, performance of query decreases. I need to
>> confirm this but I suspect the primary reason for the decreased performance
>> is increased number of "pixelId BETWEEN" predicates generated by
>> application. Predicate count is the only thing that changes. I don't think
>> performance would otherwise degrade given execution plan.
>>
>> [local] @z=# \dS+ "DiaSource"
>> Table
>> "public.DiaSource"
>>Column|Type | Collation |
>> Nullable |Default| Storage | Stats target | Description
>>
>> -+-+---+--+---+-+--+-
>>  diaSourceId | bigint  |   |
>> not null | '0'::bigint   | plain   |  |
>>  ccdVisitId  | bigint  |   |
>> not null | '0'::bigint   | plain   |  |
>>  diaObjectId | bigint  |   |
>>  | '0'::bigint   | plain   |  |
>>  ssObjectId  | bigint  |   |
>>  | '0'::bigint   | plain   |  |
>>  parentDiaSourceId   | bigint  |   |
>>  | '0'::bigint   | plain   |  |
>>  prv_procOrder   | integer |   |
>> not null | 0 | plain   |  |
>>  ssObjectReassocTime | timestamp without time zone |   |
>>  |   | plain   |  |
>>  midPointTai | double precision|   |
>> not null | '0'::double precision | plain   |  |
>>  ra  | double precision|   |
>> not null | '0'::double precision | plain   |  |
>>  raSigma | double precision|   |
>> not null | '0'::double precision | plain   |  |
>>  decl| double precision|   |
>> not null | '0'::double precision | plain   |  |
>>  declSigma   | double precision|   |
>> not null | '0'::double precision | plain   |  |
>>  ra_decl_Cov | double precision|   |
>> not null | '0'::double precision | plain   |  |
>>  x   | double precision|   |
>> not null | '0'::double precision | plain   |  |
>>  xSigma  | double precision|   |
>> not null | '0'::double precision | plain   |  |
>>  y   | double precision|   |
>> not null | '0'::double precision | plain   |  |
>>  ySigma  | double precision|   |
>> not null | '0'::double precision | plain   |  |
>>  x_y_Cov | double precision|   |
>> not null | '0'::double precision | plain   |  |
>>  apFlux  | double precision|   |
>> not null | '0'::double precision | plain   |  |
>>  apFluxErr   | double precision|   |
>> not null |

possibilities for SQL optimization

2020-04-16 Thread Chris Stephens
0..4.67
rows=10 width=0) (actual time=0.009..0.009 rows=15 loops=1)
   Index Cond: (("pixelId" >= '10729625747456'::bigint) AND
("pixelId" <= '10729625812991'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.72
rows=15 width=0) (actual time=0.003..0.003 rows=17 loops=1)
   Index Cond: (("pixelId" >= '10729625878528'::bigint) AND
("pixelId" <= '10729626009599'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.98
rows=41 width=0) (actual time=0.004..0.004 rows=32 loops=1)
   Index Cond: (("pixelId" >= '10729626337280'::bigint) AND
("pixelId" <= '10729626796031'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..6.22
rows=165 width=0) (actual time=0.029..0.029 rows=144 loops=1)
   Index Cond: (("pixelId" >= '10729626861568'::bigint) AND
("pixelId" <= '10729628893183'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..5.80
rows=123 width=0) (actual time=0.016..0.016 rows=100 loops=1)
   Index Cond: (("pixelId" >= '10729628958720'::bigint) AND
("pixelId" <= '10729630466047'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.67
rows=10 width=0) (actual time=0.005..0.005 rows=4 loops=1)
   Index Cond: (("pixelId" >= '10729630597120'::bigint) AND
("pixelId" <= '10729630662655'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.67
rows=10 width=0) (actual time=0.003..0.003 rows=2 loops=1)
   Index Cond: (("pixelId" >= '10729630793728'::bigint) AND
("pixelId" <= '10729630859263'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.92
rows=35 width=0) (actual time=0.006..0.006 rows=22 loops=1)
   Index Cond: (("pixelId" >= '10729630990336'::bigint) AND
("pixelId" <= '10729631383551'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..6.32
rows=175 width=0) (actual time=0.031..0.031 rows=155 loops=1)
   Index Cond: (("pixelId" >= '10729631449088'::bigint) AND
("pixelId" <= '10729633611775'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.82
rows=25 width=0) (actual time=0.022..0.022 rows=58 loops=1)
   Index Cond: (("pixelId" >= '10729650388992'::bigint) AND
("pixelId" <= '10729650651135'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..5.18
rows=61 width=0) (actual time=0.009..0.009 rows=63 loops=1)
   Index Cond: (("pixelId" >= '10729650716672'::bigint) AND
("pixelId" <= '10729651437567'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.82
rows=25 width=0) (actual time=0.019..0.019 rows=30 loops=1)
   Index Cond: (("pixelId" >= '10729652748288'::bigint) AND
("pixelId" <= '10729653010431'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.67
rows=10 width=0) (actual time=0.006..0.006 rows=10 loops=1)
   Index Cond: (("pixelId" >= '10729653141504'::bigint) AND
("pixelId" <= '10729653207039'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.67
rows=10 width=0) (actual time=0.004..0.004 rows=1 loops=1)
   Index Cond: (("pixelId" >= '10729653272576'::bigint) AND
("pixelId" <= '10729653338111'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.67
rows=10 width=0) (actual time=0.003..0.003 rows=0 loops=1)
   Index Cond: (("pixelId" >= '10729653600256'::bigint) AND
("pixelId" <= '10729653665791'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.82
rows=25 width=0) (actual time=0.006..0.006 rows=20 loops=1)
   Index Cond: (("pixelId" >= '10729653796864'::bigint) AND
("pixelId" <= '10729654059007'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.67
rows=10 width=0) (actual time=0.003..0.003 rows=1 loops=1)
   Index Cond: (("pixelId" >= '10729654190080'::bigint) AND
("pixelId" <= '10729654255615'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.67
rows=10 width=0) (actual time=0.003..0.003 rows=2 loops=1)
   Index Cond: (("pixelId" >= '10729654321152'::bigint) AND
("pixelId" <= '10729654386687'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.72
rows=15 width=0) (actual time=0.003..0.003 rows=2 loops=1)
   Index Cond: (("pixelId" >= '10729654452224'::bigint) AND
("pixelId" <= '10729654583295'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..5.03
rows=46 width=0) (actual time=0.026..0.026 rows=47 loops=1)
   Index Cond: (("pixelId" >= '10729734275072'::bigint) AND
("pixelId" <= '10729734799359'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..5.03
rows=46 width=0) (actual time=0.009..0.009 rows=45 loops=1)
   Index Cond: (("pixelId" >= '10729735061504'::bigint) AND
("pixelId" <= '10729735585791'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..5.03
rows=46 width=0) (actual time=0.018..0.019 rows=54 loops=1)
   Index Cond: (("pixelId" >= '10729735847936'::bigint) AND
("pixelId" <= '10729736372223'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.82
rows=25 width=0) (actual time=0.039..0.039 rows=42 loops=1)
   Index Cond: (("pixelId" >= '10729737945088'::bigint) AND
("pixelId" <= '10729738207231'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.72
rows=15 width=0) (actual time=0.006..0.006 rows=8 loops=1)
   Index Cond: (("pixelId" >= '10729738993664'::bigint) AND
("pixelId" <= '10729739124735'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.67
rows=10 width=0) (actual time=0.006..0.006 rows=20 loops=1)
   Index Cond: (("pixelId" >= '10729739190272'::bigint) AND
("pixelId" <= '10729739255807'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.72
rows=15 width=0) (actual time=0.004..0.004 rows=4 loops=1)
   Index Cond: (("pixelId" >= '10729740566528'::bigint) AND
("pixelId" <= '10729740697599'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.72
rows=15 width=0) (actual time=0.005..0.005 rows=16 loops=1)
   Index Cond: (("pixelId" >= '10729740763136'::bigint) AND
("pixelId" <= '10729740894207'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.72
rows=15 width=0) (actual time=0.005..0.005 rows=13 loops=1)
   Index Cond: (("pixelId" >= '10729740959744'::bigint) AND
("pixelId" <= '10729741090815'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.67
rows=10 width=0) (actual time=0.020..0.020 rows=23 loops=1)
   Index Cond: (("pixelId" >= '10729741484032'::bigint) AND
("pixelId" <= '10729741549567'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.67
rows=10 width=0) (actual time=0.005..0.005 rows=3 loops=1)
   Index Cond: (("pixelId" >= '10729741877248'::bigint) AND
("pixelId" <= '10729741942783'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.67
rows=10 width=0) (actual time=0.012..0.012 rows=11 loops=1)
   Index Cond: (("pixelId" >= '10729748955136'::bigint) AND
("pixelId" <= '10729749020671'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.82
rows=25 width=0) (actual time=0.013..0.013 rows=15 loops=1)
   Index Cond: (("pixelId" >= '10729868492800'::bigint) AND
("pixelId" <= '10729868754943'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.67
rows=10 width=0) (actual time=0.006..0.006 rows=11 loops=1)
   Index Cond: (("pixelId" >= '10729869082624'::bigint) AND
("pixelId" <= '10729869148159'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.67
rows=10 width=0) (actual time=0.004..0.004 rows=1 loops=1)
   Index Cond: (("pixelId" >= '10729869344768'::bigint) AND
("pixelId" <= '10729869410303'::bigint))
 ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"  (cost=0.00..4.67
rows=10 width=0) (actual time=0.040..0.040 rows=12 loops=1)
   Index Cond: (("pixelId" >= '10776494538752'::bigint) AND
("pixelId" <= '10776494604287'::bigint))
First execution:
 Planning Time: 4.742 ms
 Execution Time: 272.052 ms
Subsequent executions:
  Planning Time: 1.070 ms
  Execution Time: 40.963 ms

select count(*) from "DiaSource"
[more] - > ;
   count
---
 302971183

So PG is taking each predicate -> searching through IDX_DiaSource_htmId20
-> building bitmap of ctid's (i think that's corollary to oracle's rowid)
-> visiting table blocks containing any of those pixelID ranges 1x for
additional columns. That seems extremely efficient to me.

I did try a "create table x as select * from diasource" sorting the table
by pixelID and creating a brin index to see how that went. Not well (~2.5
second response time). My other thought was to range partition by pixelID +
brin index. I might also play w/ parallel query but process communication
overhead and scalability implications (100's of concurrent processes
running similar SQL) probably make that a dead end.

Anyways, I figured I'd submit this to the list first to see if there might
be better possibilities.

Thanks for any input!

Chris


Re: Log Unique Queries without Params?

2020-04-12 Thread Chris Morris
Thx!

On Sat, Apr 11, 2020 at 11:55 PM Julien Rouhaud  wrote:

> On Sun, Apr 12, 2020 at 6:51 AM Chris Morris 
> wrote:
> >
> > I have a local script I've written that will scan a log of PG queries to
> extract out unique queries without any specific parameter data. For
> example, if these 2 queries are actually run:
> >
> > SELECT * FROM foo where bar = 1;
> > SELECT * FROM foo where bar = 2;
> >
> > It will capture only:
> >
> > SELECT * FROM foo whee bar = :id;
> >
> > Are there any existing tools that do this already for me? I'm
> considering setting up a server that can have logs forwarded to it and only
> logging unique queries like this, but I don't want to build anything that
> may already exist out there.
>
> pgbadger (http://pgbadger.darold.net/#about) will do that and much
> more.  Depending on what you want to achieve maybe pg_stat_statements
> (https://www.postgresql.org/docs/current/pgstatstatements.html) is
> also an alternative.
>


Log Unique Queries without Params?

2020-04-11 Thread Chris Morris
I have a local script I've written that will scan a log of PG queries to
extract out unique queries without any specific parameter data. For
example, if these 2 queries are actually run:

SELECT * FROM foo where bar = 1;
SELECT * FROM foo where bar = 2;

It will capture only:

SELECT * FROM foo whee bar = :id;

Are there any existing tools that do this already for me? I'm considering
setting up a server that can have logs forwarded to it and only logging
unique queries like this, but I don't want to build anything that may
already exist out there.


Re: PG 12: Partitioning across a FDW?

2020-03-25 Thread Chris Morris
Right now my dbs are hosted by Heroku, so I doubt I have any control over
the dbs at that level.

Thanks for the idea though! :)

On Wed, Mar 25, 2020 at 12:04 PM Michael Lewis  wrote:

> Chris,
> Does it actually need to be a different server and database, or would it
> be possible to have another storage device added to your existing database
> and make use of tablespaces to accomplish pseudo-archive of older
> partitions? Just a thought.
>
>>


Re: PG 12: Partitioning across a FDW?

2020-03-25 Thread Chris Morris
>
> Not yet..  There is ongoing work to make that happen though.


Glad to hear it. :) Thx.


Re: PG 12: Partitioning across a FDW?

2020-03-25 Thread Chris Morris
>
> > Is it even possible to use PG partitioning across a Foreign Server?
> I am not certain what you mean, but you can have foreign tables as
> partitions
> of a partitioned table.  The partitions won't be processed in parallel
> though.


I have a large, growing table, that I'd like to start partitioning, but
also would like "older" partitions to be stored in a separate database,
connected via FDW. I haven't played around with partitioning at all yet, so
clearly I'm not sure how to ask the question :)

The hope is to still have one "seamless" table users can query, but to
spread the storage across different databases. I realize that may be asking
for too much.


PG 12: Partitioning across a FDW?

2020-03-24 Thread Chris Morris
Is it even possible to use PG partitioning across a Foreign Server?


Re: Fwd: sensible configuration of max_connections

2020-02-07 Thread Chris Withers

On 07/02/2020 12:49, Chris Ellis wrote:

What's "too much" for max_connections? What happens when you set it to

high? What factors affect that number?


When sizing max_connections you need to trade off how many connections 
your application will use at peak vs how much RAM and CPU you have.


Each connection is capable of allocating work_mem and has a stack etc.

As such you don't want max_connections to be able to run your system 
out of RAM.
Sure, but that's where I'm trying to find out what's sensible. The box 
has 196GB memory, most of that in hugepages, 18 core Intel Skylake with 
HT on giving 36 cores and tonnes of SSD for storage. How would I turn 
that spec into a sensible number for max_connections? As that number 
grows, what contention points in postgres will start creaking (shared 
memory where the IPC happens?)


In case I forgot to say, this is PostgreSQL 11...

Chris

PS: definitely thinking of pg_bouncer, but still trying to figure out 
what to sensibly set for max_connections.


Fwd: sensible configuration of max_connections

2020-02-07 Thread Chris Ellis
Hi Chris

On Fri, 7 Feb 2020, 08:36 Chris Withers,  wrote:

> Hi All,
>
> What's a sensible way to pick the number to use for max_connections?
>

Sensible in this context is some what variable.  Each connection in
PostgreSQL will be allocated a backend process.  These are not the lightest
weight of things.

Each connection takes up space in shared memory, as mentioned in the manual.


> I'm looking after a reasonable size multi-tenant cluster, where the
> master handles all the load and there's a slave in case of hardware
> failure in the master.
> The machine is used to host what I suspect are mainly django
> applications, so lots of short web requests, not sure how much, if any,
> django's orm does connection pooling.
>
> I arbitrarily picked 1000 for max_connections and haven't had any
> problems, but onboarding another app that handles a few million requests
> per day  on Monday and thinking about potential problems related to the
> number of available connections.
>
> What's "too much" for max_connections? What happens when you set it to
> high? What factors affect that number?
>

When sizing max_connections you need to trade off how many connections your
application will use at peak vs how much RAM and CPU you have.

Each connection is capable of allocating work_mem and has a stack etc.

As such you don't want max_connections to be able to run your system out of
RAM.

Given your situation I'd very seriously look at connection pooling using
PgBouncer or similar.  That way you can run with a far smaller
max_connections and still cope with applications configured with large
usually idle connection pools.


> cheers,
>
> Chris
>

Regards,
Chris Ellis

>


sensible configuration of max_connections

2020-02-07 Thread Chris Withers

Hi All,

What's a sensible way to pick the number to use for max_connections?

I'm looking after a reasonable size multi-tenant cluster, where the 
master handles all the load and there's a slave in case of hardware 
failure in the master.
The machine is used to host what I suspect are mainly django 
applications, so lots of short web requests, not sure how much, if any, 
django's orm does connection pooling.


I arbitrarily picked 1000 for max_connections and haven't had any 
problems, but onboarding another app that handles a few million requests 
per day  on Monday and thinking about potential problems related to the 
number of available connections.


What's "too much" for max_connections? What happens when you set it to 
high? What factors affect that number?


cheers,

Chris




Re: Should I reinstall over current installation?

2020-02-05 Thread Chris Charley
Moreno, thank you for all your help.

Following your instructions, I was able to recover my databases. All is
good.

Chris


On Wed, Feb 5, 2020 at 6:45 AM Moreno Andreo 
wrote:

> Il 04/02/2020 21:18, Chris Charley ha scritto:
> > Hello Moreno
> >
> > Thanks for the reply!
> >
> > I ran Services and it reported postsql as Disabled.
>
> A disabled service will never run nor leave error messages anywhere
>
> Right click on the Postgresql service, select properties.
> In the next window, choose startup type (the combo under the executable
> path) and select Automatic, then click on the Start button below
> Click OK and close the window.
>
> Now something should happen. If everything goes well, You'll see
> "Started" and "Automatic" in the two columns next to the service
> description, and you can start your application
> If there is an error message, then it should log something. See my last
> mail to extract log informations and post them here
>
> > I am considering reinstall because I couldn't find any helpful info
> > from these services.
>
> You got no helpful info because service does not start, andd so does not
> log :-)
>
> > Thanks again for your help.
> >  I suspect when Windows was reinstalled, the firewall was reconfigured
> > and PostgreSQL wasn't allowed to access the port?
>
> ATM there's a service start issue to be resolved, if there's also
> something about the firewall, we'll think about it later.
>
> Moreno.-
>
>
>
>


Re: Should I reinstall over current installation?

2020-02-03 Thread Chris Charley
I tried items you suggested (1-5), but could find no helpful info.

Thanks for your help and going the extra mile!

On Mon, Feb 3, 2020 at 4:23 PM Adrian Klaver 
wrote:

> On 2/3/20 1:11 PM, Chris Charley wrote:
> > Hi
> >
> > The link you provided
> > (
> https://www.howtogeek.com/123646/htg-explains-what-the-windows-event-viewer-is-and-how-you-can-use-it/
> )
> >
> > Points me to Event Viewer but I'm kinda lost there trying to find
> > anything pertaining to postgreql.
>
> I cranked up a Windows 7 instance, which should be close enough. What I
> found:
>
> 1) Open Event Viewer
>
> 2) Click on Windows Logs
>
> 3) Click on Application
>
> 4) The center section will show logs. The application name is under Source.
>
> 5) If you want to search, then Ctrl+f and enter Postgres
>
> >
> > On Mon, Feb 3, 2020 at 4:02 PM Adrian Klaver  > <mailto:adrian.kla...@aklaver.com>> wrote:
> >
> > On 2/3/20 12:54 PM, Chris Charley wrote:
> >  > /What does the Windows system log show when you do this?/
> >  >
> >  > I don't know where to find this log (for Windows 10)
> >  >
> >
> > I'm not a Windows user, what I can do is point you at:
> >
> >
> https://www.howtogeek.com/123646/htg-explains-what-the-windows-event-viewer-is-and-how-you-can-use-it/
> >
> >
> > --
> > Adrian Klaver
> > adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: Should I reinstall over current installation?

2020-02-03 Thread Chris Charley
Hi

The link you provided (
https://www.howtogeek.com/123646/htg-explains-what-the-windows-event-viewer-is-and-how-you-can-use-it/
)

Points me to Event Viewer but I'm kinda lost there trying to find anything
pertaining to postgreql.

On Mon, Feb 3, 2020 at 4:02 PM Adrian Klaver 
wrote:

> On 2/3/20 12:54 PM, Chris Charley wrote:
> > /What does the Windows system log show when you do this?/
> >
> > I don't know where to find this log (for Windows 10)
> >
>
> I'm not a Windows user, what I can do is point you at:
>
>
> https://www.howtogeek.com/123646/htg-explains-what-the-windows-event-viewer-is-and-how-you-can-use-it/
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: Should I reinstall over current installation?

2020-02-03 Thread Chris Charley
*What does the Windows system log show when you do this?*

I don't know where to find this log (for Windows 10)

On Mon, Feb 3, 2020 at 3:47 PM Adrian Klaver 
wrote:

> On 2/3/20 12:44 PM, Chris Charley wrote:
> Please reply to list also.
> Ccing list.
> > Adrian
> >
> > /Does Postgres still exist on the system?
> >
> > If it does have you started the service?/
> >
> > PostgreSQL-x64-12 is list as 'stopped' in the task manger services tab.
> > When I try to 'start', it doesn't change to 'running'.
>
> What does the Windows system log show when you do this?
>
>
> >
> >
> > On Mon, Feb 3, 2020 at 3:26 PM Adrian Klaver  > <mailto:adrian.kla...@aklaver.com>> wrote:
> >
> > On 2/3/20 12:20 PM, Chris Charley wrote:
> >  > I just got my Windows 10 computer in Nov 2019 and installed
> > PostgreSQL
> >  > Jan this year.
> >  >
> >  > I had a problem with Windows update and when I contacted
> > Microsoft, they
> >  > installed a fresh operating system. After that, I was unable to
> run
> >  > postgres from pgAdmin 4 or from a perl program that accessed the
> > database.
> >  >
> >  > The message I got was:
> >  >
> >  > could not connect to server: Connection refused
> > (0x274D/10061) Is
> >  > the server running on host "localhost" (127.0.0.1) and accepting
> > TCP/IP
> >  > connections on port 5432? at pg_01.pl <http://pg_01.pl>
> > <http://pg_01.pl> line 8.
> >  > **//___^
> >
> > Does Postgres still exist on the system?
> >
> > If it does have you started the service?
> >
> >  > I  was considering 2 options:
> >  >
> >  > !) Reinstall postgres over the current without removing the
> current
> >  > postgres first.
> >  > 2) uninstall the current postgres and then install postgres again.
> >  >
> >  > The tables I created in my database are not many and if I lose
> them
> >  > doing a reinstall, I could recreate my database and the tables. I
> > was
> >  > considering option (1) as I was wondering if that choice would
> > keep the
> >  > database and tables I created this January.
> >  >
> >  > (The database and contained tables are from 'Practical SQL' by
> > Anthony
> >  > DeBarros.)
> >  >
> >  > A second question is: When I try to edit anything in pgAdmin,
> > sometimes
> >  > it won't let me edit the code. I then refresh the database in the
> > left
> >  > pane and then it allows me to edit the SQL in the query pane.
> >  >
> >  > Thank you for any help you may provide.
> >  >
> >  > Chris
> >
> >
> > --
> > Adrian Klaver
> > adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Should I reinstall over current installation?

2020-02-03 Thread Chris Charley
I just got my Windows 10 computer in Nov 2019 and installed PostgreSQL Jan
this year.

I had a problem with Windows update and when I contacted Microsoft, they
installed a fresh operating system. After that, I was unable to run
postgres from pgAdmin 4 or from a perl program that accessed the database.

The message I got was:

could not connect to server: Connection refused (0x274D/10061) Is the
server running on host "localhost" (127.0.0.1) and accepting TCP/IP
connections on port 5432? at pg_01.pl line 8.

I  was considering 2 options:

!) Reinstall postgres over the current without removing the current
postgres first.
2) uninstall the current postgres and then install postgres again.

The tables I created in my database are not many and if I lose them doing a
reinstall, I could recreate my database and the tables. I was considering
option (1) as I was wondering if that choice would keep the database and
tables I created this January.

(The database and contained tables are from 'Practical SQL' by Anthony
DeBarros.)

A second question is: When I try to edit anything in pgAdmin, sometimes it
won't let me edit the code. I then refresh the database in the left pane
and then it allows me to edit the SQL in the query pane.

Thank you for any help you may provide.

Chris


Re: here does postgres take its timezone information from?

2019-11-05 Thread Chris Withers

On 05/11/2019 22:54, Adrian Klaver wrote:

On 11/5/19 2:46 PM, Chris Withers wrote:

Hi All,

Brazil recently abolished daylight savings time, resulting in updates 
to system timezone information packages.
Does postgres use these? If so, does it need a reload or restart to 
see the updated zone info?


If not, how does postgres store/obtain its timezone zone information 
and how would this be updated?


https://www.postgresql.org/about/news/1960/
PostgreSQL 11.5, 10.10, 9.6.15, 9.5.19, 9.4.24, and 12 Beta 3 Released!

"This update also contains tzdata release 2019b for DST law changes in 
Brazil, plus historical corrections for Hong Kong, Italy, and Palestine. 
This update also adds support for zic's new -b slim option to reduce the 
size of the installed zone files, though it is not currently being used 
by PostgreSQL."


Hmm. Is there any option to use the system timezone packages?

If not, why not?

Chris




here does postgres take its timezone information from?

2019-11-05 Thread Chris Withers

Hi All,

Brazil recently abolished daylight savings time, resulting in updates to 
system timezone information packages.
Does postgres use these? If so, does it need a reload or restart to see 
the updated zone info?


If not, how does postgres store/obtain its timezone zone information and 
how would this be updated?


cheers,

Chris




Re: Lookup Primary Key of Foreign Server's Table

2019-10-24 Thread Chris Morris
Yeah. Part of my confusion is due to Heroku providing a Data Links service
that handles a lot of the internal details around establishing a
foreign server connection, and not knowing exactly what to expect.

I experimented with IMPORT FOREIGN SCHEMA in a couple of test databases and
noticed that there was no declaration of primary key brought over to the
local database. So, this is something ActiveRecord simply won't be able to
handle right now, and I have to take care of that myself, which is easy
enough to do.

Thx for the feedback.

On Wed, Oct 23, 2019 at 11:12 AM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Wed, Oct 23, 2019 at 8:47 AM Chris Morris 
> wrote:
>
>> The foreign table has a primary key. Ruby on Rails uses a system query to
>> lookup what the primary key on the table is, but it's querying the local
>> database, and not finding anything. In a schema dump of the local database,
>> I don't see a primary key defined, so I'm presuming I need to issue an ADD
>> CONSTRAINT command against the foreign table to reflect what is actually
>> true in the foreign table? Is that correct?
>>
>
> The documentation says doing what you suggest won't work...
>
> David J.
>
>


Re: Lookup Primary Key of Foreign Server's Table

2019-10-23 Thread Chris Morris
The foreign table has a primary key. Ruby on Rails uses a system query to
lookup what the primary key on the table is, but it's querying the local
database, and not finding anything. In a schema dump of the local database,
I don't see a primary key defined, so I'm presuming I need to issue an ADD
CONSTRAINT command against the foreign table to reflect what is actually
true in the foreign table? Is that correct?


Lookup Primary Key of Foreign Server's Table

2019-10-22 Thread Chris Morris
I'm looking for a system query that will lookup the primary key column on a
fdw table. It's possible we need to declare that part of the foreign
table's schema in the local (is that the right term?) database?

Here's the foreign table - I don't see anything showing a primary key, so
my hunch is we need to declare it in the local schema?




















*=> \d sidecar_link.actions
 Foreign table "sidecar_link.actions" Column  |Type
| Collation | Nullable | Default
   | FDW options
-+-+---+--+--+-
id
 | bigint  |   | not null |
nextval('sidecar_link.actions_id_seq'::regclass) |  user_session_id |
bigint  |   | not null |
   |  user_id | bigint
 |   | not null |
 |  created_at  | timestamp without time zone |   | not null |
now()|  occurred_at |
timestamp without time zone |   | not null | now()
   |  thing_id| integer
|   |  |  |
 parent_thing_id | integer |   |  |
 |  viewing_id  | integer
  |   |  |
 |  origin  | origin  |
  |  | 'mysteryscience'::origin |  scope
| text|   | not null |
 |  name| text
   |   | not null |
 |  details | text|   |
 |  |  request_path
 | text|   |  |
 | Server: pg_mysterysci_sidecarFDW options:
(schema_name 'public', table_name 'actions')*
Not really related question, but a curiosity: why does this table not show
in the list of foreign tables?






*=> \det List of foreign tables Schema | Table | Server
+---+(0 rows)*


Re:Using PostgreSQL for Machine Learning Data Pipelines

2019-10-17 Thread chris
Hi there,


There is a project named Apache MADlib, may help you.


http://madlib.apache.org


Thanks,
| |
Chris
|
|
yuanzefuwa...@126.com
|
签名由网易邮箱大师定制


On 10/18/2019 02:04,Pankaj Jangid wrote:
Hi,

I am working on a machine-learning project. Because of the available
study material in the ML area, the team is inclined towards Apache
Kafka, Apache Spark for data-pipelines and analytics.

Our requirement is to store huge amounts of continuously increasing data
that cannot fit into a single machine. The algorithms require data in
batches so it is not necessary to keep full data ready for
consumption. Using Kafka, the data can be distributed and fetched in
varying batch sizes as and when required.

I am more comfortable with PostgreSQL. And wanted to know more about
case-studies where PostgreSQL is deployed for ML use. Any pointers
referring to study material will be helpful. Please share in this
thread.

--
Thanks & Regards,
Pankaj Jangid



Re: kind of a bag of attributes in a DB . . .

2019-09-15 Thread Chris Travers
On Sat, Sep 14, 2019 at 5:11 PM Albretch Mueller  wrote:

>  just download a bunch of json info files from youtube data Feeds
>
>  Actually, does postgresql has a json Driver of import feature?
>

Sort of  There  are a bunch of features around JSON and JSONB data
types which could be useful.

>
>  the metadata contained in json files would require more than one
> small databases, but such an import feature should be trivial
>

It is not at all trivial for a bunch of reasons inherent to the JSON
specification.  How to handle duplicate keys, for example.

However writing an import for JSON objects into a particular database is
indeed trivial.

>
>  C
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: kind of a bag of attributes in a DB . . .

2019-09-08 Thread Chris Travers
On Sat, Sep 7, 2019 at 5:17 PM Albretch Mueller  wrote:

> Say, you get lots of data and their corresponding metadata, which in
> some cases may be undefined or undeclared (left as an empty string).
> Think of youtube json files or the result of the "file" command.
>
> I need to be able to "instantly" search that metadata and I think DBs
> are best for such jobs and get some metrics out of it.
>
> I know this is not exactly a kosher way to deal with data which can't
> be represented in a nice tabular form, but I don't find the idea that
> half way off either.
>
> What is the pattern, anti-pattern or whatever relating to such design?
>
> Do you know of such implementations with such data?
>

We do the debug logs of JSONB with some indexing.It works in some
limited cases but you need to have a good sense of index possibilities and
how the indexes actually work.


> lbrtchx
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: Featured Big Name Users of Postgres

2019-06-11 Thread Chris Travers
At Adjust GmbH we have 5-10 PB data in Postgres.


On Tue, Jun 11, 2019 at 9:28 PM Ireneusz Pluta/wp.pl  wrote:

> W dniu 2019-06-11 o 19:45, Igal Sapir pisze:
> > I'm doing a presentation about Postgres to SQL Server users this
> weekend, and I want to showcase
> > some of the big names that use Postgres, e.g. MasterCard, Government
> agencies, Banks, etc.
> >
> You might be interested in this:
> https://www.theguardian.com/info/2018/nov/30/bye-bye-mongo-hello-postgres
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: granting right to create and delete just one database

2019-06-05 Thread Chris Withers

On 05/06/2019 09:52, Laurenz Albe wrote:

Chris Withers wrote:

Is there any way to grant rights to a user such that they can drop and
re-create only a single database?

No; what I'd do if I needed that is to create a SECURITY DEFINER function
that is owned by a user with the CREATEDB privilege.
This function can be called by a normal user that has the EXECUTE privilege
on the function.

Don't forget to "SET search_path" on such a function (as mentioned in the
documentation).  It might also be a good idea to REVOKE EXECUTE on the
function from PUBLIC.
Thanks, that's a great idea! Is this pattern documented anywhere as a 
complete finished thing?


cheers,

Chris





granting right to create and delete just one database

2019-06-05 Thread Chris Withers

Hi All,

Is there any way to grant rights to a user such that they can drop and 
re-create only a single database?


cheers,

Chris




RE: Question on binding VIP to Postgresql instance

2019-05-07 Thread Chris Coutinho
I think getting postgres to work with VIP (assuming virtual IP) is not a 
postgres issue, but rather a server/network issue.

Redhat seems to have some documentation on enabling nonlocal binding, which may 
be useful for you:

   
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/load_balancer_administration/s1-initial-setup-forwarding-vsa

Met vriendelijke groet,
REDstack BV

Chris Coutinho
Researcher/Data Analyst

Van: Lu, Dan  
Verzonden: dinsdag 7 mei 2019 14:12
Aan: David G. Johnston ; Alvaro Aguayo Garcia-Rada 
; pgsql-gene...@postgresql.org
Onderwerp: Question on binding VIP to Postgresql instance

Good day,

Would you be able to help direct me to instruction or implementation step to 
bind a VIP to a Postgresql instance?

I did research and understand that Postgresql does support binding of VIP, but 
I can't find documentation to implement the solution.

Any feedback is greatly appreciated.

Dan



IMPORTANT: The information contained in this email and/or its attachments is 
confidential. If you are not the intended recipient, please notify the sender 
immediately by reply and immediately delete this message and all its 
attachments. Any review, use, reproduction, disclosure or dissemination of this 
message or any attachment by an unintended recipient is strictly prohibited. 
Neither this message nor any attachment is intended as or should be construed 
as an offer, solicitation or recommendation to buy or sell any security or 
other financial instrument. Neither the sender, his or her employer nor any of 
their respective affiliates makes any warranties as to the completeness or 
accuracy of any of the information contained herein or that this message or any 
of its attachments is free of viruses.




Re: Storage Inefficiency In PostgreSQL

2019-04-15 Thread Chris Travers
On Mon, Apr 15, 2019 at 10:43 AM Ray Cheung <
ray.che...@silverpowersystems.com> wrote:

> Hi ,
>
> We are currently contemplating switching from MySQL to PostgreSQL, the main
> attraction being the use of the TimescaleDB extension. Having done much of
> the ground investigation there is one area of significant concern - the
> storage requirement of PostgreSQL. Put simply, comparing like for like for
> a
> set of tables, PostgreSQL consumes far more storage space than MySQL:
>
> - MySQL (5.6): 156 MB
> - PostgreSQL (11.2): 246 MB
> - PostgreSQL + TimescaleDB (partitioned/chunked data): 324 MB
>
> I've also submitted this in stackoverflow:
>
> https://stackoverflow.com/questions/55655272/how-to-reduce-postgresql-databa
> se-size
> <https://stackoverflow.com/questions/55655272/how-to-reduce-postgresql-database-size>
> .
>
> I can rearrange the table/column-alignment to save 6 bytes per row of the
> main table, with a saving of a few mega-bytes. Not enough to make any real
> difference. Does anyone know:
>
> - Why PostgreSQL is so storage inefficient in comparison?
>

The storage strategies are different enough you can't really assume direct
comparisons.

Long story short, iMySQL is optimized for two things:  primary key lookups,
and reducing disk I/O from updates to heavily indexed tables.
PostgreSQL is optimized for a lot of things, including access through
secondary indexes and sequential scans.  This means that both tables and
indexes are structured differently.


> - What existing methods can be used to reduce the storage consumption (I've
> already tried realignment and vacuum full)?
>

You could take a look at extensions that give you foreign data wrappers for
columnar stores, but note this has a number of important tradeoffs in
performance and is not recommended for OLTP systems.  However if space is
your primary concern, I would assume you are trying to set up some sort of
OLAP system?


> - Are there any plans to address this storage consumption inefficiency (in
> comparison to MySQL) problem?
>

Long run  pluggable storage should give people a different set of options
and choices to make here.

>
> Many thanks,
>
> sps-ray
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


  1   2   >