Re: Postgres calendar?
Absolutely, it'd be much easier having this info integrated with my work/personal calendar, as that's how I try and organize things anyways. Thanks for the suggestion. -Adam On Tue, Oct 4, 2022 at 5:02 PM Bruce Momjian wrote: > Would people be interesting in subscribing to a Postgres calendar that > includes dates for minor releases, final minor release dates for major > versions, commit fests, and even Postgres events? For example, it could > include information from: > > https://www.postgresql.org/developer/roadmap/ > https://www.postgresql.org/support/versioning/ > https://commitfest.postgresql.org/ > https://www.postgresql.org/about/events/ > > We could even add information about beta, release candidate, and final > major releases, though the final release dates are usually not public. > > This could be done in Google Calendar, with an exported ICS file, or via > a dedicated ICS file. I could even automate it by scraping our website. > > -- > Bruce Momjian https://momjian.us > EDB https://enterprisedb.com > > Indecision is a decision. Inaction is an action. Mark Batterson > > > >
Re: Ways to "serialize" result set for later use?
> Checking data (DML), if functions are doing the right things is something we do in our code unit tests. This is exactly what I am writing, unit tests for my code (which is pl/pgsql). This is an ELT pipeline for my customers to bulk update their data in my system, with detailed error reporting for any issues per-row/column. The code is all plpgsql, as are the few tests i've written so far. pgTAP is my unit testing framework for this process. So unit testing my company's (vast) database code is something I am just trying to figure out and get into my workflow, and it didn't feel like I had to fight too hard with it at this point, other than $subject$. And even that isn't an issue with my hacky function in place, it just feels a little...dirty I guess? Was just wanting a gut check if there seemed to be an obviously better way to get the same results. >
Ways to "serialize" result set for later use?
Hey there everyone, I am going through the process of writing my first pgtap tests for my database, and I wanted to get some feedback on if my solution seems fine, is just dumb, or could be acomplished much easier another way. So my main problem I was trying to work around, was my tests are written in functions and called using runtests(), so using psql and \copy to save test data to the filesystem isn't really part of the workflow, but I still needed a way to have my "expected" query resultset passed into results_eq [ https://pgtap.org/documentation.html#results_eq] easily within a function body. I originally manually dumped some "known good" data from a query to csv, and built some SELECT ... FROM VALUES (...) statements by hand to do this. That obviously sucks. Then I started looking to see if there was a way to get Postgres to "serialize" a query resultset to a values list similar to how pg_dump is able to be configured to dump data as inserts for the plain format. I couldn't find anything at all though. No mention of anything similar on the mailing list, stackoverflow, etc... I feel like I must be searching incorrectly. Anyways, since I didn't find anything, or any discussion online for something like I wanted, I just tried building it, see attached for source. It's a function that takes in the text of a query that produces a resultset, and returns a SELECT ... FROM VALUES (...) statement that will produce the exact same output. It does so by running that query and creating a temporary table with the results, then we query the system catalogs to get the data types, and column names of the temporary table produced by the query, and then uses that information to build a VALUES from clause that contains each row by scanning the temp table, and also dynamically builds the SELECT ... columns list to cast everything to the correct data type, we then put all of that together and return the query text to the caller. Not fully tested or anything, and not guaranteed to work well. Please, if you see any issues let me know. I got this together in an hour of hacking... but it did solve my immediate problem and I have been able to generate easy "expected" result sets for my tests. So I really just wanted to see if there is a better way to go about what i'm trying to do, does Postgres already support something similar I can harness instead of this hack? Or is this really an alright way to go? query_to_values.sql Description: Binary data
Re: pgAgent for multiple databases in same instance
So my experience isn't with pgagent directly, because I have been using my re-written version of it for ~5 years (but at least at one point I had a pretty darn good understanding from doing that rewrite)...please take this with a grain of salt if I am incorrect on anything. So the agent is only able to connect to a single "db" at a time, so you are correct you'd need to run multiple agents (connected to the different db's) to have things work as you intended. What is the reason for creating the pgagent extensions in multiple databases? The extension itself is the storage / logs / job definition, so each extra one is totally detached from the rest. You can still use a single pgagent install to connect to many databases to do work, so I am just not sure why you'd need multiple.
Re: How to keep format of views source code as entered?
> Admittedly, the system probably should be made to save the text, should someone wish to write such a patch. Just wanted to throw $0.02 behind this idea if anyone does want to take it up later. Using a source control system is better obviously. But even if you use source control it is still incredibly annoying you cannot compare the view you have in source control to the view definition in PG and tell if it's the same or changed. It has been a major annoyance for views with complex subqueries or where clauses, the PG representation is absolutely unreadable. -Adam
Re: Hot backup in PostgreSQL
> how to do "hot backup" (copying files) while database running? As others have shown, there are ways to do this with PG's internal tooling (pg_basebackup). However, I would highly recommend you use an external backup tool like pgbackrest [1] to save yourself the pain of implementing things incorrectly and ending up with non-viable backups when you need them most. I'm not affiliated with them at all, but have just used pgbackrest in production for years now with great results. It takes care of PITR, and manages backup retention (and associated WAL retention). Those can be a bit of a pain to do manually otherwise. Just my $0.02, hope it helps! 1. https://pgbackrest.org/
Re: serial + db key, or guid?
I mentioned this in another email thread yesterday about a similar topic, but I'd highly suggest if you do go the UUID route, do not use the standard UUID generation functions, they all suck for database use (v1 also sucks). I use: https://pgxn.org/dist/sequential_uuids/ written by Thomas Vondara (writeup here: https://2ndquadrant.com/en/blog/sequential-uuid-generators/ ) I don't mind having a time component correlated with my UUID's because it's simply not a threat model that matters for my use case, so I use the time based variant. It helped me immensely with FPW and write amplification when I switched from V4 UUIDs. It is still not as fast as an int, but it is much much better than random UUIDs.
Re: UUID or auto-increment
> I would point out, however, that using a V1 UUID rather than a V4 can help with this as it is sequential, not random (based on MAC address and timestamp + random) I wanted to make this point, using sequential UUIDs helped me reduce write amplification quite a bit with my application, I didn't use V1, instead I used: https://pgxn.org/dist/sequential_uuids/ Reduces the pain caused by UUIDs a ton IMO. -Adam
Re: Unexplained disk usage in AWS Aurora Postgres
I would highly suggest you reach out to AWS support for Aurora questions, that's part of what you're paying for, support. For reasons you mentioned and more, it's pretty hard to debug issues because it isn't actually Postgres. >
Re: Does TOAST really compress the complete row?
> > > https://www.postgresql.org/docs/12/logical-replication-restrictions.html > > > > "Large objects (see Chapter 34) are not replicated. There is no > > workaround for that, other than storing data in normal tables." > > > > Of course that does not apply to bytea: > > > https://www.postgresql.org/docs/12/datatype-binary.html That makes sense now, I was reading that section as if it were talking about bytea, not LO. Thanks for pointing that out! - Adam
Re: Does TOAST really compress the complete row?
Another thing that was said I wasn't aware of and have not been able to find any evidence to support: > 10. Blobs don’t participate in Logical replication.
Re: CPU Configuration - postgres
> For parallelism, there are these options That only matters if you want to use those extra cores to make individual queries / commands faster. If all OP cares about is "will PG use my extra cores", the answer is yes it will without doing anything special.
Re: Oracle vs. PostgreSQL - a comment
> How good will that be in performance. In my experience, not great. It's definitely better than not having it at all, but it does not make for quick queries and caused serious connection overhead when a query referenced that foreign table. I've since moved to logical replication to improve the situation there. In SQL Server I had never noticed the difference when I had to do cross-db queries. There may or may not be a difference, but it was never something that I had to think about or look up, so I am not sure. It's something I do still miss though, as it sure was convenient. Just my $0.02. -Adam
Re: Table partitioning for cloud service?
> An interesting answer, if there needs to be shared data, is for the shared data to go in its own database, and use a Foreign Data Wrapper to have each tenants' database access it < https://www.postgresql.org/docs/12/postgres-fdw.html> For my application I went the schema-per-tenant route, but I have a need to have a single login which will work for all tenants you've been given access to. Not all tenants are required to be on the same database host, so I broke that piece out into it's own database and used postgres fdw to make it seem local to each tenant. So i've got first hand experience with this for the past ~5 years, but this approach has serious tradeoffs. Queries that need to access the remote table can just fall on their face sometimes. You will also need to deal with practically every connection spawning 1-or-more new connections which will stay open taking resources the first time a query is issued that accesses foreign data. As an optimization I just worked on for my database earlier this week, I decided to logically replicate that table from my main authentication database into a each cluster, and I replaced all references to the FDW for read-only queries to use the logically replicated table. All write queries still hit the FDW as before. This was acceptable for my use case, and drastically improved performance for some queries where I had previously had to use a CTE to force materialization to get acceptable performance due to the nature of going over the FDW for that data. It's a very cool tool, just be careful about how it can impact performance if you don't measure for your specific use case.
Re: SET ROLE and search_path
I have this exact setup, and I use roles / schema names that match so the $user var works with the search path when I set role as my application user. > When search_path contains “$user”, does it refer to session_user or current_user ? It uses current_user, not session_user. Works perfectly with set_role for me. >
Re: easy way to acquire height / width from images (PNG, JPEG) stored as bytea?
Why not extract and store that metadata with the image rather than trying to extract it to filter on at query time? That way you can index your height and width columns to speed up that filtering if necessary. You may be able to write a wrapper for a command line tool like imagemagic or something so you can call that from a function to determine the size if you did want to stick with extracting that at query time.
Re: Feature request (or at least discussion): enable autovaccum on temp tables
Temp tables are not visibile outside of a single connection, so the autovacuum worker connection isn't able to see it. Are you sure that it's actually an issue with accumulating dead tuples, and not an issue with bad statistics? In my processes which are heavy on temp tables, I have to manually run analyze after populating any temp tables. Just a thought. -Adam
Re: Postgres for SQL Server users
I think the main "gotcha" when I moved from SQL Server to Postgres was I didn't even realize the amount of in-line t-sql I would use to just get stuff done for ad-hoc analysis. Postgres doesn't have a good way to emulate this. DO blocks cannot return resultsets, so short of creating a function and dropping it, it's not possible to get the same workflow. The lack of GUI tooling was also a huge "whoa" moment for me, which I still grapple with.
Re: Using CTE vs temporary tables
One thing to note, if this is a query you would like to run on a replica, temp tables are a non-starter. I really wish that wasn't the case. I have quite a few analytical queries I had to optimize with temp tables and indexes, and I really wish I could run on my hot standby. I in most cases I can't refactor them to use a CTE for performance reasons. Anyways, my 2¢. - Adam
Re: Can PostgreSQL create new WAL files instead of reusing old ones?
> Is there some way to make it auto-detect when it should be enabled? If not, please document that it should be used on ZFS and any other file system with CoW properties on files. In addition to this, wondering what type of performance regression this would show on something like ext4 (if any).
Re: How can I stop a long run pgAgent job?
As said, terminating a backend is the current way to kill a job. An alternative if this is something you do often: https://github.com/GoSimpleLLC/jpgAgent jpgAgent supports terminating a job by issuing a NOTIFY command on the correct channel like this: NOTIFY jpgagent_kill_job, 'job_id_here'; It works well with windows, and all the existing pgAgent admin tools work to manage it, it's just a drop in replacement for the agent portion.
Re: pgcrypto.gen_random_uuid() or uuid-ossp.uuid_generate_v4()?
In my testing, gen_random_uuid() is quite a bit faster than uuid_generate_v4().
Re: [GENERAL] postgres_fdw & could not receive data from client: Connection reset by peer
Just bumping this because I posted it right before Thanksgiving and it was very easy to overlook. Sorry if this is bad etiquette for the list... Just let me know if it is and I won't do it in the future.
[GENERAL] postgres_fdw & could not receive data from client: Connection reset by peer
Hey all, first off... Postgres version: 10.1 OS: Debian 9.0 So I have a database called: authentication It stores my user table for my application. I have it separated from the main database of my application to allow the same account to be used by multiple instances of my application. >From a connection to my main database, I need to be able to query this table as if it existed in my main database, so I set up a postgres_fdw connection to it. This works well, and I have no issues with the setup functionally. I can have multiple instances of my main application database (test, dev, prod), all pointed to the same "authentication" database, and everything works. My users can keep logging into the application with the same credentials no matter which environment they go into. The only thing I have seen, is pretty much any query which references the foreign user table (and connects to the auth db's user table) will put a line in my log file like so: 2017-11-22 14:27:09.846 UTC [17388] authentication@authentication LOG: could not receive data from client: Connection reset by peer I was wondering if this was normal behavior, or if i'm doing something wrong to cause this? It happens no matter if my queries come from pgadmin, psql, or jdbc. Any info would be greatly appreciated. Thanks, -Adam