Re: Postgres calendar?

2022-10-04 Thread Adam Brusselback
Absolutely, it'd be much easier having this info integrated with my
work/personal calendar, as that's how I try and organize things anyways.

Thanks for the suggestion.
-Adam

On Tue, Oct 4, 2022 at 5:02 PM Bruce Momjian  wrote:

> Would people be interesting in subscribing to a Postgres calendar that
> includes dates for minor releases, final minor release dates for major
> versions, commit fests, and even Postgres events?  For example, it could
> include information from:
>
> https://www.postgresql.org/developer/roadmap/
> https://www.postgresql.org/support/versioning/
> https://commitfest.postgresql.org/
> https://www.postgresql.org/about/events/
>
> We could even add information about beta, release candidate, and final
> major releases, though the final release dates are usually not public.
>
> This could be done in Google Calendar, with an exported ICS file, or via
> a dedicated ICS file.  I could even automate it by scraping our website.
>
> --
>   Bruce Momjian  https://momjian.us
>   EDB  https://enterprisedb.com
>
>   Indecision is a decision.  Inaction is an action.  Mark Batterson
>
>
>
>


Re: Ways to "serialize" result set for later use?

2021-04-12 Thread Adam Brusselback
>  Checking data (DML), if functions are doing the right things is
something we do in our code unit tests.

This is exactly what I am writing, unit tests for my code (which is
pl/pgsql). This is an ELT pipeline for my customers to bulk update their
data in my system, with detailed error reporting for any issues
per-row/column. The code is all plpgsql, as are the few tests i've written
so far. pgTAP is my unit testing framework for this process.

So unit testing my company's (vast) database code is something I am just
trying to figure out and get into my workflow, and it didn't feel like I
had to fight too hard with it at this point, other than $subject$. And even
that isn't an issue with my hacky function in place, it just feels a
little...dirty I guess? Was just wanting a gut check if there seemed to be
an obviously better way to get the same results.

>


Ways to "serialize" result set for later use?

2021-04-10 Thread Adam Brusselback
Hey there everyone,

I am going through the process of writing my first pgtap tests for my
database, and I wanted to get some feedback on if my solution seems fine,
is just dumb, or could be acomplished much easier another way.

So my main problem I was trying to work around, was my tests are written in
functions and called using runtests(), so using psql and \copy to save test
data to the filesystem isn't really part of the workflow, but I still
needed a way to have my "expected" query resultset passed into results_eq [
https://pgtap.org/documentation.html#results_eq] easily within a function
body.

I originally manually dumped some "known good" data from a query to csv,
and built some SELECT ... FROM VALUES (...) statements by hand to do this.
That obviously sucks.

Then I started looking to see if there was a way to get Postgres to
"serialize" a query resultset to a values list similar to how pg_dump is
able to be configured to dump data as inserts for the plain format. I
couldn't find anything at all though. No mention of anything similar on the
mailing list, stackoverflow, etc... I feel like I must be searching
incorrectly.

Anyways, since I didn't find anything, or any discussion online for
something like I wanted, I just tried building it, see attached for source.
It's a function that takes in the text of a query that produces a
resultset, and returns a SELECT ... FROM VALUES (...) statement that will
produce the exact same output.
It does so by running that query and creating a temporary table with the
results, then we query the system catalogs to get the data types, and
column names of the temporary table produced by the query, and then uses
that information to build a VALUES from clause that contains each row by
scanning the temp table, and also dynamically builds the SELECT ... columns
list to cast everything to the correct data type, we then put all of that
together and return the query text to the caller.

Not fully tested or anything, and not guaranteed to work well. Please, if
you see any issues let me know. I got this together in an hour of
hacking... but it did solve my immediate problem and I have been able to
generate easy "expected" result sets for my tests.

So I really just wanted to see if there is a better way to go about what
i'm trying to do, does Postgres already support something similar I can
harness instead of this hack? Or is this really an alright way to go?


query_to_values.sql
Description: Binary data


Re: pgAgent for multiple databases in same instance

2021-03-10 Thread Adam Brusselback
So my experience isn't with pgagent directly, because I have been using my
re-written version of it for ~5 years (but at least at one point I had a
pretty darn good understanding from doing that rewrite)...please take this
with a grain of salt if I am incorrect on anything.

So the agent is only able to connect to a single "db" at a time, so you are
correct you'd need to run multiple agents (connected to the different db's)
to have things work as you intended.

What is the reason for creating the pgagent extensions in multiple
databases? The extension itself is the storage / logs / job definition, so
each extra one is totally detached from the rest. You can still use a
single pgagent install to connect to many databases to do work, so I am
just not sure why you'd need multiple.


Re: How to keep format of views source code as entered?

2021-01-13 Thread Adam Brusselback
> Admittedly, the system probably should be made to save the text, should
someone wish to write such a patch.

Just wanted to throw $0.02 behind this idea if anyone does want to take it
up later. Using a source control system is better obviously. But even if
you use source control it is still incredibly annoying you cannot compare
the view you have in source control to the view definition in PG and tell
if it's the same or changed. It has been a major annoyance for views with
complex subqueries or where clauses, the PG representation is absolutely
unreadable.

-Adam


Re: Hot backup in PostgreSQL

2020-10-22 Thread Adam Brusselback
> how to do "hot backup" (copying files) while database running?
As others have shown, there are ways to do this with PG's internal tooling
(pg_basebackup).

However, I would highly recommend you use an external backup tool like
pgbackrest [1] to save yourself the pain of implementing things incorrectly
and ending up with non-viable backups when you need them most. I'm not
affiliated with them at all, but have just used pgbackrest in production
for years now with great results.  It takes care of PITR, and manages
backup retention (and associated WAL retention). Those can be a bit of a
pain to do manually otherwise.

Just my $0.02, hope it helps!

1. https://pgbackrest.org/


Re: serial + db key, or guid?

2020-08-11 Thread Adam Brusselback
I mentioned this in another email thread yesterday about a similar topic,
but I'd highly suggest if you do go the UUID route, do not use the standard
UUID generation functions, they all suck for database use (v1 also sucks).

I use: https://pgxn.org/dist/sequential_uuids/ written by Thomas Vondara
(writeup here: https://2ndquadrant.com/en/blog/sequential-uuid-generators/ )

I don't mind having a time component correlated with my UUID's because it's
simply not a threat model that matters for my use case, so I use the time
based variant. It helped me immensely with FPW and
write amplification when I switched from V4 UUIDs. It is still not as fast
as an int, but it is much much better than random UUIDs.


Re: UUID or auto-increment

2020-08-10 Thread Adam Brusselback
>  I would point out, however, that using a V1 UUID rather than a V4 can
help with this as it is sequential, not random (based on MAC address and
timestamp + random)

I wanted to make this point, using sequential UUIDs helped me reduce write
amplification quite a bit with my application, I didn't use V1, instead I
used: https://pgxn.org/dist/sequential_uuids/

Reduces the pain caused by UUIDs a ton IMO.
-Adam


Re: Unexplained disk usage in AWS Aurora Postgres

2020-08-07 Thread Adam Brusselback
I would highly suggest you reach out to AWS support for Aurora questions,
that's part of what you're paying for, support.
For reasons you mentioned and more, it's pretty hard to debug issues
because it isn't actually Postgres.

>


Re: Does TOAST really compress the complete row?

2020-07-02 Thread Adam Brusselback
>
> > https://www.postgresql.org/docs/12/logical-replication-restrictions.html

> >

> > "Large objects (see Chapter 34) are not replicated. There is no

> > workaround for that, other than storing data in normal tables."

> >

> > Of course that does not apply to bytea:

> >
> https://www.postgresql.org/docs/12/datatype-binary.html

That makes sense now, I was reading that section as if it were talking
about bytea, not LO.

Thanks for pointing that out!
- Adam


Re: Does TOAST really compress the complete row?

2020-07-02 Thread Adam Brusselback
Another thing that was said I wasn't aware of and have not been able to
find any evidence to support:

> 10. Blobs don’t participate in Logical replication.


Re: CPU Configuration - postgres

2020-06-11 Thread Adam Brusselback
>  For parallelism, there are these options

That only matters if you want to use those extra cores to make individual
queries / commands faster.

If all OP cares about is "will PG use my extra cores", the answer is yes it
will without doing anything special.


Re: Oracle vs. PostgreSQL - a comment

2020-06-02 Thread Adam Brusselback
>  How good will that be in performance.

In my experience, not great. It's definitely better than not having it at
all, but it does not make for quick queries and caused serious
connection overhead when a query referenced that foreign table. I've since
moved to logical replication to improve the situation there.

In SQL Server I had never noticed the difference when I had to do cross-db
queries. There may or may not be a difference, but it was never something
that I had to think about or look up, so I am not sure.

It's something I do still miss though, as it sure was convenient.

Just my $0.02.
-Adam


Re: Table partitioning for cloud service?

2020-05-21 Thread Adam Brusselback
>  An interesting answer, if there needs to be shared data, is for the
shared data to go in its own database, and use a Foreign Data Wrapper to
have each tenants' database access it <
https://www.postgresql.org/docs/12/postgres-fdw.html>

For my application I went the schema-per-tenant route, but I have a need to
have a single login which will work for all tenants you've been given
access to. Not all tenants are required to be on the same database host, so
I broke that piece out into it's own database and used postgres fdw to make
it seem local to each tenant.

So i've got first hand experience with this for the past ~5 years, but this
approach has serious tradeoffs. Queries that need to access the remote
table can just fall on their face sometimes.  You will also need to deal
with practically every connection spawning 1-or-more new connections which
will stay open taking resources the first time a query is issued that
accesses foreign data.

As an optimization I just worked on for my database earlier this week, I
decided to logically replicate that table from my main authentication
database into a each cluster, and I replaced all references to the FDW for
read-only queries to use the logically replicated table. All write queries
still hit the FDW as before.

This was acceptable for my use case, and drastically improved performance
for some queries where I had previously had to use a CTE to force
materialization to get acceptable performance due to the nature of going
over the FDW for that data.

It's a very cool tool, just be careful about how it can impact performance
if you don't measure for your specific use case.


Re: SET ROLE and search_path

2020-05-20 Thread Adam Brusselback
I have this exact setup, and I use roles / schema names that match so the
$user var works with the search path when I set role as my application user.

> When search_path contains “$user”, does it refer to session_user or
current_user ?
It uses current_user, not session_user. Works perfectly with set_role for
me.

>


Re: easy way to acquire height / width from images (PNG, JPEG) stored as bytea?

2020-04-17 Thread Adam Brusselback
Why not extract and store that metadata with the image rather than trying
to extract it to filter on at query time? That way you can index your
height and width columns to speed up that filtering if necessary.

You may be able to write a wrapper for a command line tool like imagemagic
or something so you can call that from a function to determine the size if
you did want to stick with extracting that at query time.


Re: Feature request (or at least discussion): enable autovaccum on temp tables

2019-05-31 Thread Adam Brusselback
Temp tables are not visibile outside of a single connection, so the
autovacuum worker connection isn't able to see it.

Are you sure that it's actually an issue with accumulating dead tuples, and
not an issue with bad statistics?

In my processes which are heavy on temp tables, I have to manually run
analyze after populating any temp tables.

Just a thought.
-Adam


Re: Postgres for SQL Server users

2019-05-06 Thread Adam Brusselback
I think the main "gotcha" when I moved from SQL Server to Postgres was I
didn't even realize the amount of in-line t-sql I would use to just get
stuff done for ad-hoc analysis. Postgres doesn't have a good way to emulate
this. DO blocks cannot return resultsets, so short of creating a function
and dropping it, it's not possible to get the same workflow.
The lack of GUI tooling was also a huge "whoa" moment for me, which I still
grapple with.


Re: Using CTE vs temporary tables

2018-07-11 Thread Adam Brusselback
One thing to note, if this is a query you would like to run on a replica,
temp tables are a non-starter.

I really wish that wasn't the case. I have quite a few analytical queries I
had to optimize with temp tables and indexes, and I really wish I could run
on my hot standby.

I in most cases I can't refactor them to use a CTE for performance reasons.

Anyways, my 2¢.
- Adam


Re: Can PostgreSQL create new WAL files instead of reusing old ones?

2018-06-22 Thread Adam Brusselback
>  Is there some way to make it auto-detect when it should be enabled? If
not, please document that it should be used on ZFS and any other file
system with CoW properties on files.
In addition to this, wondering what type of performance regression this
would show on something like ext4 (if any).


Re: How can I stop a long run pgAgent job?

2018-06-21 Thread Adam Brusselback
As said, terminating a backend is the current way to kill a job.

An alternative if this is something you do often:
https://github.com/GoSimpleLLC/jpgAgent
jpgAgent supports terminating a job by issuing a NOTIFY command on the
correct channel like this: NOTIFY jpgagent_kill_job, 'job_id_here';
It works well with windows, and all the existing pgAgent admin tools work
to manage it, it's just a drop in replacement for the agent portion.


Re: pgcrypto.gen_random_uuid() or uuid-ossp.uuid_generate_v4()?

2018-02-06 Thread Adam Brusselback
In my testing, gen_random_uuid() is quite a bit faster than uuid_generate_v4().



Re: [GENERAL] postgres_fdw & could not receive data from client: Connection reset by peer

2017-11-28 Thread Adam Brusselback
Just bumping this because I posted it right before Thanksgiving and it was
very easy to overlook.

Sorry if this is bad etiquette for the list... Just let me know if it is
and I won't do it in the future.


[GENERAL] postgres_fdw & could not receive data from client: Connection reset by peer

2017-11-22 Thread Adam Brusselback
Hey all, first off...
Postgres version: 10.1
OS: Debian 9.0

So I have a database called: authentication
It stores my user table for my application.  I have it separated from
the main database of my application to allow the same account to be
used by multiple instances of my application.

>From a connection to my main database, I need to be able to query this
table as if it existed in my main database, so I set up a postgres_fdw
connection to it.  This works well, and I have no issues with the
setup functionally.  I can have multiple instances of my main
application database (test, dev, prod), all pointed to the same
"authentication" database, and everything works. My users can keep
logging into the application with the same credentials no matter which
environment they go into.

The only thing I have seen, is pretty much any query which references
the foreign user table (and connects to the auth db's user table) will
put a line in my log file like so:

2017-11-22 14:27:09.846 UTC [17388] authentication@authentication LOG:
 could not receive data from client: Connection reset by peer

I was wondering if this was normal behavior, or if i'm doing something
wrong to cause this?  It happens no matter if my queries come from
pgadmin, psql, or jdbc.

Any info would be greatly appreciated.
Thanks,
-Adam