Re: PROPOSAL: Support global and local disabling of indexes

2022-03-18 Thread Adam Brusselback
Just wanted to mention that this would be a useful feature for me. Had
previously been bitten by this:
https://www.postgresql.org/message-id/flat/CAMjNa7c4pKLZe%2BZ0V49isKycnXQ6Y%3D3BO-4Gsj3QAwsd2r7Wrw%40mail.gmail.com

Ended up "solving" by putting a where clause on all my exclusion
constraints I didn't want used for most queries (WHERE 1=1). That allowed
me "disable" that index for all queries unless they explicitly have a 1=1
constant in the where clause.


Re: [Proposal] Global temporary tables

2022-03-02 Thread Adam Brusselback
>In my observation, very few users require an accurate query plan for
temporary tables to
perform manual analyze.

Absolutely not true in my observations or personal experience. It's one of
the main reasons I have needed to use (local) temporary tables rather than
just materializing a CTE when decomposing queries that are too complex for
Postgres to handle.

I wish I could use GTT to avoid the catalog bloat in those instances, but
that will only be possible if the query plans are accurate.


Re: Foreign key joins revisited

2021-12-28 Thread Adam Brusselback
 > How about JOIN WITH?
I'm -1 on this, reusing WITH is just likely to cause confusion because WITH
can appear other places in a query having an entirely different meaning.
I'd just avoid that from the start.

>> Can with think of some other suitable reserved keyword?
>FOREIGN?  Or even spell out "JOIN FOREIGN KEY".
I like the conciseness of just FOREIGN.


Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2021-06-15 Thread Adam Brusselback
Am I mistaken in thinking that this would allow CREATE DATABASE to run
inside a transaction block now, further reducing the DDL commands that are
non-transactional?


Re: What to call an executor node which lazily caches tuples in a hash table?

2021-03-31 Thread Adam Brusselback
> Does anyone else like the name "Tuple Cache"?
I personally like that name best.

It makes sense to me when thinking about looking at an EXPLAIN and trying
to understand why this node may be there. The way we look up the value
stored in the cache doesn't really matter to me as a user, I'm more
thinking about the reason the node is there in the first place, which Tuple
Cache conveys...at least for me.


Re: PostgreSQL <-> Babelfish integration

2021-02-12 Thread Adam Brusselback
Just wanted to link to the discussion on this from HN for anyone intersted:
https://news.ycombinator.com/item?id=26114281


Re: Implementing Incremental View Maintenance

2020-10-27 Thread Adam Brusselback
That was a good bit more work to get ready than I expected. It's broken
into two scripts, one to create the schema, the other to load data and
containing a couple check queries to ensure things are working properly
(checking the materialized tables against a regular view for accuracy).

The first test case is to give us a definitive result on what "agreed
pricing" is in effect at a point in time based on a product hierarchy
our customers setup, and allow pricing to be set on nodes in that
hierarchy, as well as specific products (with an order of precedence).
The second test case maintains some aggregated amounts / counts / boolean
logic at an "invoice" level for all the detail lines which make up that
invoice.

Both of these are real-world use cases which were simplified a bit to make
them easier to understand. We have other use cases as well, but with how
much time this took to prepare i'll keep it at this for now.
If you need anything clarified or have any issues, just let me know.

On Fri, Oct 23, 2020 at 3:58 AM Yugo NAGATA  wrote:

> Hi Adam,
>
> On Thu, 22 Oct 2020 10:07:29 -0400
> Adam Brusselback  wrote:
>
> > Hey there Yugo,
> > I've asked a coworker to prepare a self contained example that
> encapsulates
> > our multiple use cases.
>
> Thank you very much!
>
> > The immediate/eager approach is exactly what we need, as within the same
> > transaction we have statements that can cause one of those "materialized
> > tables" to be updated, and then sometimes have the need to query that
> > "materialized table" in a subsequent statement and need to see the
> changes
> > reflected.
>
> The proposed patch provides the exact this feature and I think this will
> meet
> your needs.
>
> > As soon as my coworker gets that example built up I'll send a followup
> with
> > it attached.
>
> Great! We are looking forward to it.
>
> Regards,
> Yugo Nagata
>
> --
> Yugo NAGATA 
>


02_materialized_test_data.sql
Description: Binary data


01_materialized_test_schema.sql
Description: Binary data


Re: Implementing Incremental View Maintenance

2020-10-22 Thread Adam Brusselback
Hey there Yugo,
I've asked a coworker to prepare a self contained example that encapsulates
our multiple use cases.

The immediate/eager approach is exactly what we need, as within the same
transaction we have statements that can cause one of those "materialized
tables" to be updated, and then sometimes have the need to query that
"materialized table" in a subsequent statement and need to see the changes
reflected.

As soon as my coworker gets that example built up I'll send a followup with
it attached.
Thank you,
Adam Brusselback


Re: Feature improvement for pg_stat_statements

2020-09-18 Thread Adam Brusselback
That'd be useful in my book. My scripts just have a hard coded timestamp I
replace when I call reset so those calculations work, but it would be much
preferred to have that data available by a built in function.

-Adam


Re: Proposal of new PostgreSQL Extension - PGSpiderExt

2020-09-16 Thread Adam Brusselback
>  From my point of view if will be very helpful if such > "PgExt Store"
> will be available.
> May be such resources already exists, but I do not > know about them.

https://pgxn.org/


Re: Persist MVCC forever - retain history

2020-07-03 Thread Adam Brusselback
> But I didn't fill a big interest to it from community.
Just fyi, it is something that I use in my database design now (just hacked
together using ranges / exclusion constraints) and
would love for a well supported solution.

I've chimed in a couple times as this feature has popped up in discussion
over the years, as I have seen others with similar needs do as well.
Just sometimes feels like spam to chime in just saying "i'd find this
feature useful" so I try and not do that too much. I'd rather not step on
the
community's toes.

-Adam


Re: Terminate the idle sessions

2020-06-10 Thread Adam Brusselback
>
> > Why not implement it in the core of Postgres? Are there any
disadvantages of
> implementing it in the core of Postgres?
I was surprised this wasn't a feature when I looked into it a couple years
ago. I'd use it if it were built in, but I am not installing something
extra just for this.

> I’m curious as to the use case because I cannot imagine using this.

My use case is, I have a primary application that connects to the DB, most
users work through that (setting is useless for this scenario, app manages
it's connections well enough). I also have a number of internal users who
deal with data ingestion and connect to the DB directly to work, and those
users sometimes leave query windows open for days accidentally. Generally
not an issue, but would be nice to be able to time those connections out.

Just my $0.02, but I am +1.
-Adam


Re: Proposal: is_castable

2020-04-03 Thread Adam Brusselback
>  What would you actually do with it?

I am one of the users of these do-it-yourself functions, and I use them in
my ETL pipelines heavily.

For me, data gets loaded into a staging table, all columns text, and I run
a whole bunch of validation queries
on the data prior to it moving to the next stage in the pipeline, a
strongly typed staging table, where more
validations are performed. So I currently check each column type with my
custom can_convert_sometype(text)
functions, and if the row has any columns that cannot convert, it marks a
boolean to ignore moving that row
to the next strongly typed table (thus avoiding the cast for those rows).

For this ETL process, I need to give users feedback about why specific
specific rows failed to be processed, so
each of those validations also logs an error message for the user for each
row failing a specific validation.

So it's a two step process for me currently because of this, I would love
if there was a better way to handle
this type of work though, because my plpgsql functions using exception
blocks are not exactly great
for performance.

>> Similar features are implemented in:
>> - SQL Server (as TRY_CONVERT)
>> - Oracle (as CONVERT([val] DEFAULT [expr] ON CONVERSION ERROR)
>
> Somehow, I don't think those have the semantics of what you suggest
here.

Agreed that they aren't the same exact feature, but I would very much love
the ability to both
know "will this cast fail?", and also be able to "try and cast, but if it
fails just put this value and don't error".

They both have uses IMO, and while having is_castable() functions built in
would be great, I just want to
express my desire for something like the above feature in SQL Server or
Oracle as well.


Re: ALTER TABLE ... SET STORAGE does not propagate to indexes

2020-03-30 Thread Adam Brusselback
> ALTER TABLE ... SET STORAGE does not propagate to indexes, even though
> indexes created afterwards get the new storage setting.  So depending on
> the order of commands, you can get inconsistent storage settings between
> indexes and tables.

I've absolutely noticed this behavior, I just thought it was intentional
for some reason.

Having this behavior change as stated above would be very welcome in my
opinion. It's always something i've had to manually think about in my
migration scripts, so it would be welcome from my view.

-Adam


Re: [HACKERS] make async slave to wait for lsn to be replayed

2020-03-06 Thread Adam Brusselback
I just wanted to express my excitement that this is being picked up again.
I was very much looking forward to this years ago, and the use case for me
is still there, so I am excited to see this moving again.


Re: Optimizer items in the release notes

2019-04-24 Thread Adam Brusselback
As a user, I am interested in the optimizer changes for sure, and I
actually had wished they were highlighted more in previous releases.

> I think planner smarts are arguably one of our weakest areas when
> compared to the big commercial databases. The more we can throw in
> there about this sort of thing the better.

Completely agree on both fronts. I have run into numerous optimizations I
had taken for granted when I worked primarily with SQL Server and were not
present in Postgres.
Work being done to make the Postgres optimizer smarter is great, as is
highlighting that work in the release notes IMO.


Re: block-level incremental backup

2019-04-23 Thread Adam Brusselback
I hope it's alright to throw in my $0.02 as a user. I've been following
this (and the other thread on reading WAL to find modified blocks,
prefaulting, whatever else) since the start with great excitement and would
love to see the built-in backup capabilities in Postgres greatly improved.
I know this is not completely on-topic for just incremental backups, so I
apologize in advance. It just seemed like the most apt place to chime in.


Just to preface where I am coming from, I have been using pgBackRest for
the past couple years and used wal-e prior to that.  I am not a big *nix
user other than all my servers, do all my development on Windows / use
primarily Java. The command line is not where I feel most comfortable
despite my best efforts over the last 5-6 years. Prior to Postgres, I used
SQL Server for quite a few years at previous companies but was more a
junior / intermediate skill set back then. I just wanted to put that out
there so you can see where my bias's are.




With all that said, I would not be comfortable using pg_basebackup as my
main backup tool simply because I’d have to cobble together numerous tools
to get backups stored in a safe (not on the same server) location, I’d have
to manage expiring backups and the WAL which is no longer needed, along
with the rest of the stuff that makes these backup management tools useful.


The command line scares me, and even if I was able to get all that working,
I would not feel warm and fuzzy I didn’t mess something up horribly and I
may hit an edge case which destroys backups, silently corrupts data, etc.

I love that there are tools that manage all of it; backups, wal archiving,
remote storage, integrate with cloud storage (S3 and the like), manages the
retention of these backups with all their dependencies for me, and has all
the restore options necessary built in as well.


Block level incremental backup would be amazing for my use case. I have
small updates / deletes that happen to data all over some of my largest
tables. With pgBackRest, since the diff/incremental backups are at the file
level, I can have a single update / delete which touched a random spot in a
table and now requires that whole 1gb file to be backed up again. That
said, even if pg_basebackup was the only tool that did incremental block
level backup tomorrow, I still wouldn’t start using it directly. I went
into the issues I’d have to deal with if I used pg_basebackup above, and
incremental backups without a management tool make me think using it
correctly would be much harder.


I know this thread is just about incremental backup, and that pretty much
everything in core is built up from small features into larger more complex
ones. I understand that and am not trying to dump on any efforts, I am
super excited to see work being done in this area! I just wanted to share
my perspective on how crucial good backup management is to me (and I’m sure
a few others may share my sentiment considering how popular all the
external tools are).

I would never put a system in production unless I have some backup
management in place. If core builds a backup management tool which uses
pg_basebackup as building blocks for its solution…awesome! That may be
something I’d use.  If pg_basebackup can be improved so it can be used as
the basis most external backup management tools can build on top of, that’s
also great. All the external tools which practically every Postgres company
have built show that it’s obviously a need for a lot of users. Core will
never solve every single problem for all users, I know that. It would just
be great to see some of the fundamental features of backup management baked
into core in an extensible way.

With that, there could be a recommended way to set up backups
(full/incremental, parallel, compressed), point in time recovery, backup
retention, and perform restores (to a point in time, on a replica server,
etc) with just the tooling within core with a nice and simple user
interface, and great performance.

If those features core supports in the internal tooling are built in an
extensible way (as has been discussed), there could be much less
duplication of work implementing the same base features over and over for
each external tool. Those companies can focus on more value-added features
to their own products that core would never support, or on improving the
tooling/performance/features core provides.


Well, this is way longer and a lot less coherent than I was hoping, so I
apologize for that. Hopefully my stream of thoughts made a little bit of
sense to someone.


-Adam


Re: Implementing Incremental View Maintenance

2018-12-31 Thread Adam Brusselback
Hi all, just wanted to say  I am very happy to see progress made on this,
my codebase has multiple "materialized tables" which are maintained with
statement triggers (transition tables) and custom functions. They are ugly
and a pain to maintain, but they work because I have no other
solution...for now at least.

I am concerned that the eager approach only addresses a subset of the MV use
> case space, though. For example, if we presume that an MV is present
> because
> the underlying direct query would be non-performant, then we have to at
> least question whether applying the delta-update would also be detrimental
> to some use cases.
>

I will say that in my case, as long as my reads of the materialized view
are always consistent with the underlying data, that's what's important.  I
don't mind if it's eager, or lazy (as long as lazy still means it will
refresh prior to reading).


Re: Referential Integrity Checks with Statement-level Triggers

2018-12-17 Thread Adam Brusselback
It's something I know I am interested in. For me, I don't really care if my
statement doesn't cancel until the very end if there is a RI violation. The
benefit of not having deletes be slow on tables which have others
referencing it  with a fkey which don't have their own index is huge IMO. I
have a good number of those type of logging tables where an index is not
useful 99% of the time, but every once and a while a bulk delete needs to
happen.

It is far from a premature optimization IMO, it is super useful and
something I was hoping would happen ever since I heard about transition
tables being worked on.

Just my $0.02.
-Adam


Re: zheap: a new storage format for PostgreSQL

2018-11-16 Thread Adam Brusselback
>  I don't know how much what I write on this thread is read by others or
how useful this is for others who are following this work

I've been following this thread and many others like it, silently soaking
it up, because I don't feel like i'd have anything useful to add in most
cases. It is very interesting seeing the development take place though, so
just know it's appreciated at least from my perspective.


Re: Sequential UUID Generation

2018-10-30 Thread Adam Brusselback
I would be very interested in a extension which generated sequential uuids.
My entire db is key'd with uuids, and I have measured some index bloat
related specifically to random uuid generation.

Thanks for bringing this up.


Re: Implementation of Flashback Query

2018-10-17 Thread Adam Brusselback
This is something I'm very interested in. Very helpful for fixing mistakes
you didn't realize in time.

One question, would it be possible to allow this to be able to be
configured on a hot standby and not the master?

That would be very helpful by being able to keep some arbitrary length of
extra data on a replica which does not have to be kept on the primary
server, thus not impacting performance.

Just a thought I had and was not sure if it was even feasible.

Happy to see work happening on this though.

-Adam


Re: Built-in connection pooling

2018-04-24 Thread Adam Brusselback
On Tue, Apr 24, 2018 at 9:52 AM, Merlin Moncure  wrote:
>
> Why does it have to be completely transparent?  As long as the feature
> is optional (say, a .conf setting) the tradeoffs can be managed.  It's
> a reasonable to expect to exchange some functionality for pooling;
> pgbouncer provides a 'release' query (say, DISCARD ALL)  to be called
> upon release back to the pool.  Having session state objects (not all
> of which we are talking about; advisory locks and notifications
> deserve consideration) 'just work' would be wonderful but ought not to
> hold up other usages of the feature.
>
> merlin

Just my $0.02, I wouldn't take advantage of this feature as a user
without it being transparent.
I use too many of the features which would be affected by not
maintaining the state.  That's one of the reasons I only use an
external JDBC pooler for my primary application, and plain ole
connections for all of my secondary services which need to just work
with temp tables, session variables, etc.  I'd love it if I could use
one of those poolers (or a built in one) which just magically
increased performance for starting up connections, lowered the
overhead of idle sessions, and didn't mess with session state.

Short of that, i'll take the hit in performance and using more memory
than I should with direct connections for now.

Not sure how other users feel, but that's where  i'm sitting for my use case.



Re: Synchronous replay take III

2018-03-02 Thread Adam Brusselback
Thanks Thomas, appreciate the rebase and the work you've done on this.
I should have some time to test this out over the weekend.

-Adam



Re: Postgres with pthread

2017-12-06 Thread Adam Brusselback
> "barely a 50% speedup" - Hah. I don't believe the numbers, but that'd be
> huge.
They are numbers derived from a benchmark that any sane person would
be using a connection pool for in a production environment, but
impressive if true none the less.



Re: Postgres with pthread

2017-12-06 Thread Adam Brusselback
Here it is formatted a little better.



​
So a little over 50% performance improvement for a couple of the test cases.



On Wed, Dec 6, 2017 at 11:53 AM, Tom Lane  wrote:

> Konstantin Knizhnik  writes:
> > Below are some results (1000xTPS) of select-only (-S) pgbench with scale
> > 100 at my desktop with quad-core i7-4770 3.40GHz and 16Gb of RAM:
>
> > ConnectionsVanilla/default   Vanilla/prepared
> > pthreads/defaultpthreads/prepared
> > 10100 191
> > 106 207
> > 100  67 131
> > 105 168
> > 100041 65
> > 55   102
>
> This table is so mangled that I'm not very sure what it's saying.
> Maybe you should have made it an attachment?
>
> However, if I guess at which numbers are supposed to be what,
> it looks like even the best case is barely a 50% speedup.
> That would be worth pursuing if it were reasonably low-hanging
> fruit, but converting PG to threads seems very far from being that.
>
> I think you've done us a very substantial service by pursuing
> this far enough to get some quantifiable performance results.
> But now that we have some results in hand, I think we're best
> off sticking with the architecture we've got.
>
> regards, tom lane
>
>