> If there's no data compressed > using the compression method you dropped, everything is cool - > otherwise everything is broken and there's no way to recover. > The only obvious alternative is to disallow DROP altogether (or make it > not really DROP). Wouldn't whatever was using the compression method have something marking which method was used? If so, couldn't we just scan if there is any data using it, and if so disallow the drop, or possibly an option to allow the drop and rewrite the table either uncompressed, or with the default compression method? -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
So from following this discussion and others focused on making the planner "smarter", there is always an argument to be had over wasting planner cycles, and it's always a hard fought battle to get any changes made. Now, i'm speaking without any knowledge of the Postgres internals, so please bear with me. It seems like the underlying "feature" which would allow more edge case optimizations into the planner would be some framework for knowing when to consider using those optimizations, and when to not waste time planning and skip expensive planning optimizations. It seems with a system like that, maybe Postgres could still maintain (or even improve) it's hard fought quick planning time for OLTP queries, while being able to spend more time planning for OLAP queries where spending an extra 500ms planning may save minutes of execution time. I know in my database, I have analytical queries which I wouldn't mind spending multiple seconds planning if necessary to produce an optimal plan, because execution time dominates everything. Now what that system looks like is something I have no real opinion or authority on. It seems like there are three options though, automatic, manual, and automatic with manual override. Two of those, manual / automatic with manual override seem to be a little too close to query hints for them to be considered from all previous discussions I've heard (correct me if i'm wrong). So that leaves automatic as the only option I can see being considered viable. Now comes the question of if it's possible to automatically classify queries in such a way that we can cheaply know what optimizations we should attempt, and what should not even be considered because the planning time would dominate. I'll leave my thoughts at that, and await comments from the smarter people in the room who know exactly why this idea wouldn't work. The general thought behind it is to make it easier for more and more query optimizations to make it into the planner, while not hurting queries which shouldn't spend their time on it. If there are better ways of going about it, great! Thanks for listening to me ramble, -Adam -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
> I can't get very excited about this one either, though I do believe it > can arise as the author says, "when you build complex views and JOIN > them to each other". Maybe I'm not excited about it because I've not > needed it :) This is one that I know would help with my database. There is a ton of logic stored in views, which get joined to to the original table to filter the set rather than imposing that set of conditions in every separate query. It would be really nice if the optimizer could simplify those to eliminate the self join. It's almost always on the primary key of a table that the join would happen on, and if not it'd be a non-nullable column for sure. On another note: > turning ORs into UNIONs This is another one which would be incredibly useful for me. I've had to do this manually for performance reasons far too often. > Partially agreed. A comment to the article also mentions that some other > database performs more optimizations depending on the cost of the > plan. That's not easy to do in our current plan structure, but I think > it's quite a worthwhile concept. I would love to see this in Postgres. It would allow the planner to not waste cycles unnecessarily on queries where it's just not needed, and to potentially spend a few more cycles planning on very costly queries to save a ton while executing. -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Hopefully it's alright for me to post this here, please let me know if not. I ran across an article on blog.jooq.org comparing all the major RDBMS' with their ability to optimize away unnecessary work with queries which are less than optimal, and saw some discussion on hackernews and reddit, but I hadn't seen any discussion here. The article in question is here: https://blog.jooq.org/2017/09/28/10-cool-sql-optimisations-that-do-not-depend-on-the-cost-model/ Commercial databases seem to have a serious leg up in this area, and there are even some that MySQL has that Postgres doesn't. I was wondering which of these are planned, which have had discussion before and decided not to support, and which just haven't been thought of? I thought i'd bring it up and hopefully others who are more knowledgeable can chime in.
I know that for my use-cases, having both options available would be very appreciated. The vast majority of the computed columns I would use in my database would be okay to compute on read. But there are for sure some which would be performance prohibitive to have compute on read, so i'd rather have those stored. So for me, i'd rather default to compute on read, as long storing the pre-computed value is an option when necessary. Just my $0.02 Thanks, -Adam
My understanding is that global indexes allow foreign keys to work naturally with partitioned tables, or tables in an inheritance hierarchy. That is pretty big IMO, as it allows you to partition a table without making a trade-off in your database integrity.
> > I'll give it a few days for objections before reverting. >> > > I can only say that the lack of this feature comes up on a weekly basis on > IRC, and a lot of people would be disappointed to see it reverted. > Not that my opinion matters, but I was very much looking forward to this feature in Postgres 10, but I understand if it just won't be stable enough to stay in. I have my fingers crossed these issues can be resolved.
>From a user's perspective: >I think most people prefer #2 because: > >* most users writing queries prefer #2 > >* most users assume full optimization and it seems natural to turn > > _off_ an optimization via a keyword > >* while some queries can be inlined, all queries can be materialized, > > so doing #1 means INLINE would be only a preference, which could be > > confusing I completely agree with this reasoning. I have a few queries I would have to touch to add "MATERIALIZED", but the vast majority of CTE's in my codebase would get a speedup. It would allow usage of CTE's more freely than now. I currently avoid them unless it really simplifies a query because of the optimization fence. Not that my opinion holds any weight, but the extra keyword for enabling the optimization fence is my preference. By default trying to optimize more is a good thing IMO. > >Anyway, I am very glad we are considering addressing this in PG 11. Seconded, this is a sore spot for me when using Postgres, and i'd love to not have it be an issue any more. Thanks, -Adam
> > BTW, it just occurred to me that having this separation would make it > relatively easy to support re-directing DML queries from a replica to the > master; if the backend throws the error indicating you tried to write data, > the connection layer could re-route that. This also sounds like it would potentially allow re-routing the other way where you know the replica contains up-to-date data, couldn't you potentially re-direct read only queries to your replicas?
> > There may be some situations where crawling the indexes a row at a > time will perform better than this by enough to want to retain that > option. If an index existed, wouldn't it still be able to use that in the set-based implementation? Is there something which would make doing the check set-based ever worse than row based inherently?
> > The delete of about one million rows from a "parent" table with no > matching rows in the "child" table, and no index on referencing > column in the child table, took 24:17.969 using current triggers > and 00:03.262 using the set-based triggers. Yes, that reduces > current run time for that case by 99.78% That is really incredible. Gets rid of the need for an index on referencing columns for a ton of use cases.
So I apologize in advance if I didn't follow the processes exactly, I was going to attempt to review this to move it along, but ran into issues applying the patch cleanly to master. I fixed the issues I was having applying it, and created a new patch (attached). Managed to test it out after I got it applied, and it is working as expected for exclusion constraints, as well as normal indexes. This test was performed on Windows 10 (64 bit), and Postgres was compiled using MSYS2. btree_gist_uuid_7.patch Description: Binary data -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Just throwing an anecdote out there, but my company uses UUID for primary keys on every table in the DB. While int4 is for sure more popular, it would be nice if there weren't even more reasons to "force" people in that direction. I know I started regretting the decision to go with UUID primary keys slightly once I realized that we'd need exclusion constraints, and you have to jump through hoops to use them together. My main point is that maybe the reason why most users use int4 pkeys (besides conventional wisdom) is because it gets the most support from features like this, and it may not be quite as skewed if that same support were given to other types. Just my $0.02 -Adam On Fri, Oct 21, 2016 at 4:46 PM, Jim Nasby
wrote: > On 10/19/16 7:52 AM, Robert Haas wrote: > >> So, I think that this is a really promising direction, but also that >> you should try very hard to try to get out from under this 6-byte PK >> limitation. That seems really ugly, and in practice it probably means >> your PK is probably going to be limited to int4, which is kind of sad >> since it leaves people using int8 or text PKs out in the cold. >> > > My impression is that int4 is by far the most popular PK type. Even if the > initial implementation is limited to that I think it'd have a lot of > potential. > -- > Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX > Experts in Analytics, Data Architecture and PostgreSQL > Data in Trouble? Get it in Treble! http://BlueTreble.com > 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 > > > > -- > Sent via pgsql-hackers mailing list (email@example.com) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >