Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL
On Sat, Apr 10, 2010 at 11:40 PM, Greg Smith wrote: > To be frank, that makes for a materalized view implementation of little > value over what you can currently do as far as I'm concerned. It might be > interesting as a prototype, but that's not necessarily going to look like > what's needed to do this for real at all. I'm not a big fan of dumping work > into projects when you can see exactly how it's going to fail before you > even get started. As I see if, if you know where it's going to fall down, > you don't need to build a prototype as an exercise to show you how to build > it--you should work on that part first instead. Hopefully, you're already aware that I have enormous respect for your opinions on a wide variety of topics; if not, let me publicly say that I absolutely do. Having said that, I disagree with your conclusions in this instance. I see nothing but upside from this work. It is vastly easier to write a patch that builds on existing functionality than it is to write something new from scratch. If there's any value in having manually refreshed materialized views, then having the simplest possible implementation of what those can look like committed will make it far easier to plan out next steps. While the proposed implementation may not solve a huge number of real-world problems, I think there's a good argument that some people will get good use of it. Not everyone has 1TB tables with continuous access patterns. And, provided that it doesn't conflict with anything we want to do in the future, being useful to some people is a good enough reason to put it in. I also think that you're underestimating the number of problems that will have to be solved to get this done. It's going to take some significant work - both design work and coding work - to figure out how this should integrate into the rest of the system. (What should be the value of pg_class.relkind? Where should the node representation of the snapshot query be stored? And did we handle all of those OID dependencies correctly?) Where I can see this possibly falling down (other than being just too much work for a relative PostgreSQL novice to get it done in one summer) is if there are concerns about it being incompatible with incrementally-updated views. I imagine that we're going to want to eventually support both, so we need to make sure that this implementation doesn't box us into a corner. But as far as snapshot views go, complaining that the proposed locking is too strong doesn't seem quite fair. Fixing that, AFAICS, is a very hard project, possibly involving significant planner support and an implementation of MERGE, and I would much rather try to land a fundamentals patch like this first and then deal with the gyrations that will be involved in making this work than try to land the whole thing all at once. Of course, if I'm missing something, and there's a SIMPLE way to get materialized views that can be refreshed without a full-table lock, that's another story altogether - maybe you have an idea? Finally, even if we decided NOT to merge this patch because of the limitations you mention (and right now that doesn't seem to be the consensus), having this part of it completed as a starting point for future work might be reason enough by itself. In short: I think you may be letting the perfect be the enemy of the good. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL
Robert Haas wrote: It's not obvious to me that a brief full-table lock wouldn't be acceptable for an initial implementation. Obviously it wouldn't be suitable for every use case but since we're talking about manually refreshed views that was bound to be true anyway. There already is an initial implementation of sorts. There are a couple of ways you can build these right now, so any new development has to look like it will end with good odds of being an improvement over what's already available before it's worth putting development resources into. As a rough idea of what people want these for in the field, based on what I've seen requests for, imagine that someone has a 1TB table they're materializing a view on in order to get at least a 10:1, and hopefully close to a 100:1, speedup on viewing summary data. Now, picture what happens if you have someone doing a sequential scan on the MV, which is still quite big, the updater process lines up to grab an exclusive lock when it's done, and now a second user wanting to read a single row quickly comes along behind it. Given a full-table lock implementation, that scenario is unlikely to play out with the second user getting a fast response. They'll likely sit in a lock queue for some potentially long period of time instead, waiting for the active seq scan to finish then the update to happen. You have to build it that way or a steady stream of people reading could block out updates forever. To be frank, that makes for a materalized view implementation of little value over what you can currently do as far as I'm concerned. It might be interesting as a prototype, but that's not necessarily going to look like what's needed to do this for real at all. I'm not a big fan of dumping work into projects when you can see exactly how it's going to fail before you even get started. As I see if, if you know where it's going to fall down, you don't need to build a prototype as an exercise to show you how to build it--you should work on that part first instead. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL
Heikki Linnakangas wrote: Your proposal basically describes doing 1, in a limited fashion where the view is not updated automatically, but only when the DBA runs a command to refresh it. I'm not sure if that's useful enough on its own, writing "CREATE MATERIALIZED VIEW ... SELECT ..." doesn't seem any easier than just writing "CREATE TABLA AS ...". But if you can do something about 2, or even a very limited part of 1, keeping the view up-to-date automatically, it becomes much more useful. You've hit upon the core issue here. You can build materialized views right now using "CREATE TABLE AS". You can even update them by creating a new table the same way, with a new name, and doing the LOCK/RENAME/DROP shuffle--what people used to do for rewriting bloated tables before there was CLUSTER. The first step in the proposal here is essentially syntax to give an easier UI for that. That's an interesting step, but recognize that it doesn't actually provide anything you can't do already. If you then note that doing any sort of incremental update to the view is a hard problem, and that a lot of the useful cases for materialized views involve tables where it's impractical to recreate the whole thing anyway, you'll inevitably find yourself deeply lost in the minutia of how to handle the updates. It's really the core problem in building what people expect from a materialized view implementation in a serious database. Chipping away at the other pieces around it doesn't move the feature that far forward, even if you get every single one of them except incremental updates finished, because everything else combined is still not that much work in comparison to the issues around updates. There certainly are a fair number of subproblems you can break out of here. I just think it's important to recognize that the path that leads to a useful GSoC project and the one that gives a production quality materialized view implementation may not have that much in common, and to manage expectations on both sides accordingly. If Pavel thinks he's going to end up being able to say "I added materialized views to PostgreSQL" at the end of the summer, that's going to end in disappointment. And if people think this project plan will lead to being able to claim PostgreSQL now has this feature, that's also not going to go well. If the scope is "add initial grammar and rewriting moving toward a future materialized view feature", which the underlying implementation noted as a stub prototype, that might work out OK. This is why I likened it to the work on "Syntax for partitioning", which has a similarly focused subgoal structure. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Virtual Private Database
On Sat, Apr 10, 2010 at 10:00 AM, Jean-Gérard Pailloncy wrote: > Hello, > > 1) VPD: Virtual Private Database > I would appreciate to have a new feature in PostgreSQL. > This is an oracle-like feature that implement "Row Level Security". > This feature may be emulated by using VIEW/RULE but this is very time > consuming and error prone. > > I would appreciated to have an estimated of the faisability and the cost > to implement it. > > 2) Description > The feature may be implemented with a simple expression associated to the > table. > > ALTER TABLE table_name ADD FILTER filter_name CHECK(expression); > ALTER TABLE table_name DROP FILTER filter_name; > > Usage/example: > ALTER TABLE filtered_table ADD FILTER tf_username > CHECK(filtered_table.creator=user) > SELECT * FROM filtered_table; > will really do > SELECT * FROM filtered_table WHERE filtered_table.creator=user; > > Same thing for INSERT, UDPATE, and DELETE > > UPDATE filtered_table SET b_column=1 WHERE a_column='a'; > wille really do > UPDATE filtered_table SET b_column=1 WHERE a_column='a' and > filtered_table.creator=user; > > In practice, the devs will create few function: my_login, my_logout, > my_filter > and the simple "filtered_table.creator=user" will be replace by ACL > encapsulated in the function my_filter and add a triger to check data on > INSERT, UDPATE. > We could use veil to build a very efficient filter. > > 3) Question > - Is it doable ? > - Is it the sound way of doing it ? > - Is it possible to have it in core ? > - Is there a pgsql dev interested to implemented it ? > - Is there other people interested in such feature ? > - How much this will cost ? > - With which delay ? > > > Cordialement, > Jean-Gérard Pailloncy > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > This is quite similar to an idea I posted about not long ago called access control jails ( http://archives.postgresql.org/pgsql-hackers/2010-03/msg00832.php ). I encountered this very problem writing a simple web application involving access control. There are a variety of ways to make implementing access control easier, and I think copying off of Oracle would be among the best ;-) Disclaimer: I am not a PostgreSQL hacker, but a newbie with some experience in other communities, absorbing what he can. Access control jailing, as I conceived it, would not simply filter per table, but would make it so all queries would be filtered. If used correctly, it would even be safe to execute untrusted SQL (though it might not be advisable). I looked at the Veil demo application a tiny bit, and the only thing I drew from it was the impression that it makes things more complicated, not less :( Then again, it may just be the example demonstrating a lot of different features at once. One problem that ought to be addressed for any of these ideas is how to do connection-local variables. For instance: > UPDATE filtered_table SET b_column=1 WHERE a_column='a'; > wille really do > UPDATE filtered_table SET b_column=1 WHERE a_column='a' and > filtered_table.creator=user; Here, what is "=user" referring to? I suppose it is a variable that is set not long after the session starts and only applies to that session? PostgreSQL has temporary tables and such, but you can't reference them until they're already created. Hence, I don't think PostgreSQL elegantly supports free variables that are bound temporarily per connection. There are GUCs and such, but using them for this purpose is far from elegant, if I understand correctly. Another problem is that session-local context doesn't go well with connection pooling, so you might need some workaround like passing context IDs back and forth. That's my own summary of the discussion about access control jails linked above. By the way, here's a hack to bind a free variable to a session: CREATE FUNCTION get_user_id() RETURNS INT AS $$ DECLARE ret INT; BEGIN SELECT INTO ret id FROM user_id_tbl; RETURN ret; END $$ LANGUAGE 'plpgsql'; Then, per-session: CREATE TEMPORARY TABLE user_id_tbl (id INT); INSERT INTO user_id_tbl VALUES (5); SELECT get_user_id(); It relies on plpgsql not complaining about user_id_tbl not existing at creation time. What this trick allows one to do is set the user ID once (e.g. after connecting), then views and such that call get_user_id() will have the appropriate user ID without needing to specify it per-query. I'm curious: is this trick a good idea? Does connection pooling play well with temporary tables (and thus this trick)? Could it result in substantial slowdowns (I don't see why it should, since get_user_id() needs to be called once per query that uses it)? I guess creating a temporary table every connection has the potential to be slow. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To
Re: [HACKERS] Virtual Private Database
On Sat, Apr 10, 2010 at 10:00 AM, Jean-Gérard Pailloncy wrote: > 1) VPD: Virtual Private Database > I would appreciate to have a new feature in PostgreSQL. > This is an oracle-like feature that implement "Row Level Security". > This feature may be emulated by using VIEW/RULE but this is very time > consuming and error prone. > > I would appreciated to have an estimated of the faisability and the cost > to implement it. > > 2) Description > The feature may be implemented with a simple expression associated to the > table. > > ALTER TABLE table_name ADD FILTER filter_name CHECK(expression); > ALTER TABLE table_name DROP FILTER filter_name; > > Usage/example: > ALTER TABLE filtered_table ADD FILTER tf_username > CHECK(filtered_table.creator=user) > SELECT * FROM filtered_table; > will really do > SELECT * FROM filtered_table WHERE filtered_table.creator=user; > > Same thing for INSERT, UDPATE, and DELETE > > UPDATE filtered_table SET b_column=1 WHERE a_column='a'; > wille really do > UPDATE filtered_table SET b_column=1 WHERE a_column='a' and > filtered_table.creator=user; > > In practice, the devs will create few function: my_login, my_logout, > my_filter > and the simple "filtered_table.creator=user" will be replace by ACL > encapsulated in the function my_filter and add a triger to check data on > INSERT, UDPATE. > We could use veil to build a very efficient filter. > > 3) Question > - Is it doable ? > - Is it the sound way of doing it ? > - Is it possible to have it in core ? > - Is there a pgsql dev interested to implemented it ? > - Is there other people interested in such feature ? > - How much this will cost ? > - With which delay ? This is very similar to the design I've been thinking about for row-level security. Here is a pointer to a previous email thread on the topic of row-level security. http://archives.postgresql.org/pgsql-hackers/2009-12/msg01095.php Before row-level security can be implemented, we'd need to fix the problem described here: http://archives.postgresql.org/pgsql-hackers/2009-10/msg01346.php With respect to sponsoring development of new features, it can certainly be done. Any such feature could not at this point be added any sooner than PostgreSQL 9.1, and I'd recommend that if you want to see it in 9.1 you should try to get a contract with someone in place in the next few months. To get a price, you'd need to contact a PostgreSQL support/development company or an individual developer. The following web page might give you some ideas where to start looking. http://www.postgresql.org/support/professional_support There's sort of an understanding that we don't talk about contracts or pricing on this list, so that the content remains technical rather than commercial. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl stop -m immediate on the primary server inflates sequences
On Sat, Apr 10, 2010 at 02:36:41PM +0200, Boszormenyi Zoltan wrote: > >> The above is quite reproducable, "pg_ctl stop -m immediate" > >> "usually" inflated my serial sequence, but I had two occasions > >> when not. The 69 -> 70 was one. The inflated increase is always 33: > > > > AFAIKS sequences are pre-logged with 32 values to WAL to avoid > > overhead. I suspect this is why you are seeing those gaps. > > Then it should happen all the time, even with "-m fast" or "-m smart", no? Nope, because on a normal shutdown it writes out the actual value. When you say "immediate" you mean "right now, don't bother with anything not important", like for example gaps in sequences. You're essentially crashing the DB. Have a ncie day, -- Martijn van Oosterhout http://svana.org/kleptog/ > Patriotism is when love of your own people comes first; nationalism, > when hate for people other than your own comes first. > - Charles de Gaulle signature.asc Description: Digital signature
Re: [HACKERS] extended operator classes vs. type interfaces
On Sat, Apr 10, 2010 at 2:30 PM, Jeff Davis wrote: > On Sat, 2010-04-10 at 20:25 +0200, Yeb Havinga wrote: >> I was thinking of a case for instance for ranges a,b,c in relations >> A,B,C respectively, where a && b and b && c, but not a && c. Would the >> planner consider a join path of table A and C first, then that result >> with B. After looking in doxygen, it looks like having && defined >> without MERGES is what prevents this unwanted behaviour, since that >> prevents a,b and c to become members of the same equivalence class. > > Interesting, I would have to make sure that didn't happen. Most likely > there would be a new property like "RANGEMERGES", it wouldn't reuse the > existing MERGES property. > >> Sorry for the spam on the list. > > Not at all, it's an interesting point. Yeah... I agree. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extended operator classes vs. type interfaces
On Fri, Apr 9, 2010 at 5:49 PM, Jeff Davis wrote: > On Thu, 2010-04-08 at 22:29 -0400, Robert Haas wrote: >> 1. knngist wants to use index scans to speed up queries of the form >> SELECT ... ORDER BY(as opposed to the >> existing machinery which only knows how to use an index for SELECT ... >> ORDER BY ). >> 2. Window functions want to define windows over a range of values >> defined by the underlying data type. To do this, we need to define >> what addition and subtraction mean for a particular data type. >> 3. Jeff Davis is interested in implementing range types. When the >> underlying base type is discrete, e.g. integers, you can say that >> [1,3] = [1,4), but only if you know that 3 and 4 are consecutive (in >> that order). > > To give some context, I started a thread a while ago: > > http://archives.postgresql.org/pgsql-hackers/2009-10/msg01403.php > > Tom provided some interesting suggestions in that thread, but I'm not > sure they would work for #1 or #2. The "map && to <<" case is interesting. It doesn't seem like it's really a good candidate for type interfaces, because you you're not really looking for "the" strictly-left-of operator; you're looking for the strictly-left-of operator associated with the overlaps operator actually specified. And ideally there might be an index strategy number available for <<, too, so that you could consider doing an index scan instead of a sort, but not necessarily. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extended operator classes vs. type interfaces
On Sat, 2010-04-10 at 20:25 +0200, Yeb Havinga wrote: > I was thinking of a case for instance for ranges a,b,c in relations > A,B,C respectively, where a && b and b && c, but not a && c. Would the > planner consider a join path of table A and C first, then that result > with B. After looking in doxygen, it looks like having && defined > without MERGES is what prevents this unwanted behaviour, since that > prevents a,b and c to become members of the same equivalence class. Interesting, I would have to make sure that didn't happen. Most likely there would be a new property like "RANGEMERGES", it wouldn't reuse the existing MERGES property. > Sorry for the spam on the list. Not at all, it's an interesting point. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extended operator classes vs. type interfaces
Robert Haas wrote: On Sat, Apr 10, 2010 at 12:05 PM, Yeb Havinga wrote: Jeff Davis wrote: To give some context, I started a thread a while ago: http://archives.postgresql.org/pgsql-hackers/2009-10/msg01403.php Interesting, a join type for overlaps, which makes me think a bit of the staircase join for pre-post coordinates. However, does a join operator type need certain kinds of properties of the operator involved, e.g. being commutative, transitive etc? Else the join reordering fails. The latter fails for the overlap operator. I don't think I follow this. As far as I know, the join order constraints don't depend on the choice of operator. I was thinking of a case for instance for ranges a,b,c in relations A,B,C respectively, where a && b and b && c, but not a && c. Would the planner consider a join path of table A and C first, then that result with B. After looking in doxygen, it looks like having && defined without MERGES is what prevents this unwanted behaviour, since that prevents a,b and c to become members of the same equivalence class. Sorry for the spam on the list. regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL
Greg Smith wrote: > And work on MERGE support is itself blocked behind the fact that > PostgreSQL doesn't have a good way to lock access to a key value > that doesn't exist yet--what other databases call key range > locking. The bulk of the serializable implementation WIP is work to implement just this sort of locking. There are already a couple possible spin-off uses on the horizon based on the ability of these locks to survive their initiating transactions and detect conflicting writes. Both spinoffs involve somehow flagging a transaction as being one for which the locks should be kept until further notice, and issuing a notification when a conflicting write occurs. That seems consistent with the needs of materialized views, too. It probably won't be solid in time to be useful for GSoC, but if someone's looking to map out a plan for materialized views, I thought this information might be germane. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extended operator classes vs. type interfaces
On Sat, Apr 10, 2010 at 12:05 PM, Yeb Havinga wrote: > Jeff Davis wrote: >> >> To give some context, I started a thread a while ago: >> >> http://archives.postgresql.org/pgsql-hackers/2009-10/msg01403.php >> > > Interesting, a join type for overlaps, which makes me think a bit of the > staircase join for pre-post coordinates. However, does a join operator type > need certain kinds of properties of the operator involved, e.g. being > commutative, transitive etc? Else the join reordering fails. The latter > fails for the overlap operator. I don't think I follow this. As far as I know, the join order constraints don't depend on the choice of operator. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extended operator classes vs. type interfaces
Jeff Davis wrote: To give some context, I started a thread a while ago: http://archives.postgresql.org/pgsql-hackers/2009-10/msg01403.php Interesting, a join type for overlaps, which makes me think a bit of the staircase join for pre-post coordinates. However, does a join operator type need certain kinds of properties of the operator involved, e.g. being commutative, transitive etc? Else the join reordering fails. The latter fails for the overlap operator. regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Virtual Private Database
Hello, 1) VPD: Virtual Private Database I would appreciate to have a new feature in PostgreSQL. This is an oracle-like feature that implement "Row Level Security". This feature may be emulated by using VIEW/RULE but this is very time consuming and error prone. I would appreciated to have an estimated of the faisability and the cost to implement it. 2) Description The feature may be implemented with a simple expression associated to the table. ALTER TABLE table_name ADD FILTER filter_name CHECK(expression); ALTER TABLE table_name DROP FILTER filter_name; Usage/example: ALTER TABLE filtered_table ADD FILTER tf_username CHECK(filtered_table.creator=user) SELECT * FROM filtered_table; will really do SELECT * FROM filtered_table WHERE filtered_table.creator=user; Same thing for INSERT, UDPATE, and DELETE UPDATE filtered_table SET b_column=1 WHERE a_column='a'; wille really do UPDATE filtered_table SET b_column=1 WHERE a_column='a' and filtered_table.creator=user; In practice, the devs will create few function: my_login, my_logout, my_filter and the simple "filtered_table.creator=user" will be replace by ACL encapsulated in the function my_filter and add a triger to check data on INSERT, UDPATE. We could use veil to build a very efficient filter. 3) Question - Is it doable ? - Is it the sound way of doing it ? - Is it possible to have it in core ? - Is there a pgsql dev interested to implemented it ? - Is there other people interested in such feature ? - How much this will cost ? - With which delay ? Cordialement, Jean-Gérard Pailloncy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL
2010/4/10 Andrew Dunstan : > Heikki Linnakangas wrote: >> >> 1. Keep the materialized view up-to-date when the base tables change. >> This can be further divided into many steps, you can begin by supporting >> automatic updates only on very simple views with e.g a single table and >> a where clause. Then extend that to support joins, aggregates, >> subqueries etc. Keeping it really limited, you could even require the >> user to write the required triggers himself. > > That last bit doesn't strike me as much of an advance. Isn't the whole point > of this to automate it? Creating greedy materialized views is usually not > terribly difficult now, but you do have to write the triggers. Yeah, I agree. > The other thing that could be interesting about this would be some scheme > for lazy refresh that didn't involve re-extracting the whole data set. One way to do this would be to infer a primary key for the result set based on the input query. But I think we don't really have the infrastructure to do this right now, so not really a project for a beginner. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL
Heikki Linnakangas wrote: 1. Keep the materialized view up-to-date when the base tables change. This can be further divided into many steps, you can begin by supporting automatic updates only on very simple views with e.g a single table and a where clause. Then extend that to support joins, aggregates, subqueries etc. Keeping it really limited, you could even require the user to write the required triggers himself. That last bit doesn't strike me as much of an advance. Isn't the whole point of this to automate it? Creating greedy materialized views is usually not terribly difficult now, but you do have to write the triggers. The other thing that could be interesting about this would be some scheme for lazy refresh that didn't involve re-extracting the whole data set. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] master in standby mode croaks
On Fri, Apr 2, 2010 at 5:36 AM, Simon Riggs wrote: > I can't duplicate this error based upon what you have said. I fooled around with this some more and I think I know what's going on. The error message I received was: recovery connections cannot start because the recovery_connections parameter is disabled on the WAL source server This is generated when !checkPoint.XLogStandbyInfoMode. That, in turn, is set on the master to the results of XLogStandbyInfoActive(), which is defined as XLogRequestRecoveryConnections && XLogIsNeeded(). XLogIsNeeded() is defined as XLogArchivingActive() || (max_wal_senders > 0), and XLogArchivingActive() is defined as XLogArchiveMode. So when you expand it all out, this error message gets triggered when the following condition does not hold on the master: XLogRequestRecoveryConnections && (XLogArchiveMode || (max_wal_senders > 0)) So this can fail in either of two ways: (1) XLogRequestRecoveryConnections (aka recovery_connections) might be false, which is the situation described in the error message, or (2) XLogArchiveMode (archive_mode) might be false and at the same time max_wal_senders might be zero. As it happens, the default configuration of the system is recovery_connections = true, archive_mode = false, max_wal_senders = 0, so with an out-of-the-box config it fails for the reason that isn't the one described in the error message. One possible approach here is to improve the error message, but it seems to me that having the ability of Hot Standby to run on the slave partially controlled by three different GUCs is awfully complicated. I think the root of the problem here is that recovery_connections controls one behavior on the primary (whether or not we WAL-log certain information needed for HS) and a completely unrelated behavior on the standby (whether or not we try to allow read-only backends into the system). In 8.4 and prior, it was always the job of archive_mode to decide whether WAL-logging was needed. Maybe we should go back to that and make it an enum: wal_mode = {standby | archive | off} ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl stop -m immediate on the primary server inflates sequences
Bernd Helmle írta: > > > --On 10. April 2010 09:26:41 +0200 Boszormenyi Zoltan > wrote: > >> The above is quite reproducable, "pg_ctl stop -m immediate" >> "usually" inflated my serial sequence, but I had two occasions >> when not. The 69 -> 70 was one. The inflated increase is always 33: > > AFAIKS sequences are pre-logged with 32 values to WAL to avoid > overhead. I suspect this is why you are seeing those gaps. Then it should happen all the time, even with "-m fast" or "-m smart", no? It seemed like my sequences have a CACHE 32 setting, which would apply to every client that connects, runs nextval() once and disconnects. But it didn't happen all the time, so it's not deterministic. -- Bible has answers for everything. Proof: "But let your communication be, Yea, yea; Nay, nay: for whatsoever is more than these cometh of evil." (Matthew 5:37) - basics of digital technology. "May your kingdom come" - superficial description of plate tectonics -- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] a faster compression algorithm for pg_dump
On Fri, Apr 9, 2010 at 5:51 AM, Greg Stark wrote: > Linking against as an option isn't nearly as bad since the user > compiling it can choose whether to include the restricted feature or > not. That's what we do with readline. However it's not nearly as > attractive when it restricts what file formats Postgres supports -- it > means someone might generate backup dump files that they later > discover they don't have a legal right to read and restore :( If we only linked against it, we'd leave it up to the user to weigh the risk as long as we are not aware of any such violation. Our top priority is to make sure that the project would not be harmed if one day such a patent showed up. If I understood you correctly, this is not an issue, even if we included lzf and less again if we only link against it. The rest is about user education and using lzf only in pg_dump and not for toasting, we could show a message in pg_dump if lzf is chosen to make the user aware of the possible issues. If we still cannot do this, then what I am asking is: What does the project need to be able to at least link against such a compression algorithm? Is it a list of 10, 20, 50 or more other projects using it or is it a lawyer saying: "There is no patent."? But then, how can we be sure that the lawyer is right? Or couldn't we include it even if we had both, because again, we couldn't be sure... ? Joachim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl stop -m immediate on the primary server inflates sequences
--On 10. April 2010 09:26:41 +0200 Boszormenyi Zoltan wrote: The above is quite reproducable, "pg_ctl stop -m immediate" "usually" inflated my serial sequence, but I had two occasions when not. The 69 -> 70 was one. The inflated increase is always 33: AFAIKS sequences are pre-logged with 32 values to WAL to avoid overhead. I suspect this is why you are seeing those gaps. -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl stop -m immediate on the primary server inflates sequences
Boszormenyi Zoltan írta: > Hi, > > I wanted to test HS/SR and have setup two instances, one primary > and one secondary, the secondary is obviously a copy of the primary > while pg_start_backup() was in effect. > > I started up the secondary server after "SELECT pg_stop_backup()" on > the primary. I stopped and started the primary with "-m fast" and > "-m immediate" and I noticed that the sequence that was created for > my serial field was inflated if I used "-m immediate". > > Here's the scenario: > > - primary and secondary are running, then: > > zozo=# create table t1 (id serial primary key, t text); > NOTICE: CREATE TABLE will create implicit sequence "t1_id_seq" for > serial column "t1.id" > NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "t1_pkey" > for table "t1" > CREATE TABLE > zozo=# insert into t1 (t) values ('a'); > INSERT 0 1 > zozo=# > > - stop the primary with "-m fast" (the connection was still alive to it) > and start it again, then: > > zozo=# \q > [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo > psql (9.0devel) > Type "help" for help. > > zozo=# select * from t1; > id | t > +--- > 1 | a > (1 row) > > zozo=# insert into t1 (t) values ('b'); > INSERT 0 1 > zozo=# select * from t1; > id | t > +--- > 1 | a > 2 | b > (2 rows) > > - stop the primary with "-m immediate" (connection was alive on it) > and start it again, then: > > zozo=# \q > [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo > psql (9.0devel) > Type "help" for help. > > zozo=# select * from t1; > id | t > +--- > 1 | a > 2 | b > (2 rows) > > zozo=# insert into t1 (t) values ('b'); > INSERT 0 1 > zozo=# select * from t1; > id | t > +--- > 1 | a > 2 | b > 35 | b > (3 rows) > > The above is quite reproducable, "pg_ctl stop -m immediate" > "usually" inflated my serial sequence, but I had two occasions > when not. The 69 -> 70 was one. The inflated increase is always 33: > > [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo > psql (9.0devel) > Type "help" for help. > > zozo=# insert into t1 (t) values ('f'); > INSERT 0 1 > zozo=# select * from t1; > id | t > -+--- >1 | a >2 | b > 35 | b > 36 | c > 69 | d > 70 | e > 103 | f > (7 rows) > > Let's try with a plain sequence: > > zozo=# create sequence s1; > CREATE SEQUENCE > zozo=# select nextval('s1'); > nextval > - >1 > (1 row) > > I stopped the primary at this point with "-m immediate", > and from this first result I thought that a plain sequence is > not bothered by this: > > zozo=# \q > [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo > psql (9.0devel) > Type "help" for help. > > zozo=# select nextval('s1'); > nextval > - >2 > (1 row) > > zozo=# insert into t1 (t) values ('g'); > INSERT 0 1 > zozo=# select * from t1; > id | t > -+--- >1 | a >2 | b > 35 | b > 36 | c > 69 | d > 70 | e > 103 | f > 136 | g > (8 rows) > > But another restart and: > > zozo=# \q > [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo > psql (9.0devel) > Type "help" for help. > > zozo=# select nextval('s1'); > nextval > - > 35 > (1 row) > > zozo=# select * from t1; > id | t > -+--- >1 | a >2 | b > 35 | b > 36 | c > 69 | d > 70 | e > 103 | f > 136 | g > (8 rows) > > zozo=# insert into t1 (t) values ('h'); > INSERT 0 1 > zozo=# select * from t1; > id | t > -+--- >1 | a >2 | b > 35 | b > 36 | c > 69 | d > 70 | e > 103 | f > 136 | g > 169 | h > (9 rows) > > It happened with a CVS version of about 2 weeks ago and the > yesterday's version, as well. I think it's not intentional, it must be > a race somewhere, as it doesn't happen all the time. > > Best regards, > Zoltán Böszörményi > And to show that it doesn't happen with "-m fast", I tried it three times. Before quitting from psql, I stopped the primary with "-m fast" each time: [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo psql (9.0devel) Type "help" for help. zozo=# insert into t1 (t) values ('i'); INSERT 0 1 zozo=# \q [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo psql (9.0devel) Type "help" for help. zozo=# insert into t1 (t) values ('i'); INSERT 0 1 zozo=# \q [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo psql (9.0devel) Type "help" for help. zozo=# insert into t1 (t) values ('i'); INSERT 0 1 zozo=# select * from t1; id | t -+--- 1 | a 2 | b 35 | b 36 | c 69 | d 70 | e 103 | f 136 | g 169 | h 170 | i 171 | i 172 | i (12 rows) Best regards, Zoltán Böszörményi -- Bible has answers for everything. Proof: "But let your communication be, Yea, yea; Nay, nay: for whatsoever is more than these cometh of evil." (Matthew 5:37) - basics of digital technology. "May your kingdom come" - superficial description of plate tectonics -- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) T
[HACKERS] pg_ctl stop -m immediate on the primary server inflates sequences
Hi, I wanted to test HS/SR and have setup two instances, one primary and one secondary, the secondary is obviously a copy of the primary while pg_start_backup() was in effect. I started up the secondary server after "SELECT pg_stop_backup()" on the primary. I stopped and started the primary with "-m fast" and "-m immediate" and I noticed that the sequence that was created for my serial field was inflated if I used "-m immediate". Here's the scenario: - primary and secondary are running, then: zozo=# create table t1 (id serial primary key, t text); NOTICE: CREATE TABLE will create implicit sequence "t1_id_seq" for serial column "t1.id" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "t1_pkey" for table "t1" CREATE TABLE zozo=# insert into t1 (t) values ('a'); INSERT 0 1 zozo=# - stop the primary with "-m fast" (the connection was still alive to it) and start it again, then: zozo=# \q [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo psql (9.0devel) Type "help" for help. zozo=# select * from t1; id | t +--- 1 | a (1 row) zozo=# insert into t1 (t) values ('b'); INSERT 0 1 zozo=# select * from t1; id | t +--- 1 | a 2 | b (2 rows) - stop the primary with "-m immediate" (connection was alive on it) and start it again, then: zozo=# \q [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo psql (9.0devel) Type "help" for help. zozo=# select * from t1; id | t +--- 1 | a 2 | b (2 rows) zozo=# insert into t1 (t) values ('b'); INSERT 0 1 zozo=# select * from t1; id | t +--- 1 | a 2 | b 35 | b (3 rows) The above is quite reproducable, "pg_ctl stop -m immediate" "usually" inflated my serial sequence, but I had two occasions when not. The 69 -> 70 was one. The inflated increase is always 33: [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo psql (9.0devel) Type "help" for help. zozo=# insert into t1 (t) values ('f'); INSERT 0 1 zozo=# select * from t1; id | t -+--- 1 | a 2 | b 35 | b 36 | c 69 | d 70 | e 103 | f (7 rows) Let's try with a plain sequence: zozo=# create sequence s1; CREATE SEQUENCE zozo=# select nextval('s1'); nextval - 1 (1 row) I stopped the primary at this point with "-m immediate", and from this first result I thought that a plain sequence is not bothered by this: zozo=# \q [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo psql (9.0devel) Type "help" for help. zozo=# select nextval('s1'); nextval - 2 (1 row) zozo=# insert into t1 (t) values ('g'); INSERT 0 1 zozo=# select * from t1; id | t -+--- 1 | a 2 | b 35 | b 36 | c 69 | d 70 | e 103 | f 136 | g (8 rows) But another restart and: zozo=# \q [z...@db00 pgd90dev-xlog-slave]$ psql -p 65432 zozo psql (9.0devel) Type "help" for help. zozo=# select nextval('s1'); nextval - 35 (1 row) zozo=# select * from t1; id | t -+--- 1 | a 2 | b 35 | b 36 | c 69 | d 70 | e 103 | f 136 | g (8 rows) zozo=# insert into t1 (t) values ('h'); INSERT 0 1 zozo=# select * from t1; id | t -+--- 1 | a 2 | b 35 | b 36 | c 69 | d 70 | e 103 | f 136 | g 169 | h (9 rows) It happened with a CVS version of about 2 weeks ago and the yesterday's version, as well. I think it's not intentional, it must be a race somewhere, as it doesn't happen all the time. Best regards, Zoltán Böszörményi -- Bible has answers for everything. Proof: "But let your communication be, Yea, yea; Nay, nay: for whatsoever is more than these cometh of evil." (Matthew 5:37) - basics of digital technology. "May your kingdom come" - superficial description of plate tectonics -- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Set LC_COLLATE to de_DE_phoneb
On Thu, Apr 08, 2010 at 06:59:44PM +0200, Frank Jagusch wrote: > I thought PostgreSQL is using OS specific functions for sorting (means > Win32 functions?). This sounds not good for my request. So I ask the > whole audience: Are others out there asking for a support for the > alternate sort orders? Is it worth to discuss further in this direction? > > Here an other Idea: Is there a way to define a custom collation for a > database? Over the years there have been various options suggested and various patches posted but they never make it, for various reasons. The only concerted effort I know of is the PostgreSQL ICU patch which is in FreeBSD ports, to work around the fact that it's C library doesn't handle UTF-8 collation at all. http://people.freebsd.org/~girgen/postgresql-icu/README.html I doubt it's been tested for Windows, but if it works it will give you alternate sort orders and even custom sort orders. Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > Patriotism is when love of your own people comes first; nationalism, > when hate for people other than your own comes first. > - Charles de Gaulle signature.asc Description: Digital signature
Re: [HACKERS] GSoC - proposal - Materialized Views in PostgreSQL
Greg Smith wrote: > The main hidden complexity in this particular project relates to > handling view refreshes. The non-obvious problem is that when the view > updates, you need something like a SQL MERGE to really handle that in a > robust way that doesn't conflict with concurrent access to queries > against the materialized view. And work on MERGE support is itself > blocked behind the fact that PostgreSQL doesn't have a good way to lock > access to a key value that doesn't exist yet--what other databases call > key range locking. See the notes for "Add SQL-standard > MERGE/REPLACE/UPSERT command" at http://wiki.postgresql.org/wiki/Todo > for more information. > > You can work around that to build a prototype by grabbing a full table > lock on the materialized view when updating it, but that's not a > production quality solution. It would still be useful for many applications. And it would provide a basis to extend later. You don't need to solve all problems at once, as long as what you implement is a useful subset. > Now, with all that said, that doesn't mean there's not a useful project > for you buried in this mess. The first two steps in your plan: > > 1) create materialized view > 2) change rewriter > > Include building a prototype grammer, doing an initial executor > implementation, and getting some sort of rewriter working. That is > potentially good groundwork to lay here. I would suggest that you > completely drop your step 3: > > 3) create command that takes snapshot (refresh MV) > > Because you cannot built that in a way that will be useful (and by that > I mean committable quality) until there's a better way to handle updates > than writing a whole new table and grabbing a full relation lock to > switch to it. To do a good job just on the first two steps should take > at least a whole summer anyway--there's a whole stack of background > research needed I haven't seen anyone do yet, and that isn't on your > plan yet. There is a precedent for taking this approach. After getting > stalled trying to add the entirety of easy partitioning to PostgreSQL, > the current scope has been scaled back to just trying to get the syntax > and on-disk structure right, then finish off the implementation. See > http://wiki.postgresql.org/wiki/Table_partitioning to get an idea how > that's been broken into those two major chunks. The good thing about this subject for GSoC is that it can be divided into many small steps. There's two largely independent main parts: 1. Keep the materialized view up-to-date when the base tables change. This can be further divided into many steps, you can begin by supporting automatic updates only on very simple views with e.g a single table and a where clause. Then extend that to support joins, aggregates, subqueries etc. Keeping it really limited, you could even require the user to write the required triggers himself. 2. Teach the planner to use materialized views automatically when a query references the base tables. So if you issue the query "SELECT * FROM table WHERE foo > 10 AND bar = 10", and there's a materialized view on "SELECT * FROM table WHERE bar = 10", the planner can transform the original query into "SELECT * FROM materializedview WHERE foo > 10". This largely depends on 1, although some DBMSs offer the option to use manually refreshed materialized views too, knowing that they might not be completely up-to-date. There's a lot room to choose which problems you want to tackle, which is good for a summer-of-code project. Your proposal basically describes doing 1, in a limited fashion where the view is not updated automatically, but only when the DBA runs a command to refresh it. I'm not sure if that's useful enough on its own, writing "CREATE MATERIALIZED VIEW ... SELECT ..." doesn't seem any easier than just writing "CREATE TABLA AS ...". But if you can do something about 2, or even a very limited part of 1, keeping the view up-to-date automatically, it becomes much more useful. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers