Re: [HACKERS] Substituting Checksum Algorithm (was: Enabling Checksums)
On 28 April 2013 13:22, Ants Aasma wrote: > I have updated the base patch. This is supposed to go under the > cflags-vector patch that Jeff posted yesterday. I've committed your patch, with two changes * one comment extended * adding the .h file from Jeff's last main patch Please can Ants and Jeff review the commit and confirm that is what we wanted. Thanks. I'm about to light up the build farm with a trial commit of the compiler instructions stuff. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Functional dependencies and GROUP BY - for subqueries
On Mon, Apr 29, 2013 at 7:31 PM, Tom Lane wrote: > Ashutosh Bapat writes: > > Is there any reason why do we want to check the functional dependencies > at > > the time of parsing and not after rewrite? Obviously, by doing so, we > will > > allow creation of certain views which will start throwing errors after > the > > underlying table changes the primary key. Is it mandatory that we throw > > "functional dependency" related errors at the time of creation of views? > > From a usability standpoint, I would think so. And really the only > excuse for the functional-dependency feature to exist at all is > usability; it adds nothing you couldn't do without it. > > If we wanted to do something like this, I think the clean way to do it > would be to invent a notion of unique/not-null/pkey constraints on > views, so that the creator of a view could declaratively say that he > wants such a property exposed. That is, the example would become > something like > > create table t1 (id int primary key, ... other stuff ...); > create view v1 as select * from t1; > alter view v1 add primary key(id); > create view v2 as select * from v1 group by id; > > The pkey constraint on v1 is just a catalog entry with a dependency on > t1's pkey constraint; there's no actual index there. But now, v2 can > be built with a dependency on v1's pkey, not t1's, and the action-at- > a-distance problem goes away. For example, a "CREATE OR REPLACE v1" > command could check that the new view definition still provides > something for v1's pkey to depend on, and throw error or not without any > need to examine the contents of other views. Dropping various elements > of this schema would work unsurprisingly, too. > > This would, of course, require a significant chunk of new code, and > personally I do not think the feature would be worth it. But it > would be a clean and usable design. > > Yes, this looks better design. But I do not see any interest as such. So, if I have to spend time here, there is higher chance it would go waste. Will it be useful to have primary key grouping functionality extended to the subqueries? For example, CREATE TEMP TABLE products (product_id int, name text, price numeric); CREATE TEMP TABLE sales (product_id int, units int); ALTER TABLE products ADD PRIMARY KEY (product_id); SELECT product_id, p.name, (sum(s.units) * p.price) AS sales FROM (select * from products) p LEFT JOIN (select * from sales) s USING (product_id) GROUP BY product_id; This subquery gives error (p.name should be part of group by clause), but functionally it's grouping based on primary key. Is there a better way to use the functional dependency for grouping? > regards, tom lane > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Postgres Database Company
Re: [HACKERS] Graph datatype addition
On Mon, Apr 29, 2013 at 10:51 PM, Peter Eisentraut wrote: > On Sun, 2013-04-28 at 22:55 -0700, Atri Sharma wrote: >> If we add support for weighted graphs, we can probably add support for >> some common graph algorithms, such as Djikstra's algorithm, Bellman >> Ford algorithm, a MST making algorithm, network flow algorithms. > > You might find that pgRouting already does much of this. > actually, i was going to suggest to Atri to take a look at that, pgrouting is currently in develop of v2.0 which will rewrite some parts (including some of the algorithms). Maybe this datatype could fit better as part of pgrouting -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación Phone: +593 4 5107566 Cell: +593 987171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The missing pg_get_*def functions
On Tue, Apr 30, 2013 at 12:58 AM, Tom Lane wrote: > Noah Misch writes: >> Note also that minor releases can readily fix bugs in C-language functions, >> but we have no infrastructure to update sql-language functions after initdb. >> That flexibility is unfortunate to lose, particularly for something that >> pg_dump depends on. > > That alone would probably be sufficient reason why we would never allow > pg_dump to depend on any such thing (not that I see a compelling > argument for it to do so anyway...). It would be better to find a way to update sql-language functions in minor upgrades, instead of shutting that door entirely for all future implementation ideas involving sql-language functions in the pg_catalog. > The long and the short of it here is that there isn't any very good > reason to migrate any of the existing pg_dump-side functionality into > server-side functions, and especially not server-side functions that > aren't in C. One of the things that we frequently recommend when doing > upgrades is that you do the dump with the newer version's pg_dump, so > as to get the benefits of any bug fixes that are in it. The more > dump functionality is on the server side, the less opportunity we have > to repair things that way. There are two very good reasons in particular of doing exactly that: a) Make it possible to develop tools which need to be able or create (and/or drop) schema objects, such as a schema migration tool, capable of migrating between different revisions of the schema. I started working on such a project two years ago, but realised most of the actual code I wrote was code to define how to create/drop objects, and in what order based on pg_depend, and I thought such code would be better to put in the core, and put the project on hold awaiting a complete coverage of the pg_get_*def functions, but they never happened, so here we are two years later. This is the project I'm talking about: https://github.com/gluefinance/pov b) Reduce the amount of hard-coded sql queries in pg_dump.c. Won't help for previous versions, but perhaps the sql queries to support older versions can be hidden in some separately included file at least so you don't have to see it, which would make sense since the sql for the old versions won't change and won't need the same amount of active development as the code for the latest version. Makes no sense of keeping the sql for all versions at the same place. > It may be that the functions Joel proposes are worth having for other > tools to use, but I'm not in favor of making pg_dump use them. I don't agree with you as I don't think it's a very hard problem to deal with updates of sql-language functions in the pg_catalog schema. If we fix that first, this and a whole range of problems, where the solution involves taking in-data from pg_catalog, and producing out-data based on that, becomes super easy to achieve by writing simple sql-language functions with direct and simple access to the pg_catalog, instead of writing C functions with inline hard-coded sql. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Graph datatype addition
On Sun, 2013-04-28 at 22:55 -0700, Atri Sharma wrote: > If we add support for weighted graphs, we can probably add support for > some common graph algorithms, such as Djikstra's algorithm, Bellman > Ford algorithm, a MST making algorithm, network flow algorithms. You might find that pgRouting already does much of this. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The missing pg_get_*def functions
On Tue, Apr 30, 2013 at 12:46 AM, Noah Misch wrote: > Note that while the sql procedural language is fair game, plpgsql currently is > not. We install it by default, but the DBA is free to drop it. Right. The "sql" procedural language was what I had in mind. > Those existing functions give a mostly-SnapshotNow picture of their objects, > but an sql-language implementation would give a normally-snapshotted picture. I assume "normally" is better than "mostly"? > That status quo is perhaps more an implementation accident than a designed > behavior. Before proliferating functions like this, we should pick a snapshot > policy and stick to it. See the block comment at the top of pg_dump.c. I didn't think there would be any reason to migrate the existing functions from C to SQL, but this snapshot problem seems like a good motive to do it. If they would all be written in SQL, the snapshot problem would be solved, right? > Note also that minor releases can readily fix bugs in C-language functions, > but we have no infrastructure to update sql-language functions after initdb. > That flexibility is unfortunate to lose, particularly for something that > pg_dump depends on. Now, the right thing is probably to design a mechanism > for applying simple catalog updates in concert with a minor release. In the > mean time, its absence puts the sql PL at a nontrivial disadvantage here. What do you mean with "infrastructure"? Isn't it as simple as CREATE OR REPLACE FUNCTION? As long as the interface the pg_get_*def functions don't change, I cannot see how simply replacing the existing functions in a minor release upgrade could do any harm. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Back branches vs. gcc 4.8.0
On Sat, 2013-04-06 at 12:59 -0400, Tom Lane wrote: > The reason I'm thinking it's a good idea is that it would expose any > remaining places where we have nominally var-length arrays embedded in > larger structs. Now that I've seen the failures with gcc 4.8.0, I'm > quite worried that there might be some more declarations like that > which we've not identified yet, but that by chance aren't causing > obvious failures today. Here is a rough patch that replaces almost all occurrences of something[1] in a struct with FLEXIBLE_ARRAY_MEMBER. It crashes left and right (because of sizeof issues, probably), but at least so far the compiler hasn't complained about any flexible-array members not at the end of the struct, which is what it did last time. So the answer to your concern so far is negative. Completing this patch will be quite a bit more debugging work. Some kind of electric fence for palloc would be helpful. diff --git a/contrib/cube/cubedata.h b/contrib/cube/cubedata.h index fd0c26a..01e0818 100644 --- a/contrib/cube/cubedata.h +++ b/contrib/cube/cubedata.h @@ -6,7 +6,7 @@ typedef struct NDBOX { int32 vl_len_; /* varlena header (do not touch directly!) */ unsigned int dim; - double x[1]; + double x[FLEXIBLE_ARRAY_MEMBER]; } NDBOX; #define DatumGetNDBOX(x) ((NDBOX*)DatumGetPointer(x)) diff --git a/contrib/intarray/_int.h b/contrib/intarray/_int.h index 56aa23c..c35e862 100644 --- a/contrib/intarray/_int.h +++ b/contrib/intarray/_int.h @@ -72,7 +72,7 @@ typedef struct { int32 vl_len_; /* varlena header (do not touch directly!) */ int32 flag; - char data[1]; + char data[FLEXIBLE_ARRAY_MEMBER]; } GISTTYPE; #define ALLISTRUE 0x04 @@ -132,7 +132,7 @@ typedef struct QUERYTYPE { int32 vl_len_; /* varlena header (do not touch directly!) */ int32 size; /* number of ITEMs */ - ITEM items[1]; /* variable length array */ + ITEM items[FLEXIBLE_ARRAY_MEMBER]; } QUERYTYPE; #define HDRSIZEQT offsetof(QUERYTYPE, items) diff --git a/contrib/ltree/ltree.h b/contrib/ltree/ltree.h index 563970a..bf25f4c 100644 --- a/contrib/ltree/ltree.h +++ b/contrib/ltree/ltree.h @@ -9,7 +9,7 @@ typedef struct { uint16 len; - char name[1]; + char name[FLEXIBLE_ARRAY_MEMBER]; } ltree_level; #define LEVEL_HDRSIZE (offsetof(ltree_level,name)) @@ -19,7 +19,7 @@ typedef struct { int32 vl_len_; /* varlena header (do not touch directly!) */ uint16 numlevel; - char data[1]; + char data[FLEXIBLE_ARRAY_MEMBER]; } ltree; #define LTREE_HDRSIZE MAXALIGN( offsetof(ltree, data) ) @@ -33,7 +33,7 @@ typedef struct int32 val; uint16 len; uint8 flag; - char name[1]; + char name[FLEXIBLE_ARRAY_MEMBER]; } lquery_variant; #define LVAR_HDRSIZE MAXALIGN(offsetof(lquery_variant, name)) @@ -50,7 +50,7 @@ typedef struct uint16 numvar; uint16 low; uint16 high; - char variants[1]; + char variants[FLEXIBLE_ARRAY_MEMBER]; } lquery_level; #define LQL_HDRSIZE MAXALIGN( offsetof(lquery_level,variants) ) @@ -71,7 +71,7 @@ typedef struct uint16 numlevel; uint16 firstgood; uint16 flag; - char data[1]; + char data[FLEXIBLE_ARRAY_MEMBER]; } lquery; #define LQUERY_HDRSIZE MAXALIGN( offsetof(lquery, data) ) @@ -106,7 +106,7 @@ typedef struct { int32 vl_len_; /* varlena header (do not touch directly!) */ int32 size; - char data[1]; + char data[FLEXIBLE_ARRAY_MEMBER]; } ltxtquery; #define HDRSIZEQT MAXALIGN(VARHDRSZ + sizeof(int32)) @@ -205,7 +205,7 @@ typedef struct { int32 vl_len_; /* varlena header (do not touch directly!) */ uint32 flag; - char data[1]; + char data[FLEXIBLE_ARRAY_MEMBER]; } ltree_gist; #define LTG_ONENODE 0x01 diff --git a/contrib/pg_trgm/trgm.h b/contrib/pg_trgm/trgm.h index ed649b8..f030558 100644 --- a/contrib/pg_trgm/trgm.h +++ b/contrib/pg_trgm/trgm.h @@ -63,7 +63,7 @@ typedef struct { int32 vl_len_; /* varlena header (do not touch directly!) */ uint8 flag; - char data[1]; + char data[FLEXIBLE_ARRAY_MEMBER]; } TRGM; #define TRGMHDRSIZE (VARHDRSZ + sizeof(uint8)) diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c index f48c0bc..4323184 100644 --- a/src/backend/catalog/namespace.c +++ b/src/backend/catalog/namespace.c @@ -1066,7 +1066,7 @@ static bool MatchNamedCall(HeapTuple proctup, int nargs, List *argnames, */ effective_nargs = Max(pronargs, nargs); newResult = (FuncCandidateList) - palloc(sizeof(struct _FuncCandidateList) - sizeof(Oid) + palloc(offsetof(struct _FuncCandidateList, args) + effective_nargs * sizeof(Oid)); newResult->pathpos = pathpos; newResult->oid = HeapTupleGetOid(proctup); @@ -1580,7 +1580,7 @@ static bool MatchNamedCall(HeapTuple proctup, int nargs, List *argnames, * separate palloc for each operator, but profiling revealed that the * pallocs used an unreasonably large fraction of parsing time. */ -#define SPACE_PER_OP MAXALIGN(sizeof(struct _FuncCandidateList) + sizeof(Oid)) +#define SPACE_PER_OP MAXAL
Re: [HACKERS] Remaining beta blockers
On Mon, Apr 29, 2013 at 9:44 PM, Stephen Frost wrote: > * Kevin Grittner (kgri...@ymail.com) wrote: >> If they modified the heap files that way while the server was >> running, the results would be somewhat unpredictable. If they did >> it while the server was stopped, starting the server and attempting >> to access the matview would generate: > > Right, the point being that they could (ab)use it as a flag to trigger > something to happen. I'd also be worried about failure cases where > files appear to be zero-length. If you assume that people are going to modify files while the backend is running, nothing we do anywhere is safe. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl non-idempotent behavior change
On Sat, 2013-04-27 at 14:24 -0400, Tom Lane wrote: > Yeah, this is not tolerable. We could think about improving the logic > to have a stronger check on whether the old server is really there or > not (ie it should be doing something more like pg_ping and less like > just checking if the pidfile is there). But given how close we are to > beta, maybe the best thing is to revert that change for now and put it > back on the to-think-about-for-9.4 list. Peter? Reverted. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remaining beta blockers
* Kevin Grittner (kgri...@ymail.com) wrote: > If they modified the heap files that way while the server was > running, the results would be somewhat unpredictable. If they did > it while the server was stopped, starting the server and attempting > to access the matview would generate: Right, the point being that they could (ab)use it as a flag to trigger something to happen. I'd also be worried about failure cases where files appear to be zero-length. > > Or we end up wanting to have that file be non-zero and considered > > 'empty' later, but we don't want pg_upgrade running around > > touching all of the existing files out there? > > I didn't follow this one; could you restate it, please? Down the road we decide that we shouldn't have any zero-length files (perhaps due to checksums..?), yet we have to special case around these mat views and figure out a way to deal with them during pg_upgrade. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] The missing pg_get_*def functions
On Mon, Apr 29, 2013 at 7:58 PM, Tom Lane wrote: > The long and the short of it here is that there isn't any very good > reason to migrate any of the existing pg_dump-side functionality into > server-side functions, There are a number of good reasons to do just that, which have been previously discussed. > and especially not server-side functions that > aren't in C. +1. > One of the things that we frequently recommend when doing > upgrades is that you do the dump with the newer version's pg_dump, so > as to get the benefits of any bug fixes that are in it. The more > dump functionality is on the server side, the less opportunity we have > to repair things that way. But why wouldn't we be able to fix the version in the server, if it turns out to be buggy? I suppose we wouldn't fix bugs discovered after EOL, but I'm not sure that's a sufficient objection. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remaining beta blockers
On Mon, Apr 29, 2013 at 3:34 PM, Kevin Grittner wrote: > The hysteria and FUD about using the currently-supported technique > for unlogged tables to implement unlogged matviews are very > discouraging. And the notion that we would release a matview > feature which allowed false results (in the form of treating a > never-populated matview as a legal empty matview) is truly scary to > me. I think that labeling other people's concerns as hysteria and FUD is not something which will advance the debate. I might be misunderstanding the nature of Tom's concern, but I believe the concern is not so much that whatever bugs exist now can't be fixed, but that the use of this mechanism might leave us vulnerable to a steady stream of future bugs or make it hard to do things which, in the future, we may want to do. To take one not-particularly-hypothetical example of that, I was recently discussing with someone, probably Noah, the idea of adding a command to pre-extend a relation to a certain number of blocks (and prevent vacuum from truncating those extra blocks away again). If you're counting on the number of blocks in the relation to mean something other than the number of blocks in the relation, there's a potential conflict there. Now maybe the fact that you've defined 0 to mean non-scannable and 1+ to mean scannable means there's no real problem with that particular idea... but suppose you'd defined it the other way around. I don't think it's purely unreasonable to think that this could conflict with future developments in other areas. The basic problem here is that the way unlogged tables work is kind of a crock. I didn't fully realize that at the time I implemented it, but it's boxed us in a little in terms of trying to extend that feature; for example, we still don't have a way to convert logged relations to unlogged relations, or visca versa, and we're not going to grow a way without some kind of reworking of that mechanism. This is another aspect of that problem. To solve these problems cleanly, we need either a relation metapage, or some more state in pg_class. Short of that, this isn't a scalable solution. Even if you think it's workable to coax one more bit of state out of the file length, what will you do when you need a second bit (or say another 32 bits)? > If we can't tell the difference between those two things, I > don't think we should be releasing the feature. So, speaking of hysteria... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Graph datatype addition
On 30/04/13 12:33, Любен Каравелов wrote: - Цитат от Christopher Browne (cbbro...@gmail.com), на 29.04.2013 в 23:18 - The one place where I *could* see a special type having a contribution is for there to be a data type that can contain an arbitrary number of links. That means you have one tuple per node, and, instead of needing a tuple for each link between nodes, you have one attribute indicating *all* the links. (And "interesting" is for that one attribute to be usable for foreign key purposes.) That has a hard time scaling in cases where nodes are over-connected, which is, broadly speaking, an acceptable sort of scenario. ... Hello, From the start of the discussion I was trying to get what this graph data type should be... I could not grasp it. With the current postgres, in the most simple case we could do something like: create table node ( node_id serial primary key, ... ); create table edge( from integer references node, to integer[] -- each element references node ); With the addition of foreign keys constraint on arrays elements (that I understand is work in progress), we could guarantee referential integrity of the graph - I really hope that it will be ready for 9.4. Without the array elements foreign keys constraints, if we have to guarantee the integrity we could do: create table edge( from integer referecens node, to integer references node, weight real, ... ); What is missing are some algorithms. I have personaly implemented some algorithms using recursive queries and it is doable (most of my experience was with Oracle but lately I have put in production some postgresql schemas with recursive queries that are working just fine). For large scale eigen values factorisation (think pagerank) this sparse matrix form is reasonable data organisation (though I have doubts that the best place to run this job is in the database) Best regards luben For directed graphs, where each node has a set of lines each with their own weight, I would suggest: create table node ( id int primary key, to_node_id integer[], -- references node(id) weight float[], -- weight of line ... ); So if nodeA has a line going to nodeB - then you can't go from nodeB back to nodeA, unless nodeB has nodeA as a destination for one of its lines. Would be best to ensure that the arrays have the same length! This way lines can also have asymmetric weights. Though it would be better to have something like: create table node ( id int primary key, edges directed_line[], ... ); Where using a 'directed_line' instance could be an object in its own right - with functions to extract start/end nodes along with the weight and to add/remove (destination_node, weight) pairs. Yet the system could internally just store the start node once for an array of directed_lines (though something other than array would be more appropriate). Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Graph datatype addition
- Цитат от Christopher Browne (cbbro...@gmail.com), на 29.04.2013 в 23:18 - > > The one place where I *could* see a special type having a contribution > is for there to be a data type that can contain an arbitrary number of > links. That means you have one tuple per node, and, instead of > needing a tuple for each link between nodes, you have one attribute > indicating *all* the links. (And "interesting" is for that one > attribute to be usable for foreign key purposes.) That has a hard > time scaling in cases where nodes are over-connected, which is, > broadly speaking, an acceptable sort of scenario. > ... Hello, From the start of the discussion I was trying to get what this graph data type should be... I could not grasp it. With the current postgres, in the most simple case we could do something like: create table node ( node_id serial primary key, ... ); create table edge( from integer references node, to integer[] -- each element references node ); With the addition of foreign keys constraint on arrays elements (that I understand is work in progress), we could guarantee referential integrity of the graph - I really hope that it will be ready for 9.4. Without the array elements foreign keys constraints, if we have to guarantee the integrity we could do: create table edge( from integer referecens node, to integer references node, weight real, ... ); What is missing are some algorithms. I have personaly implemented some algorithms using recursive queries and it is doable (most of my experience was with Oracle but lately I have put in production some postgresql schemas with recursive queries that are working just fine). For large scale eigen values factorisation (think pagerank) this sparse matrix form is reasonable data organisation (though I have doubts that the best place to run this job is in the database) Best regards luben -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER DEFAULT PRIVILEGES FOR ROLE is broken
> Hm. I defended that restriction earlier, but it now occurs to me to > wonder if it doesn't create a dump/reload sequencing hazard. I don't > recall that pg_dump is aware of any particular constraints on the order > in which it dumps privilege-grant commands. If it gets this right, > that's mostly luck, I suspect. For that matter, it raises a serious practical obstacle to implementing schema-specific default privs by script, if you have to first check whether the user in question has create privs ... something we don't make it at all easy to do. For 9.4, I'm going to argue that the ALTER DEFAULT PRIVs feature has completely failed in its goal to make database permissions easier to manage. Expect more detail on that after beta. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The missing pg_get_*def functions
Noah Misch writes: > Note also that minor releases can readily fix bugs in C-language functions, > but we have no infrastructure to update sql-language functions after initdb. > That flexibility is unfortunate to lose, particularly for something that > pg_dump depends on. That alone would probably be sufficient reason why we would never allow pg_dump to depend on any such thing (not that I see a compelling argument for it to do so anyway...). The long and the short of it here is that there isn't any very good reason to migrate any of the existing pg_dump-side functionality into server-side functions, and especially not server-side functions that aren't in C. One of the things that we frequently recommend when doing upgrades is that you do the dump with the newer version's pg_dump, so as to get the benefits of any bug fixes that are in it. The more dump functionality is on the server side, the less opportunity we have to repair things that way. It may be that the functions Joel proposes are worth having for other tools to use, but I'm not in favor of making pg_dump use them. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] The missing pg_get_*def functions
On Mon, Apr 29, 2013 at 10:16:22PM +0100, Joel Jacobson wrote: > Existing: > > pg_get_constraintdef > pg_get_indexdef > pg_get_viewdef > pg_get_triggerdef > pg_get_functiondef > pg_get_ruledef > > Missing: > > pg_get_ts_templatedef > pg_get_ts_parserdef > pg_get_ts_configdef > pg_get_ts_dictdef > pg_get_databasedef > pg_get_namespacedef > pg_get_languagedef > pg_get_conversiondef > pg_get_castdef > pg_get_amprocdef > pg_get_operatordef > pg_get_amopdef > pg_get_opfamilydef > pg_get_opclassdef > pg_get_tabledef > pg_get_sequencedef > pg_get_typedef > pg_get_attrdef > > I'm planning to implement these. Is there any reason why the remaining > functions cannot be written as SQL functions instead of C? The input data > to the C-functions in pg_dump.c comes from pg_catalog anyway, so I thought > it would be a lot easier just to construct the definitions in SQL where you > have convenient access to the pg_catalog. Note that while the sql procedural language is fair game, plpgsql currently is not. We install it by default, but the DBA is free to drop it. Those existing functions give a mostly-SnapshotNow picture of their objects, but an sql-language implementation would give a normally-snapshotted picture. That status quo is perhaps more an implementation accident than a designed behavior. Before proliferating functions like this, we should pick a snapshot policy and stick to it. See the block comment at the top of pg_dump.c. Note also that minor releases can readily fix bugs in C-language functions, but we have no infrastructure to update sql-language functions after initdb. That flexibility is unfortunate to lose, particularly for something that pg_dump depends on. Now, the right thing is probably to design a mechanism for applying simple catalog updates in concert with a minor release. In the mean time, its absence puts the sql PL at a nontrivial disadvantage here. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER DEFAULT PRIVILEGES FOR ROLE is broken
Noah Misch writes: > The particular restriction at hand, namely that a role have CREATE rights on a > schema before assigning role-specific default privileges, seems like needless > paternalism. It would be akin to forbidding ALTER ROLE ... PASSWORD on a > NOLOGIN role. I'd support removing it when such a proposal arrives. Hm. I defended that restriction earlier, but it now occurs to me to wonder if it doesn't create a dump/reload sequencing hazard. I don't recall that pg_dump is aware of any particular constraints on the order in which it dumps privilege-grant commands. If it gets this right, that's mostly luck, I suspect. > If > anything, require that the user executing the ALTER DEFAULT PRIVILEGES, not > the subject of the command, has CREATE rights on the schema. That would be just as dangerous from this angle. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER DEFAULT PRIVILEGES FOR ROLE is broken
On Mon, Apr 29, 2013 at 01:25:47PM -0400, Tom Lane wrote: > Josh Berkus writes: > > On 04/29/2013 09:59 AM, Tom Lane wrote: > >> As I pointed out to you last night, it does already say that. > >> I think the problem here is that we're just throwing a generic > >> permissions failure rather than identifying the particular permission > >> needed. > > > Yeah, a better error message would help a lot. My first thought was > > "WTF? I'm the superuser, whaddya mean, 'permission denied'"? > > Right. I wonder if there's any good reason why we shouldn't extend > aclerror() to, in all cases, add a DETAIL line along the lines of > > ERROR: permission denied for schema web > DETAIL: This operation requires role X to have privilege Y. > > Is there any scenario where this'd be exposing too much info? Can't think of one. Seems safe and helpful. The particular restriction at hand, namely that a role have CREATE rights on a schema before assigning role-specific default privileges, seems like needless paternalism. It would be akin to forbidding ALTER ROLE ... PASSWORD on a NOLOGIN role. I'd support removing it when such a proposal arrives. If anything, require that the user executing the ALTER DEFAULT PRIVILEGES, not the subject of the command, has CREATE rights on the schema. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] add --throttle option to pgbench [patch 2]
It does seem to me that we should Poissonize the throttle time, then subtract the average overhead, rather than Poissonizing the difference. After thinking again about Jeff's point and failing to sleep, I think that doing exactly that is better because: - it is "right" - the code is simpler and shorter - my transaction stuck sequence issue is not that big an issue anyway Here is a patch to schedule transactions along Poisson-distributed events. This patch replaces my previous proposal. Note that there is no reference to the current time after the stochastic process is initiated. This is necessary, and mean that if transactions lag behind the throttle at some point they will try to catch up later. Neither a good nor a bad thing, mostly a feature. -- Fabiendiff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index bc01f07..0142ed0 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -137,6 +137,12 @@ int unlogged_tables = 0; double sample_rate = 0.0; /* + * whether clients are throttled to a given rate, expressed as a delay in us. + * 0, the default means no throttling. + */ +int64 throttle = 0; + +/* * tablespace selection */ char *tablespace = NULL; @@ -204,6 +210,8 @@ typedef struct int nvariables; instr_time txn_begin; /* used for measuring transaction latencies */ instr_time stmt_begin; /* used for measuring statement latencies */ + int64 trigger; /* previous/next throttling (us) */ + bool throttled; /* whether current transaction was throttled */ int use_file; /* index in sql_files for this client */ bool prepared[MAX_FILES]; } CState; @@ -361,6 +369,9 @@ usage(void) " -S perform SELECT-only transactions\n" " -t NUM number of transactions each client runs (default: 10)\n" " -T NUM duration of benchmark test in seconds\n" + " -H SPEC, --throttle SPEC\n" + " delay in second to throttle each client\n" + " sample specs: 0.025 40tps 25ms 25000us\n" " -v vacuum all four standard tables before tests\n" "\nCommon options:\n" " -d print debugging output\n" @@ -1027,7 +1038,7 @@ top: } } - if (commands[st->state]->type == SQL_COMMAND) + if (!st->throttled && commands[st->state]->type == SQL_COMMAND) { /* * Read and discard the query result; note this is not included in @@ -1049,26 +1060,54 @@ top: discard_response(st); } + /* some stuff done at the end */ if (commands[st->state + 1] == NULL) { - if (is_connect) + /* disconnect if required and needed */ + if (is_connect && st->con) { PQfinish(st->con); st->con = NULL; } - ++st->cnt; - if ((st->cnt >= nxacts && duration <= 0) || timer_exceeded) -return clientDone(st, true); /* exit success */ + /* update transaction counter once, and possibly end */ + if (!st->throttled) + { +++st->cnt; +if ((st->cnt >= nxacts && duration <= 0) || timer_exceeded) + return clientDone(st, true); /* exit success */ + } + + /* handle throttling once, as the last post-transaction stuff */ + if (throttle && !st->throttled) + { +/* compute delay to approximate a Poisson distribution + * 100 => 13.8 .. 0 multiplier + * if transactions are too slow or a given wait shorter than + * a transaction, the next transaction will start right away. + */ +int64 wait = (int64) + throttle * -log(getrand(thread, 1, 100)/100.0); +st->trigger += wait; +st->sleeping = 1; +st->until = st->trigger; +st->throttled = true; +if (debug) + fprintf(stderr, "client %d throttling %d us\n", + st->id, (int) wait); +return true; + } } /* increment state counter */ st->state++; if (commands[st->state] == NULL) { + /* reset */ st->state = 0; st->use_file = (int) getrand(thread, 0, num_files - 1); commands = sql_files[st->use_file]; + st->throttled = false; } } @@ -2086,6 +2125,7 @@ main(int argc, char **argv) {"unlogged-tables", no_argument, &unlogged_tables, 1}, {"sampling-rate", required_argument, NULL, 4}, {"aggregate-interval", required_argument, NULL, 5}, + {"throttle", required_argument, NULL, 'H'}, {NULL, 0, NULL, 0} }; @@ -2152,7 +2192,7 @@ main(int argc, char **argv) state = (CState *) pg_malloc(sizeof(CState)); memset(state, 0, sizeof(CState)); - while ((c = getopt_long(argc, argv, "ih:nvp:dqSNc:j:Crs:t:T:U:lf:D:F:M:", long_options, &optindex)) != -1) + while ((c = getopt_long(argc, argv, "ih:nvp:dqSNc:j:Crs:t:T:U:lf:D:F:M:H:", long_options, &optindex)) != -1) { switch (c) { @@ -2307,6 +2347,26 @@ main(int argc, char **argv) exit(1); } break; + case 'H': + { +/* get a double from the beginning of option value */ +double throttle_value = atof(optarg); +if (throttle_value <= 0.0) +{ + fprintf(stderr, "invalid throttle
Re: [HACKERS] ALTER DEFAULT PRIVILEGES FOR ROLE is broken
Tom, I'm also seeing that the DEFAULT PRIVs I get seem to be dependant on my login role, not my current role. That is, if role "dba" has default privs set, and role "josh", which is a member of "dba" does not, and "josh" does "set role dba" before creating some tables, those tables do NOT get dba's default privs. At least, that's what happens in my testing. Is that expected? If so, that should certainly be documented. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remaining beta blockers
>> what happens when an admin figures out that they can 'hack' the >> system to do what they want by truncating that file? That can't possibly be a serious objection. DBAs who mess with DB files are on their own, and should expect unpredictable behavior. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] The missing pg_get_*def functions
Existing: pg_get_constraintdef pg_get_indexdef pg_get_viewdef pg_get_triggerdef pg_get_functiondef pg_get_ruledef Missing: pg_get_ts_templatedef pg_get_ts_parserdef pg_get_ts_configdef pg_get_ts_dictdef pg_get_databasedef pg_get_namespacedef pg_get_languagedef pg_get_conversiondef pg_get_castdef pg_get_amprocdef pg_get_operatordef pg_get_amopdef pg_get_opfamilydef pg_get_opclassdef pg_get_tabledef pg_get_sequencedef pg_get_typedef pg_get_attrdef I'm planning to implement these. Is there any reason why the remaining functions cannot be written as SQL functions instead of C? The input data to the C-functions in pg_dump.c comes from pg_catalog anyway, so I thought it would be a lot easier just to construct the definitions in SQL where you have convenient access to the pg_catalog.
Re: [HACKERS] Proposal to add --single-row to psql
On 4/28/13 7:50 AM, Craig Ringer wrote: I find it frustrating that I've never seen an @paraccel email address here and that few of the other vendors of highly customised Pg offshoots are contributing back. It's almost enough to make me like the GPL. FWIW, I think there's a pretty large barrier to these folks contributing back. Would the community really want to add a bunch of hooks to support something like Redshift? Or Greenplum? Or etc, etc.? Most of these guys have to change significant amounts of PG code, so much so that it's actually hard for them to stay current (which is why most of them just fork). I do think this is a shame, but I'm not sure of any good way to fix it. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remaining beta blockers
Stephen Frost wrote: > what happens when an admin figures out that they can 'hack' the > system to do what they want by truncating that file? If they modified the heap files that way while the server was running, the results would be somewhat unpredictable. If they did it while the server was stopped, starting the server and attempting to access the matview would generate: ERROR: materialized view "matview_name" has not been populated HINT: Use the REFRESH MATERIALIZED VIEW command. > Or we end up wanting to have that file be non-zero and considered > 'empty' later, but we don't want pg_upgrade running around > touching all of the existing files out there? I didn't follow this one; could you restate it, please? -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Analyzing bug 8049
On 4/28/13 7:00 PM, Tom Lane wrote: Thoughts? Anybody know of a counterexample to the idea that no plug-ins call query_planner()? I would assume that anyone writing anything that calls such a low-level function reads -hackers regularly and would easily be able to handle whatever changes to their code that the new callback parameter required. We would obviously want to send notice about this ahead-of-time in addition to sticking something in the release notes. That said, I'd be rather surprised if anything actually calls planner functions directly. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Graph datatype addition
On Mon, Apr 29, 2013 at 10:50 AM, Merlin Moncure wrote: > On Mon, Apr 29, 2013 at 9:25 AM, Atri Sharma wrote: > >> > >> This is an interesting idea. Historically I've always decomposed > >> graphs into relational structures because that's the only practical > >> way to query them. Graphs are not currently able to be transported > >> out of the database currently via JSON so one of the areas to focus > >> your research will be how the client will consume the data. > >> libpqtypes is one way to do it, but that will really restrict you > >> audience so you'll probably need a rich set of functions present the > >> internal data (just like hstore). > > > > I completely agree. Initially, I was thinking of exposing the data to > > user via HStore. But now, after Robert's suggestions, I think it will > > be better to have an alternate representation. JSON seems to be an > > excellent idea for that. > > I don't agree with this; JSON is not really designed to store graphs. > You will probably need a customized internal representation, just like > hstore, that expresses a graph like structure. > > This is not a trivial project. > Not trivial, indeed. I see there being two directions where a data type goes. 1. We created JSON and XML types as ways of storing data that has a robust validation system. They're still, in a sense, just "plain old text", but it's "plain old text" that the user can be certain satisfies the respective rules for representations. 2. Some types support special operations to allow the data to be queried in novel ways. That's NOT the case, at this point, for JSON or XML. But it certainly IS the case for Jeff Davis' "range types", which expose access to some new sorts of data validation and indexing. It is true for the inet type, which behaves rather differently from our other types. It is true for the tsearch indexes, that enable interesting random access within some large "blobs" of stored data. I'm not sure quite what we *would* want as the merits of graph-related types. I suspect that the best answer is NOT one where a graph is represented as a value in a table; that has the implication that modifying The Graph requires altering a single tuple, and that seems likely to become a horrible bottleneck. I'm suspicious that using HSTORE leads down that path, where you'll have a database that has a table with just one tuple in it, and where it's nearly impossible to alter that tuple. I'm having a hard time thinking about what it looks like to have a graph as table except to effectively compose the graph as a set of nodes, one tuple per node, and I'm not sure that a new data type has anything to contribute to that. The one place where I *could* see a special type having a contribution is for there to be a data type that can contain an arbitrary number of links. That means you have one tuple per node, and, instead of needing a tuple for each link between nodes, you have one attribute indicating *all* the links. (And "interesting" is for that one attribute to be usable for foreign key purposes.) That has a hard time scaling in cases where nodes are over-connected, which is, broadly speaking, an acceptable sort of scenario. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
Re: [HACKERS] Remaining beta blockers
* Kevin Grittner (kgri...@ymail.com) wrote: > Many people weighed in on the need to differentiate between an > empty matview and one which has not been populated. Many have also > weighed in on the benefits of unlogged matviews. Sure, I think I did that- we should be able to distinguish between those two cases and unlogged matviews are great. > > I see absolutely no reason to change my opinion on this. Keeping > > relisscannable state in the form of is-the-file-of-nonzero-size > > is something we WILL regret > > ... or change in a subsequent major release. I see no reason why > using this technique now make it harder to do something else later. > Could you elaborate on the technical challenges you see to doing > so? I've not followed this all that closely, to be honest, but I tend to side with Tom on this, making PG depend on the file size for an attribute of a relation is a bad idea. For one thing, what happens when an admin figures out that they can 'hack' the system to do what they want by truncating that file? Or we end up wanting to have that file be non-zero and considered 'empty' later, but we don't want pg_upgrade running around touching all of the existing files out there? > I would guess that about half the use-cases for materialized views > will stay with tables in spite of the added hassle, if they have to > degrade performance by adding logging where they currently have > none. Life's tough. There's quite a few things that I wish had made it into 9.3 which didn't. One might also point out that a lot of individuals may be, understandably, cautious about this new feature and not use it in the first major rev it's released in anyway (I'm talking about matviews entirely here..). Basically, I don't believe we should be acting like we've promised unlogged-matviews will be in 9.3 just because it's been committed. > > The way forward to unlogged matviews that behave the way Kevin > > wants is to improve crash recovery so that we can update catalog > > state after completing recovery, which is something there are > > other reasons to want anyway. > > That would certainly be for the best. Then let's forgo having this specific feature in 9.3 and implement it correctly for 9.4. > The hysteria and FUD about using the currently-supported technique > for unlogged tables to implement unlogged matviews are very > discouraging. Perhaps I'm all wet here, but it's not obvious to me that what's done for unlogged tables really equates to what's being done here. > And the notion that we would release a matview > feature which allowed false results (in the form of treating a > never-populated matview as a legal empty matview) is truly scary to > me. If we can't tell the difference between those two things, I > don't think we should be releasing the feature. I agree that it's important to distinguish the two. I'm not sure that I've heard anyone explicitly say otherwise.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Graph datatype addition
On 4/29/13 2:20 PM, Florian Pflug wrote: On Apr29, 2013, at 21:00 , Atri Sharma wrote: I think we find work arounds or make shifts at the moment if we need to use graphs in our database in postgres. If we have a datatype itself, with support for commonly used operations built inside the type itself, that will greatly simplify user's tasks, and open up a whole new avenue of applications for us, such as recommender systems, social network analysis, or anything that can be done with graphs. Usually though, you'd be interested a large graphs which include information for lots of records (e.g., nodes are individual users, or products, or whatever). A graph datatype is not well suited for that, because it'd store each graph as a single value, and updating the graph would mean rewriting that whole value. If you're e.g. doing social network analysis, and each new edge between two users requires you to pull the whole graph from disk, update it, and write it back, you'll probably hit problems once you reach a few hundred users or so… Which really isn't a lot for that kind of application. I'd love to see more support for those kinds of queries in postgres, (although WITH RECURSIVE already was a *huge* improvement in this area!). But storing each graph as a graph type would do isn't the way forward, IMHO. My $0.02: I believe it would be best to largely separate the questions of storage and access. Partly because of Florian's concern that you'd frequently want only one representation of the whole graph, but also because the actual storage interface does NOT have to be user friendly if we have a good access layer. In particular, if rows had a low overhead, we'd probably just store graphs that way. That's obviously not the case in PG, so is there some kind of hybrid approach we could use? Perhaps sections of a graph could be stored with one piece of MVCC overhead per section? That's why I think separating access from storage is going to be very important; if we do that up-front, we can change the storage latter as we get real experience with this. Second, we should consider how much the access layer should build on WITH RECURSIVE and the like. Being able to detect specific use patterns of CTE/WITH RECURSIVE seems like it could add a lot of value; but I also worry that it's way to magical to be practical. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Graph datatype addition
On Monday, April 29, 2013, Atri Sharma wrote: > On Mon, Apr 29, 2013 at 10:12 PM, Misa Simic > > > wrote: > > Hi Atri, > > > > What is an example of custom internal representation and its JSON > > representation (though and JSON and HStore represent its value as text)? > > > > I also think that the key question is: "what operations would you > support > > on this > > data type?" > > > > Or what kind of problems it will solve? (what can't be solved now - or > can > > now - but new type will allow the better way...) > > > > Thanks, > > > > Misa > > > > > > Hi Misa, > > Thanks for thinking it through. > > I have not thought about it yet(I was going with the HStore > representation till the moment, which I showed in my first mail in > this thread) I believe that something on these lines could be done: > > Entity 1: > > Node: Node1 > > Adjacency list: node2, node3, node4 > > Entity 2: > > Node: Node 2 > > Adjacency list: node1, node5 > > Entity 3: > > Node: Node 3 > > Adjacency list: node1, node4 > > Adjacency list sets: > > "Node1"=>"Entity1","Node2"=>"Entity2","Node3"=>"Entity3" > > I mentioned the potential operations we could have in a previous > mail.Specifically, > > I can think of the standard tasks, i.e. searching if two nodes are > connected or not,adding new nodes and edges, traversing the adjacency > lists of nodes. > > If we add support for weighted graphs, we can probably add support for > some common graph algorithms, such as Djikstra's algorithm, Bellman > Ford algorithm, a MST making algorithm, network flow algorithms. > > The main idea is to allow user to work with graphs pretty easily, and > allow the user to use the data present in his database to make graphs > and then process them. > > I think we find work arounds or make shifts at the moment if we need > to use graphs in our database in postgres. If we have a datatype > itself, with support for commonly used operations built inside the > type itself, that will greatly simplify user's tasks, and open up a > whole new avenue of applications for us, such as recommender systems, > social network analysis, or anything that can be done with graphs. > > Hm... Have you considered maybe ltree datatype? To me all described sounds solveable on pure sql way ( + ltree datatype to help with indexes and performance as materialised path to avoid recursive query all the time...) Though would be nice to see something new what would simplify the tasks... Cheers, Misa
Re: [HACKERS] [PATCH] add --throttle option to pgbench
On 4/29/13 1:08 PM, Fabien COELHO wrote: While I don't understand the part about his laptop battery, I think that there is a good use case for this. If you are looking at latency distributions or spikes, you probably want to see what they are like with a load which is like the one you expect having, not the load which is the highest possible. Although for this use case you would almost surely be using custom transaction files, not default ones, so I think you could just use \sleep. However, I don't know if there is an easy way to dynamically adjust the sleep value by subtracting off the overhead time and randomizing it a bit, like is done here. Indeed, my thoughts:-) Having regularly (\sleep n) or uniformly distributed (\sleep :random_value) is not very realistic, and I would have to do some measures to find the right value for a target load. +1 to being able to throttle to make latency measurements. I'm also wondering if it would be useful to be able to set a latency target and have something adjust concurrency to see how well you can hit it. Certainly feature creep for the proposed patch; I only bring it up because there may be enough similarity to consider that use case at this time, even if we don't implement it yet. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remaining beta blockers
Tom Lane wrote: > [ shrug... ] You and Kevin essentially repeated your claims that > the current implementation is OK; nobody else weighed in. Many people weighed in on the need to differentiate between an empty matview and one which has not been populated. Many have also weighed in on the benefits of unlogged matviews. > I see absolutely no reason to change my opinion on this. Keeping > relisscannable state in the form of is-the-file-of-nonzero-size > is something we WILL regret ... or change in a subsequent major release. I see no reason why using this technique now make it harder to do something else later. Could you elaborate on the technical challenges you see to doing so? > Pollyanna-ish refusal to believe that is not an adequate reason > for painting ourselves into a corner, especially not for a > second-order feature like unlogged matviews. I would guess that about half the use-cases for materialized views will stay with tables in spite of the added hassle, if they have to degrade performance by adding logging where they currently have none. > The way forward to unlogged matviews that behave the way Kevin > wants is to improve crash recovery so that we can update catalog > state after completing recovery, which is something there are > other reasons to want anyway. That would certainly be for the best. > But it's far too late to do that in this cycle. Yes it is. > In the meantime I remain convinced that we're better off dropping > the feature until we have an implementation that's not going to > bite us in the rear in the future. I haven't caught up with you on how it will do that. > I also note that there are acknowledged-even-by-you bugs in the > current implementation, which no patch has emerged for, so > *something's* got to be done. I'm done waiting for something to > happen, and am going to go fix it in the way I think best. Are you referring to the case where if you refresh a matview with over 8GB and it winds up empty, that vacuum can make it look invalid until the next REFRESH? This is one of many ways that could be fixed. diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index 8a1ffcf..b950b16 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -230,7 +230,13 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt, * * Don't even think about it unless we have a shot at releasing a goodly * number of pages. Otherwise, the time taken isn't worth it. + * + * Leave a populated materialized view with at least one page. */ + if (onerel->rd_rel->relkind == RELKIND_MATVIEW && + vacrelstats->nonempty_pages == 0) + vacrelstats->nonempty_pages = 1; + possibly_freeable = vacrelstats->rel_pages - vacrelstats->nonempty_pages; if (possibly_freeable > 0 && (possibly_freeable >= REL_TRUNCATE_MINIMUM || The hysteria and FUD about using the currently-supported technique for unlogged tables to implement unlogged matviews are very discouraging. And the notion that we would release a matview feature which allowed false results (in the form of treating a never-populated matview as a legal empty matview) is truly scary to me. If we can't tell the difference between those two things, I don't think we should be releasing the feature. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Graph datatype addition
On Apr29, 2013, at 21:00 , Atri Sharma wrote: > I think we find work arounds or make shifts at the moment if we need > to use graphs in our database in postgres. If we have a datatype > itself, with support for commonly used operations built inside the > type itself, that will greatly simplify user's tasks, and open up a > whole new avenue of applications for us, such as recommender systems, > social network analysis, or anything that can be done with graphs. Usually though, you'd be interested a large graphs which include information for lots of records (e.g., nodes are individual users, or products, or whatever). A graph datatype is not well suited for that, because it'd store each graph as a single value, and updating the graph would mean rewriting that whole value. If you're e.g. doing social network analysis, and each new edge between two users requires you to pull the whole graph from disk, update it, and write it back, you'll probably hit problems once you reach a few hundred users or so… Which really isn't a lot for that kind of application. I'd love to see more support for those kinds of queries in postgres, (although WITH RECURSIVE already was a *huge* improvement in this area!). But storing each graph as a graph type would do isn't the way forward, IMHO. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] add --throttle option to pgbench
I'm having a hard time understanding the use-case for this feature. Here is an example functional use case I had in mind. Let us say I'm teaching a practice session about administrating replication. Students have a desktop computer on which they can install several instances or postgresql, or possibly use virtual machines. I'd like them to setup one server, put it under a continuous load, then create a first slave, then a second, and things like that. The thing I do not want is the poor desktop and its hard drive to be at maximum speed for the whole afternoon while doing the session, making it hard to do anything else on the host. So I want something both realistic (the database is under a load, the WAL is advancing, let us dump it, base backup it, replicate it, monitor it, update it, whatever...), but gentle all the same. Using pgbench with --throttle basically provides the adjustable continuous load I need. I understand that this is not at all the intent for which it was developed. Note that I will probably propose another patch to provide a heart beat while things are going on, but I thought that one patch at a time was enough. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Graph datatype addition
On Mon, Apr 29, 2013 at 10:12 PM, Misa Simic wrote: > Hi Atri, > > What is an example of custom internal representation and its JSON > representation (though and JSON and HStore represent its value as text)? > > I also think that the key question is: "what operations would you support > on this > data type?" > > Or what kind of problems it will solve? (what can't be solved now - or can > now - but new type will allow the better way...) > > Thanks, > > Misa > > Hi Misa, Thanks for thinking it through. I have not thought about it yet(I was going with the HStore representation till the moment, which I showed in my first mail in this thread) I believe that something on these lines could be done: Entity 1: Node: Node1 Adjacency list: node2, node3, node4 Entity 2: Node: Node 2 Adjacency list: node1, node5 Entity 3: Node: Node 3 Adjacency list: node1, node4 Adjacency list sets: "Node1"=>"Entity1","Node2"=>"Entity2","Node3"=>"Entity3" I mentioned the potential operations we could have in a previous mail.Specifically, I can think of the standard tasks, i.e. searching if two nodes are connected or not,adding new nodes and edges, traversing the adjacency lists of nodes. If we add support for weighted graphs, we can probably add support for some common graph algorithms, such as Djikstra's algorithm, Bellman Ford algorithm, a MST making algorithm, network flow algorithms. The main idea is to allow user to work with graphs pretty easily, and allow the user to use the data present in his database to make graphs and then process them. I think we find work arounds or make shifts at the moment if we need to use graphs in our database in postgres. If we have a datatype itself, with support for commonly used operations built inside the type itself, that will greatly simplify user's tasks, and open up a whole new avenue of applications for us, such as recommender systems, social network analysis, or anything that can be done with graphs. Regards, Atri -- Regards, Atri l'apprenant -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fixing statistics problem related to vacuum truncation termination
Kevin Grittner wrote: > After reviewing the threads and thinking about the various > historical behaviors, I'm suggesting that we apply the attached, > back-patched to 9.0, to fix the unintended changes from historical > behavior related to lack of statistics generation in some cases > where statistics were generated before > b19e4250b45e91c9cbdd18d35ea6391ab5961c8d, and to generate them in > some cases where they were historically suppressed. Pushed. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] add --throttle option to pgbench
Hello Jeff, While I don't understand the part about his laptop battery, I think that there is a good use case for this. If you are looking at latency distributions or spikes, you probably want to see what they are like with a load which is like the one you expect having, not the load which is the highest possible. Although for this use case you would almost surely be using custom transaction files, not default ones, so I think you could just use \sleep. However, I don't know if there is an easy way to dynamically adjust the sleep value by subtracting off the overhead time and randomizing it a bit, like is done here. Indeed, my thoughts:-) Having regularly (\sleep n) or uniformly distributed (\sleep :random_value) is not very realistic, and I would have to do some measures to find the right value for a target load. It does seem to me that we should Poissonize the throttle time, then subtract the average overhead, rather than Poissonizing the difference. I actually thought about doing it the way you suggested, because it was "right". However I did not do it, because if the Poisson gives, possibly quite frequently, a time below the transaction time, one ends up with an artificial sequence of stuck transactions, as a client cannot start the second transaction while the previous one is not finished, and this does not seem realistic. To really do that more cleanly, it would require distributing the events between clients, so having some kind of coordination between clients, which would really be another test application. Having an approximation of that seemed good enough for my purpose. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER DEFAULT PRIVILEGES FOR ROLE is broken
> Right. I wonder if there's any good reason why we shouldn't extend > aclerror() to, in all cases, add a DETAIL line along the lines of > > ERROR: permission denied for schema web > DETAIL: This operation requires role X to have privilege Y. > > Is there any scenario where this'd be exposing too much info? Not that I can think of. The fact that role X doesn't have create on schema Y isn't exactly privileged info. Further, to make any use of that information, you'd have to be able to SET ROLE X, in which case you can just test for yourself if X has CREATE permission. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] add --throttle option to pgbench
Hello Tom, I'm having a hard time understanding the use-case for this feature. Surely, if pgbench is throttling its transaction rate, you're going to just end up measuring the throttle rate. Indeed, I do not want to measure the tps if I throttle it. The point is to generate a continuous but not necessarily maximal load, and to test other things under such load such as possiby cascading replication, failover, various dump strategies, whatever. I wanted that to test the impact of various load levels, and for functionnal tests on my laptop which should not drain the battery. How does causing a test to take longer result in reduced battery drain? If I test a replication setup on my laptop at maximum load, I can see the battery draining in a few seconds by looking at the effect on the time left widget. This remark is mostly for functional tests, not for performance test. If I want to test the maximum load of a setup, obviously I will not do that on my laptop, and I will not use --throttle... -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER DEFAULT PRIVILEGES FOR ROLE is broken
Josh Berkus writes: > On 04/29/2013 09:59 AM, Tom Lane wrote: >> As I pointed out to you last night, it does already say that. >> I think the problem here is that we're just throwing a generic >> permissions failure rather than identifying the particular permission >> needed. > Yeah, a better error message would help a lot. My first thought was > "WTF? I'm the superuser, whaddya mean, 'permission denied'"? Right. I wonder if there's any good reason why we shouldn't extend aclerror() to, in all cases, add a DETAIL line along the lines of ERROR: permission denied for schema web DETAIL: This operation requires role X to have privilege Y. Is there any scenario where this'd be exposing too much info? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER DEFAULT PRIVILEGES FOR ROLE is broken
On 04/29/2013 09:59 AM, Tom Lane wrote: > Josh Berkus writes: >> This moves the general brokenness of this feature from a bug to (a) a >> documentation issue and (b) unusably fussy. For (a), I think we need >> the following line in the docs: > >> DEFAULT PRIVILEGES may only be granted to a ROLE which already has >> CREATE permission on the specified schema. > > As I pointed out to you last night, it does already say that. > I think the problem here is that we're just throwing a generic > permissions failure rather than identifying the particular permission > needed. Yeah, a better error message would help a lot. My first thought was "WTF? I'm the superuser, whaddya mean, 'permission denied'"? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER DEFAULT PRIVILEGES FOR ROLE is broken
Josh Berkus writes: > This moves the general brokenness of this feature from a bug to (a) a > documentation issue and (b) unusably fussy. For (a), I think we need > the following line in the docs: > DEFAULT PRIVILEGES may only be granted to a ROLE which already has > CREATE permission on the specified schema. As I pointed out to you last night, it does already say that. I think the problem here is that we're just throwing a generic permissions failure rather than identifying the particular permission needed. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] add --throttle option to pgbench
On Mon, Apr 29, 2013 at 8:27 AM, Tom Lane wrote: > Fabien COELHO writes: > > Please find attached a small patch to add a throttling capability to > > pgbench, that is pgbench aims at a given client transaction rate instead > > of maximizing the load. The throttling relies on Poisson-distributed > > delays inserted after each transaction. > > I'm having a hard time understanding the use-case for this feature. > Surely, if pgbench is throttling its transaction rate, you're going > to just end up measuring the throttle rate. > While I don't understand the part about his laptop battery, I think that there is a good use case for this. If you are looking at latency distributions or spikes, you probably want to see what they are like with a load which is like the one you expect having, not the load which is the highest possible. Although for this use case you would almost surely be using custom transaction files, not default ones, so I think you could just use \sleep. However, I don't know if there is an easy way to dynamically adjust the sleep value by subtracting off the overhead time and randomizing it a bit, like is done here. It does seem to me that we should Poissonize the throttle time, then subtract the average overhead, rather than Poissonizing the difference. Cheers, Jeff
Re: [HACKERS] ALTER DEFAULT PRIVILEGES FOR ROLE is broken
> rhaas=# create user bob; > CREATE ROLE > rhaas=# create schema we_like_bob; > CREATE SCHEMA > rhaas=# alter default privileges for role bob in schema we_like_bob > grant select on tables to bob; > ERROR: permission denied for schema we_like_bob > rhaas=# grant create on schema we_like_bob to bob; > GRANT > rhaas=# alter default privileges for role bob in schema we_like_bob > grant select on tables to bob; > ALTER DEFAULT PRIVILEGES H. Must have got something tangled up there; starting over with a clean database and new users I got it to work. I'll see if I can reproduce the issue I'm getting on my production schema. This moves the general brokenness of this feature from a bug to (a) a documentation issue and (b) unusably fussy. For (a), I think we need the following line in the docs: DEFAULT PRIVILEGES may only be granted to a ROLE which already has CREATE permission on the specified schema. For (b), I'll take it up in the 9.4 dev cycle. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Graph datatype addition
Hi Merlin, " Graphs are not currently able to be transported out of the database currently via JSON" What does it mean? (I probably dont understand graphs well - but from my point of view - any data can be transported out of DB via JSON) Thanks, Misa 2013/4/29 Merlin Moncure > On Mon, Apr 29, 2013 at 12:55 AM, Atri Sharma wrote: > >> It's probably pretty easy to add this, but I think the question is > >> what would make it better than storing the same representation in a > >> text field. > > > > I completely agree. The main point in making a new datatype would be > > to add support for operations that are normally done with graphs. > > > > > >>Obviously you get validation that the input is in the > >> correct format, but you could do that with a CHECK constraint, too, or > >> otherwise handle it in the application. So I think the really > >> interesting question is: what operations would you support on this > >> data type? > > > > I can think of the standard tasks, i.e. searching if two nodes are > > connected or not,adding new nodes and edges, traversing the adjacency > > lists of nodes. > > > > If we add support for weighted graphs, we can probably add support for > > some common graph algorithms, such as Djikstra's algorithm, Bellman > > Ford algorithm, a MST making algorithm, network flow algorithms. > > > > The main idea is to allow user to work with graphs pretty easily, and > > allow the user to use the data present in his database to make graphs > > and then process them. > > > >> One of the problems you're likely to run into if you store the whole > >> graph as a single object is that it may make many of the things you > >> want to do with it not very efficient. > > > > Yes, I agree. On further thought, I believe it would be more of a pain > > if we stick to representing the whole thing as one.Rather,making > > multiple components will be more flexible and modular, and allow us to > > modify different components of the same graph without modifying or > > interfering with other components of the graph. > > > > I will think of a new design. I am still thinking of using HStore to > > store adjacency lists. This should have good performance for access of > > lists and similar tasks, IMO. > > This is an interesting idea. Historically I've always decomposed > graphs into relational structures because that's the only practical > way to query them. Graphs are not currently able to be transported > out of the database currently via JSON so one of the areas to focus > your research will be how the client will consume the data. > libpqtypes is one way to do it, but that will really restrict you > audience so you'll probably need a rich set of functions present the > internal data (just like hstore). > > Another area to focus research will be on searchability: how to use > GIST/GIN indexes to pull data out via an internal query string. An > overview of the current GIST based type implementations (like ltree) > couldn't hurt. > > merlin > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Re: [HACKERS] Graph datatype addition
Hi Atri, What is an example of custom internal representation and its JSON representation (though and JSON and HStore represent its value as text)? I also think that the key question is: "what operations would you support on this data type?" Or what kind of problems it will solve? (what can't be solved now - or can now - but new type will allow the better way...) Thanks, Misa 2013/4/29 Atri Sharma > > > > I don't agree with this; JSON is not really designed to store graphs. > > You will probably need a customized internal representation, just like > > hstore, that expresses a graph like structure. > > Yes, we will have a custom internal representation. I was thinking of > ways to export the graph into user parsable type, hence JSON. > > > > Regards, > > Atri > > > -- > Regards, > > Atri > l'apprenant > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Re: [HACKERS] [PATCH] add --throttle option to pgbench
Fabien COELHO writes: > Please find attached a small patch to add a throttling capability to > pgbench, that is pgbench aims at a given client transaction rate instead > of maximizing the load. The throttling relies on Poisson-distributed > delays inserted after each transaction. I'm having a hard time understanding the use-case for this feature. Surely, if pgbench is throttling its transaction rate, you're going to just end up measuring the throttle rate. > I wanted that to test the impact of various load levels, and for > functionnal tests on my laptop which should not drain the battery. How does causing a test to take longer result in reduced battery drain? You still need the same number of transactions if you want an honest test, so it seems to me the machine would have to be on longer and thus you'd eat *more* battery to get an equivalently trustworthy result. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Graph datatype addition
> > I don't agree with this; JSON is not really designed to store graphs. > You will probably need a customized internal representation, just like > hstore, that expresses a graph like structure. Yes, we will have a custom internal representation. I was thinking of ways to export the graph into user parsable type, hence JSON. Regards, Atri -- Regards, Atri l'apprenant -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Graph datatype addition
On Mon, Apr 29, 2013 at 9:25 AM, Atri Sharma wrote: >> >> This is an interesting idea. Historically I've always decomposed >> graphs into relational structures because that's the only practical >> way to query them. Graphs are not currently able to be transported >> out of the database currently via JSON so one of the areas to focus >> your research will be how the client will consume the data. >> libpqtypes is one way to do it, but that will really restrict you >> audience so you'll probably need a rich set of functions present the >> internal data (just like hstore). > > I completely agree. Initially, I was thinking of exposing the data to > user via HStore. But now, after Robert's suggestions, I think it will > be better to have an alternate representation. JSON seems to be an > excellent idea for that. I don't agree with this; JSON is not really designed to store graphs. You will probably need a customized internal representation, just like hstore, that expresses a graph like structure. This is not a trivial project. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Graph datatype addition
> > This is an interesting idea. Historically I've always decomposed > graphs into relational structures because that's the only practical > way to query them. Graphs are not currently able to be transported > out of the database currently via JSON so one of the areas to focus > your research will be how the client will consume the data. > libpqtypes is one way to do it, but that will really restrict you > audience so you'll probably need a rich set of functions present the > internal data (just like hstore). I completely agree. Initially, I was thinking of exposing the data to user via HStore. But now, after Robert's suggestions, I think it will be better to have an alternate representation. JSON seems to be an excellent idea for that. I am thinking of having the functions for working on the graph present inside the data type itself. > Another area to focus research will be on searchability: how to use > GIST/GIN indexes to pull data out via an internal query string. An > overview of the current GIST based type implementations (like ltree) I will definitely research that. Actually, I have never looked at GIST/GIN code before, so I have no idea how they can be used here. Regards, Atri -- Regards, Atri l'apprenant -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Functional dependencies and GROUP BY - for subqueries
Ashutosh Bapat writes: > Is there any reason why do we want to check the functional dependencies at > the time of parsing and not after rewrite? Obviously, by doing so, we will > allow creation of certain views which will start throwing errors after the > underlying table changes the primary key. Is it mandatory that we throw > "functional dependency" related errors at the time of creation of views? >From a usability standpoint, I would think so. And really the only excuse for the functional-dependency feature to exist at all is usability; it adds nothing you couldn't do without it. If we wanted to do something like this, I think the clean way to do it would be to invent a notion of unique/not-null/pkey constraints on views, so that the creator of a view could declaratively say that he wants such a property exposed. That is, the example would become something like create table t1 (id int primary key, ... other stuff ...); create view v1 as select * from t1; alter view v1 add primary key(id); create view v2 as select * from v1 group by id; The pkey constraint on v1 is just a catalog entry with a dependency on t1's pkey constraint; there's no actual index there. But now, v2 can be built with a dependency on v1's pkey, not t1's, and the action-at- a-distance problem goes away. For example, a "CREATE OR REPLACE v1" command could check that the new view definition still provides something for v1's pkey to depend on, and throw error or not without any need to examine the contents of other views. Dropping various elements of this schema would work unsurprisingly, too. This would, of course, require a significant chunk of new code, and personally I do not think the feature would be worth it. But it would be a clean and usable design. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Graph datatype addition
On Mon, Apr 29, 2013 at 12:55 AM, Atri Sharma wrote: >> It's probably pretty easy to add this, but I think the question is >> what would make it better than storing the same representation in a >> text field. > > I completely agree. The main point in making a new datatype would be > to add support for operations that are normally done with graphs. > > >>Obviously you get validation that the input is in the >> correct format, but you could do that with a CHECK constraint, too, or >> otherwise handle it in the application. So I think the really >> interesting question is: what operations would you support on this >> data type? > > I can think of the standard tasks, i.e. searching if two nodes are > connected or not,adding new nodes and edges, traversing the adjacency > lists of nodes. > > If we add support for weighted graphs, we can probably add support for > some common graph algorithms, such as Djikstra's algorithm, Bellman > Ford algorithm, a MST making algorithm, network flow algorithms. > > The main idea is to allow user to work with graphs pretty easily, and > allow the user to use the data present in his database to make graphs > and then process them. > >> One of the problems you're likely to run into if you store the whole >> graph as a single object is that it may make many of the things you >> want to do with it not very efficient. > > Yes, I agree. On further thought, I believe it would be more of a pain > if we stick to representing the whole thing as one.Rather,making > multiple components will be more flexible and modular, and allow us to > modify different components of the same graph without modifying or > interfering with other components of the graph. > > I will think of a new design. I am still thinking of using HStore to > store adjacency lists. This should have good performance for access of > lists and similar tasks, IMO. This is an interesting idea. Historically I've always decomposed graphs into relational structures because that's the only practical way to query them. Graphs are not currently able to be transported out of the database currently via JSON so one of the areas to focus your research will be how the client will consume the data. libpqtypes is one way to do it, but that will really restrict you audience so you'll probably need a rich set of functions present the internal data (just like hstore). Another area to focus research will be on searchability: how to use GIST/GIN indexes to pull data out via an internal query string. An overview of the current GIST based type implementations (like ltree) couldn't hurt. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] add --throttle option to pgbench
Hello, Please find attached a small patch to add a throttling capability to pgbench, that is pgbench aims at a given client transaction rate instead of maximizing the load. The throttling relies on Poisson-distributed delays inserted after each transaction. I wanted that to test the impact of various load levels, and for functionnal tests on my laptop which should not drain the battery. sh> ./pgbench -T 10 -c 2 --throttle 10tps test starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 2 number of threads: 1 duration: 10 s number of transactions actually processed: 214 tps = 21.054216 (including connections establishing) tps = 21.071253 (excluding connections establishing) -- Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index bc01f07..c11aa26 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -137,6 +137,12 @@ int unlogged_tables = 0; double sample_rate = 0.0; /* + * whether clients are throttled to a given rate, expressed as a delay in us. + * 0, the default means no throttling. + */ +int64 throttle = 0; + +/* * tablespace selection */ char *tablespace = NULL; @@ -204,6 +210,8 @@ typedef struct int nvariables; instr_time txn_begin; /* used for measuring transaction latencies */ instr_time stmt_begin; /* used for measuring statement latencies */ + instr_time txn_time; /* cumulated transaction time for throttling */ + bool throttled; /* whether current transaction was throttled */ int use_file; /* index in sql_files for this client */ bool prepared[MAX_FILES]; } CState; @@ -361,6 +369,9 @@ usage(void) " -S perform SELECT-only transactions\n" " -t NUM number of transactions each client runs (default: 10)\n" " -T NUM duration of benchmark test in seconds\n" + " -H SPEC, --throttle SPEC\n" + " delay in second to throttle each client\n" + " sample specs: 0.025 40tps 25ms 25000us\n" " -v vacuum all four standard tables before tests\n" "\nCommon options:\n" " -d print debugging output\n" @@ -1027,7 +1038,7 @@ top: } } - if (commands[st->state]->type == SQL_COMMAND) + if (!st->throttled && commands[st->state]->type == SQL_COMMAND) { /* * Read and discard the query result; note this is not included in @@ -1049,26 +1060,64 @@ top: discard_response(st); } + /* some stuff done at the end */ if (commands[st->state + 1] == NULL) { - if (is_connect) + /* disconnect if required and needed */ + if (is_connect && st->con) { PQfinish(st->con); st->con = NULL; } - ++st->cnt; - if ((st->cnt >= nxacts && duration <= 0) || timer_exceeded) -return clientDone(st, true); /* exit success */ + /* update transaction counter once, and possibly end */ + if (!st->throttled) + { +++st->cnt; +if ((st->cnt >= nxacts && duration <= 0) || timer_exceeded) + return clientDone(st, true); /* exit success */ + } + + /* handle throttling once, as the last post-transaction stuff */ + if (throttle && !st->throttled) + { +instr_time now, run; +int64 avg_txn_time; +st->throttled = true; +INSTR_TIME_SET_CURRENT(now); +run = now; +INSTR_TIME_SUBTRACT(run, st->txn_begin); +INSTR_TIME_ADD(st->txn_time, run); +avg_txn_time = INSTR_TIME_GET_MICROSEC(st->txn_time) / st->cnt; +if (avg_txn_time < throttle) +{ + /* compute delay to approximate a Poisson distribution + * based on the client's current transaction mean time + * 100 => 13.8 .. 0 multiplier + */ + int64 wait = (int64) + ((throttle - avg_txn_time) * + -log(getrand(thread, 1, 100)/100.0)); + if (debug) + fprintf(stderr, "client %d throttling %d us\n", +st->id, (int) wait); + st->sleeping = 1; + st->until = INSTR_TIME_GET_MICROSEC(now) + wait; + return true; +} +/* else WARNING transactions too long, cannot throttle... */ + } } /* increment state counter */ st->state++; if (commands[st->state] == NULL) { + /* reset */ st->state = 0; st->use_file = (int) getrand(thread, 0, num_files - 1); commands = sql_files[st->use_file]; + st->throttled = false; } } @@ -1087,8 +1136,8 @@ top: INSTR_TIME_ACCUM_DIFF(*conn_time, end, start); } - /* Record transaction start time if logging is enabled */ - if (logfile && st->state == 0) + /* Record transaction start time if logging or throttling is enabled */ + if ((logfile || throttle) && st->state == 0) INSTR_TIME_SET_CURRENT(st->txn_begin); /* Record statement start time if per-command latencies are requested */ @@ -2086,6 +2135,7 @@ main(int argc, char **argv) {"unlogged-tables", no_argument, &unlogged_tables, 1}, {"sampling-rate", required_argument, NULL, 4}, {"aggregate-interval", required_ar
Re: [HACKERS] ALTER DEFAULT PRIVILEGES FOR ROLE is broken
On Sun, Apr 28, 2013 at 9:39 PM, Josh Berkus wrote: >> The fine manual notes that the target role has to already have CREATE >> privileges on the target schema --- maybe that's what's biting you in >> this case? > > Nope. That was the first thing I thought of. It really is that the > target role must *own* the schema. So clearly a bug. wfm. rhaas=# create user bob; CREATE ROLE rhaas=# create schema we_like_bob; CREATE SCHEMA rhaas=# alter default privileges for role bob in schema we_like_bob grant select on tables to bob; ERROR: permission denied for schema we_like_bob rhaas=# grant create on schema we_like_bob to bob; GRANT rhaas=# alter default privileges for role bob in schema we_like_bob grant select on tables to bob; ALTER DEFAULT PRIVILEGES -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers