Re: [HACKERS] why do we need two snapshots per query?
On Fri, Nov 11, 2011 at 10:04 PM, Tom Lane wrote: > Simon Riggs writes: >> Tom, in that earlier thread you said you'd be doing something in this >> release about that. Can you say more about what that was, and will you >> be doing it still? > > http://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=e6faf910d75027bdce7cd0f2033db4e912592bcc > > I think that largely supersedes what I understood your notion of a > one-shot plan to be about, though perhaps I missed something? I was looking at other use cases, specifically partitioning/partial indexes. If we could be certain that a query was being executed immediately then it would be possible to simplify expressions using stable functions as if they were constants. My earlier patch did exactly that. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_restore --no-post-data and --post-data-only
On 08/26/2011 05:11 PM, Tom Lane wrote: Alvaro Herrera writes: The "--section=data --section=indexes" proposal seems very reasonable to me -- more so than "--sections='data indexes'". +1 ... not only easier to code and less squishily defined, but more like the existing precedent for other pg_dump switches, such as --table. Here is a patch for that for pg_dump. The sections provided for are pre-data, data and post-data, as discussed elsewhere. I still feel that anything finer grained should be handled via pg_restore's --use-list functionality. I'll provide a patch to do the same switch for pg_restore shortly. Adding to the commitfest. cheers andrew *** a/doc/src/sgml/ref/pg_dump.sgml --- b/doc/src/sgml/ref/pg_dump.sgml *** *** 116,124 PostgreSQL documentation ! This option is only meaningful for the plain-text format. For ! the archive formats, you can specify the option when you ! call pg_restore. --- 116,122 ! This option is equivalent to specifying --section=data. *** *** 404,413 PostgreSQL documentation --- 402,431 Dump only the object definitions (schema), not data. + + This option is equivalent to specifying + --section=pre-data --section=post-data. + + --section=sectionname + + + Only dump the named section. The name can be one of pre-data, data +and post-data. + This option can be specified more than once. The default is to dump all sections. + + + Post-data items consist of definitions of indexes, triggers, rules + and constraints other than check constraints. + Pre-data items consist of all other data definition items. + + + + + -S username --superuser=username *** a/src/bin/pg_dump/pg_dump.c --- b/src/bin/pg_dump/pg_dump.c *** *** 82,87 typedef struct --- 82,96 int objsubid; /* subobject (table column #) */ } SecLabelItem; + typedef enum + { + DUMP_PRE_DATA = 0x01, + DUMP_DATA = 0x02, + DUMP_POST_DATA = 0x04, + DUMP_UNSECTIONED = 0xff + } DumpSections; + + /* global decls */ bool g_verbose; /* User wants verbose narration of our * activities. */ *** *** 91,96 PGconn *g_conn;/* the database connection */ --- 100,106 /* various user-settable parameters */ bool schemaOnly; bool dataOnly; + int dumpSections; /* bitmask of chosen sections */ bool aclsSkip; const char *lockWaitTimeout; *** *** 247,253 static const char *fmtCopyColumnList(const TableInfo *ti); static void do_sql_command(PGconn *conn, const char *query); static void check_sql_result(PGresult *res, PGconn *conn, const char *query, ExecStatusType expected); ! int main(int argc, char **argv) --- 257,263 static void do_sql_command(PGconn *conn, const char *query); static void check_sql_result(PGresult *res, PGconn *conn, const char *query, ExecStatusType expected); ! static void set_section(const char *arg); int main(int argc, char **argv) *** *** 330,335 main(int argc, char **argv) --- 340,346 {"quote-all-identifiers", no_argument, "e_all_identifiers, 1}, {"role", required_argument, NULL, 3}, {"serializable-deferrable", no_argument, &serializable_deferrable, 1}, + {"section", required_argument, NULL, 5}, {"use-set-session-authorization", no_argument, &use_setsessauth, 1}, {"no-security-labels", no_argument, &no_security_labels, 1}, {"no-unlogged-table-data", no_argument, &no_unlogged_table_data, 1}, *** *** 346,351 main(int argc, char **argv) --- 357,363 strcpy(g_opaque_type, "opaque"); dataOnly = schemaOnly = false; + dumpSections = DUMP_UNSECTIONED; lockWaitTimeout = NULL; progname = get_progname(argv[0]); *** *** 487,492 main(int argc, char **argv) --- 499,508 use_role = optarg; break; + case 5:/* section */ + set_section(optarg); + break; + default: fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname); exit(1); *** *** 517,522 main(int argc, char **argv) --- 533,554 exit(1); } + if ((dataOnly || schemaOnly) && dumpSections != DUMP_UNSECTIONED) + { + write_msg(NULL, "options -s/--schema-only and -a/--data-only cannot be used with --section\n"); + exit(1); + } + + if (dataOnly) + dumpSections = DUMP_DATA; + else if (schemaOnly) + dumpSections = DUMP_PRE_DATA | DUMP_POST_DATA; + else if ( dumpSections != DUMP_UNSECTIONED) + { + dataOnly = dumpSections == DUMP_DATA; + schemaOnly = !(dumpSections & DUMP_DATA); + } + if (dataOnly && outputC
Re: [HACKERS] why do we need two snapshots per query?
On Sat, Nov 12, 2011 at 5:09 PM, Florian Pflug wrote: > On Nov11, 2011, at 19:17 , Tom Lane wrote: >> But frankly I do not like any of these proposals. Making fundamental >> changes in long-established semantics in the name of squeezing out a few >> cycles is the wrong way to design software. > > Hm, then maybe this is one of the things to put onto the next protocol > version todo list? +1. I had the same thought. It seems clear that we could design this in a way that would make it clear to the server whether we wanted to execute immediately or only upon further instructions, but trying to guess the user's intentions seems a little too rich. Meanwhile, here's my attempt at fixing this for the simple query protocol. I'm attaching two patches: - refactor-portal-start.patch, which attempts to change the API for PortalStart() without any behavioral change whatsoever. The idea here is that instead of passing a snapshot to PortalStart() explicitly, we just pass a flag saying whether or not it's OK to use the active snapshot (versus taking a new one). This seems to fit nicely with existing calling patterns for this function. - just-one-snapshot.patch, which applies atop refactor-portal-start.patch, makes use of the new API to avoid the need for PORTAL_ONE_SELECT queries to take two snapshots. It does so by keeping the parse/analyze snapshot around just long enough to pass it to PortalStart(). If PortalStart() chooses to register it, then it (or a copy of it) will be around for a while longer; otherwise, it will be dropped immediately after PortalStart() finishes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company refactor-portal-start.patch Description: Binary data just-one-snapshot.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW system columns
On 13 November 2011 00:38, Tom Lane wrote: > Thom Brown writes: >> But xmin on the file_fdw result is odd. Why are these all over the >> place? > > heap_form_tuple initializes the t_choice fields as though for a tuple > Datum, and file_fdw doesn't change it. > > Just a couple hours ago I was wondering why we create system columns for > foreign tables at all. Is there a reasonable prospect that they'll ever > be useful? I can see potential value in tableoid, but the others seem > pretty dubious --- even if you were fetching from a remote PG server, > the XIDs would not be meaningful within our own environment. Yes, that's what I was thinking when curiosity led me to have a look at what they contain. As far as I see, they serve no useful function. I didn't bother looking at tableoid as that's generally useful. Is there a cost to having them there? Could there be tools that might break if the columns were no longer available? -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW system columns
Thom Brown writes: > So the ctid is always 2^32-1. Bit weird, but probably explainable. See ItemPointerSetInvalid. > But xmin on the file_fdw result is odd. Why are these all over the > place? heap_form_tuple initializes the t_choice fields as though for a tuple Datum, and file_fdw doesn't change it. Just a couple hours ago I was wondering why we create system columns for foreign tables at all. Is there a reasonable prospect that they'll ever be useful? I can see potential value in tableoid, but the others seem pretty dubious --- even if you were fetching from a remote PG server, the XIDs would not be meaningful within our own environment. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] FDW system columns
I notice that there's some weird info coming out of the system columns on any FDW: test=# select tableoid, ctid, xmin, xmax, cmin, cmax, * from dict limit 12; tableoid | ctid | xmin |xmax| cmin | cmax | words --++--++---+---+--- 16428 | (4294967295,0) | 104 | 4294967295 | 16430 | 16430 | A 16428 | (4294967295,0) | 104 | 4294967295 | 16430 | 16430 | a 16428 | (4294967295,0) | 108 | 4294967295 | 16430 | 16430 | aa 16428 | (4294967295,0) | 112 | 4294967295 | 16430 | 16430 | aal 16428 | (4294967295,0) | 120 | 4294967295 | 16430 | 16430 | aalii 16428 | (4294967295,0) | 112 | 4294967295 | 16430 | 16430 | aam 16428 | (4294967295,0) | 116 | 4294967295 | 16430 | 16430 | Aani 16428 | (4294967295,0) | 132 | 4294967295 | 16430 | 16430 | aardvark 16428 | (4294967295,0) | 132 | 4294967295 | 16430 | 16430 | aardwolf 16428 | (4294967295,0) | 120 | 4294967295 | 16430 | 16430 | Aaron 16428 | (4294967295,0) | 128 | 4294967295 | 16430 | 16430 | Aaronic 16428 | (4294967295,0) | 136 | 4294967295 | 16430 | 16430 | Aaronical (12 rows) That's file_fdw. On the not-yet-ready pgsql_fdw: test=# select tableoid, ctid, xmin, xmax, cmin, cmax from cows limit 5; tableoid | ctid | xmin | xmax | cmin | cmax --++--+--+--+-- 16406 | (4294967295,0) |0 |0 |0 |0 16406 | (4294967295,0) |0 |0 |0 |0 16406 | (4294967295,0) |0 |0 |0 |0 16406 | (4294967295,0) |0 |0 |0 |0 16406 | (4294967295,0) |0 |0 |0 |0 (5 rows) So the ctid is always 2^32-1. Bit weird, but probably explainable. But xmin on the file_fdw result is odd. Why are these all over the place? -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] trivial patch: foreign table either defines row type
Alexander Soudakov writes: > Foreign table either defines row type. I simply added this type to > hardcoded check. Yeah, I think this was just an oversight. (You missed a spot in plpgsql_parse_cwordtype, though.) Patch applied. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.1.2 ?
On Wed, Nov 9, 2011 at 6:22 PM, Florian Pflug wrote: > On Nov9, 2011, at 23:53 , Daniel Farina wrote: > > I think a novice user would be scared half to death: I know I was the > > first time. That's not a great impression for the project to leave > > for what is not, at its root, a vast defect, and the fact it's > > occurring for people when they use rsync rather than my very sensitive > > backup routines is indication that it's not very corner-ey. > > Just to emphasize the non-conerish-ness of this problem, it should be > mentioned that the HS issue was observed even with backups taken with > pg_basebackup, if memory serves correctly. > Yes I personally can reliably reproduce both the clog+subtrans problems using pg_basebackup, and can confirm that the "oldestActiveXid_fixed.v2.patch" does resolve both issues.
Re: [HACKERS] why do we need two snapshots per query?
On Nov11, 2011, at 19:17 , Tom Lane wrote: > But frankly I do not like any of these proposals. Making fundamental > changes in long-established semantics in the name of squeezing out a few > cycles is the wrong way to design software. Hm, then maybe this is one of the things to put onto the next protocol version todo list? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow substitute allocators for PGresult.
* Tom Lane (t...@sss.pgh.pa.us) wrote: > Well, loading data in a form whereby the application can access it > without going through the PGresult accessor functions would be an > entirely different (and vastly larger) project. Looking through the thread, I agree that it's a different thing than what's being discussed here. > I'm not sure I want > to open that can of worms --- it seems like you could write a huge > amount of code trying to provide every format someone might want, > and still find that there were impedance mismatches for many > applications. The OCI approach is actually very similar to how we handle our catalogs internally.. Imagine you define a C struct which matched your table structure, then you allocate 5000 (or however) of those, give the base pointer to the 'getResult' call and a integer array of offsets into that structure for each of the columns. There might have been a few other minor things (like some notion of how to handle NULLs), but it was pretty straight-forward from the C perspective, imv. Trying to provide alternative formats (I'm guessing you were referring to something like XML..? Or some complex structure?) would certainly be a whole different ballgame. Thanks, Stephen > AIUI Kyotaro-san is just suggesting that the app should be able to > provide a substitute malloc function for use in allocating PGresult > space (and not, I think, anything else that libpq allocates internally). > Basically this would allow PGresults to be cleaned up with methods other > than calling PQclear on each one. It wouldn't affect how you'd interact > with one while you had it. That seems like pretty much exactly what we > want for preventing memory leaks in the backend; but is it going to be > useful for other apps? > > regards, tom lane signature.asc Description: Digital signature
Re: [HACKERS] Allow substitute allocators for PGresult.
On 12/11/2011 07:36, Robert Haas wrote: On Sat, Nov 12, 2011 at 12:48 AM, Tom Lane wrote: AIUI Kyotaro-san is just suggesting that the app should be able to provide a substitute malloc function for use in allocating PGresult space (and not, I think, anything else that libpq allocates internally). Basically this would allow PGresults to be cleaned up with methods other than calling PQclear on each one. It wouldn't affect how you'd interact with one while you had it. That seems like pretty much exactly what we want for preventing memory leaks in the backend; but is it going to be useful for other apps? I think it will. Maybe I've just talking nonsense, I just have little experience hacking the pgsql and pdo-pgsql exstensions, but to me it would seem something that could easily avoid an extra duplication of the data returned by pqgetvalue. To me it seems a pretty nice win. Cheers -- Matteo Beccati Development & Consulting - http://www.beccati.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow substitute allocators for PGresult.
Heikki Linnakangas writes: > On 11.11.2011 11:18, Kyotaro HORIGUCHI wrote: >> For these reasons, I propose to make allocators for PGresult >> replaceable. > You could use the resource owner mechanism to track them. BTW, I just thought of a potentially fatal objection to making PGresult allocation depend on palloc: libpq is absolutely not prepared to handle losing control on out-of-memory. While I'm not certain that its behavior with malloc is entirely desirable either (it tends to loop in hopes of getting the memory next time), we cannot just plop in palloc in place of malloc and imagine that we're not breaking it. This makes me think that Heikki's approach is by far the more tenable one, so far as dblink is concerned. Perhaps the substitute-malloc idea is still useful for some other application, but I'm inclined to put that idea on the back burner until we have a concrete use case for it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow substitute allocators for PGresult.
Kyotaro HORIGUCHI writes: > Hello. This message is a proposal of a pair of patches that > enables the memory allocator for PGresult in libpq to be > replaced. Since there seems to be rough consensus that something like this would be a good idea, I looked more closely at the details of the patch. I think the design could use some adjustment. To start with, the patch proposes exposing some global variables that affect the behavior of libpq process-wide. This seems like a pretty bad design, because a single process could contain multiple usages of libpq with different requirements. As an example, if dblink.c were to set these variables inside a backend process, it would break usage of libpq from PL/Perl via DBI. Global variables also tend to be a bad idea whenever you think about multi-threaded applications --- they require locking facilities, which are not in this patch. I think it'd be better to consider the PGresult alloc/free functions to be a property of a PGconn, which you'd set with a function call along the lines of PQsetResultAllocator(conn, alloc_func, realloc_func, free_func) after having successfully opened a connection. Then we just have some more PGconn fields (and I guess PGresult will need a copy of the free_func pointer) and no new global variables. I am also feeling dubious about whether it's a good idea to expect the functions to have exactly the signature of malloc/free. They are essentially callbacks, and in most places where a library provides for callbacks, it's customary to include a "void *" passthrough argument in case the callback needs some context information. I am not sure that dblink.c would need such a thing, but if we're trying to design a general-purpose feature, then we probably should have it. The cost would be having shim functions inside libpq for the default case, but it doesn't seem likely that they'd cost enough to notice. The patch lacks any user documentation, which it surely must have if we are claiming this is a user-visible feature. And I think it could use some attention to updating code comments, notably the large block about PGresult space management near the top of fe-exec.c. Usually, when writing a feature of this sort, it's a good idea to implement a prototype use-case to make sure you've not overlooked anything. So I'd feel happier about the patch if it came along with a patch to make dblink.c use it to prevent memory leaks. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Syntax for partitioning
Martijn van Oosterhout writes: > While I agree that explicit partitioning is somewhat of a hack, it's a > really useful hack. But for me the most important use of partitioning > is "dropping a billion rows efficiently and getting the disk space > back". Right. The only way to make that speedy is for the partition boundaries to match the desired granularity of data removal. I don't really see any way that the database can be expected to know what that is, unless it's told in advance. So AFAICS you really have to have a declarative way of telling it how to do the partitioning --- it's not going to be able to infer that automatically. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] trivial patch: foreign table either defines row type
Hello Hackers. I attached trivial patch. Foreign table either defines row type. I simply added this type to hardcoded check. I faced this limitation while developing www fdw feature (http://wiki.postgresql.org/wiki/WWW_FDW). -- Alexander Soudakov Software Developer email: cyga...@gmail.com skype: asudakov diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c index 578cae5..f579d98 100644 --- a/src/pl/plpgsql/src/pl_comp.c +++ b/src/pl/plpgsql/src/pl_comp.c @@ -1940,11 +1940,12 @@ build_row_from_class(Oid classOid) classStruct = RelationGetForm(rel); relname = RelationGetRelationName(rel); - /* accept relation, sequence, view, or composite type entries */ + /* accept relation, sequence, view, composite type or foreign table entries */ if (classStruct->relkind != RELKIND_RELATION && classStruct->relkind != RELKIND_SEQUENCE && classStruct->relkind != RELKIND_VIEW && - classStruct->relkind != RELKIND_COMPOSITE_TYPE) + classStruct->relkind != RELKIND_COMPOSITE_TYPE && + classStruct->relkind != RELKIND_FOREIGN_TABLE) ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg("relation \"%s\" is not a table", relname))); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Syntax for partitioning
On Thu, Nov 10, 2011 at 10:19:02PM +0100, Dimitri Fontaine wrote: > Now the aim would be to be able to implement the operation you describe > by using the new segment map, which is an index pointing to sequential > ranges of on-disk blocks where the data is known to share a common key > range over the columns you're segmenting on. I would imagine this SQL: > > TRUNCATE foo WHERE year < 2009; > > As the on-disk location of the data that qualify this WHERE clause is > known, it could be possible to (predicate) lock it and bulk remove it, > unlinking whole segments (1GB) at a time when relevant. While I agree that explicit partitioning is somewhat of a hack, it's a really useful hack. But for me the most important use of partitioning is "dropping a billion rows efficiently and getting the disk space back". And the biggest problem is always that dropping blocks of a table requires fixing all the indexes. For fixing the index of the partition key it's a simpler problem, you could probably prune the btree relatively efficiently. But for all other indexes there's no better solution than walk the entire index. However, in the very special case where the drop boundaries explicitly match the dataset, you can simply drop all the indexes. Now, if someone cames up with an efficient way to drop a huge number of rows quickly, then I admit one of the major issues is fixed. But recovering the disk space is much harder. Yes, recent versions of Linux come with ways to punch holes in existing files, but that doesn't make it quick or efficient. > > While automatic clustering would be nice, it isn't the same thing as > > partitioning. > > That has been my initial reaction to that kind of ideas too. After some > more time brewing the ideas, I'm not convinced that the use cases that > usually drives you to the latter can't be solved with the former. I hope so, but I'm not sure I'd like partitioning support to wait on someone hitting on the right idea. Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
[HACKERS] Re: [GENERAL] Strange problem with create table as select * from table;
On Sun, Nov 06, 2011 at 09:34:24AM -0500, Tom Lane wrote: > hubert depesz lubaczewski writes: > > Any chance of getting the fix in patch format so we could test it on > > this system? > > http://git.postgresql.org/gitweb/?p=postgresql.git;a=patch;h=23998fe99c1220ba3a9eefee194e37ec1f14ae07 hi just to close the loop - finally today I could restart the database with patched 8.4. bug fixed, all works fine. thanks a lot. Best regards, depesz -- The best thing about modern society is how easy it is to avoid contact with it. http://depesz.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers