Re: [HACKERS] Processing long AND/OR lists
My last email was written before reading this. A few episodes of 24 occurred between writing and sending that email. Added slony1-hackers, but didn't remove pgsql-hackers. Feel free to exclude pgsql lists, as this branch of conversation seems to be more Slony related than Postgres. On Sun, May 26, 2013 at 10:59 PM, Christopher Browne wrote: > > In Slony 2.1, the issue re-emerged because the ordering of the "action id" > values was lost; the query had previously been implicitly forcing them into > order; we had to add an "ORDER BY" clause, to make the "compressor" work > again. > > http://git.postgresql.org/gitweb/?p=slony1-engine.git;a=blobdiff;f=src/slon/remote_worker.c;h=b1f48043f8e25b4a74a392b0dbceeae8f3e18c27;hp=7fbf67c16f97cb7c3f209cf3be903ea52c4490a9;hb=c4ac435308a78a2db63bf267d401d842c169e87d;hpb=d4612aab78bac5a9836e3e2425c403878f7091c8 > > Commit log says it was fixed between 2.1.2, but from the Slony logs at the time, the version in use was 2.1.2. So > Joking about "640K" aside, it doesn't seem reasonable to expect a truly > enormous query as is generated by the broken forms of this logic to turn > out happily. I'd rather fix Slony (as done in the above patch). > Yes, by all means, fix the application, but that doesn't preclude the argument that the database should be a bit more smarter and efficient, especially if it is easy to do. Best regards, -- Gurjeet Singh http://gurjeet.singh.im/ EnterpriseDB Inc.
Re: [HACKERS] shmem startup and shutdown hooks
Amit kapila wrote: > > On Saturday, May 25, 2013 12:50 AM Alvaro Herrera wrote: > > It seems that the right place to do this is checkpointer shutdown, i.e. > > when checkpointer is told to close shop it should also invoke various > > modules' shutdown callbacks. There's no hook point there though, so > > we'd need to introduce something new for this specific purpose. > > Do you mean to say that during shutdown of checkpoint, it should call > interface which does same as shmem_exit() which means call shutdown > callbacks of other modules? I mean we should have some other hooking point for modules, one which is guaranteed to be called only during shutdown. The current hook point in use is shared memory reset, which does happen during shutdown; but also happens when a process crashes and postmaster reinits the whole thing. > Why would checkpointer shutdown is better place to handle? Because it's the process whose termination defines when a system shutdown takes place. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Planning incompatibilities for Postgres 10.0
On 05/25/2013 05:39 PM, Simon Riggs wrote: > 2. Name the next release after that 10.0 (would have been 9.5). We > declare now that > a) 10.0 will support on-line upgrade from 9.4 (only) > b) various major incompatibilities will be introduced in 10.0 - the > change in release number will indicate to everybody that is the case > c) agree that there will be no pg_upgrade patch from 9.4 to 10.0, so > that we will not be constrained by that While we're talking about changing things, what about: - Switching to single-major-version release numbering. The number of people who say "PostgreSQL 9.x" is amazing; even *packagers* get this wrong and produce "postgresql-9" packages. Witness Amazon Linux's awful PostgreSQL packages for example. Going to PostgreSQL 10.0, 11.0, 12.0, etc with a typical major/minor scheme might be worth considering. - s/cluster/server/g . Just because "cluster" is historical usage doesn't make it any less confusing for users. *dives for asbestos fire suit* -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Processing long AND/OR lists
On Sun, May 26, 2013 at 11:46 AM, Tom Lane wrote: > Josh Berkus writes: > > ***15,000***? I'd say that someone has an application design issue. > > Fixing the stack overflow is a good thing, but that query is never going > > to return ... > Just for the record, it does finish in 5 sec on my laptop. But yes, point taken. > > Yeah, the parser's stack consumption seems like only the tip of the > iceberg here. > True. After getting past the parser, the bigger lists consume exponentially longer times around process_equivalence(). And BTW, a backend stuck in that stack does not respond to pg_cancel/terminate_backend()! Should there be a CHECK_FOR_INTERRUPT() in process_equivalence(), perhaps invoked only every N calls of that function. > > I find it hard to visualize a use-case for so many AND'ed conditions > anyway. I could believe a lot of OR'd conditions, but very likely it > would be a list that could and should be converted to an IN condition. > As noted earlier, this was seen in real-world, at a customer setup, generated by Slony, a highly regarded community project. Also, the patch addresses the stack limit for long OR clauses as well. Anytime such conditions are generated programmatically, there's always a possibility that the programmer underestimated the amount of data they are generating the query for; just like it happened in Slony's case I quoted. Slony has compress_actionseq() function to scan a list of numbers, and generate a smallest possible WHERE clause. If it sees consecutive numbers that form a range, it turns that range into a BETWEEN clause, and the numbers that don't seem to be in a range are added to the where clause as 'AND col <> n1 AND col <> n2 ...'. It's quite likely that it generates such long lists because it wrongly assumes that the input list is ordered, or at least mostly ordered. Agreed that that function can/should be fixed to do a better job of sorting these numbers before scanning, and also using the NOT IN construct. But the point remains that Postgres should be capable of handling this simple construct efficiently. Best regards, -- Gurjeet Singh http://gurjeet.singh.im/ EnterpriseDB Inc.
Re: [HACKERS] Planning incompatibilities for Postgres 10.0
The assumption that we ought to plan expressly for an incompatibility that essentially discards pg_upgrade seems premature, particularly in advance of would-be solutions that, in some cases, mightn't actually work. If pg_upgrade doesn't work, then, at present, the plausible solutions are to either dump and restore, which might take way too long, or use one of the logical replication systems (e.g. - Slony, Londiste, or similar, in the absence of the would-be built-in logical replication). Unfortunately, there are significant scenarios where none of these work, particularly for data warehouse-like systems where the database size is so large that the users cannot afford the disk space to construct a replica. It sure seems premature to intentionally leave that set of users out in the cold.
Re: [HACKERS] Processing long AND/OR lists
This situation falls from a problem that we noticed a mighty long time ago in Slony, where the set of XIDs outstanding gets very large, and, attendant to that, the set of "action id" values by which tuples are being filtered, gets correspondingly large. It happens when there is a long pause in application of replication data, and is commonly the consequence of setting up replication on a very large data table that takes a long time for the initial data copy. At the time, Neil Conway observed this query breakage with a query that was roughly 640K in size, from whence fell jokes to the effect, "who would ever need a query larger than 640K?" The resolution that I introduced at the time was to write a little parser that would recognize sequences of adjacent values and merge them into "BETWEEN A and B" clauses, which would bring the query size back to a reasonable size. In Slony 2.1, the issue re-emerged because the ordering of the "action id" values was lost; the query had previously been implicitly forcing them into order; we had to add an "ORDER BY" clause, to make the "compressor" work again. http://git.postgresql.org/gitweb/?p=slony1-engine.git;a=blobdiff;f=src/slon/remote_worker.c;h=b1f48043f8e25b4a74a392b0dbceeae8f3e18c27;hp=7fbf67c16f97cb7c3f209cf3be903ea52c4490a9;hb=c4ac435308a78a2db63bf267d401d842c169e87d;hpb=d4612aab78bac5a9836e3e2425c403878f7091c8 Joking about "640K" aside, it doesn't seem reasonable to expect a truly enormous query as is generated by the broken forms of this logic to turn out happily. I'd rather fix Slony (as done in the above patch).
Re: [HACKERS] getting rid of freezing
Andres, I was talking this over with Jeff on the plane, and we wanted to be clear on your goals here: are you looking to eliminate the *write* cost of freezing, or just the *read* cost of re-reading already frozen pages? If just the latter, what about just adding a bit to the visibility map to indicate that the page is frozen? That seems simpler than what you're proposing. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Planning incompatibilities for Postgres 10.0
Stephen Frost writes: > btw, has anyone posted the SM API proposal..? Unfortunately, I think I > had to leave before that was hashed out.. There isn't one yet. We think we understand where the pain points are, but there's still a long way to go to have a proposal. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Planning incompatibilities for Postgres 10.0
* Josh Berkus (j...@agliodbs.com) wrote: > and it's entirely possible that we'll be able to implement SMs without > breaking pgupgrade. I'd certainly hope so.. It's certainly not obvious, to me at least, why a new SM or supporting any of those features would require breaking pg_upgrade. Perhaps there's something I'm not seeing there, but it had better be a *really* good reason.. btw, has anyone posted the SM API proposal..? Unfortunately, I think I had to leave before that was hashed out.. > First, let's have a few features for which breaking binary compatibility > is a necessity or a clear benefit. Then we'll schedule when to break them. Agreed. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] MVCC catalog access
Perhaps we see little difference in performance because PGPROC has been separated into PGPROC and PGXACT, reducing lock contention with getting snapshot data? By the way, I grabbed a 32-core machine and did some more performance tests with some open connections with XIDs assigned using pg_cxn v2 given by Robert in his previous mail to make sure that the snapshots get pretty large. First I ran the simple read test: $ time for s in `seq 1 1000` > do > rm -f ~/bin/pgsql/master/global/pg_internal.init && psql -c 'SELECT 2+2' >/dev/null; > done And then the create/drop test. I have done those tests with 250, 500, 1000 and 2000 open connections: 1) 250 open connections 1-1) read test Round 1: mvcc_catalog_access off: real0m9.124s user0m0.200s sys0m0.392s mvcc_catalog_access on: real0m9.297s user0m0.148s sys0m0.444s Round 2: mvcc_catalog_access off: real0m8.985s user0m0.160s sys0m0.372s mvcc_catalog_access on: real0m9.244s user0m0.240s sys0m0.400s 1-2) DDL test (drop and creation of 100,000 objects) mvcc off: Create: 24554.849, Drop: 29755.146 mvcc on: Create: 26904.755, Drop: 32891.556 mvcc off: Create: 23337.342, Drop: 29921.990 mvcc on: Create: 24533.708, Drop: 31670.840 2) 500 open connections 2-1) read test Round 1: mvcc_catalog_access off: real0m9.123s user0m0.200s sys0m0.396s mvcc_catalog_access on: real0m9.627s user0m0.156s sys0m0.460s Round 2: mvcc_catalog_access off: real0m9.221s user0m0.316s sys0m0.392s mvcc_catalog_access on: real0m9.592s user0m0.160s sys0m0.484s 2-2) DDL test (drop and creation of 100,000 objects) mvcc off: Create: 25872.886, Drop: 31723.921 mvcc on: Create: 27076.429, Drop: 33674.336 mvcc off: Create: 24039.456, Drop: 30434.019 mvcc on: Create: 29105.713, Drop: 33821.170 3) 1000 open connections 3-1) read test Round 1: mvcc_catalog_access off: real0m9.240s user0m0.192s sys0m0.396s mvcc_catalog_access on: real0m9.674s user0m0.236s sys0m0.440s Round 2: mvcc_catalog_access off: real0m9.302s user0m0.308s sys0m0.392s mvcc_catalog_access on: real0m9.746s user0m0.204s sys0m0.436s 3-2) DDL test (drop and creation of 100,000 objects) mvcc off: Create: 25563.705, Drop: 31747.451 mvcc on: Create: 33281.246, Drop: 36618.166 mvcc off: Create: 28178.210, Drop: 30550.166 mvcc on: Create: 31849.825, Drop: 36831.245 4) 2000 open connections 4-1) read test Round 1: mvcc_catalog_access off: real0m9.066s user0m0.128s sys0m0.420s mvcc_catalog_access on: real0m9.978s user0m0.168s sys0m0.412s Round 2: mvcc_catalog_access off: real0m9.113s user0m0.152s sys0m0.444s mvcc_catalog_access on: real0m9.974s user0m0.176s sys0m0.436s More or less the same results as previously with 10% performance drop. 4-2) DDL test (drop and creation of 100,000 objects) mvcc off: Create: 28708.095 ms, Drop: 32510.057 ms mvcc on: Create: 39987.815 ms, Drop: 43157.006 ms mvcc off: Create: 28409.853 ms, Drop: 31248.163 ms mvcc on: Create: 41669.829 ms, Drop: 44645.109 ms For read tests, we can see a performance drop up to 10% for 2000 connections. For the write tests, we can see a performance drop of 9~10% for 250 connections, up to 30% performance drop with 2000 connections. We barely see users drop that many objects at the same time with so much open transactions, they'll switch to a connection pooler before opening that many connections to the server. I am not sure that such a performance drop is acceptable as-is, but perhaps it is if we consider the functionality gain we can have thanks to MVCC catalogs. -- Michael
Re: [HACKERS] Parallel Sort
* Noah Misch (n...@leadboat.com) wrote: > In particular, implementing a GPU-based strcoll() for bttextcmp > sounds like quite a project in its own right. It also wouldn't likely be helpful... To be able to use a GPU effectively, last I looked, you need to be able to move a large chunk of data to the GPU's memory, operate on it, and then bulk move the results back because the cost of moving data between main memory and GPU memory is very high. > Those are matters we would eventually need to address as we parallelize more > things, so I regard confronting them as an advantage. Among other benefits, > this project is a vehicle for emplacing some infrastructure without inviting > the full complexity entailed by loftier goals. Agreed. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] background worker and normal exit
On Mon, May 27, 2013 at 2:04 AM, Bernd Helmle wrote: > > > --On 26. Mai 2013 11:38:55 +0900 Michael Paquier < > michael.paqu...@gmail.com> wrote: > > >> This flag makes a worker not to restart only in case of a crash. To solve >> your problem, you could as well allow your process to restart and put it >> in indefinite sleep if server is not in recovery such it it will do >> nothing in your case. >> > > Hmm so you can't have workers just "doing something once" and exit? I have > to admit, i didn't follow bgworkers closely in the past, but could you give > a short insight on why this is currently not possible? > Bgworkers are expected to run all the time, and will be restarted each time they exit cleanly with a status code 0. Note that they are *still* restarted immediately even if bgw_restart_time is set at BGW_NEVER_RESTART or to a certain value. There are actually two ways you can use to have them perform a one-time task: - put it in indefinite sleep after the task is accomplished - set bgw_restart_time to BGW_NEVER_RESTART. and have the bgworler exit with non-0 status code. Regards, -- Michael
Re: [HACKERS] Patch proposal: query result history in psql
Polished version of the patch. * The feature is disabled by default, enabled by backslash command \ans. Additionaly, \ansclean cleans the result history. * Escaping is applied when building COPY IN string This is patch is a diff between master:230e92c and https://github.com/maciekgajewski/psql-ans.git:2997f9c Maciek On 16 May 2013 19:18, David E. Wheeler wrote: > On May 16, 2013, at 7:02 AM, Dimitri Fontaine wrote: > >>> I find this feature quite useful, but I understand that my use case >>> may be quite unique. >> >> Just to say that I too find what you've done quite useful. Please add >> your patch to the next commit fest for consideration in 9.4! > > FYI, you can add it here: > > https://commitfest.postgresql.org/action/commitfest_view?id=18 > > Best, > > David > psql-ans.2.diff Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Sort
On Fri, May 24, 2013 at 01:13:21PM -0500, Jim Nasby wrote: > On 5/13/13 9:28 AM, Noah Misch wrote: >> It would be great if one client session could take advantage of multiple CPU >> cores. EnterpriseDB wishes to start the trek into this problem space for 9.4 >> by implementing parallel internal (i.e. not spilling to disk) sort. This >> touches on a notable subset of the infrastructure components we'll need for >> parallel general query. My intent is to map out the key design topics, hear >> about critical topics I hadn't considered, and solicit feedback on the >> quality >> of the high-level plan. Full designs for key pieces will come later. > > Have you considered GPU-based sorting? I know there's been discussion in the > past. I had considered it briefly. Parallel sort is mainly valuable for expensive comparison operators. Sorting int4, for example, is too cheap for parallelism to be compelling. (In my test build of a 16 GiB int4 index, sorting took 11s of the 391s build time.) However, expensive operators are also liable to be difficult to reimplement for the GPU. In particular, implementing a GPU-based strcoll() for bttextcmp sounds like quite a project in its own right. > To me, the biggest advantage of GPU sorting is that most of the concerns > you've laid out go away; a backend that needs to sort just throws data at the > GPU to do the actual sorting; all the MVCC issues and what not remain within > the scope of a single backend. Those are matters we would eventually need to address as we parallelize more things, so I regard confronting them as an advantage. Among other benefits, this project is a vehicle for emplacing some infrastructure without inviting the full complexity entailed by loftier goals. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Add regression tests for SET xxx
Hi, Please find attached a patch to take code-coverage of SET (SESSION / SEED / TRANSACTION / DATESTYLE / TIME ZONE) (src/backend/commands/variable.c) from 65% to 82%. Any and all feedback is welcome. -- Robins Tharakan regress_variable.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] background worker and normal exit
--On 26. Mai 2013 11:38:55 +0900 Michael Paquier wrote: This flag makes a worker not to restart only in case of a crash. To solve your problem, you could as well allow your process to restart and put it in indefinite sleep if server is not in recovery such it it will do nothing in your case. Hmm so you can't have workers just "doing something once" and exit? I have to admit, i didn't follow bgworkers closely in the past, but could you give a short insight on why this is currently not possible? -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] COPY .... (FORMAT binary) syntax doesn't work
Heikki Linnakangas writes: > On 26.05.2013 04:31, Simon Riggs wrote: >> This new format does not [work:] >> COPY pgbench_accounts FROM '/tmp/acc' (FORMAT BINARY); >> ERROR: syntax error at or near "BINARY" at character 47 > This seems to work: > --- a/src/backend/parser/gram.y > +++ b/src/backend/parser/gram.y > @@ -2528,3 +2528,7 @@ copy_generic_opt_elem: > { > $$ = makeDefElem($1, $2); > } > + | ColLabel BINARY > + { > + $$ = makeDefElem($1, (Node *) > makeString("binary")); > + } That only fixes things for the specific case of FORMAT BINARY. I think it would be better to generalize ColId_or_Sconst. This one-liner works, but it's pretty ugly: *** gram.y~ Sun May 26 11:58:42 2013 --- gram.y Sun May 26 12:00:03 2013 *** opt_encoding: *** 1548,1553 --- 1548,1554 ColId_or_Sconst: ColId { $$ = $1; } + | type_func_name_keyword { $$ = pstrdup($1); } | Sconst { $$ = $1; } ; More readable would be to invent an intermediate nonterminal falling between ColId and ColLabel, whose expansion would be "IDENT | unreserved_keyword | col_name_keyword | type_func_name_keyword", and then replace ColId_or_Sconst with whatever-we-call-that_or_Sconst. Any thoughts about a name for that new nonterminal? BTW, I tried just replacing ColId with ColLabel here, and that *almost* works, but there are some places where we can see either ColId_or_Sconst or DEFAULT. I don't know of any sane way to express "all reserved keywords except DEFAULT" to bison, so the best we can realistically do is to accept all not-fully-reserved keywords in these places. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Using indexes for partial index builds
On 3/13/13 7:10 PM, Ants Aasma wrote: On Thu, Mar 14, 2013 at 1:51 AM, Jim Nasby wrote: On 3/12/13 9:10 AM, Ants Aasma wrote: I have a feeling this is an increasingly widespread pattern with a proliferation of mobile devices that need syncing. If you're doing that with timestamps you're asking for a slew of problems, not all of which can be solved by just adding some random amount of fluff to your criteria. A queue-based solution is often a more robust solution, even if it is harder to implement. Do you know of anything else besides the obvious issues with having to use one clocksource and ensure that it produces monotonic timestamps? Those issues aren't enough? :) My first reaction was also that this is what queues are meant for, but the proposed solution seems to work surprisingly well. Unless you can point at some glaring hole that I'm missing I would say that it is good enough for a rather wide range of syncing problems. It depends on how critical it is not to miss events. Timestamps in tables are always taken before transaction commit, so you can sometimes have a significant delay. You have to make certain the timestamp can't be changed, and that rows can't be deleted. It's also tricky to make certain you don't see any events twice. Given all that, and how easy PgQ is to use, I don't understand why anyone would go with timestamps... -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Processing long AND/OR lists
Josh Berkus writes: > ***15,000***? I'd say that someone has an application design issue. > Fixing the stack overflow is a good thing, but that query is never going > to return ... Yeah, the parser's stack consumption seems like only the tip of the iceberg here. I find it hard to visualize a use-case for so many AND'ed conditions anyway. I could believe a lot of OR'd conditions, but very likely it would be a list that could and should be converted to an IN condition. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] COPY .... (FORMAT binary) syntax doesn't work
On 26.05.2013 04:31, Simon Riggs wrote: This works fine... COPY pgbench_accounts TO '/tmp/acc' BINARY; This new format does not COPY pgbench_accounts FROM '/tmp/acc' (FORMAT BINARY); ERROR: syntax error at or near "BINARY" at character 47 which looks like I've mistyped something. Until you realise that this statement gives a completely different error message. COPY pgbench_accounts FROM '/tmp/acc' (FORMAT anyname); ERROR: COPY format "anyname" not recognized and we also note that there are no examples in the docs, nor regression tests to cover this situation. So I conclude that this hasn't ever worked since it was introduced in 9.0. The cause is that there is an overlap between the old and the new COPY syntax, relating to the word BINARY. It's the grammar generating the error, not post parse analysis. Hmm, the problem is that BINARY is a type_func_keyword, so it doesn't match the ColId rule used to capture the format argument. My attempts to fix that look pretty ugly, so I'm not even going to post them. I can stop the error on binary by causing errors on csv and text, obviously not a fix. Any grammar based fix looks like it would restrict the list of formats, which breaks the orginal intention of the syntax change. This seems to work: --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -2528,3 +2528,7 @@ copy_generic_opt_elem: { $$ = makeDefElem($1, $2); } + | ColLabel BINARY + { + $$ = makeDefElem($1, (Node *) makeString("binary")); + } Am I missing something? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Processing long AND/OR lists
On 05/25/2013 09:56 AM, Gurjeet Singh wrote: > When Postgres encounters a long list of AND/OR chains, it errors out at > check_stack_depth() after a limit of few thousand. At around 10,000 > elements, the recursion at assign_expr_collations() causes the error. But > at a little higher element count, around 15,000, the recursion check errors > out a little earlier, in the stack around transformAExprAnd(). The test > queries were generated using the attached test.sh script. ***15,000***? I'd say that someone has an application design issue. Fixing the stack overflow is a good thing, but that query is never going to return ... -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Planning incompatibilities for Postgres 10.0
> Not sure which ones Simon meant, but at least any new/better > storage manager would seem to me to be requiring > a non-pg_upgrade upgrade path unless we require the storage manager > to also include a parallel implementation of pg_upgrade. Isn't this a bit of horse-cart inversion here? We just hashed out a tentative, incomplete pseudo-spec for storage managers *yesterday*. We don't have a complete spec at this point, let alone a development plan, and it's entirely possible that we'll be able to implement SMs without breaking pgupgrade. It's also not at all clear that we can develop SMs in less than 2 years. I tend to think it unlikely. First, let's have a few features for which breaking binary compatibility is a necessity or a clear benefit. Then we'll schedule when to break them. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] View Index and UNION
Stefan Keller writes: > Given following schema: > 1. TABLE a and TABLE b, each with INDEX on attribute geom. > 2. A VIEW with union: > CREATE VIEW myview AS > SELECT * FROM a > UNION > SELECT * FROM b; > 3. And a simple query with KNN index and a coordinate "mypos" : > SELECT * FROM myview > ORDER BY ST_Geomfromtext(mypos) <-> myview.geom I think this would work out-of-the-box in 9.1 or later, if you made the view use UNION ALL instead of UNION. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Planning incompatibilities for Postgres 10.0
On 05/26/2013 04:22 PM, Bruce Momjian wrote: > On Sun, May 26, 2013 at 09:18:11AM -0400, Bruce Momjian wrote: >> On Sun, May 26, 2013 at 10:53:37AM +0100, Simon Riggs wrote: I consider this thread to be not thought-through, obviously. >>> My proposal has had lots of serious consideration, but that is not the >>> topic of this thread. >>> >>> The title of the thread is a general one, with a clear objective. >>> >>> I'm looking for a way forwards that allows us to introduce the changes >>> that many have proposed and which regrettably result in >>> incompatibilities. If we have no plan I think its likely it will never >>> happen and it is currently blocking useful change. >>> >>> Please explain what you consider to be a better plan, so we can judge >>> all proposals together. >> I agree with the idea of using logical replication as a way to do >> pg_upgrade version-breaking releases. What I don't know is what >> incompatible changes are pending that would require this. > Sorry I was unclear. When I said "not thought-through", I meant, you > need to start with the _reason_ we need to break pg_upgrade in an > upcoming version, then we can start to plan how to do it. The logical > replication idea is a good one for getting us through pg_upgrade > version-breaking releases. > > I am fine with breaking pg_upgrade, but I just don't see the pending > reason at this point. Not sure which ones Simon meant, but at least any new/better storage manager would seem to me to be requiring a non-pg_upgrade upgrade path unless we require the storage manager to also include a parallel implementation of pg_upgrade. The family of possible storage magers here would include column stores, distributed / partitioned / replicated memory-only / index-structured / ... storages which all could have advantages in certain situations and whic all need an upgrade path. While you could do this using sequance of first pg_upgrading and then doing some internal data migration to new storage manager doing this in one go would be much smoother. -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Planning incompatibilities for Postgres 10.0
On Sun, May 26, 2013 at 09:18:11AM -0400, Bruce Momjian wrote: > On Sun, May 26, 2013 at 10:53:37AM +0100, Simon Riggs wrote: > > > I consider this thread to be not thought-through, obviously. > > > > My proposal has had lots of serious consideration, but that is not the > > topic of this thread. > > > > The title of the thread is a general one, with a clear objective. > > > > I'm looking for a way forwards that allows us to introduce the changes > > that many have proposed and which regrettably result in > > incompatibilities. If we have no plan I think its likely it will never > > happen and it is currently blocking useful change. > > > > Please explain what you consider to be a better plan, so we can judge > > all proposals together. > > I agree with the idea of using logical replication as a way to do > pg_upgrade version-breaking releases. What I don't know is what > incompatible changes are pending that would require this. Sorry I was unclear. When I said "not thought-through", I meant, you need to start with the _reason_ we need to break pg_upgrade in an upcoming version, then we can start to plan how to do it. The logical replication idea is a good one for getting us through pg_upgrade version-breaking releases. I am fine with breaking pg_upgrade, but I just don't see the pending reason at this point. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Planning incompatibilities for Postgres 10.0
On Sun, May 26, 2013 at 10:53:37AM +0100, Simon Riggs wrote: > > I consider this thread to be not thought-through, obviously. > > My proposal has had lots of serious consideration, but that is not the > topic of this thread. > > The title of the thread is a general one, with a clear objective. > > I'm looking for a way forwards that allows us to introduce the changes > that many have proposed and which regrettably result in > incompatibilities. If we have no plan I think its likely it will never > happen and it is currently blocking useful change. > > Please explain what you consider to be a better plan, so we can judge > all proposals together. I agree with the idea of using logical replication as a way to do pg_upgrade version-breaking releases. What I don't know is what incompatible changes are pending that would require this. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] getting rid of freezing
On 05/25/2013 01:14 PM, Simon Riggs wrote: > On 24 May 2013 17:00, Robert Haas wrote: >> On Fri, May 24, 2013 at 11:29 AM, Robert Haas wrote: >>> On Fri, May 24, 2013 at 10:53 AM, Andres Freund >>> wrote: > [all-visible cannot restore hint bits without FPI because of torn pages] I haven't yet thought about this sufficiently yet. I think we might have a chance of working around this, let me ponder a bit. >>> Yeah. I too feel like there might be a solution. But I don't know >>> have something specific in mind, yet anyway. >> One thought I had is that it might be beneficial to freeze when a page >> ceases to be all-visible, rather than when it becomes all-visible. >> Any operation that makes the page not-all-visible is going to emit an >> FPI anyway, so we don't have to worry about torn pages in that case. >> Under such a scheme, we'd have to enforce the rule that xmin and xmax >> are ignored for any page that is all-visible; and when a page ceases >> to be all-visible, we have to go back and really freeze the >> pre-existing tuples. I think we might be able to use the existing >> all_visible_cleared/new_all_visible_cleared flags to trigger this >> behavior, without adding anything new to WAL at all. > I like the idea but it would mean we'd have to freeze in the > foreground path rather in a background path. > > Have we given up on the double buffering idea to remove FPIs > completely? If we did that, then this wouldn't work. > > Anyway, I take it the direction of this idea is that "we don't need a > separate freezemap, just use the vismap". That seems to be forcing > ideas down a particular route we may regret. I'd rather just keep > those things separate, even if we manage to merge the WAL actions for > most of the time. > > > Some other related thoughts: > > ISTM that if we really care about keeping xids for debug purposes that > it could be a parameter. For the mainline, we just freeze blocks at > the same time we do page pruning. > > I think the right way is actually to rethink and simplify all this > complexity of Freezing/Pruning/Hinting/Visibility I think that tis xmin, xmax business is mainly leftovers from the time when PostgreSQL was a full history database. If we are happy to descide that we do not want to resurrect this feature, at least not the same way, then freezing at the earliest or most convenient possibility seems the way to go . The "forensic" part has always been just a nice side effect of this design and not the main design considerataion. -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] View Index and UNION
I appear to have been able to replicate what you are talking about, but it required explicitly binding the order by in different ways. See attached files. William King Senior Engineer Quentus Technologies, INC 1037 NE 65th St Suite 273 Seattle, WA 98115 Main: (877) 211-9337 Office: (206) 388-4772 Cell: (253) 686-5518 william.k...@quentustech.com On 05/26/2013 02:22 AM, Stefan Keller wrote: > Yes, it actually does, but the planner chooses a seq scan to prepare for that. > > -S. > > 2013/5/26 William King : >> Could this scenario not be handled by a step that orders the two tables >> independently, then for the view interleaves the presorted results? >> Merging two sorted sets into a single sorted set is usually a trivial >> task, and it could still take advantage of the existing indexes. >> >> William King >> Senior Engineer >> Quentus Technologies, INC >> 1037 NE 65th St Suite 273 >> Seattle, WA 98115 >> Main: (877) 211-9337 >> Office: (206) 388-4772 >> Cell: (253) 686-5518 >> william.k...@quentustech.com >> >> On 05/25/2013 05:35 PM, Stefan Keller wrote: >>> Hi >>> >>> I've encountered a fundamental problem which - to me - can only be >>> solved with an (future/possible) real index on views in PostgreSQL >>> (like the exist already in MS SQL Server and Ora): >>> >>> Given following schema: >>> >>> 1. TABLE a and TABLE b, each with INDEX on attribute geom. >>> >>> 2. A VIEW with union: >>> >>> CREATE VIEW myview AS >>> SELECT * FROM a >>> UNION >>> SELECT * FROM b; >>> >>> 3. And a simple query with KNN index and a coordinate "mypos" : >>> >>> SELECT * FROM myview >>> ORDER BY ST_Geomfromtext(mypos) <-> myview.geom >>> >>> Now, the problem is, that for the "order by" it is not enough that >>> each on the two tables calculate the ordering separately: We want a >>> total ordering over all involved tables! >>> >>> In fact, the planner realizes that and chooses a seq scan over all >>> tuples of table a and b - which is slow and suboptimal! >>> >>> To me, that's a use case where we would wish to have a distinct index on >>> views. >>> >>> Any opinions on this? >>> >>> Yours, Stefan >>> >>> >> >> >> -- >> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-hackers quentusrex=# \i sql/view_index_union.sql CREATE TABLE CREATE INDEX CREATE TABLE CREATE INDEX INSERT 0 5 INSERT 0 5 CREATE VIEW QUERY PLAN Index Scan using named_a_idx on a (cost=0.41..4329.78 rows=51291 width=36) (actual time=0.078..33.854 rows=5 loops=1) Total runtime: 36.226 ms (2 rows) QUERY PLAN Index Scan using named_b_idx on b (cost=0.41..4385.78 rows=51291 width=36) (actual time=0.036..27.166 rows=5 loops=1) Total runtime: 29.418 ms (2 rows) QUERY PLAN -- Sort (cost=27371.05..27627.51 rows=102582 width=36) (actual time=435.566..535.213 rows=10 loops=1) Sort Key: a.named Sort Method: external merge Disk: 4576kB -> Unique (cost=14230.75..15000.12 rows=102582 width=36) (actual time=75.540..131.131 rows=10 loops=1) -> Sort (cost=14230.75..14487.21 rows=102582 width=36) (actual time=75.539..102.016 rows=10 loops=1) Sort Key: a.id, a.named Sort Method: external merge Disk: 4584kB -> Append (cost=0.00..2885.64 rows=102582 width=36) (actual time=0.005..22.103 rows=10 loops=1) -> Seq Scan on a (cost=0.00..929.91 rows=51291 width=36) (actual time=0.005..6.389 rows=5 loops=1) -> Seq Scan on b (cost=0.00..929.91 rows=51291 width=36) (actual time=0.003..5.811 rows=5 loops=1) Total runtime: 541.763 ms (11 rows) QUERY PLAN -- Sort (cost=26345.23..26601.69 rows=102582 width=36) (actual time=435.116..534.755 rows=10 loops=1) Sort Key: a.named Sort Method: external merge Disk: 4576kB -> Unique (cost=14230.75..15000.12 rows=102582 width=36) (actual time=75.025..130.706 rows=10 loops=1) -> Sort (c
Re: [HACKERS] Planning incompatibilities for Postgres 10.0
On 25 May 2013 21:44, Bruce Momjian wrote: > On Sat, May 25, 2013 at 10:39:30AM +0100, Simon Riggs wrote: >> There are a number of changes we'd probably like to make to the way >> things work in Postgres. This thread is not about discussing what >> those are, just to say that requirements exist and have been discussed >> in various threads over time. >> >> The constraint on such changes is that we've decided that we must have >> an upgrade path from release to release. >> >> So I'd like to make a formal suggestion of a plan for how we cope with this: >> >> 1. Implement online upgrade in 9.4 via the various facilities we have >> in-progress. That looks completely possible. >> >> 2. Name the next release after that 10.0 (would have been 9.5). We >> declare now that >> a) 10.0 will support on-line upgrade from 9.4 (only) >> b) various major incompatibilities will be introduced in 10.0 - the >> change in release number will indicate to everybody that is the case >> c) agree that there will be no pg_upgrade patch from 9.4 to 10.0, so >> that we will not be constrained by that > > Assuming online upgrade is going to require logical replication, you are > also assuming 2x storage as you need to have a second cluster to perform > the upgrade. The people that want online upgrade already have 1+ other systems to do this with. > pg_upgrade would still be needed to upgrade a cluster > in-place. > This sounds like, "I created a new tool which does some of what the old > tool does. Let's break the old tool to allow some unspecified changes I > might want to make." I haven't argued against pg_upgrade in general, nor said anything about breaking it. I proposed that we don't support a pg_upgrade path between two near-future releases, as a way of introducing incompatibilities. After that, we would continue to use pg_upgrade for later releases. Logical replication is being developed, which gives us a complete code path for doing what we'd need to do. The most important thing is we wouldn't need to develop any other code that exists just for upgrade. Writing special code just for pg_upgrade will take a lot of work. Running that code would mean pg_upgrade would touch the actual database, which would be down for a long time while it runs. And if it hits a bug during or after, you're hosed. So you'd need to take a full backup before you started the process, probably storing it on disk somewhere and so you would need x2 disk space with this route also. Specialised code is less well tested, which means bugs are more likely to occur and tends to perform more poorly. Not only that, but the first person to want an incompatibility gets to write all the code needed and take responsibility for the bugs. I can't comment for others, but I can say I would not personally choose that route - it looks both expensive and risky. > I consider this thread to be not thought-through, obviously. My proposal has had lots of serious consideration, but that is not the topic of this thread. The title of the thread is a general one, with a clear objective. I'm looking for a way forwards that allows us to introduce the changes that many have proposed and which regrettably result in incompatibilities. If we have no plan I think its likely it will never happen and it is currently blocking useful change. Please explain what you consider to be a better plan, so we can judge all proposals together. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] View Index and UNION
Yes, it actually does, but the planner chooses a seq scan to prepare for that. -S. 2013/5/26 William King : > Could this scenario not be handled by a step that orders the two tables > independently, then for the view interleaves the presorted results? > Merging two sorted sets into a single sorted set is usually a trivial > task, and it could still take advantage of the existing indexes. > > William King > Senior Engineer > Quentus Technologies, INC > 1037 NE 65th St Suite 273 > Seattle, WA 98115 > Main: (877) 211-9337 > Office: (206) 388-4772 > Cell: (253) 686-5518 > william.k...@quentustech.com > > On 05/25/2013 05:35 PM, Stefan Keller wrote: >> Hi >> >> I've encountered a fundamental problem which - to me - can only be >> solved with an (future/possible) real index on views in PostgreSQL >> (like the exist already in MS SQL Server and Ora): >> >> Given following schema: >> >> 1. TABLE a and TABLE b, each with INDEX on attribute geom. >> >> 2. A VIEW with union: >> >> CREATE VIEW myview AS >> SELECT * FROM a >> UNION >> SELECT * FROM b; >> >> 3. And a simple query with KNN index and a coordinate "mypos" : >> >> SELECT * FROM myview >> ORDER BY ST_Geomfromtext(mypos) <-> myview.geom >> >> Now, the problem is, that for the "order by" it is not enough that >> each on the two tables calculate the ordering separately: We want a >> total ordering over all involved tables! >> >> In fact, the planner realizes that and chooses a seq scan over all >> tuples of table a and b - which is slow and suboptimal! >> >> To me, that's a use case where we would wish to have a distinct index on >> views. >> >> Any opinions on this? >> >> Yours, Stefan >> >> > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] COPY .... (FORMAT binary) syntax doesn't work
This works fine... COPY pgbench_accounts TO '/tmp/acc' BINARY; This new format does not COPY pgbench_accounts FROM '/tmp/acc' (FORMAT BINARY); ERROR: syntax error at or near "BINARY" at character 47 which looks like I've mistyped something. Until you realise that this statement gives a completely different error message. COPY pgbench_accounts FROM '/tmp/acc' (FORMAT anyname); ERROR: COPY format "anyname" not recognized and we also note that there are no examples in the docs, nor regression tests to cover this situation. So I conclude that this hasn't ever worked since it was introduced in 9.0. The cause is that there is an overlap between the old and the new COPY syntax, relating to the word BINARY. It's the grammar generating the error, not post parse analysis. My attempts to fix that look pretty ugly, so I'm not even going to post them. I can stop the error on binary by causing errors on csv and text, obviously not a fix. Any grammar based fix looks like it would restrict the list of formats, which breaks the orginal intention of the syntax change. I'm advised that COPY pgbench_accounts FROM '/tmp/acc' (FORMAT 'BINARY'); works fine. Though that is not documented and I doubt anyone much uses that. My conclusion is that we should *not* fix this bug, but just alter the manual slightly to explain what the correct usage is (use quotes around 'binary'). Reason for that suggestion is that nobody ever reported this bug, so either few people use binary mode or they use the old syntax. Of course, that is not a normal suggestion, so feel free to walk up and down my spine with boots on for suggesting it. Thoughts? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers