Re: [HACKERS] Parallel safety tagging of extension functions
On Fri, May 20, 2016 at 10:30 PM, Peter Eisentraut wrote: > On 5/20/16 7:37 PM, Robert Haas wrote: >> >> I guess my first question is whether we have consensus on the release >> into which we should put this. Some people (Noah, among others) >> thought it should wait because we're after feature freeze, while >> others thought we should do it now. If we're going to try to get this >> into 9.6, I'll work on reviewing this sooner rather than later, but if >> we're not going to do that I'm going to postpone dealing with it until >> after we branch. > > > Sounds to me that this is part of the cleanup of a 9.6 feature and should be > in that release. Yes, I agree. By the way, the patch completely ignores the fact that some of the modules already had a version bump in the 9.6 development cycle, like pageinpect. You don't need to create a new version script in such cases. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel safety tagging of extension functions
On 5/20/16 7:37 PM, Robert Haas wrote: I guess my first question is whether we have consensus on the release into which we should put this. Some people (Noah, among others) thought it should wait because we're after feature freeze, while others thought we should do it now. If we're going to try to get this into 9.6, I'll work on reviewing this sooner rather than later, but if we're not going to do that I'm going to postpone dealing with it until after we branch. Sounds to me that this is part of the cleanup of a 9.6 feature and should be in that release. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] what to revert
On Sun, May 15, 2016 at 4:06 PM, Tomas Vondra wrote: > Overall, I think this shows that there seems to be no performance penalty > with "disabled" vs. "reverted" - i.e. even with the least favorable (100% > read-only) workload. OK, that's good, as far as it goes. > Whatever the metric is, I think it's fairly clear the patch makes the > results way more volatile - particularly with the "immediate" case and > higher client counts. I think you mean only when it is enabled. Right? Mithun and others at EnterpriseDB have been struggling with the problem of getting reproducible results on benchmarks, even read-only benchmarks that seem like they ought to be fairly stable, and they've been complaining to me about that problem - not specifically with respect to snapshot too old, but in general. I don't have a clear understanding at this point of why a particular piece of code might cause increased volatility, but I wish I did. In the past, I haven't paid much attention to this, but lately it's clear that it is becoming a significant problem when trying to benchmark, especially on many-socket machines. I suspect it might be a problem for actual users as well, but I know of any definite evidence that this is the case. It's probably an area that needs a bunch of work, but I don't understand the problem well enough to know what we should be doing. > What I plan to do next, over the next week: > > 1) Wait for the second run of "immediate" to complete (should happen in a > few hours) > > 2) Do tests with other workloads (mostly read-only, read-write). Sounds good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lets (not) break all the things. Was: [pgsql-advocacy] 9.6 -> 10.0
On Mon, May 16, 2016 at 3:36 AM, Bruce Momjian wrote: > On Sun, May 15, 2016 at 03:23:52PM -0500, Jim Nasby wrote: >> 2) There's no ability at all to revert, other than restore a backup. That >> means if you pull the trigger and discover some major performance problem, >> you have no choice but to deal with it (you can't switch back to the old >> version without losing data). > > In --link mode only No, not really. Once you let write transactions into the new cluster, there's no way to get back to the old server version no matter which option you used. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel safety tagging of extension functions
On Thu, May 19, 2016 at 5:50 PM, Andreas Karlsson wrote: > I have gone through all our extensions and tried to tag all functions > correctly according to their parallel safety. > > I also did the same for the aggregate functions in a second patch, and for > min(citext)/max(citext) set a COMBINEFUNC. > > The changes for the functions is attached as one huge patch. Feel free to > suggest a way to split it up or change it in any way if that would make it > easier to review/apply. > > Some open questions: > > - How should we modify the aggregate functions when upgrading extensions? > ALTER AGGREGATE cannot change COMBINEFUNC or PARALLEL. My current patch > updates the system catalogs directly, which should be safe in this case, but > is this an acceptable solution? > > - Do you think we should add PARALLEL UNSAFE to the functions which we know > are unsafe to make it obvious that it is intentional? > > - I have not added the parallel tags to the functions used by our procedural > languages. Should we do so? > > - I have marked uuid-ossp, chkpass_in() and pgcrypto functions which > generate random data as safe, despite that they depend on state in the > backend. My reasoning is that, especially for the pgcrypto functions, that > nobody should not rely on the PRNG state. For uuid-ossp I am on the fence. > > - I have touched a lot of legacy libraries, like tsearch2 and the spi/* > stuff. Is that a good idea? > > - I decided to ignore that isn_weak() exists (and would make all input > functions PARALLEL RESTRICTED) since it is only there is ISN_WEAK_MODE is > defined. Is that ok? I guess my first question is whether we have consensus on the release into which we should put this. Some people (Noah, among others) thought it should wait because we're after feature freeze, while others thought we should do it now. If we're going to try to get this into 9.6, I'll work on reviewing this sooner rather than later, but if we're not going to do that I'm going to postpone dealing with it until after we branch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add EXPLAIN (ALL) shorthand
On Thu, May 19, 2016 at 6:24 PM, Евгений Шишкин wrote: >> On 20 May 2016, at 01:12, Tom Lane wrote: >> I'm a bit inclined to think that what this is really about is that we >> made the wrong call on the BUFFERS option, and that it should default >> to ON just like COSTS and TIMING do. Yeah, that would be an incompatible >> change, but that's what major releases are for no? > > After thinking about it, i think this is a better idea. Hmm, my experience is different. I use EXPLAIN (ANALYZE, VERBOSE) a lot, but EXPLAIN (ANALYZE, BUFFERS) only rarely. I wonder if a GUC is the way to go. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Request for Patch Feedback: Lag & Lead Window Functions Can Ignore Nulls
Just doing a drive-by... On Fri, May 20, 2016 at 3:50 PM, Jeff Davis wrote: > Old thread link: > > http://www.postgresql.org/message-id/CA+=vxna5_n1q5q5okxc0aqnndbo2ru6gvw+86wk+onsunjd...@mail.gmail.com > > On Thu, Apr 14, 2016 at 1:29 PM, Stephen Frost wrote: > > Jeff > > > > (Reviving an old thread for 2014...) > > > Would you have time to work on this for 9.7..? I came across a > > real-world use case for exactly this capability and was sorely > > disappointed to discover we didn't support it even though there had been > > discussion for years on it, which quite a few interested parties. > > > First, I think the syntax is still implemented in a bad way. Right now > it's part of the OVER clause, and the IGNORE NULLS gets put into the > frame options. It doesn't match the way the spec defines the grammar, > and I don't see how it really makes sense that it's a part of the > frame options or the window object at all. How does the relatively new FILTER clause play into this, if at all? I think we need a need catalog support to say > whether a function honors IGNORE|RESPECT NULLS or not, which means we > also need support in CREATE FUNCTION. > We already have "STRICT" for deciding whether a function processes nulls. Wouldn't this need to exist on the "CREATE AGGREGATE" Rhetorical question: I presume we are going to punt on the issue, since it is non-standard, but what is supposed to happen with a window invocation that ignores nulls but has a non-strict function that returns a non-null on null input? David J.
[HACKERS] Re: Request for Patch Feedback: Lag & Lead Window Functions Can Ignore Nulls
Old thread link: http://www.postgresql.org/message-id/CA+=vxna5_n1q5q5okxc0aqnndbo2ru6gvw+86wk+onsunjd...@mail.gmail.com On Thu, Apr 14, 2016 at 1:29 PM, Stephen Frost wrote: > Jeff > > (Reviving an old thread for 2014...) > > Would you have time to work on this for 9.7..? I came across a > real-world use case for exactly this capability and was sorely > disappointed to discover we didn't support it even though there had been > discussion for years on it, which quite a few interested parties. > > I'll take a look at reviewing your last version of the patch but wanted > to get an idea of if you still had interest and might be able to help. > > Thanks! > > Stephen There are actually quite a few issues remaining here. First, I think the syntax is still implemented in a bad way. Right now it's part of the OVER clause, and the IGNORE NULLS gets put into the frame options. It doesn't match the way the spec defines the grammar, and I don't see how it really makes sense that it's a part of the frame options or the window object at all. I believe that it should be a part of the FuncCall, and end up in the FuncExpr, so the flag can be read when the function is called without exposing frame options (which we don't want to do). I think the reason it was done as a part of the window object is so that it could save state between calls for the purpose of optimization, but I think that can be done using fn_extra. This change should remove a lot of the complexity about trying to share window definitions, etc. Second, we need a way to tell which functions support IGNORE NULLS and which do not. The grammar as implemented in the patch seems to allow it for any function with an OVER clause (which can include ordinary aggregates). Then the parse analysis hard-codes that only LEAD and LAG accept IGNORE NULLS, and throws an error otherwise. Neither of these things seem right. I think we need a need catalog support to say whether a function honors IGNORE|RESPECT NULLS or not, which means we also need support in CREATE FUNCTION. I think the execution is pretty good, except that (a) we need to keep the state in fn_extra rather than the winstate; and (b) we should get rid of the bitmaps and just do a naive scan unless we really think non-constant offsets will be important. We can always optimize more later. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel safety tagging of extension functions
On Thu, May 19, 2016 at 05:50:01PM -0400, Andreas Karlsson wrote: > Hi, > > I have gone through all our extensions and tried to tag all functions > correctly according to their parallel safety. > > I also did the same for the aggregate functions in a second patch, and for > min(citext)/max(citext) set a COMBINEFUNC. > > The changes for the functions is attached as one huge patch. Feel free to > suggest a way to split it up or change it in any way if that would make it > easier to review/apply. > > Some open questions: > > - How should we modify the aggregate functions when upgrading extensions? > ALTER AGGREGATE cannot change COMBINEFUNC or PARALLEL. At the moment, it basically changes namespace and ownership but doesn't touch the actual implementation. Maybe it should be able to tweak those. A brief check implies that it's not a gigantic amount of work. Is it worth making ALTER AGGREGATE support the things CREATE AGGREGATE does? Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Speedup twophase transactions
On Fri, May 20, 2016 at 12:46 PM, Alvaro Herrera wrote: > Jesper Pedersen wrote: >> Discussed with Noah off-list I think we should revisit this for 9.6 due to >> the async replica lag as shown in [1]. The performance improvement for the >> master node is shown in [2]. > > I gave a very quick look and it seems to me far more invasive than we > would normally consider in the beta period. I would just put it to > sleep till release 10 opens up. I share the same opinion. Improving 2PC is definitely a huge win thanks to the first patch that got committed when WAL is generated, but considering how the second patch is invasive really concerns me, and I looked at the patch in an extended way a couple of weeks back. As we care about stability now regarding 9.6, let's bump the second portion to 10.0 as well as keep the improvement for WAL generation in 9.6. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Speedup twophase transactions
Jesper Pedersen wrote: > Discussed with Noah off-list I think we should revisit this for 9.6 due to > the async replica lag as shown in [1]. The performance improvement for the > master node is shown in [2]. I gave a very quick look and it seems to me far more invasive than we would normally consider in the beta period. I would just put it to sleep till release 10 opens up. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Declarative partitioning
Hi Amit, On 20.05.2016 11:37, Amit Langote wrote: Perhaps you're already aware but may I also suggest looking at how clauses are matched to indexes? For example, consider how match_clauses_to_index() in src/backend/optimizer/path/indxpath.c works. Thanks, I'll take a closer look at it. Moreover, instead of pruning partitions in planner prep phase, might it not be better to do that when considering paths for the (partitioned) rel? IOW, instead of looking at parse->jointree, we should rather be working with rel->baserestrictinfo. Although, that would require some revisions to how append_rel_list, simple_rel_list, etc. are constructed and manipulated in a given planner invocation. Maybe it's time for that... Again, you may have already considered these things. Yes, you're right, this is how we did it in pg_pathman extension. But for this patch it requires further consideration and I'll do it in future! Could you try with the attached updated set of patches? I changed partition descriptor relcache code to eliminate excessive copying in previous versions. Thanks, Amit I tried your new patch and got following results, which are quite close to the ones using pointer to PartitionDesc structure (TPS): # of partitions | single row | single partition ++-- 100 | 3014 | 1024 1000 | 2964 | 1001 2000 | 2874 | 1000 However I've encountered a problem which is that postgres crashes occasionally while creating partitions. Here is function that reproduces this behaviour: CREATE OR REPLACE FUNCTION fail() RETURNS void LANGUAGE plpgsql AS $$ BEGIN DROP TABLE IF EXISTS abc CASCADE; CREATE TABLE abc (id SERIAL NOT NULL, a INT, dt TIMESTAMP) PARTITION BY RANGE (a); CREATE INDEX ON abc (a); CREATE TABLE abc_0 PARTITION OF abc FOR VALUES START (0) END (1000); CREATE TABLE abc_1 PARTITION OF abc FOR VALUES START (1000) END (2000); END $$; SELECT fail(); It happens not every time but quite often. It doesn't happen if I execute this commands one by one in psql. Backtrace: #0 range_overlaps_existing_partition (key=0x7f1097504410, range_spec=0x1d0f400, pdesc=0x1d32200, with=0x7ffe437ead00) at partition.c:747 #1 0x0054c2a5 in StorePartitionBound (relid=245775, parentId=245770, bound=0x1d0f400) at partition.c:578 #2 0x0061bfc4 in DefineRelation (stmt=0x1d0dfe0, relkind=114 'r', ownerId=10, typaddress=0x0) at tablecmds.c:739 #3 0x007f4473 in ProcessUtilitySlow (parsetree=0x1d1a150, queryString=0x1d1d940 "CREATE TABLE abc_0 PARTITION OF abc FOR VALUES START (0) END (1000)", context=PROCESS_UTILITY_QUERY, params=0x0, dest=0xdb5ca0 , completionTag=0x7ffe437eb500 "") at utility.c:983 #4 0x007f425e in standard_ProcessUtility (parsetree=0x1d1a150, queryString=0x1d1d940 "CREATE TABLE abc_0 PARTITION OF abc FOR VALUES START (0) END (1000)", context=PROCESS_UTILITY_QUERY, params=0x0, dest=0xdb5ca0 , completionTag=0x7ffe437eb500 "") at utility.c:907 #5 0x007f3354 in ProcessUtility (parsetree=0x1d1a150, queryString=0x1d1d940 "CREATE TABLE abc_0 PARTITION OF abc FOR VALUES START (0) END (1000)", context=PROCESS_UTILITY_QUERY, params=0x0, dest=0xdb5ca0 , completionTag=0x7ffe437eb500 "") at utility.c:336 #6 0x0069f8b2 in _SPI_execute_plan (plan=0x1d19cf0, paramLI=0x0, snapshot=0x0, crosscheck_snapshot=0x0, read_only=0 '\000', fire_triggers=1 '\001', tcount=0) at spi.c:2200 #7 0x0069c735 in SPI_execute_plan_with_paramlist (plan=0x1d19cf0, params=0x0, read_only=0 '\000', tcount=0) at spi.c:450 #8 0x7f108cc6266f in exec_stmt_execsql (estate=0x7ffe437eb8e0, stmt=0x1d05318) at pl_exec.c:3517 #9 0x7f108cc5e5fc in exec_stmt (estate=0x7ffe437eb8e0, stmt=0x1d05318) at pl_exec.c:1503 #10 0x7f108cc5e318 in exec_stmts (estate=0x7ffe437eb8e0, stmts=0x1d04c98) at pl_exec.c:1398 #11 0x7f108cc5e1af in exec_stmt_block (estate=0x7ffe437eb8e0, block=0x1d055e0) at pl_exec.c:1336 #12 0x7f108cc5c35d in plpgsql_exec_function (func=0x1cc2a90, fcinfo=0x1cf7f50, simple_eval_estate=0x0) at pl_exec.c:434 ... Thanks -- Ildar Musin i.mu...@postgrespro.ru
Re: [HACKERS] foreign table batch inserts
On 20 May 2016 at 15:35, Craig Ringer wrote: > > You can, however, omit Sync from between messages and send a series of > protocol messages, like > > Parse/Bind/Execute/Bind/Execute/Bind/Execute/Sync > > to avoid round-trip overheads. > > I implemented what I think is a pretty solid proof of concept of this for kicks this evening. Attached, including basic test program. Patch attached. The performance difference over higher latency links is huge, see below. Demo/test program in src/test/examples/testlibpqbatch.c. github: https://github.com/2ndQuadrant/postgres/tree/dev/libpq-async-batch I still need to add the logic for handling an error during a batch by discarding all input until the next Sync, but otherwise I think it's pretty reasonable. The time difference for 10k inserts on the local host over a unix socket shows a solid improvement: batch insert elapsed: 0.244293s sequential insert elapsed: 0.375402s ... but over, say, a connection to a random AWS RDS instance fired up for the purpose that lives about 320ms away the difference is huge: batch insert elapsed: 9.029995s sequential insert elapsed: (I got bored after 10 minutes; it should take a bit less then an hour based on the latency numbers) With 500 rows on the remote AWS RDS instance, once the I/O quota is already saturated: batch insert elapsed: 1.229024s sequential insert elapsed: 156.962180s which is an improvement by a factor of over 120 I didn't compare vs COPY. I'm sure COPY will be faster, but COPY doesn't let you do INSERT ... ON CONFLICT, do UPDATE, do DELETE, etc. Not without temp tables and a bunch of hoop jumping anyway. If COPY solved everything there'd be no point having pipelining. No docs yet, but if folks think the interface is reasonable I can add them easily since the comments on each of the new functoins should be easy to adapt into the SGML docs. With a bit of polishing I think this can probably go in the next CF, though I only wrote it as an experiment. Can I get opinions on the API? The TL;DR API, using the usual async libpq routines, is: PQbeginBatchMode(conn); PQsendQueryParams(conn, "BEGIN", 0, NULL, NULL, NULL, NULL, 0); PQsendPrepare(conn, "my_update", "UPDATE ..."); PQsetnonblocking(conn, 1); while (!all_responses_received) { select(...) if (can-write) { if (app-has-more-data-to-send) { PQsendQueryPrepared(conn, "my_update", params-go-here); } else if (havent-sent-commit-yet) { PQsendQueryParams(conn, "COMMIT", ...); } else if (havent-sent-endbatch-yet) { PqEndBatch(conn); } PQflush(conn); } if (can-read) { PQconsumeInput(conn); if (PQisBusy(conn)) continue; res = PQgetResult(conn); if (res == NULL) { PQgetNextQuery(conn); continue; } /* process results in the same order we sent the commands */ /* client keeps track of that, libpq just supplies the results */ ... } } PQendBatch(conn); Note that: * PQsendQuery cannot be used as it uses simple query protocol, use PQsendQueryParams instead; * Batch supports PQsendQueryParams, PQsendPrepare, PQsendQueryPrepared, PQsendDescribePrepared, PQsendDescribePortal; * You don't call PQgetResult after dispatching each query * Multiple batches may be pipelined, you don't have to wait for one to end to start another (an advantage over JDBC's API) * non-blocking mode isn't required, but is strongly advised -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From f0ca25bdc2bacf65530e4f180fdfc7c219866541 Mon Sep 17 00:00:00 2001 From: Craig Ringer Date: Fri, 20 May 2016 12:45:18 +0800 Subject: [PATCH] Draft of libpq async pipelining support Now with test in src/test/examples/testlibpqbatch.c --- src/interfaces/libpq/exports.txt| 6 + src/interfaces/libpq/fe-connect.c | 16 + src/interfaces/libpq/fe-exec.c | 540 ++-- src/interfaces/libpq/fe-protocol2.c | 6 + src/interfaces/libpq/fe-protocol3.c | 13 +- src/interfaces/libpq/libpq-fe.h | 12 +- src/interfaces/libpq/libpq-int.h| 36 +- src/test/examples/Makefile | 2 +- src/test/examples/testlibpqbatch.c | 690 9 files changed, 1278 insertions(+), 43 deletions(-) create mode 100644 src/test/examples/testlibpqbatch.c diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt index 21dd772..e297c4b 100644 --- a/src/interfaces/libpq/exports.txt +++ b/src/interfaces/libpq/exports.txt @@ -171,3 +171,9 @@ PQsslAttributeNames 168 PQsslAttribute169 PQsetErrorContextVisibility 170 PQresultVerboseErrorMessage 171 +PQisInBatchMode 172 +PQqueriesInBatch 173 +PQbeginBatchMode 174 +PQendBatchMode175 +PQendBatch176 +PQgetNextQuery177 diff --git a/src/interfaces/libpq/fe
Re: [HACKERS] Refactor pg_dump as a library?
Hello, Jakob. You wrote: JE> Would anybody else be interested in a pg_dump library? I've found JE> a thread from 2013 related to the idea, but the discussion came to nothing. JE> Thread started here: JE> http://www.postgresql.org/message-id/71e01949.2e16b.13df4707405.coremail.shuai900...@126.com JE> My Motivation: JE> I'm the developer of a PostgreSQL GUI client, and I am looking JE> for ways to integrate pg_dump into my application. The main use JE> case would be to get the dump of individual tables (for example, JE> when you want to create a table similar to an existing one) JE> Bundling pg_dump with my application and calling it doesn't allow JE> the fine grained control and integration I would like to have. JE> Also, pg_dump always opens a new connection; I would prefer to use JE> an existing database connection instead. JE> In case anybody else is interested in this, I can offer to JE> sponsor some developer time towards this effort. JE> Best regards, JE> Jakob I proposed this several times, but nobody cares. Then we did it. Our PostgresDAC component set (http://microolap.com/products/connectivity/postgresdac/) has TPSQLDump and TPSQLRestore classes. Details: http://microolap.com/products/connectivity/postgresdac/help/tpsqldump_tpsqldump.htm Also we've implemented PaGoDump and PaGoRestore utilities compatible with native pg_dump/pg_restore: http://microolap.com/products/database/pagodump/ -- With best wishes, Pavel mailto:pa...@gf.microolap.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Possible regression regarding estimating relation width in FDWs
Hello, While working on adapting the Multicorn FDW for 9.6, I noticed that there is a regression with regards to estimating the remote relation width. This behavior can be exposed using the postgres_fdw, using "use_remote_estimate". Test case: CREATE EXTENSION postgres_fdw; CREATE SERVER localhost FOREIGN DATA WRAPPER postgres_fdw; CREATE USER MAPPING FOR CURRENT_USER SERVER localhost; CREATE TABLE local_table (c1 text); INSERT INTO local_table (c1) (SELECT repeat('test', 1)); ANALYZE local_table; CREATE FOREIGN TABLE foreign_table (c1 text) SERVER localhost OPTIONS (table_name 'local_table', use_remote_estimate 'true'); EXPLAIN SELECT * FROM foreign_table; Output, on current HEAD: QUERY PLAN -- Foreign Scan on foreign_table (cost=100.00..101.03 rows=1 width=32) On 9.5: QUERY PLAN --- Foreign Scan on foreign_table (cost=100.00..101.03 rows=1 width=472) While the FDW correctly sets the pathtarget width, it is then overriden at a later point. I'm not sure what happens exactly, but it seems that the relation path target is ignored completely, in planner.c:1695: /* * Convert the query's result tlist into PathTarget format. * * Note: it's desirable to not do this till after query_planner(), * because the target width estimates can use per-Var width numbers * that were obtained within query_planner(). */ final_target = create_pathtarget(root, tlist); It says explicitly that it will be computed using per-Var width numbers. I think the current_rel->cheapest_total_path->pathtarget should be taken into account, at least in the FDW case. I'm not sure if the ability to estimate the whole relation width should be deprecated in favor of per-var width, or if it still should be supported (after all, the GetForeignRelSize callback is called AFTER having set a value computed from the individual attr_widths, in set_foreign_size). But in any case, at least postgres_fdw should be updated to support that. Sorry if that was not clear, I'm at PGCon at the moment so if anyone want to discuss that in person I'm available. -- Ronan Dunklau http://dalibo.com - http://dalibo.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Speedup twophase transactions
Hi, On 04/13/2016 10:31 AM, Stas Kelvich wrote: On 13 Apr 2016, at 01:04, Michael Paquier wrote: That's too late for 9.6 unfortunately, don't forget to add that in the next CF! Fixed patch attached. There already was infrastructure to skip currently held locks during replay of standby_redo() and I’ve extended that with check for prepared xids. The reason why I’m still active on this thread is because I see real problems in deploying 9.6 in current state. Let me stress my concern: current state of things _WILL_BREAK_ async replication in case of substantial load of two phase transactions on master. And a lot of J2EE apps falls into that category, as they wrap most of their transactions in prepare/commit. Slave server just will always increase it lag and will never catch up. It is possible to deal with that by switching to synchronous replication or inserting triggers with pg_sleep on master, but it doesn’t looks like normal behaviour of system. Discussed with Noah off-list I think we should revisit this for 9.6 due to the async replica lag as shown in [1]. The performance improvement for the master node is shown in [2]. As I see it there are 3 options to resolve this (in one way or another) * Leave as is, document the behaviour in release notes/documentation * Apply the patch for slaves * Revert the changes done to the twophase.c during the 9.6 cycle. All have pros/cons for the release. Latest slave patch by Stas is on [3]. Thoughts from others on the matter would be appreciated. [1] http://www.postgresql.org/message-id/e7497864-de11-4099-83f5-89fb97af5...@postgrespro.ru [2] http://www.postgresql.org/message-id/5693f703.3000...@redhat.com [3] https://commitfest.postgresql.org/10/523/ Best regards, Jesper -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] It's seems that the function "do_text_output_multiline" does not suit for format "line1\nline2\n...lineN".
On 20 May 2016 at 19:13, Hao Lee wrote: > > Hi all, > > Today, I am do some works on adding some customized featues to PostgreSQL 9.6 > beta1. But, when i do some output to psql using the fuction > "do_text_output_multiline" with the string just like mentioned in mail tilte, > such as "this is a\ntest for\nnew blank.". the PostgreSQL may lead to > corruption in this function, and i debugged it that found this function can > not dealt with the boundaries properly. The original function code as : > > do_text_output_multiline(TupOutputState *tstate, char *text) > { > Datumvalues[1]; > boolisnull[1] = {false}; > > while (*text) > { > char *eol; > intlen; > > eol = strchr(text, '\n'); > if (eol) > { > len = eol - text; > > eol++; > } > else > { > len = strlen(text); > eol += len; > } > > values[0] = PointerGetDatum(cstring_to_text_with_len(text, len)); > do_tup_output(tstate, values, isnull); > pfree(DatumGetPointer(values[0])); > text = eol; > } > } > Thanks for reporting this. It does seem pretty broken. I guess we've only gotten away with this due to EXPLAIN output lines always having a \n at the end of them, but we should fix this. Your proposed fix looks a little bit confused. You could have just removed the eol += len; as testing if (eol) in the else will never be true as that else is only being hit because eol is NULL. I shuffled things around in there a bit and came up with the attached fix. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services do_text_output_multiline_fix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] foreign table batch inserts
On 20 May 2016 at 08:47, Tsunakawa, Takayuki wrote: > From: pgsql-hackers-ow...@postgresql.org [mailto: > pgsql-hackers-ow...@postgresql.org] On Behalf Of Craig Ringer > > Well, there's FE/BE level batching/pipelining already. Just no access to > it from libpq. > > > > Oh, really. The Bind ('B') appears to take one set of parameter values, > not multiple sets (array). Anyway, I had to say "I want batch update API > in libpq" to use it in ODBC and ECPG. > > Right, and there's no protocol level support for array-valued batches. You can, however, omit Sync from between messages and send a series of protocol messages, like Parse/Bind/Execute/Bind/Execute/Bind/Execute/Sync to avoid round-trip overheads. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
[HACKERS] It's seems that the function "do_text_output_multiline" does not suit for format "line1\nline2\n...lineN".
Hi all, Today, I am do some works on adding some customized featues to PostgreSQL 9.6 beta1. But, when i do some output to psql using the fuction "do_text_output_multiline" with the string just like mentioned in mail tilte, such as "this is a\ntest for\nnew blank.". the PostgreSQL may lead to corruption in this function, and i debugged it that found this function can not dealt with the boundaries properly. The original function code as : do_text_output_multiline(TupOutputState *tstate, char *text) { Datumvalues[1]; boolisnull[1] = {false}; while (*text) { char *eol; intlen; eol = strchr(text, '\n'); if (eol) { len = eol - text; eol++; } else { len = strlen(text); eol += len; } values[0] = PointerGetDatum(cstring_to_text_with_len(text, len)); do_tup_output(tstate, values, isnull); pfree(DatumGetPointer(values[0])); text = eol; } } using the string "this is a*\n*test for*\n*new blank." as the second input parameter, when dealing with the the last part of stirng, "new blank.", the code "eol = strchr(text, '\n');" will return NULL and will go into the "else" part, and the varialbe "eol" will added an integer, which is the length of "new blank.", eol will be 0xa, text is set to 0xa too. this address will is a system used address, then corruption. I think some boundary check will be added as following. do_text_output_multiline(TupOutputState *tstate, char *text) { Datumvalues[1]; boolisnull[1] = {false}; while (*text *&& *text) //*if the text is NULL the return, do nothing.* { char *eol; intlen; eol = strchr(text, '\n'); if (eol) { len = eol - text; eol++; } else { len = strlen(text); * if (eol) //to check whether the eol is NULL or not.* eol += len; } values[0] = PointerGetDatum(cstring_to_text_with_len(text, len)); do_tup_output(tstate, values, isnull); pfree(DatumGetPointer(values[0])); text = eol; } } Best Regards. RingsC.