Re: [HACKERS] POC: Cache data in GetSnapshotData()
Cache the SnapshotData for reuse: === In one of our perf analysis using perf tool it showed GetSnapshotData takes a very high percentage of CPU cycles on readonly workload when there is very high number of concurrent connections >= 64. Machine : cthulhu 8 node machine. -- Architecture: x86_64 CPU op-mode(s):32-bit, 64-bit Byte Order:Little Endian CPU(s):128 On-line CPU(s) list: 0-127 Thread(s) per core:2 Core(s) per socket:8 Socket(s): 8 NUMA node(s): 8 Vendor ID: GenuineIntel CPU family:6 Model: 47 Model name:Intel(R) Xeon(R) CPU E7- 8830 @ 2.13GHz Stepping: 2 CPU MHz: 2128.000 BogoMIPS: 4266.63 Virtualization:VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 24576K NUMA node0 CPU(s): 0,65-71,96-103 NUMA node1 CPU(s): 72-79,104-111 NUMA node2 CPU(s): 80-87,112-119 NUMA node3 CPU(s): 88-95,120-127 NUMA node4 CPU(s): 1-8,33-40 NUMA node5 CPU(s): 9-16,41-48 NUMA node6 CPU(s): 17-24,49-56 NUMA node7 CPU(s): 25-32,57-64 Perf CPU cycle 128 clients. On further analysis, it appears this mainly accounts to cache-misses happening while iterating through large proc array to compute the SnapshotData. Also, it seems mainly on read-only (read heavy) workload SnapshotData mostly remains same as no new(rarely a) transaction commits or rollbacks. Taking advantage of this we can save the previously computed SanspshotData in shared memory and in GetSnapshotData we can use the saved SnapshotData instead of computing same when it is still valid. A saved SnapsotData is valid until no transaction end. [Perf analysis Base] Samples: 421K of event 'cache-misses', Event count (approx.): 160069274 Overhead Command Shared Object Symbol 18.63% postgres postgres[.] GetSnapshotData 11.98% postgres postgres[.] _bt_compare 10.20% postgres postgres[.] PinBuffer 8.63% postgres postgres[.] LWLockAttemptLock 6.50% postgres postgres[.] LWLockRelease [Perf analysis with Patch] Server configuration: ./postgres -c shared_buffers=8GB -N 300 -c min_wal_size=15GB -c max_wal_size=20GB -c checkpoint_timeout=900 -c maintenance_work_mem=1GB -c checkpoint_completion_target=0.9 -c wal_buffers=256MB & pgbench configuration: scale_factor = 300 ./pgbench -c $threads -j $threads -T $time_for_reading -M prepared -S postgres The machine has 64 cores with this patch I can see server starts improvement after 64 clients. I have tested up to 256 clients. Which shows performance improvement nearly max 39% [1]. This is the best case for the patch where once computed snapshotData is reused further. The worst case seems to be small, quick write transactions with which frequently invalidate the cached SnapshotData before it is reused by any next GetSnapshotData call. As of now, I tested simple update tests: pgbench -M Prepared -N on the same machine with the above server configuration. I do not see much change in TPS numbers. All TPS are median of 3 runs Clients TPS-With Patch 05 TPS-Base%Diff 1 752.461117755.186777 -0.3% 64 32171.296537 31202.153576 +3.1% 128 41059.660769 40061.929658 +2.49% I will do some profiling and find out why this case is not costing us some performance due to caching overhead. [] On Thu, Aug 3, 2017 at 2:16 AM, Andres Freundwrote: > Hi, > > I think this patch should have a "cover letter" explaining what it's > trying to achieve, how it does so and why it's safe/correct. I think > it'd also be good to try to show some of the worst cases of this patch > (i.e. where the caching only adds overhead, but never gets used), and > some of the best cases (where the cache gets used quite often, and > reduces contention significantly). > > I wonder if there's a way to avoid copying the snapshot into the cache > in situations where we're barely ever going to need it. But right now > the only way I can think of is to look at the length of the > ProcArrayLock wait queue and count the exclusive locks therein - which'd > be expensive and intrusive... > > > On 2017-08-02 15:56:21 +0530, Mithun Cy wrote: >> diff --git a/src/backend/storage/ipc/procarray.c >> b/src/backend/storage/ipc/procarray.c >> index a7e8cf2..4d7debe 100644 >> --- a/src/backend/storage/ipc/procarray.c >> +++ b/src/backend/storage/ipc/procarray.c >> @@ -366,11 +366,13 @@ ProcArrayRemove(PGPROC *proc, TransactionId latestXid) >> (arrayP->numProcs - index - 1) * >> sizeof(int)); >> arrayP->pgprocnos[arrayP->numProcs - 1] = -1; /* for >> debugging */ >>
Re: [HACKERS] Adding hook in BufferSync for backup purposes
Hi hackers! > 8 авг. 2017 г., в 11:27, Tom Laneнаписал(а): > > My point is not to claim that we mustn't put a hook there. It's that what > such a hook could safely do is tightly constrained, and you've not offered > clear evidence that there's something useful to be done within those > constraints. Alvaro seems to think that the result might be better > reached by hooking in at other places, and my gut reaction is similar. > I was studying through work done by Marco and Gabriel on the matter of tracking pages for the incremental backup, and I have found PTRACK patch by Yury Zhuravlev and PostgresPro [0]. This work seems to be solid and thoughtful. I have drafted a new prototype for hooks enabling incremental backup as extension based on PTRACK calls. If you look at the original patch you can see that attached patch differs slightly: functionality to track end of critical section is omitted. I have not included it in this draft because it changes very sensitive place even for those who do not need it. Subscriber of proposed hook must remember that it is invoked under critical section. But there cannot me more than XLR_MAX_BLOCK_ID blocks for one transaction. Proposed draft does not add any functionality to track finished transactions or any atomic unit of work, just provides a flow of changed block numbers. Neither does this draft provide any assumption on where to store this information. I suppose subscribers could trigger asynchronous writes somewhere as long as info for given segment is accumulated (do we need a hook on segment end then?). During inremental backup you can skip scanning those WAL segments for which you have accumulated changeset of block numbers. The thing which is not clear to me: if we are accumulating blocknumbers under critical section, then we must place them to preallocated array. When is the best time to take away these blocknumbers to empty that array and avoid overflow? PTRACK has array of XLR_MAX_BLOCK_ID length and saves these array during the end of each critical section. But I want to avoid intervention into critical sections. Thank you for your attention, any thoughts will be appreciated. Best regards, Andrey Borodin. [0] https://gist.github.com/stalkerg/ab833d94e2f64df241f1835651e06e4b 0001-hooks-to-watch-for-changed-pages.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Improve bitmap costing for lossy pages
On Wed, Aug 23, 2017 at 9:45 AM, Dilip Kumarwrote: >> >> ... >> if (tbm->nentries <= tbm->maxentries / 2) >> { >> /* >> * We have made enough room. >> ... >> I think we could try higher fill factor, say, 0.9. tbm_lossify basically >> just continues iterating over the hashtable with not so much overhead per a >> call, so calling it more frequently should not be a problem. On the other >> hand, it would have to process less pages, and the bitmap would be less >> lossy. >> >> I didn't benchmark the index scan per se with the 0.9 fill factor, but the >> reduction of lossy pages was significant. > > I will try this and produce some performance number. > I have done some performance testing as suggested by Alexander (patch attached). Performance results: I can see a significant reduction in lossy_pages count in all the queries and also a noticeable reduction in the execution time in some of the queries. I have tested with 2 different work_mem. Below are the test results (lossy pages count and execution time). TPCH benchmark: 20 scale factor Machine: Power 4 socket Tested with max_parallel_worker_per_gather=0 Work_mem: 20 MB (Lossy Pages count:) Query head patch 4 166551 35478 5330679 35765 6 1160339 211357 14 666897 103275 15 1160518 211544 20 1982981 405903 (Time in ms:) Queryhead patch 414849 14093 576790 74486 625816 14327 14 16011 11093 15 5138135326 20 25 195501 Work_mem: 40 MB (Lossy Pages count) Queryhead patch 6 995223195681 14337894 75744 15 995417 195873 20 1654016 199113 (Time in ms) Queryhead patch 6 2381914571 14 1351411183 15 49980 32400 20204441 188978 -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com lossify_slow.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands
On Sat, Aug 26, 2017 at 8:00 AM, Robert Haaswrote: > What happens if you say VACUUM partitioned_table (a), some_partition (b)? Using v9, if you do that: =# CREATE TABLE parent (id int) PARTITION BY RANGE (id); CREATE TABLE =# CREATE TABLE child_1_10001 partition of parent for values from (1) to (10001); CREATE TABLE =# CREATE TABLE child_10001_20001 partition of parent for values from (10001) to (20001); CREATE TABLE =# insert into parent values (generate_series(1,2)); INSERT 0 2 Vacuuming the parent vacuums all the children, so any child listed would get vacuumed twice, still this does not cause an error: =# vacuum parent, child_10001_2; VACUUM And with the de-duplication patch on top of it, things are vacuumed only once. On Tue, Aug 29, 2017 at 7:56 AM, Bossart, Nathan wrote: > On 8/23/17, 11:59 PM, "Michael Paquier" wrote: >> + * relations is a list of VacuumRelations to process. If it is NIL, all >> + * relations in the database are processed. >> Typo here, VacuumRelation is singular. > > This should be fixed in v9. > > On 8/24/17, 5:45 PM, "Michael Paquier" wrote: >> This makes me think that it could be a good idea to revisit this bit >> in a separate patch. ANALYZE fails as well now when the same column is >> defined multiple times with an incomprehensible error message. > > The de-duplication code is now in a separate patch, > dedupe_vacuum_relations_v1.patch. I believe it fixes the incomprehensible > error message you were experiencing, but please let me know if you are > still hitting it. It looks that problems in this area are fixed using the second patch. > On 8/25/17, 6:00 PM, "Robert Haas" wrote: >> +oldcontext = MemoryContextSwitchTo(vac_context); >> +foreach(lc, relations) >> +temp_relations = lappend(temp_relations, copyObject(lfirst(lc))); >> +MemoryContextSwitchTo(oldcontext); >> +relations = temp_relations; >> >> Can't we just copyObject() on the whole list? > > I've made this change. > >> -ListCell *cur; >> - >> >> Why change this? Generally, declaring a separate variable in an inner >> scope seems like better style than reusing one that happens to be >> lying around in the outer scope. > > I've removed this change. > >> +VacuumRelation *relinfo = (VacuumRelation *) lfirst(lc); >> >> Could use lfirst_node. > > Done. > > On 8/28/17, 5:28 PM, "Michael Paquier" wrote: >> Yes, if I understand that correctly. That's the point I am exactly >> coming at. My suggestion is to have one VacuumRelation entry per >> relation vacuumed, even for partitioned tables, and copy the list of >> columns to each one. > > I've made this change in v9. It does clean up the patch quite a bit. Here is some input for vacuum_multiple_tables_v9, about which I think that we are getting to something committable. Here are some minor comments. - With no parameter, VACUUM processes every table in the + With no parameters, VACUUM processes every table in the current database that the current user has permission to vacuum. - With a parameter, VACUUM processes only that table. + With parameters, VACUUM processes only the tables + specified. The part about parameters looks fine to me if unchanged. + foreach(lc, relations) + { + VacuumRelation *relation = lfirst_node(VacuumRelation, lc); + if (relation->va_cols != NIL && (options & VACOPT_ANALYZE) == 0) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), +errmsg("ANALYZE option must be specified when a column list is provided"))); + } Could you add a hint with the relation name involved here? When many relations are defined in the VACUUM query this would be useful for the user. + tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid)); + if (!HeapTupleIsValid(tuple)) + elog(ERROR, "cache lookup failed for relation %u", relid); + classForm = (Form_pg_class) GETSTRUCT(tuple); + include_parts = (classForm->relkind == RELKIND_PARTITIONED_TABLE); + ReleaseSysCache(tuple); It pains me to see that get_rel_relkind does not return an error if the relation is missing, we could use it here. I would welcome a refactoring with a missing_ok argument a lot! Now this patch for VACUUM does not justify breaking potentially many extensions... + relinfo = makeNode(VacuumRelation); + rel_oid = HeapTupleGetOid(tuple); + relinfo->oid = rel_oid; There are 4 patterns like that in the patch. We could make use of a makeVacuumRelation. About the de-duplication patch, I have to admit that I am still not a fan of doing such a thing. Another road that we could take is to simply complain with a proper error message if: - the same column name is specified twice for a relation. - the same relation
Re: [HACKERS] NoMovementScanDirection in heapgettup() and heapgettup_pagemode()
On Mon, Aug 28, 2017 at 5:50 PM, Tom Lanewrote: > I think that's probably dead code given that ExecutorRun short-circuits > everything for NoMovementScanDirection. There is some use of > NoMovementScanDirection for indexscans, to denote an unordered index, > but likely that could be got rid of too. That would leave us with > NoMovementScanDirection having no other use except as a do-nothing > flag for ExecutorRun. > > regards, tom lane Thanks, Tom, yes it seems to be a dead code. -- Thanks and Regards Mithun C Y EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] show "aggressive" or not in autovacuum logs
On Mon, Aug 28, 2017 at 2:26 AM, Kyotaro HORIGUCHI < horiguchi.kyot...@lab.ntt.co.jp> wrote: > https://www.postgresql.org/docs/devel/static/runtime-config-client.html > > > > V > ACUUM performs an aggressive scan > Maybe this should gets its own thread/patch but I'll tack this on here since it all seems related. That paragraph you linked has a couple of typos: "Although users can set this value anywhere from zero to two billions, VACUUM" ... should be 'two billion' (i.e., drop the "s") "...so that a periodical manual VACUUM has..." 'periodic' - though the description in the linked 24.1.5 is somewhat clearer (and longer) - the gap exists for the benefit of routine vacuum invocations to detect the need for an aggressive vacuum as part of a normal operating cycle rather than the last routine vacuum being non-aggressive and shortly thereafter an auto-vacuum anti-wraparound run is performed. Current: VACUUM will silently limit the effective value to 95% of autovacuum_freeze_max_age, so that a periodical manual VACUUM has a chance to run before an anti-wraparound autovacuum is launched for the table. My interpretation: VACUUM will silently limit the effective value to 95% of autovacuum_freeze_max_age so that a normal scan has a window within which to detect the need to convert itself to an aggressive scan and preempt the need for an untimely autovacuum initiated anti-wraparound scan. As noted in the 24.1.5 that "normal scan" can be time scheduled or an update driven auto-vacuum one. It could be manual (though not really periodic) if one is monitoring the aging and is notified to run on manually when the age falls within the gap. "Aggressive" sounds right throughout all of this. Up-thread: INFO: vacuuming "public.it" in aggressive mode Maybe: INFO: aggressively vacuuming "public.it" INFO: vacuuming "public.it" Likewise: LOG: automatic vacuum of table "postgres.public.pgbench_branches": mode: aggressive, index scans: 0 could be: LOG: automatic aggressive vacuum of table "postgres.public.pgbench_branches", index scans: 0 Having read the docs and come to understand what "aggressive" means two wordings work for me (i.e., leaving non-aggressive unadorned). David J.
Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90
On Mon, Aug 28, 2017 at 10:17 PM, Tom Lanewrote: > In the meantime, I think what we should do is commit the bug fix more or > less as I have it, and then work on Amit's concern about losing parallel > efficiency by separating the resetting of shared parallel-scan state > into a new plan tree traversal that's done before launching new worker > processes. The only real alternative is to lobotomize the existing rescan > optimizations, and that seems like a really poor choice from here. There's already ExecParallelReinitialize, which could be made to walk the nodes in addition to what it does already, but I don't understand exactly what else needs fixing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hash partitioning based on v10Beta2
On Mon, Aug 28, 2017 at 10:44 PM, yangjiewrote: > When the number of partitions and the data are more, adding new partitions, > there will be some efficiency problems. > I don't know how the solution you're talking about is how to implement a > hash partition? I am having difficulty understanding this. There was discussion on the other thread of how splitting partitions could be done reasonably efficiently with the proposed design; of course, it's never going to be super-cheap. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: pg_rewind: Fix some problems when copying files >2GB.
Michael Paquierwrites: > I don't like breaking the abstraction of pg_log() with the existing > flags with some kind of pg_debug() layer. The set of APIs present now > in pg_rewind for logging is nice to have, and I think that those debug > messages should be translated. So what about the attached? Your point about INT64_FORMAT not necessarily working with fprintf is an outstanding reason not to keep it like it is. I've not reviewed this patch in detail but I think this is basically the way to fix it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: pg_rewind: Fix some problems when copying files >2GB.
On Mon, Aug 28, 2017 at 11:24 PM, Robert Haaswrote: > On Mon, Aug 28, 2017 at 10:16 AM, Alvaro Herrera > wrote: >>> I am fine with however you want to handle it, but it seems odd to me >>> that we don't have a way of embedding INT64_FORMAT in a translatable >>> string. Surely that's going to be a problem in some case, sometime, >>> isn't it? >> >> The way we do that elsewhere is to print out the value to a string >> variable and then use %s in the translatable message. > > Hmm. OK. That doesn't sound great, but if there's no better option... I don't like breaking the abstraction of pg_log() with the existing flags with some kind of pg_debug() layer. The set of APIs present now in pg_rewind for logging is nice to have, and I think that those debug messages should be translated. So what about the attached? -- Michael rewind-int64-log.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90
Robert Haaswrites: > On Mon, Aug 28, 2017 at 6:35 PM, Tom Lane wrote: >> Maybe parallel_aware should have more than two values, depending >> on whether the result of the node is context-dependent or not. > It seems likely the whole concept of parallel_aware is only only a > zero-order approximation to what we really want. Yeah, I agree --- but it's also clear that we don't yet know what it should be. We'll have to work that out as we accrete more functionality. In the meantime, I think what we should do is commit the bug fix more or less as I have it, and then work on Amit's concern about losing parallel efficiency by separating the resetting of shared parallel-scan state into a new plan tree traversal that's done before launching new worker processes. The only real alternative is to lobotomize the existing rescan optimizations, and that seems like a really poor choice from here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protect syscache from bloating with negative cache entries
On Mon, Aug 28, 2017 at 5:24 AM, Kyotaro HORIGUCHIwrote: > This patch have had interferences from several commits after the > last submission. I amended this patch to follow them (up to > f97c55c), removed an unnecessary branch and edited some comments. I think the core problem for this patch is that there's no consensus on what approach to take. Until that somehow gets sorted out, I think this isn't going to make any progress. Unfortunately, I don't have a clear idea what sort of solution everybody could tolerate. I still think that some kind of slow-expire behavior -- like a clock hand that hits each backend every 10 minutes and expires entries not used since the last hit -- is actually pretty sensible. It ensures that idle or long-running backends don't accumulate infinite bloat while still allowing the cache to grow large enough for good performance when all entries are being regularly used. But Tom doesn't like it. Other approaches were also discussed; none of them seem like an obvious slam-dunk. Turning to the patch itself, I don't know how we decide whether the patch is worth it. Scanning the whole (potentially large) cache to remove negative entries has a cost, mostly in CPU cycles; keeping those negative entries around for a long time also has a cost, mostly in memory. I don't know how to decide whether these patches will help more people than it hurts, or the other way around -- and it's not clear that anyone else has a good idea about that either. Typos: funciton, paritial. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Improving overflow checks when adding tuple to PGresult Re: [GENERAL] Retrieving query results
On Mon, Aug 28, 2017 at 3:05 PM, Michael Paquierwrote: > Attached are two patches: > 1) 0001 refactors the code around pqAddTuple to be able to handle > error messages and assign them in PQsetvalue particularly. > 2) 0002 adds sanity checks in pqAddTuple for overflows, maximizing the > size of what is allocated to INT_MAX but now more. > > pqRowProcessor() still has errmsgp, but it is never used on HEAD. At > least with this set of patches it comes to be useful. We could rework > check_field_number() to use as well an error message string, but I > have left that out to keep things simple. Not sure if any complication > is worth compared to just copying the error message in case of an > unmatching column number. As this change requires I think an extra lookup, I am moving the discussion to -hackers with a proper subject and the set of patches attached (and the test program). This patch is registered in the next commit fest. -- Michael 0001-Refactor-error-message-handling-in-pqAddTuple.patch Description: Binary data 0002-Improve-overflow-checks-of-pqAddTuple-in-libpq.patch Description: Binary data /* * Script to test PQcopyResult and subsequently PQsetvalue. * Compile with for example: * gcc -lpq -g -o pg_copy_res pg_copy_res.c */ #include #include #include "libpq-fe.h" #define DEFAULT_PORT "5432" #define DEFAULT_HOST "/tmp" #define DEFAULT_DB "postgres" int main() { char *port = getenv("PGPORT"); char *host = getenv("PGHOST"); char *dbname = getenv("PGDATABASE"); char connstr[512]; PGconn *conn; PGresult *res, *res_copy; if (port == NULL) port = DEFAULT_PORT; if (host == NULL) host = DEFAULT_HOST; if (dbname == NULL) dbname = DEFAULT_DB; snprintf(connstr, sizeof(connstr), "port=%s host=%s dbname=%s", port, host, dbname); conn = PQconnectdb(connstr); if (PQstatus(conn) != CONNECTION_OK) { fprintf(stderr, "Connection to database failed: %s", PQerrorMessage(conn)); return 1; } res = PQexec(conn, "SELECT 1"); /* Copy the resuld wanted, who care what that is... */ res_copy = PQcopyResult(res, PG_COPYRES_TUPLES | PG_COPYRES_ATTRS); PQclear(res); PQclear(res_copy); PQfinish(conn); return 0; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
On Mon, Aug 28, 2017 at 4:59 PM, Fabien COELHOwrote: > > Hello Masahiko-san, > > Patch applies cleanly, compiles, works for me. Thank you for reviewing! > >>> At least: "Required to invoke" -> "Require to invoke". >> >> >> Fixed. > > > I meant the added one about -I, not the pre-existing one about -i. Fixed. >>> About the code: >>> >>> is_no_vacuum should be a bool? >> >> >> We can change it but I think there is no difference actually. So >> keeping it would be better. > > > I would like to insist a little bit: the name was declared as an int but > passed to init and used as a bool there before the patch. Conceptually what > is meant is really a bool, and I see no added value bar saving a few strokes > to have an int. ISTM that recent pgbench changes have started turning old > int-for-bool habits into using bool when bool is meant. Since is_no_vacuum is a existing one, if we follow the habit we should change other similar variables as well: is_init_mode, do_vacuum_accounts and debug. And I think we should change them in a separated patch. > > initialize_cmds initialization: rather use pg_strdup instead of > pg_malloc/strcpy? Fixed. > -I: pg_free before pg_strdup to avoid a small memory leak? Fixed. > I'm not sure I would have bothered with sizeof(char), but why not! > > I'm still a little bit annoyed by checkCustomCmds: when unhappy, it shows an > error message and return false, which immediatly results in exit(1). However > the pattern elsewhere in pgbench is to show the error and exit immediatly. I > would suggest to simplify by void-ing the function and exiting instead of > returning false. Agreed, fixed. After more thought, I'm bit inclined to not have a short option for --custom-initialize because this option will be used for not primary cases. It would be better to save short options for future enhancements of pgbench. Thought? Attached latest patch. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center pgbench_custom_initialization_v6.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Custom allocators in libpq
On 29 August 2017 at 05:15, Tom Lanewrote: > Peter Eisentraut writes: > > On 8/28/17 15:11, Tom Lane wrote: > >> ... but it seems like you're giving up a lot of the possible uses if > >> you don't make it apply uniformly. I admit I'm not sure how we'd handle > >> the initial creation of a connection object with a custom malloc. The > >> obvious solution of requiring the functions to be specified at PQconnect > >> time seems to require Yet Another PQconnect Variant, which is not very > >> appetizing. > > > I would have expected a separate function just to register the callback > > functions, before doing anything else with libpq. Similar to libxml: > > http://xmlsoft.org/xmlmem.html > > I really don't much care for libxml's solution, because it implies > global variables, with the attendant thread-safety issues. That's > especially bad if you want a passthrough such as a memory context > pointer, since it's quite likely that different call sites would > need different passthrough values, even assuming that a single set > of callback functions would suffice for an entire application. > That latter assumption isn't so pleasant either. One could expect > that by using such a solution, postgres_fdw could be expected to > break, say, a libpq-based DBI library inside plperl. Yeah, the 'register a malloc() function pointer in a global via a registration function call' approach seems fine and dandy until you find yourself with an app that, via shared library loads, has more than one different user of libpq with its own ideas about memory allocation. RTLD_LOCAL can help, but may introduce other issues. So there doesn't seem much way around another PQconnect variant. Yay? We could switch to a struct-passing argument model, but by the time you add the necessary "nfields" argument to allow you to know how much of the struct you can safely access, etc, just adding new connect functions starts to look good in comparison. Which reminds me, it kind of stinks that PQconnectdbParams and PQpingParams accept key and value char* arrays, but PQconninfoParse produces a PQconninfoOption* . This makes it seriously annoying when you want to parse a connstring, make some transformations and pass it to a connect function. I pretty much always just put the user's original connstring in 'dbname' and set expand_dbname = true instead. It might make sense to have any new function accept PQconninfoOption*. Or a variant of PQconninfoParse that populates k/v arrays with 'n' extra fields allocated and zeroed on return, I guess. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] show "aggressive" or not in autovacuum logs
On Mon, Aug 28, 2017 at 5:26 AM, Kyotaro HORIGUCHIwrote: > Currently the message shows the '%d skipped-frozen' message but > it is insufficient to verify the true effect. This is a patch to > show mode as 'aggressive' or 'normal' in the closing message of > vacuum. %d frozen-skipped when 'aggressive mode' shows the true > effect of ALL_FROZEN. > > I will add this patch to CF2017-09. I would be a bit inclined to somehow show aggressive if it's aggressive and not insert anything at all otherwise. That'd probably require two separate translatable strings in each case, but maybe that's OK. What do other people think? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Write operations in parallel mode
On Mon, Aug 28, 2017 at 12:23 PM, Antonin Houskawrote: > Now that dynamic shared memory hash table has been committed > (8c0d7bafad36434cb08ac2c78e69ae72c194ca20) I wonder if it's still a big deal > to remove restrictions like this in (e.g. heap_update()): > > /* > * Forbid this during a parallel operation, lest it allocate a > combocid. > * Other workers might need that combocid for visibility checks, and > we > * have no provision for broadcasting it to them. > */ > if (IsInParallelMode()) > ereport(ERROR, > (errcode(ERRCODE_INVALID_TRANSACTION_STATE), > errmsg("cannot update tuples during a > parallel operation"))); Well, it certainly doesn't solve that problem directly, but I think it is useful infrastructure that lends itself to some kind of a solution to that problem. There are other problems, too, like group locking vs. the relation extension lock, though Masahiko-san posted a patch for that IIRC, and also group locking vs. GIN page locks, and also proper interlocking of updates. I think that the order in which we should tackle things is probably something like: 1. Make inserts parallel-restricted rather than parallel-unsafe - i.e. allow inserts but only in the leader. This allows plans like Insert -> Gather -> whatever. I'm not sure there's a lot to do here other than allow such plans to be generated. 2. Make updates and deletes parallel-restricted rather than parallel-unsafe - i.e. allow updates and deletes but only in the leader. This similarly would allow Update -> Gather -> whatever and Delete -> Gather -> whatever. For this, you'd need a shared combo CID hash so that workers can learn about new combo CIDs created by the leader. 3. Make inserts parallel-safe rather than parallel-restricted. This allows plans like Gather -> Insert -> whatever, which is way better than Insert -> Gather -> whatever. If there's no RETURNING, the Gather isn't actually gathering anything. This requires sorting out some group locking issues, but not tuple locking. 4. Make updates and deletes parallel-safe rather than parallel-restricted. Now tuple locking has to be sorted out. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90
On Mon, Aug 28, 2017 at 6:35 PM, Tom Lanewrote: > but probably we should think of a more arm's-length way to do it. > Maybe parallel_aware should have more than two values, depending > on whether the result of the node is context-dependent or not. My original intent for the parallel_aware flag was for it to signal whether the plan node was going to do something functionally different when in parallel mode. For scans, that's come to mean "partition the input among the workers", and there doesn't seem to be any other sensible meaning. I don't have a good idea what it's going to mean for non-scan nodes yet. Parallel Hash will be the first non-parallel aware scan node, and it uses it to mean that the hash table in dynamic shared memory, so that the inner side can be partial (which is otherwise not possible). I'm guessing that is going to be a common meaning for nodes that store stuff - it's easy to imagine Parallel Materialize, Parallel Sort, Parallel HashAggregate with similar semantics. There's also a proposed patch for Parallel Append where it signals that DSM is being used to coordinate task scheduling and load balancing. It seems likely the whole concept of parallel_aware is only only a zero-order approximation to what we really want. This bug is, IMHO, the first really tangible evidence of the concept creaking around the edges, but I've kind of had a feeling for a while that it wasn't likely to be problem-free. I'm still not sure exactly what the right answer will turn out to be. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Challenges preventing us moving to 64 bit transaction id (XID)?
On Thu, Jul 6, 2017 at 07:29:07AM -0700, Jim Finnerty wrote: > re: "The problem is if you want to delete from such a page. Then you need to > update the tuple's xmax and stick the new xid epoch somewhere." I am coming to this very late, but wouldn't such a row be marked using our frozen-commited fixed xid so it doesn't matter what the xid epoch is? I realize with 64-bit xids we don't need to freeze tuples, but we could still use a frozen-commited fixed xid, see: #define FrozenTransactionId ((TransactionId) 2) -- Bruce Momjianhttp://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90
Robert Haaswrites: > On Mon, Aug 28, 2017 at 3:00 PM, Tom Lane wrote: >> Count the "that"s (you're failing to notice the next line). > OK, true. But "Academic literature" -> "The academic literature" is > just second-guessing, I think. No, it was more to avoid reflowing the paragraph (or leaving a weirdly short line). > There should never be a parallel_aware node that's not beneath a > Gather or Gather Merge; I don't know what the meaning of such a plan > would be, so I think we're safe against such a thing appearing in the > future. What I'm unclear about is what happens with nodes that aren't > directly in the chain between the Gather and the parallel-aware node. Nothing. The parallel-aware node(s) get marked as dependent on the rescan parameter, and then that marking bubbles up to their ancestor nodes (up to the Gather). Nodes that are not ancestral to any parallel-aware node are unchanged. > Now consider Parallel Hash > (not yet committed), where we might get this: > Something > -> Gather > -> Merge Join > -> Sort > -> Parallel Seq Scan on a > -> Hash Join > -> Index Scan on b > -> Parallel Hash > -> Parallel Seq Scan on c > Now what? We clearly still need to force a re-sort of a, but what > about the shared hash table built on c? If we've still got that hash > table and it was a single-batch join, there's probably no need to > rescan it even though both the Parallel Hash node and the Parallel Seq > Scan beneath it are parallel-aware. In this case all workers > collaborated to build a shared hash table covering all rows from c; if > we've still got that, it's still good. In fact, even the workers can > reuse that hash table even though, for them, it'll not really be a > re-scan at all. Well, I'd say that's something for the parallel hash patch to resolve. Yes, if the contents of the hash table are expected to be the same regardless of how many workers were involved, then we shouldn't need to rebuild it after a rescan of the Gather. That would mean not marking either Parallel Hash or its descendant Parallel Seq Scan as dependent on the Gather's rescan param. That's not terribly hard to mechanize within the structure of this patch: just ignore the param at/below the ParallelHash. Cowboy coding would be, perhaps, if (plan->parallel_aware) { if (gather_param < 0) elog(ERROR, "parallel-aware plan node is not below a Gather"); + if (IsA(plan, Hash)) + gather_param = -1; + else context.paramids = bms_add_member(context.paramids, gather_param); } but probably we should think of a more arm's-length way to do it. Maybe parallel_aware should have more than two values, depending on whether the result of the node is context-dependent or not. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands
On Sat, Aug 26, 2017 at 8:00 AM, Robert Haaswrote: > On Thu, Aug 24, 2017 at 12:59 AM, Michael Paquier > wrote: >> Robert, Amit and other folks working on extending the existing >> partitioning facility would be in better position to answer that, but >> I would think that we should have something as flexible as possible, >> and storing a list of relation OID in each VacuumRelation makes it >> harder to track the uniqueness of relations vacuumed. I agree that the >> concept of a partition with multiple parents induces a lot of >> problems. But the patch as proposed worries me as it complicates >> vacuum() with a double loop: one for each relation vacuumed, and one >> inside it with the list of OIDs. Modules calling vacuum() could also >> use flexibility, being able to analyze a custom list of columns for >> each relation has value as well. > > So ... why have a double loop? I mean, you could just expand this out > to one entry per relation actually being vacuumed, couldn't you? Yes, if I understand that correctly. That's the point I am exactly coming at. My suggestion is to have one VacuumRelation entry per relation vacuumed, even for partitioned tables, and copy the list of columns to each one. > +oldcontext = MemoryContextSwitchTo(vac_context); > +foreach(lc, relations) > +temp_relations = lappend(temp_relations, copyObject(lfirst(lc))); > +MemoryContextSwitchTo(oldcontext); > +relations = temp_relations; > > Can't we just copyObject() on the whole list? Yup. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90
On Mon, Aug 28, 2017 at 3:00 PM, Tom Lanewrote: >> That sentence isn't wrong as written. > > Count the "that"s (you're failing to notice the next line). OK, true. But "Academic literature" -> "The academic literature" is just second-guessing, I think. >> I don't really understand the part about depending on a parallel-aware >> node. I mean, there should always be one, except in the >> single-copy-Gather case, but why is it right to depend on that rather >> than anything else? What happens when the Parallel Hash patch goes in >> and we have multiple parallel-aware scan nodes (plus a parallel-aware >> Hash node) under the same Gather? > > Well, that's what I'm asking. AFAICS we only really need the scan node(s) > to be marked as depending on the Gather's rescan parameter. It would not, > however, hurt anything for nodes above them to be so marked as well --- > and even if we didn't explicitly mark them, those nodes would end up > depending on the parameter anyway because of the way that parameter > dependency propagation works. I think the question boils down to whether > a "parallel_aware" node would ever not be underneath a related Gather. There should never be a parallel_aware node that's not beneath a Gather or Gather Merge; I don't know what the meaning of such a plan would be, so I think we're safe against such a thing appearing in the future. What I'm unclear about is what happens with nodes that aren't directly in the chain between the Gather and the parallel-aware node. For instance: Something -> Gather -> Merge Join -> Sort -> Parallel Seq Scan on a -> Merge Join -> Sort -> Seq Scan on b -> Sort -> Seq Scan on c If the Gather gets rescanned, is it OK to force a re-sort of a but not of b or c? Hmm, maybe so. The workers are going to have to do the sorts of b and c since any workers from a previous scan are GONE, but if the leader's done that work, it's still good. Similarly: Something -> Gather -> Merge Join -> Sort -> Parallel Seq Scan on a -> Hash Join -> Index Scan on b -> Hash -> Seq Scan on c If the leader's got an existing hash table built on c, it can reuse it. The workers will need to build one. Now consider Parallel Hash (not yet committed), where we might get this: Something -> Gather -> Merge Join -> Sort -> Parallel Seq Scan on a -> Hash Join -> Index Scan on b -> Parallel Hash -> Parallel Seq Scan on c Now what? We clearly still need to force a re-sort of a, but what about the shared hash table built on c? If we've still got that hash table and it was a single-batch join, there's probably no need to rescan it even though both the Parallel Hash node and the Parallel Seq Scan beneath it are parallel-aware. In this case all workers collaborated to build a shared hash table covering all rows from c; if we've still got that, it's still good. In fact, even the workers can reuse that hash table even though, for them, it'll not really be a re-scan at all. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Custom allocators in libpq
Peter Eisentrautwrites: > On 8/28/17 15:11, Tom Lane wrote: >> ... but it seems like you're giving up a lot of the possible uses if >> you don't make it apply uniformly. I admit I'm not sure how we'd handle >> the initial creation of a connection object with a custom malloc. The >> obvious solution of requiring the functions to be specified at PQconnect >> time seems to require Yet Another PQconnect Variant, which is not very >> appetizing. > I would have expected a separate function just to register the callback > functions, before doing anything else with libpq. Similar to libxml: > http://xmlsoft.org/xmlmem.html I really don't much care for libxml's solution, because it implies global variables, with the attendant thread-safety issues. That's especially bad if you want a passthrough such as a memory context pointer, since it's quite likely that different call sites would need different passthrough values, even assuming that a single set of callback functions would suffice for an entire application. That latter assumption isn't so pleasant either. One could expect that by using such a solution, postgres_fdw could be expected to break, say, a libpq-based DBI library inside plperl. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Custom allocators in libpq
On 8/28/17 15:11, Tom Lane wrote: > ... but it seems like you're giving up a lot of the possible uses if > you don't make it apply uniformly. I admit I'm not sure how we'd handle > the initial creation of a connection object with a custom malloc. The > obvious solution of requiring the functions to be specified at PQconnect > time seems to require Yet Another PQconnect Variant, which is not very > appetizing. I would have expected a separate function just to register the callback functions, before doing anything else with libpq. Similar to libxml: http://xmlsoft.org/xmlmem.html -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Custom allocators in libpq
On Mon, Aug 28, 2017 at 03:11:26PM -0400, Tom Lane wrote: > Aaron Pattersonwrites: > > I would like to be able to configure libpq with custom malloc functions. > > I can see the potential value of this ... > > > This patch doesn't replace all malloc calls to the configured ones, just > > the mallocs related to creating result objects (which is what I'm > > concerned with). > > ... but it seems like you're giving up a lot of the possible uses if > you don't make it apply uniformly. I'm happy to make the changes uniformly! I'll do that and update the patch. > I admit I'm not sure how we'd handle > the initial creation of a connection object with a custom malloc. The > obvious solution of requiring the functions to be specified at PQconnect > time seems to require Yet Another PQconnect Variant, which is not very > appetizing. Other libraries I've worked with allow me to malloc a struct, then pass it to an initialization function. This might take a bit of refactoring, like introducing a new `PQconnectStart`, but might be worth while. > I also wonder whether you wouldn't want a passthrough argument. > For instance, one of the use-cases that springs to mind immediately is > teaching postgres_fdw and dblink to use this so that their result objects > are palloc'd not malloc'd, allowing removal of lots of PG_TRY overhead. > While I suppose we could have the hook functions always allocate in > CurrentMemoryContext, it'd likely be useful to be able to specify > "use context X" at creation time. We don't need this for the Ruby wrapper, but I've seen other libraries do it. I'm happy to add it as well. Thanks! -- Aaron Patterson http://tenderlovemaking.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] expanding inheritance in partition bound order
On Mon, Aug 28, 2017 at 6:38 AM, Amit Langotewrote: > I am worried about the open relcache reference in PartitionDispatch when > we start using it in the planner. Whereas there is a ExecEndModifyTable() > as a suitable place to close that reference, there doesn't seem to exist > one within the planner, but I guess we will have to figure something out. Yes, I think there's no real way to avoid having to figure that out. > OK, done this way in the attached updated patch. Any suggestions about a > better name for what the patch calls PartitionTupleRoutingInfo? I think this patch could be further simplified by continuing to use the existing function signature for RelationGetPartitionDispatchInfo instead of changing it to return a List * rather than an array. I don't see any benefit to such a change. The current system is more efficient. I keep having the feeling that this is a big patch with a small patch struggling to get out. Is it really necessary to change RelationGetPartitionDispatchInfo so much or could you just do a really minimal surgery to remove the code that sets the stuff we don't need? Like this: diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c index 96a64ce6b2..4fabcf9f32 100644 --- a/src/backend/catalog/partition.c +++ b/src/backend/catalog/partition.c @@ -1089,29 +1089,7 @@ RelationGetPartitionDispatchInfo(Relation rel, pd[i] = (PartitionDispatch) palloc(sizeof(PartitionDispatchData)); pd[i]->reldesc = partrel; pd[i]->key = partkey; -pd[i]->keystate = NIL; pd[i]->partdesc = partdesc; -if (parent != NULL) -{ -/* - * For every partitioned table other than root, we must store a - * tuple table slot initialized with its tuple descriptor and a - * tuple conversion map to convert a tuple from its parent's - * rowtype to its own. That is to make sure that we are looking at - * the correct row using the correct tuple descriptor when - * computing its partition key for tuple routing. - */ -pd[i]->tupslot = MakeSingleTupleTableSlot(tupdesc); -pd[i]->tupmap = convert_tuples_by_name(RelationGetDescr(parent), - tupdesc, - gettext_noop("could not convert row type")); -} -else -{ -/* Not required for the root partitioned table */ -pd[i]->tupslot = NULL; -pd[i]->tupmap = NULL; -} pd[i]->indexes = (int *) palloc(partdesc->nparts * sizeof(int)); /* -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Custom allocators in libpq
Aaron Pattersonwrites: > I would like to be able to configure libpq with custom malloc functions. I can see the potential value of this ... > This patch doesn't replace all malloc calls to the configured ones, just > the mallocs related to creating result objects (which is what I'm > concerned with). ... but it seems like you're giving up a lot of the possible uses if you don't make it apply uniformly. I admit I'm not sure how we'd handle the initial creation of a connection object with a custom malloc. The obvious solution of requiring the functions to be specified at PQconnect time seems to require Yet Another PQconnect Variant, which is not very appetizing. I also wonder whether you wouldn't want a passthrough argument. For instance, one of the use-cases that springs to mind immediately is teaching postgres_fdw and dblink to use this so that their result objects are palloc'd not malloc'd, allowing removal of lots of PG_TRY overhead. While I suppose we could have the hook functions always allocate in CurrentMemoryContext, it'd likely be useful to be able to specify "use context X" at creation time. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90
Robert Haaswrites: > On Mon, Aug 28, 2017 at 10:47 AM, Tom Lane wrote: > - fuller description. Academic literature on parallel query suggests that > + fuller description. The academic literature on parallel query suggests > That sentence isn't wrong as written. Count the "that"s (you're failing to notice the next line). > I don't really understand the part about depending on a parallel-aware > node. I mean, there should always be one, except in the > single-copy-Gather case, but why is it right to depend on that rather > than anything else? What happens when the Parallel Hash patch goes in > and we have multiple parallel-aware scan nodes (plus a parallel-aware > Hash node) under the same Gather? Well, that's what I'm asking. AFAICS we only really need the scan node(s) to be marked as depending on the Gather's rescan parameter. It would not, however, hurt anything for nodes above them to be so marked as well --- and even if we didn't explicitly mark them, those nodes would end up depending on the parameter anyway because of the way that parameter dependency propagation works. I think the question boils down to whether a "parallel_aware" node would ever not be underneath a related Gather. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90
On Mon, Aug 28, 2017 at 10:47 AM, Tom Lanewrote: > That seems like an unacceptably fragile assumption. Even if it happens to > be true today, we would need to fix it sooner or later. (And I kinda > suspect it's possible to break it today, anyway. Treating PARAM_EXEC > Params as parallel-restricted seems to lock out the easiest cases, but we > have param slots that don't correspond to any Param node, eg for recursive > union worktables. replace_nestloop_params is also a source of PARAM_EXEC > Params that won't be detected during is_parallel_safe() tests, because it > happens later.) Those particular cases are, I think, handled. The CTE case is handled by considering CTE scans as parallel-restricted, and the nestloop case is handled by insisting that all partial paths must be unparameterized. You can join a partial path to a parameterized non-partial path to make a new partial path, but neither the original partial path nor the resulting one can itself be parameterized. - fuller description. Academic literature on parallel query suggests that + fuller description. The academic literature on parallel query suggests That sentence isn't wrong as written. I don't really understand the part about depending on a parallel-aware node. I mean, there should always be one, except in the single-copy-Gather case, but why is it right to depend on that rather than anything else? What happens when the Parallel Hash patch goes in and we have multiple parallel-aware scan nodes (plus a parallel-aware Hash node) under the same Gather? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Custom allocators in libpq
Hello! I would like to be able to configure libpq with custom malloc functions. The reason is that we have a Ruby wrapper that exposes libpq in Ruby. The problem is that Ruby's GC doesn't know how much memory has been allocated by libpq, so no pressure is applied to the GC when it should be. Ruby exports malloc functions that automatically apply GC pressure, and I'd like to be able to configure libpq to use those malloc functions. I've attached two patches that add this functionality. The first patch introduces a new function `PQunescapeByteaConn` which takes a connection (so we have a place to get the malloc functions). We already have `PQescapeBytea` and `PQescapeByteaConn`, this first patch gives us the analogous `PQunescapeBytea` and `PQunescapeByteaConn`. The second patch adds malloc function pointer fields to `PGEvent`, `pg_result`, and `pg_conn` structs, and changes libpq internals to use those allocators rather than directly calling `malloc`. This patch doesn't replace all malloc calls to the configured ones, just the mallocs related to creating result objects (which is what I'm concerned with). If there's something I'm missing, please let me know. This is my first patch to libpq, so I look forward to hearing feedback. Thanks for your time! -- Aaron Patterson http://tenderlovemaking.com/ >From be463a2da13734d5b104a5fc9c58e7bbb8908323 Mon Sep 17 00:00:00 2001 From: Aaron PattersonDate: Thu, 19 Nov 2015 14:44:49 -0800 Subject: [PATCH 1/2] add PQunescapeByteaConn for unescaping bytes given a connection This commit adds an unescape method that takes a connection as a parameter. The connection may have settings on it (like allocators) that are important to unescaping the string. --- src/interfaces/libpq/exports.txt | 1 + src/interfaces/libpq/fe-exec.c | 21 +++-- src/interfaces/libpq/libpq-fe.h | 6 -- 3 files changed, 24 insertions(+), 4 deletions(-) diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt index d6a38d0df8..ba718f4071 100644 --- a/src/interfaces/libpq/exports.txt +++ b/src/interfaces/libpq/exports.txt @@ -172,3 +172,4 @@ PQsslAttribute169 PQsetErrorContextVisibility 170 PQresultVerboseErrorMessage 171 PQencryptPasswordConn 172 +PQunescapeByteaConn 173 diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c index e1e2d18e3a..464bdc635f 100644 --- a/src/interfaces/libpq/fe-exec.c +++ b/src/interfaces/libpq/fe-exec.c @@ -3651,8 +3651,9 @@ PQescapeBytea(const unsigned char *from, size_t from_length, size_t *to_length) * \ooo == a byte whose value = ooo (ooo is an octal number) * \x == x (x is any character not matched by the above transformations) */ -unsigned char * -PQunescapeBytea(const unsigned char *strtext, size_t *retbuflen) +static unsigned char * +PQunescapeByteaInternal(PGconn *conn, + const unsigned char *strtext, size_t *retbuflen) { size_t strtextlen, buflen; @@ -3762,3 +3763,19 @@ PQunescapeBytea(const unsigned char *strtext, size_t *retbuflen) *retbuflen = buflen; return tmpbuf; } + +unsigned char * +PQunescapeByteaConn(PGconn *conn, + const unsigned char *strtext, size_t *retbuflen) +{ + if (!conn) + return NULL; + + return PQunescapeByteaInternal(conn, strtext, retbuflen); +} + +unsigned char * +PQunescapeBytea(const unsigned char *strtext, size_t *retbuflen) +{ + return PQunescapeByteaInternal(NULL, strtext, retbuflen); +} diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h index 1d915e7915..7fce4b387e 100644 --- a/src/interfaces/libpq/libpq-fe.h +++ b/src/interfaces/libpq/libpq-fe.h @@ -527,13 +527,15 @@ extern char *PQescapeIdentifier(PGconn *conn, const char *str, size_t len); extern unsigned char *PQescapeByteaConn(PGconn *conn, const unsigned char *from, size_t from_length, size_t *to_length); -extern unsigned char *PQunescapeBytea(const unsigned char *strtext, - size_t *retbuflen); +extern unsigned char *PQunescapeByteaConn(PGconn *conn, + const unsigned char *strtext, size_t *retbuflen); /* These forms are deprecated! */ extern size_t PQescapeString(char *to, const char *from, size_t length); extern unsigned char *PQescapeBytea(const unsigned char *from, size_t from_length, size_t *to_length); +extern unsigned char *PQunescapeBytea(const unsigned char *strtext, + size_t *retbuflen); -- 2.11.0 >From b62fed4a7e38ef9e2057d5556312a5a76cdf Mon Sep 17 00:00:00 2001 From: Aaron Patterson Date: Thu, 19 Nov 2015 15:28:27 -0800 Subject: [PATCH 2/2] use configured malloc /
[HACKERS] Write operations in parallel mode
Now that dynamic shared memory hash table has been committed (8c0d7bafad36434cb08ac2c78e69ae72c194ca20) I wonder if it's still a big deal to remove restrictions like this in (e.g. heap_update()): /* * Forbid this during a parallel operation, lest it allocate a combocid. * Other workers might need that combocid for visibility checks, and we * have no provision for broadcasting it to them. */ if (IsInParallelMode()) ereport(ERROR, (errcode(ERRCODE_INVALID_TRANSACTION_STATE), errmsg("cannot update tuples during a parallel operation"))); -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] auto_explain : log queries with wrong estimation
On 08/24/2017 03:08 PM, Maksim Milyutin wrote: [...] > > AFAICS you want to introduce two additional per-node variables: > - auto_explain_log_estimate_ratio that denotes minimum ratio (>= 1) between > real value and planned one. I would add 'min' prefix before 'ratio'. > - auto_explain_log_estimate_min_rows - minimum absolute difference between > those two values. IMHO this name is somewhat poor, the suffix 'min_diff_rows' > looks better. > If real expressions (ratio and diff) exceed these threshold values both, you > log > this situation. I'm right? Yes, you're totaly right! I wonder if "ratio" is fine or if I should use "factor"? [...] > > Instrumentation is initialized only with analyze (log_analyze is true)[1] Good, I didn't notice instrumentation can be enabled in auto_explain's hook. I added these lines and it works : if (auto_explain_log_estimate_ratio || auto_explain_log_estimate_min_rows) { queryDesc->instrument_options |= INSTRUMENT_ROWS; } But I need to undestand how instrumentation works. Thanks for your answer. I will continue my work, actually my patch is not functionnal. -- Adrien NAYRAT signature.asc Description: OpenPGP digital signature
Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90
Amit Kapilawrites: > On Mon, Aug 28, 2017 at 6:34 PM, Tom Lane wrote: >> If what you're complaining about is that I put back the "if >> (outerPlan->chgParam == NULL)" test to allow postponement of the >> recursive ExecReScan call, I'm afraid that it's mere wishful >> thinking that omitting that test in nodeGather did anything. > Previously outerPlan->chgParam will be NULL, so I think rescan's won't > be postponed. That seems like an unacceptably fragile assumption. Even if it happens to be true today, we would need to fix it sooner or later. (And I kinda suspect it's possible to break it today, anyway. Treating PARAM_EXEC Params as parallel-restricted seems to lock out the easiest cases, but we have param slots that don't correspond to any Param node, eg for recursive union worktables. replace_nestloop_params is also a source of PARAM_EXEC Params that won't be detected during is_parallel_safe() tests, because it happens later.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 1 test fails in make installcheck-world - database "regress_ecpg_user2" does not exist
> Thanks for reporting it! > My pleasure! So the initial issue didn't happen the 2nd time. So if misc_sanity was the only test failing then I guess my tests are working fine other than that. Sweet! When I get a break from work I'll review some patches! Best, Ryan
Re: [HACKERS] hash partitioning based on v10Beta2
On Sat, Aug 26, 2017 at 12:40 AM, yang...@highgo.comwrote: > A partition table can be create as bellow: > > CREATE TABLE h1 PARTITION OF h; > CREATE TABLE h2 PARTITION OF h; > CREATE TABLE h3 PARTITION OF h; This syntax is very problematic for reasons that have been discussed on the existing hash partitioning thread. Fortunately, a solution has already been implemented... you're the third person to try to write a patch for this feature. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: pg_rewind: Fix some problems when copying files >2GB.
On Mon, Aug 28, 2017 at 10:16 AM, Alvaro Herrerawrote: >> I am fine with however you want to handle it, but it seems odd to me >> that we don't have a way of embedding INT64_FORMAT in a translatable >> string. Surely that's going to be a problem in some case, sometime, >> isn't it? > > The way we do that elsewhere is to print out the value to a string > variable and then use %s in the translatable message. Hmm. OK. That doesn't sound great, but if there's no better option... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: pg_rewind: Fix some problems when copying files >2GB.
Robert Haas wrote: > On Mon, Aug 28, 2017 at 9:05 AM, Alvaro Herrera >wrote: > > Robert Haas wrote: > >> pg_rewind: Fix some problems when copying files >2GB. > > > > I just noticed that this broke pg_rewind translation, because of the > > INT64_FORMAT marker in the translatable string. The message catalog now > > has this: > > > > msgid "received chunk for file \"%s\", offset " > > > > for this source line: > > > > pg_log(PG_DEBUG, "received chunk for file \"%s\", offset " > > INT64_FORMAT ", size %d\n", > >filename, chunkoff, chunksize); > > > > I don't think that this is terribly valuable as a translatable string, > > so my preferred fix would be to have a new function pg_debug() here > > where the messages are *not* marked for translations. > > I am fine with however you want to handle it, but it seems odd to me > that we don't have a way of embedding INT64_FORMAT in a translatable > string. Surely that's going to be a problem in some case, sometime, > isn't it? The way we do that elsewhere is to print out the value to a string variable and then use %s in the translatable message. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: pg_rewind: Fix some problems when copying files >2GB.
On Mon, Aug 28, 2017 at 9:05 AM, Alvaro Herrerawrote: > Robert Haas wrote: >> pg_rewind: Fix some problems when copying files >2GB. > > I just noticed that this broke pg_rewind translation, because of the > INT64_FORMAT marker in the translatable string. The message catalog now > has this: > > msgid "received chunk for file \"%s\", offset " > > for this source line: > > pg_log(PG_DEBUG, "received chunk for file \"%s\", offset " > INT64_FORMAT ", size %d\n", >filename, chunkoff, chunksize); > > I don't think that this is terribly valuable as a translatable string, > so my preferred fix would be to have a new function pg_debug() here > where the messages are *not* marked for translations. I am fine with however you want to handle it, but it seems odd to me that we don't have a way of embedding INT64_FORMAT in a translatable string. Surely that's going to be a problem in some case, sometime, isn't it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90
On Mon, Aug 28, 2017 at 6:34 PM, Tom Lanewrote: > Amit Kapila writes: >> On Mon, Aug 28, 2017 at 6:01 PM, Tom Lane wrote: >>> Um, what's different about that than before? > >> Earlier, we perform the rescan of all the nodes before ExecProcNode, >> so workers will always start (restart) after the scan descriptor is >> initialized. > > If what you're complaining about is that I put back the "if > (outerPlan->chgParam == NULL)" test to allow postponement of the > recursive ExecReScan call, I'm afraid that it's mere wishful > thinking that omitting that test in nodeGather did anything. > The nodes underneath the Gather are likely to do the same thing, > so that the parallel table scan node itself is going to get a > postponed rescan call anyway. See e.g. ExecReScanNestLoop(). > Previously outerPlan->chgParam will be NULL, so I think rescan's won't be postponed. IIRC, I have debugged it during the development of this code that rescans were not postponed. I don't deny that for some cases it might get delayed but for simple cases, it was done immediately. I agree that in general, the proposed fix is better than having nothing, but not sure if we need it for Gather as well considering we are not able to demonstrate any case. > I see your point that there's inadequate interlocking between resetting > the parallel scan's shared state and starting a fresh set of workers, > but that's a pre-existing bug. In practice I doubt it makes any > difference, because according to my testing the leader will generally > reach the table scan long before any workers do. It'd be nice to do > better though. > Agreed. BTW, I have mentioned above that we can avoid skipping optimization in rescan path if we are in parallel mode. I think that will not be as elegant a solution as your patch, but it won't have this problem. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 1 test fails in make installcheck-world - database "regress_ecpg_user2" does not exist
Ryan Murphywrites: > I did notice that the test seems to create a ROLE called regress_ecpg_user2: Right. > Could that implicitly create a database too? I know that I somehow have a > database named after my username / postgres role "murftown". Maybe you've got some tool somewhere that automatically creates databases for users? PG itself doesn't. > I've attached the new regression.diffs. Oh, huh, that's actually a bug in the misc_sanity test --- it is legit for at least some rows in pg_shdepend to have zero dbid. That doesn't happen when working in an empty installation, but yours evidently isn't. Thanks for reporting it! regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 1 test fails in make installcheck-world - database "regress_ecpg_user2" does not exist
> No, you're reading it backwards: the error is expected, but it's not > appearing in your results. I can duplicate this if I manually create > database "regress_ecpg_user2" before running ecpg's installcheck, > so I guess that's what you did. I can find no evidence that any > part of the PG regression tests creates such a database. > Thanks, that makes sense. However, when I go into my database with psql and type `drop database regress_ecpg_user2;`, the response is "ERROR: database "regress_ecpg_user2" does not exist". So it seems either the tests are creating it somehow and then cleaning it up (though I agree that I found no obvious evidence of that in the codebase), or could I be looking in the wrong postgres install? I think it's the same install though. I have my postgres installing into an install_dir/ directory. Here's how I run configure: #!/bin/sh CFLAGS='-O0' ./configure \ --prefix=path/to/postgres/install_dir/ \ --with-uuid=e2fs \ --enable-debug \ --with-perl I did notice that the test seems to create a ROLE called regress_ecpg_user2: $ git grep -n 'regress_ecpg_user2' src/interfaces/ecpg/test/Makefile:78:REGRESS_OPTS = --dbname=ecpg1_regression,ecpg2_regression --create-role=regress_ecpg_user1,regress_ecpg_user2 $(EXTRA_REGRESS_OPTS) ... Could that implicitly create a database too? I know that I somehow have a database named after my username / postgres role "murftown". I ran "make installcheck-world" again, and, the result is different this time - a test called "misc_sanity" failed early on in the tests: $ make installcheck-world make -C src/test installcheck make -C perl installcheck make[2]: Nothing to be done for `installcheck'. make -C regress installcheck ... == dropping database "regression" == DROP DATABASE == creating database "regression" == CREATE DATABASE ALTER DATABASE == running regression test queries== test tablespace ... ok test boolean ... ok test char ... ok ... test regex... ok test oidjoins ... ok test type_sanity ... ok test opr_sanity ... ok test misc_sanity ... FAILED test comments ... ok test expressions ... ok test insert ... ok test insert_conflict ... ok test create_function_1... ok ... The old failure with the missing error message is gone from regression.diffs this time. I've attached the new regression.diffs. Best, Ryan regression2.diffs Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hash partitioning based on v10Beta2
On Sat, Aug 26, 2017 at 10:10 AM, yang...@highgo.comwrote: > Hi all, > > Now we have had the range / list partition, but hash partitioning is not > implemented yet. > Attached is a POC patch based on the v10Beta2 to add the > hash partitioning feature. > Although we will need more discussions about the syntax > and other specifications before going ahead the project, > but I think this runnable code might help to discuss > what and how we implement this. > > FYI, there is already an existing commitfest entry for this project. https://commitfest.postgresql.org/14/1059/ > Description > > The hash partition's implement is on the basis of > the original range / list partition,and using similar syntax. > > To create a partitioned table ,use: > > CREATE TABLE h (id int) PARTITION BY HASH(id); > > The partitioning key supports only one value, and I think > the partition key can support multiple values, > which may be difficult to implement when querying, but > it is not impossible. > > A partition table can be create as bellow: > > CREATE TABLE h1 PARTITION OF h; > CREATE TABLE h2 PARTITION OF h; > CREATE TABLE h3 PARTITION OF h; > > FOR VALUES clause cannot be used, and the partition bound > is calclulated automatically as partition index of single integer value. > > An inserted record is stored in a partition whose index equals > DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], > TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts > /* Number of partitions */ > ; > In the above example, this is DatumGetUInt32(OidFunctionCall1(lookup_type_ > cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, id)) % 3; > > postgres=# insert into h select generate_series(1,20); > INSERT 0 20 > postgres=# select tableoid::regclass,* from h; > tableoid | id > --+ > h1 | 3 > h1 | 5 > h1 | 17 > h1 | 19 > h2 | 2 > h2 | 6 > h2 | 7 > h2 | 11 > h2 | 12 > h2 | 14 > h2 | 15 > h2 | 18 > h2 | 20 > h3 | 1 > h3 | 4 > h3 | 8 > h3 | 9 > h3 | 10 > h3 | 13 > h3 | 16 > (20 rows) > > The number of partitions here can be dynamically added, and > if a new partition is created, the number of partitions > changes, the calculated target partitions will change, > and the same data is not reasonable in different > partitions,So you need to re-calculate the existing data > and insert the target partition when you create a new partition. > > postgres=# create table h4 partition of h; > CREATE TABLE > postgres=# select tableoid::regclass,* from h; > tableoid | id > --+ > h1 | 5 > h1 | 17 > h1 | 19 > h1 | 6 > h1 | 12 > h1 | 8 > h1 | 13 > h2 | 11 > h2 | 14 > h3 | 1 > h3 | 9 > h3 | 2 > h3 | 15 > h4 | 3 > h4 | 7 > h4 | 18 > h4 | 20 > h4 | 4 > h4 | 10 > h4 | 16 > (20 rows) > > When querying the data, the hash partition uses the same > algorithm as the insertion, and filters out the table > that does not need to be scanned. > > postgres=# explain analyze select * from h where id = 1; > QUERY PLAN > > > > Append (cost=0.00..41.88 rows=13 width=4) (actual time= > 0.020..0.023 rows=1 loops=1) >-> Seq Scan on h3 (cost=0.00..41.88 rows=13 width=4) ( > actual time=0.013..0.016 rows=1 loops=1) > Filter: (id = 1) > Rows Removed by Filter: 3 > Planning time: 0.346 ms > Execution time: 0.061 ms > (6 rows) > > postgres=# explain analyze select * from h where id in (1,5);; > QUERY PLAN > > > > Append (cost=0.00..83.75 rows=52 width=4) (actual time= > 0.016..0.028 rows=2 loops=1) >-> Seq Scan on h1 (cost=0.00..41.88 rows=26 width=4) ( > actual time=0.015..0.018 rows=1 loops=1) > Filter: (id = ANY ('{1,5}'::integer[])) > Rows Removed by Filter: 6 >-> Seq Scan on h3 (cost=0.00..41.88 rows=26 width=4) ( > actual time=0.005..0.007 rows=1 loops=1) > Filter: (id = ANY ('{1,5}'::integer[])) > Rows Removed by Filter: 3 > Planning time: 0.720 ms > Execution time: 0.074 ms > (9 rows) > > postgres=# explain analyze select * from h where id = 1 or id = 5;; > QUERY PLAN > > > > Append (cost=0.00..96.50 rows=50 width=4) (actual time= > 0.017..0.078 rows=2 loops=1) >-> Seq Scan on h1 (cost=0.00..48.25 rows=25 width=4) ( > actual time=0.015..0.019 rows=1 loops=1) > Filter: ((id = 1) OR (id
Re: [HACKERS] [POC] hash partitioning
Hello Looking at your hash partitioning syntax, I implemented a hash partition in a more concise way, with no need to determine the number of sub-tables, and dynamically add partitions. Description The hash partition's implement is on the basis of the original range / list partition,and using similar syntax. To create a partitioned table ,use: CREATE TABLE h (id int) PARTITION BY HASH(id); The partitioning key supports only one value, and I think the partition key can support multiple values, which may be difficult to implement when querying, but it is not impossible. A partition table can be create as bellow: CREATE TABLE h1 PARTITION OF h; CREATE TABLE h2 PARTITION OF h; CREATE TABLE h3 PARTITION OF h; FOR VALUES clause cannot be used, and the partition bound is calclulated automatically as partition index of single integer value. An inserted record is stored in a partition whose index equals DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts/* Number of partitions */ ; In the above example, this is DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, id)) % 3; postgres=# insert into h select generate_series(1,20); INSERT 0 20 postgres=# select tableoid::regclass,* from h; tableoid | id --+ h1 | 3 h1 | 5 h1 | 17 h1 | 19 h2 | 2 h2 | 6 h2 | 7 h2 | 11 h2 | 12 h2 | 14 h2 | 15 h2 | 18 h2 | 20 h3 | 1 h3 | 4 h3 | 8 h3 | 9 h3 | 10 h3 | 13 h3 | 16 (20 rows) The number of partitions here can be dynamically added, and if a new partition is created, the number of partitions changes, the calculated target partitions will change, and the same data is not reasonable in different partitions,So you need to re-calculate the existing data and insert the target partition when you create a new partition. postgres=# create table h4 partition of h; CREATE TABLE postgres=# select tableoid::regclass,* from h; tableoid | id --+ h1 | 5 h1 | 17 h1 | 19 h1 | 6 h1 | 12 h1 | 8 h1 | 13 h2 | 11 h2 | 14 h3 | 1 h3 | 9 h3 | 2 h3 | 15 h4 | 3 h4 | 7 h4 | 18 h4 | 20 h4 | 4 h4 | 10 h4 | 16 (20 rows) When querying the data, the hash partition uses the same algorithm as the insertion, and filters out the table that does not need to be scanned. postgres=# explain analyze select * from h where id = 1; QUERY PLAN Append (cost=0.00..41.88 rows=13 width=4) (actual time=0.020..0.023 rows=1 loops=1) -> Seq Scan on h3 (cost=0.00..41.88 rows=13 width=4) (actual time=0.013..0.016 rows=1 loops=1) Filter: (id = 1) Rows Removed by Filter: 3 Planning time: 0.346 ms Execution time: 0.061 ms (6 rows) postgres=# explain analyze select * from h where id in (1,5);; QUERY PLAN Append (cost=0.00..83.75 rows=52 width=4) (actual time=0.016..0.028 rows=2 loops=1) -> Seq Scan on h1 (cost=0.00..41.88 rows=26 width=4) (actual time=0.015..0.018 rows=1 loops=1) Filter: (id = ANY ('{1,5}'::integer[])) Rows Removed by Filter: 6 -> Seq Scan on h3 (cost=0.00..41.88 rows=26 width=4) (actual time=0.005..0.007 rows=1 loops=1) Filter: (id = ANY ('{1,5}'::integer[])) Rows Removed by Filter: 3 Planning time: 0.720 ms Execution time: 0.074 ms (9 rows) postgres=# explain analyze select * from h where id = 1 or id = 5;; QUERY PLAN Append (cost=0.00..96.50 rows=50 width=4) (actual time=0.017..0.078 rows=2 loops=1) -> Seq Scan on h1 (cost=0.00..48.25 rows=25 width=4) (actual time=0.015..0.019 rows=1 loops=1) Filter: ((id = 1) OR (id = 5)) Rows Removed by Filter: 6 -> Seq Scan on h3 (cost=0.00..48.25 rows=25 width=4) (actual time=0.005..0.010 rows=1 loops=1) Filter: ((id = 1) OR (id = 5)) Rows Removed by Filter: 3 Planning time: 0.396 ms Execution time: 0.139 ms (9 rows) Can not detach / attach / drop partition table. Best regards, young yonj1e.github.io yang...@highgo.com
[HACKERS] Re: [COMMITTERS] pgsql: pg_rewind: Fix some problems when copying files >2GB.
Robert Haas wrote: > pg_rewind: Fix some problems when copying files >2GB. I just noticed that this broke pg_rewind translation, because of the INT64_FORMAT marker in the translatable string. The message catalog now has this: msgid "received chunk for file \"%s\", offset " for this source line: pg_log(PG_DEBUG, "received chunk for file \"%s\", offset " INT64_FORMAT ", size %d\n", filename, chunkoff, chunksize); I don't think that this is terribly valuable as a translatable string, so my preferred fix would be to have a new function pg_debug() here where the messages are *not* marked for translations. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90
Amit Kapilawrites: > On Mon, Aug 28, 2017 at 6:01 PM, Tom Lane wrote: >> Um, what's different about that than before? > Earlier, we perform the rescan of all the nodes before ExecProcNode, > so workers will always start (restart) after the scan descriptor is > initialized. If what you're complaining about is that I put back the "if (outerPlan->chgParam == NULL)" test to allow postponement of the recursive ExecReScan call, I'm afraid that it's mere wishful thinking that omitting that test in nodeGather did anything. The nodes underneath the Gather are likely to do the same thing, so that the parallel table scan node itself is going to get a postponed rescan call anyway. See e.g. ExecReScanNestLoop(). I see your point that there's inadequate interlocking between resetting the parallel scan's shared state and starting a fresh set of workers, but that's a pre-existing bug. In practice I doubt it makes any difference, because according to my testing the leader will generally reach the table scan long before any workers do. It'd be nice to do better though. I'm inclined to think that what's needed is to move the reset of the shared state into a new "ExecParallelReInitializeDSM" plan tree walk, which would have to occur before we start the new set of workers. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90
On Mon, Aug 28, 2017 at 6:01 PM, Tom Lanewrote: > Amit Kapila writes: >> With this change, it is quite possible that during rescans workers >> will not do any work. > > Um, what's different about that than before? > Earlier, we perform the rescan of all the nodes before ExecProcNode, so workers will always start (restart) after the scan descriptor is initialized. Now, as evident in the discussion in this thread that was not the right thing for gather merge as some of the nodes like Sort does some optimization due to which rescan for the lower nodes will never be called. So, we need to ensure in some way that we don't skip rescanning in such nodes and one way to achieve that is what you have done in the patch, but it seems to have some side effect. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90
Amit Kapilawrites: > With this change, it is quite possible that during rescans workers > will not do any work. Um, what's different about that than before? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90
On Mon, Aug 28, 2017 at 1:59 AM, Tom Lanewrote: > I wrote: >> I think that the correct fix probably involves marking each parallel scan >> plan node as dependent on a pseudo executor parameter, which the parent >> Gather or GatherMerge node would flag as being changed on each rescan. >> This would cue the plan layers in between that they cannot optimize on the >> assumption that the leader's instance of the parallel scan will produce >> exactly the same rows as it did last time, even when "nothing else >> changed". The "wtParam" pseudo parameter that's used for communication >> between RecursiveUnion and its descendant WorkTableScan node is a good >> model for what needs to happen. > > Here is a draft patch for this. ! /* ! * Set child node's chgParam to tell it that the next scan might deliver a ! * different set of rows within the leader process. (The overall rowset ! * shouldn't change, but the leader process's subset might; hence nodes ! * between here and the parallel table scan node mustn't optimize on the ! * assumption of an unchanging rowset.) ! */ ! if (gm->rescan_param >= 0) ! outerPlan->chgParam = bms_add_member(outerPlan->chgParam, ! gm->rescan_param); ! ! ! /* ! * if chgParam of subnode is not null then plan will be re-scanned by ! * first ExecProcNode. ! */ ! if (outerPlan->chgParam == NULL) ! ExecReScan(outerPlan); With this change, it is quite possible that during rescans workers will not do any work. I think this will allow workers to launch before rescan (for sequence scan) can reset the scan descriptor in the leader which means that workers will still see the old value and assume that the scan is finished and come out without doing any work. Now, this won't produce wrong results because the leader will scan the whole relation by itself in such a case, but it might be inefficient. It's a bit different from wtParam in > that the special parameter isn't allocated until createplan.c time, > so that we don't eat a parameter slot if we end up choosing a non-parallel > plan; but otherwise things are comparable. > > I could use some feedback on whether this is marking dependent child nodes > sanely. As written, any plan node that's marked parallel_aware is assumed > to need a dependency on the parent Gather or GatherMerge's rescan param > --- and the planner will now bitch if a parallel_aware plan node is not > under any such Gather. Is this reasonable? I think so. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] NoMovementScanDirection in heapgettup() and heapgettup_pagemode()
Mithun Cywrites: > I was trying to study NoMovementScanDirection part of heapgettup() and > heapgettup_pagemode(). If I am right there is no test in test suit to > hit this code. I did run make check-world could not hit it. Also, > coverage report in > https://coverage.postgresql.org/src/backend/access/heap/heapam.c.gcov.html > shows this part of the code is not hit. Can somebody please help me to > understand this part of the code and how to test same? I think that's probably dead code given that ExecutorRun short-circuits everything for NoMovementScanDirection. There is some use of NoMovementScanDirection for indexscans, to denote an unordered index, but likely that could be got rid of too. That would leave us with NoMovementScanDirection having no other use except as a do-nothing flag for ExecutorRun. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] standby server crashes hard on out-of-disk-space in HEAD
On Tue, Jun 13, 2017 at 4:21 AM, Andres Freundwrote: > On 2017-06-12 15:12:23 -0400, Robert Haas wrote: >> Commit 4b4b680c3d6d8485155d4d4bf0a92d3a874b7a65 (Make backend local >> tracking of buffer pins memory efficient., vintage 2014) seems like a >> likely culprit here, but I haven't tested. > > I'm not that sure. As written above, the Assert isn't new, and given > this hasn't been reported before, I'm a bit doubtful that it's a general > refcount tracking bug. The FPI code has been whacked around more > heavily, so it could well be a bug in it somewhere. Something doing a bisect could just use a VM that puts the standby on a tiny partition. I remember seeing this assertion failure some time ago on a test deployment, and that was really surprising. I think that this may be hiding something, so we should really try to investigate more what's wrong here. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make pg_regress print a connstring with sockdir
On 28 August 2017 at 19:45, Tom Lanewrote: > Craig Ringer writes: > > It's a pain having to find the postmaster command line to get the port > > pg_regress started a server on. We print the port in the pg_regress > output, > > why not the socket directory / host? > > I'm not following the point here. The test postmaster isn't really > going to be around long enough to connect to it manually. If you > want to do that, you should be using "installcheck", and then the > problem doesn't arise. > > The reason for printing the port number, if memory serves, is to > aid in debugging port selection conflicts. That doesn't really > apply for temporary socket directories; we're expecting libc to > avoid any conflicts there. > I'm frequently debugging postmasters that are around long enough. Deadlocks, etc. It's also way easier to debug shmem related issues with a live postmaster vs a core. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Make pg_regress print a connstring with sockdir
Craig Ringerwrites: > It's a pain having to find the postmaster command line to get the port > pg_regress started a server on. We print the port in the pg_regress output, > why not the socket directory / host? I'm not following the point here. The test postmaster isn't really going to be around long enough to connect to it manually. If you want to do that, you should be using "installcheck", and then the problem doesn't arise. The reason for printing the port number, if memory serves, is to aid in debugging port selection conflicts. That doesn't really apply for temporary socket directories; we're expecting libc to avoid any conflicts there. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 1 test fails in make installcheck-world - database "regress_ecpg_user2" does not exist
Ryan Murphywrites: > And when I look at that diffs file, this is what I see: > - [NO_PID]: ECPGconnect: could not open database: FATAL: database > "regress_ecpg_user2" does not exist > Why would this database not be getting created? No, you're reading it backwards: the error is expected, but it's not appearing in your results. I can duplicate this if I manually create database "regress_ecpg_user2" before running ecpg's installcheck, so I guess that's what you did. I can find no evidence that any part of the PG regression tests creates such a database. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql --batch
I don't doubt about a sense of this configuration - but this specific combination depends on usage - so I don't think so using special option is good idea. Although I agree with you that detailed settings are definitely debatable, I'd say that at least it would be a more reasonable starting point for scripting than the default configuration which is more interactive usage oriented. So even if unperfect, one is free to update defaults to suit more closely their needs, eg "psql -B -F ':' ...", at least most of the scripting conviniencies are already there. I think that such a scripting mode should also imply --no-readline. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?
On Mon, Aug 28, 2017 at 8:02 PM, Kyotaro HORIGUCHIwrote: > The first patch (0001-) fixes this problem, preventing the > problematic state of WAL segments by retarding restart LSN of a > physical replication slot in a certain condition. FWIW, I have this patch marked on my list of things to look at, so you can count me as a reviewer. There are also some approaches that I would like to test because I rely on replication slots for some infrastructure. Still... +if (oldFlushPtr != InvalidXLogRecPtr && +(restartLSN == InvalidXLogRecPtr ? + oldFlushPtr / XLOG_SEG_SIZE != flushPtr / XLOG_SEG_SIZE : + restartLSN / XLOG_BLCKSZ != flushPtr / XLOG_BLCKSZ)) I find such code patterns not readable. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?
Hello, This problem still occurs on the master. I rebased this to the current master. At Mon, 3 Apr 2017 08:38:47 +0900, Michael Paquierwrote in > On Mon, Apr 3, 2017 at 7:19 AM, Venkata B Nagothi wrote: > > As we are already past the commitfest, I am not sure, what should i change > > the patch status to ? > > The commit fest finishes on the 7th of April. Even with the deadline > passed, there is nothing preventing to work on bug fixes. So this item > ought to be moved to the next CF with the same category. The steps to reproduce the problem follows. - Apply the second patch (0002-) attached and recompile. It effectively reproduces the problematic state of database. - M(aster): initdb the master with wal_keep_segments = 0 (default), log_min_messages = debug2 - M: Create a physical repslot. - S(tandby): Setup a standby database. - S: Edit recovery.conf to use the replication slot above then start it. - S: touch /tmp/hoge - M: Run pgbench ... - S: After a while, the standby stops. > LOG: STOP THE SERVER - M: Stop pgbench. - M: Do 'checkpoint;' twice. - S: rm /tmp/hoge - S: Fails to catch up with the following error. > FATAL: could not receive data from WAL stream: ERROR: requested WAL segment 0001002B has already been removed The first patch (0001-) fixes this problem, preventing the problematic state of WAL segments by retarding restart LSN of a physical replication slot in a certain condition. regards, -- Kyotaro Horiguchi NTT Open Source Software Center >From 3813599b74299f1da8d0567ed90542c5f35ed48b Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Wed, 1 Feb 2017 16:07:22 +0900 Subject: [PATCH 1/2] Retard restart LSN of a slot when a segment starts with a contrecord. A physical-replication standby can stop just at the boundary of WAL segments. restart_lsn of a slot on the master can be assumed to be the same location. The last segment on the master will be removed after some checkpoints for the case. If the last record of the last replicated segment continues to the next segment, the continuation record is only on the master. The standby cannot start in the case because the split record is not available from only one source. This patch detains restart_lsn in the last sgement when the first page of the next segment is a continuation record. --- src/backend/replication/walsender.c | 105 +--- 1 file changed, 98 insertions(+), 7 deletions(-) diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 03e1cf4..30c80af 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -217,6 +217,13 @@ static struct WalTimeSample last_read[NUM_SYNC_REP_WAIT_MODE]; } LagTracker; +/* + * This variable corresponds to restart_lsn in pg_replication_slots for a + * physical slot. This has a valid value only when it differs from the current + * flush pointer. + */ +static XLogRecPtr restartLSN = InvalidXLogRecPtr; + /* Signal handlers */ static void WalSndLastCycleHandler(SIGNAL_ARGS); @@ -251,7 +258,7 @@ static void LagTrackerWrite(XLogRecPtr lsn, TimestampTz local_flush_time); static TimeOffset LagTrackerRead(int head, XLogRecPtr lsn, TimestampTz now); static bool TransactionIdInRecentPast(TransactionId xid, uint32 epoch); -static void XLogRead(char *buf, XLogRecPtr startptr, Size count); +static bool XLogRead(char *buf, XLogRecPtr startptr, Size count, bool noutfoundok); /* Initialize walsender process before entering the main command loop */ @@ -546,6 +553,9 @@ StartReplication(StartReplicationCmd *cmd) ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), (errmsg("cannot use a logical replication slot for physical replication"; + + /* Restore restartLSN from replication slot */ + restartLSN = MyReplicationSlot->data.restart_lsn; } /* @@ -561,6 +571,10 @@ StartReplication(StartReplicationCmd *cmd) else FlushPtr = GetFlushRecPtr(); + /* Set InvalidXLogRecPtr if catching up */ + if (restartLSN == FlushPtr) + restartLSN = InvalidXLogRecPtr; + if (cmd->timeline != 0) { XLogRecPtr switchpoint; @@ -770,7 +784,7 @@ logical_read_xlog_page(XLogReaderState *state, XLogRecPtr targetPagePtr, int req count = flushptr - targetPagePtr; /* part of the page available */ /* now actually read the data, we know it's there */ - XLogRead(cur_page, targetPagePtr, XLOG_BLCKSZ); + XLogRead(cur_page, targetPagePtr, XLOG_BLCKSZ, false); return count; } @@ -1738,7 +1752,7 @@ static void ProcessStandbyReplyMessage(void) { XLogRecPtr writePtr, -flushPtr, +flushPtr, oldFlushPtr, applyPtr; bool replyRequested; TimeOffset writeLag, @@ -1798,6 +1812,7 @@ ProcessStandbyReplyMessage(void) WalSnd
Re: [HACKERS] expanding inheritance in partition bound order
On 2017/08/26 3:28, Robert Haas wrote: > On Mon, Aug 21, 2017 at 2:10 AM, Amit Langote >wrote: >> [ new patches ] > > I am failing to understand the point of separating PartitionDispatch > into PartitionDispatch and PartitionTableInfo. That seems like an > unnecessary multiplication of entities, as does the introduction of > PartitionKeyInfo. I also think that replacing reldesc with reloid is > not really an improvement; any places that gets the relid has to go > open the relation to get the reldesc, whereas without that it has a > direct pointer to the information it needs. I am worried about the open relcache reference in PartitionDispatch when we start using it in the planner. Whereas there is a ExecEndModifyTable() as a suitable place to close that reference, there doesn't seem to exist one within the planner, but I guess we will have to figure something out. For time being, the second patch closes the same in expand_inherited_rtentry() right after picking up the OID using RelationGetRelid(pd->reldesc). > I suggest that this patch just focus on removing the following things > from PartitionDispatchData: keystate, tupslot, tupmap. Those things > are clearly executor-specific stuff that makes sense to move to a > different structure, what you're calling PartitionTupleRoutingInfo > (not sure that's the best name). The other stuff all seems fine. > You're going to have to open the relation anyway, so keeping the > reldesc around seems like an optimization, if anything. The > PartitionKey and PartitionDesc pointers may not really be needed -- > they're just pointers into reldesc -- but they're trivial to compute, > so it doesn't hurt anything to have them either as a > micro-optimization for performance or even just for readability. OK, done this way in the attached updated patch. Any suggestions about a better name for what the patch calls PartitionTupleRoutingInfo? > That just leaves indexes. In a world where keystate, tupslot, and > tupmap are removed from the PartitionDispatchData, you must need > indexes or there would be no point in constructing a > PartitionDispatchData object in the first place; any application that > needs neither indexes nor the executor-specific stuff could just use > the Relation directly. Agreed. > Regarding your XXX comments, note that if you've got a lock on a > relation, the pointers to the PartitionKey and PartitionDesc are > stable. The PartitionKey can't change once it's established, and the > PartitionDesc can't change while we've got a lock on the relation > unless we change it ourselves (and any places that do should have > CheckTableNotInUse checks). The keep_partkey and keep_partdesc > handling in relcache.c exists exactly so that we can guarantee that > the pointer won't go stale under us. Now, if we *don't* have a lock > on the relation, then those pointers can easily be invalidated -- so > you can't hang onto a PartitionDispatch for longer than you hang onto > the lock on the Relation. But that shouldn't be a problem. I think > you only need to hang onto PartitionDispatch pointers for the lifetime > of a single query. One can imagine optimizations where we try to > avoid rebuilding that for subsequent queries but I'm not sure there's > any demonstrated need for such a system at present. Here too. Attached are the updated patches. Thanks, Amit From fb4bd4818c4faa08b3c4d37709f01dc55f256a46 Mon Sep 17 00:00:00 2001 From: amit Date: Mon, 24 Jul 2017 18:59:57 +0900 Subject: [PATCH 1/2] Decouple RelationGetPartitionDispatchInfo() from executor Currently it and the structure it generates viz. PartitionDispatch objects are too coupled with the executor's tuple-routing code. In particular, it's pretty undesirable that it makes it the responsibility of the caller to release some resources, such as executor tuple table slots, tuple-conversion maps, etc. That makes it harder to use in places other than where it's currently being used. After this refactoring, ExecSetupPartitionTupleRouting() now needs to do some of the work that was previously done in RelationGetPartitionDispatchInfo(). --- src/backend/catalog/partition.c| 278 +++-- src/backend/commands/copy.c| 37 +++-- src/backend/executor/execMain.c| 124 +-- src/backend/executor/nodeModifyTable.c | 37 +++-- src/include/catalog/partition.h| 34 ++-- src/include/executor/executor.h| 4 +- src/include/nodes/execnodes.h | 40 - 7 files changed, 326 insertions(+), 228 deletions(-) diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c index 96a64ce6b2..25fc4583de 100644 --- a/src/backend/catalog/partition.c +++ b/src/backend/catalog/partition.c @@ -981,181 +981,147 @@ get_partition_qual_relid(Oid relid) } /* - * Append OIDs of rel's partitions to the list 'partoids' and for each OID, - * append pointer rel to
Re: [HACKERS] Make pg_regress print a connstring with sockdir
Craig Ringer wrote: > On 28 August 2017 at 15:19, Michael Paquier> wrote: > > > On Mon, Aug 28, 2017 at 4:07 PM, Craig Ringer > > wrote: > > > == starting postmaster== > > > running with PID 30235; connect with: > > > psql "host='/tmp/pg_regress-j74yFE' port=50848 dbname='regression'" > > > == creating database "regression" == > > > > Sorry if my words were confusing and have cost you three minutes of > > development. I like better the one-line version :) > > Now a socket path could be quite long. I can live with that personally. > > > > I'm not fussed, I just think we should show it one way or the other. > > One nice thing about the two line form is that you can > double-click/middle-click to open a new psql in the pg_regress session > pretty much instantly. So don't add gettext_noop() around it :-) -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Restricting maximum keep segments by repslots
Hello, I'll add this to CF2017-09. At Mon, 06 Mar 2017 18:20:06 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHIwrote in <20170306.182006.172683338.horiguchi.kyot...@lab.ntt.co.jp> > Thank you for the comment. > > At Fri, 3 Mar 2017 14:47:20 -0500, Peter Eisentraut > wrote in > > > On 3/1/17 19:54, Kyotaro HORIGUCHI wrote: > > >> Please measure it in size, not in number of segments. > > > It was difficult to dicide which is reaaonable but I named it > > > after wal_keep_segments because it has the similar effect. > > > > > > In bytes(or LSN) > > > max_wal_size > > > min_wal_size > > > wal_write_flush_after > > > > > > In segments > > > wal_keep_segments > > > > We have been moving away from measuring in segments. For example, > > checkpoint_segments was replaced by max_wal_size. > > > > Also, with the proposed patch that allows changing the segment size more > > easily, this will become more important. (I wonder if that will require > > wal_keep_segments to change somehow.) > > Agreed. It is 'max_slot_wal_keep_size' in the new version. > > wal_keep_segments might should be removed someday. - Following to min/max_wal_size, the variable was renamed to "max_slot_wal_keep_size_mb" and used as ConvertToXSegs(x)" - Stopped warning when checkpoint doesn't flush segments required by slots even if max_slot_wal_keep_size have worked. - Avoided subraction that may be negative. regards, *** a/src/backend/access/transam/xlog.c --- b/src/backend/access/transam/xlog.c *** *** 105,110 int wal_level = WAL_LEVEL_MINIMAL; --- 105,111 int CommitDelay = 0; /* precommit delay in microseconds */ int CommitSiblings = 5; /* # concurrent xacts needed to sleep */ int wal_retrieve_retry_interval = 5000; + int max_slot_wal_keep_size_mb = 0; #ifdef WAL_DEBUG bool XLOG_DEBUG = false; *** *** 9353,9361 KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo) --- 9354,9385 if (max_replication_slots > 0 && keep != InvalidXLogRecPtr) { XLogSegNo slotSegNo; + int slotlimitsegs = ConvertToXSegs(max_slot_wal_keep_size_mb); XLByteToSeg(keep, slotSegNo); + /* + * ignore slots if too many wal segments are kept. + * max_slot_wal_keep_size is just accumulated on wal_keep_segments. + */ + if (max_slot_wal_keep_size_mb > 0 && slotSegNo + slotlimitsegs < segno) + { + segno = segno - slotlimitsegs; /* must be positive */ + + /* + * warn only if the checkpoint flushes the required segment. + * we assume here that *logSegNo is calculated keep location. + */ + if (slotSegNo < *logSegNo) + ereport(WARNING, + (errmsg ("restart LSN of replication slots is ignored by checkpoint"), + errdetail("Some replication slots have lost required WAL segnents to continue by up to %ld segments.", + (segno < *logSegNo ? segno : *logSegNo) - slotSegNo))); + + /* emergency vent */ + slotSegNo = segno; + } + if (slotSegNo <= 0) segno = 1; else if (slotSegNo < segno) *** a/src/backend/utils/misc/guc.c --- b/src/backend/utils/misc/guc.c *** *** 2366,2371 static struct config_int ConfigureNamesInt[] = --- 2366,2382 }, { + {"max_slot_wal_keep_size", PGC_SIGHUP, REPLICATION_SENDING, + gettext_noop("Sets the maximum size of extra WALs kept by replication slots."), + NULL, + GUC_UNIT_MB + }, + _slot_wal_keep_size_mb, + 0, 0, INT_MAX, + NULL, NULL, NULL + }, + + { {"wal_sender_timeout", PGC_SIGHUP, REPLICATION_SENDING, gettext_noop("Sets the maximum time to wait for WAL replication."), NULL, *** a/src/backend/utils/misc/postgresql.conf.sample --- b/src/backend/utils/misc/postgresql.conf.sample *** *** 235,240 --- 235,241 #max_wal_senders = 10 # max number of walsender processes # (change requires restart) #wal_keep_segments = 0 # in logfile segments, 16MB each; 0 disables + #max_slot_wal_keep_size = 0 # measured in bytes; 0 disables #wal_sender_timeout = 60s # in milliseconds; 0 disables #max_replication_slots = 10 # max number of replication slots *** a/src/include/access/xlog.h --- b/src/include/access/xlog.h *** *** 97,102 extern bool reachedConsistency; --- 97,103 extern int min_wal_size_mb; extern int max_wal_size_mb; extern int wal_keep_segments; + extern int max_slot_wal_keep_size_mb; extern int XLOGbuffers; extern int XLogArchiveTimeout; extern int wal_retrieve_retry_interval; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql --batch
2017-08-28 11:05 GMT+02:00 Craig Ringer: > On 28 August 2017 at 16:23, Fabien COELHO wrote: > >> >> This doesn't really address the original issue though, that it's far from >>> obvious how to easily and correctly script psql. >>> >> >> That is another interesting argument. I understood that the issue was >> having to type these options, but now it is also to remember which one are >> relevant and wanted, which is a little different and more justifiable as an >> option. >> >> On that account, ISTM that '|' as a field separator is debatable, that >> pager should be turned off... and maybe a few other things. >> > > > Good point re pager, though it's turned off automatically when stdout > isn't a terminal, so in practice that'll only matter if you're using > something like 'expect' that uses a pty. Still worth doing. > > I agree that | is a bit iffy, but so's anything really. null bytes aren't > usable for all scripts, and nothing else cannot also be output in the data > its self. No easy answers there. In cases where I expect that to be an > issue I sometimes use \COPY ... TO STDOUT WITH (FORMAT CSV) though. > I don't doubt about a sense of this configuration - but this specific combination depends on usage - so I don't think so using special option is good idea. Regards Pavel > > -- > Craig Ringer http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
Re: [HACKERS] Race between SELECT and ALTER TABLE NO INHERIT
Hello, I'll add this to CF2017-09. At Tue, 27 Jun 2017 16:27:18 +0900, Amit Langotewrote in <75fe42df-b1d8-89ff-596d-d9da0749e...@lab.ntt.co.jp> > On 2017/06/26 18:44, Kyotaro HORIGUCHI wrote: > > At Mon, 26 Jun 2017 18:16:42 +0900, Amit Langote wrote: > >> > >> I recall I had proposed a fix for the same thing some time ago [1]. > > > > Wow. About 1.5 years ago and stuck? I have a concrete case where > > this harms so I'd like to fix that this time. How can we move on? > > Agreed that this should be fixed. > > Your proposed approach #1 to recheck the inheritance after obtaining the > lock on the child table seems to be a good way forward. > > Approach #2 of reordering locking is a simpler solution, but does not > completely prevent the problem, because DROP TABLE child can still cause > it to occur, as you mentioned. > > >>> The cause is that NO INHERIT doesn't take an exlusive lock on the > >>> parent. This allows expand_inherited_rtentry to add the child > >>> relation into appendrel after removal from the inheritance but > >>> still exists. > >> > >> Right. > >> > >>> I see two ways to fix this. > >>> > >>> The first patch adds a recheck of inheritance relationship if the > >>> corresponding attribute is missing in the child in > >>> make_inh_translation_list(). The recheck is a bit complex but it > >>> is not performed unless the sequence above is happen. It checks > >>> duplication of relid (or cycles in inheritance) following > >>> find_all_inheritors (but doing a bit different) but I'm not sure > >>> it is really useful. > >> > >> I had proposed adding a syscache on pg_inherits(inhrelid, inhparent), but > >> I guess your hash table based solution will do the job as far as > >> performance of this check is concerned, although I haven't checked the > >> code closely. > > > > The hash table is not crucial in the patch. Substantially it just > > recurs down looking up pg_inherits for the child. I considered > > the new index but abandoned because I thought that such case > > won't occur so frequently. > > Agreed. BTW, the hash table in patch #1 does not seem to be really > helpful. In the while loop in is_descendant_of_internal(), does > hash_search() ever return found = true? AFAICS, it does not. > > >>> The second patch lets ALTER TABLE NO INHERIT to acquire locks on > >>> the parent first. > >>> > >>> Since the latter has a larger impact on the current behavior and > >>> we already treat "DROP TABLE child" case in the similar way, I > >>> suppose that the first approach would be preferable. > >> > >> That makes sense. So, I attached only the first patch, rebased on the current master (It actually failed to apply on it.) and fixed a typo in a comment. This still runs a closed-loop test using temporary hash but it seems a bit paranoic. (this is the same check with what find_all_inheritors is doing) << the following is another topic >> > >> BTW, in the partitioned table case, the parent is always locked first > >> using an AccessExclusiveLock. There are other considerations in that case > >> such as needing to recreate the partition descriptor upon termination of > >> inheritance (both the DETACH PARTITION and also DROP TABLE child cases). > > > > Apart from the degree of concurrency, if we keep parent->children > > order of locking, such recreation does not seem to be > > needed. Maybe I'm missing something. > > Sorry to have introduced that topic in this thread, but I will try to > explain anyway why things are the way they are currently: > > Once a table is no longer a partition of the parent (detached or dropped), > we must make sure that the next commands in the transaction don't see it > as one. That information is currently present in the relcache > (rd_partdesc), which is used by a few callers, most notably the > tuple-routing code. Next commands must recreate the entry so that the > correct thing happens based on the updated information. More precisely, > we must invalidate the current entry. RelationClearRelation() will either > delete the entry or rebuild it. If it's being referenced somewhere, it > will be rebuilt. The place holding the reference may also be looking at > the content of rd_partdesc, which we don't require them to make a copy of, > so we must preserve its content while other fields of RelationData are > being read anew from the catalog. We don't have to preserve it if there > has been any change (partition added/dropped), but to make such a change > one would need to take a strong enough lock on the relation (parent). We > assume here that anyone who wants to reference rd_partdesc takes at least > AccessShareLock lock on the relation, and anyone who wants to change its > content must take a lock that will conflict with it, so > AccessExclusiveLock. Note that in all of this, we are only talking about > one relation, that is the parent, so parent -> child ordering of taking > locks may be irrelevant. I think I
Re: [HACKERS] show "aggressive" or not in autovacuum logs
Hello, Currently the message shows the '%d skipped-frozen' message but it is insufficient to verify the true effect. This is a patch to show mode as 'aggressive' or 'normal' in the closing message of vacuum. %d frozen-skipped when 'aggressive mode' shows the true effect of ALL_FROZEN. I will add this patch to CF2017-09. At Tue, 4 Apr 2017 20:29:38 +0900, Masahiko Sawadawrote in > On Tue, Apr 4, 2017 at 10:09 AM, Kyotaro HORIGUCHI > wrote: > > | =# vacuum freeze verbose it; > > | INFO: vacuuming "public.it" in aggressive mode > > | INFO: "it": found 0 removable, 0 nonremovable row versions in 0 out of 0 > > pages > > ... > > | Skipped 0 pages due to buffer pins, 0 frozen pages. > > > > I still feel a bit uneasy about the word "aggressive" here. > > I think we can use the word "aggressive" here since we already use the > word "aggressive vacuum" in docs[1], but it might be easily > misunderstood. > > [1] https://www.postgresql.org/docs/9.6/static/routine-vacuuming.html > > >Is it better to be "freezing" or something? > > An another idea can be something like "prevent wraparound". The > autovaucum process doing aggressive vacuum appears in pg_stat_activity > with the word " (to prevent wraparound)". This word might be more > user friendly IMO. Hmm. This appears to be in several form. https://www.postgresql.org/docs/devel/static/sql-vacuum.html > aggressive “freezing” of tuples. ... Aggressive freezing https://www.postgresql.org/docs/devel/static/routine-vacuuming.html > VACUUM will perform an aggressive vacuum, > an anti-wraparound autovacuum https://www.postgresql.org/docs/devel/static/runtime-config-client.html > ACUUM performs an aggressive scan ps title > (to prevent wraparound) The nearest common wording seems to be just aggressive (vacuum) so I left it alone in the attached patch. regards, -- Kyotaro Horiguchi NTT Open Source Software Center >From 009507d5ddb33229e4c866fef206962de39317cc Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Mon, 28 Aug 2017 13:12:25 +0900 Subject: [PATCH] Show "aggressive" or not in vacuum messages VACUUM VERBOSE or autovacuum emits log message with "n skipped-frozen" but we cannot tell whether the vacuum was non-freezing (or not aggressive) vacuum or freezing (or aggressive) vacuum having no tuple to freeze. This patch adds indication of "aggressive" or "normal" in autovacuum logs and VACUUM VERBOSE message --- src/backend/commands/vacuumlazy.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index 45b1859..71ecd50 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -373,10 +373,11 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, * emitting individual parts of the message when not applicable. */ initStringInfo(); - appendStringInfo(, _("automatic vacuum of table \"%s.%s.%s\": index scans: %d\n"), + appendStringInfo(, _("automatic vacuum of table \"%s.%s.%s\": mode: %s, index scans: %d\n"), get_database_name(MyDatabaseId), get_namespace_name(RelationGetNamespace(onerel)), RelationGetRelationName(onerel), + aggressive ? "aggressive" : "normal", vacrelstats->num_index_scans); appendStringInfo(, _("pages: %u removed, %u remain, %u skipped due to pins, %u skipped frozen\n"), vacrelstats->pages_removed, @@ -487,9 +488,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats, relname = RelationGetRelationName(onerel); ereport(elevel, - (errmsg("vacuuming \"%s.%s\"", + (errmsg("vacuuming \"%s.%s\" in %s mode", get_namespace_name(RelationGetNamespace(onerel)), - relname))); + relname, aggressive ? "aggressive" : "normal"))); empty_pages = vacuumed_pages = 0; num_tuples = tups_vacuumed = nkeep = nunused = 0; -- 2.9.2 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protect syscache from bloating with negative cache entries
Thank you for your attention. At Mon, 14 Aug 2017 17:33:48 -0400, Peter Eisentrautwrote in <09fa011f-4536-b05d-0625-11f3625d8...@2ndquadrant.com> > On 1/24/17 02:58, Kyotaro HORIGUCHI wrote: > >> BTW, if you set a slightly larger > >> context size on the patch you might be able to avoid rebases; right > >> now the patch doesn't include enough context to uniquely identify the > >> chunks against cacheinfo[]. > > git format-patch -U5 fuses all hunks on cacheinfo[] together. I'm > > not sure that such a hunk can avoid rebases. Is this what you > > suggested? -U4 added an identifiable forward context line for > > some elements so the attached patch is made with four context > > lines. > > This patch needs another rebase for the upcoming commit fest. This patch have had interferences from several commits after the last submission. I amended this patch to follow them (up to f97c55c), removed an unnecessary branch and edited some comments. regards, -- Kyotaro Horiguchi NTT Open Source Software Center >From 4887f7d178b41a1a4729931a12bd396b9a8e8ee0 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Mon, 28 Aug 2017 11:36:21 +0900 Subject: [PATCH 1/2] Cleanup negative cache of pg_statistic when dropping a relation. Accessing columns that don't have statistics leaves negative entries in catcache for pg_statstic, but there's no chance to remove them. Especially when repeatedly creating then dropping temporary tables bloats catcache so much that memory pressure becomes significant. This patch removes negative entries in STATRELATTINH, ATTNAME and ATTNUM when corresponding relation is dropped. --- src/backend/utils/cache/catcache.c | 57 ++- src/backend/utils/cache/syscache.c | 299 +++-- src/include/utils/catcache.h | 3 + src/include/utils/syscache.h | 2 + 4 files changed, 279 insertions(+), 82 deletions(-) diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c index e092801..e50c997 100644 --- a/src/backend/utils/cache/catcache.c +++ b/src/backend/utils/cache/catcache.c @@ -304,10 +304,11 @@ CatCachePrintStats(int code, Datum arg) if (cache->cc_ntup == 0 && cache->cc_searches == 0) continue; /* don't print unused caches */ - elog(DEBUG2, "catcache %s/%u: %d tup, %ld srch, %ld+%ld=%ld hits, %ld+%ld=%ld loads, %ld invals, %ld lsrch, %ld lhits", + elog(DEBUG2, "catcache %s/%u: %d tup, %d negtup, %ld srch, %ld+%ld=%ld hits, %ld+%ld=%ld loads, %ld invals, %ld lsrch, %ld lhits", cache->cc_relname, cache->cc_indexoid, cache->cc_ntup, + cache->cc_nnegtup, cache->cc_searches, cache->cc_hits, cache->cc_neg_hits, @@ -374,6 +375,10 @@ CatCacheRemoveCTup(CatCache *cache, CatCTup *ct) /* free associated tuple data */ if (ct->tuple.t_data != NULL) pfree(ct->tuple.t_data); + + if (ct->negative) + --cache->cc_nnegtup; + pfree(ct); --cache->cc_ntup; @@ -572,6 +577,49 @@ ResetCatalogCache(CatCache *cache) } /* + * CleanupCatCacheNegEntries + * + * Remove negative cache tuples matching a partial key. + * + */ +void +CleanupCatCacheNegEntries(CatCache *cache, ScanKeyData *skey) +{ + int i; + + /* If this cache has no negative entries, nothing to do */ + if (cache->cc_nnegtup == 0) + return; + + /* searching with a paritial key means scanning the whole cache */ + for (i = 0; i < cache->cc_nbuckets; i++) + { + dlist_head *bucket = >cc_bucket[i]; + dlist_mutable_iter iter; + + dlist_foreach_modify(iter, bucket) + { + CatCTup*ct = dlist_container(CatCTup, cache_elem, iter.cur); + bool res; + + if (!ct->negative) +continue; + + HeapKeyTest(>tuple, cache->cc_tupdesc, 1, skey, res); + if (!res) +continue; + + /* + * the negative cache entries can no longer be referenced, so we + * can remove it unconditionally + */ + CatCacheRemoveCTup(cache, ct); + } + } +} + + +/* * ResetCatalogCaches * * Reset all caches when a shared cache inval event forces it @@ -718,6 +766,7 @@ InitCatCache(int id, cp->cc_relisshared = false; /* temporary */ cp->cc_tupdesc = (TupleDesc) NULL; cp->cc_ntup = 0; + cp->cc_nnegtup = 0; cp->cc_nbuckets = nbuckets; cp->cc_nkeys = nkeys; for (i = 0; i < nkeys; ++i) @@ -1215,8 +1264,8 @@ SearchCatCache(CatCache *cache, CACHE4_elog(DEBUG2, "SearchCatCache(%s): Contains %d/%d tuples", cache->cc_relname, cache->cc_ntup, CacheHdr->ch_ntup); - CACHE3_elog(DEBUG2, "SearchCatCache(%s): put neg entry in bucket %d", - cache->cc_relname, hashIndex); + CACHE4_elog(DEBUG2, "SearchCatCache(%s): put neg entry in bucket %d, total %d", + cache->cc_relname, hashIndex, cache->cc_nnegtup); /* * We are not returning the negative entry to the caller, so leave its @@ -1667,6 +1716,8 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, cache->cc_ntup++; CacheHdr->ch_ntup++; + if (negative) + cache->cc_nnegtup++; /*
Re: [HACKERS] WIP: Separate log file for extension
Craig Ringerwrote: > On 25 August 2017 at 15:12, Antonin Houska wrote: > > How will this play with syslog? csvlog? etc? There's nothing special about csvlog: the LogStream structure has a "destination" field, so if particular extension wants this kind of output, it simply sets the LOG_DESTINATION_CSVLOG bit here. LOG_DESTINATION_SYSLOG is a problem because (AFAIK) a single process can maintain no more than one connection to the syslog server. I don't like the idea of closing the current connection and opening a different one whenever the next ereport() is associated with a different stream than the previous was. As for LOG_DESTINATION_EVENTLOG, I haven't checked yet. > I wonder if a level of indirection is appropriate here, where extensions (or > postgres subsystems, even) provide a log stream label. Then the logging > backed takes care of using that appropriately for the logging mechanism in > use; for logging to file that'd generally be separate files. Same for > CSVlog. Other mechanisms could be left as stubs initially. > > So the outcome would be the same, just without the assumption of specific > file name and output mechanism baked in. The example I shown in my previous email was the simplest case I could think of. But it does not mean that the file name has to be hard-coded in the extension. Instead of setting the LogStream.filename field, you can pass a pointer to this field to DefineCustomStringVariable() function, so the specific GUC can control the value. -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql --batch
On 28 August 2017 at 16:23, Fabien COELHOwrote: > > This doesn't really address the original issue though, that it's far from >> obvious how to easily and correctly script psql. >> > > That is another interesting argument. I understood that the issue was > having to type these options, but now it is also to remember which one are > relevant and wanted, which is a little different and more justifiable as an > option. > > On that account, ISTM that '|' as a field separator is debatable, that > pager should be turned off... and maybe a few other things. > Good point re pager, though it's turned off automatically when stdout isn't a terminal, so in practice that'll only matter if you're using something like 'expect' that uses a pty. Still worth doing. I agree that | is a bit iffy, but so's anything really. null bytes aren't usable for all scripts, and nothing else cannot also be output in the data its self. No easy answers there. In cases where I expect that to be an issue I sometimes use \COPY ... TO STDOUT WITH (FORMAT CSV) though. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
[HACKERS] 1 test fails in make installcheck-world - database "regress_ecpg_user2" does not exist
Hello Postgres Hackers, I want to become more helpful to the project, reviewing more patches and starting to write more of my own - and one of the first steps is to get all the tests passing so I can confidently review patches. It almost worked... I typed "make check" and all the tests passed. Then I typed "make installcheck" and all the tests passed again. But when I typed "make installcheck-world", I got this: === 1 of 55 tests failed. === The differences that caused some tests to fail can be viewed in the file "path/to/postgres/src/interfaces/ecpg/test/regression.diffs". A copy of the test summary that you see above is saved in the file "path/to/postgres/src/interfaces/ecpg/test/regression.out". And when I look at that diffs file, this is what I see: - [NO_PID]: ECPGconnect: could not open database: FATAL: database "regress_ecpg_user2" does not exist Why would this database not be getting created? The full diffs file is attached. Thanks for your help! Ryan regression.diffs Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] hash partitioning
Hi young, On Mon, 28 Aug 2017 15:33:46 +0800 "yang...@highgo.com"wrote: > Hello > > Looking at your hash partitioning syntax, I implemented a hash partition in a > more concise way, with no need to determine the number of sub-tables, and > dynamically add partitions. I think it is great work, but the current consensus about hash-partitioning supports Amul's patch[1], in which the syntax is different from the my original proposal. So, you will have to read Amul's patch and make a discussion if you still want to propose your implementation. Regards, [1] https://www.postgresql.org/message-id/CAAJ_b965A2oog=6efuhelexl3rmgfssb3g7lwkva1bw0wuj...@mail.gmail.com > > Description > > The hash partition's implement is on the basis of the original range / list > partition,and using similar syntax. > > To create a partitioned table ,use: > > CREATE TABLE h (id int) PARTITION BY HASH(id); > > The partitioning key supports only one value, and I think the partition key > can support multiple values, > which may be difficult to implement when querying, but it is not impossible. > > A partition table can be create as bellow: > > CREATE TABLE h1 PARTITION OF h; > CREATE TABLE h2 PARTITION OF h; > CREATE TABLE h3 PARTITION OF h; > > FOR VALUES clause cannot be used, and the partition bound is calclulated > automatically as partition index of single integer value. > > An inserted record is stored in a partition whose index equals > DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], > TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts/* Number of partitions > */ > ; > In the above example, this is > DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], > TYPECACHE_HASH_PROC)->hash_proc, id)) % 3; > > postgres=# insert into h select generate_series(1,20); > INSERT 0 20 > postgres=# select tableoid::regclass,* from h; > tableoid | id > --+ > h1 | 3 > h1 | 5 > h1 | 17 > h1 | 19 > h2 | 2 > h2 | 6 > h2 | 7 > h2 | 11 > h2 | 12 > h2 | 14 > h2 | 15 > h2 | 18 > h2 | 20 > h3 | 1 > h3 | 4 > h3 | 8 > h3 | 9 > h3 | 10 > h3 | 13 > h3 | 16 > (20 rows) > > The number of partitions here can be dynamically added, and if a new > partition is created, the number of partitions changes, the calculated target > partitions will change, and the same data is not reasonable in different > partitions,So you need to re-calculate the existing data and insert the > target partition when you create a new partition. > > postgres=# create table h4 partition of h; > CREATE TABLE > postgres=# select tableoid::regclass,* from h; > tableoid | id > --+ > h1 | 5 > h1 | 17 > h1 | 19 > h1 | 6 > h1 | 12 > h1 | 8 > h1 | 13 > h2 | 11 > h2 | 14 > h3 | 1 > h3 | 9 > h3 | 2 > h3 | 15 > h4 | 3 > h4 | 7 > h4 | 18 > h4 | 20 > h4 | 4 > h4 | 10 > h4 | 16 > (20 rows) > > When querying the data, the hash partition uses the same algorithm as the > insertion, and filters out the table that does not need to be scanned. > > postgres=# explain analyze select * from h where id = 1; > QUERY PLAN > > > Append (cost=0.00..41.88 rows=13 width=4) (actual time=0.020..0.023 rows=1 > loops=1) >-> Seq Scan on h3 (cost=0.00..41.88 rows=13 width=4) (actual > time=0.013..0.016 rows=1 loops=1) > Filter: (id = 1) > Rows Removed by Filter: 3 > Planning time: 0.346 ms > Execution time: 0.061 ms > (6 rows) > > postgres=# explain analyze select * from h where id in (1,5);; > QUERY PLAN > > > Append (cost=0.00..83.75 rows=52 width=4) (actual time=0.016..0.028 rows=2 > loops=1) >-> Seq Scan on h1 (cost=0.00..41.88 rows=26 width=4) (actual > time=0.015..0.018 rows=1 loops=1) > Filter: (id = ANY ('{1,5}'::integer[])) > Rows Removed by Filter: 6 >-> Seq Scan on h3 (cost=0.00..41.88 rows=26 width=4) (actual > time=0.005..0.007 rows=1 loops=1) > Filter: (id = ANY ('{1,5}'::integer[])) > Rows Removed by Filter: 3 > Planning time: 0.720 ms > Execution time: 0.074 ms > (9 rows) > > postgres=# explain analyze select * from h where id = 1 or id = 5;; > QUERY PLAN > >
Re: [HACKERS] psql --batch
This doesn't really address the original issue though, that it's far from obvious how to easily and correctly script psql. That is another interesting argument. I understood that the issue was having to type these options, but now it is also to remember which one are relevant and wanted, which is a little different and more justifiable as an option. On that account, ISTM that '|' as a field separator is debatable, that pager should be turned off... and maybe a few other things. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
Hello Masahiko-san, Patch applies cleanly, compiles, works for me. At least: "Required to invoke" -> "Require to invoke". Fixed. I meant the added one about -I, not the pre-existing one about -i. About the code: is_no_vacuum should be a bool? We can change it but I think there is no difference actually. So keeping it would be better. I would like to insist a little bit: the name was declared as an int but passed to init and used as a bool there before the patch. Conceptually what is meant is really a bool, and I see no added value bar saving a few strokes to have an int. ISTM that recent pgbench changes have started turning old int-for-bool habits into using bool when bool is meant. initialize_cmds initialization: rather use pg_strdup instead of pg_malloc/strcpy? -I: pg_free before pg_strdup to avoid a small memory leak? I'm not sure I would have bothered with sizeof(char), but why not! I'm still a little bit annoyed by checkCustomCmds: when unhappy, it shows an error message and return false, which immediatly results in exit(1). However the pattern elsewhere in pgbench is to show the error and exit immediatly. I would suggest to simplify by void-ing the function and exiting instead of returning false. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql --batch
On 28 August 2017 at 15:34, Pavel Stehulewrote: > > > 2017-08-28 9:33 GMT+02:00 Fabien COELHO : > >> >> ISTM that the real pain is the "-v ON_ERRORS_STOP=1" which I occasionally encountered as well, the other one letter ones are not too bad. Maybe it would be enough to have a shortcut for this one, say "-B"? >>> >>> I agree with last sentence. I don't think so -qAtX are expected always, >>> but >>> "-v ON_ERRORS_STOP=1" is pretty often. >>> >> >> Yep. I often "\set" that in the script. >> >> What do you think about long option "--on-errors-stop" ? >>> >> >> It does not really relieve the typing pain Craig is complaining about, >> but it would be ok as the long option version if -B is added, and it is >> auto-completion friendly. > > > This doesn't really address the original issue though, that it's far from obvious how to easily and correctly script psql. I guess there's always the option of a docs patch for that. *shrug* I'll see what others have to say. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Make pg_regress print a connstring with sockdir
On 28 August 2017 at 15:19, Michael Paquierwrote: > On Mon, Aug 28, 2017 at 4:07 PM, Craig Ringer > wrote: > > == starting postmaster== > > running with PID 30235; connect with: > > psql "host='/tmp/pg_regress-j74yFE' port=50848 dbname='regression'" > > == creating database "regression" == > > Sorry if my words were confusing and have cost you three minutes of > development. I like better the one-line version :) > Now a socket path could be quite long. I can live with that personally. > I'm not fussed, I just think we should show it one way or the other. One nice thing about the two line form is that you can double-click/middle-click to open a new psql in the pg_regress session pretty much instantly. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] psql --batch
2017-08-28 9:33 GMT+02:00 Fabien COELHO: > > ISTM that the real pain is the "-v ON_ERRORS_STOP=1" which I occasionally >>> encountered as well, the other one letter ones are not too bad. Maybe it >>> would be enough to have a shortcut for this one, say "-B"? >>> >> >> I agree with last sentence. I don't think so -qAtX are expected always, >> but >> "-v ON_ERRORS_STOP=1" is pretty often. >> > > Yep. I often "\set" that in the script. > > What do you think about long option "--on-errors-stop" ? >> > > It does not really relieve the typing pain Craig is complaining about, but > it would be ok as the long option version if -B is added, and it is > auto-completion friendly. ok Pavel > > > -- > Fabien. >
Re: [HACKERS] psql --batch
ISTM that the real pain is the "-v ON_ERRORS_STOP=1" which I occasionally encountered as well, the other one letter ones are not too bad. Maybe it would be enough to have a shortcut for this one, say "-B"? I agree with last sentence. I don't think so -qAtX are expected always, but "-v ON_ERRORS_STOP=1" is pretty often. Yep. I often "\set" that in the script. What do you think about long option "--on-errors-stop" ? It does not really relieve the typing pain Craig is complaining about, but it would be ok as the long option version if -B is added, and it is auto-completion friendly. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql --batch
2017-08-28 8:56 GMT+02:00 Fabien COELHO: > > I find myself regurgitating the incantation >> >> psql -qAtX -v ON_ERRORS_STOP=1 >> >> quite a bit. It's not ... super friendly. >> >> It strikes me that we could possibly benefit from a 'psql --batch' option. >> >> Thoughts? >> > > The link between -qAtX and "batch" is not that fully obvious, especially > the unaligned tuples-only part. If so, why not some -F as well? > > ISTM that the real pain is the "-v ON_ERRORS_STOP=1" which I occasionally > encountered as well, the other one letter ones are not too bad. Maybe it > would be enough to have a shortcut for this one, say "-B"? > I agree with last sentence. I don't think so -qAtX are expected always, but "-v ON_ERROS_STOP=1" is pretty often. What do you think about long option "---on-errors-stop" ? Regards Pavel > alias? > > -- > Fabien. > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Re: [HACKERS] Make pg_regress print a connstring with sockdir
On Mon, Aug 28, 2017 at 4:07 PM, Craig Ringerwrote: > == starting postmaster== > running with PID 30235; connect with: > psql "host='/tmp/pg_regress-j74yFE' port=50848 dbname='regression'" > == creating database "regression" == Sorry if my words were confusing and have cost you three minutes of development. I like better the one-line version :) Now a socket path could be quite long. I can live with that personally. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make pg_regress print a connstring with sockdir
== starting postmaster== running with PID 30235; connect with: psql "host='/tmp/pg_regress-j74yFE' port=50848 dbname='regression'" == creating database "regression" == On 28 August 2017 at 14:08, Michael Paquierwrote: > On Mon, Aug 28, 2017 at 2:28 PM, Craig Ringer > wrote: > > It's a pain having to find the postmaster command line to get the port > > pg_regress started a server on. We print the port in the pg_regress > output, > > why not the socket directory / host? > > > > How about > > running on 'port=50848 host=/tmp/pg_regress-UMrcT3' with PID 16409 > > > > per the attached? > > > > If you'd prefer nicer wording at the expense of two lines, maybe > > > > running with PID 16409 > > connection string: 'port=50848 host=/tmp/blah' > > Yeah, I think that this is a good idea. > -- > Michael > -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From c3613d4526ba04fb18011e2af1923b9a167effec Mon Sep 17 00:00:00 2001 From: Craig Ringer Date: Mon, 28 Aug 2017 13:28:05 +0800 Subject: [PATCH v2] Show sockdir/hostname in pg_regress startup output --- src/test/regress/pg_regress.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c index abb742b..4cccefc 100644 --- a/src/test/regress/pg_regress.c +++ b/src/test/regress/pg_regress.c @@ -2438,8 +2438,10 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc #else #define ULONGPID(x) (unsigned long) (x) #endif - printf(_("running on port %d with PID %lu\n"), - port, ULONGPID(postmaster_pid)); + printf(_("running with PID %lu; connect with:\n"), + ULONGPID(postmaster_pid)); + printf(gettext_noop(" psql \"host='%s' port=%d dbname='%s'\"\n"), + hostname ? hostname : sockdir, port, dblist->str); } else { -- 2.9.5 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql --batch
On 28 August 2017 at 14:56, Fabien COELHOwrote: > > I find myself regurgitating the incantation >> >> psql -qAtX -v ON_ERRORS_STOP=1 >> >> quite a bit. It's not ... super friendly. >> >> It strikes me that we could possibly benefit from a 'psql --batch' option. >> >> Thoughts? >> > > The link between -qAtX and "batch" is not that fully obvious, especially > the unaligned tuples-only part. If so, why not some -F as well? > > q: quiet Pretty much always wanted for a batch mode run of anything. A: unaligned tuples Because alignment is pretty much never useful when you're parsing result sets with scripting (splitting, cut, etc) and just makes everything harder. The alignment psql uses isn't fixed, after all. t: tuples-only Headers just make everything more annoying to parse, and force you to do extra work to skip them. If you're running batch code you know the headers because you used a column-list form SELECT, or should've. You're unlikely to be ingesting them and using them to split up the tuple anyway. I think this one is a bit more arguable than the first two, though, as I can at least think of some cases where you might want it. X: skip .psqlrc Reliable, portable scripted psql shouldn't be using the local .psqlrc IMO. It's likely to just break things in exciting ways. But I can see it being reasonable to require this option to be supplied separately and just document it as "recommended" with --batch. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Re: Poor cost estimate with interaction between table correlation and partial indexes
Michael Malis wrote: > Hmm... It seems the command I used for obtaining a patch I got from > here https://wiki.postgresql.org/wiki/Working_with_Git truncated part > of the patch. I've attached the file generated from git diff > --patience master improve-partial-index-correlation-calculation > --no-color > improve-correlated-partial-index-cost-v2.patch to this > email. What is the correct command for generating a context diff? Yeah, I've had patches truncated by that too and I've never cared enough to see about getting it fixed. I think it's a bug in the filterdiff utility, but I got a stupid answer from the Debian maintainer when I reported it and didn't care to follow up any further. Eventually I gave up on using context diffs because of this problem. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql --batch
I find myself regurgitating the incantation psql -qAtX -v ON_ERRORS_STOP=1 quite a bit. It's not ... super friendly. It strikes me that we could possibly benefit from a 'psql --batch' option. Thoughts? The link between -qAtX and "batch" is not that fully obvious, especially the unaligned tuples-only part. If so, why not some -F as well? ISTM that the real pain is the "-v ON_ERRORS_STOP=1" which I occasionally encountered as well, the other one letter ones are not too bad. Maybe it would be enough to have a shortcut for this one, say "-B"? alias? -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] NoMovementScanDirection in heapgettup() and heapgettup_pagemode()
Hi all, I was trying to study NoMovementScanDirection part of heapgettup() and heapgettup_pagemode(). If I am right there is no test in test suit to hit this code. I did run make check-world could not hit it. Also, coverage report in https://coverage.postgresql.org/src/backend/access/heap/heapam.c.gcov.html shows this part of the code is not hit. Can somebody please help me to understand this part of the code and how to test same? -- Thanks and Regards Mithun C Y EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make pg_regress print a connstring with sockdir
On Mon, Aug 28, 2017 at 2:28 PM, Craig Ringerwrote: > It's a pain having to find the postmaster command line to get the port > pg_regress started a server on. We print the port in the pg_regress output, > why not the socket directory / host? > > How about > running on 'port=50848 host=/tmp/pg_regress-UMrcT3' with PID 16409 > > per the attached? > > If you'd prefer nicer wording at the expense of two lines, maybe > > running with PID 16409 > connection string: 'port=50848 host=/tmp/blah' Yeah, I think that this is a good idea. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers