Re: Reducing WaitEventSet syscall churn
On Tue, Jul 14, 2020 at 6:51 PM Thomas Munro wrote: > In the meantime, here's a rebase of the more straightforward patches > in the stack. These are the ones that deal only with fixed sets of > file descriptors, and they survive check-world on Linux, > Linux+EXEC_BACKEND (with ASLR disabled) and FreeBSD, and at least > check on macOS and Windows (my CI recipes need more work to get > check-world working on those two). There's one user-visible change > that I'd appreciate feedback on: I propose to drop the FATAL error > when the postmaster goes away, to make things more consistent. See > below for more on that. Here's the effect of patches 0001-0003 on the number of relevant system calls generate by "make check" on Linux and FreeBSD, according to strace/truss -f -c: epoll_create1: 4,825 -> 865 epoll_ctl: 12,454 -> 2,721 epoll_wait:~45k -> ~45k close: ~81k -> ~77k kqueue:4,618 -> 866 kevent:~54k -> ~46k close: ~65k -> ~61k I pushed those three patches, but will wait for more discussion on the rest.
Re: Is it useful to record whether plans are generic or custom?
On 2020/07/22 16:49, torikoshia wrote: On 2020-07-20 13:57, torikoshia wrote: As I proposed earlier in this thread, I'm now trying to add information about generic/cudstom plan to pg_stat_statements. I'll share the idea and the poc patch soon. Attached a poc patch. Thanks for the POC patch! With the patch, when I ran "CREATE EXTENSION pg_stat_statements", I got the following error. ERROR: function pg_stat_statements_reset(oid, oid, bigint) does not exist Main purpose is to decide (1) the user interface and (2) the way to get the plan type from pg_stat_statements. (1) the user interface I added a new boolean column 'generic_plan' to both pg_stat_statements view and the member of the hash key of pg_stat_statements. This is because as Legrand pointed out the feature seems useful under the condition of differentiating all the counters for a queryid using a generic plan and the one using a custom one. I don't like this because this may double the number of entries in pgss. Which means that the number of entries can more easily reach pg_stat_statements.max and some entries will be discarded. I thought it might be preferable to make a GUC to enable or disable this feature, but changing the hash key makes it harder. What happens if the server was running with this option enabled and then restarted with the option disabled? Firstly two entries for the same query were stored in pgss because the option was enabled. But when it's disabled and the server is restarted, those two entries should be merged into one at the startup of server? If so, that's problematic because it may take a long time. Therefore I think that it's better and simple to just expose the number of times generic/custom plan was chosen for each query. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: [PATCH] Tab completion for VACUUM of partitioned tables
On Thu, 30 Jul 2020 at 12:24, Fujii Masao wrote: > > > > On 2020/07/30 10:46, Michael Paquier wrote: > > On Thu, Jul 30, 2020 at 08:44:26AM +0900, Fujii Masao wrote: > >> Isn't it better to add the comment explaining why toast tables are > >> excluded from the tab completion for vacuum while they are vacuumable? > > > > Sounds sensible, still it does not apply only to vacuum. I would go > > as far as just adding a comment at the beginning of the block for > > schema queries: > > Yes, that seems better. Agreed. > BTW, one thing I think a bit strange is that indexes for toast tables > are included in tab-completion for REINDEX, for example. That is, > "REINDEX INDEX" displays "pg_toast.", and "REINDEX INDEX pg_toast." > displays indexes for toast tables. Maybe it's better to exclude them, > too. But there seems no simple way to do that. > So I'm ok with this current situation. Yeah, that's the reason why I mentioned about toast tables. "VACUUM " displays "pg_toast." but, "VACUUM pg_to" doesn't complement anything. Regards, -- Masahiko Sawadahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: IDEA: pg_stat_statements tracking utility statements by tag?
On Thu, Jul 30, 2020 at 3:54 AM Michael Paquier wrote: > > On Wed, Jul 29, 2020 at 06:35:41PM +0200, Julien Rouhaud wrote: > > There's at least PREPARE TRANSACTION / COMMIT PREPARED / ROLLBACK > > PREPARED that should be normalized too. I also don't think that we > > really want to have different entries for begin / Begin / BEGIN / > > bEgin and similar for many other commands, as the hash is computed > > based on the query text. > > Hmm. Do we really want to those commands fully normalized all the > time? There may be applications that care about the stats of some > commands that are for example prefixed the same way and would prefer > group those things together. By fully normalizing those commands all > the time, we would lose this option. > > An example. The ODBC driver uses its own grammar for internal > savepoint names, aka _EXEC_SVP_%p. If you mix that with a second > application that has its own naming policy for savepoints it would not > be possible anymore to make the difference in the stats between what > one or the other do. But if you have an OLTP application that uses ODBC, won't you already have 80+% of pgss entries being savepoint orders, which is really not helpful at all? We'd technically lose the ability to group such commands together, but in most cases the current behavior is quite harmful.
Re: HashAgg's batching counter starts at 0, but Hash's starts at 1.
On Wed, Jul 29, 2020 at 9:05 PM Justin Pryzby wrote: > So my 2ndary suggestion is to conditionalize based on the method rather than > value of space used. What's the advantage of doing it that way? -- Peter Geoghegan
Re: HashAgg's batching counter starts at 0, but Hash's starts at 1.
On Wed, Jul 29, 2020 at 08:35:08PM -0700, Peter Geoghegan wrote: > AFAICT sort (and IncrSort) don't fail to show a field value if it is > zero. Rather, they consistently show "space used" (in non-text > format), which can be either memory or disk space. Technically they > don't violate the letter of the law that you cite. (Though admittedly > this is a legalistic loophole -- the "space" value presumably has to > be interpreted according to different rules for programs that consume > non-text EXPLAIN output.) Sort shows this: Sort Method: "external merge" + Sort Space Used: 19520 + Sort Space Type: "Disk" + Incremental sort shows this: Sort Methods Used:+ - "external merge" + Sort Space Disk: + Average Sort Space Used: 128+ Peak Sort Space Used: 128 + So my 2ndary suggestion is to conditionalize based on the method rather than value of space used. --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -2830 +2830 @@ show_incremental_sort_group_info(IncrementalSortGroupInfo *groupInfo, - if (groupInfo->maxMemorySpaceUsed > 0) + if (groupInfo->sortMethods & SORT_TYPE_QUICKSORT) @@ -2847 +2847 @@ show_incremental_sort_group_info(IncrementalSortGroupInfo *groupInfo, - if (groupInfo->maxDiskSpaceUsed > 0) + if (groupInfo->sortMethods & SORT_TYPE_EXTERNAL_SORT) -- Justin
Re: PG 13 release notes, first draft
On Wednesday, July 29, 2020, Peter Geoghegan wrote: > On Wed, Jul 29, 2020 at 6:30 PM Bruce Momjian wrote: > > > There should be a note about this in the Postgres 13 release notes, > > > for the usual reasons. More importantly, the "Allow hash aggregation > > > to use disk storage for large aggregation result sets" feature should > > > reference the new GUC directly. Users should be advised that the GUC > > > may be useful in cases where they upgrade and experience a performance > > > regression linked to slower hash aggregation. Just including a > > > documentation link for the GUC would be very helpful. > > > > I came up with the attached patch. > > I was thinking something along like the following (after the existing > sentence about avoiding hash aggs in the planner): > > If you find that hash aggregation is slower than in previous major > releases of PostgreSQL, it may be useful to increase the value of > hash_mem_multiplier. This allows hash aggregation to use more memory > without affecting competing query operations that are generally less > likely to put any additional memory to good use. > > How about adding wording for GROUP BY as well to cater to users who are more comfortable thinking in terms of SQL statements instead of execution plans? David J.
Re: HashAgg's batching counter starts at 0, but Hash's starts at 1.
On Mon, Jul 27, 2020 at 8:36 PM Justin Pryzby wrote: > I don't know of a guideline for text format, but the issues I've raised for > HashAgg and IncrSort are based on previous commits which change to "show field > even if its value is zero" for non-text format: But the non-text format for IncrSort shows about the same information as sort, broken out by group. What's missing if you assume that sort is the gold standard? The objection to your argument from James (which could just as easily apply to regular sort from earlier releases) is that accurate information just isn't available as a practical matter. This is due to tuplesort implementation limitations that cannot be fixed now. See the comment block in tuplesort_get_stats() for an explanation. The hard part is showing memory used by external sorts. It's true that "Disk" is specifically shown by sort nodes output in text explain format, but you're talking about non-text formats so that's not really relevant AFAICT sort (and IncrSort) don't fail to show a field value if it is zero. Rather, they consistently show "space used" (in non-text format), which can be either memory or disk space. Technically they don't violate the letter of the law that you cite. (Though admittedly this is a legalistic loophole -- the "space" value presumably has to be interpreted according to different rules for programs that consume non-text EXPLAIN output.) Have I missed something? -- Peter Geoghegan
Re: Any objections to implementing LogicalDecodeMessageCB for pgoutput?
On Wed, Jul 29, 2020 at 9:41 PM Dave Cramer wrote: > For logical replication there is no need to implement this, but others are > using the pgoutput plugin for Change Data Capture. The reason they are > using pgoutput is because it is guaranteed to be available as it is in core > postgres. > > Implementing LogicalDecodeMessageCB provides some synchronization facility > that is not easily replicated. > > Thoughts ? > Attached is a draft patch that adds this functionality into the pgoutput plugin. A slot consumer can pass 'messages' as an option to include logical messages from pg_logical_emit_message in the replication flow. FWIW, we have been using pg_logical_emit_message to send application-level events alongside our change-data-capture for about two years, and we would move this part of our stack to pgoutput if message support was available. Looking forward to discussion and feedback. Cheers, Dave 0001-Add-logical-decoding-messages-to-pgoutput.patch.gz Description: application/gzip
Re: [PATCH] Tab completion for VACUUM of partitioned tables
On 2020/07/30 10:46, Michael Paquier wrote: On Thu, Jul 30, 2020 at 08:44:26AM +0900, Fujii Masao wrote: Isn't it better to add the comment explaining why toast tables are excluded from the tab completion for vacuum while they are vacuumable? Sounds sensible, still it does not apply only to vacuum. I would go as far as just adding a comment at the beginning of the block for schema queries: Yes, that seems better. BTW, one thing I think a bit strange is that indexes for toast tables are included in tab-completion for REINDEX, for example. That is, "REINDEX INDEX" displays "pg_toast.", and "REINDEX INDEX pg_toast." displays indexes for toast tables. Maybe it's better to exclude them, too. But there seems no simple way to do that. So I'm ok with this current situation. "Never include toast tables in any of those queries to avoid unnecessary bloat in the completions." The patch looks good to me except that. Indeed. FWIW, I would also adjust the comment on top of Query_for_list_of_indexables to not say "index creation", but just "supporting indexing" instead. Fujii-san, perhaps you would prefer taking care of this patch? I am fine to do it if you wish. Of course I'm fine if you work on this patch. So please feel free to do that! Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: PG 13 release notes, first draft
On Wed, Jul 29, 2020 at 7:48 PM Bruce Momjian wrote: > Well, that seems to be repeating what is already in the docs for > hash_mem_multiplier, which I try to avoid. One other direction is to > put something in the incompatibilities section. Does that make sense? I would prefer to put it next to the hash agg item itself. It's more likely to be noticed there, and highlighting it a little seems warranted. OTOH, this may not be a problem at all for many individual users. Framing it as a tip rather than a compatibility item gets that across. -- Peter Geoghegan
Re: PG 13 release notes, first draft
On Wed, Jul 29, 2020 at 07:00:43PM -0700, Peter Geoghegan wrote: > On Wed, Jul 29, 2020 at 6:30 PM Bruce Momjian wrote: > > > There should be a note about this in the Postgres 13 release notes, > > > for the usual reasons. More importantly, the "Allow hash aggregation > > > to use disk storage for large aggregation result sets" feature should > > > reference the new GUC directly. Users should be advised that the GUC > > > may be useful in cases where they upgrade and experience a performance > > > regression linked to slower hash aggregation. Just including a > > > documentation link for the GUC would be very helpful. > > > > I came up with the attached patch. > > I was thinking something along like the following (after the existing > sentence about avoiding hash aggs in the planner): > > If you find that hash aggregation is slower than in previous major > releases of PostgreSQL, it may be useful to increase the value of > hash_mem_multiplier. This allows hash aggregation to use more memory > without affecting competing query operations that are generally less > likely to put any additional memory to good use. Well, that seems to be repeating what is already in the docs for hash_mem_multiplier, which I try to avoid. One other direction is to put something in the incompatibilities section. Does that make sense? -- Bruce Momjian https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: PG 13 release notes, first draft
On Wed, Jul 29, 2020 at 6:30 PM Bruce Momjian wrote: > > There should be a note about this in the Postgres 13 release notes, > > for the usual reasons. More importantly, the "Allow hash aggregation > > to use disk storage for large aggregation result sets" feature should > > reference the new GUC directly. Users should be advised that the GUC > > may be useful in cases where they upgrade and experience a performance > > regression linked to slower hash aggregation. Just including a > > documentation link for the GUC would be very helpful. > > I came up with the attached patch. I was thinking something along like the following (after the existing sentence about avoiding hash aggs in the planner): If you find that hash aggregation is slower than in previous major releases of PostgreSQL, it may be useful to increase the value of hash_mem_multiplier. This allows hash aggregation to use more memory without affecting competing query operations that are generally less likely to put any additional memory to good use. -- Peter Geoghegan
Re: IDEA: pg_stat_statements tracking utility statements by tag?
On Wed, Jul 29, 2020 at 06:35:41PM +0200, Julien Rouhaud wrote: > There's at least PREPARE TRANSACTION / COMMIT PREPARED / ROLLBACK > PREPARED that should be normalized too. I also don't think that we > really want to have different entries for begin / Begin / BEGIN / > bEgin and similar for many other commands, as the hash is computed > based on the query text. Hmm. Do we really want to those commands fully normalized all the time? There may be applications that care about the stats of some commands that are for example prefixed the same way and would prefer group those things together. By fully normalizing those commands all the time, we would lose this option. An example. The ODBC driver uses its own grammar for internal savepoint names, aka _EXEC_SVP_%p. If you mix that with a second application that has its own naming policy for savepoints it would not be possible anymore to make the difference in the stats between what one or the other do. -- Michael signature.asc Description: PGP signature
Re: [PATCH] Tab completion for VACUUM of partitioned tables
On Thu, Jul 30, 2020 at 08:44:26AM +0900, Fujii Masao wrote: > Isn't it better to add the comment explaining why toast tables are > excluded from the tab completion for vacuum while they are vacuumable? Sounds sensible, still it does not apply only to vacuum. I would go as far as just adding a comment at the beginning of the block for schema queries: "Never include toast tables in any of those queries to avoid unnecessary bloat in the completions." > The patch looks good to me except that. Indeed. FWIW, I would also adjust the comment on top of Query_for_list_of_indexables to not say "index creation", but just "supporting indexing" instead. Fujii-san, perhaps you would prefer taking care of this patch? I am fine to do it if you wish. -- Michael signature.asc Description: PGP signature
Re: PG 13 release notes, first draft
On Wed, Jul 29, 2020 at 03:34:22PM -0700, Peter Geoghegan wrote: > Hi Bruce, > > On Fri, Jun 26, 2020 at 3:24 PM Bruce Momjian wrote: > > Patch attached and applied to PG 13. > > I committed the hash_mem_multiplier GUC to Postgres 13 just now. > > There should be a note about this in the Postgres 13 release notes, > for the usual reasons. More importantly, the "Allow hash aggregation > to use disk storage for large aggregation result sets" feature should > reference the new GUC directly. Users should be advised that the GUC > may be useful in cases where they upgrade and experience a performance > regression linked to slower hash aggregation. Just including a > documentation link for the GUC would be very helpful. I came up with the attached patch. -- Bruce Momjian https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee diff --git a/doc/src/sgml/release-13.sgml b/doc/src/sgml/release-13.sgml new file mode 100644 index 18e6497..2586645 *** a/doc/src/sgml/release-13.sgml --- b/doc/src/sgml/release-13.sgml *** Author: Jeff Davis Previously, hash aggregation was avoided if it was expected to use ! more than memory. --- 649,657 Previously, hash aggregation was avoided if it was expected to use ! more than memory. To use more ! memory for hash query operations, increase .
Re: Ought to use heap_multi_insert() for pg_attribute/depend insertions?
On Wed, Jul 29, 2020 at 11:34:07PM +0200, Daniel Gustafsson wrote: > Extreme cases for sure, but more importantly, there should be no cases when > this would cause an increase wrt the status quo. Yep. > Maybe it'd be worth pre-computing by a first pass which tracks pinned objects > in a bitmap; with a second pass which then knows how many and which to insert > into slots? Or it could be possible to just rebuild a new list of dependencies before insertion into the catalog. No objections with a bitmap, any approach would be fine here as long as there is a first pass on the item list. > Fair enough, let's break out pg_depend and I'll have another go at that. Thanks. Attached is a polished version of the patch that I intend to commit for pg_attribute and pg_shdepend. Let's look again at pg_depend later, as there are also links with the handling of dependencies for ALTER TABLE mainly. -- Michael diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h index cbfdfe2abe..d31141c1a2 100644 --- a/src/include/catalog/heap.h +++ b/src/include/catalog/heap.h @@ -93,10 +93,11 @@ extern void heap_truncate_check_FKs(List *relations, bool tempTables); extern List *heap_truncate_find_FKs(List *relationIds); -extern void InsertPgAttributeTuple(Relation pg_attribute_rel, - Form_pg_attribute new_attribute, - Datum attoptions, - CatalogIndexState indstate); +extern void InsertPgAttributeTuples(Relation pg_attribute_rel, + TupleDesc tupdesc, + Oid new_rel_oid, + Datum *attoptions, + CatalogIndexState indstate); extern void InsertPgClassTuple(Relation pg_class_desc, Relation new_rel_desc, diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h index 8be303870f..a7e2a9b26b 100644 --- a/src/include/catalog/indexing.h +++ b/src/include/catalog/indexing.h @@ -19,6 +19,7 @@ #define INDEXING_H #include "access/htup.h" +#include "nodes/execnodes.h" #include "utils/relcache.h" /* @@ -36,6 +37,10 @@ extern void CatalogCloseIndexes(CatalogIndexState indstate); extern void CatalogTupleInsert(Relation heapRel, HeapTuple tup); extern void CatalogTupleInsertWithInfo(Relation heapRel, HeapTuple tup, CatalogIndexState indstate); +extern void CatalogTuplesMultiInsertWithInfo(Relation heapRel, + TupleTableSlot **slot, + int ntuples, + CatalogIndexState indstate); extern void CatalogTupleUpdate(Relation heapRel, ItemPointer otid, HeapTuple tup); extern void CatalogTupleUpdateWithInfo(Relation heapRel, diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 8df2716de4..5eef225f5c 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -2164,8 +2164,8 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples, RelationPutHeapTuple(relation, buffer, heaptuples[ndone], false); /* - * Note that heap_multi_insert is not used for catalog tuples yet, but - * this will cover the gap once that is the case. + * For logical decoding we need combocids to properly decode the + * catalog. */ if (needwal && need_cids) log_heap_new_cid(relation, heaptuples[ndone]); @@ -2180,8 +2180,8 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples, RelationPutHeapTuple(relation, buffer, heaptup, false); /* - * We don't use heap_multi_insert for catalog tuples yet, but - * better be prepared... + * For logical decoding we need combocids to properly decode the + * catalog. */ if (needwal && need_cids) log_heap_new_cid(relation, heaptup); diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index 3985326df6..2a18dca34d 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -710,70 +710,122 @@ CheckAttributeType(const char *attname, } /* - * InsertPgAttributeTuple - * Construct and insert a new tuple in pg_attribute. + * Cap the maximum amount of bytes allocated for InsertPgAttributeTuples() + * slots. + */ +#define MAX_PGATTRIBUTE_INSERT_BYTES 65535 + +/* + * InsertPgAttributeTuples + * Construct and insert a set of tuples in pg_attribute. * - * Caller has already opened and locked pg_attribute. new_attribute is the - * attribute to insert. attcacheoff is always initialized to -1, attacl, - * attfdwoptions and attmissingval are always initialized to NULL. + * Caller has already opened and locked pg_attribute. tupdesc contains the + * attributes to insert. attcacheoff is always initialized to -1, attacl, + * attfdwoptions and attmissingval are always initialized to NULL. attoptions + * must contain the same number of elements as tupdesc, or be NULL. * * indstate is the index state for CatalogTupleInsertWithInfo. It can be * passed as NULL, in which case we'll fetch the necessary info. (Don't do * this when inserting multiple attributes, because it's a tad more *
Re: Yet another fast GiST build (typo)
On Fri, Jul 10, 2020 at 6:55 PM Andrey M. Borodin wrote: > Thanks! Fixed. It's not a bug, but I think those 64 bit constants should be wrapped in UINT64CONST(), following our convention. I'm confused about these two patches: 0001 introduces gist_point_fastcmp(), but then 0002 changes it to gist_bbox_fastcmp(). Maybe you intended to keep both of them? Also 0002 seems to have fixups for 0001 squashed into it.
Re: Default setting for enable_hashagg_disk
On Mon, Jul 27, 2020 at 5:55 PM Peter Geoghegan wrote: > Pushed the hashagg_avoid_disk_plan patch -- thanks! Pushed the hash_mem_multiplier patch as well -- thanks again! As I've said before, I am not totally opposed to adding a true escape hatch. That has not proven truly necessary just yet. For now, my working assumption is that the problem on the table has been addressed. -- Peter Geoghegan
Re: HyperLogLog.c and pg_leftmost_one_pos32()
On Wed, Jul 29, 2020 at 10:08 AM Jeff Davis wrote: > Is there a reason that HyperLogLog doesn't use pg_leftmost_one_pos32()? Yes: HyperLogLog predates pg_leftmost_one_pos32(). > I tried the following patch and some brief performance tests seem to > show an improvement. Makes sense. How did you test this? What kind of difference are we talking about? I ported this code from the upstream C++ as part of the original abbreviated keys commit. Note that the cardinality of abbreviated keys are displayed when you set "trace_sort = on". > This came up because my recent commit 9878b643 uses HLL for estimating > the cardinality of spill files, which solves a few annoyances with > overpartitioning[1]. I think that you should change back the rhs() variable names to match HyperLogLog upstream (as well as the existing rhs() comments). > I think it's overall an improvement, but > addHyperLogLog() itself seemed to show up as a cost, so it can hurt > spilled-but-still-in-memory cases. I'd also like to backpatch this to > 13 (as I already did for 9878b643), if that's acceptable. I still wonder if it was ever necessary to add HLL to abbreviated keys. It only served to avoid some pretty narrow worse cases, at the expense of typical cases. Given that the existing users of HLL are pretty narrow, and given the importance of preserving the favorable performance characteristics of hash aggregate, I'm inclined to agree that it's worth backpatching to 13 now. Assuming it is a really measurable cost in practice. -- Peter Geoghegan
Re: [PATCH] Initial progress reporting for COPY command
On 2020/07/29 3:30, Pavel Stehule wrote: út 28. 7. 2020 v 20:25 odesílatel Josef Šimánek mailto:josef.sima...@gmail.com>> napsal: Thanks for the info. I am waiting for review. Is there any summary of requested changes needed? Maybe it is just noise - you wrote so you will resend a patch to different thread I tried to reattach the thread there. I'll prepare a new patch soon, what should I do? Just attach it again? Yeah, since I read this message, I was thinking that new patch will be posted. But, Josef, if the situation was changed, could you correct me? Anyway the patch seems not to be applied cleanly, so at least it needs to be updated to address that. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: Missing CFI in hlCover()?
I wrote: > (I wonder if we need to try to make it faster. I'd supposed that the > loop was cheap enough to be a non-problem, but with large enough > documents maybe not? It seems like converting to a hash table could > be worthwhile for a large doc.) OK, I dug into Stephen's test case off-list. While some CFIs would be a good idea, that's just band-aid'ing the symptom. The actual problem is that hlCover() is taking way too much time. The test case boils down to "common_word & rare_word" matched to a very long document, wherein the rare_word appears only near the front. Once we're past that match, hlCover() tries all the remaining matches for common_word, of which there are plenty ... and for each one, it scans clear to the end of the document, looking vainly for a substring that will satisfy the "common_word & rare_word" query. So obviously, this is O(N^2) in the length of the document :-(. I have to suppose that I introduced this problem in c9b0c678d, since AFAIR we weren't getting ts_headline-takes-forever complaints before that. It's not immediately obvious why the preceding algorithm doesn't have a similar issue, but then there's not anything at all that was obvious about the preceding algorithm. The most obvious way to fix the O(N^2) hazard is to put a limit on the length of "cover" (matching substring) that we'll consider. For the mark_hl_words caller, we won't highlight more than max_words words anyway, so it would be reasonable to bound covers to that length or some small multiple of it. The other caller mark_hl_fragments is willing to highlight up to max_fragments of up to max_words each, and there can be some daylight between the fragments, so it's not quite clear what the longest reasonable match is. Still, I doubt it's useful to show a headline consisting of a few words from the start of the document and a few words from thousands of words later, so a limit of max_fragments times max_words times something would probably be reasonable. We could hard-code a rule like that, or we could introduce a new explicit parameter for the maximum cover length. The latter would be more flexible, but we need something back-patchable and I'm concerned about the compatibility hazards of adding a new parameter in minor releases. So on the whole I propose hard-wiring a multiplier of, say, 10 for both these cases. regards, tom lane
Re: [PATCH] Tab completion for VACUUM of partitioned tables
On 2020/07/30 3:33, Justin Pryzby wrote: On Wed, Jul 29, 2020 at 03:21:19PM +0900, Michael Paquier wrote: On Wed, Jul 29, 2020 at 01:27:07PM +0900, Masahiko Sawada wrote: Good catch. The patch looks good to me. While this patch is logically correct. I think that we should try to not increase more the number of queries used to scan pg_class depending on a list of relkinds. For example, did you notice that your new Query_for_list_of_vacuumables becomes the same query as Query_for_list_of_indexables? You can make your patch more simple. I didn't notice. There's an argument for keeping them separate, but as long as there's a #define in between, this is fine, too. On Wed, Jul 29, 2020 at 08:05:57PM +0900, Michael Paquier wrote: On Wed, Jul 29, 2020 at 06:41:16PM +0900, Masahiko Sawada wrote: whereas Query_for_list_of_vacuumables should search for: RELKIND_RELATION RELKIND_PARTITIONED_TABLE RELKIND_MATVIEW RELKIND_TOASTVALUE FWIW, I don't think that we should make toast relations suggested to the user at all for any command. This comes down to the same point that we don't have pg_toast in search_path, and going down to this level of operations is an expert-level mode, not something we should recommend to the average user in psql IMO. Right. Tom's response to that suggestion a couple years ago I thought was pretty funny (I picture Dr. Claw at his desk using psql tab completion being presented with a list of pg_toast.pg_toast_NN OIDs: "which TOAST table should I vacuum next..") https://www.postgresql.org/message-id/14255.1536781...@sss.pgh.pa.us |I don't actually think that's a good idea. It's more likely to clutter |peoples' completion lists than offer anything they want. Even if someone |actually does want to vacuum a toast table, they are not likely to be |entering its name via tab completion; they're going to have identified |which table they want via some query, and then they'll be doing something |like copy-and-paste out of a query result. Isn't it better to add the comment explaining why toast tables are excluded from the tab completion for vacuum while they are vacuumable? The patch looks good to me except that. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: PG 13 release notes, first draft
Hi Bruce, On Fri, Jun 26, 2020 at 3:24 PM Bruce Momjian wrote: > Patch attached and applied to PG 13. I committed the hash_mem_multiplier GUC to Postgres 13 just now. There should be a note about this in the Postgres 13 release notes, for the usual reasons. More importantly, the "Allow hash aggregation to use disk storage for large aggregation result sets" feature should reference the new GUC directly. Users should be advised that the GUC may be useful in cases where they upgrade and experience a performance regression linked to slower hash aggregation. Just including a documentation link for the GUC would be very helpful. Thanks -- Peter Geoghegan
Re: recovering from "found xmin ... from before relfrozenxid ..."
On Thu, Jul 30, 2020 at 1:36 AM Robert Haas wrote: > On Wed, Jul 29, 2020 at 3:23 AM Thomas Munro wrote: > > On Tue, Jul 14, 2020 at 9:12 AM Robert Haas wrote: > > > A number of EDB customers have had this error crop on their tables for > > > reasons that we have usually not been able to determine. In many > > > > Do you happen to know if they ever used the > > snapshot-too-old feature? > > I don't have any reason to believe that they did. Why? Nothing specific, I was just contemplating the problems with that feature and the patches[1] proposed so far to fix some of them, and what types of corruption might be possible due to that stuff, and it occurred to me to ask if you'd thought about that in connection to these reports. [1] https://www.postgresql.org/message-id/flat/CA%2BTgmoY%3Daqf0zjTD%2B3dUWYkgMiNDegDLFjo%2B6ze%3DWtpik%2B3XqA%40mail.gmail.com
Re: Ought to use heap_multi_insert() for pg_attribute/depend insertions?
> On 29 Jul 2020, at 10:32, Michael Paquier wrote: > > On Wed, Jul 01, 2020 at 06:24:18PM +0900, Michael Paquier wrote: >> I am not sure either, still it looks worth thinking about it. >> Attached is a rebased version of the last patch. What this currently >> holds is just the switch to heap_multi_insert() for the three catalogs >> pg_attribute, pg_depend and pg_shdepend. One point that looks worth >> debating about is to how much to cap the data inserted at once. This >> uses 64kB for all three, with a number of slots chosen based on the >> size of each record, similarly to what we do for COPY. > > I got an extra round of review done for this patch. Thanks! > While on it, I have done some measurements to see the difference in > WAL produced and get an idea of the gain. > On HEAD, with a table that has 1300 attributes, this leads to 563kB of > WAL produced. With the patch, we get down to 505kB. That's an > extreme case of course, but that's nice a nice gain. > > A second test, after creating a database from a template that has > roughly 10k entries in pg_shdepend (10k empty tables actually), showed > a reduction from 2158kB to 1762kB in WAL. Extreme cases for sure, but more importantly, there should be no cases when this would cause an increase wrt the status quo. > Finally comes the third catalog, pg_depend, and there is one thing > that makes me itching about this part. We do a lot of useless work > for the allocation and destruction of the slots when there are pinned > dependencies, and there can be a lot of them. Just by running the > main regression tests, it is easy to see that in 0003 we still do a > lot of calls of recordMultipleDependencies() for one single > dependency, and that most of these are actually pinned. So we finish > by doing a lot of slot manipulation to insert nothing at the end, > contrary to the counterparts with pg_shdepend and pg_attribute. Maybe it'd be worth pre-computing by a first pass which tracks pinned objects in a bitmap; with a second pass which then knows how many and which to insert into slots? > In > short, I think that for now it would be fine to commit a patch that > does the multi-INSERT optimization for pg_attribute and pg_shdepend, > but that we need more careful work for pg_depend. Fair enough, let's break out pg_depend and I'll have another go at that. cheers ./daniel
Re: PATCH: Add uri percent-encoding for binary data
> On 1 Jul 2020, at 16:58, Alvaro Herrera wrote: > > On 2020-Jul-01, Daniel Gustafsson wrote: > >>> On 19 Mar 2020, at 08:55, Daniel Gustafsson wrote: >> >>> With no response for 2 weeks during the commitfest, I propose to move this >>> to >>> the next CF to allow time for discussions. >> >> This patch no longer applies, the failing hunk being in the docs part. As >> stated in my review earlier in the thread I don't think this feature is >> complete enough in its current form; having hacked on it a bit, what are your >> thoughts Alvaro? > > If the author (or some other person interested in the feature) submits a > version addressing the feedback, by all means let's consider it further; > but if nothing happens during this commitfest, I'd say we close as RwF > at end of July. As per discussion, this entry is closed as "Returned with Feedback". cheers ./daniel
Re: new heapcheck contrib module
On Mon, Jul 20, 2020 at 5:02 PM Mark Dilger wrote: > I've made the options 'all-visible', 'all-frozen', and 'none'. It defaults > to 'none'. That looks nice. > > I guess that > > could still be expensive if there's a lot of them, but needing > > ShareUpdateExclusiveLock rather than only AccessShareLock is a little > > unfortunate. > > I welcome strategies that would allow for taking a lesser lock. I guess I'm not seeing why you need any particular strategy here. Say that at the beginning you note the starting relfrozenxid of the table -- I think I would lean toward just ignoring datfrozenxid and the cluster-wide value completely. You also note the current value of the transaction ID counter. Those are the two ends of the acceptable range. Let's first consider the oldest acceptable XID, bounded by relfrozenxid. If you see a value that is older than the relfrozenxid value that you noted at the start, it is definitely invalid. If you see a newer value, it could still be older than the table's current relfrozenxid, but that doesn't seem very worrisome. If the user vacuumed the table while they were running this tool, they can always run the tool again afterward if they wish. Forcing the vacuum to wait by taking ShareUpdateExclusiveLock doesn't actually solve anything anyway: you STILL won't notice any problems the vacuum introduces, and in fact you are now GUARANTEED not to notice them, plus now the vacuum happens later. Now let's consider the newest acceptable XID, bounded by the value of the transaction ID counter. Any time you see a newer XID than the last value of the transaction ID counter that you observed, you go observe it again. If the value from the table still looks invalid, then you complain about it. Either way, you remember the new observation and check future tuples against that value. I think the patch is already doing this anyway; if it weren't, you'd need an even stronger lock, one sufficient to prevent any insert/update/delete activity on the table altogether. Maybe I'm just being dense here -- exactly what problem are you worried about? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Tab completion for VACUUM of partitioned tables
On Wed, Jul 29, 2020 at 03:21:19PM +0900, Michael Paquier wrote: > On Wed, Jul 29, 2020 at 01:27:07PM +0900, Masahiko Sawada wrote: > > Good catch. The patch looks good to me. > > While this patch is logically correct. I think that we should try to > not increase more the number of queries used to scan pg_class > depending on a list of relkinds. For example, did you notice that > your new Query_for_list_of_vacuumables becomes the same query as > Query_for_list_of_indexables? You can make your patch more simple. I didn't notice. There's an argument for keeping them separate, but as long as there's a #define in between, this is fine, too. On Wed, Jul 29, 2020 at 08:05:57PM +0900, Michael Paquier wrote: > On Wed, Jul 29, 2020 at 06:41:16PM +0900, Masahiko Sawada wrote: > > whereas Query_for_list_of_vacuumables should search for: > > > > RELKIND_RELATION > > RELKIND_PARTITIONED_TABLE > > RELKIND_MATVIEW > > RELKIND_TOASTVALUE > > FWIW, I don't think that we should make toast relations suggested to > the user at all for any command. This comes down to the same point > that we don't have pg_toast in search_path, and going down to this > level of operations is an expert-level mode, not something we should > recommend to the average user in psql IMO. Right. Tom's response to that suggestion a couple years ago I thought was pretty funny (I picture Dr. Claw at his desk using psql tab completion being presented with a list of pg_toast.pg_toast_NN OIDs: "which TOAST table should I vacuum next..") https://www.postgresql.org/message-id/14255.1536781...@sss.pgh.pa.us |I don't actually think that's a good idea. It's more likely to clutter |peoples' completion lists than offer anything they want. Even if someone |actually does want to vacuum a toast table, they are not likely to be |entering its name via tab completion; they're going to have identified |which table they want via some query, and then they'll be doing something |like copy-and-paste out of a query result. -- Justin >From c5b356797fc092aec076cca9faa83ea086516350 Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Tue, 28 Jul 2020 11:56:26 -0500 Subject: [PATCH v2] Tab completion for VACUUM of partitioned tables --- src/bin/psql/tab-complete.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index 272f799c24..b464aceea4 100644 --- a/src/bin/psql/tab-complete.c +++ b/src/bin/psql/tab-complete.c @@ -573,8 +573,11 @@ static const SchemaQuery Query_for_list_of_indexables = { .result = "pg_catalog.quote_ident(c.relname)", }; -/* Relations supporting VACUUM */ -static const SchemaQuery Query_for_list_of_vacuumables = { +/* Relations supporting VACUUM are currently same as those supporting indexing */ +#define Query_for_list_of_vacuumables Query_for_list_of_indexables + +/* Relations supporting CLUSTER */ +static const SchemaQuery Query_for_list_of_clusterables = { .catname = "pg_catalog.pg_class c", .selcondition = "c.relkind IN (" CppAsString2(RELKIND_RELATION) ", " @@ -584,9 +587,6 @@ static const SchemaQuery Query_for_list_of_vacuumables = { .result = "pg_catalog.quote_ident(c.relname)", }; -/* Relations supporting CLUSTER are currently same as those supporting VACUUM */ -#define Query_for_list_of_clusterables Query_for_list_of_vacuumables - static const SchemaQuery Query_for_list_of_constraints_with_schema = { .catname = "pg_catalog.pg_constraint c", .selcondition = "c.conrelid <> 0", -- 2.17.0
Re: proposal: unescape_text function
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:tested, passed Hi, The patch looks good to me. The new status of this patch is: Ready for Committer
Re: Online checksums patch - once again
On Mon, Jun 22, 2020 at 8:27 AM Daniel Gustafsson wrote: > Attached is a new version of the online checksums patch which, I hope, address > most of the concerns raised in previous reviews. There has been a fair amount > of fiddling done, so below is a summary of what has been done. Here are a bunch of comments based on a partial read-through of this patch. The most serious concerns, around synchronization, are down toward at the bottom. Sorry this is a bit eclectic as a review, but I wrote things down as I read through the patch more or less in the order I ran across them. Regarding disable_data_checksums(), I disagree with ereport(LOG, ...) here. If you want to indicate to the caller whether or not a state change occurred, you could consider returning a Boolean instead of void. If you want to do it with log messages, I vote for NOTICE, not LOG. Maybe NOTICE is better, because enable_data_checksums() seems to want to convey more information that you can represent in a Boolean, but then it should use NOTICE consistently, not a mix of NOTICE and LOG. Formatting needs work for project style: typically no braces around single statements, "ereport(WHATEVER," should always have a line break at that point. + * cluster, which was not initialized with checksums, this worker will ensure "which was not initialized with checksums" => "that does not running with checksums enabled"? + * turned on. In the case of disabling checksums, the state transition is + * recorded in the catalog and controlfile, no changes are performed + * on the data pages or in the catalog. Comma splice. Either write "controlfile; no" or "controlfile, and no". My spell-checker complains that controfile, clusterwide, inprogress, and endstate are not words. I think you should think about inserting spaces or, in the case of cluster-wide, a dash, unless they are being used as literals, in which case perhaps those instances should be quoted. "havent" needs an apostrophe. + * DataChecksumsWorker will compile a list of databases which exists at the which exist + * For each database, all relations which have storage are read and every data + * page is marked dirty to force a write with the checksum, this will generate Comma splice. Split into two sentences. + * In case checksums have been enabled and later disabled, when re-enabling + * pg_class.relhaschecksums will be reset to false before entering inprogress + * mode to ensure that all relations are re-processed. "If checksums are enabled, then disabled, and then re-enabled, every relation's pg_class.relhaschecksums field will be reset to false before entering the in-progress mode." + * Disabling checksums is done as an immediate operation as it only updates s/done as // + * to pg_class.relhaschecksums is performed as it only tracks state during is performed -> are necessary + * Access to other members can be done without a lock, as while they are + * in shared memory, they are never concurrently accessed. When a worker + * is running, the launcher is only waiting for that worker to finish. The way this is written, it sounds like you're saying that concurrent access might be possible when this structure isn't in shared memory. But since it's called DatachecksumsWorkerShmemStruct that's not likely a correct conclusion, so I think it needs rephrasing. + if (DatachecksumsWorkerShmem->launcher_started && !DatachecksumsWorkerShmem->abort) + started = true; Why not started = a && b instead of started = false; if (a && b) started = true? + { + LWLockRelease(DatachecksumsWorkerLock); + ereport(ERROR, + (errmsg("data checksums worker has been aborted"))); + } Errors always release LWLocks, so this seems unnecessary. Also, the error message looks confusing from a user perspective. What does it mean if I ask you to make me a cheeseburger and you tell me the cheeseburger has been eaten? I'm asking for a *new* cheeseburger (or in this case, a new worker). I wonder why this thing is inventing a brand new way of aborting a worker, anyway. Why not just keep track of the PID and send it SIGINT and have it use CHECK_FOR_INTERRUPTS()? That's already sprinkled all over the code, so it's likely to work better than some brand-new mechanism that will probably have checks in a lot fewer places. + vacuum_delay_point(); Huh? Why? + elog(DEBUG2, + "background worker \"datachecksumsworker\" starting to process relation %u", + relationId); This and similar messages seem likely they refer needlessly to internals, e.g. this could be "adding checksums to relation with OID %u" without needing to reference background workers or datachecksumworker. It would be even better if we could find a way to report relation names. + * so when the cluster comes back up processing will habe to be resumed. habe -> have + ereport(FATAL, + (errmsg("cannot enable checksums without the postmaster process"), + errhint("Restart the database and restart the checksumming process by calling pg_enable_data_checksums().")));
Re: logical replication empty transactions
Hi, Please see below review of the 0001-Skip-empty-transactions-for-logical-replication.patch The make check passes. + /* output BEGIN if we haven't yet */ + if (!data->xact_wrote_changes) + pgoutput_begin(ctx, txn); + + data->xact_wrote_changes = true; + IMO, xact_wrote_changes flag is better set inside the if condition as it does not need to be set repeatedly in subsequent calls to the same function. * Stash BEGIN data in plugin's LogicalDecodingContext.output_plugin_private when plugin's begin callback called, don't write anything to the outstream * Write out BEGIN message lazily when any other callback generates a message that does need to be written out * If no BEGIN written by the time COMMIT callback called, discard the COMMIT too. Check if sync rep enabled. if it is, call LogicalDecodingContext.update_progress from within the output plugin commit handler, otherwise just ignore the commit totally. Probably by calling OutputPluginUpdateProgress(). I think the code in the patch is similar to what has been described by Craig in the above snippet, except instead of stashing the BEGIN message and sending the message lazily, it simply maintains a flag in LogicalDecodingContext.output_plugin_private which defers calling output plugin's begin callback, until any other callback actually generates a remote write. Also, the patch does not contain the last part where he describes having OutputPluginUpdateProgress() for synchronous replication enabled transactions. However, some basic testing suggests that the patch does not have any notable adverse effect on either the replication lag or the sync_rep performance. I performed tests by setting up publisher and subscriber on the same machine with synchronous_commit = on and ran pgbench -c 12 -j 6 -T 300 on unpublished pgbench tables. I see that confirmed_flush_lsn is catching up just fine without any notable delay as compared to the test results without the patch. Also, the TPS for synchronous replication of empty txns with and without the patch remains similar. Having said that, these are initial findings and I understand better performance tests are required to measure reduction in consumption of network bandwidth and impact on synchronous replication and replication lag. Thank you, Rahila Syed
Re: Default setting for enable_hashagg_disk
On Sat, 2020-07-25 at 17:52 -0700, Peter Geoghegan wrote: > BTW, your HLL patch ameliorates the problem with my extreme "sorted > vs > random input" test case from this morning [1] (the thing that I just > discussed with Tomas). Without the HLL patch the sorted case had 2424 > batches. With the HLL patch it has 20. That at least seems like a > notable improvement. Committed. Though I did notice some overhead for spilled-but-still-in-memory cases due to addHyperLogLog() itself. It seems that it can be mostly eliminated with [1], though I'll wait to see if there's an objection because that would affect other users of HLL. Regards, Jeff Davis [1] https://www.postgresql.org/message-id/17068336d300fab76dd6131cbe1996df450dde38.ca...@j-davis.com
HyperLogLog.c and pg_leftmost_one_pos32()
Is there a reason that HyperLogLog doesn't use pg_leftmost_one_pos32()? I tried the following patch and some brief performance tests seem to show an improvement. This came up because my recent commit 9878b643 uses HLL for estimating the cardinality of spill files, which solves a few annoyances with overpartitioning[1]. I think it's overall an improvement, but addHyperLogLog() itself seemed to show up as a cost, so it can hurt spilled-but-still-in-memory cases. I'd also like to backpatch this to 13 (as I already did for 9878b643), if that's acceptable. Regards, Jeff Davis [1] https://www.postgresql.org/message-id/cah2-wznidojad-zbobnfozda5rtcs0jlsqczkdnu+ou1ngy...@mail.gmail.com diff --git a/src/backend/lib/hyperloglog.c b/src/backend/lib/hyperloglog.c index a5cc1f8b83b..5ef0cc9c449 100644 --- a/src/backend/lib/hyperloglog.c +++ b/src/backend/lib/hyperloglog.c @@ -49,6 +49,7 @@ #include #include "lib/hyperloglog.h" +#include "port/pg_bitutils.h" #define POW_2_32 (4294967296.0) #define NEG_POW_2_32 (-4294967296.0) @@ -240,13 +241,15 @@ estimateHyperLogLog(hyperLogLogState *cState) static inline uint8 rho(uint32 x, uint8 b) { - uint8 j = 1; + int p; - while (j <= b && !(x & 0x8000)) - { - j++; - x <<= 1; - } + if (x == 0) + return b + 1; + + p = 32 - pg_leftmost_one_pos32(x); + + if (p > b) + return b + 1; - return j; + return p; }
Re: IDEA: pg_stat_statements tracking utility statements by tag?
On Wed, Jul 29, 2020 at 5:29 PM Martijn van Oosterhout wrote: > > > On Wed, 29 Jul 2020 at 15:40, Julien Rouhaud wrote: >> >> On Wed, Jul 29, 2020 at 2:42 PM Fujii Masao >> wrote: >> > >> > >> > Or, we should extend the existing query normalization to handle also DDL? >> >> +1, introducing DDL normalization seems like a better way to go in the >> long run. Defining what should and shouldn't be normalized can be >> tricky though. > > > In principle, the only thing that really needs to be normalised is > SAVEPOINT/CURSOR names which are essentially random strings which have no > effect on the result. Most other stuff is material to the query. > > That said, I think "aggregate by tag" has value all by itself. Being able to > collapse all CREATE TABLES into a single line can be useful in some > situations. There's at least PREPARE TRANSACTION / COMMIT PREPARED / ROLLBACK PREPARED that should be normalized too. I also don't think that we really want to have different entries for begin / Begin / BEGIN / bEgin and similar for many other commands, as the hash is computed based on the query text.
Re: [Patch] ALTER SYSTEM READ ONLY
On Fri, Jul 24, 2020 at 3:12 PM Soumyadeep Chakraborty wrote: > Ah yes. We should then have ALTER SYSTEM WAL {PERMIT|PROHIBIT}. I don't > think we should say "READ ONLY" if we still allow on-disk file changes > after the ALTER SYSTEM command returns (courtesy dirty buffer flushes) > because it does introduce confusion, especially to an audience not privy > to this thread. When people hear "read-only" they may think of static on-disk > files immediately. They might think of a variety of things that are not a correct interpretation of what the feature does, but I think the way to handle that is to document it properly. I don't think making WAL a grammar keyword just for this is a good idea. I'm not totally stuck on this particular syntax if there's consensus on something else, but I seriously doubt that there will be consensus around adding parser keywords for this. > I don't have enough context to enumerate use cases for the advantages or > opportunities that would come with an assurance that the cluster's files > are frozen (and not covered by any existing utilities), but surely there > are some? Like the possibility of pg_upgrade on a running server while > it can entertain read-only queries? Surely, that's a nice one! I think that this feature is plenty complicated enough already, and we shouldn't make it more complicated to cater to additional use cases, especially when those use cases are somewhat uncertain and would probably require additional work in other parts of the system. For instance, I think it would be great to have an option to start the postmaster in a strictly "don't write ANYTHING" mode where regardless of the cluster state it won't write any data files or any WAL or even the control file. It would be useful for poking around on damaged clusters without making things worse. And it's somewhat related to the topic of this thread, but it's not THAT closely related. It's better to add features one at a time; you can always add more later, but if you make the individual ones too big and hard they don't get done. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: IDEA: pg_stat_statements tracking utility statements by tag?
On Wed, 29 Jul 2020 at 15:40, Julien Rouhaud wrote: > On Wed, Jul 29, 2020 at 2:42 PM Fujii Masao > wrote: > > > > > > Or, we should extend the existing query normalization to handle also DDL? > > +1, introducing DDL normalization seems like a better way to go in the > long run. Defining what should and shouldn't be normalized can be > tricky though. > In principle, the only thing that really needs to be normalised is SAVEPOINT/CURSOR names which are essentially random strings which have no effect on the result. Most other stuff is material to the query. That said, I think "aggregate by tag" has value all by itself. Being able to collapse all CREATE TABLES into a single line can be useful in some situations. Ideally the results of FETCH "cursor" should be combined with the DECLARE, but I really don't know how to go about that. Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/
Re: Nicer error when connecting to standby with hot_standby=off
On 2020/04/03 22:49, James Coleman wrote: On Thu, Apr 2, 2020 at 5:53 PM David Zhang wrote: The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: tested, passed Spec compliant: not tested Documentation:not tested I applied the patch to the latest master branch and run a test below. The error messages have been separated. Below is the test steps. ### setup primary server initdb -D /tmp/primary/data mkdir /tmp/archive_dir echo "archive_mode='on'" >> /tmp/primary/data/postgresql.conf echo "archive_command='cp %p /tmp/archive_dir/%f'" >> /tmp/primary/data/postgresql.conf pg_ctl -D /tmp/primary/data -l /tmp/primary-logs start ### setup host standby server pg_basebackup -p 5432 -w -R -D /tmp/hotstandby echo "primary_conninfo='host=127.0.0.1 port=5432 user=pgdev'" >> /tmp/hotstandby/postgresql.conf echo "restore_command='cp /tmp/archive_dir/%f %p'" >> /tmp/hotstandby/postgresql.conf echo "hot_standby = off" >> /tmp/hotstandby/postgresql.conf pg_ctl -D /tmp/hotstandby -l /tmp/hotstandby-logs -o "-p 5433" start ### keep trying to connect to hot standby server in order to get the error messages in different stages. while true; do echo "`date`"; psql postgres -p 5433 -c "SELECT txid_current_snapshot();" sleep 0.2; done ### before the patch psql: error: could not connect to server: FATAL: the database system is starting up ... ### after the patch, got different messages, one message indicates hot_standby is off psql: error: could not connect to server: FATAL: the database system is starting up ... psql: error: could not connect to server: FATAL: the database system is up, but hot_standby is off ... Thanks for the review and testing! Thanks for the patch! Here is the comment from me. + else if (!FatalError && pmState == PM_RECOVERY) + return CAC_STANDBY; /* connection disallowed on non-hot standby */ Even if hot_standby is enabled, pmState seems to indicate PM_RECOVERY until recovery has reached a consistent state. No? So, if my understanding is right, "FATAL: the database system is up, but hot_standby is off" can be logged even when hot_standby is on. Which sounds very confusing. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: pg_stat_statements: rows not updated for CREATE TABLE AS SELECT statements
On 2020/05/06 22:49, Asif Rehman wrote: The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation:not tested The patch applies cleanly and works as expected. Thanks for the review and test! Since this patch was marked as Ready for Committer, I pushed it. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: Include access method in listTables output
‐‐‐ Original Message ‐‐‐ On Saturday, July 25, 2020 2:41 AM, vignesh C wrote: > On Thu, Jul 16, 2020 at 7:35 PM Georgios gkokola...@protonmail.com wrote: > > > ‐‐‐ Original Message ‐‐‐ > > On Saturday, July 11, 2020 3:16 PM, vignesh C vignes...@gmail.com wrote: > > > > > On Mon, Jul 6, 2020 at 1:24 PM Georgios gkokola...@protonmail.com wrote: > > > > > > > ‐‐‐ Original Message ‐‐‐ > > > > On Monday, July 6, 2020 3:12 AM, Michael Paquier mich...@paquier.xyz > > > > wrote: > > > > > > > > > On Sun, Jul 05, 2020 at 07:13:10AM +0530, vignesh C wrote: > > > > > > > > > > > I'm not sure if we should include showViews, I had seen that the > > > > > > access method was not getting selected for view. > > > > > > > > > > +1. These have no physical storage, so you are looking here for > > > > > relkinds that satisfy RELKIND_HAS_STORAGE(). > > > > > > > > Thank you for the review. > > > > Find attached v4 of the patch. > > > > > > Thanks for fixing the comments. > > > Patch applies cleanly, make check & make check-world passes. > > > I was not sure if any documentation change is required or not for this > > > in doc/src/sgml/ref/psql-ref.sgml. Thoughts? > > > > Thank you for the comment. I added a line in docs. Admittedly, might need > > tweaking. > > Please find version 5 of the patch attached. > > Most changes looks fine, minor comments: Thank you for your comments. Please find the individual answers inline for completeness. I'm having issues understanding where you are going with the reviews, can you fully describe what is it that you wish to be done? > > \pset tuples_only false > -- check conditional tableam display > --- Create a heap2 table am handler with heapam handler > +\pset expanded off > +CREATE SCHEMA conditional_tableam_display; > +CREATE ROLE conditional_tableam_display_role; > +ALTER SCHEMA conditional_tableam_display OWNER TO > conditional_tableam_display_role; > +SET search_path TO conditional_tableam_display; > CREATE ACCESS METHOD heap_psql TYPE TABLE HANDLER heap_tableam_handler; > > This comment might have been removed unintentionally, do you want to > add it back? It was intentional as heap2 table am handler is not getting created. With the code additions the comment does seem out of place and thus removed. > > +-- access method column should not be displayed for sequences > +\ds+ > > - List of relations > > > - Schema | Name | Type | Owner | Persistence | Size | Description > ++--+--+---+-+--+- > +(0 rows) > > We can include one test for view. The list of cases in the test for both including and excluding storage is by no means exhaustive. I thought that this is acceptable. Adding a test for the view, will still not make it the test exhaustive. Are you sure you only suggest the view to be included or you would suggest an exhaustive list of tests. > > - if (pset.sversion >= 12 && !pset.hide_tableam && > > - (showTables || showMatViews || showIndexes)) > > - appendPQExpBuffer(, > > - ",\n am.amname as \"%s\"", > > - gettext_noop("Access Method")); > > - /* > - As of PostgreSQL 9.0, use pg_table_size() to show a more accurate > - size of a table, including FSM, VM and TOAST tables. > @@ -3772,6 +3778,12 @@ listTables(const char *tabtypes, const char > *pattern, bool verbose, bool showSys > appendPQExpBufferStr(, > "\nFROM pg_catalog.pg_class c" > "\n LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace"); > > - > - if (pset.sversion >= 12 && !pset.hide_tableam && > > - (showTables || showMatViews || showIndexes)) > > If conditions in both the places are the same, do you want to make it a > macro? No, I would rather not if you may. I think that a macro would not add to the readability rather it would remove from it. Those are two simple conditionals used in the same function very close to each other. > > Regards, > Vignesh > EnterpriseDB: http://www.enterprisedb.com >
Re: Making CASE error handling less surprising
On Fri, Jul 24, 2020 at 1:18 PM Tom Lane wrote: > The real bottom line is: if you don't want to do this, how else do > you want to fix the problem? I'm no longer willing to deny that > there is a problem. That's the wrong question. The right question is whether we're sufficiently certain that a particular proposal is an improvement over the status quo to justify changing something. It's better to do nothing than to do something that makes some cases better and other cases worse, because then instead of users having a problem with this, they have a variety of different problems depending on which release they are running. IMHO, changing the semantics of something like this is really scary and should be approached with great caution. You don't have to deny that something is a problem in order to admit that you might not have a perfect solution. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Making CASE error handling less surprising
On Sun, Jul 26, 2020 at 1:27 PM Chris Travers wrote: > The first (probably the best) would be a solution along the lines of yours > along with a session-level GUC variable which could determine whether case > branches can fold constants. Bluntly, that seems like a terrible idea. It's great if you are an expert DBA, because then you can adjust the behavior on your own system according to what works best for you. But if you are trying to write portable code that will work on any PostgreSQL instance, you now have to remember to test it with every possible value of the GUC and make sure it behaves the same way under all of them. That is a major burden on authors of tools and extensions, and if we add even three or four such GUCs with three or four possible values each, there are suddenly dozens or even hundreds of possible combinations to test. I think that adding GUCs for this kind of thing is a complete non-starter for that reason. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: INSERT INTO SELECT, Why Parallelism is not selected?
On Sun, Jul 26, 2020 at 7:24 AM Amit Kapila wrote: > No, "git diff --check" doesn't help. I have tried pgindent but that > also doesn't help neither was I expecting it to help. I am still not > able to figure out how I goofed up this but will spend some more time > on this. In the meantime, I have updated the patch to improve the > comments as suggested by Robert. Do let me know if you want to > edit/add something more? I still don't agree with this as proposed. + * For now, we don't allow parallel inserts of any form not even where the + * leader can perform the insert. This restriction can be uplifted once + * we allow the planner to generate parallel plans for inserts. We can If I'm understanding this correctly, this logic is completely backwards. We don't prohibit inserts here because we know the planner can't generate them. We prohibit inserts here because, if the planner somehow did generate them, it wouldn't be safe. You're saying that it's not allowed because we don't try to do it yet, but actually it's not allowed because we want to make sure that we don't accidentally try to do it. That's very different. + * parallelize inserts unless they generate a new commandid (ex. inserts + * into a table having foreign key column) or lock tuples (ex. statements + * like Insert .. Select For Update). I understand the part about generating new command IDs, but not the part about locking tuples. Why would that be a problem? Can it better explained here? Examples in comments are typically introduced with e.g., not ex. + * We should be able to parallelize + * the later case if we can ensure that no two parallel processes can ever + * operate on the same page. I don't know whether this is talking about two processes operating on the same page at the same time, or ever within a single query execution. If it's the former, perhaps we need to explain why that's a concern for parallel query but not otherwise; if it's the latter, that seems impossible to guarantee and imagining that we'll ever be able to do so seems like wishful thinking. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: IDEA: pg_stat_statements tracking utility statements by tag?
On Wed, Jul 29, 2020 at 2:42 PM Fujii Masao wrote: > > On 2020/07/29 18:24, Martijn van Oosterhout wrote: > > Hoi hackers, > > > > We've been using the pg_stat_statements extension to get an idea of the > > queries used in the database, but the table is being filled with entries > > like: > > > > SAVEPOINT sa_savepoint_NNN; > > RELEASE SAVEPOINT sa_savepoint_NNN; > > DECLARE "c_7f9451c4dcc0_5" CURSOR WITHOUT HOLD ... > > FETCH FORWARD 250 FROM "c_7f9451b03908_5" > > > > Since the unique id is different for each query, the aggregation does > > nothing and there are quite a lot of these drowning out the normal queries > > (yes, I'm aware this is an issue of itself). The only way to deal with this > > is "pg_stat_statements.track_utility=off". However, it occurs to me that if > > you just tracked the tags rather than the full query text you could at > > least track the number of such queries and how much time they take. So the > > above queries would be tracked under SAVEPOINT, RELEASE, DECLARE CURSOR and > > (I guess) FETCH respectively. But it would also catch DDL. > > > > Does this sound like something for which a patch would be accepted? > > Or, we should extend the existing query normalization to handle also DDL? +1, introducing DDL normalization seems like a better way to go in the long run. Defining what should and shouldn't be normalized can be tricky though.
Re: recovering from "found xmin ... from before relfrozenxid ..."
On Wed, Jul 29, 2020 at 3:23 AM Thomas Munro wrote: > On Tue, Jul 14, 2020 at 9:12 AM Robert Haas wrote: > > A number of EDB customers have had this error crop on their tables for > > reasons that we have usually not been able to determine. In many > > Do you happen to know if they ever used the > snapshot-too-old feature? I don't have any reason to believe that they did. Why? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: IDEA: pg_stat_statements tracking utility statements by tag?
On 2020/07/29 18:24, Martijn van Oosterhout wrote: Hoi hackers, We've been using the pg_stat_statements extension to get an idea of the queries used in the database, but the table is being filled with entries like: SAVEPOINT sa_savepoint_NNN; RELEASE SAVEPOINT sa_savepoint_NNN; DECLARE "c_7f9451c4dcc0_5" CURSOR WITHOUT HOLD ... FETCH FORWARD 250 FROM "c_7f9451b03908_5" Since the unique id is different for each query, the aggregation does nothing and there are quite a lot of these drowning out the normal queries (yes, I'm aware this is an issue of itself). The only way to deal with this is "pg_stat_statements.track_utility=off". However, it occurs to me that if you just tracked the tags rather than the full query text you could at least track the number of such queries and how much time they take. So the above queries would be tracked under SAVEPOINT, RELEASE, DECLARE CURSOR and (I guess) FETCH respectively. But it would also catch DDL. Does this sound like something for which a patch would be accepted? Or, we should extend the existing query normalization to handle also DDL? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: Should we remove a fallback promotion? take 2
On 2020/07/28 20:35, Hamid Akhtar wrote: There have been no real objections on this patch and it seems to work. Thanks! So I pushed the patch. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: factorial function/phase out postfix operators?
On Wed, Jul 22, 2020 at 8:47 AM Mark Dilger wrote: > > > > > On Jul 18, 2020, at 1:00 AM, John Naylor > > wrote: > > > > pg_get_keywords() should probably have a column to display ability to > > act as a bare col label. Perhaps a boolean? If so, what do you think > > of using true/false for the new field in kwlist.h as well? Hi Mark, sorry for the delay. > I have broken this into its own patch. I like using a BARE_LABEL / > EXPLICIT_LABEL in kwlist.h because it is self-documenting. I don't care > about the *exact* strings that we choose for that, but using TRUE/FALSE in > kwlist.h makes it harder for a person adding a new keyword to know what to > place there. If they guess "FALSE", and also don't know about adding the new > keyword to the bare_label_keyword rule in gram.y, then those two mistakes > will agree with each other and the person adding the keyword won't likely > know they did it wrong. It is simple enough for gen_keywordlist.pl to > convert between what we use in kwlist.h and a boolean value for kwlist_d.h, > so I did it that way. Sounds fine to me. > > In the bikeshedding department, it seems "implicit" was chosen because > > it was distinct from "bare". I think "bare" as a descriptor should be > > kept throughout for readability's sake. Maybe BareColLabel could be > > "IDENT or bare_label_keyword" for example. Same for the $status var. > > The category "bare_label_keyword" is used in v3. As for the $status var, I > don't want to name that $bare, as I didn't go with your idea about using a > boolean. $status = "BARE_LABEL" vs "EXPLICIT_LABEL" makes sense to me, more > than $bare = "BARE_LABEL" vs "EXPLICIT_LABEL" does. "status" is still a bit > vague, so more bikeshedding is welcome. Yeah, it's very generic, but it's hard to find a short word for "can-be-used-as-a-bare-column-label-ness". > This patch does not attempt to remove pre-existing postfix operators from > existing databases, so users upgrading to the new major version who have > custom postfix operators will find that pg_upgrade chokes trying to recreate > the postfix operator. That's not great, but perhaps there is nothing > automated that we could do for them that would be any better. I'm thinking it would be good to have something like select oid from pg_operator where oprright = 0 and oid >= FirstNormalObjectId; in the pre-upgrade check. Other comments: 0001: + errhint("postfix operator support has been discontinued"))); This language seems more appropriate for release notes -- I would word the hint in the present, as in "postfix operators are not supported". Ditto the words "discontinuation", "has been removed", and "no longer works" elsewhere in the patch. +SELECT -5!; +SELECT -0!; +SELECT 0!; +SELECT 100!; I think one negative and one non-negative case is enough to confirm the syntax error. - gram.y still contains "POSTFIXOP" and "postfix-operator". - parse_expr.c looks like it has some now-unreachable code. 0002: + * All keywords can be used explicitly as a column label in expressions + * like 'SELECT 1234 AS keyword', but only some keywords can be used + * implicitly as column labels in expressions like 'SELECT 1234 keyword'. + * Those that can be used implicitly should be listed here. In my mind, "AS" is the thing that's implied when not present, so we should reword this to use the "bare" designation when talking about the labels. I think there are contexts elsewhere where the implicit column label is "col1, col2, col3...". I can't remember offhand where that is though. - * kwlist.h's table from one source of truth.) + * kwlist.h's table from a common master list.) Off topic. 0003: First off, I get a crash when trying select * from pg_get_keywords(); and haven't investigated further. I don't think the returned types match, though. Continuing on, I think 2 and 3 can be squashed together. If anything, it should make revisiting cosmetic decisions easier. + TupleDescInitEntry(tupdesc, (AttrNumber) 4, "bare", +BOOLOID, -1, 0); Perhaps something a bit meatier for the user-visible field name. I don't have a great suggestion. - proname => 'pg_get_keywords', procost => '10', prorows => '400', + proname => 'pg_get_keywords', procost => '10', prorows => '450', Off topic for this patch. Not sure it matters much, either. "EXPLICIT_LABEL" -- continuing my line of thought above, all labels are explicit, that's why they're called labels. Brainstorm: EXPLICIT_AS_LABEL EXPLICIT_AS NON_BARE_LABEL *shrug* + # parser/kwlist.h lists each keyword as either bare or + # explicit, but ecpg neither needs nor has any such PL/pgSQL also uses this script, so maybe just phrase it to exclude the core keyword list. -- John Naylorhttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [Patch] ALTER SYSTEM READ ONLY
On Fri, Jul 24, 2020 at 10:40 PM Soumyadeep Chakraborty < soumyadeep2...@gmail.com> wrote: > On Thu, Jul 23, 2020 at 10:14 PM Amul Sul wrote: > > > > On Fri, Jul 24, 2020 at 6:28 AM Soumyadeep Chakraborty < > soumyadeep2...@gmail.com> wrote: > > > In case it is necessary, the patch set does not wait for the > checkpoint to > > > complete before marking the system as read-write. Refer: > > > > > > /* Set final state by clearing in-progress flag bit */ > > > if (SetWALProhibitState(wal_state & > > ~(WALPROHIBIT_TRANSITION_IN_PROGRESS))) > > > { > > > if ((wal_state & WALPROHIBIT_STATE_READ_ONLY) != 0) > > > ereport(LOG, (errmsg("system is now read only"))); > > > else > > > { > > > /* Request checkpoint */ > > > RequestCheckpoint(CHECKPOINT_IMMEDIATE); > > > ereport(LOG, (errmsg("system is now read write"))); > > > } > > > } > > > > > > We should RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_WAIT) > before > > > we SetWALProhibitState() and do the ereport(), if we have a read-write > > > state change request. > > > > > +1, I too have the same question. > > > > > > > > FWIW, I don't we can request CHECKPOINT_WAIT for this place, otherwise, > it > > think > > it will be deadlock case -- checkpointer process waiting for itself. > > We should really just call CreateCheckPoint() here instead of > RequestCheckpoint(). > > The only setting flag would have been enough for now, the next loop of CheckpointerMain() will anyway be going to call CreateCheckPoint() without waiting. I used RequestCheckpoint() to avoid duplicate flag setting code. Also, I think RequestCheckpoint() will be better so that we don't need to deal will the standalone backend, the only imperfection is it will unnecessary signal itself, that would be fine I guess. > > 3. Some of the functions that were added such as GetWALProhibitState(), > > > IsWALProhibited() etc could be declared static inline. > > > > > IsWALProhibited() can be static but not GetWALProhibitState() since it > > needed to > > be accessible from other files. > > If you place a static inline function in a header file, it will be > accessible from other files. E.g. pg_atomic_* functions. > Well, the current patch set also has few inline functions in the header file. But, I don't think we can do the same for GetWALProhibitState() without changing the XLogCtl structure scope which is local to xlog.c file and the changing XLogCtl scope would be a bad idea. Regards, Amul
Re: [PATCH] Tab completion for VACUUM of partitioned tables
On Wed, Jul 29, 2020 at 06:41:16PM +0900, Masahiko Sawada wrote: > whereas Query_for_list_of_vacuumables should search for: > > RELKIND_RELATION > RELKIND_PARTITIONED_TABLE > RELKIND_MATVIEW > RELKIND_TOASTVALUE FWIW, I don't think that we should make toast relations suggested to the user at all for any command. This comes down to the same point that we don't have pg_toast in search_path, and going down to this level of operations is an expert-level mode, not something we should recommend to the average user in psql IMO. -- Michael signature.asc Description: PGP signature
Re: [PATCH] Tab completion for VACUUM of partitioned tables
On Wed, 29 Jul 2020 at 15:21, Michael Paquier wrote: > > On Wed, Jul 29, 2020 at 01:27:07PM +0900, Masahiko Sawada wrote: > > Good catch. The patch looks good to me. > > While this patch is logically correct. I think that we should try to > not increase more the number of queries used to scan pg_class > depending on a list of relkinds. For example, did you notice that > your new Query_for_list_of_vacuumables becomes the same query as > Query_for_list_of_indexables? Oh, I didn't realize that. Looking at target relation kinds for operations in-depth, I think that the relation list for index creation and the relation list for vacuum is different. Query_for_list_of_indexables should search for: RELKIND_RELATION RELKIND_PARTITIONED_TABLE RELKIND_MATVIEW whereas Query_for_list_of_vacuumables should search for: RELKIND_RELATION RELKIND_PARTITIONED_TABLE RELKIND_MATVIEW RELKIND_TOASTVALUE Also, Query_for_list_of_clusterables is further different from the above two lists. It should search for: RELKIND_RELATION RELKIND_MATVIEW RELKIND_TOASTVALUE Regards, -- Masahiko Sawadahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [PATCH]Fix ja.po error
On Wed, Jul 29, 2020 at 08:42:27AM +, Lu, Chenyang wrote: > When I was using PostgreSQL, I noticed that the output of the > Japanese messages was inconsistent with the English messages. > The Japanese message needs to be modified,so I made the patch. Indeed, good catch. This needs to be applied to the translation repository first though: https://git.postgresql.org/gitweb/?p=pgtranslation/messages.git;a=summary I am adding Alvaro and Peter in CC as they take care of that usually (I don't think I have an access to this repo). -- Michael signature.asc Description: PGP signature
IDEA: pg_stat_statements tracking utility statements by tag?
Hoi hackers, We've been using the pg_stat_statements extension to get an idea of the queries used in the database, but the table is being filled with entries like: SAVEPOINT sa_savepoint_NNN; RELEASE SAVEPOINT sa_savepoint_NNN; DECLARE "c_7f9451c4dcc0_5" CURSOR WITHOUT HOLD ... FETCH FORWARD 250 FROM "c_7f9451b03908_5" Since the unique id is different for each query, the aggregation does nothing and there are quite a lot of these drowning out the normal queries (yes, I'm aware this is an issue of itself). The only way to deal with this is "pg_stat_statements.track_utility=off". However, it occurs to me that if you just tracked the tags rather than the full query text you could at least track the number of such queries and how much time they take. So the above queries would be tracked under SAVEPOINT, RELEASE, DECLARE CURSOR and (I guess) FETCH respectively. But it would also catch DDL. Does this sound like something for which a patch would be accepted? Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/
[PATCH]Fix ja.po error
Hi,hackers When I was using PostgresQL, I noticed that the output of the Japanese messages was inconsistent with the English messages. The Japanese message needs to be modified,so I made the patch. See the attachment for the patch. Best regards ja_po.patch Description: ja_po.patch
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 7/29/20 1:03 PM, Amit Langote wrote: Hi Andrey, Thanks for updating the patch. I will try to take a look later. On Wed, Jul 22, 2020 at 6:09 PM Andrey V. Lepikhov wrote: On 7/16/20 2:14 PM, Amit Langote wrote: * Why the "In" in these API names? + /* COPY a bulk of tuples into a foreign relation */ + BeginForeignCopyIn_function BeginForeignCopyIn; + EndForeignCopyIn_function EndForeignCopyIn; + ExecForeignCopyIn_function ExecForeignCopyIn; I used an analogy from copy.c. Hmm, if we were going to also need *ForeignCopyOut APIs, maybe it makes sense to have "In" here, but maybe we don't, so how about leaving out the "In" for clarity? Ok, sounds good. * I see that the remote copy is performed from scratch on every call of postgresExecForeignCopyIn(), but wouldn't it be more efficient to send the `COPY remote_table FROM STDIN` in postgresBeginForeignCopyIn() and end it in postgresEndForeignCopyIn() when there are no errors during the copy? It is not possible. FDW share one connection between all foreign relations from a server. If two or more partitions will be placed at one foreign server you will have problems with concurrent COPY command. Ah, you're right. I didn't consider multiple foreign partitions pointing to the same server. Indeed, we would need separate connections to a given server to COPY to multiple remote relations on that server in parallel. May be we can create new connection for each partition? Yeah, perhaps, although it sounds like something that might be more generally useful and so we should work on that separately if at all. I will try to prepare a separate patch. I tried implementing these two changes -- pgfdw_copy_data_dest_cb() and sending `COPY remote_table FROM STDIN` only once instead of on every flush -- and I see significant speedup. Please check the attached patch that applies on top of yours. I integrated first change and rejected the second by the reason as above. Thanks. Will send more comments after reading the v5 patch. Ok. I'll be waiting for the end of your review. -- regards, Andrey Lepikhov Postgres Professional
Re: Ought to use heap_multi_insert() for pg_attribute/depend insertions?
On Wed, Jul 01, 2020 at 06:24:18PM +0900, Michael Paquier wrote: > I am not sure either, still it looks worth thinking about it. > Attached is a rebased version of the last patch. What this currently > holds is just the switch to heap_multi_insert() for the three catalogs > pg_attribute, pg_depend and pg_shdepend. One point that looks worth > debating about is to how much to cap the data inserted at once. This > uses 64kB for all three, with a number of slots chosen based on the > size of each record, similarly to what we do for COPY. I got an extra round of review done for this patch. One spot was missed in heap_multi_insert() for a comment telling catalogs not using multi inserts. After some consideration, I think that using 64kB as a base number to calculate the number of slots should be fine, similarly to COPY. While on it, I have done some measurements to see the difference in WAL produced and get an idea of the gain. For example, this function would create one table with a wanted number of attributes: CREATE OR REPLACE FUNCTION create_cols(tabname text, num_cols int) RETURNS VOID AS $func$ DECLARE query text; BEGIN query := 'CREATE TABLE ' || tabname || ' ('; FOR i IN 1..num_cols LOOP query := query || 'a_' || i::text || ' int'; IF i != num_cols THEN query := query || ', '; END IF; END LOOP; query := query || ')'; EXECUTE format(query); END $func$ LANGUAGE plpgsql; On HEAD, with a table that has 1300 attributes, this leads to 563kB of WAL produced. With the patch, we get down to 505kB. That's an extreme case of course, but that's nice a nice gain. A second test, after creating a database from a template that has roughly 10k entries in pg_shdepend (10k empty tables actually), showed a reduction from 2158kB to 1762kB in WAL. Finally comes the third catalog, pg_depend, and there is one thing that makes me itching about this part. We do a lot of useless work for the allocation and destruction of the slots when there are pinned dependencies, and there can be a lot of them. Just by running the main regression tests, it is easy to see that in 0003 we still do a lot of calls of recordMultipleDependencies() for one single dependency, and that most of these are actually pinned. So we finish by doing a lot of slot manipulation to insert nothing at the end, contrary to the counterparts with pg_shdepend and pg_attribute. In short, I think that for now it would be fine to commit a patch that does the multi-INSERT optimization for pg_attribute and pg_shdepend, but that we need more careful work for pg_depend. For example we could go through all the dependencies first and recalculate the number of slots to use depending on what is pinned or not, but this would make sense actually when more dependencies are inserted at once in more code paths, mostly for ALTER TABLE. So this needs more consideration IMO. Thoughts? -- Michael signature.asc Description: PGP signature
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Hi Andrey, Thanks for updating the patch. I will try to take a look later. On Wed, Jul 22, 2020 at 6:09 PM Andrey V. Lepikhov wrote: > On 7/16/20 2:14 PM, Amit Langote wrote: > > * Why the "In" in these API names? > > > > + /* COPY a bulk of tuples into a foreign relation */ > > + BeginForeignCopyIn_function BeginForeignCopyIn; > > + EndForeignCopyIn_function EndForeignCopyIn; > > + ExecForeignCopyIn_function ExecForeignCopyIn; > > I used an analogy from copy.c. Hmm, if we were going to also need *ForeignCopyOut APIs, maybe it makes sense to have "In" here, but maybe we don't, so how about leaving out the "In" for clarity? > > * I see that the remote copy is performed from scratch on every call > > of postgresExecForeignCopyIn(), but wouldn't it be more efficient to > > send the `COPY remote_table FROM STDIN` in > > postgresBeginForeignCopyIn() and end it in postgresEndForeignCopyIn() > > when there are no errors during the copy? > > It is not possible. FDW share one connection between all foreign > relations from a server. If two or more partitions will be placed at one > foreign server you will have problems with concurrent COPY command. Ah, you're right. I didn't consider multiple foreign partitions pointing to the same server. Indeed, we would need separate connections to a given server to COPY to multiple remote relations on that server in parallel. > May be we can create new connection for each partition? Yeah, perhaps, although it sounds like something that might be more generally useful and so we should work on that separately if at all. > > I tried implementing these two changes -- pgfdw_copy_data_dest_cb() > > and sending `COPY remote_table FROM STDIN` only once instead of on > > every flush -- and I see significant speedup. Please check the > > attached patch that applies on top of yours. > > I integrated first change and rejected the second by the reason as above. Thanks. Will send more comments after reading the v5 patch. -- Amit Langote EnterpriseDB: http://www.enterprisedb.com
Re: [Patch] Optimize dropping of relation buffers using dlist
On 17.06.2020 09:14, k.jami...@fujitsu.com wrote: Hi, Since the last posted version of the patch fails, attached is a rebased version. Written upthread were performance results and some benefits and challenges. I'd appreciate your feedback/comments. Regards, Kirk Jamison As far as i understand this patch can provide significant improvement of performance only in case of recovery of truncates of large number of tables. You have added shared hash of relation buffers and certainly if adds some extra overhead. According to your latest results this overhead is quite small. But it will be hard to prove that there will be no noticeable regression at some workloads. I wonder if you have considered case of local hash (maintained only during recovery)? If there is after-crash recovery, then there will be no concurrent access to shared buffers and this hash will be up-to-date. in case of hot-standby replica we can use some simple invalidation (just one flag or counter which indicates that buffer cache was updated). This hash also can be constructed on demand when DropRelFileNodeBuffers is called first time (so w have to scan all buffers once, but subsequent drop operation will be fast. i have not thought much about it, but it seems to me that as far as this problem only affects recovery, we do not need shared hash for it.
Re: recovering from "found xmin ... from before relfrozenxid ..."
On Tue, Jul 14, 2020 at 9:12 AM Robert Haas wrote: > A number of EDB customers have had this error crop on their tables for > reasons that we have usually not been able to determine. In many Do you happen to know if they ever used the snapshot-too-old feature?
Re: Improving connection scalability: GetSnapshotData()
On Wed, Jul 29, 2020 at 6:15 PM Thomas Munro wrote: > +static inline FullTransactionId > +FullXidViaRelative(FullTransactionId rel, TransactionId xid) > > I'm struggling to find a better word for this than "relative". The best I've got is "anchor" xid. It is an xid that is known to limit nextFullXid's range while the receiving function runs.
Re: Rethinking opclass member checks and dependency strength
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation:not tested Looks good to me. CORRECTION: In my previous review I had mistakenly mentioned that it was causing a server crash. Tests run perfectly fine without any errors. The new status of this patch is: Ready for Committer
Re: [PATCH] Tab completion for VACUUM of partitioned tables
On Wed, Jul 29, 2020 at 01:27:07PM +0900, Masahiko Sawada wrote: > Good catch. The patch looks good to me. While this patch is logically correct. I think that we should try to not increase more the number of queries used to scan pg_class depending on a list of relkinds. For example, did you notice that your new Query_for_list_of_vacuumables becomes the same query as Query_for_list_of_indexables? You can make your patch more simple. -- Michael signature.asc Description: PGP signature
Re: proposal: unescape_text function
Hi > > Hi Pavel, > > Since the idea originated from unescaping unicode string literals i.e. >select unescape('Odpov\u011Bdn\u00E1 osoba'); > > Shouldn't the built-in function support the above syntax as well? > good idea. The prefixes u (4 digits) and U (8 digits) are supported Regards Pavel > -- > Asif Rehman > Highgo Software (Canada/China/Pakistan) > URL : www.highgo.ca > > diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 959f6a1c2f..126d3483e6 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -3539,6 +3539,38 @@ repeat('Pg', 4) PgPgPgPg + + + + unicode_unescape + +unicode_unescape ( string text +, escape_char text ) +text + + +Evaluate escaped unicode chars (4 or 6 digits), with prefix +u (4 digits) or with prefix +U (8 digits) to chars. + + +unicode_unescape('\0441\043B\043E\043D') +слон + + +unicode_unescape('d\0061t\+61') +data + + +unicode_unescape('d!0061t!+61', '!') +data + + +unicode_unescape('d\u0061t\U0061') +data + + + diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c index be86eb37fe..c7f94298c1 100644 --- a/src/backend/parser/parser.c +++ b/src/backend/parser/parser.c @@ -26,7 +26,6 @@ #include "parser/parser.h" #include "parser/scansup.h" -static bool check_uescapechar(unsigned char escape); static char *str_udeescape(const char *str, char escape, int position, core_yyscan_t yyscanner); @@ -278,44 +277,6 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner) return cur_token; } -/* convert hex digit (caller should have verified that) to value */ -static unsigned int -hexval(unsigned char c) -{ - if (c >= '0' && c <= '9') - return c - '0'; - if (c >= 'a' && c <= 'f') - return c - 'a' + 0xA; - if (c >= 'A' && c <= 'F') - return c - 'A' + 0xA; - elog(ERROR, "invalid hexadecimal digit"); - return 0; /* not reached */ -} - -/* is Unicode code point acceptable? */ -static void -check_unicode_value(pg_wchar c) -{ - if (!is_valid_unicode_codepoint(c)) - ereport(ERROR, -(errcode(ERRCODE_SYNTAX_ERROR), - errmsg("invalid Unicode escape value"))); -} - -/* is 'escape' acceptable as Unicode escape character (UESCAPE syntax) ? */ -static bool -check_uescapechar(unsigned char escape) -{ - if (isxdigit(escape) - || escape == '+' - || escape == '\'' - || escape == '"' - || scanner_isspace(escape)) - return false; - else - return true; -} - /* * Process Unicode escapes in "str", producing a palloc'd plain string * diff --git a/src/backend/parser/scansup.c b/src/backend/parser/scansup.c index cac70d5df7..9d3173bc6d 100644 --- a/src/backend/parser/scansup.c +++ b/src/backend/parser/scansup.c @@ -218,3 +218,41 @@ scanner_isspace(char ch) return true; return false; } + +/* convert hex digit (caller should have verified that) to value */ +unsigned int +hexval(unsigned char c) +{ + if (c >= '0' && c <= '9') + return c - '0'; + if (c >= 'a' && c <= 'f') + return c - 'a' + 0xA; + if (c >= 'A' && c <= 'F') + return c - 'A' + 0xA; + elog(ERROR, "invalid hexadecimal digit"); + return 0; /* not reached */ +} + +/* is Unicode code point acceptable? */ +void +check_unicode_value(pg_wchar c) +{ + if (!is_valid_unicode_codepoint(c)) + ereport(ERROR, +(errcode(ERRCODE_SYNTAX_ERROR), + errmsg("invalid Unicode escape value"))); +} + +/* is 'escape' acceptable as Unicode escape character (UESCAPE syntax) ? */ +bool +check_uescapechar(unsigned char escape) +{ + if (isxdigit(escape) + || escape == '+' + || escape == '\'' + || escape == '"' + || scanner_isspace(escape)) + return false; + else + return true; +} diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index df10bfb906..5ca9817708 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -6139,3 +6139,256 @@ unicode_is_normalized(PG_FUNCTION_ARGS) PG_RETURN_BOOL(result); } + +/* + * First four chars should be hexnum digits + */ +static bool +isxdigit_four(const char *instr) +{ + return isxdigit((unsigned char) instr[0]) && + isxdigit((unsigned char) instr[1]) && + isxdigit((unsigned char) instr[2]) && + isxdigit((unsigned char) instr[3]); +} + +/* + * Translate string with hexadecimal digits to number + */ +static long int +hexval_four(const char *instr) +{ + return (hexval(instr[0]) << 12) + + (hexval(instr[1]) << 8) + + (hexval(instr[2]) << 4) + + hexval(instr[3]); +} + +/* + * Process Unicode escapes in "str" + * + * escape: the escape character to use + */ +static void +udeescape(StringInfo str, const char *instr, size_t len, char escape) +{ + pg_wchar pair_first = 0; + char cbuf[MAX_UNICODE_EQUIVALENT_STRING + 1]; + + while (len > 0) + { + if (instr[0]
Re: Improving connection scalability: GetSnapshotData()
On Fri, Jul 24, 2020 at 1:11 PM Andres Freund wrote: > On 2020-07-15 21:33:06 -0400, Alvaro Herrera wrote: > > On 2020-Jul-15, Andres Freund wrote: > > > It could make sense to split the conversion of > > > VariableCacheData->latestCompletedXid to FullTransactionId out from 0001 > > > into is own commit. Not sure... > > > > +1, the commit is large enough and that change can be had in advance. > > I've done that in the attached. + * pair with the memory barrier below. We do however accept xid to be <= + * to next_xid, instead of just <, as xid could be from the procarray, + * before we see the updated nextFullXid value. Tricky. Right, that makes sense. I like the range assertion. +static inline FullTransactionId +FullXidViaRelative(FullTransactionId rel, TransactionId xid) I'm struggling to find a better word for this than "relative". +return FullTransactionIdFromU64(U64FromFullTransactionId(rel) ++ (int32) (xid - rel_xid)); I like your branch-free code for this. > I wonder if somebody has an opinion on renaming latestCompletedXid to > latestCompletedFullXid. That's the pattern we already had (cf > nextFullXid), but it also leads to pretty long lines and quite a few > comment etc changes. > > I'm somewhat inclined to remove the "Full" out of the variable, and to > also do that for nextFullXid. I feel like including it in the variable > name is basically a poor copy of the (also not great) C type system. If > we hadn't made FullTransactionId a struct I'd see it differently (and > thus incompatible with TransactionId), but we have ... Yeah, I'm OK with dropping the "Full". I've found it rather clumsy too.
Re: Doc patch: mention indexes in pg_inherits docs
On Tue, Jul 28, 2020 at 12:21:29PM +0100, Dagfinn Ilmari Mannsåker wrote: > When partitioned index support was added in veresion 11, the pg_inherits > docs missed the memo and still only say it describes table inheritance. > The attached patch adds mentions of indexes too, and notes that they can > not participate in multiple inheritance. What you have here looks fine to me. We could be more picky regarding the types or relations that can be added, as it can actually be possible to have a partitioned table or index, two relkinds of their own, but what you are proposing looks fine enough here. > I don't know what the policy is on backpatching doc fixes, but > personally I think it should be. This is actually a bug fix, because we include in the docs some incorrect information, so it should be backpatched. If there are no objections, I'll take care of that. -- Michael signature.asc Description: PGP signature