Re: [HACKERS] Write Ahead Logging for Hash Indexes
On Thu, Mar 9, 2017 at 3:11 AM, Robert Haaswrote: > On Tue, Mar 7, 2017 at 6:41 PM, Robert Haas wrote: Great, thanks. 0001 looks good to me now, so committed. >>> >>> Committed 0002. >> >> Here are some initial review thoughts on 0003 based on a first read-through. > > More thoughts on the main patch: > > +mode anyway). It would seem natural to complete the split in VACUUM, but > since > +splitting a bucket might require allocating a new page, it might fail if you > +run out of disk space. That would be bad during VACUUM - the reason for > +running VACUUM in the first place might be that you run out of disk space, > +and now VACUUM won't finish because you're out of disk space. In contrast, > +an insertion can require enlarging the physical file anyway. > > But VACUUM actually does try to complete the splits, so this is wrong, I > think. > Vacuum just tries to clean the remaining tuples from old bucket. Only insert or split operation tries to complete the split. > > +if (!xlrec.is_primary_bucket_page) > +XLogRegisterBuffer(0, bucket_buf, REGBUF_STANDARD); > > So, in hashbucketcleanup, the page is registered and therefore an FPI > will be taken if this is the first reference to this page since the > checkpoint, but hash_xlog_delete won't restore the FPI. I think that > exposes us to a torn-page hazard. > _hash_freeovflpage/hash_xlog_squeeze_page seems to have the same > disease, as does _hash_squeezebucket/hash_xlog_move_page_contents. > Right, if we use XLogReadBufferForRedoExtended() instead of XLogReadBufferExtended()/LockBufferForCleanup during relay routine, then it will restore FPI when required and can serve our purpose of acquiring clean up lock on primary bucket page. Yet another way could be to store some information like block number, relfilenode, forknum (maybe we can get away with some less info) of primary bucket in the fixed part of each of these records and then using that read the page and then take a cleanup lock. Now, as in most cases, this buffer needs to be registered only in cases when there are multiple overflow pages, I think having fixed information might hurt in some cases. > > +/* > + * As bitmap page doesn't have standard page layout, so this will > + * allow us to log the data. > + */ > +XLogRegisterBuffer(2, mapbuf, REGBUF_STANDARD); > +XLogRegisterBufData(2, (char *) _page_bit, > sizeof(uint32)); > > If the page doesn't have a standard page layout, why are we passing > REGBUF_STANDARD? But I think you've hacked things so that bitmap > pages DO have a standard page layout, per the change to > _hash_initbitmapbuffer, in which case the comment seems wrong. > Yes, the comment is obsolete and I have removed it. We need standard page layout for bitmap page, so that full page writes won't exclude the bitmappage data. > > +/* > + * Note that we still update the page even if it was restored from a full > + * page image, because the bucket flag is not included in the image. > + */ > > Really? Why not? (Because it's part of the special space?) > Yes, because it's part of special space. Do you want that to mentioned in comments? > Doesn't hash_xlog_split_allocpage need to acquire cleanup locks to > guard against concurrent scans? > I don't think so, because regular code takes it on old bucket to protect the split from concurrent inserts and cleanup lock on new bucket is just for the sake of consistency. > And also hash_xlog_split_complete? In regular code, we are not taking cleanup lock for this operation. > And hash_xlog_delete? If the regular code path is getting a cleanup > lock to protect against concurrent scans, recovery better do it, too. > We are already taking cleanup lock for this, not sure what exactly is missed here, can you be more specific? > +/* > + * There is no harm in releasing the lock on old bucket before new bucket > + * during replay as no other bucket splits can be happening concurrently. > + */ > +if (BufferIsValid(oldbuf)) > +UnlockReleaseBuffer(oldbuf); > > And I'm not sure this is safe either. The issue isn't that there > might be another bucket split happening concurrently, but that there > might be scans that get confused by seeing the bucket split flag set > before the new bucket is created or the metapage updated. > During scan we check for bucket being populated, so this should be safe. > Seems like > it would be safer to keep all the locks for this one. > If you feel better that way, I can change it and add a comment saying something like "we could release lock on bucket being split early as well, but doing here to be consistent with normal operation" > > It seems like a good test to do with this patch would be to set up a > pgbench test on the master with a hash index replacing the usual btree > index. Then, set
Re: [HACKERS] [bug fix] dblink leaks unnamed connections
On Wed, Mar 8, 2017 at 3:52 PM, Tsunakawa, Takayukiwrote: > Hello, > > dblink fails to close the unnamed connection as follows when a new unnamed > connection is requested. The attached patch fixes this. This issue was reported about ten years ago and added as TODO item. http://archives.postgresql.org/pgsql-hackers/2007-10/msg00895.php I agree that this is a bug, and am tempted to back-patch to all the supported versions. But it had not been fixed in many years since the first report of the issue. So I'm not sure if it's ok to just treat this as a bug right now and back-patch. Or we should fix this only in HEAD? Anyway I'd like to hear more opinions about this. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Declarative partitioning optimization for large amount of partitions
Hi, Andres Thanks a lot for the review! > Why are we keeping that list / the "batch" allocation pattern? It > doesn't actually seem useful to me after these changes. Given that we > don't shrink hash-tables, we could just use the hash-table memory for > this, no? I don't think we can do that. According to comments: ``` * NOTE: once allocated, TabStatusArray structures are never moved or deleted [...] * This allows relcache pgstat_info pointers to be treated as long-lived data, * avoiding repeated searches in pgstat_initstats() when a relation is * repeatedly opened during a transaction. ``` It is my understanding that dynahash can't give same guarantees. > 'faster ... lookup' doesn't strike me as a useful comment, because it's > only useful in relation of the current code to the new code. Agree, fixed to 'O(1) ... lookup'. > It's not really freed... Right, it's actually zeroed. Fixed. > Arguably it'd be better to HASH_ENTER directly here, instead of doing > two lookups. Good point. Fixed. > Think it'd be better to move the hash creation into its own function. Agree, fixed. On Mon, Mar 06, 2017 at 12:01:50PM -0800, Andres Freund wrote: > Hi, > > This issue has bothered me in non-partitioned use-cases recently, so > thanks for taking it up. > > > On 2017-03-06 18:22:17 +0300, Aleksander Alekseev wrote: > > diff --git a/src/backend/postmaster/pgstat.c > > b/src/backend/postmaster/pgstat.c > > index 2fb9a8bf58..fa906e7930 100644 > > --- a/src/backend/postmaster/pgstat.c > > +++ b/src/backend/postmaster/pgstat.c > > @@ -160,6 +160,16 @@ typedef struct TabStatusArray > > > > static TabStatusArray *pgStatTabList = NULL; > > Why are we keeping that list / the "batch" allocation pattern? It > doesn't actually seem useful to me after these changes. Given that we > don't shrink hash-tables, we could just use the hash-table memory for > this, no? > > I think a separate list that only contains things that are "active" in > the current transaction makes sense, because the current logic requires > iterating over a potentially very large array at transaction commit. > > > > +/* pgStatTabHash entry */ > > +typedef struct TabStatHashEntry > > +{ > > + Oid t_id; > > + PgStat_TableStatus* tsa_entry; > > +} TabStatHashEntry; > > + > > +/* Hash table for faster t_id -> tsa_entry lookup */ > > +static HTAB *pgStatTabHash = NULL; > > + > > 'faster ... lookup' doesn't strike me as a useful comment, because it's > only useful in relation of the current code to the new code. > > > > > /* > > * Backends store per-function info that's waiting to be sent to the > > collector > > * in this hash table (indexed by function OID). > > @@ -815,7 +825,13 @@ pgstat_report_stat(bool force) > > pgstat_send_tabstat(this_msg); > > this_msg->m_nentries = 0; > > } > > + > > + /* > > +* Entry will be freed soon so we need to remove it > > from the lookup table. > > +*/ > > + hash_search(pgStatTabHash, >t_id, HASH_REMOVE, > > NULL); > > } > > It's not really freed... > > > > static PgStat_TableStatus * > > get_tabstat_entry(Oid rel_id, bool isshared) > > { > > + TabStatHashEntry* hash_entry; > > PgStat_TableStatus *entry; > > TabStatusArray *tsa; > > - TabStatusArray *prev_tsa; > > - int i; > > + > > + /* Try to find an entry */ > > + entry = find_tabstat_entry(rel_id); > > + if(entry != NULL) > > + return entry; > > Arguably it'd be better to HASH_ENTER directly here, instead of doing > two lookups. > > > > /* > > -* Search the already-used tabstat slots for this relation. > > +* Entry doesn't exist - creating one. > > +* First make sure there is a free space in a last element of > > pgStatTabList. > > */ > > - prev_tsa = NULL; > > - for (tsa = pgStatTabList; tsa != NULL; prev_tsa = tsa, tsa = > > tsa->tsa_next) > > + if (!pgStatTabList) > > { > > - for (i = 0; i < tsa->tsa_used; i++) > > - { > > - entry = >tsa_entries[i]; > > - if (entry->t_id == rel_id) > > - return entry; > > - } > > + HASHCTL ctl; > > > > - if (tsa->tsa_used < TABSTAT_QUANTUM) > > + Assert(!pgStatTabHash); > > + > > + memset(, 0, sizeof(ctl)); > > + ctl.keysize = sizeof(Oid); > > + ctl.entrysize = sizeof(TabStatHashEntry); > > + ctl.hcxt = TopMemoryContext; > > + > > + pgStatTabHash = hash_create("pgstat t_id to tsa_entry lookup > > table", > > + TABSTAT_QUANTUM, , HASH_ELEM | HASH_BLOBS | > > HASH_CONTEXT); > > Think it'd be better to move the hash creation into its own function. > > > - Andres -- Best regards, Aleksander Alekseev diff --git a/src/backend/postmaster/pgstat.c
Re: [HACKERS] Gather Merge
On Thu, Mar 9, 2017 at 8:21 AM, Rushabh Lathiawrote: > Thanks Robert for committing this. > > My colleague Neha Sharma found one regression with the patch. I was about > to send this mail and noticed that you committed the patch. Oops. Bad timing. > postgres=# explain select aid from pgbench_accounts where aid % 25= 0 group > by aid; > ERROR: ORDER/GROUP BY expression not found in targetlist I think your fix for this looks right, although I would write it this way: -gm_plan->plan.targetlist = subplan->targetlist; +gm_plan->plan.targetlist = build_path_tlist(root, _path->path); The second part of your fix looks wrong. I think you want this: create_gather_merge_path(root, grouped_rel, subpath, - NULL, + partial_grouping_target, root->group_pathkeys, NULL, _groups); That will match the create_gather_path case. This test case is still failing for me even with those fixes: rhaas=# select aid+1 from pgbench_accounts group by aid+1; ERROR: could not find pathkey item to sort So evidently there is at least one more bug here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GUC for cleanup indexes threshold.
On Wed, Mar 8, 2017 at 1:43 AM, Peter Geogheganwrote: > On Sat, Mar 4, 2017 at 1:30 AM, Amit Kapila wrote: >>> While I can't see this explained anywhere, I'm >>> pretty sure that that's supposed to be impossible, which this patch >>> changes. >>> >> >> What makes you think that patch will allow pg_class.relfrozenxid to be >> advanced past opaque->btpo.xact which was previously not possible? > > By not reliably recycling pages in a timely manner, we won't change > anything about the dead page. It just sticks around. This is mostly > fine, but we still need VACUUM to be able to reason about it (to > determine if it is in fact recyclable), which is what I'm concerned > about breaking here. It still needs to be *possible* to recycle any > recyclable page at some point (e.g., when we find it convenient). > > pg_class.relfrozenxid is InvalidTransactionId for indexes because > indexes generally don't store XIDs. This is the one exception that I'm > aware of, presumably justified by the fact that it's only for > recyclable pages anyway, and those are currently *guaranteed* to get > recycled quickly. In particular, they're guaranteed to get recycled by > the next VACUUM. They may be recycled in the course of anti-wraparound > VACUUM, even if VACUUM has no garbage tuples to kill (even if we only > do lazy_cleanup_index() instead of lazy_vacuum_index()). This is the > case that this patch proposes to have us skip touching indexes for. > To prevent this, I think we need to not skip the lazy_cleanup_index when ant-wraparound vacuum (aggressive = true) even if the number of scanned pages is less then the threshold. This can ensure that pg_class.relfrozenxid is not advanced past opaque->bp.xact with minimal pain. Even if the btree dead page is leaved, the subsequent modification makes garbage on heap and then autovauum recycles that page while index vacuuming(lazy_index_vacuum). Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Bitmap scans a bit broken
On Thu, Mar 9, 2017 at 10:02 PM, Dilip Kumarwrote: > I slightly modified your query to reproduce this issue. > > explain analyze select * from r1 where value<555; > > Patch is attached to fix the problem. I forgot to mention the cause of the problem. if (istate->schunkptr < istate->nchunks) { PagetableEntry *chunk = [idxchunks[istate->schunkptr]]; PagetableEntry *page = [idxpages[istate->spageptr]]; BlockNumber chunk_blockno; In above if condition we have only checked istate->schunkptr < istate->nchunks that means we have some chunk left so we are safe to access idxchunks, But just after that we are accessing ptbase[idxpages[istate->spageptr]] without checking that accessing idxpages is safe or not. tbm_iterator already handling this case, I broke it in tbm_shared_iterator. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Write Ahead Logging for Hash Indexes
On Thu, Mar 9, 2017 at 12:25 AM, Robert Haaswrote: > On Wed, Mar 8, 2017 at 7:45 AM, Amit Kapila wrote: >> Okay, I can try, but note that currently there is no test related to >> "snapshot too old" for any other indexes. > > Wow, that's surprising. It seems the snapshot_too_old test only > checks that this works for a table that has no indexes. Have you, > anyway, tested it manually? > Yes, I have tested in manually. I think we need to ensure that the modified tuple falls on the same page as old tuple to make the test work. The slight difficulty with the index is to ensure the modified tuple to be inserted into same page as old tuple, this is more true with hash indexes. Also, for heap, I think it relies on hot pruning stuff and for index we need to perform manual vacuum. Basically, if we want we can write a test for index, but not sure if it is worth the pain to write for hash index when the test for btree is not there. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Bitmap scans a bit broken
On Thu, Mar 9, 2017 at 9:17 PM, David Rowleywrote: > patch with [1] > > =# create table r1(value int); > CREATE TABLE > =# insert into r1 select (random()*1000)::int from > generate_Series(1,100); > INSERT 0 100 > =# create index on r1 using brin(value); > CREATE INDEX > =# set enable_seqscan=0; > SET > =# explain select * from r1 where value=555; I am looking into the issue, I have already reproduced it. I will update on this soon. Thanks for reporting. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Declarative partitioning optimization for large amount of partitions
Hi Amit, > Sorry, I didn't mean to dissuade you from trying those > micro-optimizations. If general inheritance cases can benefit from it > (which, until we have a different method, will be used by partitioned > tables as well), I think we should try it. OK, I'll see what could be done here as well then. On Tue, Mar 07, 2017 at 10:55:12AM +0900, Amit Langote wrote: > Hi Aleksander, > > On 2017/03/07 0:22, Aleksander Alekseev wrote: > > Hello. > > > > OK, here is a patch. > > > > Benchmark, before: > > > > ``` > > number of transactions actually processed: 1823 > > latency average = 1153.495 ms > > latency stddev = 154.366 ms > > tps = 6.061104 (including connections establishing) > > tps = 6.061211 (excluding connections establishing) > > ``` > > > > Benchmark, after: > > > > ``` > > number of transactions actually processed: 2462 > > latency average = 853.862 ms > > latency stddev = 112.270 ms > > tps = 8.191861 (including connections establishing) > > tps = 8.192028 (excluding connections establishing) > > ``` > > > > +35% TPS, just as expected. Feel free to run your own benchmarks on > > different datasets and workloads. `perf top` shows that first bottleneck > > is completely eliminated. > > That seems like a good gain. > > > I did nothing about the second bottleneck > > since as Amit mentioned partition-pruning should solve this anyway and > > apparently any micro-optimizations don't worth an effort. > > Sorry, I didn't mean to dissuade you from trying those > micro-optimizations. If general inheritance cases can benefit from it > (which, until we have a different method, will be used by partitioned > tables as well), I think we should try it. > > Thanks, > Amit > > -- Best regards, Aleksander Alekseev signature.asc Description: PGP signature
Re: [HACKERS] use SQL standard error code for nextval
> On Mar 9, 2017, at 7:59 AM, Peter Eisentraut >wrote: > > On 2/28/17 22:15, Peter Eisentraut wrote: >> The SQL standard defines a separate error code for nextval exhausting >> the sequence space. I haven't found any discussion of this in the >> archives, so it seems this was just not considered or not yet in >> existence when the error codes were introduced. Here is a patch to >> correct it. > > committed Perhaps you should add something to the release notes. Somebody could be testing for the old error code. Mark Dilger -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Bitmap scans a bit broken
On Thu, Mar 9, 2017 at 11:50 AM, Dilip Kumarwrote: > On Thu, Mar 9, 2017 at 10:02 PM, Dilip Kumar wrote: >> I slightly modified your query to reproduce this issue. >> >> explain analyze select * from r1 where value<555; >> >> Patch is attached to fix the problem. > > I forgot to mention the cause of the problem. > > if (istate->schunkptr < istate->nchunks) > { >PagetableEntry *chunk = [idxchunks[istate->schunkptr]]; > PagetableEntry *page = [idxpages[istate->spageptr]]; > BlockNumber chunk_blockno; > > In above if condition we have only checked istate->schunkptr < > istate->nchunks that means we have some chunk left so we are safe to > access idxchunks, But just after that we are accessing > ptbase[idxpages[istate->spageptr]] without checking that accessing > idxpages is safe or not. > > tbm_iterator already handling this case, I broke it in tbm_shared_iterator. I don't know if this is the only problem -- it would be good if David could retest -- but it's certainly *a* problem, so committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] partial indexes and bitmap scans
Stephen Frostwrites: > We have already figured out that the user's predicate results in a > subset of the index's or we wouldn't be able to use that index though, > right? Do we really need to spend cycles re-discovering that? Are > there cases where we actually need the index's predicate to ever be > included for correctness..? In a bitmap indexscan, yes, because the entire point of the recheck condition is that we're going to scan a whole page and possibly see tuples that don't satisfy the index predicate at all. Another point that's mentioned in the comments in createplan.c is that if you're considering the result of a BitmapOr or BitmapAnd, there's not necessarily a single index involved so it's much harder to reason about which part of the qual is an index predicate. I do notice that createplan.c makes some effort to get rid of filter conditions that are provably true when the index conditions are. Doesn't look like it considers the reverse direction. Not sure if that's missing a bet. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Parallel Bitmap scans a bit broken
I was just doing some testing on [1] when I noticed that there's a problem with parallel bitmap index scans scans. Test case: patch with [1] =# create table r1(value int); CREATE TABLE =# insert into r1 select (random()*1000)::int from generate_Series(1,100); INSERT 0 100 =# create index on r1 using brin(value); CREATE INDEX =# set enable_seqscan=0; SET =# explain select * from r1 where value=555; QUERY PLAN - Gather (cost=3623.52..11267.45 rows=5000 width=4) Workers Planned: 2 -> Parallel Bitmap Heap Scan on r1 (cost=2623.52..9767.45 rows=2083 width=4) Recheck Cond: (value = 555) -> Bitmap Index Scan on r1_value_idx (cost=0.00..2622.27 rows=522036 width=0) Index Cond: (value = 555) (6 rows) =# explain analyze select * from r1 where value=555; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. The crash occurs in tbm_shared_iterate() at: PagetableEntry *page = [idxpages[istate->spageptr]]; I see in tbm_prepare_shared_iterate() tbm->npages is zero. I'm unsure if bringetbitmap() does something different with npages than btgetbitmap() around setting npages? But anyway, due to the npages being 0 the tbm->ptpages is not allocated in tbm_prepare_shared_iterate() if (tbm->npages) { tbm->ptpages = dsa_allocate(tbm->dsa, sizeof(PTIterationArray) + tbm->npages * sizeof(int)); so when tbm_shared_iterate runs this code; /* * If both chunk and per-page data remain, must output the numerically * earlier page. */ if (istate->schunkptr < istate->nchunks) { PagetableEntry *chunk = [idxchunks[istate->schunkptr]]; PagetableEntry *page = [idxpages[istate->spageptr]]; BlockNumber chunk_blockno; chunk_blockno = chunk->blockno + istate->schunkbit; if (istate->spageptr >= istate->npages || chunk_blockno < page->blockno) { /* Return a lossy page indicator from the chunk */ output->blockno = chunk_blockno; output->ntuples = -1; output->recheck = true; istate->schunkbit++; LWLockRelease(>lock); return output; } } it fails, due to idxpages pointing to random memory Probably this is a simple fix for the authors, so passing it along. I'm a bit unable to see how the part above is meant to work. [1] https://www.postgresql.org/message-id/attachment/50164/brin-correlation-v3.patch -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Gather Merge
On Thu, Mar 9, 2017 at 9:42 PM, Robert Haaswrote: > On Thu, Mar 9, 2017 at 8:21 AM, Rushabh Lathia > wrote: > > Thanks Robert for committing this. > > > > My colleague Neha Sharma found one regression with the patch. I was about > > to send this mail and noticed that you committed the patch. > > Oops. Bad timing. > > > postgres=# explain select aid from pgbench_accounts where aid % 25= 0 > group > > by aid; > > ERROR: ORDER/GROUP BY expression not found in targetlist > > I think your fix for this looks right, although I would write it this way: > > -gm_plan->plan.targetlist = subplan->targetlist; > +gm_plan->plan.targetlist = build_path_tlist(root, _path->path); > > The second part of your fix looks wrong. I think you want this: > > create_gather_merge_path(root, > grouped_rel, > subpath, > - NULL, > + partial_grouping_target, > root->group_pathkeys, > NULL, > _groups); > > That will match the create_gather_path case. > > Right, I did that change and perform the test with the fix and I don't see any regression now. > This test case is still failing for me even with those fixes: > > rhaas=# select aid+1 from pgbench_accounts group by aid+1; > ERROR: could not find pathkey item to sort > > I don't see this failure with the patch. Even I forced the gather merge in the above query and that just working fine. Attaching patch, with the discussed changes. Thanks, Rushabh Lathia fix_target_gm_v2.patch Description: application/download -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Bitmap scans a bit broken
On Thu, Mar 9, 2017 at 9:37 PM, Dilip Kumarwrote: >> =# create table r1(value int); >> CREATE TABLE >> =# insert into r1 select (random()*1000)::int from >> generate_Series(1,100); >> INSERT 0 100 >> =# create index on r1 using brin(value); >> CREATE INDEX >> =# set enable_seqscan=0; >> SET >> =# explain select * from r1 where value=555; > > I am looking into the issue, I have already reproduced it. I will > update on this soon. > > Thanks for reporting. I slightly modified your query to reproduce this issue. explain analyze select * from r1 where value<555; Patch is attached to fix the problem. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com parallel_bitmap_fix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix] dblink leaks unnamed connections
On 03/09/2017 07:54 AM, Tom Lane wrote: > Fujii Masaowrites: >> On Wed, Mar 8, 2017 at 3:52 PM, Tsunakawa, Takayuki >> wrote: >>> dblink fails to close the unnamed connection as follows when a new unnamed >>> connection is requested. The attached patch fixes this. > >> This issue was reported about ten years ago and added as TODO item. >> http://archives.postgresql.org/pgsql-hackers/2007-10/msg00895.php > >> I agree that this is a bug, and am tempted to back-patch to all the supported >> versions. But it had not been fixed in many years since the first report of >> the issue. So I'm not sure if it's ok to just treat this as a bug right now >> and >> back-patch. Or we should fix this only in HEAD? Anyway I'd like to hear >> more opinions about this. > > It looks to me like the issue simply fell through the cracks because Joe > wasn't excited about fixing it. Now that we have a second complaint, > I think it's worth treating as a bug and back-patching. > > (I've not read this particular patch and am not expressing an opinion > whether it's correct.) Ok, will take another look. Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Performance issue after upgrading from 9.4 to 9.6
On Thu, Mar 9, 2017 at 7:47 AM, Naytro Naytrowrote: > We are having some performance issues after we upgraded to newest > version of PostgreSQL, before it everything was fast and smooth. > > Upgrade was done by pg_upgrade from 9.4 directly do 9.6.1. Now we > upgraded to 9.6.2 with no improvement. > > Some information about our setup: Freebsd, Solaris (SmartOS), simple > master-slave using streaming replication. > > Problem: > Very high system CPU when master is streaming replication data, CPU > goes up to 77%. Only one process is generating this load, it's a > postgresql startup process. When I attached a truss to this process I > saw a lot o read calls with almost the same number of errors (EAGAIN). > read(6,0x7fffa0c7,1) ERR#35 'Resource temporarily unavailable' > > Descriptor 6 is a pipe > > Read call try to read one byte over and over, I looked up to source > code and I think this file is responsible for this behavior > src/backend/storage/ipc/latch.c. There was no such file in 9.4. Our latch implementation did get overhauled pretty thoroughly in 9.6; see primarily commit 98a64d0bd713cb89e61bef6432befc4b7b5da59e. But I can't guess what is going wrong here based on this information. It might help if you can pull some stack backtraces from the startup process. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] use SQL standard error code for nextval
On 2/28/17 22:15, Peter Eisentraut wrote: > The SQL standard defines a separate error code for nextval exhausting > the sequence space. I haven't found any discussion of this in the > archives, so it seems this was just not considered or not yet in > existence when the error codes were introduced. Here is a patch to > correct it. committed -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Gather Merge
On Thu, Mar 9, 2017 at 11:25 AM, Rushabh Lathiawrote: > I don't see this failure with the patch. Even I forced the gather merge > in the above query and that just working fine. > > Attaching patch, with the discussed changes. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New CORRESPONDING clause design
2017-03-09 13:18 GMT+01:00 Surafel Temesgen: > Hi , > > Here is a patch corrected as your feedback except missed tests case > because corresponding by clause is implemented on the top of set operation > and you can’t do that to set operation without corresponding by clause too > I don't understand. The following statement should to work postgres=# select 10 as a, 20 as b union corresponding select 20 as a, 30 as b, 40 as c; ERROR: each UNION query must have the same number of columns LINE 1: ...elect 10 as a, 20 as b union corresponding select 20 as a, 3... Corresponding clause should to work like projection filter. Regards Pavel > > Eg > > > postgres=# SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d; > > ERROR: each UNION query must have the same number of columns > > LINE 1: SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d; > > ^ > > postgres=# create table t1(a int, b int, c int); > > CREATE TABLE > > postgres=# create table t2(a int, b int); > > CREATE TABLE > > > > postgres=# select * from t1 union select * from t2; > > ERROR: each UNION query must have the same number of columns > > LINE 1: select * from t1 union select * from t2; > > > > Thanks > > Surafel > > On Tue, Mar 7, 2017 at 10:26 PM, Pavel Stehule > wrote: > >> Hi >> >> I am sending a review of this interesting feature. >> >> I found following issues, questions: >> >> 1. unclosed tags in documentation >> 2. bad name "changeTargetEntry" - should be makeTargetEntry? >> 3. Why you removed lot of asserts in prepunion.c? These asserts should be >> valid still >> 4. make_coresponding_target has wrong formatting >> 5. error "%s queries with a CORRESPONDING clause must have at least one >> column with the same name" has wrong formatting, you can show position >> 6. previous issue is repeated - look on formatting ereport function, >> please, you can use DETAIL and HINT fields >> 7. corresponding clause should to contain column list (I am looking to >> ANSI/SQL 99) - you are using expr_list, what has not sense and probably it >> has impact on all implementation. >> 8. typo orderCorrespondingLsit(List *targetlist) >> 9. I miss more tests for CORRESPONDING BY >> 10. if I understand to this feature, this query should to work >> >> postgres=# SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 >> b, 6 c, 8 d; >> ERROR: each UNION query must have the same number of columns >> LINE 1: ...1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, ... >> >> postgres=# create table t1(a int, b int, c int); >> CREATE TABLE >> Time: 63,260 ms >> postgres=# create table t2(a int, b int); >> CREATE TABLE >> Time: 57,120 ms >> postgres=# select * from t1 union corresponding select * from t2; >> ERROR: each UNION query must have the same number of columns >> LINE 1: select * from t1 union corresponding select * from t2; >> >> If it is your first patch to Postgres, then it is perfect work! >> >> The @7 is probably most significant - I dislike a expression list there. >> name_list should be better there. >> >> Regards >> >> Pavel >> >> >
Re: [HACKERS] partial indexes and bitmap scans
Stephen Frostwrites: > Perhaps I'm missing something obvious, but isn't it a bit redundant to > have both a Recheck condition (which is the predicate of the index) and > a Filter condition (which is the user's predicate) when we've already > decided that the user's predicate must result in a subset of the > index's, as, otherwise, we wouldn't be able to use the index in the > first place? Yeah, I think this is just something that the planner doesn't see fit to expend cycles on detecting. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] partial indexes and bitmap scans
Tom, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Stephen Frostwrites: > > Perhaps I'm missing something obvious, but isn't it a bit redundant to > > have both a Recheck condition (which is the predicate of the index) and > > a Filter condition (which is the user's predicate) when we've already > > decided that the user's predicate must result in a subset of the > > index's, as, otherwise, we wouldn't be able to use the index in the > > first place? > > Yeah, I think this is just something that the planner doesn't see fit > to expend cycles on detecting. We have already figured out that the user's predicate results in a subset of the index's or we wouldn't be able to use that index though, right? Do we really need to spend cycles re-discovering that? Are there cases where we actually need the index's predicate to ever be included for correctness..? This seems like we're going out of our way to add in an additional check for something that we've already determined must always be true and that strikes me as odd. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] [bug fix] dblink leaks unnamed connections
Fujii Masaowrites: > On Wed, Mar 8, 2017 at 3:52 PM, Tsunakawa, Takayuki > wrote: >> dblink fails to close the unnamed connection as follows when a new unnamed >> connection is requested. The attached patch fixes this. > This issue was reported about ten years ago and added as TODO item. > http://archives.postgresql.org/pgsql-hackers/2007-10/msg00895.php > I agree that this is a bug, and am tempted to back-patch to all the supported > versions. But it had not been fixed in many years since the first report of > the issue. So I'm not sure if it's ok to just treat this as a bug right now > and > back-patch. Or we should fix this only in HEAD? Anyway I'd like to hear > more opinions about this. It looks to me like the issue simply fell through the cracks because Joe wasn't excited about fixing it. Now that we have a second complaint, I think it's worth treating as a bug and back-patching. (I've not read this particular patch and am not expressing an opinion whether it's correct.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] partial indexes and bitmap scans
Greetings, Consider this: create table t10 (c1 int, c2 int); create index on t10 (c1) where c2 > 5; \d t10 Table "sfrost.t10" Column | Type | Modifiers +-+--- c1 | integer | c2 | integer | Indexes: "t10_c1_idx" btree (c1) WHERE c2 > 5 insert into t10 select * from generate_series(1,1) a, generate_series(1,10) b; (repeat a bunch of times, if desired) vacuum analyze t10; set work_mem = '64kB'; set enable_indexscan = false; set enable_seqscan = false; =*> explain analyze select * from t10 where c2 > 6; QUERY PLAN -- Bitmap Heap Scan on t10 (cost=6496.49..15037.50 rows=318653 width=8) (actual time=34.682..116.236 rows=32 loops=1) Recheck Cond: (c2 > 5) Rows Removed by Index Recheck: 327502 Filter: (c2 > 6) Rows Removed by Filter: 8 Heap Blocks: exact=642 lossy=2898 -> Bitmap Index Scan on t10_c1_idx (cost=0.00..6416.83 rows=400081 width=0) (actual time=34.601..34.601 rows=40 loops=1) Planning time: 0.087 ms Execution time: 124.229 ms (9 rows) Perhaps I'm missing something obvious, but isn't it a bit redundant to have both a Recheck condition (which is the predicate of the index) and a Filter condition (which is the user's predicate) when we've already decided that the user's predicate must result in a subset of the index's, as, otherwise, we wouldn't be able to use the index in the first place? In other words, it seems like we shouldn't need a Filter in the above Bitmap Heap Scan, instead we should just make the Recheck be (c2 > 6). I've not looked into the code side of this at all and there may be reasons why this is hard to do, but it seems like a worthwhile improvement to consider doing, though perhaps I'm missing some reason why we need both the Recheck and the Filter in such cases for correctness. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 3/8/17 9:34 AM, Andreas Karlsson wrote: Also, if by any chance you think (or use any software that thinks) that OIDs for system objects are a stable identifier, this will be the first case where that ceases to be true. If the system is shut down or crashes or the session is killed, you'll be left with stray objects with names that you've never typed into the system. I'm sure you're going to say "don't worry, none of that is any big deal" and maybe you're right. Hm, I cannot think of any real life scenario where this will be an issue based on my personal experience with PostgreSQL, but if you can think of one please provide it. I will try to ponder some more on this myself. The case I currently have is to allow tracking database objects similar to (but not the same) as how we track the objects that belong to an extension[1]. That currently depends on event triggers to keep names updated if they're changed, as well as making use of the reg* types. If an event trigger fired as part of the index rename (essentially treating it like an ALTER INDEX) then I should be able to work around that. The ultimate reason for doing this is to provide something similar to extensions (create a bunch of database objects that are all bound together), but also similar to classes in OO languages (so you can have multiple instances).[2] Admittedly, this is pretty off the beaten path and I certainly wouldn't hold up the patch because of it. I am hoping that it'd be fairly easy to fire an event trigger as if someone had just renamed the index. 1: https://github.com/decibel/object_reference 2: https://github.com/decibel/pg_classy -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Use $ parameters as replacement characters for pg_stat_statements
On Thu, Mar 9, 2017 at 8:17 PM, Bruce Momjianwrote: >> In my opinion, we expose query id (and dbid, and userid) as the >> canonical identifier for each pg_stat_statements entry, and have done >> so for some time. That's the stable API -- not query text. I'm aware >> of cases where query text was used as an identifier, but that ended up >> being hashed anyway. > > Speaking of hash values for queries, someone once asked me if we could > display a hash value for queries displayed in pg_stat_activity and > pg_stat_statements so they could take a running query and look in > pg_stat_statements to see how long is usually ran. It seemed like a > useful idea to me. I agree. > I don't think they can hash the query manually because of the constants > involved. It would be a matter of having postgres expose Query.queryId (or the similar field in PlannedStmt, I suppose). Morally, that field belongs to pg_stat_statements, since it was written to make the query normalization stuff work, and because everything would break if another extension attempted to use it as pg_stat_statements does. Whether or not that makes it okay to expose the hash value in pg_stat_activity like that is above my pay grade, as Tom would say. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign Join pushdowns not working properly for outer joins
>> >> The new name merge_fdw_options() is shorter than the one I chose, but >> we are not exactly merging options for an upper relation since there >> isn't the other fpinfo to merge from. But I think we can live with >> merge_fdw_options(). > > Perhaps "combine" is a better word? I didn't really see a problem with > that. After I posted this I wished I'd made the function even more > generic to accept either side as NULL and make up the new fpinfo from > the non-NULL one, or Merge if both were non-NULL. I liked that way > much better than giving the function too much knowledge of what its > purpose actually is. It's more likely to get used for something else > in the future, which means there's less chance that someone else will > make the same mistake. It's more like copy for an upper rel and merge for a join rel. Your point about making it side-agnostic is good but I doubt if we want to spend more effort there. If somebody writes a code with the input plan stuffed in the innerrel instead of the outerrel, s/he will notice it immediately when testing as assertion would fail or there will be a straight segfault. We will decide what to do then. > >> Once I fixed that, the testcases started showing an assertion failure, >> since fpinfo of a base relation can not have an outerrel. Fixed the >> assertion as well. If we are passing fpinfo's of joining relations, >> there's no need to have outerrel and innerrel in fpinfo of join. > > Looks like you forgot to update the comment on the Assert() Yes and I also forgot to update the function prologue to refer to the fpinfo_o/i instead of inner and outer relations. Attached patch corrects it. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company foreign_outerjoin_pushdown_fix_v4.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Append implementation
On Thu, Mar 9, 2017 at 6:28 PM, Robert Haaswrote: > On Thu, Mar 9, 2017 at 7:42 AM, Ashutosh Bapat > wrote: >>> >>> +if (rel->partial_pathlist != NIL && >>> +(Path *) linitial(rel->partial_pathlist) == subpath) >>> +partial_subplans_set = bms_add_member(partial_subplans_set, i); >>> >>> This seems like a scary way to figure this out. What if we wanted to >>> build a parallel append subpath with some path other than the >>> cheapest, for some reason? I think you ought to record the decision >>> that set_append_rel_pathlist makes about whether to use a partial path >>> or a parallel-safe path, and then just copy it over here. >> >> I agree that assuming that a subpath is non-partial path if it's not >> cheapest of the partial paths is risky. In fact, we can not assume >> that even when it's not one of the partial_paths since it could have >> been kicked out or was never added to the partial path list like >> reparameterized path. But if we have to save the information about >> which of the subpaths are partial paths and which are not in >> AppendPath, it would take some memory, noticeable for thousands of >> partitions, which will leak if the path doesn't make into the >> rel->pathlist. > > True, but that's no different from the situation for any other Path > node that has substructure. For example, an IndexPath has no fewer > than 5 list pointers in it. Generally we assume that the number of > paths won't be large enough for the memory used to really matter, and > I think that will also be true here. And an AppendPath has a list of > subpaths, and if I'm not mistaken, those list nodes consume more > memory than the tracking information we're thinking about here will. > What I have observed is that we try to keep the memory usage to a minimum, trying to avoid memory consumption as much as possible. Most of that substructure gets absorbed by the planner or is shared across paths. Append path lists are an exception to that, but we need something to hold all subpaths together and list is PostgreSQL's way of doing it. So, that's kind of unavoidable. And may be we will find some reason for almost every substructure in paths. > I think you're thinking about this issue because you've been working > on partitionwise join where memory consumption is a big issue, but > there are a lot of cases where that isn't really a big deal. :). > >> The purpose of that information is to make sure that we >> allocate only one worker to that plan. I suggested that we use >> path->parallel_workers for the same, but it seems that's not >> guaranteed to be reliable. The reasons were discussed upthread. Is >> there any way to infer whether we can allocate more than one workers >> to a plan by looking at the corresponding path? > > I think it would be smarter to track it some other way. Either keep > two lists of paths, one of which is the partial paths and the other of > which is the parallel-safe paths, or keep a bitmapset indicating which > paths fall into which category. I like two lists: it consumes almost no memory (two list headers instead of one) compared to non-parallel-append when there are non-partial paths and what more, it consumes no extra memory when all paths are partial. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Upgrading postmaster's log messages about bind/listen errors
Robert Haaswrites: > On Thu, Mar 9, 2017 at 4:01 PM, Joe Conway wrote: >> On 03/09/2017 12:27 PM, Tom Lane wrote: >>> For good measure I also added a DEBUG1 log message reporting successful >>> binding to a port. I'm not sure if there's an argument for putting this >>> out at LOG level (i.e. by default) --- any thoughts about that? >> +1 for making it LOG instead of DEBUG1 > I would tend to vote against that, because startup is getting > gradually chattier and chattier, and I think this isn't likely to be > of interest to very many people most of the time. Yeah, my thought was that if we've gotten along without this for 20 years, it's probably not of interest to most people most of the time. However, if we're measuring this on a scale of usefulness to the average DBA, I would argue that it's of more interest than any of these messages that currently appear by default: 2017-03-09 23:40:12.334 EST [19335] LOG: MultiXact member wraparound protections are now enabled 2017-03-09 23:40:12.335 EST [19339] LOG: autovacuum launcher started 2017-03-09 23:40:12.336 EST [19341] LOG: logical replication launcher started The first of those is surely past its sell-by date. As for the other two, we should log *failure* to start, but not the normal case. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical replication existing data copy
On 09/03/17 18:46, Peter Eisentraut wrote: > On 3/6/17 05:27, Petr Jelinek wrote: >> updated and rebased version of the patch attached. > > Some comments on this patch: > > Can we have a better explanation of different snapshot options in > CREATE_REPLICATION_SLOT. We use all these variants in different > places, but it's not fully documented why. Think about interested > users reading this. > I am not quite sure what/where do you want me to expand the docs, the protocol doc explains what the options do, are you missing reasoning for why we use them in the calling code? > > errmsg("subscription table %u in subscription %u does not exist", > > Use names instead of IDs if possible. > Well, this message is more or less equal to cache lookup failure, problem is that neither relation nor subscription are guaranteed to exist if this fails. I can write code that spits two different messages depending of they do, not quite sure if it's worth it there though. > > + libpqrcv_table_list, > + libpqrcv_table_info, > + libpqrcv_table_copy, > > I don't think these functions belong into the WAL receiver API, since > they are specific to this particular logical replication > implementation. I would just make an API function libpqrc_exec_sql() > that runs a query, and then put the table_*() functions as wrappers > around that into tablesync.c. > Yeah I was wondering about it as well. The reason why it's in libpqwalreciver is that if we create some libpqrc_exec_sql as you say, we'll need code that converts the PQresult to something consumable by backend that has no access to libpq API and that seemed like quite a bit of boilerplate work. I really don't want to write another libpq-like or SPI-like interface for this. Maybe it would be acceptable to return result as tuplestore? > > Not sure what the worker init stuff in ApplyLauncherShmemInit() is > doing. Could be commented more. > Eh, I copy pasted comment there from different place, will fix. > > max_sync_workers_per_subscription could be PGC_SIGHUP, I think. And > the minimum could be 0, to stop any syncing? > > > pg_subscription_rel.h: I'm not sure under which circumstances we need > to use BKI_ROWTYPE_OID. > > > Does pg_subscription_rel need an OID column? The index > SubscriptionRelOidIndexId is not used anywhere. > Ah, leftover from the time it used dependencies for tracking. > > You removed this command from the tests: > > ALTER SUBSCRIPTION testsub SET PUBLICATION testpub, testpub1; > > I suppose because it causes a connection. We should have an option to > prevent that (NOCONNECT/NOREFRESH?). NOREFRESH makes more sense I guess. > > Why was the test 'check replication origin was dropped on subscriber' > removed? > I don't know what you mean? > > Attached also a small patch with some stylistic changes. > Thanks. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical replication existing data copy
On 09/03/17 18:48, Peter Eisentraut wrote: > On 3/6/17 05:27, Petr Jelinek wrote: >> And lastly I changed the automagic around exporting, not exporting and >> using the snapshot produced by CREATE_REPLICATION_SLOT into explicit >> parameter for the CREATE_REPLICATION_SLOT command (and added simple >> framework for adding more of those if needed in the future). > > It might be useful to make this into a separate patch, for clarity. > Yes, already working on that (at least part of it), since it's useful for other patches in CF. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] allow referring to functions without arguments when unique
On 3/7/17 00:32, Michael Paquier wrote: >> OK. After a lookup, I am just seeing opfamily, opclass missing, so >> this patch is doing it as you describe. I'm not sure what you mean here. >> @@ -7198,6 +7198,33 @@ function_with_argtypes: >> n->objargs = extractArgTypes($2); >> $$ = n; >> } >> This may not have arguments listed, so is function_with_argtypes really >> adapted? Well, we could do something like function_with_opt_argtypes? >> + /* >> +* If no arguments were specified, the name must yield a unique >> candidate. >> +*/ >> + if (nargs == -1 && clist) >> + { >> + if (clist->next) >> + ereport(ERROR, >> I would have used list_length here for clarity. This is actually not a "List" node. >> The comment at the top of LookupFuncName() needs a refresh. The caller >> can as well just use a function name without arguments. Fixed. > =# set search_path to 'public,popo'; I think you mean set search_path to public,popo; > =# drop function dup2; > ERROR: 42883: function dup2() does not exist > LOCATION: LookupFuncName, parse_func.c:1944 > In this case I would have expected an error telling that the name is > ambiguous. FuncnameGetCandidates() returns an empty list. Your example works correctly if you set the schema path correctly. However, the error message is misleading with the parentheses. I have updated that to create a different error message for this case, and added a test case. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services From ac83b6f71433d2101f67dd148a4af2aac78129ee Mon Sep 17 00:00:00 2001 From: Peter EisentrautDate: Thu, 9 Mar 2017 23:58:48 -0500 Subject: [PATCH v2] Allow referring to functions without arguments when unique In DDL commands referring to an existing function, allow omitting the argument list if the function name is unique in its schema, per SQL standard. This uses the same logic that the regproc type uses for finding functions by name only. --- doc/src/sgml/ref/alter_extension.sgml | 2 +- doc/src/sgml/ref/alter_function.sgml| 13 + doc/src/sgml/ref/alter_opfamily.sgml| 7 +++-- doc/src/sgml/ref/comment.sgml | 2 +- doc/src/sgml/ref/create_cast.sgml | 6 ++-- doc/src/sgml/ref/create_transform.sgml | 12 +--- doc/src/sgml/ref/drop_function.sgml | 35 --- doc/src/sgml/ref/grant.sgml | 2 +- doc/src/sgml/ref/revoke.sgml| 2 +- doc/src/sgml/ref/security_label.sgml| 2 +- src/backend/nodes/copyfuncs.c | 1 + src/backend/nodes/equalfuncs.c | 1 + src/backend/parser/gram.y | 27 ++ src/backend/parser/parse_func.c | 37 +++-- src/include/nodes/parsenodes.h | 3 ++ src/test/regress/expected/create_function_3.out | 11 +++- src/test/regress/sql/create_function_3.sql | 8 ++ 17 files changed, 143 insertions(+), 28 deletions(-) diff --git a/doc/src/sgml/ref/alter_extension.sgml b/doc/src/sgml/ref/alter_extension.sgml index de6d6dca16..a7c0927d1c 100644 --- a/doc/src/sgml/ref/alter_extension.sgml +++ b/doc/src/sgml/ref/alter_extension.sgml @@ -39,7 +39,7 @@ EVENT TRIGGER object_name | FOREIGN DATA WRAPPER object_name | FOREIGN TABLE object_name | - FUNCTION function_name ( [ [ argmode ] [ argname ] argtype [, ...] ] ) | + FUNCTION function_name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] | MATERIALIZED VIEW object_name | OPERATOR operator_name (left_type, right_type) | OPERATOR CLASS object_name USING index_method | diff --git a/doc/src/sgml/ref/alter_function.sgml b/doc/src/sgml/ref/alter_function.sgml index 0388d06b95..168eeb7c52 100644 --- a/doc/src/sgml/ref/alter_function.sgml +++ b/doc/src/sgml/ref/alter_function.sgml @@ -21,15 +21,15 @@ -ALTER FUNCTION name ( [ [ argmode ] [ argname ] argtype [, ...] ] ) +ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] action [ ... ] [ RESTRICT ] -ALTER FUNCTION name ( [ [ argmode ] [ argname ] argtype [, ...] ] ) +ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] RENAME TO new_name -ALTER FUNCTION name ( [ [ argmode ] [ argname ] argtype [, ...] ] ) +ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] OWNER TO { new_owner | CURRENT_USER | SESSION_USER } -ALTER FUNCTION name ( [ [ argmode ] [ argname ] argtype [, ...] ] ) +ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] SET SCHEMA new_schema -ALTER FUNCTION name ( [ [ argmode ] [ argname ] argtype [, ...] ] ) +ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] DEPENDS ON EXTENSION extension_name where action is one
Re: [HACKERS] exposing wait events for non-backends (was: Tracking wait event for latches)
On Fri, Mar 10, 2017 at 3:11 AM, Andres Freundwrote: > On 2017-03-09 16:37:29 -0500, Tom Lane wrote: >> Robert Haas writes: >> > On Thu, Mar 9, 2017 at 2:30 PM, Peter Eisentraut >> > wrote: >> >> In practice, I think it's common to do a quick select * from >> >> pg_stat_activity to determine whether a database instance is in use. >> >> > I thought of the same kind of thing, and it was discussed upthread. >> > There seemed to be more votes for keeping it all in one view, but that >> > could change if more people vote. >> >> I've not been paying much attention to this thread, but it seems like >> something that would help Peter's use-case and have other uses as well >> is a new column that distinguishes different process types --- user >> session, background worker, autovacuum worker, etc. > > The patches upthread add precisely such a column. > The patch exposes auxiliary processes, autovacuum launcher and bgworker along with other backends in pg_stat_activity. It also adds an extra column, named proc_type (suggested by Craig and Robert), to indicate the type of process in pg_stat_activity view. proc_type includes: * client backend * autovacuum launcher * wal sender * bgworker * writer * checkpointer * wal writer * wal receiver Here is the present output with the relevant columns. postgres=# SELECT wait_event_type, wait_event, state, proc_type FROM pg_stat_activity; wait_event_type | wait_event | state | proc_type -+-++- Activity| AutoVacuumMain | idle | autovacuum launcher Activity| LogicalLauncherMain | idle | bgworker Activity| WalSenderMain | idle | wal sender | | active | client backend Activity| BgWriterMain| idle | writer Activity| CheckpointerMain| idle | checkpointer Activity| WalWriterMain | idle | wal writer (7 rows) -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Use $ parameters as replacement characters for pg_stat_statements
On Sat, Mar 4, 2017 at 10:52:44AM -0800, Peter Geoghegan wrote: > On Sat, Mar 4, 2017 at 8:02 AM, Tom Lanewrote: > > Meh ... we've generally regretted it when we "solved" a backwards > > compatibility problem by introducing a GUC that changes query semantics. > > I'm inclined to think we should either do it or not. > > In my opinion, we expose query id (and dbid, and userid) as the > canonical identifier for each pg_stat_statements entry, and have done > so for some time. That's the stable API -- not query text. I'm aware > of cases where query text was used as an identifier, but that ended up > being hashed anyway. Speaking of hash values for queries, someone once asked me if we could display a hash value for queries displayed in pg_stat_activity and pg_stat_statements so they could take a running query and look in pg_stat_statements to see how long is usually ran. It seemed like a useful idea to me. I don't think they can hash the query manually because of the constants involved. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Logical replication origin tracking fix
Hi, while discussing with Craig issues around restarting logical replication stream related to the patch he posted [1], I realized that we track wrong origin LSN in the logical replication apply. We currently track commit_lsn which is *start* of commit record, what we need to track is end_lsn which is *end* of commit record otherwise we might request transaction that was already replayed if the subscription instance has crashed right after commit. Attached patch fixes that. [1] https://www.postgresql.org/message-id/CAMsr+YGFvikx-U_mHQ0mAzTarqvCpwzvsPKv=7mfp9scdrm...@mail.gmail.com -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services 0001-Fix-remote-position-tracking-in-logical-replication.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Enabling atomics on ARM64
Andres Freundwrites: > FWIW, Unless somebody seriously protests, I'm going to commit this soon-ish. Sure, if you want to take responsibility for it, push away. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Generic type subscripting
On 2/28/17 13:02, Dmitry Dolgov wrote: > + > +-- Extract value by key > +SELECT ('{"a": 1}'::jsonb)['a']; > + > +-- Extract nested value by key path > +SELECT ('{"a": {"b": {"c": 1}}}'::jsonb)['a']['b']['c']; > + > +-- Extract element by index > +SELECT ('[1, "2", null]'::jsonb)['1']; > + > +-- Update value by key > +UPDATE table_name set jsonb_field['key'] = 1; > + > +-- Select records using where clause with subscripting > +SELECT * from table_name where jsonb_field['key'] = '"value"'; > + I see a possible problem here: This design only allows one subscripting function. But what you'd really want in this case is at least two: one taking an integer type for selecting by array index, and one taking text for selecting by field name. I suppose that since a given value can only be either an array or an object, there is no ambiguity, but I think this might also lose some error checking. It might also not work the same way for other types. It looks like your jsonb subscripting function just returns null if it can't find a field, which is also a bit dubious. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers
Robert Haaswrites: > On Thu, Mar 9, 2017 at 9:17 PM, Tom Lane wrote: >> Buildfarm thinks eight wasn't enough. >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=clam=2017-03-10%2002%3A00%3A01 > At first I was confused how you knew that this was the fault of this > patch, but this seems like a pretty indicator: > TRAP: FailedAssertion("!(curval == 0 || (curval == 0x03 && status != > 0x00) || curval == status)", File: "clog.c", Line: 574) Yeah, that's what led me to blame the clog-group-update patch. > I'm not sure whether it's related to this problem or not, but now that > I look at it, this (preexisting) comment looks like entirely wishful > thinking: > * If we update more than one xid on this page while it is being written > * out, we might find that some of the bits go to disk and others don't. > * If we are updating commits on the page with the top-level xid that > * could break atomicity, so we subcommit the subxids first before we mark > * the top-level commit. Maybe, but that comment dates to 2008 according to git, and clam has been, er, happy as a clam up to now. My money is on a newly-introduced memory-access-ordering bug. Also, I see clam reported in green just now, so it's not 100% reproducible :-( regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Append implementation
On 10 March 2017 at 10:13, Ashutosh Bapatwrote: > On Thu, Mar 9, 2017 at 6:28 PM, Robert Haas wrote: >> On Thu, Mar 9, 2017 at 7:42 AM, Ashutosh Bapat >> wrote: +if (rel->partial_pathlist != NIL && +(Path *) linitial(rel->partial_pathlist) == subpath) +partial_subplans_set = bms_add_member(partial_subplans_set, i); This seems like a scary way to figure this out. What if we wanted to build a parallel append subpath with some path other than the cheapest, for some reason? Yes, there was an assumption that append subpath will be either a cheapest non-partial path, or the cheapest (i.e. first in the list) partial path, although in the patch there is no Asserts to make sure that a common rule has been followed at both these places. I think you ought to record the decision that set_append_rel_pathlist makes about whether to use a partial path or a parallel-safe path, and then just copy it over here. >>> >>> I agree that assuming that a subpath is non-partial path if it's not >>> cheapest of the partial paths is risky. In fact, we can not assume >>> that even when it's not one of the partial_paths since it could have >>> been kicked out or was never added to the partial path list like >>> reparameterized path. But if we have to save the information about >>> which of the subpaths are partial paths and which are not in >>> AppendPath, it would take some memory, noticeable for thousands of >>> partitions, which will leak if the path doesn't make into the >>> rel->pathlist. >> >> True, but that's no different from the situation for any other Path >> node that has substructure. For example, an IndexPath has no fewer >> than 5 list pointers in it. Generally we assume that the number of >> paths won't be large enough for the memory used to really matter, and >> I think that will also be true here. And an AppendPath has a list of >> subpaths, and if I'm not mistaken, those list nodes consume more >> memory than the tracking information we're thinking about here will. >> > > What I have observed is that we try to keep the memory usage to a > minimum, trying to avoid memory consumption as much as possible. Most > of that substructure gets absorbed by the planner or is shared across > paths. Append path lists are an exception to that, but we need > something to hold all subpaths together and list is PostgreSQL's way > of doing it. So, that's kind of unavoidable. And may be we will find > some reason for almost every substructure in paths. > >> I think you're thinking about this issue because you've been working >> on partitionwise join where memory consumption is a big issue, but >> there are a lot of cases where that isn't really a big deal. > > :). > >> >>> The purpose of that information is to make sure that we >>> allocate only one worker to that plan. I suggested that we use >>> path->parallel_workers for the same, but it seems that's not >>> guaranteed to be reliable. The reasons were discussed upthread. Is >>> there any way to infer whether we can allocate more than one workers >>> to a plan by looking at the corresponding path? >> >> I think it would be smarter to track it some other way. Either keep >> two lists of paths, one of which is the partial paths and the other of >> which is the parallel-safe paths, or keep a bitmapset indicating which >> paths fall into which category. > > I like two lists: it consumes almost no memory (two list headers > instead of one) compared to non-parallel-append when there are > non-partial paths and what more, it consumes no extra memory when all > paths are partial. I agree that the two-lists approach will consume less memory than bitmapset. Keeping two lists will effectively have an extra pointer field which will add up to the AppendPath size, but this size will not grow with the number of subpaths, whereas the Bitmapset will grow. But as far as code is concerned, I think the two-list approach will turn out to be less simple if we derive corresponding two different arrays in AppendState node. Handling two different arrays during execution does not look clean. Whereas, the bitmapset that I have used in Append has turned out to be very simple. I just had to do the below check (and that is the only location) to see if it's a partial or non-partial subplan. There is nowhere else any special handling for non-partial subpath. /* * Increment worker count for the chosen node, if at all we found one. * For non-partial plans, set it to -1 instead, so that no other workers * run it. */ if (min_whichplan != PA_INVALID_PLAN) { if (bms_is_member(min_whichplan, ((Append*)state->ps.plan)->partial_subplans_set)) padesc->pa_info[min_whichplan].pa_num_workers++; else padesc->pa_info[min_whichplan].pa_num_workers = -1; } Now, since Bitmapset field is used during execution
Re: [HACKERS] PATCH: Configurable file mode mask
On 2/28/17 20:58, David Steele wrote: > This patch introduces a new initdb param, -u/-file-mode-mask, and a new > GUC, file_mode_mask, to allow the default mode of files and directories > in the $PGDATA directory to be modified. The postmaster.pid file appears not to observe the configured mask. There ought to be a test, perhaps under src/bin/initdb/, to check for that kind of thing. There is no documentation update for initdb. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On Mon, Feb 27, 2017 at 11:15 PM, Jim Nasbywrote: >> * You're considering any WAL file with a power of 2 as valid. Suppose, >> the correct WAL seg size is 64mb. For some reason, the server >> generated a 16mb invalid WAL file(maybe it crashed while creating the >> WAL file). Your code seems to treat this as a valid file which I think >> is incorrect. Do you agree with that? > > Detecting correct WAL size based on the size of a random WAL file seems like > a really bad idea to me. It's not the most elegant thing ever, but I'm not sure I really see a big problem with it. Today, if the WAL file were the wrong size, we'd just error out. With the patch, if the WAL file were the wrong size, but happened to be a size that we consider legal, pg_waldump would treat it as a legal file and try to display the WAL records contained therein. This doesn't seem like a huge problem from her; what are you worried about? I agree that it would be bad if, for example, pg_resetwal saw a broken WAL file in pg_wal and consequently did the reset incorrectly, because the whole point of pg_resetwal is to escape situations where the contents of pg_wal may be bogus. However, pg_resetwal can read the value from the control file, so the technique of believing the file size doesn't need to be used in that case anyway. The only tools that need to infer the WAL size from the sizes of the segments actually present are those that neither have a running cluster (where SHOW can be used) nor access to the control file. There aren't many of those, and pg_waldump, at least, is a debugging tool anyway. IIUC, the other case where this comes up is for pg_standby, but if the WAL segments aren't all the same size that tool is presumably going to croak with or without these changes, so I'm not really sure there's much of an issue here. I might be missing something. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE/ALTER ROLE PASSWORD ('value' USING 'method')
On Thu, Mar 9, 2017 at 7:59 AM, Michael Paquierwrote: > Yes, I agree with that for MD5, and after looking around I can see > (like here http://prosody.im/doc/plain_or_hashed) as well that > SCRAM-hashed is used. Now, there are as well references to the salt, > like in protocol.sgml: > "The salt to use when encrypting the password." > Joe, do you think that in this case using the term "hashing" would be > more appropriate? I would think so as we use it to hash the password. I'm not Joe, but I think that would be more appropriate. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] partitioned tables and contrib/sepgsql
Greetings, While going over the contrib modules, I noticed that sepgsql was not updated for partitioned tables. What that appears to mean is that it's not possible to define labels on partitioned tables. As I recall, accessing the parent of a table will, similar to the GRANT system, not perform checkes against the child tables, meaning that there's no way to have SELinux checks properly enforced when partitioned tables are being used. This is an issue which should be resolved for PG10, so I'll add it to the open items list. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] [GSoC] Self introduction and question to mentors
Hi, Kirill! 2017-03-06 12:41 GMT+05:00 Кирилл Бороздин: > My name is Kirill Borozdin. I am a student of Ural Federal University from > Yekaterinburg, Russia. That's fine. I'm the associate professor at Ural Federal University. But not in your institution, though. > I understand that this task will require a big number of experiments because > we want to achieve > a speed-up in real life cases. > > Can you recommend me some articles or specific source files in PostgreSQL > codebase (it is so huge!) > which I can explore to be better prepared (in case I will be lucky enough to > be selected for GSoC)? Well, this project is not that specific to PostgreSQL. Everything you need to compile and use Postgres you can find on Wiki or just ask me directly. For starters, I'd recommend to look up some sorting algorithms survey papers from Google Scholar. There is a good paper by Goetz Graefe on sorting, but it either reflects current situation that some future algorithms :) Something on cache-oblivious sorting and on cache http://igoro.com/archive/gallery-of-processor-cache-effects/ Finally, papers referenced in Wiki page about GSoC for this project are a good starting point. As for tests, TPC is kind of ubiquitous, We could just move standard pg_bench script towards more sorting involved. Best regards, Andrey Borodin. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] partitioned tables and contrib/sepgsql
On Thu, Mar 9, 2017 at 9:47 AM, Stephen Frostwrote: > Greetings, > > While going over the contrib modules, I noticed that sepgsql was not > updated for partitioned tables. What that appears to mean is that it's > not possible to define labels on partitioned tables. As I recall, > accessing the parent of a table will, similar to the GRANT system, not > perform checkes against the child tables, meaning that there's no way to > have SELinux checks properly enforced when partitioned tables are being > used. I'll start taking a look at this. Presumably we'd just extend existing object_access_hooks to cover partitioned tables? > > This is an issue which should be resolved for PG10, so I'll add it to > the open items list. I'll grab it. Thanks. --Mike -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE/ALTER ROLE PASSWORD ('value' USING 'method')
On 03/09/2017 06:34 AM, Robert Haas wrote: > On Thu, Mar 9, 2017 at 7:59 AM, Michael Paquier >wrote: >> Yes, I agree with that for MD5, and after looking around I can see >> (like here http://prosody.im/doc/plain_or_hashed) as well that >> SCRAM-hashed is used. Now, there are as well references to the salt, >> like in protocol.sgml: >> "The salt to use when encrypting the password." >> Joe, do you think that in this case using the term "hashing" would be >> more appropriate? I would think so as we use it to hash the password. > > I'm not Joe, but I think that would be more appropriate. I am Joe and I agree ;-) -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Logical replication existing data copy
On 3/6/17 05:27, Petr Jelinek wrote: > And lastly I changed the automagic around exporting, not exporting and > using the snapshot produced by CREATE_REPLICATION_SLOT into explicit > parameter for the CREATE_REPLICATION_SLOT command (and added simple > framework for adding more of those if needed in the future). It might be useful to make this into a separate patch, for clarity. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/JSON in PostgreSQL
On 09.03.2017 18:58, Robert Haas wrote: Also, even if the superset thing were true on a theoretical plane, I'm not sure it would do us much good in practice. If we start using YAML-specific constructs, we won't have valid JSON any more. If we use only things that are legal in JSON, YAML's irrelevant. That's true. I just wanted to share my view of the "date guessing" part of pgpro's commits. I don't have a good solution for it either, I can only tell that where I work we do have same issues: either we guess by looking at the string value or we know that "this particular key" must be a date. Unsatisfied with either solution, we tend to use YAML for our APIs if possible. Regards, Sven -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Documentation improvements for partitioning
I think you might have the titles for 0002 and 0003 backwards. On Fri, Mar 3, 2017 at 2:51 AM, Amit Langotewrote: > 0002: some cosmetic fixes to create_table.sgml I think this sentence may be unclear to some readers: + One might however want to set it for only some partitions, + which is possible by doing SET NOT NULL on individual + partitions. I think you could replace this with something like: Even if there is no NOT NULL constraint on the parent, such a constraint can still be added to individual partitions, if desired; that is, the children can disallow nulls even if the parent allows them, but not the other way around. > 0003: add clarification about NOT NULL constraint on partition columns in > alter_table.sgml This is about list-ifying a note, but I think we should try to de-note-ify it. It's a giant block of text that is not obviously more noteworthy than the surrounding text; I think should be saved for things that particularly need to be called out. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: psql show index with type info
On 3/8/17 04:11, Ashutosh Bapat wrote: > The header for this table is "list of relations", so type gets > associated with relations indicated type of relation. btree: gin as a > type of relation doesn't sound really great. Instead we might want to > add another column "access method" and specify the access method used > for that relation. I like that. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add pg_disable_checksums() and supporting infrastructure
On Sun, Feb 19, 2017 at 12:02 PM, David Christensenwrote: > Hi Robert, this is part of a larger patch which *does* enable the checksums > online; I’ve been extracting the necessary pieces out with the understanding > that some people thought the disable code might be useful in its own merit. > I can add documentation for the various states. The CHECKSUM_REVALIDATE is > the only one I feel is a little wibbly-wobbly; the other states are required > because of the fact that enabling checksums requires distinguishing between > “in process of enabling” and “everything is enabled”. Ah, sorry, I had missed that patch. > My design notes for the patch were submitted to the list with little comment; > see: > https://www.postgresql.org/message-id/1E6E64E9-634B-43F4-8AA2-CD85AD92D2F8%40endpoint.com > > I have since added the WAL logging of checksum states, however I’d be glad to > take feedback on the other proposed approaches (particularly the system > catalog changes + the concept of a checksum cycle).] I think the concept of a checksum cycle might be overkill. It would be useful if we ever wanted to change the checksum algorithm, but I don't see any particular reason why we need to build infrastructure for that. If we wanted to change the checksum algorithm, I think it likely that we'd do that in the course of, say, widening it to 4 bytes as part of a bigger change in the page format, and this infrastructure would be too narrow to help with that. While I'm glad to see work finally going on to allow enabling and disabling checksums, I think it's too late to put this in v10. We have a rough rule that large new patches shouldn't be appearing for the first time in the last CommitFest, and I think this falls into that category. I also think it would be unwise to commit just the bits of the infrastructure that let us disable checksums without having time for a thorough review of the whole thing; to me, the chances that we'll make decisions that we later regret seem fairly high. I would rather wait for v11 and have a little more time to try to get everything right. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Write Ahead Logging for Hash Indexes
On Thu, Mar 9, 2017 at 10:23 AM, Amit Kapilawrote: >> +mode anyway). It would seem natural to complete the split in VACUUM, but >> since >> +splitting a bucket might require allocating a new page, it might fail if you >> +run out of disk space. That would be bad during VACUUM - the reason for >> +running VACUUM in the first place might be that you run out of disk space, >> +and now VACUUM won't finish because you're out of disk space. In contrast, >> +an insertion can require enlarging the physical file anyway. >> >> But VACUUM actually does try to complete the splits, so this is wrong, I >> think. > > Vacuum just tries to clean the remaining tuples from old bucket. Only > insert or split operation tries to complete the split. Oh, right. Sorry. >> +if (!xlrec.is_primary_bucket_page) >> +XLogRegisterBuffer(0, bucket_buf, REGBUF_STANDARD); >> >> So, in hashbucketcleanup, the page is registered and therefore an FPI >> will be taken if this is the first reference to this page since the >> checkpoint, but hash_xlog_delete won't restore the FPI. I think that >> exposes us to a torn-page hazard. >> _hash_freeovflpage/hash_xlog_squeeze_page seems to have the same >> disease, as does _hash_squeezebucket/hash_xlog_move_page_contents. > > Right, if we use XLogReadBufferForRedoExtended() instead of > XLogReadBufferExtended()/LockBufferForCleanup during relay routine, > then it will restore FPI when required and can serve our purpose of > acquiring clean up lock on primary bucket page. Yet another way could > be to store some information like block number, relfilenode, forknum > (maybe we can get away with some less info) of primary bucket in the > fixed part of each of these records and then using that read the page > and then take a cleanup lock. Now, as in most cases, this buffer > needs to be registered only in cases when there are multiple overflow > pages, I think having fixed information might hurt in some cases. Just because the WAL record gets larger? I think you could arrange to record the data only in the cases where it is needed, but I'm also not sure it would matter that much anyway. Off-hand, it seems better than having to mark the primary bucket page dirty (because you have to set the LSN) every time any page in the chain is modified. >> +/* >> + * Note that we still update the page even if it was restored from a >> full >> + * page image, because the bucket flag is not included in the image. >> + */ >> >> Really? Why not? (Because it's part of the special space?) > > Yes, because it's part of special space. Do you want that to > mentioned in comments? Seems like a good idea. Maybe just change the end to "because the special space is not included in the image" to make it slightly more explicit. >> Doesn't hash_xlog_split_allocpage need to acquire cleanup locks to >> guard against concurrent scans? > > I don't think so, because regular code takes it on old bucket to > protect the split from concurrent inserts and cleanup lock on new > bucket is just for the sake of consistency. You might be right, but this definitely does create a state on the standby that can't occur on the master: the bucket being split is flagged as such, but the bucket being populated doesn't appear to exist yet. Maybe that's OK. It makes me slightly nervous to have the standby take a weaker lock than the master does, but perhaps it's harmless. >> And also hash_xlog_split_complete? > > In regular code, we are not taking cleanup lock for this operation. You're right, sorry. >> And hash_xlog_delete? If the regular code path is getting a cleanup >> lock to protect against concurrent scans, recovery better do it, too. >> > > We are already taking cleanup lock for this, not sure what exactly is > missed here, can you be more specific? Never mind, I'm wrong. >> +/* >> + * There is no harm in releasing the lock on old bucket before new >> bucket >> + * during replay as no other bucket splits can be happening >> concurrently. >> + */ >> +if (BufferIsValid(oldbuf)) >> +UnlockReleaseBuffer(oldbuf); >> >> And I'm not sure this is safe either. The issue isn't that there >> might be another bucket split happening concurrently, but that there >> might be scans that get confused by seeing the bucket split flag set >> before the new bucket is created or the metapage updated. > > During scan we check for bucket being populated, so this should be safe. Yeah, I guess so. This business of the standby taking weaker locks still seems scary to me, as in hash_xlog_split_allocpage above. >> Seems like >> it would be safer to keep all the locks for this one. > > If you feel better that way, I can change it and add a comment saying > something like "we could release lock on bucket being split early as > well, but doing here to be consistent with normal operation" I think that might not be a bad idea. -- Robert
Re: [HACKERS] SQL/JSON in PostgreSQL
On 08.03.2017 20:48, Peter van Hardenberg wrote: Small point of order: YAML is not strictly a super-set of JSON. I haven't read the whole standard, but from what I can see the standard considers JSON an official subset of itself: http://www.yaml.org/spec/1.2/spec.html Regards, Sven -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] tzdata2017a breaks timestamptz regression test
I typically build with --with-system-tzdata, which means that any time Red Hat sees fit to push out a new copy of the tzdata package, that's what I'm testing against. This morning they updated to tzdata2017a, and I'm seeing the attached test failures. The reason for this is that the IANA crew have really moved forward aggressively on their project to remove invented zone abbreviations. As stated at http://mm.icann.org/pipermail/tz-announce/2017-February/45.html they've removed text abbreviations for pretty much all of South America, which is what's breaking these cases. For the cases involving the America/Santiago zone, I'm a bit inclined to just switch that to America/New_York, which seems much less likely to get fooled with by IANA. But I'm wondering if Alvaro had a specific reason for using the Santiago zone in those test cases. The diffs involving VET (America/Caracas) are more problematic: we're intentionally trying to test a time-varying zone abbreviation, so we can't just switch to a stable zone. As far as the lines 2395..2437 hunk goes, we could perhaps make that segment of the test print in ISO datestyle instead of Postgres datestyle, which would force the zone to be printed in numeric form not as an abbreviation. That's slightly annoying, but it doesn't really compromise what that part of the test is trying to test. However, the very last hunk indicates that VET no longer functions as a time-varying zone abbreviation at all, because it doesn't match any abbreviation recorded for America/Caracas in the IANA database, so we fall back to treating it as a simple synonym for America/Caracas (cf commit 39b691f25). So that's now failing altogether to test what it means to. What I propose we do about that is replace the America/Caracas test cases with Europe/Moscow tests, moving the dates as needed to match DST transitions from when Moscow was observing DST (pre 2011). The comments in the IANA files indicate that they believe the MSK/MSD abbreviations have or had real-world usage, so they probably won't replace them with numeric offsets. We can hope, anyway. regards, tom lane *** /home/postgres/pgsql/src/test/regress/expected/timestamptz.out Mon Jan 30 17:03:46 2017 --- /home/postgres/pgsql/src/test/regress/results/timestamptz.out Thu Mar 9 13:35:41 2017 *** *** 1778,1796 SELECT make_timestamptz(1973, 07, 15, 08, 15, 55.33); make_timestamptz - ! Sun Jul 15 08:15:55.33 1973 CLT (1 row) SELECT make_timestamptz(1973, 07, 15, 08, 15, 55.33, '+2'); make_timestamptz - ! Sun Jul 15 02:15:55.33 1973 CLT (1 row) SELECT make_timestamptz(1973, 07, 15, 08, 15, 55.33, '-2'); make_timestamptz - ! Sun Jul 15 06:15:55.33 1973 CLT (1 row) WITH tzs (tz) AS (VALUES --- 1778,1796 SELECT make_timestamptz(1973, 07, 15, 08, 15, 55.33); make_timestamptz - ! Sun Jul 15 08:15:55.33 1973 -04 (1 row) SELECT make_timestamptz(1973, 07, 15, 08, 15, 55.33, '+2'); make_timestamptz - ! Sun Jul 15 02:15:55.33 1973 -04 (1 row) SELECT make_timestamptz(1973, 07, 15, 08, 15, 55.33, '-2'); make_timestamptz - ! Sun Jul 15 06:15:55.33 1973 -04 (1 row) WITH tzs (tz) AS (VALUES *** *** 1799,1821 ('+10:00:1'), ('+10:00:01'), ('+10:00:10')) SELECT make_timestamptz(2010, 2, 27, 3, 45, 00, tz), tz FROM tzs; !make_timestamptz|tz ! ---+--- ! Fri Feb 26 23:45:00 2010 CLST | +1 ! Fri Feb 26 23:45:00 2010 CLST | +1: ! Fri Feb 26 23:45:00 2010 CLST | +1:0 ! Fri Feb 26 23:45:00 2010 CLST | +100 ! Fri Feb 26 23:45:00 2010 CLST | +1:00 ! Fri Feb 26 23:45:00 2010 CLST | +01:00 ! Fri Feb 26 14:45:00 2010 CLST | +10 ! Fri Feb 26 14:45:00 2010 CLST | +1000 ! Fri Feb 26 14:45:00 2010 CLST | +10: ! Fri Feb 26 14:45:00 2010 CLST | +10:0 ! Fri Feb 26 14:45:00 2010 CLST | +10:00 ! Fri Feb 26 14:45:00 2010 CLST | +10:00: ! Fri Feb 26 14:44:59 2010 CLST | +10:00:1 ! Fri Feb 26 14:44:59 2010 CLST | +10:00:01 ! Fri Feb 26 14:44:50 2010 CLST | +10:00:10 (15 rows) -- these should fail --- 1799,1821 ('+10:00:1'), ('+10:00:01'), ('+10:00:10')) SELECT make_timestamptz(2010, 2, 27, 3, 45, 00, tz), tz FROM tzs; !make_timestamptz |tz ! --+--- ! Fri Feb 26 23:45:00 2010 -03 | +1 ! Fri Feb 26 23:45:00 2010 -03 | +1: ! Fri Feb 26 23:45:00 2010 -03 | +1:0 ! Fri Feb 26 23:45:00 2010 -03 | +100 ! Fri Feb 26 23:45:00 2010 -03 | +1:00 ! Fri Feb 26 23:45:00 2010 -03 | +01:00 ! Fri Feb 26 14:45:00 2010 -03 | +10 ! Fri Feb 26 14:45:00 2010 -03 | +1000 ! Fri
Re: [HACKERS] exposing wait events for non-backends (was: Tracking wait event for latches)
Perhaps I'm confused by the title of this thread/CF entry, but background workers already do show up in pg_stat_activity. (You can verify that by testing the background sessions patch.) So which additional things are we aiming to see with this? In practice, I think it's common to do a quick select * from pg_stat_activity to determine whether a database instance is in use. (You always see your own session, but that's easy to eyeball.) If we add all the various background processes by default, that will make things harder, especially if there is no straightforward way to filter them out. Perhaps a pg_stat_user_* and pg_stat_system_* split like we have for some of the statistics tables would be useful? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Documentation improvements for partitioning
Robert Haaswrites: > This is about list-ifying a note, but I think we should try to > de-note-ify it. It's a giant block of text that is not obviously more > noteworthy than the surrounding text; I think should be saved > for things that particularly need to be called out. Yeah. A big problem with the markup we use, imo, is that segments are displayed in a way that makes them more prominent than the surrounding text, not less so. That doesn't really square with my intuitive view of what a ought to be used for; it forces it to be considered as something only slightly less dangerous than a or , not as a parenthetical remark. But that's what we have to deal with so we should mark up our text accordingly. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Query fails when SRFs are part of FROM clause (Commit id: 69f4b9c85f)
On Mon, Jan 30, 2017 at 6:54 PM, Tom Lanewrote: > Andres Freund writes: >> Wonder if we there's an argument to be made for implementing this >> roughly similarly to split_pathtarget_at_srf - instead of injecting a >> ProjectSet node we'd add a FunctionScan node below a Result node. > > Yeah, possibly. That would have the advantage of avoiding an ExecProject > step when the SRFs aren't buried, which would certainly be the expected > case. > > If you don't want to make ExecInitExpr responsible, then the planner would > have to do something like split_pathtarget_at_srf anyway to decompose the > expressions, no matter which executor representation we use. Did we do anything about this? Are we going to? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: psql show index with type info
On 3/8/17 08:30, Stephen Frost wrote: > Right, I don't think having an extra column which is going to be NULL a > large amount of the time is good. Note that \di already has a column "Table" that is null for something that is not an index. So I don't think this argument is very strong. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/JSON in PostgreSQL
Anecdotally, we just stored dates as strings and used a convention (key ends in "_at", I believe) to interpret them. The lack of support for dates in JSON is well-known, universally decried... and not a problem the PostgreSQL community can fix. On Thu, Mar 9, 2017 at 10:24 AM, Sven R. Kunzewrote: > On 09.03.2017 18:58, Robert Haas wrote: > >> Also, even if the superset thing were true on a theoretical plane, I'm >> not sure it would do us much good in practice. If we start using >> YAML-specific constructs, we won't have valid JSON any more. If we >> use only things that are legal in JSON, YAML's irrelevant. >> > > That's true. I just wanted to share my view of the "date guessing" part of > pgpro's commits. > I don't have a good solution for it either, I can only tell that where I > work we do have same issues: either we guess by looking at the string value > or we know that "this particular key" must be a date. > Unsatisfied with either solution, we tend to use YAML for our APIs if > possible. > > > Regards, > Sven > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Peter van Hardenberg San Francisco, California "Everything was beautiful, and nothing hurt."—Kurt Vonnegut
Re: [HACKERS] exposing wait events for non-backends (was: Tracking wait event for latches)
Hello Amit, On Tue, Mar 7, 2017 at 4:24 PM, Amit Langotewrote: > Hi Kuntal, > > Patches apply and compile fine. Works as advertised. > > Some minor comments on the patches themselves. > Thanks for the review. > In 0001: > > - * pgstat_bestart() - > + * pgstat_procstart() - > + * > + * Initialize this process's entry in the PgBackendStatus array. > + * Called from InitPostgres and AuxiliaryProcessMain. > > Not being called from AuxiliaryProcessMain(). Maybe leftover comment from > a previous version. Actually I see that in patch 0002, Main() functions > of various auxiliary processes call pgstat_procstart, not > AuxiliaryProcessMain. > Fixed. > + * user-defined functions which expects ids of backends starting from > 1 to > > s/expects/expect/g > Fixed. > +/* > + * AuxiliaryPidGetProc -- get PGPROC for an auxiliary process > + * given its PID > + * > + * Returns NULL if not found. > + */ > +PGPROC * > +AuxiliaryPidGetProc(int pid) > +{ > +PGPROC *result; > > Initialize to NULL so that the comment above is true. :) > Fixed. > In 0002: > > @@ -248,6 +248,9 @@ BackgroundWriterMain(void) > */ > prev_hibernate = false; > > +/* report walwriter process in the PgBackendStatus array */ > +pgstat_procstart(); > + > > s/walwriter/writer/g Fixed. > Patch 0004 should update monitoring.sgml. Added. I've attached the updated patches. PFA. -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com 0001-Infra-to-expose-non-backend-processes-in-pg_stat_get.patch Description: binary/octet-stream 0002-Expose-stats-for-auxiliary-processes-in-pg_stat_get_.patch Description: binary/octet-stream 0003-Expose-stats-for-autovacuum-launcher-and-bgworker.patch Description: binary/octet-stream 0004-Add-proc_type-column-in-pg_stat_get_activity.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki
This is looking pretty neat. I played around with it a bit. There are a couple of edge cases that you need to address, I think. - Does not support \x - When \pset format is rst, then \pset linestyle also shows up as "rst". That is wrong. Same for markdown. - Broken output in tuples_only (\t) mode. (rst and markdown) - rst: Do something about \pset title; the way it currently shows up appears to be invalid; could use ".. table:: title" directive - markdown: Extra blank line between table and footer. - markdown: We should document or comment somewhere exactly which of the various markdown table formats this is supposed to produce. (Pandoc pipe_tables?) - markdown: Table title needs to be after the table, like Table: title - markdown: Needs to escape | characters in cell contents. (Not needed for rst.) More escaping might be needed. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] use SQL standard error code for nextval
On 3/9/17 12:27, Mark Dilger wrote: > Perhaps you should add something to the release notes. Somebody could be > testing for the old error code. The release notes will be written when the release is prepared. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] partial indexes and bitmap scans
Tom, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Stephen Frostwrites: > > We have already figured out that the user's predicate results in a > > subset of the index's or we wouldn't be able to use that index though, > > right? Do we really need to spend cycles re-discovering that? Are > > there cases where we actually need the index's predicate to ever be > > included for correctness..? > > In a bitmap indexscan, yes, because the entire point of the recheck > condition is that we're going to scan a whole page and possibly see > tuples that don't satisfy the index predicate at all. I understand the point of the recheck condition being that we might see tuples that don't satisfy either the index predicate or the user's predicate, but that isn't the point- to even consider using that index we must have already realized that the user's predicate will only be satisfied in a subset of cases when the index predicate predicate will be satisfied, making any check of the index predicate necessairly always true. > Another point > that's mentioned in the comments in createplan.c is that if you're > considering the result of a BitmapOr or BitmapAnd, there's not necessarily > a single index involved so it's much harder to reason about which part > of the qual is an index predicate. We must be pulling the index's predicate to be able to put it into the Recheck condition in the first place. What I'm arguing is that once we've decided that we're able to use an index X, because the values which the user's predicate will satisfy is a subset of those which the index's predicate will satisfy, then we can entirely forget about the index's predicate as being redundant to the user's. I don't see anything about a BitmapAnd or BitmapOr being relevant to that. We will always need to check the user's predicate against the tuples being returned from the Bitmap Heap Scan, unless by chance it matches the index's predicate exactly *and* we have an entirely exact match bitmap without any lossy parts. > I do notice that createplan.c makes some effort to get rid of filter > conditions that are provably true when the index conditions are. > Doesn't look like it considers the reverse direction. Not sure if > that's missing a bet. That actually strikes me as a less likely condition to have in the general case, isn't it? Wouldn't that only happen if the index predicate and the user predicate are identical, because otherwise we either can't use the index or we must keep the user's predicate because it will only be satisfied in a subset of cases? Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Logical replication existing data copy
On 3/6/17 05:27, Petr Jelinek wrote: > updated and rebased version of the patch attached. Some comments on this patch: Can we have a better explanation of different snapshot options in CREATE_REPLICATION_SLOT. We use all these variants in different places, but it's not fully documented why. Think about interested users reading this. errmsg("subscription table %u in subscription %u does not exist", Use names instead of IDs if possible. + libpqrcv_table_list, + libpqrcv_table_info, + libpqrcv_table_copy, I don't think these functions belong into the WAL receiver API, since they are specific to this particular logical replication implementation. I would just make an API function libpqrc_exec_sql() that runs a query, and then put the table_*() functions as wrappers around that into tablesync.c. Not sure what the worker init stuff in ApplyLauncherShmemInit() is doing. Could be commented more. There are a lot of places that do one of MyLogicalRepWorker->relid == InvalidOid OidIsValid(MyLogicalRepWorker->relid) To check whether the current worker is a sync worker. Wrap that into a function. Not a fan of adding CommandCounterIncrement() calls at the end of functions. In some cases, they are not necessary at all. In cases where they are, the CCI call should be at a higher level between two function calls with a comment for why the call below needs to see the changes from the call above. The index name pg_subscription_rel_map_index/SubscriptionRelMapIndexId doesn't seem to match existing style, since there is no "map" column. How about pg_subscription_rel_rel_sub_index? I see we have a similarly named index for publications. max_sync_workers_per_subscription could be PGC_SIGHUP, I think. And the minimum could be 0, to stop any syncing? pg_subscription_rel.h: I'm not sure under which circumstances we need to use BKI_ROWTYPE_OID. Does pg_subscription_rel need an OID column? The index SubscriptionRelOidIndexId is not used anywhere. You removed this command from the tests: ALTER SUBSCRIPTION testsub SET PUBLICATION testpub, testpub1; I suppose because it causes a connection. We should have an option to prevent that (NOCONNECT/NOREFRESH?). Why was the test 'check replication origin was dropped on subscriber' removed? Attached also a small patch with some stylistic changes. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services From c6ba1eaa3c5a44e4a9f6d072cb95fcf7e68ba3d6 Mon Sep 17 00:00:00 2001 From: Peter EisentrautDate: Thu, 9 Mar 2017 08:19:06 -0500 Subject: [PATCH] fixup! Logical replication support for initial data copy --- doc/src/sgml/catalogs.sgml | 9 + doc/src/sgml/logical-replication.sgml | 18 +- doc/src/sgml/monitoring.sgml| 4 ++-- doc/src/sgml/protocol.sgml | 14 +++--- doc/src/sgml/ref/alter_subscription.sgml| 8 doc/src/sgml/ref/create_subscription.sgml | 13 ++--- src/backend/catalog/pg_subscription.c | 13 - src/backend/replication/logical/launcher.c | 18 +++--- src/backend/replication/logical/snapbuild.c | 6 +++--- src/backend/replication/logical/worker.c| 4 ++-- src/backend/replication/repl_gram.y | 1 + src/backend/replication/walsender.c | 2 +- src/backend/tcop/postgres.c | 8 +--- 13 files changed, 56 insertions(+), 62 deletions(-) diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index f587a08b6a..ab78585035 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -302,7 +302,7 @@ System Catalogs pg_subscription_rel - relation state mapping for subscriptions + relation state for subscriptions @@ -6429,14 +6429,14 @@ pg_subscription_rel The catalog pg_subscription_rel contains the - status for each replicated relation in each subscription. This is a + state for each replicated relation in each subscription. This is a many-to-many mapping. - This catalog only contains tables known to subscription after running + This catalog only contains tables known to the subscription after running either CREATE SUBSCRIPTION or - ALTER SUBSCRIPTION ... REFRESH commands. + ALTER SUBSCRIPTION ... REFRESH. @@ -6472,6 +6472,7 @@ pg_subscription_rel Columns char + State code: i = initialize, d = data is being copied, s = synchronized, diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml index f75304cd91..4ec6bb49b7 100644 --- a/doc/src/sgml/logical-replication.sgml +++ b/doc/src/sgml/logical-replication.sgml @@ -24,8 +24,8 @@ Logical Replication - Logical replication typically starts with a snapshot of the data on - the publisher
Re: [HACKERS] partial indexes and bitmap scans
Stephen Frostwrites: > * Tom Lane (t...@sss.pgh.pa.us) wrote: >> I do notice that createplan.c makes some effort to get rid of filter >> conditions that are provably true when the index conditions are. >> Doesn't look like it considers the reverse direction. Not sure if >> that's missing a bet. > That actually strikes me as a less likely condition to have in the > general case, isn't it? Wouldn't that only happen if the index > predicate and the user predicate are identical, because otherwise we > either can't use the index or we must keep the user's predicate because > it will only be satisfied in a subset of cases? Well, remember that the planner's idea of an ideal situation is to not have any filter conditions, not to not have any index (a/k/a recheck) conditions. It's going to try to put as much load as it can on the index condition side of things, and that gives rise to the need for rechecks. It seems like there might be some mileage to be gained by reversing the proof direction here, and having it get rid of recheck conditions that are implied by filter conditions rather than vice versa. I'm not quite convinced though, and I'm also not sure how hard it would be to mechanize. A lot of that code is shared between the bitmap and plain-indexscan cases, which could make it tricky to not break the plain-indexscan case. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/JSON in PostgreSQL
On 08.03.2017 20:52, Magnus Hagander wrote: On Wed, Mar 8, 2017 at 11:48 AM, Peter van Hardenberg> wrote: Small point of order: YAML is not strictly a super-set of JSON. Editorializing slightly, I have not seen much interest in the world for YAML support though I'd be interested in evidence to the contrary. The world of configuration management seems to for some reason run off YAML, but that's the only places I've seen it recently (ansible, puppet etc). SaltStack uses YAML for their tools, too. I personally can empathize with them (as a user of configuration management) about this as writing JSON would be nightmare with all the quoting, commas, curly braces etc. But that's my own preference maybe. (Btw. does "run off" mean like or avoid? At least my dictionaries tend to the latter.) That said if we're introducing something new, it's usually better to copy from another format than to invite your own. From my day-to-day work I can tell, the date(time) type is the only missing piece of JSON to make it perfect for business applications (besides, maybe, a "currency" type). Regards, Sven
Re: [HACKERS] PATCH: psql show index with type info
Peter, * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: > On 3/8/17 04:11, Ashutosh Bapat wrote: > > The header for this table is "list of relations", so type gets > > associated with relations indicated type of relation. btree: gin as a > > type of relation doesn't sound really great. Instead we might want to > > add another column "access method" and specify the access method used > > for that relation. > > I like that. When it will be NULL for all of the regular tables..? I'd rather have the one Type column that just included the access method when there is one to include. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] use SQL standard error code for nextval
Mark Dilgerwrites: >> On Mar 9, 2017, at 7:59 AM, Peter Eisentraut >> wrote: >> On 2/28/17 22:15, Peter Eisentraut wrote: >>> The SQL standard defines a separate error code for nextval exhausting >>> the sequence space. I haven't found any discussion of this in the >>> archives, so it seems this was just not considered or not yet in >>> existence when the error codes were introduced. Here is a patch to >>> correct it. > Perhaps you should add something to the release notes. Somebody could be > testing for the old error code. The release notes for v10 aren't going to be drafted for months yet. When they are, hopefully the writer will notice that this should be listed as an incompatible change. That's not the responsibility of this commit, although it would've been better if the commit log entry explicitly pointed out that it's an incompatible change. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] foreign partition DDL regression tests
On Thu, Mar 9, 2017 at 1:19 AM, Ashutosh Bapatwrote: >>> At least we need to update the documentation. >> >> Got a proposal? > > How about something like attached? Committed with some revisions. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Couple of issues with prepared FETCH commands
On Wed, Jan 11, 2017 at 11:28 PM, Tom Lanewrote: > Robert Haas writes: >> On Tue, Jan 10, 2017 at 5:38 PM, Andrew Gierth >> wrote: >>> But the problem that actually came up is this: if you do the PQprepare >>> before the named cursor has actually been opened, then everything works >>> _up until_ the first event, such as a change to search_path, that forces >>> a revalidation; and at that point it fails with the "must not change >>> result type" error _even if_ the cursor always has exactly the same >>> result type. This happens because the initial prepare actually stored >>> NULL for plansource->resultDesc, since the cursor name wasn't found, >>> while on the revalidate, when the cursor obviously does exist, it gets >>> the actual result type. >>> >>> It seems a bit of a "gotcha" to have it fail in this case when the >>> result type isn't actually being checked in other cases. > >> To me, that sounds like a bug. > > Yeah --- specifically, I wonder why we allow the reference to an > unrecognized cursor name to succeed. Or were you defining the bug > differently? I'm not sure whether that's a bug or not. What I was defining as a bug is calling a change from "we don't know what the result type will be" to "we know that the result type will be X" as a change in the result type. That's really totally inaccurate. I've never really understood errors about changing the result type. As a user, I assumed those were unavoidable implementation artifacts, on the theory that they were annoying and therefore the developers would have eliminated such messages had it been practical. As a developer, I've never gotten around to understanding whether that theory was correct. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/JSON in PostgreSQL
On Thu, Mar 9, 2017 at 12:48 PM, Sven R. Kunzewrote: > On 08.03.2017 20:48, Peter van Hardenberg wrote: >> >> Small point of order: YAML is not strictly a super-set of JSON. > > I haven't read the whole standard, but from what I can see the standard > considers JSON an official subset of itself: > http://www.yaml.org/spec/1.2/spec.html But there's apparent sophistry, like this, in that spec: SON's RFC4627 requires that mappings keys merely “SHOULD” be unique, while YAML insists they “MUST” be. Technically, YAML therefore complies with the JSON spec, choosing to treat duplicates as an error. In practice, since JSON is silent on the semantics of such duplicates, the only portable JSON files are those with unique keys, which are therefore valid YAML files. I don't see how YAML can impose a stronger requirement than JSON and yet claim to be a superset; a JSON document that doesn't meet that requirement will be legal (if stupid) as JSON but illegal as YAML. Also, even if the superset thing were true on a theoretical plane, I'm not sure it would do us much good in practice. If we start using YAML-specific constructs, we won't have valid JSON any more. If we use only things that are legal in JSON, YAML's irrelevant. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] updating parallel.sgml
On Wed, Mar 8, 2017 at 12:46 PM, Robert Haaswrote: > Here's a patch which updates parallel.sgml for the new parallel scan > types which have recently been committed, and also expands the > explanation of parallel joins slightly. I hope everyone will find > this useful in understanding what new capabilities we now have, and > what remains to be done in the future. > > Barring objections, I plan to commit this tomorrow. Committed with a fix for a typo pointed out to me off-list. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: psql show index with type info
* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: > On 3/8/17 08:30, Stephen Frost wrote: > > Right, I don't think having an extra column which is going to be NULL a > > large amount of the time is good. > > Note that \di already has a column "Table" that is null for something > that is not an index. So I don't think this argument is very strong. That's an interesting point. I think what I find most odd about all of this is that \dti and \dit work at all, and give a different set of columns from \dt. We don't document that combining those works in \?, as far as I can see, and other combinations don't work, just this. In any case, I won't push very hard on this, it's useful information to include and we should do so. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)
On Wed, Jan 25, 2017 at 12:44 AM, Craig Ringerwrote: >> The way that SetTransactionIdLimit() now works looks a bit dangerous. >> xidWrapLimit, xidStopLimit, and xidWarnLimit are computed based on the >> passed-in oldestXid value and written straight into shared memory. >> But the shared memory copy of oldestXid could have a different value. >> I'm not sure if that breaks anything, but it certainly weakens any >> confidence callers might have had that all those values are consistent >> with each other. > > This was my main hesitation with the whole thing too. > > It's necessary to advance oldestXmin before we xlog the advance and > truncate clog, and necessary to advance the vacuum limits only > afterwards. Well, that's why I tried to advocate that their should be two separate XID limits in shared memory: leave what's there now the way it is, and then add a new limit for "don't try to look up XIDs before this point: splat". I still think that'd be less risky. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance issue after upgrading from 9.4 to 9.6
Hi, On 2017-03-09 13:47:35 +0100, Naytro Naytro wrote: > We are having some performance issues after we upgraded to newest > version of PostgreSQL, before it everything was fast and smooth. > > Upgrade was done by pg_upgrade from 9.4 directly do 9.6.1. Now we > upgraded to 9.6.2 with no improvement. > > Some information about our setup: Freebsd, Solaris (SmartOS), simple > master-slave using streaming replication. Which node is on which of those, and where is the high load? > Problem: > Very high system CPU when master is streaming replication data, CPU > goes up to 77%. Only one process is generating this load, it's a > postgresql startup process. When I attached a truss to this process I > saw a lot o read calls with almost the same number of errors (EAGAIN). Hm. Just to clarify: The load is on the *receiving* side, in the startup process? Because the load doesn't quite look that way... > read(6,0x7fffa0c7,1) ERR#35 'Resource temporarily unavailable' > > Descriptor 6 is a pipe That's presumably a latches internal pipe. Could you redo that truss/strace with timestamps attached? Does truss show signals received? The above profile would e.g. make a lot more sense if not. Is the wal receiver sending signals? > Read call try to read one byte over and over, I looked up to source > code and I think this file is responsible for this behavior > src/backend/storage/ipc/latch.c. There was no such file in 9.4. It was "just" moved (and expanded), used to be at src/backend/port/unix_latch.c. There normally shouldn't be that much "latch traffic" in the startup process, we'd expect to block from within WaitForWALToBecomeAvailable(). Hm. Any chance you've configured a recovery_min_apply_delay? Although I'd expect more timestamp calls in that case. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Query fails when SRFs are part of FROM clause (Commit id: 69f4b9c85f)
Tom, all, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Robert Haaswrites: > > On Mon, Jan 30, 2017 at 6:54 PM, Tom Lane wrote: > >> If you don't want to make ExecInitExpr responsible, then the planner would > >> have to do something like split_pathtarget_at_srf anyway to decompose the > >> expressions, no matter which executor representation we use. > > > Did we do anything about this? Are we going to? > > No, and I think we should. Is it on the v10 open items list? Wasn't, I've added it now: https://wiki.postgresql.org/wiki/PostgreSQL_10_Open_Items Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] SQL/JSON in PostgreSQL
On 09.03.2017 19:50, Peter van Hardenberg wrote: Anecdotally, we just stored dates as strings and used a convention (key ends in "_at", I believe) to interpret them. The lack of support for dates in JSON is well-known, universally decried... and not a problem the PostgreSQL community can fix. I completely agree here. Regards, Sven -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bytea_output vs make installcheck
On 2/14/17 16:50, Jeff Janes wrote: > make installcheck currently fails against a server running > with bytea_output = escape. > > Making it succeed is fairly easy, and I think it is worth doing. > > Attached are two options for doing that. One overrides bytea_output > locally where-ever needed, and the other overrides it for the entire > 'regression' database. I would use option 2 here (ALTER DATABASE) and be done with it. Some people didn't like using ALTER DATABASE, but it's consistent with existing use. If someone wants to change that, that can be independent of this issue. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Enabling atomics on ARM64
On Wed, Mar 8, 2017 at 10:18 PM, Roman Shaposhnikwrote: > I'd like to offer a forward port from a change I'm contributing > the Greenplum code base over here: > https://github.com/greenplum-db/gpdb/pull/1983 Thanks. Please add this to https://commitfest.postgresql.org/14/ so it doesn't get forgotten. I'm afraid your patch is probably too late for v10 (the deadline for patches to be considered for v10 was March 1st, though somebody might propose to make an exception), but v11 will be here soon enough. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New CORRESPONDING clause design
hi 2017-03-09 17:19 GMT+01:00 Pavel Stehule: > > > 2017-03-09 13:18 GMT+01:00 Surafel Temesgen : > >> Hi , >> >> Here is a patch corrected as your feedback except missed tests case >> because corresponding by clause is implemented on the top of set operation >> and you can’t do that to set operation without corresponding by clause too >> > > I don't understand. > > The following statement should to work > > postgres=# select 10 as a, 20 as b union corresponding select 20 as a, 30 > as b, 40 as c; > > ERROR: each UNION query must have the same number of columns > LINE 1: ...elect 10 as a, 20 as b union corresponding select 20 as a, 3... > > > Corresponding clause should to work like projection filter. > I found a link to postgresql mailing list related to some older try to this feature implementation https://www.postgresql.org/message-id/cajzswkx7c6wmfo9py4baf8vhz_ofko3afsosjpsb17rgmgb...@mail.gmail.com > Regards > > Pavel > > >> >> Eg >> >> >> postgres=# SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d; >> >> ERROR: each UNION query must have the same number of columns >> >> LINE 1: SELECT 1 a, 2 b, 3 c UNION SELECT 4 a, 5 b, 6 c, 8 d; >> >> ^ >> >> postgres=# create table t1(a int, b int, c int); >> >> CREATE TABLE >> >> postgres=# create table t2(a int, b int); >> >> CREATE TABLE >> >> >> >> postgres=# select * from t1 union select * from t2; >> >> ERROR: each UNION query must have the same number of columns >> >> LINE 1: select * from t1 union select * from t2; >> >> >> >> Thanks >> >> Surafel >> >> On Tue, Mar 7, 2017 at 10:26 PM, Pavel Stehule >> wrote: >> >>> Hi >>> >>> I am sending a review of this interesting feature. >>> >>> I found following issues, questions: >>> >>> 1. unclosed tags in documentation >>> 2. bad name "changeTargetEntry" - should be makeTargetEntry? >>> 3. Why you removed lot of asserts in prepunion.c? These asserts should >>> be valid still >>> 4. make_coresponding_target has wrong formatting >>> 5. error "%s queries with a CORRESPONDING clause must have at least one >>> column with the same name" has wrong formatting, you can show position >>> 6. previous issue is repeated - look on formatting ereport function, >>> please, you can use DETAIL and HINT fields >>> 7. corresponding clause should to contain column list (I am looking to >>> ANSI/SQL 99) - you are using expr_list, what has not sense and probably it >>> has impact on all implementation. >>> 8. typo orderCorrespondingLsit(List *targetlist) >>> 9. I miss more tests for CORRESPONDING BY >>> 10. if I understand to this feature, this query should to work >>> >>> postgres=# SELECT 1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 >>> b, 6 c, 8 d; >>> ERROR: each UNION query must have the same number of columns >>> LINE 1: ...1 a, 2 b, 3 c UNION CORRESPONDING BY (c,b) SELECT 4 a, 5 b, ... >>> >>> postgres=# create table t1(a int, b int, c int); >>> CREATE TABLE >>> Time: 63,260 ms >>> postgres=# create table t2(a int, b int); >>> CREATE TABLE >>> Time: 57,120 ms >>> postgres=# select * from t1 union corresponding select * from t2; >>> ERROR: each UNION query must have the same number of columns >>> LINE 1: select * from t1 union corresponding select * from t2; >>> >>> If it is your first patch to Postgres, then it is perfect work! >>> >>> The @7 is probably most significant - I dislike a expression list there. >>> name_list should be better there. >>> >>> Regards >>> >>> Pavel >>> >>> >> >
Re: [HACKERS] on_dsm_detach() callback and parallel tuplesort BufFile resource management
On Wed, Mar 8, 2017 at 5:23 PM, Robert Haaswrote: > I think something like 0007-hj-shared-buf-file-v6.patch from > https://www.postgresql.org/message-id/CAEepm=3g=dMG+84083fkFzLvgMJ7HdhbGB=aezabnukbzm3...@mail.gmail.com > is probably a good approach to this problem. In essence, it dodges > the problem of trying to transfer ownership by making ownership be > common from the beginning. That's what I've been recommending, or > trying to, anyway. I think that I'm already doing that, to at least the same extent that 0007-hj-shared-buf-file-v6.patch is. That patch seems to be solving the problem by completely taking over management of temp files from fd.c. That is, these temp files are not marked as temp files in the way ordinary temp BufFiles are, with explicit buy-in from resowner.c about their temp-ness. It adds "#include "catalog/pg_tablespace.h" to buffile.c, even though that kind of thing generally lives within fd.c. It does use PG_TEMP_FILE_PREFIX, but never manages to use temp_tablespaces if that's set by user. It also doesn't do anything about temp_file_limit enforcement. Quite a lot of thought seems to have gone into making the fd.c/resowner mechanism leak-proof in the face of errors. So, quite apart from what that approaches misses out on, I really don't want to change resource management too much. As I went into in my "recap", it seems useful to change as little as possible about temp files. Besides, because parallel hash join cannot know the size of the BufFiles from workers in advance, and thus cannot replicate fd.c fully by having state for each segment, it ends up actually *relying on* ENOENT-on-unlink() as a condition that terminates execution: > +bool > +BufFileDeleteShared(Oid tablespace, pid_t pid, int set, int partition, > + int participant) > +{ > + chartempdirpath[MAXPGPATH]; > + chartempfilepath[MAXPGPATH]; > + int segment = 0; > + boolfound = false; > + > + /* > +* We don't know if the BufFile really exists, because SharedBufFile > +* tracks only the range of file numbers. If it does exists, we don't > +* khow many 1GB segments it has, so we'll delete until we hit ENOENT or > +* an IO error. > +*/ > + for (;;) > + { > + make_shareable_path(tempdirpath, tempfilepath, > + tablespace, pid, set, partition, > + participant, segment); > + if (!PathNameDelete(tempfilepath)) > + break; > + found = true; > + ++segment; > + } > + > + return found; > +} However, the whole point of my fortifying the refcount mechanism for V9 of parallel tuplesort is to not have to ignore a ENOENT like this, on the grounds that that's ugly/weird (I pointed this out when I posted my V8, actually). Obviously I could very easily teach fd.c not to LOG a complaint about an ENOENT on unlink() for the relevant parallel case, on the assumption that it's only down to an error during the brief period of co-ownership of a Buffile at the start of leader unification. Is that acceptable? Anyway, the only advantage I immediately see with the approach 0007-hj-shared-buf-file-v6.patch takes (over what I've provisionally written as my V9) is that by putting everything in shared memory all along, there is no weirdness with tying local memory clean-up to a shared memory on_dsm_detach() callback. As I said, stashing a local pointer for the leader in shared memory is weird, even if it accomplishes the stated goals for V9. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Query fails when SRFs are part of FROM clause (Commit id: 69f4b9c85f)
Robert Haaswrites: > On Mon, Jan 30, 2017 at 6:54 PM, Tom Lane wrote: >> If you don't want to make ExecInitExpr responsible, then the planner would >> have to do something like split_pathtarget_at_srf anyway to decompose the >> expressions, no matter which executor representation we use. > Did we do anything about this? Are we going to? No, and I think we should. Is it on the v10 open items list? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cast result of copyNode()
On 3/7/17 18:27, Mark Dilger wrote: > You appear to be using a #define macro to wrap a function of the same name > with the code: > > #define copyObject(obj) ((typeof(obj)) copyObject(obj)) Yeah, that's a bit silly. Here is an updated version that changes that. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services From 696e605c358e7ca5786a13c82504e9d7dbed5eb4 Mon Sep 17 00:00:00 2001 From: Peter EisentrautDate: Thu, 9 Mar 2017 15:18:59 -0500 Subject: [PATCH v2] Cast result of copyObject() to correct type copyObject() is declared to return void *, which allows easily assigning the result independent of the input, but it loses all type checking. If the compiler supports typeof or something similar, cast the result to the input type. This creates a greater amount of type safety. In some cases, where the result is assigned to a generic type such as Node * or Expr *, new casts are now necessary, but in general casts are now unnecessary in the normal case and indicate that something unusual is happening. --- config/c-compiler.m4 | 27 + configure | 40 +++ configure.in | 1 + src/backend/bootstrap/bootstrap.c | 4 ++-- src/backend/commands/copy.c | 2 +- src/backend/commands/createas.c | 2 +- src/backend/commands/event_trigger.c | 8 +++ src/backend/commands/prepare.c| 4 ++-- src/backend/commands/view.c | 2 +- src/backend/nodes/copyfuncs.c | 8 +++ src/backend/optimizer/path/indxpath.c | 4 ++-- src/backend/optimizer/plan/createplan.c | 6 ++--- src/backend/optimizer/plan/initsplan.c| 8 +++ src/backend/optimizer/plan/planagg.c | 4 ++-- src/backend/optimizer/plan/planner.c | 4 ++-- src/backend/optimizer/plan/setrefs.c | 26 ++-- src/backend/optimizer/plan/subselect.c| 14 +-- src/backend/optimizer/prep/prepjointree.c | 4 ++-- src/backend/optimizer/prep/preptlist.c| 2 +- src/backend/optimizer/prep/prepunion.c| 4 ++-- src/backend/optimizer/util/tlist.c| 12 +- src/backend/parser/analyze.c | 2 +- src/backend/parser/gram.y | 2 +- src/backend/parser/parse_clause.c | 2 +- src/backend/parser/parse_expr.c | 2 +- src/backend/parser/parse_relation.c | 2 +- src/backend/parser/parse_utilcmd.c| 6 ++--- src/backend/rewrite/rewriteHandler.c | 8 +++ src/backend/rewrite/rewriteManip.c| 8 +++ src/backend/tcop/postgres.c | 6 ++--- src/backend/utils/cache/plancache.c | 14 +-- src/backend/utils/cache/relcache.c| 8 +++ src/include/nodes/nodes.h | 8 ++- src/include/optimizer/tlist.h | 4 ++-- src/include/pg_config.h.in| 6 + src/include/pg_config.h.win32 | 6 + 36 files changed, 178 insertions(+), 92 deletions(-) diff --git a/config/c-compiler.m4 b/config/c-compiler.m4 index 7d901e1f1a..7afaec5f85 100644 --- a/config/c-compiler.m4 +++ b/config/c-compiler.m4 @@ -178,6 +178,33 @@ fi])# PGAC_C_STATIC_ASSERT +# PGAC_C_TYPEOF +# - +# Check if the C compiler understands typeof or a variant. Define +# HAVE_TYPEOF if so, and define 'typeof' to the actual key word. +# +AC_DEFUN([PGAC_C_TYPEOF], +[AC_CACHE_CHECK(for typeof, pgac_cv_c_typeof, +[pgac_cv_c_typeof=no +for pgac_kw in typeof __typeof__ decltype; do + AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], +[int x = 0; +$pgac_kw(x) y; +y = x; +return y;])], +[pgac_cv_c_typeof=$pgac_kw]) + test "$pgac_cv_c_typeof" != no && break +done]) +if test "$pgac_cv_c_typeof" != no; then + AC_DEFINE(HAVE_TYPEOF, 1, +[Define to 1 if your compiler understands `typeof' or something similar.]) + if test "$pgac_cv_c_typeof" != typeof; then +AC_DEFINE(typeof, $pgac_cv_c_typeof, [Define to how the compiler spells `typeof'.]) + fi +fi])# PGAC_C_TYPEOF + + + # PGAC_C_TYPES_COMPATIBLE # --- # Check if the C compiler understands __builtin_types_compatible_p, diff --git a/configure b/configure index b5cdebb510..47a1a59e8e 100755 --- a/configure +++ b/configure @@ -11399,6 +11399,46 @@ if test x"$pgac_cv__static_assert" = xyes ; then $as_echo "#define HAVE__STATIC_ASSERT 1" >>confdefs.h fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for typeof" >&5 +$as_echo_n "checking for typeof... " >&6; } +if ${pgac_cv_c_typeof+:} false; then : + $as_echo_n "(cached) " >&6 +else + pgac_cv_c_typeof=no +for pgac_kw in typeof __typeof__ decltype; do + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +int +main () +{ +int x = 0; +$pgac_kw(x) y; +y = x; +return y; + ; + return 0; +} +_ACEOF +if ac_fn_c_try_compile
Re: [HACKERS] Cost model for parallel CREATE INDEX
On Wed, Mar 8, 2017 at 5:55 PM, Robert Haaswrote: > I like to err on the side of the approach that requires fewer changes. > That is, if the question is "does pg_restore need to treat this issue > specially?" and the answer is unclear, I like to assume it probably > doesn't until some contrary evidence emerges. > > I mean, sometimes it is clear that you are going to need special > handling someplace, and then you have to do it. But I don't see that > this is one of those cases, necessarily. That's what I'll do, then. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] exposing wait events for non-backends (was: Tracking wait event for latches)
Robert Haaswrites: > On Thu, Mar 9, 2017 at 2:30 PM, Peter Eisentraut > wrote: >> In practice, I think it's common to do a quick select * from >> pg_stat_activity to determine whether a database instance is in use. > I thought of the same kind of thing, and it was discussed upthread. > There seemed to be more votes for keeping it all in one view, but that > could change if more people vote. I've not been paying much attention to this thread, but it seems like something that would help Peter's use-case and have other uses as well is a new column that distinguishes different process types --- user session, background worker, autovacuum worker, etc. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
Vaishnavi Prabakaran wrote: >if (PQbatchStatus(st->con) != PQBATCH_MODE_ON) > >But, it is better to use if (PQbatchStatus(st->con) == > PQBATCH_MODE_OFF) for this verification. Reason is there is one more state > in batch mode - PQBATCH_MODE_ABORTED. So, if the batch mode is in aborted > status, this check will still assume that the connection is not in batch > mode. > In a same way, "if(PQbatchStatus(st->con) == PQBATCH_MODE_ON)" check is > better to be modified as "PQbatchStatus(st->con) != PQBATCH_MODE_OFF". Agreed, these two tests need to be changed to account for the PQBATCH_MODE_ABORTED case. Fixed in new patch. > 2. @@ -2207,7 +2227,47 @@ doCustom(TState *thread, CState *st, StatsData > *agg) > + if (PQbatchStatus(st->con) != PQBATCH_MODE_OFF) > + { > + commandFailed(st, "already in batch mode"); > + break; > + } > > This is not required as below PQbatchBegin() internally does this > verification. > > + PQbatchBegin(st->con); The point of this test is to specifically forbid a sequence like this \beginbatch ...(no \endbatch) \beginbatch ... and according to the doc for PQbatchBegin, it looks like the return code won't tell us: "Causes a connection to enter batch mode if it is currently idle or already in batch mode. int PQbatchBegin(PGconn *conn); Returns 1 for success" My understanding is that "already in batch mode" is not an error. "Returns 0 and has no effect if the connection is not currently idle, i.e. it has a result ready, is waiting for more input from the server, etc. This function does not actually send anything to the server, it just changes the libpq connection state" > 3. It is better to check the return value of PQbatchBegin() rather than > assuming success. E.g: PQbatchBegin() will return false if the connection > is in copy in/out/both states. Given point #2 above, I think both tests are needed: if (PQbatchStatus(st->con) != PQBATCH_MODE_OFF) { commandFailed(st, "already in batch mode"); break; } if (PQbatchBegin(st->con) == 0) { commandFailed(st, "failed to start a batch"); break; } > > 3. In relation to #2, PQsendQuery() is not forbidden in batch mode > > although I don't think it can work with it, as it's based on the old > > protocol. > > > It is actually forbidden and you can see the error message "cannot > PQsendQuery in batch mode, use PQsendQueryParams" when you use > PQsendQuery(). Sorry for blaming the innocent piece of code, looking closer it's pgbench that does not display the message. With the simple query protocol, sendCommand() does essentially: r = PQsendQuery(st->con, sql); ... other stuff... if (r == 0) { if (debug) fprintf(stderr, "client %d could not send %s\n", st->id, command->argv[0]); st->ecnt++; return false; } So only in debug mode does it inform the user that it failed, and even then, it does not display PQerrorMessage(st->con). In non-debug mode it's opaque to the user. If it always fail, it appears to loop forever. The caller has this comment that is relevant to the problem: if (!sendCommand(st, command)) { /* * Failed. Stay in CSTATE_START_COMMAND state, to * retry. ??? What the point or retrying? Should * rather abort? */ return; } I think it doesn't bother anyone up to now because the immediate failure modes of PQsendQuery() are resource allocation or protocol failures that tend to never happen. Anyway currently \beginbatch avoids the problem by checking whether querymode == QUERY_SIMPLE, to fail gracefully instead of letting the endless loop happen. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index f6cb5d4..f93673e 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -269,7 +269,8 @@ typedef enum * * CSTATE_START_COMMAND starts the execution of a command. On a SQL * command, the command is sent to the server, and we move to -* CSTATE_WAIT_RESULT state. On a \sleep meta-command, the timer is set, +* CSTATE_WAIT_RESULT state unless in batch mode. +* On a \sleep meta-command, the timer is set, * and we enter the CSTATE_SLEEP state to wait for it to expire. Other * meta-commands are executed immediately. * @@ -1882,11 +1883,24 @@ sendCommand(CState *st, Command *command) if (commands[j]->type != SQL_COMMAND) continue; preparedStatementName(name, st->use_file, j); - res = PQprepare(st->con, name, - commands[j]->argv[0], commands[j]->argc - 1, NULL); - if
Re: [HACKERS] on_dsm_detach() callback and parallel tuplesort BufFile resource management
On Thu, Mar 9, 2017 at 11:19 AM, Peter Geogheganwrote: > That patch seems to be solving the problem by completely taking over > management of temp files from fd.c. That is, these temp files are not > marked as temp files in the way ordinary temp BufFiles are, with > explicit buy-in from resowner.c about their temp-ness. It adds > "#include "catalog/pg_tablespace.h" to buffile.c, even though that > kind of thing generally lives within fd.c. It does use > PG_TEMP_FILE_PREFIX, but never manages to use temp_tablespaces if > that's set by user. It also doesn't do anything about temp_file_limit > enforcement. Actually, it's a bigger departure than I suggest here, even. Currently, literally no BufFile caller ever gets anything other than fd.c-wise temp files (those marked FD_TEMPORARY). This is the closest that buffile.c comes to allowing this: #ifdef NOT_USED /* * Create a BufFile and attach it to an already-opened virtual File. * * This is comparable to fdopen() in stdio. This is the only way at present * to attach a BufFile to a non-temporary file. Note that BufFiles created * in this way CANNOT be expanded into multiple files. */ BufFile * BufFileCreate(File file) { return makeBufFile(file); } #endif IOW, I cannot see why 0007-hj-shared-buf-file-v6.patch leaves parallel hash join with BufFiles that are even capable of being expanded past 1GiB (1 segment), which is surely not acceptable -- it's a bug. Have I just missed something? (This is not to be confused with BufFiles that are interXact -- those are allowed, but won't work here either. This is why Thomas changed PHJ to not do it that way some months back.) At first I thought that it was okay because BufFiles are immutable/"readonly" once handed over from workers, but of course the PHJ SharedBufFile mechanism is up-front, and does all resource management in shared memory. Thus, it must even create BufFiles in workers that have this only-one-segment restriction as a consequence of not being temp files (in the conventional sense: FD_TEMPORARY fd.c segments). I suggest that the MAX_PHYSICAL_FILESIZE stress-test I sometimes perform [1] be incorporated in Thomas' testing. (Looks more carefully...) I now see that PathNameCreateFile() is being called, a new function which is largely just a new interface to the existing OpenTemporaryFileInTablespace() temp files. So, if you look carefully, you notice that you do in fact end up with FD_TEMPORARY fd.c segments here...so maybe temp_file_limit isn't broken, because, it turns out, 0007-hj-shared-buf-file-v6.patch is still *halfway* buying into the conventional idea of file temp-ness. But buffile.c if left behind ("file->isTemp == false"), so AFAICT it must still be broken simply because new segments cannot be written, per BufFileCreate() comments quoted above. This has become more of a PHJ review, which is off-topic for this thread. But my observations illustrate the difficulty with tying resource manager resources in local memory (temp segments, and associated BufFile(s)) with clean-up at shared memory segment detachment. It's possible, but ticklish enough that you'll end up with some wart or other when you're done. The question, then, is what wart is the most acceptable for parallel tuplesort? I see two plausible paths forward right now: 1. Selectively suppress spurious ENOENT-on-unlink() LOG message in the brief window where it might be spurious. This is the easiest option. (There is less code this way.) 2. Do more or less what I've already drafted as my V9, which is described in opening mail to this thread, while accepting the ugliness that that approach brings with it. [1] postgr.es/m/cam3swzrwdntkhig0gyix_1muaypik3dv6-6542pye2iel-f...@mail.gmail.com -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE/ALTER ROLE PASSWORD ('value' USING 'method')
On Fri, Mar 10, 2017 at 12:00 AM, Joe Conwaywrote: > On 03/09/2017 06:34 AM, Robert Haas wrote: >> On Thu, Mar 9, 2017 at 7:59 AM, Michael Paquier >> wrote: >>> Yes, I agree with that for MD5, and after looking around I can see >>> (like here http://prosody.im/doc/plain_or_hashed) as well that >>> SCRAM-hashed is used. Now, there are as well references to the salt, >>> like in protocol.sgml: >>> "The salt to use when encrypting the password." >>> Joe, do you think that in this case using the term "hashing" would be >>> more appropriate? I would think so as we use it to hash the password. >> >> I'm not Joe, but I think that would be more appropriate. > > I am Joe and I agree ;-) OK I'll spawn a new thread on the matter. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Enabling replication connections by default in pg_hba.conf
On Thu, Mar 9, 2017 at 10:43 PM, Peter Eisentrautwrote: > On 3/8/17 02:34, Michael Paquier wrote: >> This patch looks good to me. > > committed Thanks. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] exposing wait events for non-backends (was: Tracking wait event for latches)
On Thu, Mar 9, 2017 at 2:30 PM, Peter Eisentrautwrote: > Perhaps I'm confused by the title of this thread/CF entry, but > background workers already do show up in pg_stat_activity. (You can > verify that by testing the background sessions patch.) So which > additional things are we aiming to see with this? All the processes that don't normally show up in pg_stat_activity, such as auxiliary processes. > In practice, I think it's common to do a quick select * from > pg_stat_activity to determine whether a database instance is in use. > (You always see your own session, but that's easy to eyeball.) If we > add all the various background processes by default, that will make > things harder, especially if there is no straightforward way to filter > them out. > > Perhaps a pg_stat_user_* and pg_stat_system_* split like we have for > some of the statistics tables would be useful? I thought of the same kind of thing, and it was discussed upthread. There seemed to be more votes for keeping it all in one view, but that could change if more people vote. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] compiler warning in set_tablefunc_size_estimates
I tried a non-cassert compile on a machine that has a pickier compiler than my laptop, and got: costsize.c: In function ‘set_tablefunc_size_estimates’: costsize.c:4574:17: error: variable ‘rte’ set but not used [-Werror=unused-but-set-variable] That appears to be a legitimate gripe. Perhaps: diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index e78f3a8..c23f681 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -4571,12 +4571,9 @@ set_function_size_estimates(PlannerInfo *root, RelOptInfo *rel) void set_tablefunc_size_estimates(PlannerInfo *root, RelOptInfo *rel) { - RangeTblEntry *rte; - /* Should only be applied to base relations that are functions */ Assert(rel->relid > 0); - rte = planner_rt_fetch(rel->relid, root); - Assert(rte->rtekind == RTE_TABLEFUNC); + Assert(planner_rt_fetch(rel->relid, root)->rtekind == RTE_TABLEFUNC); rel->tuples = 100; -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] alter enum add value regression
It's just been brought to my attention that the following which worked in current releases is broken in master: BEGIN; CREATE TYPE test_enum AS ENUM ('v1','v2'); ALTER TYPE test_enum OWNER TO postgres; CREATE TABLE test_table (test_col test_enum DEFAULT 'v1'::test_enum); COMMIT; The reason is shown I think in this comment in enum.c: * We also insist that the type's pg_type row not be HEAP_UPDATED. If it * is, we can't tell whether the row was created or only modified in the * apparent originating xact, so it might be older than that xact. (We do * not worry whether the enum value is HEAP_UPDATED; if it is, we might * think it's too new and throw an unnecessary error, but we won't allow * an unsafe case.) Clearly this isn't just academic since someone has tripped over it. This example is based on code in an existing extension install script. Now the extension is probably somewhat broken because it assumes the existence of a "postgres" user, but that's a beside the point. It's not clear to me that we can do anything about this, and the changes that we made address numerous complaints that we have had about not being able to add values to an enum in a transaction, so I'm very far from keen to revert it. But this is sure a pity. At the very least it deserves a release note. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] adding an immutable variant of to_date
Thanks! On 08.03.2017 17:37, Andreas Karlsson wrote: The current code for to_char will on the first call to to_char build arrays with the localized names of the week days and the months. I suspect that you may need to build something similar but a set of arrays per locale. Not sure what the most C-like way of solving this is. Coming from a Python background, I would tend to use a dict. The keys are the locales, the values are the arrays. I would use something which is described here: http://stackoverflow.com/questions/3269881/how-to-represent-a-python-like-dictionary-in-c . Preferred libc-related, like hcreate and hsearch. See the DCH_to_char function and its call to cache_locale_time. Alright. Found the cache here: src/backend/utils/adt/pg_locale.c /* lc_time localization cache */ char *localized_abbrev_days[7]; char *localized_full_days[7]; char *localized_abbrev_months[12]; char *localized_full_months[12]; /* */ void cache_locale_time(void) { char *save_lc_time; time_t timenow; struct tm *timeinfo; int i; #ifdef WIN32 char *save_lc_ctype; #endif /* did we do this already? */ if (CurrentLCTimeValid) return; I would replace the invalidation check with a hsearch lookup. Grepping for the cache variables, I found: ~/src/postgres$ grep -r localized_abbrev_days * src/backend/utils/adt/pg_locale.c:char *localized_abbrev_days[7]; src/backend/utils/adt/pg_locale.c: cache_single_time(_abbrev_days[i], "%a", timeinfo); src/backend/utils/adt/formatting.c:char *str = str_toupper_z(localized_abbrev_days[tm->tm_wday], collid); src/backend/utils/adt/formatting.c:char *str = str_initcap_z(localized_abbrev_days[tm->tm_wday], collid); src/backend/utils/adt/formatting.c:char *str = str_tolower_z(localized_abbrev_days[tm->tm_wday], collid); src/include/utils/pg_locale.h:extern char *localized_abbrev_days[]; ~/src/postgres$ grep -r localized_full_days * src/backend/utils/adt/pg_locale.c:char *localized_full_days[7]; src/backend/utils/adt/pg_locale.c: cache_single_time(_full_days[i], "%A", timeinfo); src/backend/utils/adt/formatting.c:char *str = str_toupper_z(localized_full_days[tm->tm_wday], collid); src/backend/utils/adt/formatting.c:char *str = str_initcap_z(localized_full_days[tm->tm_wday], collid); src/backend/utils/adt/formatting.c:char *str = str_tolower_z(localized_full_days[tm->tm_wday], collid); src/include/utils/pg_locale.h:extern char *localized_full_days[]; ~/src/postgres$ grep -r localized_abbrev_months * src/backend/utils/adt/pg_locale.c:char *localized_abbrev_months[12]; src/backend/utils/adt/pg_locale.c: cache_single_time(_abbrev_months[i], "%b", timeinfo); src/backend/utils/adt/formatting.c:char *str = str_toupper_z(localized_abbrev_months[tm->tm_mon - 1], collid); src/backend/utils/adt/formatting.c:char *str = str_initcap_z(localized_abbrev_months[tm->tm_mon - 1], collid); src/backend/utils/adt/formatting.c:char *str = str_tolower_z(localized_abbrev_months[tm->tm_mon - 1], collid); src/include/utils/pg_locale.h:extern char *localized_abbrev_months[]; srkunze@mexico:~/src/postgres$ grep -r localized_full_months * src/backend/utils/adt/pg_locale.c:char *localized_full_months[12]; src/backend/utils/adt/pg_locale.c: cache_single_time(_full_months[i], "%B", timeinfo); src/backend/utils/adt/formatting.c:char *str = str_toupper_z(localized_full_months[tm->tm_mon - 1], collid); src/backend/utils/adt/formatting.c:char *str = str_initcap_z(localized_full_months[tm->tm_mon - 1], collid); src/backend/utils/adt/formatting.c:char *str = str_tolower_z(localized_full_months[tm->tm_mon - 1], collid); src/include/utils/pg_locale.h:extern char *localized_full_months[]; It seems to me, that I would need to make them point to the correct array address at the beginning of those functions by doing an hsearch. That step is basically independent of the actual immutable to_date idea. What do you think? Best, Sven -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] adding an immutable variant of to_date
On 08.03.2017 03:37, Robert Haas wrote: The commit message for 64353640e87b54250f1b8a1d2a708d270dff4bfd has some interesting perspective on this. """ Also, mark to_date() and to_char(interval) as stable; although these appear not to depend on any GUC variables as of CVS HEAD, that seems a property unlikely to survive future improvements. It seems best to mark all the formatting functions stable and be done with it. """ My take away from this commit is the following: Historically, the immutability of "to_date(text, text)" was an emergent feature. Proven to be possibly mutable, the parsing feature had a higher priority, so the immutability needed to be removed. The proposed variant on the other hand should be immutable first before everything else. Thus, future improvements cannot violate that. This might restrict the possible parsing functionality but that shouldn't be a problem in practice. Regards, Sven -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] compiler warning in set_tablefunc_size_estimates
Robert Haaswrites: > I tried a non-cassert compile on a machine that has a pickier compiler > than my laptop, and got: > costsize.c: In function ‘set_tablefunc_size_estimates’: > costsize.c:4574:17: error: variable ‘rte’ set but not used > [-Werror=unused-but-set-variable] > That appears to be a legitimate gripe. Perhaps: I think PG_USED_FOR_ASSERTS_ONLY would be a better solution. It's only happenstance that the function currently uses the RTE just for this; if it grows another use, your approach would be harder to clean up. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] contrib modules and relkind check
Amit, Michael, * Amit Langote (langote_amit...@lab.ntt.co.jp) wrote: > On 2017/03/09 11:51, Michael Paquier wrote: > > OK, I am marking that as ready for committer. > > Thanks. Thanks for this, I've pushed this now. I do have a few notes about changes that I made from your patch; - Generally speaking, the user-facing functions come first in these .c files, with a prototype at the top for the static functions defined later on in the file. I went ahead and did that for the functions you added too. - I added more comments to the regression tests, in particular, we usually comment when tests are expected to fail. - I added some additional regression tests to cover more cases, particularly ones for things that weren't being tested at all. - Not the fault of your patch, but there were cases where elog() was being used when it really should have been ereport(), so I changed those cases to all be, hopefully, consistent throughout. Thanks again! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Automatic cleanup of oldest WAL segments with pg_receivexlog
On Thu, Mar 9, 2017 at 10:54 PM, Peter Eisentrautwrote: > On 3/7/17 11:16, Robert Haas wrote: >> Well, if the problem you're trying to solve is "retain WAL for as long >> as possible without running out of disk space and having everything go >> kablooey", then it would solve that problem, and I think that's a very >> reasonable problem to want to solve. > > Could be. I'm not sure what that means for the presented patch, though. > Or whether it addresses Michael's original use case at all. The patch adding an end-segment command does address my problem, because I just want to be sure that there is enough space left on disk for one complete segment. And that's fine to check for that when the last segment is complete. This needs some additional effort but that's no big deal either. Having something like --limit-retained-segments partially addresses it, as long as there is a way to define an automatic mode, based on statvfs() obviously. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] compiler warning in set_tablefunc_size_estimates
On Thu, Mar 9, 2017 at 4:39 PM, Tom Lanewrote: > Robert Haas writes: >> I tried a non-cassert compile on a machine that has a pickier compiler >> than my laptop, and got: > >> costsize.c: In function ‘set_tablefunc_size_estimates’: >> costsize.c:4574:17: error: variable ‘rte’ set but not used >> [-Werror=unused-but-set-variable] > >> That appears to be a legitimate gripe. Perhaps: > > I think PG_USED_FOR_ASSERTS_ONLY would be a better solution. It's > only happenstance that the function currently uses the RTE just > for this; if it grows another use, your approach would be harder > to clean up. Yeah, we might have to revert the entire -4/+1 line patch. Actually, the thing I don't like about that is that that then we're still emitting code for the planner_rt_fetch. That probably doesn't cost much, but why do it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance issue after upgrading from 9.4 to 9.6
2017-03-09 18:28 GMT+01:00 Robert Haas: > On Thu, Mar 9, 2017 at 7:47 AM, Naytro Naytro > wrote: > > We are having some performance issues after we upgraded to newest > > version of PostgreSQL, before it everything was fast and smooth. > > > > Upgrade was done by pg_upgrade from 9.4 directly do 9.6.1. Now we > > upgraded to 9.6.2 with no improvement. > > > > Some information about our setup: Freebsd, Solaris (SmartOS), simple > > master-slave using streaming replication. > > > > Problem: > > Very high system CPU when master is streaming replication data, CPU > > goes up to 77%. Only one process is generating this load, it's a > > postgresql startup process. When I attached a truss to this process I > > saw a lot o read calls with almost the same number of errors (EAGAIN). > > read(6,0x7fffa0c7,1) ERR#35 'Resource temporarily unavailable' > > > > Descriptor 6 is a pipe > > > > Read call try to read one byte over and over, I looked up to source > > code and I think this file is responsible for this behavior > > src/backend/storage/ipc/latch.c. There was no such file in 9.4. > > Our latch implementation did get overhauled pretty thoroughly in 9.6; > see primarily commit 98a64d0bd713cb89e61bef6432befc4b7b5da59e. But I > can't guess what is going wrong here based on this information. It > might help if you can pull some stack backtraces from the startup > process. Dtrace from solaris: http://pastebin.com/u03uVKbr
Re: [HACKERS] Parallel Bitmap scans a bit broken
On 10 March 2017 at 06:17, Robert Haaswrote: > On Thu, Mar 9, 2017 at 11:50 AM, Dilip Kumar > wrote: > > On Thu, Mar 9, 2017 at 10:02 PM, Dilip Kumar > wrote: > >> I slightly modified your query to reproduce this issue. > >> > >> explain analyze select * from r1 where value<555; > >> > >> Patch is attached to fix the problem. > > > > I forgot to mention the cause of the problem. > > > > if (istate->schunkptr < istate->nchunks) > > { > >PagetableEntry *chunk = [idxchunks[istate->schunkptr]]; > > PagetableEntry *page = [idxpages[istate->spageptr]]; > > BlockNumber chunk_blockno; > > > > In above if condition we have only checked istate->schunkptr < > > istate->nchunks that means we have some chunk left so we are safe to > > access idxchunks, But just after that we are accessing > > ptbase[idxpages[istate->spageptr]] without checking that accessing > > idxpages is safe or not. > > > > tbm_iterator already handling this case, I broke it in > tbm_shared_iterator. > > I don't know if this is the only problem -- it would be good if David > could retest -- but it's certainly *a* problem, so committed. > Thanks for committing, and generally parallelising more stuff. I confirm that my test case is now working again. I'll be in this general area today, so will mention if I stumble over anything that looks broken. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services