Re: [WIP PATCH] Index scan offset optimisation using visibility map

2018-05-22 Thread Michail Nikolaev
exScan instead. > 3. The results are obtained on actual "sort of TPC-B" script? You could check testing script ( https://gist.github.com/michail-nikolaev/23e1520a1db1a09ff2b48d78f0cde91d) for SQL queries. But briefly: * Vanilla pg_bench initialization * ALTER TABLE pgbench_accou

[WIP PATCH] Index scan offset optimisation using visibility map

2018-01-31 Thread Michail Nikolaev
ont need index data) to avoid fetching tuples while they are just thrown away by nodeLimit. Patch is also availble on Github: https://github.com/michail-nikolaev/postgres/commit/a368c3483250e4c02046d418a27091678cb963f4?diff=split And some test here: https://gist.github.com/michail-nikol

Re: Contention preventing locking

2018-02-16 Thread Michail Nikolaev
Hello. Just want to notice - this work also correlates with https://www.postgresql.org/message-id/CAEepm%3D18buPTwNWKZMrAXLqja1Tvezw6sgFJKPQ%2BsFFTuwM0bQ%40mail.gmail.com paper. It provides more general way to address the issue comparing to single optimisations (but they could do the work too,

Re: [HACKERS] Can ICU be used for a database's default sort order?

2018-02-18 Thread Michail Nikolaev
Hello. Just want to inform: I have run check,installcheck,plcheck,contribcheck,modulescheck,ecpgcheck,isolationcheck,upgradecheck tests on Windows 10, VC2017 with patch applied on top of 2a41507dab0f293ff241fe8ae326065998668af8 as Andrey asked me. Everything is passing with and without

Re: [WIP PATCH] Index scan offset optimisation using visibility map

2018-07-16 Thread Michail Nikolaev
-nikolaev/postgres/compare/e3eb8be77ef82ccc8f87c515f96d01bf7c726ca8...michail-nikolaev:index_only_fetch?ts=4 сб, 14 июл. 2018 г. в 0:17, Heikki Linnakangas : > On 21/05/18 18:43, Michail Nikolaev wrote: > > Hello everyone. > > This letter related to “Extended support for index-onl

Re: [WIP PATCH] Index scan offset optimisation using visibility map

2018-03-06 Thread Michail Nikolaev
Hello, Andrey. Thanks for review. I have updated comments according your review also renamed some fields for consistency. Additional some notes added to documentation. Updated patch in attach, github updated too. offset_index_only_v3.patch Description: Binary data

Re: [WIP PATCH] Index scan offset optimisation using visibility map

2018-03-12 Thread Michail Nikolaev
Hello. > Sorry, seems like I've incorrectly expressed what I wanted to say. > I mean this Assert() can be checked before loop, not on every loop cycle. Yes, I understood it. Condition should be checked every cycle - at least it is done this way for index only scan:

Re: [WIP PATCH] Index scan offset optimisation using visibility map

2018-03-13 Thread Michail Nikolaev
Hello. Tom, thanks a lot for your thorough review. > What you've done to > IndexNext() is a complete disaster from a modularity standpoint: it now > knows all about the interactions between index_getnext, index_getnext_tid, > and index_fetch_heap. I was looking into the current IndexOnlyNext

Re: [WIP PATCH] Index scan offset optimisation using visibility map

2018-03-10 Thread Michail Nikolaev
Hello. Andrey, Tels - thanks for review. > It could be named "SkipTuples" (e.g. this is the number of tuples we need > to skip, not the number we have skipped), and the other one then > "iss_SkipTuplesRemaining" so they are consistent with each other. Agreed, done. > Also, I think that this

Re: [WIP PATCH] Index scan offset optimisation using visibility map

2018-03-20 Thread Michail Nikolaev
Hello everyone. I need an advice. I was reworking the patch: added support for the planner, added support for queries with projection, addded support for predicates which could be executed over index data. And.. I realized that my IndexScan is even more index-only than the original

Re: New gist vacuum.

2018-02-25 Thread Michail Nikolaev
> I'll attach patch file next message. Updated patch is attached. gist-vacuum-count.patch Description: Binary data

Re: [WIP PATCH] Index scan offset optimisation using visibility map

2018-10-02 Thread Michail Nikolaev
orking on it and was digging into just last night ( https://github.com/michail-nikolaev/postgres/commits/index_only_fetch ) The main issue currently is a cost estimation. In right case (10m relation, 0.5 index correlation, 0.1 selectivity for filter) - it works like a charm with 200%-400% performance

Re: txid_status returns NULL for recently commited transactions

2018-09-26 Thread Michail Nikolaev
template1302786564 4157344464 project 695176837 3764954191 As far as I remember master and replicas were not rebooted after upgrading from 9.6 to 10. So, maybe issue is upgrade-related. вт, 25 сент. 2018 г. в 22:22, Michail Nikolaev : > Hi, thanks for the reply! > > >

Re: txid_status returns NULL for recently commited transactions

2018-09-25 Thread Michail Nikolaev
Hi, thanks for the reply! > What are you using it for? I want to use it to validate status of related entities in other database (queue) in short interval after PG transaction commit/rollback. > I can't reproduce that... Yes, it happens only with one cluster. All others work as expected. >

txid_status returns NULL for recently commited transactions

2018-09-25 Thread Michail Nikolaev
Hello everyone. I see strange beaviour of txid_status with one of my production servers. SELECT txid_status(txid_current()) -> NULL SELECT txid_current() -> 4447342811 It also returns null for recent commited and aborted transactions. SELECT datname, age(datfrozenxid), datfrozenxid FROM

Re: txid_status returns NULL for recently commited transactions

2018-09-25 Thread Michail Nikolaev
Both tx ids are from my head and not valid. First 'commited' happends because int32 overflow: TransactionIdPrecedes(59856482, 2207340131) == false Could anyone help me with understanding of txid_status behaviour? Thanks, Michail. вт, 25 сент. 2018 г. в 19:47, Michail Nikolaev : > Hello eve

Re: Synchronous replay take III

2018-12-02 Thread Michail Nikolaev
Hello. It is really nice feature. I am working on the project which heavily reads from replicas (6 of them). In our case we have implemented some kind of "replication barrier" functionality based on table with counters (one counter per application backend in simple case). Each application

Re: Synchronous replay take III

2019-01-30 Thread Michail Nikolaev
Hello, Sorry, missed email. >> In our case we have implemented some kind of "replication barrier" functionality based on table with counters (one counter per application backend in simple case). >> Each application backend have dedicated connection to each replica. And it selects its counter

Re: Optimize single tuple fetch from nbtree index

2019-08-04 Thread Michail Nikolaev
Hello everyone. I am also was looking into possibility of such optimisation few days ago (attempt to reduce memcpy overhead on IndexOnlyScan). One thing I noticed here - whole page is scanned only if index quals are "opened" at some side. So, in case of SELECT* FROM tbl WHERE k=:val AND

thoughts on "prevent wraparound" vacuum

2019-07-20 Thread Michail Nikolaev
Hello. Currently I am working a lot with cluster consist a few of big tables. About 2-3 TB. These tables are heavily updated, some rows are removed, new rows are inserted... Kind of typical OLTP workload. Physical table size keeps mostly stable while regular VACUUM is working. It is fast enough

Re: thoughts on "prevent wraparound" vacuum

2019-07-20 Thread Michail Nikolaev
Hello. >- Which version of postgres is this? Newer versions avoid scanning > unchanged parts of the heap even for freezing (9.6+, with additional > smaller improvements in 11). Oh, totally forgot about version and settings... server_version 10.9 (Ubuntu 10.9-103) So, "don't vacuum all-frozen

Re: BUG #16108: Colorization to the output of command-line has unproperly behaviors at Windows platform

2020-02-26 Thread Michail Nikolaev
Hello. Looks totally fine to me now. So, I need to mark it as "ready to commiter", right?

Re: Thoughts on "killed tuples" index hint bits support on standby

2020-01-24 Thread Michail Nikolaev
Hello again. Andres, Peter, thanks for your comments. Some of issues your mentioned (reporting feedback to the another cascade standby, processing queries after restart and newer xid already reported) could be fixed in provided design, but your intention to have "independent correctness

Re: BUG #16108: Colorization to the output of command-line has unproperly behaviors at Windows platform

2020-02-18 Thread Michail Nikolaev
P.S. Also, should we enable vt100 mode in case of PG_COLOR=always? I think yes.

Re: BUG #16108: Colorization to the output of command-line has unproperly behaviors at Windows platform

2020-02-18 Thread Michail Nikolaev
Hello everyone. > Please find attached a version that supports older Mingw versions and SDKs. I have checked the patch source code and it seems to be working. But a few moments I want to mention: I think it is not good idea to mix the logic of detecting the fact of TTY with enabling of the

Re: Disallow cancellation of waiting for synchronous replication

2020-02-20 Thread Michail Nikolaev
Hello. Just want to share some thoughts about how it looks from perspective of a high availability web-service application developer. Because sometimes things look different from other sides. And everything looks like disaster to be honest. But let's take it one at a time. First - the problem

[PATCH] Comments related to "Take fewer snapshots" and "Revert patch for taking fewer snapshots"

2020-02-10 Thread Michail Nikolaev
Hello, hackers. Yesterday I have noticed that in simple protocol mode snapshot is taken twice - first time for parsing/analyze and later for execution. I was thinking it is a great idea to reuse the same snapshot. After some time (not short) I was able to find this thread from 2011 with exactly

Re: BUG #16108: Colorization to the output of command-line has unproperly behaviors at Windows platform

2020-02-22 Thread Michail Nikolaev
Hello. > The patch about making color by default [1] introduces the function > terminal_supports_color(), that I think is relevant for this issue. Please > find attached a new version based on that idea. I am not sure it is good idea to mix both patches because it adds some confusion and makes

Thoughts on "killed tuples" index hint bits support on standby

2020-01-16 Thread Michail Nikolaev
Hello, hackers. Currently hint bits in the index pages (dead tuples) are set and taken into account only at primary server. Standby just ignores it. It is done for reasons, of course (see RelationGetIndexScan and [1]): * We do this because the xmin on the primary node could easily be

Re: Thoughts on "killed tuples" index hint bits support on standby

2020-04-09 Thread Michail Nikolaev
Hello, Peter. > Let me make sure I understand your position: > You're particularly concerned about cases where there are relatively > few page splits, and the standby has to wait for VACUUM to run on the > primary before dead index tuples get cleaned up. The primary itself > probably has no

Re: Thoughts on "killed tuples" index hint bits support on standby

2020-04-08 Thread Michail Nikolaev
Hello, Peter. Thanks for your feedback. > Attached is a very rough POC patch of my own, which makes item > deletion occur "non-opportunistically" in unique indexes. The idea is > that we exploit the uniqueness property of unique indexes to identify > "version churn" from non-HOT updates. If any

[PATCH] Btree BackwardScan race condition on Standby during VACUUM

2020-03-16 Thread Michail Nikolaev
Hello, hackers. -- ABSTRACT -- There is a race condition between btree_xlog_unlink_page and _bt_walk_left. A lot of versions are affected including 12 and new-coming 13. Happens only on standby. Seems like could not cause invalid query results. -- REMARK -- While working on

Re: [PATCH] Btree BackwardScan race condition on Standby during VACUUM

2020-04-05 Thread Michail Nikolaev
Hello, Peter. > I added > something about this to the nbtree README in commit 9f83468b353. I have added some updates to your notes in the updated patch version. I also was trying to keep the original wrapping of the paragraph, so the patch looks too wordy. Thanks, Michail.

Re: [PATCH] Btree BackwardScan race condition on Standby during VACUUM

2020-03-27 Thread Michail Nikolaev
Hello. > Probably, patch in this thread should fix this in btree_xlog_split() too? I have spent some time trying to find any possible race condition between btree_xlog_split and _bt_walk_left… But I can’t find any. Also, I have tried to cause any issue by putting pg_sleep put into

[PATCH] hs_standby_disallowed test fix

2020-05-11 Thread Michail Nikolaev
Hello. There is a recent commit about changes in way read-only commands are prevented to be executed [1]. It seems like hs_standby_disallowed test is broken now. So, a simple patch to fix the test is attached. Thanks, Michail. [1]

Re: [PATCH] Btree BackwardScan race condition on Standby during VACUUM

2020-08-02 Thread Michail Nikolaev
Hello, Peter. > Attached is a revised version of your patch Thanks for your work, the patch is looking better now. Michail.

Re: [PATCH] Btree BackwardScan race condition on Standby during VACUUM

2020-07-10 Thread Michail Nikolaev
Hello, Peter. Thanks for the update. Yes, it is the right decision. I have started to spot that bug only while working on a faster scan using hint bits on replicas [1], so it is unlikely to hit it in production at the moment. Thanks, Michail. [1]:

Re: Improving connection scalability: GetSnapshotData()

2020-06-07 Thread Michail Nikolaev
Hello, hackers. Andres, nice work! Sorry for the off-top. Some of my work [1] related to the support of index hint bits on standby is highly interfering with this patch. Is it safe to consider it committed and start rebasing on top of the patches? Thanks, Michail. [1]:

Re: Why latestRemovedXid|cuteoff_xid are always sent?

2021-01-08 Thread Michail Nikolaev
Hello, Peter. Thanks for your explanation. One of the reasons I was asking - is an idea to use the same technique in the "LP_DEAD index hint bits on standby" WIP patch to reduce the amount of additional WAL. Now I am sure such optimization should work correctly. Thanks, Michail.

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-01-21 Thread Michail Nikolaev
Hello, everyone. Oh, I just realized that it seems like I was too naive to allow standby to set LP_DEAD bits this way. There is a possible consistency problem in the case of low minRecoveryPoint value (because hint bits do not move PageLSN forward). Something like this: LSN=10 STANDBY INSERTS

[PATCH] Full support for index LP_DEAD hint bits on standby

2021-01-18 Thread Michail Nikolaev
Hello, hackers. [ABSTRACT] Execution of queries to hot standby is one of the most popular ways to scale application workload. Most of the modern Postgres installations have two standby nodes for high-availability support. So, utilization of replica's CPU seems to be a reasonable idea. At the

Why latestRemovedXid|cuteoff_xid are always sent?

2021-01-02 Thread Michail Nikolaev
Hello, hackers. Working on some stuff, I realized I do not understand why latestRemovedXid|cuteoff_xid (in different types of WAL records) are sent every time they appear on the primary side. latestRemovedXid|cuteoff_xid is used to call ResolveRecoveryConflictWithSnapshot and cancel conflicting

Re: Thoughts on "killed tuples" index hint bits support on standby

2021-01-27 Thread Michail Nikolaev
Hello, hackers. Sorry for necroposting, but if someone is interested - I hope the patch is ready now and available in the other thread (1). Thanks, Michail. [1] https://www.postgresql.org/message-id/flat/CANtu0oiP18H31dSaEzn0B0rW6tA_q1G7%3D9Y92%2BUS_WHGOoQevg%40mail.gmail.com

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-01-27 Thread Michail Nikolaev
Hello, hackers. I think I was able to fix the issue related to minRecoveryPoint and crash recovery. To make sure standby will be consistent after crash recovery, we need to take the current value of minRecoveryPoint into account while setting LP_DEAD hints (almost the same way as it is done for

Re: Thoughts on "killed tuples" index hint bits support on standby

2021-01-28 Thread Michail Nikolaev
Hello, Peter. > I wonder if it would help to not actually use the LP_DEAD bit for > this. Instead, you could use the currently-unused-in-indexes > LP_REDIRECT bit. Hm… Sound very promising - an additional bit is a lot in this situation. > Whether or not "recently dead" means "dead to my >

Re: Thoughts on "killed tuples" index hint bits support on standby

2021-01-30 Thread Michail Nikolaev
Hello, Peter. > Yeah, it would help a lot. But those bits are precious. So it makes > sense to think about what to do with both of them in index AMs at the > same time. Otherwise we risk missing some important opportunity. Hm. I was trying to "expand the scope" as you said and got an idea...

Re: Thoughts on "killed tuples" index hint bits support on standby

2021-02-02 Thread Michail Nikolaev
Hello, Peter. > AFAICT that's not true, at least not in any practical sense. See the > comment in the middle of MarkBufferDirtyHint() that begins with "If we > must not write WAL, due to a relfilenode-specific...", and see the > "Checksums" section at the end of src/backend/storage/page/README.

Re: Slow standby snapshot

2021-06-13 Thread Michail Nikolaev
)Hello. > I recently ran into a problem in one of our production postgresql cluster. > I had noticed lock contention on procarray lock on standby, which causes WAL > replay lag growth. Yes, I saw the same issue on my production cluster. > 1) set max_connections to big number, like 10 I

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-05-13 Thread Michail Nikolaev
Hello. Added a check for standby promotion with the long transaction to the test (code and docs are unchanged). Thanks, Michail. From c5e1053805c537b50b0922151bcf127754500adb Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Fri, 14 May 2021 00:32:30 +0300 Subject: [PATCH v3 3/4] test

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-05-07 Thread Michail Nikolaev
Hello, Antonin. > I'm trying to review the patch, but not sure if I understand this problem, > please see my comment below. Thanks a lot for your attention. It is strongly recommended to look at version N3 (1) because it is a much more elegant, easy, and reliable solution :) But the

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-05-10 Thread Michail Nikolaev
Hello, Antonin. > Sorry, I missed the fact that your example can be executed inside BEGIN - END > block, in which case minRecoveryPoint won't advance after each command. No, the block is not executed as a single transaction, all commands are separated transactions (see below) > Actually I think

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-05-12 Thread Michail Nikolaev
e." Fixed. Updated version in attach. Thanks a lot, Michail. From 004b2dea9b700d890147b840573bb5b796c1f96a Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Wed, 12 May 2021 22:56:18 +0300 Subject: [PATCH v2 4/4] doc --- src/backend/access/nbtree/README | 35 ++-

Re: Slow standby snapshot

2021-07-11 Thread Michail Nikolaev
Hello, Kirill. > Also, maybe it is better to reduce the invasivity by using a more > simple approach. For example, use the first bit to mark xid as valid > and the last 7 bit (128 values) as an optimistic offset to the next > valid xid (jump by 127 steps in the worse scenario). > What do you

Re: Thoughts on "killed tuples" index hint bits support on standby

2021-02-10 Thread Michail Nikolaev
Hello, Peter. If you are interested, the possible patch (based on FPI mask during replay) was sent with some additional explanation and graphics to (1). At the moment I unable to find any "incorrectness" in it. Thanks again for your comments. Michail. [1]

Re: Thoughts on "killed tuples" index hint bits support on standby

2021-02-01 Thread Michail Nikolaev
Hello, Peter. Thanks a lot for your comments. There are some mine thoughts, related to the “masked bits” solution and your comments: > During recovery, we will probably always have to consider the > possibility that LP_DEAD bits that get set on the primary may be > received by a replica through

Re: Slow standby snapshot

2021-08-02 Thread Michail Nikolaev
Hello. > I have tried such an approach but looks like it is not effective, > probably because of CPU caching issues. It was a mistake by me. I have repeated the approach and got good results with small and a non-invasive patch. The main idea is simple optimistic optimization - store offset to

Re: Slow standby snapshot

2021-08-09 Thread Michail Nikolaev
Hello, Andres. Thanks for the feedback again. > The problem is that we don't want to add a lot of work > KnownAssignedXidsAdd/Remove, because very often nobody will build a snapshot > for that moment and building a sorted, gap-free, linear array of xids isn't > cheap. In my experience it's more

Re: Slow standby snapshot

2021-08-03 Thread Michail Nikolaev
Hello, Andres. Thanks for your feedback. >> Maybe use a hashtable of running transactions? It will be slightly faster >> when adding\removing single transactions. But much worse when doing >> KnownAssignedXidsRemove(). > Why would it be worse for KnownAssignedXidsRemove()? Were you intending to

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-09-29 Thread Michail Nikolaev
nt bits itself. It is tested on different standby, see > is(hints_num($node_standby_2), qq(10), 'index hint bits already set on second standby 2'); Also, I added checks for BTP_LP_SAFE_ON_STANDBY to make sure everything in the test goes by scenario. Thanks a lot, Michail. From cfb45d1a9cbf30be6098b2

Re: Slow standby snapshot

2021-10-02 Thread Michail Nikolaev
Hello, Andres. Could you please clarify how to better deal with the situation? According to your previous letter, I think there was some misunderstanding regarding the latest patch version (but I am not sure). Because as far as understand provided optimization (lazily calculated optional offset

Re: Slow standby snapshot

2021-11-21 Thread Michail Nikolaev
e and some additional investigation had been done. So, I think I’ll re-add the patch to the commitfest app. Thanks, Michail From 94cd2fbf37b5f0b824e0f9a9bc23f762a8bb19b5 Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Sun, 21 Nov 2021 21:23:02 +0300 Subject: [PATCH v3 1/2] memory barrier instead

Re: Slow standby snapshot

2021-11-22 Thread Michail Nikolaev
Hello, Andrey. > Write barrier must be issued after write, not before. > Don't we need to issue read barrier too? The write barrier is issued after the changes to KnownAssignedXidsNext and KnownAssignedXidsValid arrays and before the update of headKnownAssignedXids. So, it seems to be correct.

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-11-09 Thread Michail Nikolaev
I have changed approach, so it is better to start from this email: https://www.postgresql.org/message-id/flat/CANtu0ohHu1r1xQfTzEJuxeaOMYncG7xRxUQWdH%3DcMXZSf%2Bnzvg%40mail.gmail.com#4c81a4d623d8152f5e8889e97e750eec

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-11-09 Thread Michail Nikolaev
Woo-hoo :) > Attached is a proposal for a minor addition that would make sense to me, add > it if you think it's appropriate. Yes, I'll add to the patch. > I think I've said enough, changing the status to "ready for committer" :-) Thanks a lot for your help and attention! Best regards,

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-11-09 Thread Michail Nikolaev
s feature in the commitfest app works in a different way :) Best regards, Michail. From 02b0dd27944c37007d8a92905a14e6b3e8e50fa8 Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Tue, 9 Nov 2021 21:43:58 +0300 Subject: [PATCH v6 2/3] test --- src/test/recovery/Makefile| 1 + .../r

Re: Slow standby snapshot

2021-11-09 Thread Michail Nikolaev
ot lean on the compiler here because of `volatile` args. Also, I have added some comments. Best regards, Michail. From 1d55c6fae8cc160eadd705da0d70d9e7fb5bc00f Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Wed, 10 Nov 2021 00:02:18 +0300 Subject: [PATCH v2] known assignment xid next

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-11-05 Thread Michail Nikolaev
be” LSN-related logic to the test. Thanks a lot, Michail. From f8a87a2329e81b55b484547dd50edfd97a722ad2 Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Fri, 5 Nov 2021 19:28:12 +0300 Subject: [PATCH v5 3/3] doc --- src/backend/access/nbtree/README | 35 ++-- sr

Re: Logical replication error "no record found" /* shouldn't happen */

2021-07-24 Thread Michail Nikolaev
Hello. I saw this error multiple times trying to replicate the 2-3 TB server (version 11 to version 12). I was unable to find any explanation for this error. Thanks, Michail.

Re: Stream replication test fails of cfbot/windows server 2019

2022-01-12 Thread Michail Nikolaev
Hello. Looks like logical replication also affected: [08:26:35.599] # poll_query_until timed out executing this query: [08:26:35.599] # SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's'); [08:26:35.599] # expecting this output: [08:26:35.599] # t [08:26:35.599] #

Re: Windows vs recovery tests

2022-01-12 Thread Michail Nikolaev
Hello. It is also could be related - https://www.postgresql.org/message-id/flat/20220112112425.pgzymqcgdy62e7m3%40jrouhaud#097b54a539ac3091ca4e4ed8ce9ab89c (both Windows and Linux cases. Best regards, Michail.

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-01-15 Thread Michail Nikolaev
> I'm switching this patch on Waiting on Author. I have tested it multiple times on my Github repo, seems to be stable now. Switching back to Ready for committer. Best regards. Michail. From 9372bac9b56d27cf993e9d1fa66127c86b51f25c Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Sat, 15 Ja

Re: Slow standby snapshot

2022-02-20 Thread Michail Nikolaev
Hello, Andrey. Thanks for your efforts. > Patch on barrier seems too complicated to me right now. I’d propose to focus > on KnowAssignedXidsNext patch: it’s clean, simple and effective. I'll extract it to the separated patch later. > I’ve rebased the patch so that it does not depend on

Re: Patch proposal - parameter to limit amount of FPW because of hint bits per second

2022-03-24 Thread Michail Nikolaev
Hello, Peter. >> * Add code to _bt_killitems() that detects if it has generated an FPI, >> just to set some LP_DEAD bits. >> * Instead of avoiding the FPI when this happens, proactively call >> _bt_simpledel_pass() just before _bt_killitems() returns. Accept the >> immediate cost of setting an

Re: Slow standby snapshot

2022-03-31 Thread Michail Nikolaev
Hello. Just an updated commit message. Thanks, Michail. From 934d649a18c66f8b448463e57375865b33ce53e7 Mon Sep 17 00:00:00 2001 From: nkey Date: Fri, 1 Apr 2022 02:17:55 +0300 Subject: [PATCH v5] Optimize KnownAssignedXidsGetAndSetXmin by maintaining offset to next valid xid. MIME-Version: 1.0

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-03-31 Thread Michail Nikolaev
Hello, Peter. > The simple answer is: I don't know. I could probably come up with a > better answer than that, but it would take real effort, and time. I remember you had an idea about using the LP_REDIRECT bit in btree indexes as some kind of “recently dead” flag (1). Is this idea still in

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-03-31 Thread Michail Nikolaev
Hello, David. Thanks for your review! > As a specific recommendation here - submit patches with a complete commit > message. > Tweak it for each new version so that any prior discussion that informed the > general design of > the patch is reflected in the commit message. Yes, agreed. Applied

Re: Patch proposal - parameter to limit amount of FPW because of hint bits per second

2022-03-22 Thread Michail Nikolaev
Hello, Peter. Thanks for your comments. > There is one FPI per checkpoint for any leaf page that is modified > during that checkpoint. The difference between having that happen once > or twice per leaf page and having that happen many more times per leaf > page could be very large. Yes, I am

Re: Patch proposal - parameter to limit amount of FPW because of hint bits per second

2022-03-21 Thread Michail Nikolaev
Hello, Peter. > * Instead of avoiding the FPI when this happens, proactively call > _bt_simpledel_pass() just before _bt_killitems() returns. Accept the > immediate cost of setting an LP_DEAD bit, just like today, but avoid > repeated FPIs. Hm, not sure here AFAIK current implementation does not

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-03-22 Thread Michail Nikolaev
ci.com/build/5599876384817152), so, moving it back to "ready for committer" . Best regards, Michail. From 9ecb33a54971cfa1c766ed9d129c6abb44e39f98 Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Sat, 15 Jan 2022 16:21:51 +0300 Subject: [PATCH v10 1/3] code --- src/backend/access/

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-03-29 Thread Michail Nikolaev
Hello, Greg. > I'm seeing a recovery test failure. Not sure if this represents an > actual bug or just a test that needs to be adjusted for the new > behaviour. Thanks for notifying me. It is a failure of a test added in the patch. It is a little hard to make it stable (because it depends on

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-03-29 Thread Michail Nikolaev
Hello, Peter. Thanks for your review! > I doubt that the patch's use of pg_memory_barrier() in places like > _bt_killitems() is correct. There is no way to know for sure if this > novel new lockless algorithm is correct or not, since it isn't > explained anywhere. The memory barrier is used

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-03-29 Thread Michail Nikolaev
UPD: > I was thinking it is safe to have additional hint bits > on primary, but it seems like no. Oh, sorry for the mistake, it is about standby of course. > BTW I am wondering if it is possible > to achieve the same situation by pg_rewind and standby promotion… Looks like it is impossible,

Re: BufferAlloc: don't take two simultaneous locks

2022-01-30 Thread Michail Nikolaev
Hello, Yura. Test results look promising. But it seems like the naming and dynahash API change is a little confusing. 1) I think it is better to split the main part and atomic nentries optimization into separate commits. 2) Also, it would be nice to also fix hash_update_hash_key bug :) 3) Do we

Re: BufferAlloc: don't take two simultaneous locks

2022-02-06 Thread Michail Nikolaev
Hello, Yura. A one additional moment: > 1332: Assert((oldFlags & (BM_PIN_COUNT_WAITER | BM_IO_IN_PROGRESS)) == 0); > 1333: CLEAR_BUFFERTAG(buf->tag); > 1334: buf_state &= ~(BUF_FLAG_MASK | BUF_USAGECOUNT_MASK); > 1335: UnlockBufHdr(buf, buf_state); I think there is no sense to unlock buffer

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-01-23 Thread Michail Nikolaev
back to "Ready for Committer" once it passes tests. Best regards, Michail. From a46315fd96b5432241ab6c67c37493ef41d7dc73 Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Sun, 23 Jan 2022 20:47:51 +0300 Subject: [PATCH v8 2/3] test --- src/test/recovery/Makefile

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-01-25 Thread Michail Nikolaev
Hello, Julien. > I rebased the pathset in attached v9. Please double check that I didn't miss > anything in the rebase. Thanks a lot for your help. > I will let you mark the patch as Ready for Committer once you validate that > the > rebase was ok. Yes, rebase looks good. Best regards,

Re: Replace known_assigned_xids_lck by memory barrier

2023-09-05 Thread Michail Nikolaev
Thanks everyone for help!

Re: Replace known_assigned_xids_lck by memory barrier

2023-08-15 Thread Michail Nikolaev
Hello, Nathan. > What sort of benefits do you see from this patch? It might be worthwhile > in itself to remove spinlocks when possible, but IME it's much easier to > justify such changes when there is a tangible benefit we can point to. Oh, it is not an easy question :) The answer, probably,

Re: Replace known_assigned_xids_lck by memory barrier

2023-08-16 Thread Michail Nikolaev
Hello, good question! Thanks for your edits. As answer: probably we need to change "If we know that we're holding ProcArrayLock exclusively, we don't need the read barrier." to "If we're removing xid, we don't need the read barrier because only the startup process can remove and add xids to

Re: Replace known_assigned_xids_lck by memory barrier

2023-08-16 Thread Michail Nikolaev
Hello! Updated version (with read barriers is attached). > One remaining question I have is whether it is okay if we see an updated value > for one of the head/tail variables but not the other. It looks like the > tail variable is only updated with ProcArrayLock held exclusively, which > IIUC

CPU time for pg_stat_statement

2022-05-20 Thread Michail Nikolaev
Hello, hackers. Today I was doing some aggregates over pg_stat_statements in order to find types of queries consuming most of the CPU. Aggregates were made on two pg_state_statement snapshots within 30 sec delay. The sum(total_time) had the biggest value for a very frequent query with about 10ms

Re: CPU time for pg_stat_statement

2022-05-20 Thread Michail Nikolaev
Hello, Thomas. > This might be interesting: > https://github.com/powa-team/pg_stat_kcache Oh, nice, looks like it could help me to reduce CPU and test my assumption (using exec_user_time and exec_system_time). BWT, do you know why extension is not in standard contrib (looks mature)? Best

Re: CPU time for pg_stat_statement

2022-05-20 Thread Michail Nikolaev
Hello, Tom. > This is a pretty broad claim to make on the basis of one undocumented > test case on one unmentioned platform. I'll try to use pg_stat_kcache to check the difference between Wall and CPU for my case. > On what grounds do you claim getrusage will be better? One thing we > can be

Re: Slow standby snapshot

2022-07-02 Thread Michail Nikolaev
Hello, Simon. Sorry for calling you directly, but you know the subject better than anyone else. It is related to your work from 2010 - replacing KnownAssignedXidsHash with the KnownAssignedXids array. I have added additional optimization to the data structure you implemented. Initially, it was

Re: CPU time for pg_stat_statement

2022-06-08 Thread Michail Nikolaev
Hello, Tom. >> This is a pretty broad claim to make on the basis of one undocumented >> test case on one unmentioned platform. > I'll try to use pg_stat_kcache to check the difference between Wall > and CPU for my case. In my case I see pretty high correlation of pg_stat_kcache and

Any sense to get rid of known_assigned_xids_lck?

2022-06-13 Thread Michail Nikolaev
Hello, hackers. While working on (1) in commit 2871b4618af1acc85665eec0912c48f8341504c4 (2) from 2010 I noticed Simon Riggs was thinking about usage of memory barrier for KnownAssignedXids access instead of spinlocks. > We could dispense with the spinlock if we were to > create suitable memory

Re: Slow standby snapshot

2022-07-19 Thread Michail Nikolaev
Hello, Andrey. > I've looked into v5. Thanks! Patch is updated accordingly your remarks. Best regards, Michail. From 1301a262dea7f541c11092851e82f10932150ee3 Mon Sep 17 00:00:00 2001 From: Michail Nikolaev Date: Tue, 19 Jul 2022 23:50:53 +0300 Subject: [PATCH v6] Curren

Re: Slow standby snapshot

2022-07-29 Thread Michail Nikolaev
Hello. Thanks to everyone for the review. > It seems to me storing the index itself is simpler and maybe faster by > the cycles to perform addition. Yes, first version used 1-byte for offset with maximum value of 255. Agreed, looks like there is no sense to store offsets now. > A simple patch

Data loss on logical replication, 12.12 to 14.5, ALTER SUBSCRIPTION

2022-12-26 Thread Michail Nikolaev
Hello. Just a small story about small data-loss on logical replication. We were logically replicating a 4 TB database from > PostgreSQL 12.12 (Ubuntu 12.12-201-yandex.49163.d86383ed5b) on > x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, > 64-bit to > PostgreSQL

Re: Data loss on logical replication, 12.12 to 14.5, ALTER SUBSCRIPTION

2022-12-26 Thread Michail Nikolaev
Hello again. Just small a fix for: > 2022-12-14 09:21:25.705 to > 2022-12-14 09:49:20.664 (after synchronization start, but before finish). Correct values are: 2022-12-14 09:49:31.340 2022-12-14 09:49:41.683 So, it looks like we lost about 10s of one of the tables WAL.

  1   2   >