Re: [HACKERS] often PREPARE can generate high load (and sometimes minutes long unavailability)
Hello 2014-02-24 21:31 GMT+01:00 Jeff Janes jeff.ja...@gmail.com: On Mon, Feb 24, 2014 at 7:02 AM, Pavel Stehule pavel.steh...@gmail.comwrote: 2014-02-23 21:32 GMT+01:00 Andres Freund and...@2ndquadrant.com: Hi, On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote: There is relative few very long ProcArrayLocks lwlocks This issue is very pathologic on fast computers with more than 8 CPU. This issue was detected after migration from 8.4 to 9.2. (but tested with same result on 9.0) I see it on devel 9.4 today actualized. When I moved PREPARE from cycle, then described issues is gone. But when I use a EXECUTE IMMEDIATELY, then the issue is back. So it looks it is related to planner, ... In addition to the issue Jeff mentioned, I'd suggest trying the same workload with repeatable read. That can do *wonders* because of the reduced number of snapshots. I tested it, and it doesn't help. Is there some patch, that I can test related to this issue? This is the one that I was referring to: http://www.postgresql.org/message-id/11927.1384199...@sss.pgh.pa.us I tested this patch with great success. Waiting on ProcArrayLocks are off. Throughput is expected. For described use case it is seriously interesting. Regards Pavel light weight locks lockname mode countavg DynamicLocks Exclusive 8055 5003 DynamicLocks Shared 1666 50 LockMgrLocks Exclusive129 36 IndividualLock Exclusive 34 48 IndividualLock Shared 21 7 BufFreelistLock Exclusive 12 8 WALWriteLock Exclusive 1 38194 ProcArrayLock Shared 1 8 Cheers, Jeff
Re: [HACKERS] often PREPARE can generate high load (and sometimes minutes long unavailability)
2014-02-25 11:29 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com: Hello 2014-02-24 21:31 GMT+01:00 Jeff Janes jeff.ja...@gmail.com: On Mon, Feb 24, 2014 at 7:02 AM, Pavel Stehule pavel.steh...@gmail.comwrote: 2014-02-23 21:32 GMT+01:00 Andres Freund and...@2ndquadrant.com: Hi, On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote: There is relative few very long ProcArrayLocks lwlocks This issue is very pathologic on fast computers with more than 8 CPU. This issue was detected after migration from 8.4 to 9.2. (but tested with same result on 9.0) I see it on devel 9.4 today actualized. When I moved PREPARE from cycle, then described issues is gone. But when I use a EXECUTE IMMEDIATELY, then the issue is back. So it looks it is related to planner, ... In addition to the issue Jeff mentioned, I'd suggest trying the same workload with repeatable read. That can do *wonders* because of the reduced number of snapshots. I tested it, and it doesn't help. Is there some patch, that I can test related to this issue? This is the one that I was referring to: http://www.postgresql.org/message-id/11927.1384199...@sss.pgh.pa.us I tested this patch with great success. Waiting on ProcArrayLocks are off. Throughput is expected. For described use case it is seriously interesting. Here is a update for 9.4 Regards Pavel Regards Pavel light weight locks lockname mode countavg DynamicLocks Exclusive 8055 5003 DynamicLocks Shared 1666 50 LockMgrLocks Exclusive129 36 IndividualLock Exclusive 34 48 IndividualLock Shared 21 7 BufFreelistLock Exclusive 12 8 WALWriteLock Exclusive 1 38194 ProcArrayLock Shared 1 8 Cheers, Jeff diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c new file mode 100644 index d525ca4..261473d *** a/src/backend/utils/adt/selfuncs.c --- b/src/backend/utils/adt/selfuncs.c *** get_actual_variable_range(PlannerInfo *r *** 4958,4963 --- 4958,4964 HeapTuple tup; Datum values[INDEX_MAX_KEYS]; bool isnull[INDEX_MAX_KEYS]; + SnapshotData SnapshotDirty; estate = CreateExecutorState(); econtext = GetPerTupleExprContext(estate); *** get_actual_variable_range(PlannerInfo *r *** 4980,4985 --- 4981,4987 slot = MakeSingleTupleTableSlot(RelationGetDescr(heapRel)); econtext-ecxt_scantuple = slot; get_typlenbyval(vardata-atttype, typLen, typByVal); + InitDirtySnapshot(SnapshotDirty); /* set up an IS NOT NULL scan key so that we ignore nulls */ ScanKeyEntryInitialize(scankeys[0], *** get_actual_variable_range(PlannerInfo *r *** 4997,5003 if (min) { index_scan = index_beginscan(heapRel, indexRel, ! GetActiveSnapshot(), 1, 0); index_rescan(index_scan, scankeys, 1, NULL, 0); /* Fetch first tuple in sortop's direction */ --- 4999,5005 if (min) { index_scan = index_beginscan(heapRel, indexRel, ! SnapshotDirty, 1, 0); index_rescan(index_scan, scankeys, 1, NULL, 0); /* Fetch first tuple in sortop's direction */ *** get_actual_variable_range(PlannerInfo *r *** 5029,5035 if (max have_data) { index_scan = index_beginscan(heapRel, indexRel, ! GetActiveSnapshot(), 1, 0); index_rescan(index_scan, scankeys, 1, NULL, 0); /* Fetch first tuple in reverse direction */ --- 5031,5037 if (max have_data) { index_scan = index_beginscan(heapRel, indexRel, ! SnapshotDirty, 1, 0); index_rescan(index_scan, scankeys, 1, NULL, 0); /* Fetch first tuple in reverse direction */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] often PREPARE can generate high load (and sometimes minutes long unavailability)
2014-02-23 21:32 GMT+01:00 Andres Freund and...@2ndquadrant.com: Hi, On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote: There is relative few very long ProcArrayLocks lwlocks This issue is very pathologic on fast computers with more than 8 CPU. This issue was detected after migration from 8.4 to 9.2. (but tested with same result on 9.0) I see it on devel 9.4 today actualized. When I moved PREPARE from cycle, then described issues is gone. But when I use a EXECUTE IMMEDIATELY, then the issue is back. So it looks it is related to planner, ... In addition to the issue Jeff mentioned, I'd suggest trying the same workload with repeatable read. That can do *wonders* because of the reduced number of snapshots. I tested it, and it doesn't help. Is there some patch, that I can test related to this issue? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] often PREPARE can generate high load (and sometimes minutes long unavailability)
On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote: 354246.00 93.0% s_lock /usr/lib/postgresql/9.2/bin/postgres 10503.00 2.8% LWLockRelease /usr/lib/postgresql/9.2/bin/postgres 8802.00 2.3% LWLockAcquire /usr/lib/postgresql/9.2/bin/postgres 828.00 0.2% _raw_spin_lock [kernel.kallsyms] 559.00 0.1% _raw_spin_lock_irqsave [kernel.kallsyms] 340.00 0.1% switch_mm [kernel.kallsyms] 305.00 0.1% poll_schedule_timeout [kernel.kallsyms] 274.00 0.1% native_write_msr_safe [kernel.kallsyms] 257.00 0.1% _raw_spin_lock_irq [kernel.kallsyms] 238.00 0.1% apic_timer_interrupt [kernel.kallsyms] 236.00 0.1% __schedule [kernel.kallsyms] 213.00 0.1% HeapTupleSatisfiesMVCC With systemtap I got list of spin locks light weight locks lockname mode countavg (time) DynamicLocks Exclusive 2804 1025 DynamicLocks Shared106130 ProcArrayLock Exclusive 63 963551 ProcArrayLock Shared 50 4160 LockMgrLocks Exclusive 18159 IndividualLock Exclusive 2 7 There is relative few very long ProcArrayLocks lwlocks It's odd that there are so many exclusive acquisition ProcArrayLocks... A hierarchical profile would be interesting. I'd suggest compiling postgres with -fno-omit-frame-pointer and doing a profile with perf. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] often PREPARE can generate high load (and sometimes minutes long unavailability)
2014-02-24 16:09 GMT+01:00 Andres Freund and...@2ndquadrant.com: On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote: 354246.00 93.0% s_lock /usr/lib/postgresql/9.2/bin/postgres 10503.00 2.8% LWLockRelease /usr/lib/postgresql/9.2/bin/postgres 8802.00 2.3% LWLockAcquire /usr/lib/postgresql/9.2/bin/postgres 828.00 0.2% _raw_spin_lock [kernel.kallsyms] 559.00 0.1% _raw_spin_lock_irqsave [kernel.kallsyms] 340.00 0.1% switch_mm [kernel.kallsyms] 305.00 0.1% poll_schedule_timeout [kernel.kallsyms] 274.00 0.1% native_write_msr_safe [kernel.kallsyms] 257.00 0.1% _raw_spin_lock_irq [kernel.kallsyms] 238.00 0.1% apic_timer_interrupt [kernel.kallsyms] 236.00 0.1% __schedule [kernel.kallsyms] 213.00 0.1% HeapTupleSatisfiesMVCC With systemtap I got list of spin locks light weight locks lockname mode countavg (time) DynamicLocks Exclusive 2804 1025 DynamicLocks Shared106130 ProcArrayLock Exclusive 63 963551 ProcArrayLock Shared 50 4160 LockMgrLocks Exclusive 18159 IndividualLock Exclusive 2 7 There is relative few very long ProcArrayLocks lwlocks It's odd that there are so many exclusive acquisition ProcArrayLocks... A hierarchical profile would be interesting. I'd suggest compiling postgres with -fno-omit-frame-pointer and doing a profile with perf. I had no experience with perf, so maybe it is not what you want - 19.59% postmaster postgres - s_lock - 55.06% LWLockAcquire + 99.84% TransactionIdIsInProgress - 44.63% LWLockRelease + 99.91% TransactionIdIsInProgress - 13.84% postmaster postgres - tas - 97.97% s_lock + 55.01% LWLockAcquire + 44.99% LWLockRelease - 1.10% LWLockAcquire + 99.89% TransactionIdIsInProgress - 0.93% LWLockRelease + 99.93% TransactionIdIsInProgress - 19.59% postmaster postgres - s_lock - 55.06% LWLockAcquire - 99.84% TransactionIdIsInProgress HeapTupleSatisfiesMVCC heap_hot_search_buffer index_fetch_heap index_getnext get_actual_variable_range ineq_histogram_selectivity scalarineqsel mergejoinscansel cached_scansel initial_cost_mergejoin try_mergejoin_path sort_inner_and_outer add_paths_to_joinrel make_join_rel make_rels_by_clause_joins join_search_one_level standard_join_search make_rel_from_joinlist make_one_rel query_planner grouping_planner subquery_planner standard_planner planner pg_plan_query pg_plan_queries BuildCachedPlan GetCachedPlan exec_bind_message PostgresMain ExitPostmaster BackendStartup ServerLoop PostmasterMain startup_hacks regards Pavel Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] often PREPARE can generate high load (and sometimes minutes long unavailability)
On Mon, Feb 24, 2014 at 7:02 AM, Pavel Stehule pavel.steh...@gmail.comwrote: 2014-02-23 21:32 GMT+01:00 Andres Freund and...@2ndquadrant.com: Hi, On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote: There is relative few very long ProcArrayLocks lwlocks This issue is very pathologic on fast computers with more than 8 CPU. This issue was detected after migration from 8.4 to 9.2. (but tested with same result on 9.0) I see it on devel 9.4 today actualized. When I moved PREPARE from cycle, then described issues is gone. But when I use a EXECUTE IMMEDIATELY, then the issue is back. So it looks it is related to planner, ... In addition to the issue Jeff mentioned, I'd suggest trying the same workload with repeatable read. That can do *wonders* because of the reduced number of snapshots. I tested it, and it doesn't help. Is there some patch, that I can test related to this issue? This is the one that I was referring to: http://www.postgresql.org/message-id/11927.1384199...@sss.pgh.pa.us Cheers, Jeff
Re: [HACKERS] often PREPARE can generate high load (and sometimes minutes long unavailability)
On Sun, Feb 23, 2014 at 11:04 AM, Pavel Stehule pavel.steh...@gmail.comwrote: Hello I got a example of code, that generate relatively high load with minimal connections. This code is +/- bad - it repeatedly generate prepare statement, but somewhere uses prepared statements as protections against SQL injections and they can use same use case. Pseudocode (I can send a test case privately): Script a: -- A,B are in RAM for i in 1 .. N loop insert into A values(); for j in 1 .. M loop insert into B values(); end loop; end loop; Script b: -- query is extremely fast - returns 0 or 1 rows usually 40 threads execute while true loop pr = PREPARE SELECT * FROM A LEFT JOIN B ON .. EXECUTE pr(...) sleep(10 ms) end loop Digging through uncommitted tuples at the top or bottom of an index (which happenings during planning, especially the planner of merge joins) is very contentious. Tom proposed changing the snapshot used for planning to Dirty, but the proposal didn't go anywhere because no one did the testing to confirm that it solved the problem in the field. Perhaps you can help do that. See: [PERFORM] Performance bug in prepared statement binding in 9.2? and several related threads. Cheers, Jeff
Re: [HACKERS] often PREPARE can generate high load (and sometimes minutes long unavailability)
2014-02-23 20:35 GMT+01:00 Jeff Janes jeff.ja...@gmail.com: On Sun, Feb 23, 2014 at 11:04 AM, Pavel Stehule pavel.steh...@gmail.comwrote: Hello I got a example of code, that generate relatively high load with minimal connections. This code is +/- bad - it repeatedly generate prepare statement, but somewhere uses prepared statements as protections against SQL injections and they can use same use case. Pseudocode (I can send a test case privately): Script a: -- A,B are in RAM for i in 1 .. N loop insert into A values(); for j in 1 .. M loop insert into B values(); end loop; end loop; Script b: -- query is extremely fast - returns 0 or 1 rows usually 40 threads execute while true loop pr = PREPARE SELECT * FROM A LEFT JOIN B ON .. EXECUTE pr(...) sleep(10 ms) end loop Digging through uncommitted tuples at the top or bottom of an index (which happenings during planning, especially the planner of merge joins) is very contentious. Tom proposed changing the snapshot used for planning to Dirty, but the proposal didn't go anywhere because no one did the testing to confirm that it solved the problem in the field. Perhaps you can help do that. I am able to test some patches. Thank you for info Regards Pavel See: [PERFORM] Performance bug in prepared statement binding in 9.2? and several related threads. Cheers, Jeff
Re: [HACKERS] often PREPARE can generate high load (and sometimes minutes long unavailability)
2014-02-23 20:35 GMT+01:00 Jeff Janes jeff.ja...@gmail.com: On Sun, Feb 23, 2014 at 11:04 AM, Pavel Stehule pavel.steh...@gmail.comwrote: Hello I got a example of code, that generate relatively high load with minimal connections. This code is +/- bad - it repeatedly generate prepare statement, but somewhere uses prepared statements as protections against SQL injections and they can use same use case. Pseudocode (I can send a test case privately): Script a: -- A,B are in RAM for i in 1 .. N loop insert into A values(); for j in 1 .. M loop insert into B values(); end loop; end loop; Script b: -- query is extremely fast - returns 0 or 1 rows usually 40 threads execute while true loop pr = PREPARE SELECT * FROM A LEFT JOIN B ON .. EXECUTE pr(...) sleep(10 ms) end loop Digging through uncommitted tuples at the top or bottom of an index (which happenings during planning, especially the planner of merge joins) is very contentious. Tom proposed changing the snapshot used for planning to Dirty, but the proposal didn't go anywhere because no one did the testing to confirm that it solved the problem in the field. Perhaps you can help do that. See: [PERFORM] Performance bug in prepared statement binding in 9.2? and several related threads. yes, it is very similar. Only it is little bit worse - on 16CPU it can produce a 20-60 minutes unavailability regards Pavel Cheers, Jeff
Re: [HACKERS] often PREPARE can generate high load (and sometimes minutes long unavailability)
Hi, On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote: There is relative few very long ProcArrayLocks lwlocks This issue is very pathologic on fast computers with more than 8 CPU. This issue was detected after migration from 8.4 to 9.2. (but tested with same result on 9.0) I see it on devel 9.4 today actualized. When I moved PREPARE from cycle, then described issues is gone. But when I use a EXECUTE IMMEDIATELY, then the issue is back. So it looks it is related to planner, ... In addition to the issue Jeff mentioned, I'd suggest trying the same workload with repeatable read. That can do *wonders* because of the reduced number of snapshots. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers