Re: [HACKERS] Move unused buffers to freelist
On Friday, June 28, 2013 6:20 PM Robert Haas wrote: On Fri, Jun 28, 2013 at 12:52 AM, Amit Kapila amit.kap...@huawei.com wrote: Currently it wakes up based on bgwriterdelay config parameter which is by default 200ms, so you means we should think of waking up bgwriter based on allocations and number of elements left in freelist? I think that's what Andres and I are proposing, yes. As per my understanding Summarization of points raised by you and Andres which this patch should address to have a bigger win: 1. Bgwriter needs to be improved so that it can help in reducing usage count and finding next victim buffer (run the clock sweep and add buffers to the free list). Check. I think one way to handle it is that while moving buffers to freelist, if we find that there are not enough buffers (= high watermark) which have zero usage count, then move through buffer list and reduce usage count. Now here I think it is important how do we find that how many times we should circulate the buffer list to reduce usage count. Currently I have kept it proportional to number of times it failed to move enough buffers to freelist. 2. SetLatch for bgwriter (wakeup bgwriter) when elements in freelist are less. Check. The way to do this is to keep a variable in shared memory in the same cache line as the spinlock protecting the freelist, and update it when you update the free list. Added a new variable freelistLatch in BufferStrategyControl 3. Split the workdone globallock (Buffreelist) in StrategyGetBuffer (a spinlock for the freelist, and an lwlock for the clock sweep). Check. Added a new variable freelist_lck in BufferStrategyControl which will be used to protect freelist. Still Buffreelist will be used to protect clock sweep part of StrategyGetBuffer. 4. Separate processes for writing dirty buffers and moving buffers to freelist I think this part might be best pushed to a separate patch, although I agree we probably need it. 5. Bgwriter needs to be more aggressive, logic based on which it calculates how many buffers it needs to process needs to be improved. This is basically overlapping with points already made. I suspect we could just get rid of bgwriter_delay, bgwriter_lru_maxpages, and bgwriter_lru_multiplier altogether. The background writer would just have a high and a low watermark. When the number of buffers on the freelist drops below the low watermark, the allocating backend sets the latch and bgwriter wakes up and begins adding buffers to the freelist. When the number of buffers on the free list reaches the high watermark, the background writer goes back to sleep. Some experimentation might be needed to figure out what values are appropriate for those watermarks. In theory this could be a configuration knob, but I suspect it's better to just make the system tune it right automatically. Currently in Patch I have used low watermark as 1/6 and high watermark as 1/3 of NBuffers. Values are hardcoded for now, but I will change to guc's or hash defines. As far as I can think there is no way to find number of buffers on freelist, so I had added one more variable to maintain it. Initially I thought that I could use existing variables firstfreebuffer and lastfreebuffer to calculate it, but it may not be accurate as once the buffers are moved to freelist, these don't give exact count. The main doubt here is what if after traversing all buffers, it didn't find enough buffers to meet high watermark? Currently I just move out of loop to move buffers and just try to reduce usage count as explained in point-1 6. There can be contention around buffer mapping locks, but we can focus on it later 7. cacheline bouncing around the buffer header spinlocks, is there anything we can do to reduce this? I think these are points that we should leave for the future. This is just a WIP patch. I have kept older code in comments. I need to further refine it and collect performance data. I had prepared one script (perf_buff_mgmt.sh) to collect performance data for different shared buffers/scalefactor/number_of_clients Top level points which still needs to be taken care: 1. Choose Optimistically used buffer in StrategyGetBuffer(). Refer Simon's Patch: https://commitfest.postgresql.org/action/patch_view?id=743 2. Don't bump the usage count on every time buffer is pinned. This idea I got when reading archives about improvements in this area. With Regards, Amit Kapila. changed_freelist_mgmt.patch Description: Binary data perf_buff_mgmt.sh Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] System catalog vacuum issues
On 08/06/2013 01:56 PM, Vlad Arkhipov wrote: Hello, We are suffering from a long-standing issue with autovacuuming/vacuuming system catalogs on the production server. We are actively using temporary tables in the legacy application, so system catalogs grows unbounded in time. Autovacuum does not remove dead tuples and neither do the manual vacuum. We are running PostgreSQL 9.2.4 on Linux 2.6.18 x86_64. Nobody's holding an open transaction for long periods. Got any prepared transactions? SELECT * FROM pg_prepared_xacts; SHOW max_prepared_transactions; -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mvcc catalo gsnapshots and TopTransactionContext
On 2013-08-05 13:09:31 -0400, Noah Misch wrote: On Fri, Jul 12, 2013 at 11:42:23AM +0200, Andres Freund wrote: On 2013-07-11 15:09:45 -0400, Tom Lane wrote: It never has been, and never will be, allowed to call the catcache code without being in a transaction. What do you think will happen if the requested row isn't in cache? A table access, that's what, and that absolutely requires being in a transaction. I'd like to add an Assert like in the attached patch making sure we're in a transaction. Otherwise it's far too easy not to hit an error during development because everything is cached - and syscache usage isn't always obvious from the outside to the naive or the tired. The following test case reliably hits this new assertion: create table t (c int); begin; create index on t (c); savepoint q; insert into t values (1); select 1/0; Stack trace: #2 0x00761159 in ExceptionalCondition (conditionName=value optimized out, errorType=value optimized out, fileName=value optimized out, lineNumber=value optimized out) at assert.c:54 #3 0x0074fc53 in SearchCatCache (cache=0xc53048, v1=139479, v2=0, v3=0, v4=0) at catcache.c:1072 #4 0x0075c011 in SearchSysCache (cacheId=0, key1=369539, key2=6, key3=18446744073709551615, key4=0) at syscache.c:909 #5 0x00757e57 in RelationReloadIndexInfo (relation=0x7f2dd5fffea0) at relcache.c:1770 #6 0x0075817b in RelationClearRelation (relation=0x7f2dd5fffea0, rebuild=1 '\001') at relcache.c:1926 #7 0x007588c6 in RelationFlushRelation (relationId=139479) at relcache.c:2076 #8 RelationCacheInvalidateEntry (relationId=139479) at relcache.c:2138 #9 0x0075185d in LocalExecuteInvalidationMessage (msg=0xcde778) at inval.c:546 #10 0x00751185 in ProcessInvalidationMessages (hdr=0xc0d390, func=0x7517d0 LocalExecuteInvalidationMessage) at inval.c:422 #11 0x00751a3f in AtEOSubXact_Inval (isCommit=0 '\000') at inval.c:973 #12 0x004cb986 in AbortSubTransaction () at xact.c:4250 #13 0x004cbf05 in AbortCurrentTransaction () at xact.c:2863 #14 0x006968f6 in PostgresMain (argc=1, argv=0x7fff7cf71610, dbname=0xc13b60 test, username=0xbedc88 nm) at postgres.c:3848 #15 0x0064c2b5 in BackendRun () at postmaster.c:4044 #16 BackendStartup () at postmaster.c:3733 #17 ServerLoop () at postmaster.c:1571 #18 0x0064fa8d in PostmasterMain (argc=1, argv=0xbe84a0) at postmaster.c:1227 #19 0x005e2dcd in main (argc=1, argv=0xbe84a0) at main.c:196 When we call AtEOSubXact_Inval() or AtEOXact_Inval() with a relation still open, we can potentially get a relcache rebuild and therefore a syscache lookup as shown above. CommitSubTransaction() is also potentially affected, though I don't have an SQL-level test case for that. It calls CommandCounterIncrement() after moving to TRANS_COMMIT. That CCI had better find no invalidations of open relations, or we'll make syscache lookups. (In simple tests, any necessary invalidations tend to happen at the CCI in CommitTransactionCommand(), so the later CCI does in fact find nothing to do. I have little confidence that should be counted upon, though.) How might we best rearrange things to avoid these hazards? Ok. After a good bit of code reading, I think this isn't an actual bug but an overzealous Assert(). I think it should be Assert(IsTransactionBlock()) not Assert(IsTransactionState()); The reason for that is that when we do the AtEO(Sub)?Xact_Inval(), we've already done a RecordTransactionAbort(true|false) and CurrentTransactionState-state = TRANS_ABORT. So the visibility routines have enough information to consider rows created by the aborted transaction as invisible. I am not really happy with the RelationReloadIndexInfo()s in RelationClearRelation() when we're in an aborted state, especially as the comment surrounding them are clearly out of date, but I don't see a bug there anymore. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] System catalog vacuum issues
On 08/06/2013 04:00 PM, Craig Ringer wrote: On 08/06/2013 01:56 PM, Vlad Arkhipov wrote: Hello, We are suffering from a long-standing issue with autovacuuming/vacuuming system catalogs on the production server. We are actively using temporary tables in the legacy application, so system catalogs grows unbounded in time. Autovacuum does not remove dead tuples and neither do the manual vacuum. We are running PostgreSQL 9.2.4 on Linux 2.6.18 x86_64. Nobody's holding an open transaction for long periods. Got any prepared transactions? SELECT * FROM pg_prepared_xacts; SHOW max_prepared_transactions; dcdb=# select * from pg_prepared_xacts; transaction | gid | prepared | owner | database -+-+--+---+-- (0 rows) dcdb=# show max_prepared_transactions; max_prepared_transactions --- 100 (1 row) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] System catalog vacuum issues
On Mon, Aug 5, 2013 at 10:56 PM, Vlad Arkhipov arhi...@dc.baikal.ru wrote: dcdb=# select relname, n_live_tup, n_dead_tup, last_vacuum, last_autovacuum from pg_stat_sys_tables where relname = 'pg_attribute'; relname| n_live_tup | n_dead_tup | last_vacuum | last_autovacuum --+++---+--- pg_attribute | 39318086 | 395478 | 2013-08-06 14:47:48.187259+09 | 2013-08-06 13:43:03.162286+09 What pgstattuple shows on this table? http://www.postgresql.org/docs/9.2/interactive/pgstattuple.html -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA Profile: http://www.linkedin.com/in/grayhemp Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988) 888-1979 Skype: gray-hemp Jabber: gray...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] System catalog vacuum issues
On 08/06/2013 04:26 PM, Sergey Konoplev wrote: On Mon, Aug 5, 2013 at 10:56 PM, Vlad Arkhipov arhi...@dc.baikal.ru wrote: dcdb=# select relname, n_live_tup, n_dead_tup, last_vacuum, last_autovacuum from pg_stat_sys_tables where relname = 'pg_attribute'; relname| n_live_tup | n_dead_tup | last_vacuum | last_autovacuum --+++---+--- pg_attribute | 39318086 | 395478 | 2013-08-06 14:47:48.187259+09 | 2013-08-06 13:43:03.162286+09 What pgstattuple shows on this table? http://www.postgresql.org/docs/9.2/interactive/pgstattuple.html dcdb=# select * from pgstattuple('pg_catalog.pg_attribute'); table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent +-+---+---+--++++-- 6363938816 | 48786 | 6830040 | 0.11 | 1459439 | 204321460 | 3.21 | 5939017376 | 93.32 (1 row) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] System catalog vacuum issues
On Tue, Aug 6, 2013 at 12:37 AM, Vlad Arkhipov arhi...@dc.baikal.ru wrote: What pgstattuple shows on this table? http://www.postgresql.org/docs/9.2/interactive/pgstattuple.html dcdb=# select * from pgstattuple('pg_catalog.pg_attribute'); table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent +-+---+---+--++++-- 6363938816 | 48786 | 6830040 | 0.11 | 1459439 | 204321460 | 3.21 | 5939017376 | 93.32 (1 row) I guess you need to VACUUM FULL pg_attribute, if it is possible in your situation of course. If it is not, let me know, I have another one tricky way of solving this problem in my mind. -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA Profile: http://www.linkedin.com/in/grayhemp Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988) 888-1979 Skype: gray-hemp Jabber: gray...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pgbench progress report improvements
Here is a patch submission for reference to the next commitfest. Improve pgbench measurements progress report - Use progress option both under init bench. Activate progress report by default, every 5 seconds. When initializing, --quiet reverts to the old every 100,000 insertions behavior... - Measure transaction latency instead of computing it. The previous computed figure does not make sense under throttling, as sleep throttling time was included in the figures. The latency and its standard deviation are printed out under progress and in the final report. - Take thread start time at the beginning of the thread. Otherwise it includes pretty slow thread/fork system start times in the measurements. May help with bug #8326. - Reduce and compensate throttling underestimation bias. This is a consequence of relying on an integer random generator. It was about 0.5% with 1000 distinct values. Multiplier added to compensate the 0.05% bias with 1 distinct integer values. - Updated documentation help message. Sample output under initialization with --progress=1 creating tables... 223000 of 100 tuples (22%) done (elapsed 1.01 s, remaining 3.52 s). 430200 of 100 tuples (43%) done (elapsed 2.03 s, remaining 2.68 s). 507300 of 100 tuples (50%) done (elapsed 3.44 s, remaining 3.34 s). 571400 of 100 tuples (57%) done (elapsed 4.05 s, remaining 3.03 s). 786800 of 100 tuples (78%) done (elapsed 5.03 s, remaining 1.36 s). 893200 of 100 tuples (89%) done (elapsed 6.02 s, remaining 0.72 s). 100 of 100 tuples (100%) done (elapsed 6.93 s, remaining 0.00 s). ... Sample output under benchmarking with --progress=1 progress: 1.0 s, 116.7 tps, 8.528 +- 1.217 ms lat progress: 2.0 s, 117.7 tps, 8.493 +- 1.165 ms lat progress: 3.0 s, 115.5 tps, 8.654 +- 1.650 ms lat progress: 4.0 s, 116.8 tps, 8.559 +- 1.365 ms lat progress: 5.0 s, 111.7 tps, 8.947 +- 3.475 ms lat progress: 6.0 s, 116.7 tps, 8.563 +- 1.387 ms lat progress: 7.0 s, 116.6 tps, 8.572 +- 1.474 ms lat ... # oops, things happen: progress: 36.0 s, 10.9 tps, 91.864 +- 124.874 ms lat progress: 37.0 s, 115.2 tps, 8.678 +- 1.792 ms lat ... -- Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index ad8e272..a90b188 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -103,7 +103,6 @@ extern int optind; #define MAXCLIENTS 1024 #endif -#define LOG_STEP_SECONDS 5 /* seconds between log messages */ #define DEFAULT_NXACTS 10 /* default nxacts */ int nxacts = 0; /* number of transactions per client */ @@ -167,11 +166,11 @@ char *index_tablespace = NULL; #define SCALE_32BIT_THRESHOLD 2 bool use_log; /* log transaction latencies to a file */ -bool use_quiet; /* quiet logging onto stderr */ int agg_interval; /* log aggregates instead of individual * transactions */ -int progress = 0; /* thread progress report every this seconds */ +int progress = 5; /* report every this seconds, 0 is quiet */ int progress_nclients = 0; /* number of clients for progress report */ +int progress_nthreads = 0; /* number of threads for progress report */ bool is_connect; /* establish connection for each transaction */ bool is_latencies; /* report per-command latencies */ int main_pid; /* main process id used in log filename */ @@ -214,6 +213,8 @@ typedef struct int nvariables; instr_time txn_begin; /* used for measuring transaction latencies */ instr_time stmt_begin; /* used for measuring statement latencies */ + int64 txn_latencies; /* cumulated latencies */ + int64 txn_sqlats; /* cumulated square latencies */ bool is_throttled; /* whether transaction throttling is done */ int use_file; /* index in sql_files for this client */ bool prepared[MAX_FILES]; @@ -243,8 +244,10 @@ typedef struct { instr_time conn_time; int xacts; - int64 throttle_lag; - int64 throttle_lag_max; + int64 latencies; + int64 sqlats; + int64 throttle_lag; + int64 throttle_lag_max; } TResult; /* @@ -347,7 +350,6 @@ usage(void) -i, --initialize invokes initialization mode\n -F, --fillfactor=NUM set fill factor\n -n, --no-vacuum do not run VACUUM after initialization\n - -q, --quiet quiet logging (one message each 5 seconds)\n -s, --scale=NUM scaling factor\n --foreign-keys create foreign key constraints between tables\n --index-tablespace=TABLESPACE\n @@ -367,9 +369,8 @@ usage(void) (default: simple)\n -n, --no-vacuum do not run VACUUM before tests\n -N, --skip-some-updates skip updates of pgbench_tellers and pgbench_branches\n - -P, --progress=NUM show thread progress report every NUM seconds\n -r, --report-latencies
Re: [HACKERS] Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
Greg Stark st...@mit.edu writes: set it too large. Or if I set multiple settings together and you set one of them you'll undo my change and lose just part of my changes but not stop me from setting the others inconsistently. So we need to be able to change more than one setting in a single command and have a lock around the whole command, or maybe have ALTER SYSTEM SET run in a transaction. -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Bottlenecks with large number of relation segment files
(2013/08/05 19:28), Andres Freund wrote: On 2013-08-05 18:40:10 +0900, KONDO Mitsumasa wrote: (2013/08/05 17:14), Amit Langote wrote: So, within the limits of max_files_per_process, the routines of file.c should not become a bottleneck? It may not become bottleneck. 1 FD consumes 160 byte in 64bit system. See linux manual at epoll. That limit is about max_user_watches, not the general cost of an fd. Afair they take up a a good more than that. OH! It's my mistake... I retry to read about FD in linux manual at proc. It seems that a process having FD can see in /proc/[pid]/fd/. And it seems symbolic link and consume 64byte memory per FD. Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Bottlenecks with large number of relation segment files
(2013/08/05 21:23), Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: ... Also, there are global limits to the amount of filehandles that can simultaneously opened on a system. Yeah. Raising max_files_per_process puts you at serious risk that everything else on the box will start falling over for lack of available FD slots. Is it Really? When I use hadoop like NOSQL storage, I set large number of FD. Actually, Hadoop Wiki is writing following. http://wiki.apache.org/hadoop/TooManyOpenFiles Too Many Open Files You can see this on Linux machines in client-side applications, server code or even in test runs. It is caused by per-process limits on the number of files that a single user/process can have open, which was introduced in the 2.6.27 kernel. The default value, 128, was chosen because that should be enough. In Hadoop, it isn't. ~ ulimit -n 8192 Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Bottlenecks with large number of relation segment files
On 2013-08-06 19:19:41 +0900, KONDO Mitsumasa wrote: (2013/08/05 21:23), Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: ... Also, there are global limits to the amount of filehandles that can simultaneously opened on a system. Yeah. Raising max_files_per_process puts you at serious risk that everything else on the box will start falling over for lack of available FD slots. Is it Really? When I use hadoop like NOSQL storage, I set large number of FD. Actually, Hadoop Wiki is writing following. http://wiki.apache.org/hadoop/TooManyOpenFiles Too Many Open Files You can see this on Linux machines in client-side applications, server code or even in test runs. It is caused by per-process limits on the number of files that a single user/process can have open, which was introduced in the 2.6.27 kernel. The default value, 128, was chosen because that should be enough. The first paragraph (which you're quoting with 128) is talking about epoll which we don't use. The second paragraph indeed talks about the max numbers of fds. Of *one* process. Postgres uses a *process* based model. So, max_files_per_process is about the the number of fds in a single backend. You need to multiply it by max_connections + a bunch to get to the overall number of FDs. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling ALTER SYSTEM SET WAS: Re: ALTER SYSTEM SET command to change postgresql.conf parameters
Again, what are we trying to achieve?! no idea - wondering about that myself... It seems we are trying to add grammar for modifying postgresql.conf. Something we can already do easily in a standard extension, but without grammar changes. Maybe better to provide a contrib/ to modify config, then design what we can achieve more with an ALTER SYSTEM command. -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] Should we remove not fast promotion at all?
Hi, 2013/8/6 Tom Lane t...@sss.pgh.pa.us Fujii Masao masao.fu...@gmail.com writes: On Tue, Aug 6, 2013 at 11:40 AM, Andres Freund and...@2ndquadrant.com wrote: FWIW I'd rather keep plain promotion for a release or two. TBH, I have a bit of trust issues regarding the new method, and I'd like to be able to test potential issues against a stock postgres by doing a normal instead of a fast promotion. So we should add new option specifying the promotion mode, into pg_ctl? Currently pg_ctl cannot trigger the normal promotion. It would be silly to add such an option if we want to remove the old mode in a release or two. I think what Andres is suggesting is to leave it as-is for 9.4 and then remove the old code in 9.5 or 9.6. Which seems prudent to me. How about giving trigger_file an ability to chose fast promote and normal promote like the triggerfile of pg_standby. It means that if the contents of the trigger_file is empty or 'smart' then do normal promote, and it's 'fast' then do fast promote. I think this change would be smaller than change to pg_ctl. And this would allow us to treat ${PGDATA}/promote and trigger_file only. (because ${PGDATA}/fast_promote is not created automatically) regards, --- Tomonari Katsumata
Re: [HACKERS] [GENERAL] Bottlenecks with large number of relation segment files
(2013/08/06 19:33), Andres Freund wrote: On 2013-08-06 19:19:41 +0900, KONDO Mitsumasa wrote: (2013/08/05 21:23), Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: ... Also, there are global limits to the amount of filehandles that can simultaneously opened on a system. Yeah. Raising max_files_per_process puts you at serious risk that everything else on the box will start falling over for lack of available FD slots. Is it Really? When I use hadoop like NOSQL storage, I set large number of FD. Actually, Hadoop Wiki is writing following. http://wiki.apache.org/hadoop/TooManyOpenFiles Too Many Open Files You can see this on Linux machines in client-side applications, server code or even in test runs. It is caused by per-process limits on the number of files that a single user/process can have open, which was introduced in the 2.6.27 kernel. The default value, 128, was chosen because that should be enough. The first paragraph (which you're quoting with 128) is talking about epoll which we don't use. The second paragraph indeed talks about the max numbers of fds. Of *one* process. Yes. I have been already understood like that. Postgres uses a *process* based model. So, max_files_per_process is about the the number of fds in a single backend. You need to multiply it by max_connections + a bunch to get to the overall number of FDs. Yes, too. I think max_file_per_process seems too small. In NoSQL, it was recommended large number of FD. However, I do not know whether it is really enough in PostgreSQL. If we use PostgreSQL with big data, we might need to change max_file_per_process, I think. Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling ALTER SYSTEM SET WAS: Re: ALTER SYSTEM SET command to change postgresql.conf parameters
=?iso-8859-1?q?C=E9dric_Villemain?= ced...@2ndquadrant.com writes: Again, what are we trying to achieve?! no idea - wondering about that myself... It seems we are trying to add grammar for modifying postgresql.conf. Something we can already do easily in a standard extension, but without grammar changes. Maybe better to provide a contrib/ to modify config, then design what we can achieve more with an ALTER SYSTEM command. Hmm ... putting the UI into a contrib module would neatly solve the problem for people who don't want the functionality. They just don't install the module. Another thought here is that it seems like we've been discussing multiple scenarios that don't necessarily all have the same best solution. Maybe we should think in terms of multiple contrib modules. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File-per-GUC WAS: Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
On Mon, Aug 5, 2013 at 05:29:48PM -0400, Bruce Momjian wrote: That is a killer point. So really the value of the global lock is to ensure serializability when transactions are updating multiple GUCs. Well, I think it is more than that. The global lock will allow multiple GUCs to be modified in a serializable fashion, but unless you have new backends wait until that lock is clear, new backends are going to see the directory in an inconsistent state if we keep a single-guc-per-file. So, unless we want new backends to hang during ALTER SYSTEM SET operations, I think we are going to need a global lock and all gucs in a single file. To clarify, I talked about new backends above, but on Unix and Windows, new backends inherit the postmaster GUC settings, so it would only be config file reload that would possibly see the single-guc-per-file in an inconsistent state. And we already have this possibility if someone edits postgresql.conf, and while the editor is writing the file, a backend is reading the file via a reload. Now, I assume that ALTER SYSTEM SET would automatically issue a pg_reload_conf(), so we would need to make sure that people modifying multiple parameters that are related do it in a single transaction, and that we issue a pg_reload_conf() only after all values have been changed. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling ALTER SYSTEM SET WAS: Re: ALTER SYSTEM SET command to change postgresql.conf parameters
On Tue, Aug 6, 2013 at 12:40:12PM +0200, Cédric Villemain wrote: Again, what are we trying to achieve?! no idea - wondering about that myself... It seems we are trying to add grammar for modifying postgresql.conf. Something we can already do easily in a standard extension, but without grammar changes. Maybe better to provide a contrib/ to modify config, then design what we can achieve more with an ALTER SYSTEM command. Let's look at the problems: * remote users can lock themselves out of the server * interconnected GUC variables are complex to change * need a way to disable this feature Given the above, I am not sure I see a way forward for ALTER SYSTEM SET. One compromise that avoids the problems above would be to have a limited feature called ALTER LOG SET that only controls logging parameters, but I am not sure there is enough use-case there. This is not a zero-cost feature as we would be giving people _two_ ways to configure Postgres, and two ways to find a fix if it is broken, so we have to have a clear win to implement this. Also, if you have broken the system via ALTER SYSTEM SET, you might need to edit flat files to fix it, adding further complexity and limitations if someone only knows the SQL method of configuration. Given that, and the problems above, I reluctantly just don't see how this features moves us forward. There seemed to be agreement on having a config.d, though. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] System catalog vacuum issues
Vlad Arkhipov arhi...@dc.baikal.ru writes: On 08/06/2013 04:26 PM, Sergey Konoplev wrote: What pgstattuple shows on this table? dcdb=# select * from pgstattuple('pg_catalog.pg_attribute'); table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent +-+---+---+--++++-- 6363938816 | 48786 | 6830040 | 0.11 | 1459439 | 204321460 | 3.21 | 5939017376 | 93.32 (1 row) So the problem isn't so much that you have lots of dead tuples, it's that the file is full of free space. I suspect the key issue is that autovacuum is unable to truncate the file because of too many concurrent accesses. There was a fix in 9.2.3 that was meant to ameliorate that problem, but maybe that's not getting the job done for you. Or maybe the bloat we're looking at is left over from when you were running earlier 9.2.x releases; in which case a one-time VACUUM FULL should fix it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling ALTER SYSTEM SET WAS: Re: ALTER SYSTEM SET command to change postgresql.conf parameters
There seemed to be agreement on having a config.d, though. Yes. Also, the validate_conf_parameter() (or some similar name) Amit added in his patch sounds useful if an extension can use it (to check a GUC someone want to change, to check a configuration file, ...) -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] Add json_typeof() and json_is_*() functions.
On Fri, Aug 2, 2013 at 8:22 AM, Andrew Tipton and...@kiwidrew.com wrote: But without json_is_scalar(), the choice is one of these two forms: json_typeof() NOT IN ('object', 'array') json_typeof() IN ('string', 'number', 'boolean', 'null') The first of those is what seemed to make sense to me. The user can always define their own convenience function if they so desire. I don't think we need to bloat the default contents of pg_proc for that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] how to pass data (tuples) to worker processes?
On Sat, Aug 3, 2013 at 6:31 AM, Andrew Tipton and...@kiwidrew.com wrote: Robert: any chance you could share a few more details on the enhancements you're planning for bgworkers? I seem to recall reading that communicating with the dynamic bgworkers after they had been launched was next on your agenda... Yeah, it is. I'm working on a patch to allow additional shared memory segments to be created on the fly. The idea I'm working with is that a backend that plans to launch a worker will first create a dynamic shared memory segment, then pass the ID of that segment to the worker via bgw_main_arg. The worker will map the segment, and then the two processes can use that to communicate. My thought is to create a queue abstraction that sits on top of the dynamic shared memory infrastructure, so that you can set aside a portion of your dynamic shared memory segment to use as a ring buffer and send messages back and forth with using some kind of API along these lines: extern void dsm_queue_send(dsm_queue *, char *data, uint64 len); extern uint64 dsm_queue_receive(dsm_queue *, char **dataptr); It would also be possible to implement message sending and receiving using pipes, but I'm leaning away from that because it would require even more OS-dependent code than I'm already having to write, and writing OS-dependent shim layers is one of the world's less-rewarding coding tasks; and also because I think it will be easier to achieve zero-copy semantics using shared memory. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File-per-GUC WAS: Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
On Tuesday, August 06, 2013 5:39 PM Bruce Momjian wrote: On Mon, Aug 5, 2013 at 05:29:48PM -0400, Bruce Momjian wrote: That is a killer point. So really the value of the global lock is to ensure serializability when transactions are updating multiple GUCs. Well, I think it is more than that. The global lock will allow multiple GUCs to be modified in a serializable fashion, but unless you have new backends wait until that lock is clear, new backends are going to see the directory in an inconsistent state if we keep a single-guc-per- file. So, unless we want new backends to hang during ALTER SYSTEM SET operations, I think we are going to need a global lock and all gucs in a single file. To clarify, I talked about new backends above, but on Unix and Windows, new backends inherit the postmaster GUC settings, so it would only be config file reload that would possibly see the single-guc-per-file in an inconsistent state. And we already have this possibility if someone edits postgresql.conf, and while the editor is writing the file, a backend is reading the file via a reload. Now, I assume that ALTER SYSTEM SET would automatically issue a pg_reload_conf(), so we would need to make sure that people modifying multiple parameters that are related do it in a single transaction, and that we issue a pg_reload_conf() only after all values have been changed. Can't we leave this onus of conflict of changing dependent parameters on user, such that he should do it carefully? Other databases SQL Server, Oracle also has similar feature, so there user's also must be handling by themselves. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Possible bug with row_to_json
Jack Christensen j...@jackchristensen.com writes: jack=# create table player( jack(# player_id serial primary key, jack(# name varchar not null unique jack(# ); CREATE TABLE jack=# insert into player(name) values('Jack'); INSERT 0 1 jack=# select row_to_json(t) jack-# from ( jack(# select player_id as renamed, name jack(# from player jack(# order by name jack(# ) t; row_to_json --- {player_id:1,name:Jack} (1 row) It ignored the rename. I looked into this and found that the culprit is the optimization that skips ExecProject() if a scan plan node is not doing any useful projection. In the plan for this query: Subquery Scan on t (cost=0.15..81.98 rows=1230 width=60) Output: row_to_json(t.*) - Index Scan using player_name_key on public.player (cost=0.15..66.60 rows=1230 width=36) Output: player.player_id, player.name the indexscan node decides it can just return its scan tuple slot directly to the caller, rather than projecting into the result tuple slot. The problem is that the scan tuple slot's tuple descriptor is that of the table being scanned, and in particular it has the actual column names of the underlying table, plus a tdtypeid matching the table's rowtype OID. This info is what's looked at by ExecEvalWholeRowVar to form a tuple datum, so the datum ends up looking like it has exactly the table's rowtype, and then row_to_json is going to be given the table's tuple descriptor when it asks lookup_rowtype_tupdesc for a tupdesc. The attached quick-hack patch fixes the reported case. I *think* it's safe as is, but I wonder whether we ought to install additional checks to verify physical compatibility of the two tuple descriptors before we decide it's okay to skip projection. Another potential issue is that we're losing whatever benefit there may be in having tuple datums marked with named rowtypes rather than RECORD --- I'm not sure there's any meaningful performance difference, but I'm not sure there isn't, either. We could be even more paranoid by choosing to only install the other descriptor if there's actually a difference in column names; but is it worth the extra complexity? Comments? regards, tom lane diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c index 3ea6460..b91ae03 100644 *** a/src/backend/executor/execScan.c --- b/src/backend/executor/execScan.c *** ExecScan(ScanState *node, *** 234,246 * Set up projection info for a scan node, if necessary. * * We can avoid a projection step if the requested tlist exactly matches ! * the underlying tuple type. If so, we just set ps_ProjInfo to NULL. * Note that this case occurs not only for simple SELECT * FROM ..., but * also in most cases where there are joins or other processing nodes above * the scan node, because the planner will preferentially generate a matching * tlist. * ! * ExecAssignScanType must have been called already. */ void ExecAssignScanProjectionInfo(ScanState *node) --- 234,253 * Set up projection info for a scan node, if necessary. * * We can avoid a projection step if the requested tlist exactly matches ! * the underlying tuple type. If so, we set ps_ProjInfo to NULL to inform ! * later processing that the scan tuple can be returned as-is. We also ! * have to tweak the scan tuple slot to have the exact same tuple descriptor ! * the projection slot does --- they are physically compatible, if the tlist ! * matches, but the projection slot might contain different column names and ! * we want anything that looks at the slot tupdesc to see the right names. ! * * Note that this case occurs not only for simple SELECT * FROM ..., but * also in most cases where there are joins or other processing nodes above * the scan node, because the planner will preferentially generate a matching * tlist. * ! * ExecAssignScanType and ExecAssignResultTypeFromTL must have been called ! * already. */ void ExecAssignScanProjectionInfo(ScanState *node) *** ExecAssignScanProjectionInfo(ScanState * *** 258,264 --- 265,277 scan-plan.targetlist, varno, node-ss_ScanTupleSlot-tts_tupleDescriptor)) + { + /* no physical projection needed */ node-ps.ps_ProjInfo = NULL; + /* make sure scan tuple slot's tupdesc exposes the right names */ + ExecSetSlotDescriptor(node-ss_ScanTupleSlot, + node-ps.ps_ResultTupleSlot-tts_tupleDescriptor); + } else ExecAssignProjectionInfo(node-ps, node-ss_ScanTupleSlot-tts_tupleDescriptor); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File-per-GUC WAS: Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
On Tue, Aug 6, 2013 at 06:30:18PM +0530, Amit Kapila wrote: Now, I assume that ALTER SYSTEM SET would automatically issue a pg_reload_conf(), so we would need to make sure that people modifying multiple parameters that are related do it in a single transaction, and that we issue a pg_reload_conf() only after all values have been changed. Can't we leave this onus of conflict of changing dependent parameters on user, such that he should do it carefully? Other databases SQL Server, Oracle also has similar feature, so there user's also must be handling by themselves. Yes, we _can_ have the user deal with it, but we have to consider the final user experience. The initial experience will be, Oh, great, no more need to edit a configuration file,, but it might end with: * Oh, my server doesn't start and I don't have shell access * How do I undo an ALTER SYSTEM SET if the server can't be started * Why doesn't my server start? postgresql.conf looks fine (see ALTER SYSTEM SET) * What things did I change via ALTER SYSTEM SET * How do I disable ALTER SYSTEM SET so only postgresql.conf works Consider how long it took the hackers to realize all the interactions involved; we would have to effectively communicate that to users. I realize other database systems have this, but those systems are not known to be easy to use. ALTER SYSTEM SET has the promise of making Postgres easier to use, _and_ harder to use. The question is what percentage of users are going to have a positive experience with this feature, and what percentage are going to have a negative or disastrous experience with it, or disable it? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File-per-GUC WAS: Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
* Andres Freund (and...@2ndquadrant.com) wrote: I don't think we're designing a feature that's supposed to be used under heavy concurrency here. If you have users/tools doing conflicting actions as superusers you need to solve that by social means, not by technical ones. If this actually gets used by puppet or another CMS, the chances of the 'social means' being successful drop drastically. I agree that it doesn't need to work under heavy concurrency, but it should do something sensible if it happens- perhaps even just throwing an error if it can't acquire the lock immediately, warning the user that some other process is trying to modify the config concurrently. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Disabling ALTER SYSTEM SET WAS: Re: ALTER SYSTEM SET command to change postgresql.conf parameters
* Tom Lane (t...@sss.pgh.pa.us) wrote: =?iso-8859-1?q?C=E9dric_Villemain?= ced...@2ndquadrant.com writes: Maybe better to provide a contrib/ to modify config, then design what we can achieve more with an ALTER SYSTEM command. Hmm ... putting the UI into a contrib module would neatly solve the problem for people who don't want the functionality. They just don't install the module. Another thought here is that it seems like we've been discussing multiple scenarios that don't necessarily all have the same best solution. Maybe we should think in terms of multiple contrib modules. I'd certainly be happier with that- but it already exists, to some extent, in the form of the pgAdmin adminpack contrib module. Perhaps we should be working on improving what we've already got there rather than adding something completely new.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] File-per-GUC WAS: Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
On Tue, Aug 6, 2013 at 10:54:22AM -0400, Stephen Frost wrote: * Andres Freund (and...@2ndquadrant.com) wrote: I don't think we're designing a feature that's supposed to be used under heavy concurrency here. If you have users/tools doing conflicting actions as superusers you need to solve that by social means, not by technical ones. If this actually gets used by puppet or another CMS, the chances of the 'social means' being successful drop drastically. I agree that it doesn't need to work under heavy concurrency, but it should do something sensible if it happens- perhaps even just throwing an error if it can't acquire the lock immediately, warning the user that some other process is trying to modify the config concurrently. My guess is that most users are going to do: SHOW log_min_duration_statement; SET log_min_duration_statement = 3; i.e. there isn't going to be any way to _lock_ that value between viewing it and setting it, and hence no way to warn users. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling ALTER SYSTEM SET WAS: Re: ALTER SYSTEM SET command to change postgresql.conf parameters
On Tue, Aug 6, 2013 at 10:58:52AM -0400, Stephen Frost wrote: * Tom Lane (t...@sss.pgh.pa.us) wrote: =?iso-8859-1?q?C=E9dric_Villemain?= ced...@2ndquadrant.com writes: Maybe better to provide a contrib/ to modify config, then design what we can achieve more with an ALTER SYSTEM command. Hmm ... putting the UI into a contrib module would neatly solve the problem for people who don't want the functionality. They just don't install the module. Another thought here is that it seems like we've been discussing multiple scenarios that don't necessarily all have the same best solution. Maybe we should think in terms of multiple contrib modules. I'd certainly be happier with that- but it already exists, to some extent, in the form of the pgAdmin adminpack contrib module. Perhaps we should be working on improving what we've already got there rather than adding something completely new.. Yes, adminpack has mostly read/write file I/O operations --- it would need a higher-level API. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] latest pgbench results
Here are the latest pgbench results from the IBM POWER7 machine. These results were gathered about two weeks ago. I ran each test configuration three times; below I report the median of the three results. For all runs, I used scale factor = 300, clients = jobs, and the following non-default configuration parameters: shared_buffers = 8GB, maintenance_work_mem = 4GB, synchronous_commit = off, checkpoint_segments = 300, checkpoint_timeout = 15min, checkpoint_completion_target = 0.9, log_line_prefix = '%t [%p] '. Commits in use were cce5d681be7abfd9f48c28151ebf2b242f8ba438, dd8ea2eb5e996f3f3dfd928e20aa2462c4bd9c63, cce5d681be7abfd9f48c28151ebf2b242f8ba438. Clients: 1 9.2 tps = 1355.953202 (including connections establishing) 9.3 tps = 1376.949633 (including connections establishing) 9.4 tps = 1283.168055 (including connections establishing) Clients: 8 9.2 tps = 9165.548715 (including connections establishing) 9.3 tps = 9168.803482 (including connections establishing) 9.4 tps = 9130.962750 (including connections establishing) Clients: 32 9.2 tps = 14456.061411 (including connections establishing) 9.3 tps = 14226.033279 (including connections establishing) 9.4 tps = 14932.841344 (including connections establishing) The 9.2 and 9.3 numbers are very close, but things do seem to have changed a bit in 9.4 - for worse at 1 client, and for better, perhaps, at 32. Make of these what you will; I'm not posting these as a way of advocating anything in particular. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DATE type output does not follow datestyle parameter
On Wed, Jul 24, 2013 at 09:06:30PM +0900, MauMau wrote: Hello, The description of datestyle parameter does not seem to match the actual behavior. Is this a bug to be fixed? Which do you think should be corrected, the program or the manual? The manual says: DateStyle (string) Sets the display format for date and time values, as well as the rules for interpreting ambiguous date input values. For historical reasons, this variable contains two independent components: the output format specification (ISO, Postgres, SQL, or German) and the input/output specification for year/month/day ordering (DMY, MDY, or YMD). ... And says: http://www.postgresql.org/docs/current/static/datatype-datetime.html 8.5.2. Date/Time Output The output of the date and time types is of course only the date or time part in accordance with the given examples. After doing SET datestyle = 'Postgres, MDY' on the psql prompt, I did the following things on the same psql session: 1. SELECT current_timestamp; now -- Wed Jul 24 10:51:00.217 2013 GMT (1 行) This is exactly as I expected. 2. SELECT current_date; I expected the output Wed Jul 24 2013 or Jul 24 2013, but I got: date 07-24-2013 (1 行) This does not follow the above statement in 8.5.2. This output is created by EncodeDateOnly() in src/backend/utils/adt/datetime.c. Yes, you are correct, this is inconsistent. Let me look at writing a patch to fix this. Is this format so old that we can't fix this? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DATE type output does not follow datestyle parameter
On Tue, Aug 6, 2013 at 11:40 AM, Bruce Momjian br...@momjian.us wrote: Yes, you are correct, this is inconsistent. Let me look at writing a patch to fix this. Is this format so old that we can't fix this? I think I would be more inclined to change the documentation than the behavior. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] latest pgbench results
Hello, Here are the latest pgbench results from the IBM POWER7 machine. These results were gathered about two weeks ago. I ran each test configuration three times; below I report the median of the three results. For all runs, I used scale factor = 300, This means a database size of about 4.5 GB clients = jobs, I understand that it means one thread per client. and the following non-default configuration parameters: shared_buffers = 8GB, maintenance_work_mem = 4GB, synchronous_commit = off, checkpoint_segments = 300, checkpoint_timeout = 15min, checkpoint_completion_target = 0.9, log_line_prefix = '%t [%p] '. Commits in use were cce5d681be7abfd9f48c28151ebf2b242f8ba438, dd8ea2eb5e996f3f3dfd928e20aa2462c4bd9c63, cce5d681be7abfd9f48c28151ebf2b242f8ba438. How long did it run? As there has been some changes in pgbench, would it make sense to run the same pgbench version (whatever) against the different servers? To help appreciate wether the change is significant, it would help to have all three runs so as to have an idea of the performance spread. Maybe you can try my patch for better pgbench measures: http://archives.postgresql.org/message-id/alpine.DEB.2.02.1308061042030.2@localhost6.localdomain6 My experience is that performance can varry significantly during a run. Clients: 1 9.2 tps = 1355.953202 (including connections establishing) 9.3 tps = 1376.949633 (including connections establishing) 9.4 tps = 1283.168055 (including connections establishing) Clients: 8 9.2 tps = 9165.548715 (including connections establishing) 9.3 tps = 9168.803482 (including connections establishing) 9.4 tps = 9130.962750 (including connections establishing) Clients: 32 9.2 tps = 14456.061411 (including connections establishing) 9.3 tps = 14226.033279 (including connections establishing) 9.4 tps = 14932.841344 (including connections establishing) The 9.2 and 9.3 numbers are very close, but things do seem to have changed a bit in 9.4 - for worse at 1 client, and for better, perhaps, at 32. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DATE type output does not follow datestyle parameter
Bruce Momjian br...@momjian.us writes: Is this format so old that we can't fix this? Yes. I don't see any reason to change it, either, as nobody has complained that it's actually bad. If you feel a compulsion to change the docs, do that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
2013-08-05 16:01 keltezéssel, Stephen Frost írta: * Greg Stark (st...@mit.edu) wrote: On Fri, Aug 2, 2013 at 4:05 PM, Stephen Frost sfr...@snowman.net wrote: I'm not even clear we do want this in /etc since none of our GUC options are repeatable things like Apache virtual servers. It actually makes *more* sense for pg_hba than it does for gucs. I think we can assume that in the future we'll have something like it however. I tend to agree with this also, though I can imagine wanting to separate things in a conf.d directory ala exim's conf.d directories, to allow tools like puppet to manage certain things environment-wide (perhaps krb_server_keyfile) while other configuration options are managed locally. Extensions are actually a pretty good argument for why conf.d in /etc (or wherever the non-auto-config is) is pretty important useful. That's the kind of thing conf.d directories are meant for. A user can install a package containing an extension and the extension would automatically drop in the config entries needed in that directory. Agreed, though I think there should be a difference between shared library load being added-to for extensions, and random extension-specific GUC.. Now that you mention shared library load, it may be a good idea to add an append-to-this-GUC flag instead of overwriting the previous value. Two GUCs may make use of it: shared_preload_libraries and local_preload_libraries. It would make packagers' of extensions and DBA's lives easier. Best regards, Zoltán Böszörményi Thanks, Stephen -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling ALTER SYSTEM SET WAS: Re: ALTER SYSTEM SET command to change postgresql.conf parameters
On 08/06/2013 05:29 AM, Bruce Momjian wrote: Let's look at the problems: * remote users can lock themselves out of the server * interconnected GUC variables are complex to change * need a way to disable this feature Given the above, I am not sure I see a way forward for ALTER SYSTEM SET. One compromise that avoids the problems above would be to have a limited feature called ALTER LOG SET that only controls logging parameters, but I am not sure there is enough use-case there. This is not a zero-cost feature as we would be giving people _two_ ways to configure Postgres, and two ways to find a fix if it is broken, so we have to have a clear win to implement this. Also, if you have broken the system via ALTER SYSTEM SET, you might need to edit flat files to fix it, adding further complexity and limitations if someone only knows the SQL method of configuration. Given that, and the problems above, I reluctantly just don't see how this features moves us forward. Well said. I'd like to look at use cases, and let's see how ALTER SYSTEM SET addresses or doesn't address these use cases. I'd really like it if some other folks also posted use cases they know of. (1) Making is easier for GUIs to manage PostgreSQL settings. (2) Enabling DBAAS services to give users limited control over settings. (3) Making the initial developer experience better by not requiring hand-editing of configuration files to play with settings. (4) Making it easier to write automated testing scripts which test changes to settings. (5) Enabling new ways of writing Puppet/Chef/etc. scripts, which can check a setting before changing it. For the current patch, it fullfills these use cases as follows: (1) works, but is ALSO one of the critical problems with the feature. (2) does not work for this purpose. (3) works for this purpose. (4) works for this purpose. (5) works for this purpose, although it's unclear if it's actually wanted. Others? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] latest pgbench results
On 08/06/2013 08:26 AM, Robert Haas wrote: Here are the latest pgbench results from the IBM POWER7 machine. These results were gathered about two weeks ago. I ran each test configuration three times; below I report the median of the three results. For all runs, I used scale factor = 300, clients = jobs, and the following non-default configuration parameters: shared_buffers = 8GB, maintenance_work_mem = 4GB, synchronous_commit = off, checkpoint_segments = 300, checkpoint_timeout = 15min, checkpoint_completion_target = 0.9, log_line_prefix = '%t [%p] '. Commits in use were cce5d681be7abfd9f48c28151ebf2b242f8ba438, dd8ea2eb5e996f3f3dfd928e20aa2462c4bd9c63, cce5d681be7abfd9f48c28151ebf2b242f8ba438. What's the length of the run each time? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
On Tue, Aug 6, 2013 at 06:34:35PM +0200, Boszormenyi Zoltan wrote: 2013-08-05 16:01 keltezéssel, Stephen Frost írta: * Greg Stark (st...@mit.edu) wrote: On Fri, Aug 2, 2013 at 4:05 PM, Stephen Frost sfr...@snowman.net wrote: I'm not even clear we do want this in /etc since none of our GUC options are repeatable things like Apache virtual servers. It actually makes *more* sense for pg_hba than it does for gucs. I think we can assume that in the future we'll have something like it however. I tend to agree with this also, though I can imagine wanting to separate things in a conf.d directory ala exim's conf.d directories, to allow tools like puppet to manage certain things environment-wide (perhaps krb_server_keyfile) while other configuration options are managed locally. Extensions are actually a pretty good argument for why conf.d in /etc (or wherever the non-auto-config is) is pretty important useful. That's the kind of thing conf.d directories are meant for. A user can install a package containing an extension and the extension would automatically drop in the config entries needed in that directory. Agreed, though I think there should be a difference between shared library load being added-to for extensions, and random extension-specific GUC.. Now that you mention shared library load, it may be a good idea to add an append-to-this-GUC flag instead of overwriting the previous value. Two GUCs may make use of it: shared_preload_libraries and local_preload_libraries. It would make packagers' of extensions and DBA's lives easier. 'search_path' might also use it, though we might need append and prepend. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling ALTER SYSTEM SET WAS: Re: ALTER SYSTEM SET command to change postgresql.conf parameters
On Tue, Aug 6, 2013 at 10:33:20AM -0700, Josh Berkus wrote: On 08/06/2013 05:29 AM, Bruce Momjian wrote: Let's look at the problems: * remote users can lock themselves out of the server * interconnected GUC variables are complex to change * need a way to disable this feature Given the above, I am not sure I see a way forward for ALTER SYSTEM SET. One compromise that avoids the problems above would be to have a limited feature called ALTER LOG SET that only controls logging parameters, but I am not sure there is enough use-case there. This is not a zero-cost feature as we would be giving people _two_ ways to configure Postgres, and two ways to find a fix if it is broken, so we have to have a clear win to implement this. Also, if you have broken the system via ALTER SYSTEM SET, you might need to edit flat files to fix it, adding further complexity and limitations if someone only knows the SQL method of configuration. Given that, and the problems above, I reluctantly just don't see how this features moves us forward. Well said. I wish I had good news to say well. ;-) I'd like to look at use cases, and let's see how ALTER SYSTEM SET addresses or doesn't address these use cases. I'd really like it if some other folks also posted use cases they know of. (1) Making is easier for GUIs to manage PostgreSQL settings. One of the traps here is that while it makes it easier, it also could trap the user if they don't have the knowledge to fix a problem because would need to acquire the knowledge while they are trying to fix the problem, rather then while they are making the initial change. (2) Enabling DBAAS services to give users limited control over settings. Right. This could be accomplished by giving users a function API for certain features, and then marking the function as SECURITY DEFINER. However, I am unclear how to do this in a generic way. Once neat idea would be to have a lookup table define which setting the administrator wishes to allow for non-superusers. If adminpack has an API to change postgresql.conf, it would be easy to create a function with SECURITY DEFINER permissions that SET lookup-allowed values in postgresql.conf. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] latest pgbench results
On Tue, Aug 6, 2013 at 12:04 PM, Fabien COELHO coe...@cri.ensmp.fr wrote: How long did it run? Each run was 30 minutes. As there has been some changes in pgbench, would it make sense to run the same pgbench version (whatever) against the different servers? I used the master branch's version of pgbench for all tests. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling ALTER SYSTEM SET WAS: Re: ALTER SYSTEM SET command to change postgresql.conf parameters
Josh Berkus escribió: (2) Enabling DBAAS services to give users limited control over settings. (5) Enabling new ways of writing Puppet/Chef/etc. scripts, which can check a setting before changing it. Surely these two cases are better covered by conf.d. For (2), DBaaS providers could offer pre-cooked snippets that are dropped in conf.d at the click of a button. For (5) it should be obvious that conf.d is better than ALTER SYSTEM. Not clear on (4); I think conf.d would also solve that one. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling ALTER SYSTEM SET WAS: Re: ALTER SYSTEM SET command to change postgresql.conf parameters
On Tue, Aug 6, 2013 at 3:31 PM, Bruce Momjian br...@momjian.us wrote: I'd like to look at use cases, and let's see how ALTER SYSTEM SET addresses or doesn't address these use cases. I'd really like it if some other folks also posted use cases they know of. (1) Making is easier for GUIs to manage PostgreSQL settings. One of the traps here is that while it makes it easier, it also could trap the user if they don't have the knowledge to fix a problem because would need to acquire the knowledge while they are trying to fix the problem, rather then while they are making the initial change. I think the more serious problem here is not about knowledge acquisition, but access to problem-fixing means. As was said some posts ago, if a DBA has access to a superuser account but not to server configuration files, he can lock himself out for good. No amount of knowledge will avail him/her. That's bad. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
2013-08-06 19:41 keltezéssel, Bruce Momjian írta: On Tue, Aug 6, 2013 at 06:34:35PM +0200, Boszormenyi Zoltan wrote: 2013-08-05 16:01 keltezéssel, Stephen Frost írta: * Greg Stark (st...@mit.edu) wrote: On Fri, Aug 2, 2013 at 4:05 PM, Stephen Frost sfr...@snowman.net wrote: I'm not even clear we do want this in /etc since none of our GUC options are repeatable things like Apache virtual servers. It actually makes *more* sense for pg_hba than it does for gucs. I think we can assume that in the future we'll have something like it however. I tend to agree with this also, though I can imagine wanting to separate things in a conf.d directory ala exim's conf.d directories, to allow tools like puppet to manage certain things environment-wide (perhaps krb_server_keyfile) while other configuration options are managed locally. Extensions are actually a pretty good argument for why conf.d in /etc (or wherever the non-auto-config is) is pretty important useful. That's the kind of thing conf.d directories are meant for. A user can install a package containing an extension and the extension would automatically drop in the config entries needed in that directory. Agreed, though I think there should be a difference between shared library load being added-to for extensions, and random extension-specific GUC.. Now that you mention shared library load, it may be a good idea to add an append-to-this-GUC flag instead of overwriting the previous value. Two GUCs may make use of it: shared_preload_libraries and local_preload_libraries. It would make packagers' of extensions and DBA's lives easier. 'search_path' might also use it, though we might need append and prepend. Indeed. Although I was thinking along the lines of the GUC parser, so: shared_preload_library += 'some_new_lib' would append to the old value. Maybe += is intuitive enough for prepending, or someone may come up with a better idea. But the extra flag would still be needed to indicate the GUC is a list, so these new operators are usable on them and not on regular GUCs. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DATE type output does not follow datestyle parameter
On Tue, Aug 6, 2013 at 12:09:53PM -0400, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Is this format so old that we can't fix this? Yes. I don't see any reason to change it, either, as nobody has complained that it's actually bad. If you feel a compulsion to change the docs, do that. OK, seems 'Postgres' is a unique output format for 'date' too, even though it doesn't look like the 'Postgres' timestamp output: default SET datestyle = 'ISO, MDY'; SELECT current_timestamp, current_date; SET now |date ---+ 2013-08-06 16:18:48.218555-04 | 2013-08-06 SET datestyle = 'SQL, MDY'; SELECT current_timestamp, current_date; SET now |date + 08/06/2013 16:18:43.054488 EDT | 08/06/2013 SET datestyle = 'German, MDY'; SELECT current_timestamp, current_date; SET now |date + 06.08.2013 16:18:59.026553 EDT | 06.08.2013 MDY SET datestyle = 'Postgres, MDY'; SELECT current_timestamp, current_date; SET now |date -+ Tue Aug 06 16:18:53.590548 2013 EDT | 08-06-2013 DMY SET datestyle = 'Postgres, DMY'; SELECT current_timestamp, current_date; SET now |date -+ Tue 06 Aug 16:20:23.902549 2013 EDT | 06-08-2013 I don't think there is even a documentation change I can suggest. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Doc Patch: Subquery section to say that subqueries can't modify data
Hi, The attached documentation patch, doc-subqueries-v1.patch, applies against head. I wanted to document that subqueries can't modify data. This is mentioned in the documentation for SELECT and implied elsewhere but I was looking for something more than an 'in-passing' mention. (I wrote a bad query, modifying data in a subquery, couldn't recall where it was documented that you can't do this, and couldn't find the answer from the TOC or the index. Now that there's lots of statements with RETURNING clauses it's natural to want to use them in subqueries.) There seemed to be no good place to put this in the tutorial section of the documentation. I wound up adding a small, 2 paragraph, section describing subqueries to the chapter on queries. Although the first paragraph echos what's already documented the utility of subqueries is such that it's nice to have a place in the tutorial that serves as a single point of reference. The alternative seemed to be to put the 2nd paragraph in 9.22. Subquery Expressions, and this didn't seem to fit well. The last 2 sentences of the first paragraph are something in the way of helpful hints and may not be appropriate, or even accurate. I've left them in for review. Regards, Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml index d7b0d73..acc6686 100644 --- a/doc/src/sgml/queries.sgml +++ b/doc/src/sgml/queries.sgml @@ -549,7 +549,7 @@ SELECT * FROM my_table AS m WHERE my_table.a gt; 5;-- wrong SELECT * FROM people AS mother JOIN people AS child ON mother.id = child.mother_id; /programlisting Additionally, an alias is required if the table reference is a - subquery (see xref linkend=queries-subqueries). + subquery (see xref linkend=queries-subquery-derived-tables). /para para @@ -590,10 +590,10 @@ SELECT a.* FROM (my_table AS a JOIN your_table AS b ON ...) AS c /para /sect3 - sect3 id=queries-subqueries -titleSubqueries/title + sect3 id=queries-subquery-derived-tables +titleSubquery Derived Tables/title -indexterm zone=queries-subqueries +indexterm zone=queries-subquery-derived-tables primarysubquery/primary /indexterm @@ -1315,6 +1315,46 @@ SELECT DISTINCT ON (replaceableexpression/replaceable optional, replaceab /sect1 + sect1 id=queries-subqueries + titleSubqueries/title + + indexterm zone=queries-subqueries + primarysubquery/primary + /indexterm + + indexterm zone=queries-subqueries + primarysub-select/primary + /indexterm + + para + Subqueries, also called sub-selects, are queries written within + parenthesis in the text of larger queries. The values produced by + subqueries may be scalar, used within expressions as described + in xref linkend=sql-syntax-scalar-subqueries, or tabular. When + tabular, subqueries may substitute for tables, as described + in xref linkend=queries-subquery-derived-tables, generate array + content, as described + in xref linkend=sql-syntax-array-constructors, have their + result content tested within expressions, as described + in xref linkend=functions-subquery, or be used in other + contexts. Often either joins or subqueries can be used to produce + different query plans yielding identical output. Which technique + is appropriate depends upon circumstance but it is worth noting + that more work has gone into query planner join optimization. + /para + + para + Subqueries may not modify database + content. link linkend=queries-withCommon Table + Expressions/link are one way to integrate data returned by data + modification statements, + i.e. commandINSERT/command/commandUPDATE/command/commandDELETE/command + statements with literalRETURNING/literal clauses, into larger + queries. + /para + /sect1 + + sect1 id=queries-union titleCombining Queries/title -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: UNNEST (and other functions) WITH ORDINALITY
On Mon, Aug 5, 2013 at 1:31 AM, Robert Haas robertmh...@gmail.com wrote: This looks like really nice work. It does. It's functionally equivalent to my attempt but with much better comments and cleaner code. But it doesn't seem to cover the case I was stumped on, namely nulls first appearing in a place where two unreserved keywords can appear consecutively. This doesn't happen for WITH_* before with is a reserved keyword. Looking into it a bit I see that the case I was most worried about is actually a non-issue. We don't support column aliases without AS unless the alias is completely unknown to the parser. That seems a bit of a strange rule that must make the syntax with the missing AS pretty unreliable if people are looking for code that will continue to work in future versions. I never knew about this. The only other case I could come up with in my regression tests is pretty esoteric: CREATE COLLATION nulls (locale='C'); ALTER OPERATOR CLASS text_ops USING btree RENAME TO first; CREATE TABLE nulls_first(t text); CREATE INDEX nulls_first_i ON nulls_first(t COLLATE nulls first); I'm not 100% sure there aren't other cases where this can occur though. -- greg
[HACKERS] Re: Doc Patch: Subquery section to say that subqueries can't modify data
Instead of simply expanding the section on sub-queries, which may still be worthwhile, it seems that we have effectively introduced a new kind of query - namely one that mixes both query DDL and update DDL into a kind of hybrid query. An entire section describing the means to implement these queries and the limitations thereof would seem advisable as the current material is spread throughout the documentation. Some areas to address would: Select queries that cause/utilize: function-based modifications CTE-based modifications FDW/dblink-based modifications I guess the main question is if someone were to put this together would it likely be included in the queries section of the documentation. Also, are there any other thoughts to add; and is something like this documented in a ToDo somewhere already? The proposed patch; while warranting a technical review (namely that the presence of functions in a sub-select can cause the sub-query to update the database), seems to add one more place to go find this information without adding a central index or summary that someone learning the system could directly comprehend/learn as opposed to it being some allowed/disallowed side-effect to something else. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Doc-Patch-Subquery-section-to-say-that-subqueries-can-t-modify-data-tp5766574p5766580.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] refactor heap_deform_tuple guts
Hi, heap_deform_tuple and slot_deform_tuple contain duplicated code. This patch refactors them so that the guts are in a single place. I have checked the resulting assembly code for heap_deform_tuple, and with the inline declaration, the gcc version I have (4.7.2) generates almost identical output both after the patch than before, thus there shouldn't be any slowdown. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c index e39b977..4afd998 100644 --- a/src/backend/access/common/heaptuple.c +++ b/src/backend/access/common/heaptuple.c @@ -865,6 +865,109 @@ heap_modifytuple(HeapTuple tuple, return result; } + +/* + * heap_deconstruct_tuple + * Guts of attribute extraction from a heap tuple. See heap_deform_tuple + * and slot_deform_tuple for the user-friendly versions. + * + * Its arguments are: + * tupdata pointer to the tuple data area + * t_bits pointer to the tuple null bitmask + * hasnulls has nulls bit in tuple infomask + * start_attnum attnum at which to start decoding + * end_attnum attnum at which to stop decoding + * att the tuple's TupleDesc Form_pg_attribute pointer + * offset offset (into tp) at which start_attnum starts. + * slowly whether the cache can be used; updated same as offset. + * values output values + * isnull output nulls + * + * offset and slowly are updated to be the values corresponding to the next + * attribute to be decoded; thus this function can be used to deconstruct a + * tuple incrementally. For the first call, start_attnum and offset should + * both be zero; a subsequent call would pass a start_attnum equal to the + * previous end_attnum. + * + * If hasnulls is false, NULL can be passed for t_bits. + */ +static inline void +heap_deconstruct_tuple(char *tupdata, + bits8 *t_bits, + bool hasnulls, + int start_attnum, + int end_attnum, + Form_pg_attribute *att, + Datum *values, + bool *isnull, + long *offset, + bool *slowly) +{ + int attnum; + bool slow; + long off; + + off = *offset; + slow = *slowly; + + for (attnum = start_attnum; attnum end_attnum; attnum++) + { + Form_pg_attribute thisatt = att[attnum]; + + if (hasnulls att_isnull(attnum, t_bits)) + { + values[attnum] = (Datum) 0; + isnull[attnum] = true; + slow = true; /* can't use attcacheoff anymore */ + continue; + } + + isnull[attnum] = false; + + if (!slow thisatt-attcacheoff = 0) + off = thisatt-attcacheoff; + else if (thisatt-attlen == -1) + { + /* + * We can only cache the offset for a varlena attribute if the + * offset is already suitably aligned, so that there would be no + * pad bytes in any case: then the offset will be valid for either + * an aligned or unaligned value. + */ + if (!slow +off == att_align_nominal(off, thisatt-attalign)) +thisatt-attcacheoff = off; + else + { +off = att_align_pointer(off, thisatt-attalign, -1, + tp + off); +slow = true; + } + } + else + { + /* not varlena, so safe to use att_align_nominal */ + off = att_align_nominal(off, thisatt-attalign); + + if (!slow) +thisatt-attcacheoff = off; + } + + values[attnum] = fetchatt(thisatt, tp + off); + + off = att_addlength_pointer(off, thisatt-attlen, tp + off); + + if (thisatt-attlen = 0) + slow = true; /* can't use attcacheoff anymore */ + } + + *slowly = slow; + *offset = off; + + return attnum; +} + + /* * heap_deform_tuple * Given a tuple, extract data into values/isnull arrays; this is @@ -910,56 +1013,8 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc, off = 0; - for (attnum = 0; attnum natts; attnum++) - { - Form_pg_attribute thisatt = att[attnum]; - - if (hasnulls att_isnull(attnum, bp)) - { - values[attnum] = (Datum) 0; - isnull[attnum] = true; - slow = true; /* can't use attcacheoff anymore */ - continue; - } - - isnull[attnum] = false; - - if (!slow thisatt-attcacheoff = 0) - off = thisatt-attcacheoff; - else if (thisatt-attlen == -1) - { - /* - * We can only cache the offset for a varlena attribute if the - * offset is already suitably aligned, so that there would be no - * pad bytes in any case: then the offset will be valid for either - * an aligned or unaligned value. - */ - if (!slow -off == att_align_nominal(off, thisatt-attalign)) -thisatt-attcacheoff = off; - else - { -off = att_align_pointer(off, thisatt-attalign, -1, - tp + off); -slow = true; - } - } - else - { - /* not varlena, so safe to use att_align_nominal */ - off = att_align_nominal(off, thisatt-attalign); - - if (!slow) -thisatt-attcacheoff = off; - } - - values[attnum] = fetchatt(thisatt, tp + off); - - off = att_addlength_pointer(off, thisatt-attlen, tp + off); - - if (thisatt-attlen = 0) - slow = true; /* can't
Re: [HACKERS] Re: Doc Patch: Subquery section to say that subqueries can't modify data
Good points. On 08/06/2013 05:15:28 PM, David Johnston wrote: Instead of simply expanding the section on sub-queries, which may still be worthwhile, it seems that we have effectively introduced a new kind of query - namely one that mixes both query DDL and update DDL into a kind of hybrid query. An entire section describing the means to implement these queries and the limitations thereof would seem advisable as the current material is spread throughout the documentation. The proposed patch is an attempt to provide a base upon which to build such a section. Some areas to address would: Select queries that cause/utilize: function-based modifications CTE-based modifications FDW/dblink-based modifications While it'd be nice to have a full set of examples and an exhaustive list of what constitutes modification and what does not it should be enough to state where, in this sort of hybrid query, modification is allowed and where not. Of course more detail is needed if the different kinds of modification above are restricted in different ways. I guess the main question is if someone were to put this together would it likely be included in the queries section of the documentation. If this section is not to be part of the Query chapter then it surely also does not belong in the Data Manipulation chapter (or the Data Definition chapter), if for no other reason than the information presented in the Query chapter is necessary to understand the subject. To me that means it needs it's own chapter, probably immediately following the Query chapter. I can't think what to call such a chapter. The proposed patch; while warranting a technical review (namely that the presence of functions in a sub-select can cause the sub-query to update the database), seems to add one more place to go find this information without adding a central index or summary that someone learning the system could directly comprehend/learn as opposed to it being some allowed/disallowed side-effect to something else. I'm less worried about data modifying functions than I am about the patch's language regards other sorts of modifications. Although unstated, it should be clear that data modifying functions that are executed with a SELECT statement do modify data. Where the patch is lacking is noting that schema alterations and FDW modifications also have restrictions. I don't feel particularly qualified regards where either are allowed, or not, although I could probably get it right after some research. Regards, Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: UNNEST (and other functions) WITH ORDINALITY
On Tue, Aug 06, 2013 at 11:10:11PM +0100, Greg Stark wrote: On Mon, Aug 5, 2013 at 1:31 AM, Robert Haas robertmh...@gmail.com wrote: This looks like really nice work. It does. It's functionally equivalent to my attempt but with much better comments and cleaner code. But it doesn't seem to cover the case I was stumped on, namely nulls first appearing in a place where two unreserved keywords can appear consecutively. This doesn't happen for WITH_* before with is a reserved keyword. Looking into it a bit I see that the case I was most worried about is actually a non-issue. We don't support column aliases without AS unless the alias is completely unknown to the parser. That seems a bit of a strange rule that must make the syntax with the missing AS pretty unreliable if people are looking for code that will continue to work in future versions. I never knew about this. The only other case I could come up with in my regression tests is pretty esoteric: CREATE COLLATION nulls (locale='C'); ALTER OPERATOR CLASS text_ops USING btree RENAME TO first; CREATE TABLE nulls_first(t text); CREATE INDEX nulls_first_i ON nulls_first(t COLLATE nulls first); I am pretty sure we dismiss as pilot error foolishness at levels much lower than this. I'm not 100% sure there aren't other cases where this can occur though. If you don't find one considerably simpler, I'm inclined to say we should let it lie, possibly with docs--even user-visible ones if you think it's appropriate. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] refactor heap_deform_tuple guts
On Tue, Aug 6, 2013 at 6:32 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: heap_deform_tuple and slot_deform_tuple contain duplicated code. This patch refactors them so that the guts are in a single place. I have checked the resulting assembly code for heap_deform_tuple, and with the inline declaration, the gcc version I have (4.7.2) generates almost identical output both after the patch than before, thus there shouldn't be any slowdown. Although I'm generally in favor of eliminating duplicated code, I have to admit that in this case I'm not sure I see the point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unsafe GUCs and ALTER SYSTEM WAS: Re: ALTER SYSTEM SET
On 8/5/13 2:36 PM, Josh Berkus wrote: Most of our users not on Heroku are running with superuser as the app user now. Like, 95% of them based on my personal experience (because our object permissions management sucks). My percentage wouldn't be nearly that high. 95% of database installs done by developers? That I could believe. But setups done by ex-{Oracle,DB2,Sybase...} DBAs avoid superuser rights for the application, and that's around 1/3 of the systems I see. In general I agree that there certainly are a lot of superuser only installs out there, I just don't think it's quite as bad as you're painting it. The important thing to realize here is that ALTER SYSTEM SET is a convenience function. It’s useful to the extent that it makes people feel more comfortable with changing things in the database. We will never stop people from shooting themselves in the foot. And if the barriers added for that purpose are too high, like disabling changes to shared_buffers altogether, all you’ll do is make this so restricted that it doesn’t satisfy anyone. The original spec I outlined for how to implement this feature allowed disabling the whole thing. Then it was just commenting out the includedir directive from the postgresql.conf. Allowing people to disable use of this feature in a managed configuration environment is important. Beyond that, I don’t expect that we’ll ever make this foolproof. After arguing out this topic in person with Stephen last night, I’ve lumped ideas here into three major approaches that could be followed: 1) Don’t allow changing unsafe GUCs. 2) Provide a way for the server to start with bad setting and force the administrator to fix the problem they introduced. Some sort of maintenance mode that only allows superuser connections would force someone to clean up this class of problem at next restart, while still giving them enough access to run a new ALTER SYSTEM SET command. 3) Require extra syntax to modify an unsafe GUC. As far as fixing the bad settings goes, there’s already code in initdb to detect how high shared_buffers and max_connections can go. Any bogus shared_buffers/max_connections value above the kernel limits could be worked around with that approach. Here’s how I would try and communicate that a change is unsafe, only allowing it after proving user is paying some attention to the danger: # ALTER SYSTEM SET shared_buffers = ‘8GB’; NOTICE: Changing shared_buffers only takes effect after a server restart. ERROR: Changing shared_buffers incorrectly can prevent the server from starting normally. Use the FORCE option to modify the value anyway. # ALTER SYSTEM SET shared_buffers = ‘8GB’ FORCE; NOTICE: Changing shared_buffers only takes effect after a server restart. ALTER SYSTEM Will bad examples pop up in the Internet that just use FORCE all the time? Sure they will, and people will cut and paste them without paying attention. I don't see why that possibility has to block this feature from being adopted though. That line of thinking leads toward removing trust authentication, because that's similarly abused with cut and paste tutorials. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unsafe GUCs and ALTER SYSTEM WAS: Re: ALTER SYSTEM SET
On Tue, Aug 6, 2013 at 09:24:47PM -0400, Greg Smith wrote: # ALTER SYSTEM SET shared_buffers = ‘8GB’ FORCE; NOTICE: Changing shared_buffers only takes effect after a server restart. ALTER SYSTEM Will bad examples pop up in the Internet that just use FORCE all the time? Sure they will, and people will cut and paste them without paying attention. I don't see why that possibility has to block this feature from being adopted though. That line of thinking leads toward removing trust authentication, because that's similarly abused with cut and paste tutorials. We already have six levels of GUC settings: postgresql.conf user database session function subtransaction If we add ALTER SYSTEM SET and config.d, we would then have eight. ALTER SYSTEM SET seems to add an entirely new set of behaviors and complexity. Is that really what we want? If we do this, perhaps we should unconditionally just print the file name they have to delete to undo the operation in case the server doesn't start; I am unclear we can clearly identify all the GUC settings that could cause a server not to start. Also, I think we need a SHOW SYSTEM command so users can see their settings via SQL. FYI, ALTER SYSTEM SET is hitting the same problems we would have if pg_hba.conf were set in SQL and in flat files. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unsafe GUCs and ALTER SYSTEM WAS: Re: ALTER SYSTEM SET
From: Greg Smith [mailto:g...@2ndquadrant.com] Sent: Wednesday, August 07, 2013 6:55 AM To: Josh Berkus Cc: Stephen Frost; Bruce Momjian; Greg Stark; Andres Freund; Alvaro Herrera; Fujii Masao; Robert Haas; Amit Kapila; Dimitri Fontaine; pgsql-hackers@postgresql.org; Tom Lane Subject: Re: [HACKERS] Unsafe GUCs and ALTER SYSTEM WAS: Re: ALTER SYSTEM SET On 8/5/13 2:36 PM, Josh Berkus wrote: Most of our users not on Heroku are running with superuser as the app user now. Like, 95% of them based on my personal experience (because our object permissions management sucks). My percentage wouldn't be nearly that high. 95% of database installs done by developers? That I could believe. But setups done by ex-{Oracle,DB2,Sybase...} DBAs avoid superuser rights for the application, and that's around 1/3 of the systems I see. In general I agree that there certainly are a lot of superuser only installs out there, I just don't think it's quite as bad as you're painting it. The important thing to realize here is that ALTER SYSTEM SET is a convenience function. It’s useful to the extent that it makes people feel more comfortable with changing things in the database. We will never stop people from shooting themselves in the foot. And if the barriers added for that purpose are too high, like disabling changes to shared_buffers altogether, all you’ll do is make this so restricted that it doesn’t satisfy anyone. The original spec I outlined for how to implement this feature allowed disabling the whole thing. Then it was just commenting out the includedir directive from the postgresql.conf. Allowing people to disable use of this feature in a managed configuration environment is important. Beyond that, I don’t expect that we’ll ever make this foolproof. After arguing out this topic in person with Stephen last night, I’ve lumped ideas here into three major approaches that could be followed: 1) Don’t allow changing unsafe GUCs. 2) Provide a way for the server to start with bad setting and force the administrator to fix the problem they introduced. Some sort of maintenance mode that only allows superuser connections would force someone to clean up this class of problem at next restart, while still giving them enough access to run a new ALTER SYSTEM SET command. 3) Require extra syntax to modify an unsafe GUC. I also believe this is the right way to move forward and similar thoughts of mine are shared in mail below: http://www.postgresql.org/message-id/00a301ce91a3$09226970$1b673c50$@kap...@huawei.com I think for unsafe parameter's, for initial patch may be we can choose limited number of parameters. As far as fixing the bad settings goes, there’s already code in initdb to detect how high shared_buffers and max_connections can go. Any bogus shared_buffers/max_connections value above the kernel limits could be worked around with that approach. Here’s how I would try and communicate that a change is unsafe, only allowing it after proving user is paying some attention to the danger: # ALTER SYSTEM SET shared_buffers = ‘8GB’; NOTICE: Changing shared_buffers only takes effect after a server restart. ERROR: Changing shared_buffers incorrectly can prevent the server from starting normally. Use the FORCE option to modify the value anyway. # ALTER SYSTEM SET shared_buffers = ‘8GB’ FORCE; NOTICE: Changing shared_buffers only takes effect after a server restart. ALTER SYSTEM Will bad examples pop up in the Internet that just use FORCE all the time? Sure they will, and people will cut and paste them without paying attention. I don't see why that possibility has to block this feature from being adopted though. That line of thinking leads toward removing trust authentication, because that's similarly abused with cut and paste tutorials. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unsafe GUCs and ALTER SYSTEM WAS: Re: ALTER SYSTEM SET
On Wednesday, August 07, 2013 8:01 AM Bruce Momjian wrote: On Tue, Aug 6, 2013 at 09:24:47PM -0400, Greg Smith wrote: # ALTER SYSTEM SET shared_buffers = ‘8GB’ FORCE; NOTICE: Changing shared_buffers only takes effect after a server restart. ALTER SYSTEM Will bad examples pop up in the Internet that just use FORCE all the time? Sure they will, and people will cut and paste them without paying attention. I don't see why that possibility has to block this feature from being adopted though. That line of thinking leads toward removing trust authentication, because that's similarly abused with cut and paste tutorials. We already have six levels of GUC settings: postgresql.conf user database session function subtransaction If we add ALTER SYSTEM SET and config.d, we would then have eight. ALTER SYSTEM SET seems to add an entirely new set of behaviors and complexity. Is that really what we want? If we do this, perhaps we should unconditionally just print the file name they have to delete to undo the operation in case the server doesn't start; I am unclear we can clearly identify all the GUC settings that could cause a server not to start. Also, I think we need a SHOW SYSTEM command so users can see their settings via SQL. Although users can see the settings in pg_settings as it has sourcefile, but such a command can be useful. FYI, ALTER SYSTEM SET is hitting the same problems we would have if pg_hba.conf were set in SQL and in flat files. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] how to pass data (tuples) to worker processes?
On Tuesday, August 06, 2013 6:29 PM Robert Haas wrote: On Sat, Aug 3, 2013 at 6:31 AM, Andrew Tipton and...@kiwidrew.com wrote: Robert: any chance you could share a few more details on the enhancements you're planning for bgworkers? I seem to recall reading that communicating with the dynamic bgworkers after they had been launched was next on your agenda... Yeah, it is. I'm working on a patch to allow additional shared memory segments to be created on the fly. The idea I'm working with is that a backend that plans to launch a worker will first create a dynamic shared memory segment, then pass the ID of that segment to the worker via bgw_main_arg. The worker will map the segment, and then the two processes can use that to communicate. My thought is to create a queue abstraction that sits on top of the dynamic shared memory infrastructure, so that you can set aside a portion of your dynamic shared memory segment to use as a ring buffer and send messages back and forth with using some kind of API along these lines: extern void dsm_queue_send(dsm_queue *, char *data, uint64 len); extern uint64 dsm_queue_receive(dsm_queue *, char **dataptr); It would also be possible to implement message sending and receiving using pipes, but I'm leaning away from that because it would require even more OS-dependent code than I'm already having to write, and writing OS-dependent shim layers is one of the world's less-rewarding coding tasks; and also because I think it will be easier to achieve zero-copy semantics using shared memory. Another idea to get parallel tasks done by bgworkers is rather than dynamically invoking a new bgworker, we can have a set of pre-allocated bgworkers for a sever and then based on need allocate bgworker from pre-allocated array. Now we can allocate shared memory in the beginning based on bgworkers and the information needed to share between Backend and bgworkers (Plan, Tuple, snapshot, .. ). The basic idea can work as below: a. Backend who wishes to get parallel tasks done by bgworker will divide the tasks and check which bgworkers are free and share the plan in corresponding bgworker share memory. b. Bgworker who is polling on its slot of shared memory can retrieve the plan and execute it. c. Bgworker can share the tuples again in its shared memory slot d. Backend can retrieve tuples from shared memory slots of bgworkers where it has communicated the plan e. Backend can send the tuples back to client This idea has a drawback that queue of tuples to be shared has to be of fixed size as we need to allocate memory in beginning. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers