Re: [HACKERS] pg_regress writes into source tree
On Fri, Dec 12, 2014 at 10:45 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Another thing in that patch was that I had to add the sql/ directory to the source tree, but other than that .gitignore file it was empty. Maybe pg_regress should create the sql/ directory in the build dir if it doesn't exist. This is only a problem if a pg_regress suite only runs stuff from input/, because otherwise the sql/ dir already exists in the source. +1 for having pg_regress create the sql/ directory when it does not exist. Current behavior is annoying when modules having only tests in input/... -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
On Thu, Dec 18, 2014 at 2:21 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Wed, Dec 17, 2014 at 11:33 PM, Rahila Syed rahilasye...@gmail.com wrote: I had a look at code. I have few minor points, Thanks! + bkpb.fork_flags |= BKPBLOCK_HAS_IMAGE; + + if (is_compressed) { - rdt_datas_last-data = page; - rdt_datas_last-len = BLCKSZ; + /* compressed block information */ + bimg.length = compress_len; + bimg.extra_data = hole_offset; + bimg.extra_data |= XLR_BLCK_COMPRESSED_MASK; For consistency with the existing code , how about renaming the macro XLR_BLCK_COMPRESSED_MASK as BKPBLOCK_HAS_COMPRESSED_IMAGE on the lines of BKPBLOCK_HAS_IMAGE. OK, why not... + blk-hole_offset = extra_data ~XLR_BLCK_COMPRESSED_MASK; Here , I think that having the mask as BKPBLOCK_HOLE_OFFSET_MASK will be more indicative of the fact that lower 15 bits of extra_data field comprises of hole_offset value. This suggestion is also just to achieve consistency with the existing BKPBLOCK_FORK_MASK for fork_flags field. Yeah that seems clearer, let's define it as ~XLR_BLCK_COMPRESSED_MASK though. And comment typo +* First try to compress block, filling in the page hole with zeros +* to improve the compression of the whole. If the block is considered +* as incompressible, complete the block header information as if +* nothing happened. As hole is no longer being compressed, this needs to be changed. Fixed. As well as an additional comment block down. A couple of things noticed on the fly: - Fixed pg_xlogdump being not completely correct to report the FPW information - A couple of typos and malformed sentences fixed - Added an assertion to check that the hole offset value does not the bit used for compression status - Reworked docs, mentioning as well that wal_compression is off by default. - Removed stuff in pg_controldata and XLOG_PARAMETER_CHANGE (mentioned by Fujii-san) Thanks! +else +memcpy(compression_scratch, page, page_len); I don't think the block image needs to be copied to scratch buffer here. We can try to compress the page directly. +#include utils/pg_lzcompress.h #include utils/memutils.h pg_lzcompress.h should be after meutils.h. +/* Scratch buffer used to store block image to-be-compressed */ +static char compression_scratch[PGLZ_MAX_BLCKSZ]; Isn't it better to allocate the memory for compression_scratch in InitXLogInsert() like hdr_scratch? +uncompressed_page = (char *) palloc(PGLZ_RAW_SIZE(header)); Why don't we allocate the buffer for uncompressed page only once and keep reusing it like XLogReaderState-readBuf? The size of uncompressed page is at most BLCKSZ, so we can allocate the memory for it even before knowing the real size of each block image. -printf( (FPW); hole: offset: %u, length: %u\n, - record-blocks[block_id].hole_offset, - record-blocks[block_id].hole_length); +if (record-blocks[block_id].is_compressed) +printf( (FPW); hole offset: %u, compressed length %u\n, + record-blocks[block_id].hole_offset, + record-blocks[block_id].bkp_len); +else +printf( (FPW); hole offset: %u, length: %u\n, + record-blocks[block_id].hole_offset, + record-blocks[block_id].bkp_len); We need to consider what info about FPW we want pg_xlogdump to report. I'd like to calculate how much bytes FPW was compressed, from the report of pg_xlogdump. So I'd like to see also the both length of uncompressed FPW and that of compressed one in the report. In pg_config.h, the comment of BLCKSZ needs to be updated? Because the maximum size of BLCKSZ can be affected by not only itemid but also XLogRecordBlockImageHeader. boolhas_image; +boolis_compressed; Doesn't ResetDecoder need to reset is_compressed? +#wal_compression = off# enable compression of full-page writes Currently wal_compression compresses only FPW, so isn't it better to place it after full_page_writes in postgresql.conf.sample? +uint16extra_data;/* used to store offset of bytes in hole, with + * last free bit used to check if block is + * compressed */ At least to me, defining something like the following seems more easy to read. uint16hole_offset:15, is_compressed:1 Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Too strict check when starting from a basebackup taken off a standby
On 2014-12-16 18:37:48 +0200, Heikki Linnakangas wrote: On 12/11/2014 04:21 PM, Marco Nenciarini wrote: Il 11/12/14 12:38, Andres Freund ha scritto: On December 11, 2014 9:56:09 AM CET, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 12/11/2014 05:45 AM, Andres Freund wrote: Yeah. I was not able to reproduce this, but I'm clearly missing something, since both you and Sergey have seen this happening. Can you write a script to reproduce? Not right now, I only have my mobile... Its quite easy though. Create a pg-basebackup from a standby. Create a recovery.conf with a broken primary conninfo. Start. Shutdown. Fix conninfo. Start. Just tested it. There steps are not sufficient to reproduce the issue on a test installation. I suppose because, on small test datadir, the checkpoint location and the redo location on the pg_control are the same present in the backup_label. To trigger this bug you need to have at least a restartpoint happened on standby between the start and the end of the backup. you could simulate it issuing a checkpoint on master, a checkpoint on standby (to force a restartpoint), then copying the pg_control from the standby. This way I've been able to reproduce it. Ok, got it. I was able to reproduce this by using pg_basebackup --max-rate=1024, and issuing CHECKPOINT in the standby while the backup was running. FWIW, I can reproduce it without any such hangups. I've just tested it using my local scripts: # create primary $ reinit-pg-dev-master $ run-pg-dev-master # create first standby $ reinit-pg-dev-master-standby $ run-pg-dev-master-standby # create 2nd standby $ pg_basebackup -h /tmp/ -p 5441 -D /tmp/tree --write-recovery-conf $ PGHOST=frakbar run-pg-dev-master-standby -D /tmp/tree LOG: creating missing WAL directory pg_xlog/archive_status LOG: entering standby mode FATAL: could not connect to the primary server: could not translate host name frakbar to address: Name or service not known $ PGHOST=/tmp run-pg-dev-master-standby -D /tmp/tree LOG: started streaming WAL from primary at 0/200 on timeline 1 FATAL: backup_label contains data inconsistent with control file HINT: This means that the backup is corrupted and you will have to use another backup for recovery. After the fix I just pushed that sequence works. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minor improvement to explain.c
On Thu, Dec 18, 2014 at 12:52 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: Hi, The attached patch just removes one bad-looking blank line in the comments at the top of a function in explain.c. Applied. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minor improvement to explain.c
(2014/12/18 17:34), Fujii Masao wrote: On Thu, Dec 18, 2014 at 12:52 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: The attached patch just removes one bad-looking blank line in the comments at the top of a function in explain.c. Applied. Thanks! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WALWriter active during recovery
On Tue, Dec 16, 2014 at 3:51 AM, Simon Riggs si...@2ndquadrant.com wrote: Currently, WALReceiver writes and fsyncs data it receives. Clearly, while we are waiting for an fsync we aren't doing any other useful work. Following patch starts WALWriter during recovery and makes it responsible for fsyncing data, allowing WALReceiver to progress other useful actions. +1 At present this is a WIP patch, for code comments only. Don't bother with anything other than code questions at this stage. Implementation questions are * How should we wake WALReceiver, since it waits on a poll(). Should we use SIGUSR1, which is already used for latch waits, or another signal? Probably we need to change libpqwalreceiver so that it uses the latch. This is useful even for the startup process to report the replay location to the walreceiver in real time. * Should we introduce some pacing delays if the WALreceiver gets too far ahead of apply? I don't think so for now. Instead, we can support synchronous_commit = replay, and the users can use that new mode if they are worried about the delay of WAL replay. * Other questions you may have? Who should wake the startup process so that it reads and replays the WAL data? Current walreceiver. But if walwriter is responsible for fsyncing WAL data, probably walwriter should do that. Because the startup process should not replay the WAL data which has not been fsync'd yet. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 16.12.2014 08:33, David Rowley wrote: On 16 December 2014 at 18:18, Josh Berkus j...@agliodbs.com mailto:j...@agliodbs.com wrote: Man. You're equating stuff that's not the same. You didn't get your way (and I'm tentatively on your side onthat one) and take that to imply that we don't want more reviewers. During that thread a couple people said that novice reviewers added no value to the review process, and nobody argued with them then. I've also been told this to my face at pgCon, and when I've tried organizing patch review events. I got the message, which is why I stopped trying to get new reviewers. And frankly: if we're opposed to giving credit to patch reviewers, we're opposed to having them. I'd just like to add something which might be flying below the radar of more senior people. There are people out there (ike me) working on PostgreSQL more for the challenge and perhaps the love of the product, who make absolutely zero money out of it. For these people getting credit where it's due is very important. I'm pretty happy with this at the moment and I can't imagine any situation where not crediting reviewers would be beneficial to anyone. I want to focus this point a little more. I'm educate young people to become developers and i always try to encourage them to participate in open source projects for many reasons. In my experience people are very different and one of the key arguments to make some becoming part of a community is the credit the can earn. Finding their own name in the release-note of a known project is very good for the ego and one of their main motivations. This is especially import for *young* people. Also its easy to argue that credits makes it easier to get a good job later. If you can write have a look at the release notes of $x, i make the feature you're using in your references of a application for employment this is a big plus for you. People are very different, but there are some groups of motivations to address. For me its very unimportant if you name me or not. 10 years ago i would not have participate in anything without somebody stretching out this was my genius work :( Apart of this social addressing problem i believe there are some other problems in the process of reviewing patches in the context of novice reviewers. I'm trying to write it in a separate note. Greetings, Torsten -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
Isn't it better to allocate the memory for compression_scratch in InitXLogInsert() like hdr_scratch? I think making compression_scratch a statically allocated global variable is the result of following discussion earlier, http://www.postgresql.org/message-id/ca+tgmoaznbuwnls4bpwyqgqteeznoavy7kwdbm0a2-tbarn...@mail.gmail.com Thank you, Rahila Syed On Thu, Dec 18, 2014 at 1:57 PM, Fujii Masao masao.fu...@gmail.com wrote: On Thu, Dec 18, 2014 at 2:21 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Wed, Dec 17, 2014 at 11:33 PM, Rahila Syed rahilasye...@gmail.com wrote: I had a look at code. I have few minor points, Thanks! + bkpb.fork_flags |= BKPBLOCK_HAS_IMAGE; + + if (is_compressed) { - rdt_datas_last-data = page; - rdt_datas_last-len = BLCKSZ; + /* compressed block information */ + bimg.length = compress_len; + bimg.extra_data = hole_offset; + bimg.extra_data |= XLR_BLCK_COMPRESSED_MASK; For consistency with the existing code , how about renaming the macro XLR_BLCK_COMPRESSED_MASK as BKPBLOCK_HAS_COMPRESSED_IMAGE on the lines of BKPBLOCK_HAS_IMAGE. OK, why not... + blk-hole_offset = extra_data ~XLR_BLCK_COMPRESSED_MASK; Here , I think that having the mask as BKPBLOCK_HOLE_OFFSET_MASK will be more indicative of the fact that lower 15 bits of extra_data field comprises of hole_offset value. This suggestion is also just to achieve consistency with the existing BKPBLOCK_FORK_MASK for fork_flags field. Yeah that seems clearer, let's define it as ~XLR_BLCK_COMPRESSED_MASK though. And comment typo +* First try to compress block, filling in the page hole with zeros +* to improve the compression of the whole. If the block is considered +* as incompressible, complete the block header information as if +* nothing happened. As hole is no longer being compressed, this needs to be changed. Fixed. As well as an additional comment block down. A couple of things noticed on the fly: - Fixed pg_xlogdump being not completely correct to report the FPW information - A couple of typos and malformed sentences fixed - Added an assertion to check that the hole offset value does not the bit used for compression status - Reworked docs, mentioning as well that wal_compression is off by default. - Removed stuff in pg_controldata and XLOG_PARAMETER_CHANGE (mentioned by Fujii-san) Thanks! +else +memcpy(compression_scratch, page, page_len); I don't think the block image needs to be copied to scratch buffer here. We can try to compress the page directly. +#include utils/pg_lzcompress.h #include utils/memutils.h pg_lzcompress.h should be after meutils.h. +/* Scratch buffer used to store block image to-be-compressed */ +static char compression_scratch[PGLZ_MAX_BLCKSZ]; Isn't it better to allocate the memory for compression_scratch in InitXLogInsert() like hdr_scratch? +uncompressed_page = (char *) palloc(PGLZ_RAW_SIZE(header)); Why don't we allocate the buffer for uncompressed page only once and keep reusing it like XLogReaderState-readBuf? The size of uncompressed page is at most BLCKSZ, so we can allocate the memory for it even before knowing the real size of each block image. -printf( (FPW); hole: offset: %u, length: %u\n, - record-blocks[block_id].hole_offset, - record-blocks[block_id].hole_length); +if (record-blocks[block_id].is_compressed) +printf( (FPW); hole offset: %u, compressed length %u\n, + record-blocks[block_id].hole_offset, + record-blocks[block_id].bkp_len); +else +printf( (FPW); hole offset: %u, length: %u\n, + record-blocks[block_id].hole_offset, + record-blocks[block_id].bkp_len); We need to consider what info about FPW we want pg_xlogdump to report. I'd like to calculate how much bytes FPW was compressed, from the report of pg_xlogdump. So I'd like to see also the both length of uncompressed FPW and that of compressed one in the report. In pg_config.h, the comment of BLCKSZ needs to be updated? Because the maximum size of BLCKSZ can be affected by not only itemid but also XLogRecordBlockImageHeader. boolhas_image; +boolis_compressed; Doesn't ResetDecoder need to reset is_compressed? +#wal_compression = off# enable compression of full-page writes Currently wal_compression compresses only FPW, so isn't it better to place it after full_page_writes in postgresql.conf.sample? +uint16extra_data;/* used to store offset of
Re: [HACKERS] Streaming replication and WAL archive interactions
On Wed, Dec 17, 2014 at 4:11 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 12/16/2014 10:24 AM, Borodin Vladimir wrote: 12 дек. 2014 г., в 16:46, Heikki Linnakangas hlinnakan...@vmware.com написал(а): There have been a few threads on the behavior of WAL archiving, after a standby server is promoted [1] [2]. In short, it doesn't work as you might expect. The standby will start archiving after it's promoted, but it will not archive files that were replicated from the old master via streaming replication. If those files were not already archived in the master before the promotion, they are not archived at all. That's not good if you wanted to restore from a base backup + the WAL archive later. The basic setup is a master server, a standby, a WAL archive that's shared by both, and streaming replication between the master and standby. This should be a very common setup in the field, so how are people doing it in practice? Just live with the wisk that you might miss some files in the archive if you promote? Don't even realize there's a problem? Something else? Yes, I do live like that (with streaming replication and shared archive between master and replicas) and don’t even realize there’s a problem :( And I think I’m not the only one. Maybe at least a note should be added to the documentation? Let's try to figure out a way to fix this in master, but yeah, a note in the documentation is in order. +1 And how would we like it to work? Here's a plan: Have a mechanism in the standby, to track how far the master has archived its WAL, and don't throw away WAL in the standby that hasn't been archived in the master yet. This is similar to the physical replication slots, which prevent the master from recycling WAL that a standby hasn't received yet, but in reverse. I think we can use the .done and .ready files for this. Whenever a file is streamed (completely) from the master, create a .ready file for it. When we get an acknowledgement from the master that it has archived it, create a .done file for it. To get the information from the master, add the last archived WAL segment e.g. in the streaming replication keep-alive message, or invent a new message type for it. Sounds OK to me. How does this work in cascade replication case? The cascading walsender just relays the archive location to the downstream standby? What happens when WAL streaming is terminated and the startup process starts to read the WAL file from the archive? After reading the WAL file from the archive, probably we would need to change .ready files of every older WAL files to .done. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
On Thu, Dec 18, 2014 at 7:31 PM, Rahila Syed rahilasye...@gmail.com wrote: Isn't it better to allocate the memory for compression_scratch in InitXLogInsert() like hdr_scratch? I think making compression_scratch a statically allocated global variable is the result of following discussion earlier, http://www.postgresql.org/message-id/ca+tgmoaznbuwnls4bpwyqgqteeznoavy7kwdbm0a2-tbarn...@mail.gmail.com /* * Permanently allocate readBuf. We do it this way, rather than just * making a static array, for two reasons: (1) no need to waste the * storage in most instantiations of the backend; (2) a static char array * isn't guaranteed to have any particular alignment, whereas palloc() * will provide MAXALIGN'd storage. */ The above source code comment in XLogReaderAllocate() makes me think that it's better to avoid using a static array. The point (1) seems less important in this case because most processes need the buffer for WAL compression, though. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
On Thu, Dec 18, 2014 at 7:31 PM, Rahila Syed rahilasye...@gmail.com wrote: Isn't it better to allocate the memory for compression_scratch in InitXLogInsert() like hdr_scratch? I think making compression_scratch a statically allocated global variable is the result of following discussion earlier, http://www.postgresql.org/message-id/ca+tgmoaznbuwnls4bpwyqgqteeznoavy7kwdbm0a2-tbarn...@mail.gmail.com Yep, in this case the OS does not request this memory as long as it is not touched, like when wal_compression is off all the time in the backend. Robert mentioned that upthread. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] speedup tidbitmap patch: cache page
On 18 December 2014 at 04:56, Teodor Sigaev teo...@sigaev.ru wrote: You could well be right, but it would be good to compare the numbers just so we know this for sure. I wasn't right :( # set work_mem='64kB'; # set enable_seqscan = off; Patched: 1194.094 ms Master: 1765.338 ms Are you seeing the same? Fixed too, the mistake was in supposition that current page becomes lossy in tbm_lossify. It's not always true because tbm_lossify() could lossify only part of pages and we don't know which. Thank you very much. Oh, that makes sense. Though I wonder if you need to clear the caches at all when calling tbm_lossify(). Surely it never becomes un-lossified and plus, at least for lossy_page it would never be set to the current page anyway, it's either going to be set to InvalidBlockNumber, or some other previous page that was lossy. I also can't quite see the need to set page to NULL. Surely doing this would just mean we'd have to lookup the page again once tbm_lossify() is called if the next loop happened to be the same page? I think this would only be needed if the hash lookup was going to return a new instance of the page after we've lossified it, which from what I can tell won't happen. I've made these small changes, and just tweaked the comments a little. (attached) I've also attached some benchmark results using your original table from up-thread. It seems that the caching if the page was seen as lossy is not much of a help in this test case. Did you find another one where you saw some better gains? Regards David Rowley tbm_cache_benchmark.xlsx Description: MS-Excel 2007 spreadsheet tbm_cachepage-2.2_dr.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] .gitignore config.cache and comment about regression.(out|diff)
On Tue, Dec 16, 2014 at 9:47 AM, Jim Nasby jim.na...@bluetreble.com wrote: config.cache is created when you pass -C to configure, which speeds it up considerably (3.5s vs 16.5 on my laptop). It would be nice to just ignore the cache file it generates. Originally this patch also ignored the regression output files, until I realized why that was a bad idea. Add a comment about that. Applied. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench -f and vacuum
On Sun, Dec 14, 2014 at 11:43 AM, Tatsuo Ishii is...@postgresql.org wrote: If we care enough about that case to attempt the vacuum anyway then we need to do something about the error message; either squelch it or check for the existence of the tables before attempting to vacuum. Since there's no way to squelch in the server logfile, I think checking for the table is the right answer. Fair enough. I will come up with checking for table before vacuum approach. +1 for this approach. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TABLESAMPLE patch
Hi, v2 version of this patch is attached. On 16/12/14 09:31, Petr Jelinek wrote: On 16/12/14 08:43, Jaime Casanova wrote: Sadly when the jsonb functions patch was committed a few oids where used, so you should update the ones you are using. at least to make the patch easier for testing. Will do. Done. The test added for this failed, attached is the diff. i didn't looked up why it failed It isn't immediately obvious to me why, will look into it. Fixed. Finally, i created a view with a tablesample clause. i used the view and the tablesample worked, then dumped and restored and the tablesample clause went away... actually pg_get_viewdef() didn't see it at all. Yeah, as I mentioned in the submission the ruleutils support is not there yet, so that's expected. Also fixed. I also added proper costing/row estimation. I consider this patch feature complete now, docs could still use improvement though. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml index 01d24a5..250ae29 100644 --- a/doc/src/sgml/ref/select.sgml +++ b/doc/src/sgml/ref/select.sgml @@ -49,7 +49,7 @@ SELECT [ ALL | DISTINCT [ ON ( replaceable class=parameterexpression/replac phrasewhere replaceable class=parameterfrom_item/replaceable can be one of:/phrase -[ ONLY ] replaceable class=parametertable_name/replaceable [ * ] [ [ AS ] replaceable class=parameteralias/replaceable [ ( replaceable class=parametercolumn_alias/replaceable [, ...] ) ] ] +[ ONLY ] replaceable class=parametertable_name/replaceable [ * ] [ TABLESAMPLE replaceable class=parametersampling_method/replaceable ( replaceable class=parameterargument/replaceable [, ...] ) [ REPEATABLE ( replaceable class=parameterseed/replaceable ) ] ] [ [ AS ] replaceable class=parameteralias/replaceable [ ( replaceable class=parametercolumn_alias/replaceable [, ...] ) ] ] [ LATERAL ] ( replaceable class=parameterselect/replaceable ) [ AS ] replaceable class=parameteralias/replaceable [ ( replaceable class=parametercolumn_alias/replaceable [, ...] ) ] replaceable class=parameterwith_query_name/replaceable [ [ AS ] replaceable class=parameteralias/replaceable [ ( replaceable class=parametercolumn_alias/replaceable [, ...] ) ] ] [ LATERAL ] replaceable class=parameterfunction_name/replaceable ( [ replaceable class=parameterargument/replaceable [, ...] ] ) @@ -317,6 +317,38 @@ TABLE [ ONLY ] replaceable class=parametertable_name/replaceable [ * ] /varlistentry varlistentry + termTABLESAMPLE replaceable class=parametersampling_method/replaceable ( replaceable class=parameterargument/replaceable [, ...] ) [ REPEATABLE ( replaceable class=parameterseed/replaceable ) ]/term + listitem + para +Table sample clause after +replaceable class=parametertable_name/replaceable indicates that +a replaceable class=parametersampling_method/replaceable should +be used to retrieve subset of rows in the table. +The replaceable class=parametersampling_method/replaceable can be +one of: +itemizedlist + listitem + paraliteralSYSTEM/literal/para + /listitem + listitem + paraliteralBERNOULLI/literal/para + /listitem +/itemizedlist +Both of those sampling methods currently accept only single argument +which is the percent (floating point from 0 to 100) of the rows to +be returned. +The literalSYSTEM/literal sampling method does block level +sampling with each block having same chance of being selected and +returns all rows from each selected block. +The literalBERNOULLI/literal scans whole table and returns +individual rows with equal probability. +The optional numeric parameter literalREPEATABLE/literal is used +as random seed for sampling. + /para + /listitem + /varlistentry + + varlistentry termreplaceable class=parameteralias/replaceable/term listitem para diff --git a/src/backend/access/Makefile b/src/backend/access/Makefile index 21721b4..595737c 100644 --- a/src/backend/access/Makefile +++ b/src/backend/access/Makefile @@ -8,6 +8,7 @@ subdir = src/backend/access top_builddir = ../../.. include $(top_builddir)/src/Makefile.global -SUBDIRS = brin common gin gist hash heap index nbtree rmgrdesc spgist transam +SUBDIRS = brin common gin gist hash heap index nbtree rmgrdesc spgist \ + transam tsm include $(top_srcdir)/src/backend/common.mk diff --git a/src/backend/access/tsm/Makefile b/src/backend/access/tsm/Makefile new file mode 100644 index 000..73bbbd7 --- /dev/null +++ b/src/backend/access/tsm/Makefile @@ -0,0 +1,17 @@ +#- +# +# Makefile-- +#Makefile for access/tsm
Re: [HACKERS] Add generate_series(numeric, numeric)
On Mon, Dec 15, 2014 at 12:25 PM, Andrew Gierth and...@tao11.riddles.org.uk wrote: Fujii == Fujii Masao masao.fu...@gmail.com writes: Fujii Pushed. Bug found: regression=# select count(*) from generate_series(1::numeric,10) v, generate_series(1,v) w; count --- 0 (1 row) regression=# select count(*) from generate_series(1::numeric,10) v, generate_series(1,v+0) w; count --- 55 (1 row) The error is in the use of PG_GETARG_NUMERIC and init_var_from_num when setting up the multi-call state; init_var_from_num points at the original num's digits rather than copying them, but start_num and stop_num have just been (potentially) detoasted in the per-call context, in which case the storage will have been freed by the next call. Obviously this could also be fixed by not detoasting the input until after switching to the multi-call context, but it looks to me like that would be unnecessarily complex. Suggested patch attached. Pushed. Thanks! Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add generate_series(numeric, numeric)
On Mon, Dec 15, 2014 at 2:38 PM, Andrew Gierth and...@tao11.riddles.org.uk wrote: Ali == Ali Akbar the.ap...@gmail.com writes: Ali I think yes, it will be good. The alternative is restructuring Ali this paragraph in the SRF docs: The memory context that is current when the SRF is called is a transient context that will be cleared between calls. This means that you do not need to call pfree on everything you allocated using palloc; it will go away anyway. However, if you want to allocate any data structures to live across calls, you need to put them somewhere else. The memory context referenced by multi_call_memory_ctx is a suitable location for any data that needs to survive until the SRF is finished running. In most cases, this means that you should switch into multi_call_memory_ctx while doing the first-call setup. Ali The important part However, if you want to allocate any data Ali structures to live across calls, you need to put them somewhere Ali else. is buried in the docs. Ali But i think explicit warning is more noticeable for new Ali developer like me. I was thinking something like this, added just after that para: warning para While the actual arguments to the function remain unchanged between calls, if you detoast the argument values (which is normally done transparently by the functionPG_GETARG_replaceablexxx/replaceable/function macro) in the transient context then the detoasted copies will be freed on each cycle. Accordingly, if you keep references to such values in your structfielduser_fctx/, you must either copy them into the structfieldmulti_call_memory_ctx/ after detoasting, or ensure that you detoast the values only in that context. /para /warning I'm OK with this. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] no test programs in contrib
Michael Paquier wrote: It would be good to be consistent on Windows with what is now done on other platforms: those modules should not be installed by default, but it would be good to make install.pm a bit smarter with for example an option full, aka install server + client + test modules. Building them is worth it in any case as they can be used with modulescheck. Sure, I agree with that. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 17.12.2014 20:00, Stephen Frost wrote: * Jaime Casanova (ja...@2ndquadrant.com) wrote: On Tue, Dec 16, 2014 at 11:32 AM, Robert Haas robertmh...@gmail.com wrote: It has been proposed that we do a general list of people at the bottom of the release notes who helped review during that cycle. That would be less intrusive and possibly a good idea, but would we credit the people who did a TON of reviewing? Everyone who reviewed even one patch? Somewhere in between? Would committers be excluded because we just expect them to help or included because credit is important to established community members too? To what extent would this be duplicative of http://www.postgresql.org/community/contributors/ ? Not much really, I tried to get my name on that list a couple of years ago, when i reviewed more than i do now, and never got it. And while my name is in a couple commit messages, that is a lot harder to show to people... Having your name in a list of other names at the bottom of the release notes page, without any indication of what you helped with, would work better? Perhaps it would but I tend to doubt it. Out of my personal experience in Germany: yes, it helps. It is not very logical, but many people need a simple way (Website against git log) to see something. (I've rarely seen that something like that is considered not trustable even if there are strong indications that its faked.) But i think it is a good point that the release notes should not become to big. you know, it's kind of frustrating when some not-yet customers ask for certificated engineers, and there isn't any official (as in from community) certificate so you need to prove you're a contributor so let's see this random commit messages... Another thought I had was to suggest we consider *everyone* to be a contributor and implement a way to tie together the mailing list archives with the commit history and perhaps the commitfest app and make it searchable and indexed on some website. eg: contributors.postgresql.org/sfrost - Recent commits - Recent commit mentions - Recent emails to any list - Recent commitfest app activity - Recent wiki page updates ... Ideally with a way for individuals to upload a photo, provide a company link, etc, similar to what the existing Major Contributors have today. Obviously, this is not a small task to develop and there is some risk of abuse (which I expect the other folks on the infra team will point out and likely tar and feather me for suggesting this at all..) but it might be along the same lines as Bruce's PgLife.. That's an interesting idea. I'm not convinced that this is the best solution, but i would help, if community thinks this should be implemented. Greetings, Torsten -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
At 2014-12-16 13:28:07 -0500, sfr...@snowman.net wrote: The magic audit role has SELECT rights on a given table. When any user does a SELECT against that table, ExecCheckRTPerms is called and there's a hook there which the module can use to say ok, does the audit role have any permissions here? and, if the result is yes, then the command is audited. You're right, I did not understand that this is what you were proposing, and this is not what the code does. I went back and read your original description, and it seems I implemented only the subset I understood. I'll look into changing the code sometime next week. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for Oid formatting in printf/elog strings
Mark Dilger wrote: I've been going through a copy of the code and replacing int32 and uint32 with the appropriate type such as Oid, TransactionId, and such, and have created my own typedef for TypeModType, in order to help chase through the code and figure out what functions handle what kind of data. By defining TypeModType to int64, for example, I get lots of compiler errors when passing a TypeModType * into a function that takes int32 *, and that lets me know that the callers of that function think of it as a typemod argument, even if it does not have any clear documentation to that effect. The exercise is a bit tedious, as I have to change lots of strings like the value %d is out of bounds to something like the value TYPEMODTYPE_FORMAT is out of bounds, but the clarity I get from it helps enough that it is useful to me. If it weren't for the format string issue, which makes translations impossible, I'd say let's go ahead with these ideas. For all the drawbacks that a 64-bit TransactionId has, I'm sure there's people who would prefer that to the constant vacuuming the 32-bit system causes. But the int32/TypModType thing looks like we could carry without causing any trouble at all. Not the format string part of it, though. How large is a patch that changes typmod from int32 to TypModType? I can see value in having 64-bit OIDs, for databases with large number of toasted items. Not in the default build, mind you. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Table-level log_autovacuum_min_duration
Hi all, Today I spent a bit of time looking at the activity of autovacuum for one table particularly bloated. log_autovacuum_min_duration was enabled and set to a high value but even with that I had to deal with some spam from the jobs of other tables. It would be cool to have the possibility to do some filtering IMO. So, what about having a relopt to control log_autovacuum_min_duration? RelationData-rd_options is largely accessible for a given relation in the code paths of vacuum and analyze. Thoughts? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] exitArchiveRecovery woes
On Wed, Dec 17, 2014 at 8:40 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: At the end of archive recovery, we copy the last segment from the old timeline, to initialize the first segment on the new timeline. For example, if the timeline switch happens in the middle of WAL segment 00010005, the whole 00010005 segment is copied to become 00020005. The copying is necessary, so that the new segment contains valid data up to the switch point. However, we wouldn't really need to copy the whole segment, copying up to the switch point would be enough. In fact, copying the whole segment is a bad idea, because the copied WAL looks valid on the new timeline too. Your proposed change makes sense to me, but why do we need the segment to contain valid data up to the switch point? It seems like the switch between timelines should be crisper: replay WAL on the old timeline only from the old segment, and from the new timeline only on the new segment. Anything else seems like an invitation to unending corner-case bugs. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Log inability to lock pages during vacuum
On Wed, Dec 17, 2014 at 11:20 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: LOG: automatic vacuum of table postgres.public.foo: index scans: 0 pages: 0 removed, 7256 remain, 0 pinned tuples: 79415 removed, 513156 remain, 0 are dead but not yet removable buffer usage: 14532 hits, 6 misses, 6241 dirtied avg read rate: 0.003 MB/s, avg write rate: 3.413 MB/s system usage: CPU 0.00s/0.30u sec elapsed 14.28 sec I.e. this just says how many pages were pinned, without saying what was done about them. That's not very meaningful to an average DBA, but that's true for many of the numbers printed in vacuum verbose. That message is extremely confusing, to my eyes. If you want to say pages: 0 removed, 7256 remain, 0 skipped due to pins, that would work for me, but just say 0 pinned is totally wrong, because vacuum pinned every page in the table. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
Abhijit, * Abhijit Menon-Sen (a...@2ndquadrant.com) wrote: At 2014-12-16 13:28:07 -0500, sfr...@snowman.net wrote: The magic audit role has SELECT rights on a given table. When any user does a SELECT against that table, ExecCheckRTPerms is called and there's a hook there which the module can use to say ok, does the audit role have any permissions here? and, if the result is yes, then the command is audited. You're right, I did not understand that this is what you were proposing, and this is not what the code does. I went back and read your original description, and it seems I implemented only the subset I understood. I'll look into changing the code sometime next week. Ok, great, thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Function to know last log write timestamp
On Fri, Nov 28, 2014 at 9:07 PM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Nov 26, 2014 at 4:05 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Fri, Aug 15, 2014 at 8:17 PM, Fujii Masao masao.fu...@gmail.com wrote: On Fri, Aug 15, 2014 at 3:40 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-08-14 14:37:22 -0400, Robert Haas wrote: On Thu, Aug 14, 2014 at 2:21 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-08-14 14:19:13 -0400, Robert Haas wrote: That's about the idea. However, what you've got there is actually unsafe, because shmem-counter++ is not an atomic operation. It reads the counter (possibly even as two separate 4-byte loads if the counter is an 8-byte value), increments it inside the CPU, and then writes the resulting value back to memory. If two backends do this concurrently, one of the updates might be lost. All these are only written by one backend, so it should be safe. Note that that coding pattern, just without memory barriers, is all over pgstat.c Ah, OK. If there's a separate slot for each backend, I agree that it's safe. We should probably add barriers to pgstat.c, too. Yea, definitely. I think this is rather borked on weaker architectures. It's just that the consequences of an out of date/torn value are rather low, so it's unlikely to be noticed. Imo we should encapsulate the changecount modifications/checks somehow instead of repeating the barriers, Asserts, comments et al everywhere. So what about applying the attached patch first, which adds the macros to load and store the changecount with the memory barries, and changes pgstat.c use them. Maybe this patch needs to be back-patch to at least 9.4? After applying the patch, I will rebase the pg_last_xact_insert_timestamp patch and post it again. Hm, what's the status on this patch? The addition of those macros to control count increment with a memory barrier seems like a good thing at least. Thanks for reminding me of that! Barring any objection, I will commit it. Applied. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Table-level log_autovacuum_min_duration
Michael Paquier wrote: Hi all, Today I spent a bit of time looking at the activity of autovacuum for one table particularly bloated. log_autovacuum_min_duration was enabled and set to a high value but even with that I had to deal with some spam from the jobs of other tables. It would be cool to have the possibility to do some filtering IMO. So, what about having a relopt to control log_autovacuum_min_duration? RelationData-rd_options is largely accessible for a given relation in the code paths of vacuum and analyze. +1 -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On 12/18/2014 01:02 AM, Peter Geoghegan wrote: On Wed, Dec 17, 2014 at 1:12 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: Now, let's imagine a table like this: CREATE TABLE persons ( username text unique, real_name text unique, data text ); Is there any way to specify both of those constraints, so that the insertion is IGNOREd if it violates either one of them? If you try to do: INSERT INTO persons(username, real_name, data) VALUES('foobar', 'foo bar') ON CONFLICT (username, real_name) IGNORE; It will fail because there is no unique index on (username, real_name). In this particular case, you could leave out the specification, but if there was a third constraint that you're not expecting to conflict with, you would want violations of that constraint to still throw an error. And you can't leave out the specification with ON CONFLICT UPDATE anyway. Good point. For the IGNORE case: I guess the syntax just isn't that flexible. I agree that that isn't ideal. It should be simple to allow multiple key specifications: INSERT INTO persons (username, real_name, data) VALUES('foobar', 'foo bar') ON CONFLICT (username), (real_name) IGNORE; It's a rather niche use case, but might as well support it for the sake of completeness. For the UPDATE case: Suppose your example was an UPDATE where we simply assigned the excluded.data value to the data column in the auxiliary UPDATE's targetlist. What would the user really be asking for with that command, at a really high level? It seems like they might actually want to run two UPSERT commands (one for username, the other for real_name), or rethink their indexing strategy - in particular, whether it's appropriate that there isn't a composite unique constraint on (username, real_name). Now, suppose that by accident or by convention it will always be possible for a composite unique index to be built on (username, real_name) - no dup violations would be raised if it was attempted, but it just hasn't been and won't be. In other words, it's generally safe to actually pretend that there is one. Then, surely it doesn't matter if the user picks one or the other unique index. It'll all work out when the user assigns to both in the UPDATE targetlist, because of the assumed convention that I think is implied by the example. If the convention is violated, at least you get a dup violation letting you know (iff you bothered to assign). But I wouldn't like to encourage that pattern. I think that the long and the short of it is that you really ought to have one unique index as an arbiter in mind when writing a DML statement for the UPDATE variant. Relying on this type of convention is possible, I suppose, but ill-advised. Another thought is that you might want to specify a different action depending on which constraint is violated: INSERT INTO persons (username, real_name, data) VALUES('foobar', 'foo bar') ON CONFLICT (username) IGNORE ON CONFLICT (real_name) UPDATE ...; Although that leaves the question of what to do if both are violated. Perhaps: INSERT INTO persons (username, real_name, data) VALUES('foobar', 'foo bar') ON CONFLICT (username, real_name) IGNORE ON CONFLICT (real_name) UPDATE username = excluded.username; ON CONFLICT (username) UPDATE real_name = excluded.real_name; 5. What if there are multiple unique indexes with the same columns, but different operator classes? I thought about that. I am reusing a little bit of the CREATE INDEX infrastructure for raw parsing, and for a small amount of parse analysis (conveniently, this makes the command reject things like aggregate functions with no additional code - the error messages only mention index expressions, so I believe that's fine). This could include an opclass specification, but right now non-default opclasses are rejected during extra steps in parse analysis, for no particular reason. I could easily have the unique index inference specification accept a named opclass, if you thought that was important, and you thought naming a non-default opclass by name was a good SQL interface. It would take only a little effort to support non-default opclasses. It's a little weird to mention an opclass by name. It's similar to naming an index by name, really. How about naming the operator? For an exclusion constraint, that would be natural, as the syntax to create an exclusion constraint in the first place is EXCLUDE USING gist (c WITH ) Naming the index by columns makes sense in most cases, and I don't like specifying the index's name, but how about allowing naming a constraint? Indexes are just an implementation detail, but constraints are not. Unique and exclusion constraints are always backed by an index, so there is little difference in practice, but I would feel much more comfortable mentioning constraints by name than indexes. Most people would list the columns, but if there is a really bizarre constraint, with non-default opclasses, or an exclusion constraint, it's
Re: [HACKERS] Commitfest problems
On Wed, Dec 17, 2014 at 5:00 PM, Stephen Frost sfr...@snowman.net wrote: Another thought I had was to suggest we consider *everyone* to be a contributor and implement a way to tie together the mailing list archives with the commit history and perhaps the commitfest app and make it searchable and indexed on some website. eg: contributors.postgresql.org/sfrost - Recent commits - Recent commit mentions - Recent emails to any list - Recent commitfest app activity - Recent wiki page updates ... Ideally with a way for individuals to upload a photo, provide a company link, etc, similar to what the existing Major Contributors have today. Obviously, this is not a small task to develop and there is some risk of abuse (which I expect the other folks on the infra team will point out and likely tar and feather me for suggesting this at all..) but it might be along the same lines as Bruce's PgLife.. +1 Here in Brazil we need a better way to proof our contributions an involvement with the PostgreSQL Community. Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog: http://fabriziomello.github.io Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello Github: http://github.com/fabriziomello
Re: [HACKERS] Commitfest problems
On Thu, Dec 18, 2014 at 8:44 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Wed, Dec 17, 2014 at 5:00 PM, Stephen Frost sfr...@snowman.net wrote: Another thought I had was to suggest we consider *everyone* to be a contributor and implement a way to tie together the mailing list archives with the commit history and perhaps the commitfest app and make it searchable and indexed on some website. eg: contributors.postgresql.org/sfrost - Recent commits - Recent commit mentions - Recent emails to any list - Recent commitfest app activity - Recent wiki page updates ... Ideally with a way for individuals to upload a photo, provide a company link, etc, similar to what the existing Major Contributors have today. Obviously, this is not a small task to develop and there is some risk of abuse (which I expect the other folks on the infra team will point out and likely tar and feather me for suggesting this at all..) but it might be along the same lines as Bruce's PgLife.. +1 It does feel good to be acknowledged for our work especially when there is a policy to acknowledge this in our community. Regards, Atri
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
On Thu, Dec 18, 2014 at 5:27 PM, Fujii Masao masao.fu...@gmail.com wrote: Thanks! Thanks for your input. +else +memcpy(compression_scratch, page, page_len); I don't think the block image needs to be copied to scratch buffer here. We can try to compress the page directly. Check. +#include utils/pg_lzcompress.h #include utils/memutils.h pg_lzcompress.h should be after meutils.h. Oops. +/* Scratch buffer used to store block image to-be-compressed */ +static char compression_scratch[PGLZ_MAX_BLCKSZ]; Isn't it better to allocate the memory for compression_scratch in InitXLogInsert() like hdr_scratch? Because the OS would not touch it if wal_compression is never used, but now that you mention it, it may be better to get that in the context of xlog_insert.. +uncompressed_page = (char *) palloc(PGLZ_RAW_SIZE(header)); Why don't we allocate the buffer for uncompressed page only once and keep reusing it like XLogReaderState-readBuf? The size of uncompressed page is at most BLCKSZ, so we can allocate the memory for it even before knowing the real size of each block image. OK, this would save some cycles. I was trying to make process allocate a minimum of memory only when necessary. -printf( (FPW); hole: offset: %u, length: %u\n, - record-blocks[block_id].hole_offset, - record-blocks[block_id].hole_length); +if (record-blocks[block_id].is_compressed) +printf( (FPW); hole offset: %u, compressed length %u\n, + record-blocks[block_id].hole_offset, + record-blocks[block_id].bkp_len); +else +printf( (FPW); hole offset: %u, length: %u\n, + record-blocks[block_id].hole_offset, + record-blocks[block_id].bkp_len); We need to consider what info about FPW we want pg_xlogdump to report. I'd like to calculate how much bytes FPW was compressed, from the report of pg_xlogdump. So I'd like to see also the both length of uncompressed FPW and that of compressed one in the report. OK, so let's add a parameter in the decoder for the uncompressed length. Sounds fine? In pg_config.h, the comment of BLCKSZ needs to be updated? Because the maximum size of BLCKSZ can be affected by not only itemid but also XLogRecordBlockImageHeader. Check. boolhas_image; +boolis_compressed; Doesn't ResetDecoder need to reset is_compressed? Check. +#wal_compression = off# enable compression of full-page writes Currently wal_compression compresses only FPW, so isn't it better to place it after full_page_writes in postgresql.conf.sample? Check. +uint16extra_data;/* used to store offset of bytes in hole, with + * last free bit used to check if block is + * compressed */ At least to me, defining something like the following seems more easy to read. uint16hole_offset:15, is_compressed:1 Check++. Updated patches addressing all those things are attached. Regards, -- Michael 20141219_fpw_compression_v9.tar.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minor binary-search int overflow in timezone code
Re: Tom Lane 2014-12-16 14615.1418694...@sss.pgh.pa.us Jim Nasby jim.na...@bluetreble.com writes: On 12/15/14, 1:39 PM, Christoph Berg wrote: Well, if it's not interesting, let's just forget it. Sorry. At the risk of sticking my head in the lions mouth... this is the kind of response that deters people from contributing anything to the project, including reviewing patches. A simple thanks, but we feel it's already clear enough that there can't be anywhere close to INT_MAX timezones would have sufficed. Yeah, I need to apologize. I was a bit on edge today due to the release wrap (which you may have noticed wasn't going too smoothly), and should not have responded like that. Hi, maybe I should apologize as well for submitting this right at the time of the release... I also remain curious as to what sort of tool would complain about this particular code and not the N other nearly-identical binary-search loops in the PG sources, most of which deal with data structures potentially far larger than the timezone data ... He said he found it in manual code review, not using a tool. But anyway, I do agree this is a very minor issue and there's much more interesting things to spend time on. I promise to send in more severe security issues next time :) Christoph -- c...@df7cb.de | http://www.df7cb.de/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Mon, Dec 15, 2014 at 11:06 PM, Peter Geoghegan p...@heroku.com wrote: On Mon, Dec 15, 2014 at 4:59 PM, Peter Geoghegan p...@heroku.com wrote: On Mon, Dec 15, 2014 at 4:22 PM, Jeff Janes jeff.ja...@gmail.com wrote: Also, in both Linux and MinGW under option 1 patch I get an OID conflict on OID 3261. I'll take a pass at fixing this bitrot soon. I'll follow Tom's advice about macro collisions on MinGW while I'm at it, since his explanation seems plausible. Attached pair of revised patch sets fix the OID collision, and presumably fix the MinGW issue (because IGNORE_P is now used as a token name). It also polishes approach #2 to value locking in a few places (e.g. better comments). Finally, both patches have a minor buglet around EXPLAIN ANALYZE output fixed -- the output now indicates if tuples are pulled up from auxiliary update nodes. I naively tried this in vallock1 patch: create table foo(index int, count int); create unique index on foo(index); insert into foo (index, count) values (0,1) on conflict (index) update set count=foo.count + 1 returning foo.count; insert into foo (index, count) values (0,1) on conflict (index) update set count=foo.count + 1 returning foo.count; insert into foo (index, count) values (0,1) on conflict (index) update set count=foo.count + 1 returning foo.count; insert into foo (index, count) values (0,1) on conflict (index) update set count=foo.count + 1 returning foo.count; After actually reading the documentation more closely, I decided this should be an error because foo is not a valid table alias in the update set expression. Instead of being a parsing/planning error, this executes and the foo.count on the RHS of the assignment always evaluates as zero (even on subsequent invocations when TARGET.count is 1). If I switch to a text type, then I get seg faults under the same condition: create table foo(index int, count text); create unique index on foo(index); insert into foo (index, count) values (0,'start ') on conflict (index) update set count=foo.count||' bar' returning count; insert into foo (index, count) values (0,'start ') on conflict (index) update set count=foo.count||' bar' returning count; boom #0 pg_detoast_datum_packed (datum=0x0) at fmgr.c:2270 #1 0x0074fb7a in textcat (fcinfo=0x1e67a78) at varlena.c:662 #2 0x005a63a5 in ExecMakeFunctionResultNoSets (fcache=0x1e67a08, econtext=0x1e67848, isNull=0x1e68b11 , isDone=value optimized out) at execQual.c:2026 #3 0x005a2353 in ExecTargetList (projInfo=value optimized out, isDone=0x7fffa7fa346c) at execQual.c:5358 #4 ExecProject (projInfo=value optimized out, isDone=0x7fffa7fa346c) at execQual.c:5573 #5 0x005a86c2 in ExecScan (node=0x1e67738, accessMtd=0x5baf00 SeqNext, recheckMtd=0x5bad60 SeqRecheck) at execScan.c:207 #6 0x005a1918 in ExecProcNode (node=0x1e67738) at execProcnode.c:406 #7 0x0059ef32 in EvalPlanQualNext (epqstate=value optimized out) at execMain.c:2380 #8 0x005b8fcd in ExecLockUpdateTuple (node=0x1e5f750) at nodeModifyTable.c:1098 #9 ExecInsert (node=0x1e5f750) at nodeModifyTable.c:372 #10 ExecModifyTable (node=0x1e5f750) at nodeModifyTable.c:1396 #11 0x005a1958 in ExecProcNode (node=0x1e5f750) at execProcnode.c:383 #12 0x005a0642 in ExecutePlan (queryDesc=0x1dd0908, direction=value optimized out, count=0) at execMain.c:1515 #13 standard_ExecutorRun (queryDesc=0x1dd0908, direction=value optimized out, count=0) at execMain.c:308 #14 0x7f601416b9cb in pgss_ExecutorRun (queryDesc=0x1dd0908, direction=ForwardScanDirection, count=0) at pg_stat_statements.c:874 #15 0x0069385f in ProcessQuery (plan=0x1e47df0, So I think there needs to be some kind of logic to de-recognize the table alias foo. Once I rewrote the query to use TARGET and EXCLUDED correctly, I've put this through an adaptation of my usual torture test, and it ran fine until wraparound shutdown. I'll poke at it more later. Cheers, Jeff
Re: [HACKERS] exitArchiveRecovery woes
On Thu, Dec 18, 2014 at 10:49 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 12/18/2014 03:53 PM, Robert Haas wrote: On Wed, Dec 17, 2014 at 8:40 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: At the end of archive recovery, we copy the last segment from the old timeline, to initialize the first segment on the new timeline. For example, if the timeline switch happens in the middle of WAL segment 00010005, the whole 00010005 segment is copied to become 00020005. The copying is necessary, so that the new segment contains valid data up to the switch point. However, we wouldn't really need to copy the whole segment, copying up to the switch point would be enough. In fact, copying the whole segment is a bad idea, because the copied WAL looks valid on the new timeline too. Your proposed change makes sense to me, but why do we need the segment to contain valid data up to the switch point? It seems like the switch between timelines should be crisper: replay WAL on the old timeline only from the old segment, and from the new timeline only on the new segment. Anything else seems like an invitation to unending corner-case bugs. True. That would require some changes to the way archive recovery works, though. Currently, when our recovery target timeline is, for example, 5, whose parents are 4 and 3, and we're currently on timeline 3, we will try to restore each segment first with timeline ID 5, then 4, then 3. It's a bit silly, because we know the timeline history and the exact points where the timelines changed, so we could just fetch the correct one. That would be a good idea, but I'm going to go ahead with just this smaller change now. Yeah, it's a separate issue. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Thu, Dec 18, 2014 at 9:20 AM, Jeff Janes jeff.ja...@gmail.com wrote: After actually reading the documentation more closely, I decided this should be an error because foo is not a valid table alias in the update set expression. Instead of being a parsing/planning error, this executes and the foo.count on the RHS of the assignment always evaluates as zero (even on subsequent invocations when TARGET.count is 1). If I switch to a text type, then I get seg faults under the same condition: So I think there needs to be some kind of logic to de-recognize the table alias foo. Once I rewrote the query to use TARGET and EXCLUDED correctly, I've put this through an adaptation of my usual torture test, and it ran fine until wraparound shutdown. I'll poke at it more later. Oops. I agree with your diagnosis, and will circle around to fix that bug in the next revision by, as you say, simply rejecting the query if it doesn't use the two standard aliases. Thanks for testing! -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Log inability to lock pages during vacuum
On 12/18/14, 7:56 AM, Robert Haas wrote: On Wed, Dec 17, 2014 at 11:20 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: LOG: automatic vacuum of table postgres.public.foo: index scans: 0 pages: 0 removed, 7256 remain, 0 pinned tuples: 79415 removed, 513156 remain, 0 are dead but not yet removable buffer usage: 14532 hits, 6 misses, 6241 dirtied avg read rate: 0.003 MB/s, avg write rate: 3.413 MB/s system usage: CPU 0.00s/0.30u sec elapsed 14.28 sec I.e. this just says how many pages were pinned, without saying what was done about them. That's not very meaningful to an average DBA, but that's true for many of the numbers printed in vacuum verbose. That message is extremely confusing, to my eyes. If you want to say pages: 0 removed, 7256 remain, 0 skipped due to pins, that would work for me, but just say 0 pinned is totally wrong, because vacuum pinned every page in the table. We have to decide on a tradeoff here. Either we end up with two different log messages (depending on scan_all) that require two different translations, or we end up with a generic message that isn't as clear. The best option I can think of for the later is something like failed initial lock attempt. That's the only thing that will be true in all cases. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] NUMERIC private methods?
On 12/18/14, 9:21 AM, Tom Lane wrote: As it stands, no extension can use the numeric type in any non-trivial way without paying a large penalty for repeated pallocs and data copies. Given that the ability to write C extensions easily is one of pg's great strengths, this is a defect that should be corrected. If copying data/palloc is the root of numeric's performance problems then we need to address that, because it will provide benefit across the entire database. The pattern of (palloc; copy) is repeated throughout a large part of the codebase. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 12/18/2014 04:53 AM, Torsten Zuehlsdorff wrote: Having your name in a list of other names at the bottom of the release notes page, without any indication of what you helped with, would work better? Perhaps it would but I tend to doubt it. Out of my personal experience in Germany: yes, it helps. It is not very logical, but many people need a simple way (Website against git log) to see something. (I've rarely seen that something like that is considered not trustable even if there are strong indications that its faked.) But i think it is a good point that the release notes should not become to big. I think a lot of hackers forget exactly how tender their egos are. Now I say this knowing that a lot of them will go, Oh give me a break but as someone who employs hackers, deals with open source AND normal people :P every single day, I can tell you without a single inch of sarcasm that petting egos is one of the ways you get things done in the open source (and really any male dominated) community. The problem is, most of our long term contributers don't need to be petted quite so often because they have a long list of: I don't need my ego stroked, do you see the length or value of the contributions I provide? And simply, there are some that just don't care. However, doing crappy work and let's not be shy about it, there is NOTHING fun about reviewing someone else's code needs to have incentive. Just like when we were kids, we were much more likely to rake the leaves with at least a half smile if we got that extra 10 bucks or if we were able to go to that party on Friday. Finding a way to provide incentive and credit (and they may be the same) will increase the value of the non-self value work of reviewing patches. In the the Pg world, the most obvious way is to have attribution in a public space. Perhaps an email that goes out to -announce (and planet) for each release that is a thank you to contributors? That way we don't touch the release notes at all. Sincerely, Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, @cmdpromptinc If we send our children to Caesar for their education, we should not be surprised when they come back as Romans. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 12/18/2014 07:31 AM, Andrew Dunstan wrote: +1 It does feel good to be acknowledged for our work especially when there is a policy to acknowledge this in our community. I like this idea but who is going to code our new social network? Frankly, this coin is going to become so debased as to be worthless. From a specific desire to acknowledge reviewers for their work we are now getting to a list that you can get on by posting to any mailing list. Sure you can but that doesn't mean there is value in the fact that they are listed. Just like Facebook, it is all about how many friends you have or linked-in who has the endorse feature. Please, can we stick to what was the original point. This tendency we have to enlarge proposals way beyond their origin is why getting things done is often so difficult. Good point. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, @cmdpromptinc If we send our children to Caesar for their education, we should not be surprised when they come back as Romans. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
All, It's sounding like folks would prefer keeing the master contributors list up to date, to adding a bunch of names to the release notes. So, then, I have a proposal for criteria for getting on the contributors list via patch review: - substantial, deep review of at least one patch (including detailed code review and possible corrections) - functionality reviews of at least 3 patches, including full write-ups (not just it compiled, seems to work). Kibitz as you may, but please don't try to make these criteria more *complicated*, because there's no way we'll ever keep track. Note that updating the contributor list in the past has been slow due to lack of technology and process; if it's our way forward, then I'll apply myself to that problem. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
Josh Berkus wrote: So, then, I have a proposal for criteria for getting on the contributors list via patch review: - substantial, deep review of at least one patch (including detailed code review and possible corrections) - functionality reviews of at least 3 patches, including full write-ups (not just it compiled, seems to work). Kibitz as you may, but please don't try to make these criteria more *complicated*, because there's no way we'll ever keep track. The problem with complicated rules (which these, I think, already are) is how to keep track of people that helps to which level. I make a point of crediting reviewers and code contributors in my commit messages, but can you tell which ones of the following guys should make it to these lists? I yanked this text from my commit 73c986adde5d73a5e2555da9b5c8facedb146dcd: Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven Singer, Peter Eisentraut I do agree that we need to give credit in some form, though. I'm just saying can we please not put the responsibility on committers. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 12/18/2014 10:37 AM, Alvaro Herrera wrote: The problem with complicated rules (which these, I think, already are) is how to keep track of people that helps to which level. I make a point of crediting reviewers and code contributors in my commit messages, but can you tell which ones of the following guys should make it to these lists? I yanked this text from my commit 73c986adde5d73a5e2555da9b5c8facedb146dcd: Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven Singer, Peter Eisentraut I do agree that we need to give credit in some form, though. I'm just saying can we please not put the responsibility on committers. Truthfully, I think this is something that should be solved through crowdsourcing. Why is a relatively small group of people (-hackers) trying to solve a problem that each individual can solve on their own given the tools? Allow people to submit for approval their own contributor listing, allow people to edit for approval their own contributor listing. Just like news/events. If people want to be listed they can be, it just has to be approved. Then have a very basic (not unlike what JB just posted) rules for the moderators to review against. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, @cmdpromptinc If we send our children to Caesar for their education, we should not be surprised when they come back as Romans. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: hash_create API changes (was Re: [HACKERS] speedup tidbitmap patch: hash BlockNumber)
I wrote: Here's a proposed patch along this line. I left in oid_hash (in the form of a macro) so that this does not cause any API break for existing third-party modules. However, no callers in our own code directly refer to tag_hash or oid_hash anymore. Committed that version after some further comment wordsmithing. On Teodor's original test cases, I see about 8% speedup compared to the 4%-ish numbers he originally reported. This may be random variation or it might mean that we got a win someplace else besides tidbitmap.c. I've not tried to sleuth it down exactly. I am wondering though if this suggests that it'd be worth our while to add a similar fast path for 8-byte hash keys. That would be quite painless to add now (with the exception of actually coding the fast hash function, perhaps). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 19/12/14 07:02, Joshua D. Drake wrote: On 12/18/2014 04:53 AM, Torsten Zuehlsdorff wrote: Having your name in a list of other names at the bottom of the release notes page, without any indication of what you helped with, would work better? Perhaps it would but I tend to doubt it. Out of my personal experience in Germany: yes, it helps. It is not very logical, but many people need a simple way (Website against git log) to see something. (I've rarely seen that something like that is considered not trustable even if there are strong indications that its faked.) But i think it is a good point that the release notes should not become to big. I think a lot of hackers forget exactly how tender their egos are. Now I say this knowing that a lot of them will go, Oh give me a break but as someone who employs hackers, deals with open source AND normal people :P every single day, I can tell you without a single inch of sarcasm that petting egos is one of the ways you get things done in the open source (and really any male dominated) community. The problem is, most of our long term contributers don't need to be petted quite so often because they have a long list of: I don't need my ego stroked, do you see the length or value of the contributions I provide? And simply, there are some that just don't care. However, doing crappy work and let's not be shy about it, there is NOTHING fun about reviewing someone else's code needs to have incentive. Just like when we were kids, we were much more likely to rake the leaves with at least a half smile if we got that extra 10 bucks or if we were able to go to that party on Friday. Finding a way to provide incentive and credit (and they may be the same) will increase the value of the non-self value work of reviewing patches. In the the Pg world, the most obvious way is to have attribution in a public space. Perhaps an email that goes out to -announce (and planet) for each release that is a thank you to contributors? That way we don't touch the release notes at all. Sincerely, Joshua D. Drake Hey Joshua, what does a 'Normal person look like??? :-) I did help review some code for a pg contributor once, but it was very minor. If I was 17, I would probably get a kick out of seeing my name mentioned - but now I would be embarrassed, because what I did was was quite insignificant in the scale of things. I think a separate list of contributors would be good. How about some 'Browne points' mechanism which would give a rough measure of the value of the contribution. My contribution would probably rate '1', whereas Tom's would be at least '100,000,000' - more realistically: my contribution would not rate at all, but Tom's would still be the largest by far! Perhaps a log scale so Tom would not show us up so much, and the contributions of new people would look more significant? Probably any such scheme would be too difficult to administer in practice, or taken far too seriously, though it might be fun. Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Thu, Dec 18, 2014 at 7:51 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 12/18/2014 05:46 PM, Kevin Grittner wrote: I don't think either point was ever really settled beyond Robert and I preferring ON DUPLICATE versus Peter preferring ON CONFLICT. I also prefer ON CONFLICT, because that makes more sense when you consider exclusion constraints, which I'm still hoping that this would support. If not immediately, at least in the future. This was why I changed the spelling to ON CONFLICT. It also doesn't hurt that that spelling is dissimilar to MySQL's syntax, IMV, because there are plenty of things to dislike about ON DUPLICATE KEY UPDATE, and I think a veneer of compatibility is inappropriate - this syntax is both considerably more flexible and considerably safer. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 12/18/2014 11:03 AM, Gavin Flower wrote: Hey Joshua, what does a 'Normal person look like??? :-) Hahhhahahah, you have to get out of your basement to see them. Usually, they are at the latest and newest coffee hub, talking about hating hipsters while wearing skinny jeans and a new but used looking plaid shirt. I think a separate list of contributors would be good. How about some 'Browne points' mechanism which would give a rough measure of the value of the contribution. My contribution would probably rate '1', whereas Tom's would be at least '100,000,000' - more realistically: my contribution would not rate at all, but Tom's would still be the largest by far! Perhaps a log scale so Tom would not show us up so much, and the contributions of new people would look more significant? Probably any such scheme would be too difficult to administer in practice, or taken far too seriously, though it might be fun. I don't think it would be difficult to come up with a basic system that gives weight to length of contribution as well as quantity and quality. But I don't think your specific applies. Tom is core, that in itself is the platinum badge of honor from an outsiders perspective. Sincerely, Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, @cmdpromptinc If we send our children to Caesar for their education, we should not be surprised when they come back as Romans. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Log inability to lock pages during vacuum
Jim Nasby wrote: We have to decide on a tradeoff here. Either we end up with two different log messages (depending on scan_all) that require two different translations, or we end up with a generic message that isn't as clear. The best option I can think of for the later is something like failed initial lock attempt. That's the only thing that will be true in all cases. Here's my proposal. Instead of punting, I split the message in separately translatable units, and emit only the ones that apply. The code is messier this way, but I think we can live with that. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml index 450c94f..19fd748 100644 --- a/doc/src/sgml/ref/vacuum.sgml +++ b/doc/src/sgml/ref/vacuum.sgml @@ -252,6 +252,7 @@ DETAIL: CPU 0.01s/0.06u sec elapsed 0.07 sec. INFO: onek: found 3000 removable, 1000 nonremovable tuples in 143 pages DETAIL: 0 dead tuples cannot be removed yet. There were 0 unused item pointers. +Skipped 0 pages due to buffer pins. 0 pages are entirely empty. CPU 0.07s/0.39u sec elapsed 1.56 sec. INFO: analyzing public.onek diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index 6db6c5c..592ef95 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -105,6 +105,7 @@ typedef struct LVRelStats BlockNumber old_rel_pages; /* previous value of pg_class.relpages */ BlockNumber rel_pages; /* total number of pages */ BlockNumber scanned_pages; /* number of pages we examined */ + BlockNumber pinned_pages; /* # of pages we could not initially lock */ double scanned_tuples; /* counts only tuples on scanned pages */ double old_rel_tuples; /* previous value of pg_class.reltuples */ double new_rel_tuples; /* new estimated total # of tuples */ @@ -332,6 +333,7 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt, TimestampDifferenceExceeds(starttime, endtime, Log_autovacuum_min_duration)) { + StringInfoData buf; TimestampDifference(starttime, endtime, secs, usecs); read_rate = 0; @@ -343,27 +345,44 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt, write_rate = (double) BLCKSZ *VacuumPageDirty / (1024 * 1024) / (secs + usecs / 100.0); } + + /* + * This is pretty messy, but we split it up so that we can skip emitting + * individual parts of the message when not applicable. + */ + initStringInfo(buf); + appendStringInfo(buf, _(automatic vacuum of table \%s.%s.%s\: index scans: %d\n), + get_database_name(MyDatabaseId), + get_namespace_name(RelationGetNamespace(onerel)), + RelationGetRelationName(onerel), + vacrelstats-num_index_scans); + appendStringInfo(buf, _(pages: %d removed, %d remain\n), + vacrelstats-pages_removed, + vacrelstats-rel_pages); + if (scan_all) +appendStringInfo(buf, _(waited for %d buffer pins\n), + vacrelstats-pinned_pages); + else +appendStringInfo(buf, + _(skipped %d pages due to buffer pins\n), + vacrelstats-pinned_pages); + appendStringInfo(buf, + _(tuples: %.0f removed, %.0f remain, %.0f are dead but not yet removable\n), + vacrelstats-tuples_deleted, + vacrelstats-new_rel_tuples, + vacrelstats-new_dead_tuples); + appendStringInfo(buf, + _(buffer usage: %d hits, %d misses, %d dirtied\n), + VacuumPageHit, + VacuumPageMiss, + VacuumPageDirty); + appendStringInfo(buf, _(avg read rate: %.3f MB/s, avg write rate: %.3f MB/s\n), + read_rate, write_rate); + appendStringInfo(buf, _(system usage: %s), pg_rusage_show(ru0)); + ereport(LOG, - (errmsg(automatic vacuum of table \%s.%s.%s\: index scans: %d\n - pages: %d removed, %d remain\n - tuples: %.0f removed, %.0f remain, %.0f are dead but not yet removable\n - buffer usage: %d hits, %d misses, %d dirtied\n - avg read rate: %.3f MB/s, avg write rate: %.3f MB/s\n - system usage: %s, - get_database_name(MyDatabaseId), - get_namespace_name(RelationGetNamespace(onerel)), - RelationGetRelationName(onerel), - vacrelstats-num_index_scans, - vacrelstats-pages_removed, - vacrelstats-rel_pages, - vacrelstats-tuples_deleted, - vacrelstats-new_rel_tuples, - vacrelstats-new_dead_tuples, - VacuumPageHit, - VacuumPageMiss, - VacuumPageDirty, - read_rate, write_rate, - pg_rusage_show(ru0; + (errmsg_internal(%s, buf.data))); + pfree(buf.data); } } } @@ -438,6 +457,7 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, BlockNumber next_not_all_visible_block; bool skipping_all_visible_blocks; xl_heap_freeze_tuple *frozen; + StringInfoData buf; pg_rusage_init(ru0); @@ -611,6 +631,8 @@ lazy_scan_heap(Relation onerel,
Re: [HACKERS] Proposal: Log inability to lock pages during vacuum
On 12/18/2014 09:41 PM, Alvaro Herrera wrote: Jim Nasby wrote: We have to decide on a tradeoff here. Either we end up with two different log messages (depending on scan_all) that require two different translations, or we end up with a generic message that isn't as clear. The best option I can think of for the later is something like failed initial lock attempt. That's the only thing that will be true in all cases. Here's my proposal. Instead of punting, I split the message in separately translatable units, and emit only the ones that apply. The code is messier this way, but I think we can live with that. Works for me. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Log inability to lock pages during vacuum
Heikki Linnakangas wrote: On 12/18/2014 09:41 PM, Alvaro Herrera wrote: Here's my proposal. Instead of punting, I split the message in separately translatable units, and emit only the ones that apply. The code is messier this way, but I think we can live with that. Works for me. Great, thanks, pushed. I tweaked it a bit more, so that it would say either skipped N pages or waited N pins in both autovacuum and vacuum verbose cases, but only if N 0. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Log inability to lock pages during vacuum
Alvaro Herrera alvhe...@2ndquadrant.com writes: Great, thanks, pushed. I tweaked it a bit more, so that it would say either skipped N pages or waited N pins in both autovacuum and vacuum verbose cases, but only if N 0. Not directly relevant but ... I think probably all those BlockNumber counters should be printed with %u not %d. A relation with between 2G and 4G pages is possible, and it wouldn't look right with %d. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Log inability to lock pages during vacuum
On 2014-12-18 16:41:04 -0300, Alvaro Herrera wrote: + if (scan_all) + appendStringInfo(buf, _(waited for %d buffer pins\n), + vacrelstats-pinned_pages); + else + appendStringInfo(buf, + _(skipped %d pages due to buffer pins\n), + vacrelstats-pinned_pages); Unless I miss something this is, as mentioned before, not correct. scan_all doesn't imply at all that we waited for buffer pins. We only do so if lazy_check_needs_freeze(buf). Which usually won't be true for a *significant* number of pages. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Log inability to lock pages during vacuum
Andres Freund wrote: On 2014-12-18 16:41:04 -0300, Alvaro Herrera wrote: + if (scan_all) + appendStringInfo(buf, _(waited for %d buffer pins\n), + vacrelstats-pinned_pages); + else + appendStringInfo(buf, +_(skipped %d pages due to buffer pins\n), + vacrelstats-pinned_pages); Unless I miss something this is, as mentioned before, not correct. scan_all doesn't imply at all that we waited for buffer pins. We only do so if lazy_check_needs_freeze(buf). Which usually won't be true for a *significant* number of pages. Ah, interesting, I didn't remember we had that. I guess one possible tweak is to discount the pages we skip from pinned_pages; or we could keep a separate count of pages waited for. Jim, up for a patch? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 12/18/14, 12:08 PM, Joshua D. Drake wrote: It does feel good to be acknowledged for our work especially when there is a policy to acknowledge this in our community. I like this idea but who is going to code our new social network? +1. I do like the idea; but I don't like it enough to do it myself. :) -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Log inability to lock pages during vacuum
On 12/18/14, 3:02 PM, Alvaro Herrera wrote: Andres Freund wrote: On 2014-12-18 16:41:04 -0300, Alvaro Herrera wrote: + if (scan_all) + appendStringInfo(buf, _(waited for %d buffer pins\n), + vacrelstats-pinned_pages); + else + appendStringInfo(buf, +_(skipped %d pages due to buffer pins\n), + vacrelstats-pinned_pages); Unless I miss something this is, as mentioned before, not correct. scan_all doesn't imply at all that we waited for buffer pins. We only do so if lazy_check_needs_freeze(buf). Which usually won't be true for a *significant* number of pages. Ah, interesting, I didn't remember we had that. I guess one possible tweak is to discount the pages we skip from pinned_pages; or we could keep a separate count of pages waited for. Jim, up for a patch? I would prefer that we at least count if we initially don't get the lock; presumably that number is always low anyway and in that case I think we're done with this. If it turns out it is common to initially miss the pin then we could do something fancier. So how about if in the scan_all case we say something like unable to initially acquire pin on %d buffers\n? (Happy to do the patch either way, but I'd like us to decide what we're doing first. ;) -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: hash_create API changes (was Re: [HACKERS] speedup tidbitmap patch: hash BlockNumber)
On 12/18/14, 12:48 PM, Tom Lane wrote: I wrote: Here's a proposed patch along this line. I left in oid_hash (in the form of a macro) so that this does not cause any API break for existing third-party modules. However, no callers in our own code directly refer to tag_hash or oid_hash anymore. Committed that version after some further comment wordsmithing. On Teodor's original test cases, I see about 8% speedup compared to the 4%-ish numbers he originally reported. This may be random variation or it might mean that we got a win someplace else besides tidbitmap.c. I've not tried to sleuth it down exactly. I am wondering though if this suggests that it'd be worth our while to add a similar fast path for 8-byte hash keys. That would be quite painless to add now (with the exception of actually coding the fast hash function, perhaps). I stuck an elog in to report when a hash wasn't found, and came up with these results (first number is how many times a hash wasn't found, second is keysize). I don't see a lot of 8's in there... Second set of numbers is the same thing, except counting every time we looked for the hash value, regardless of result. (Both cases generated by running make check.) BTW, as part of this I found some hash values were hit over 30k times in the first case. I don't know if that means a boatload of collisions or something else. Command to generate that data is: $egrep '^LOG: Hashtable ' src/test/regress/log/postmaster.log |cut -d\| -f2,4|sort|uniq -c|cut -d' ' -f1|sort -n|tail Before hitting the raw data, here is a summary of hash searches by key size (generated by patch 1): 43 48 327 256 1411 416 9321 136 12436 12 31270 64 107017 8 203127 16 628107 4 2201582 20 -- Mostly LOCALLOCK and Shared Buffer egrep '^LOG: Hashtable ' src/test/regress/log/postmaster.log |cut -d\| -f2,6|sort|uniq -c (patch 2) 581 Analyzed elements table|8 1141 Analyzed lexemes table|16 46 Array distinct element count table|4 167 Attopt cache|8 26 Btree proof lookup cache|8 95 CFuncHash|4 2387 Combo CIDs|8 99 Databases hash|4 22 Event Trigger Cache|4 85 JoinRelHashTable|8 460690 LOCALLOCK hash|20 1 LOCALLOCK hash|LOCALLOCK hash 49608 LOCK hash|16 951 Local Buffer Lookup Table|20 5 Local predicate lock|16 581 Operator class cache|4 1658 Operator lookup cache|136 301 PLpgSQL function cache|416 5 PREDICATELOCK hash|16 14 PREDICATELOCKTARGET hash|16 52278 PROCLOCK hash|16 1064 Pending Ops Table|12 14790 Per-database table|4 15297 Portal hash|64 21 Prepared Queries|64 126 PrivateRefCount|4 4 RI compare cache|8 55 RI constraint cache|4 90 RI query cache|8 72 Record information cache|64 24040 Relcache by OID|4 742 RelfilenodeMap cache|8 21 Rendezvous variable hash|64 4 Rewrite / Old to new tid map|12 49 Rewrite / Unresolved ctids|12 5 SERIALIZABLEXID hash|4 60 Sequence values|4 9603 Shared Buffer Lookup Table|20 43 ShmemIndex|48 5335 TIDBitmap|4 138 TableSpace cache|4 16516 Temporary table of OIDs|4 169 Timezones|256 6 Tsearch configuration cache|4 4 Tsearch dictionary cache|4 1 Tsearch parser cache|4 11550 TupleHashTable|8 327 Type information cache|4 59 json object hashtable|64 17430 smgr relation table|16 egrep '^LOG: Hashtable ' src/test/regress/log/postmaster.log |cut -d\| -f2,6|sort|uniq -c (patch 1) 2250 Analyzed elements table|8 28817 Analyzed lexemes table|16 428 Array distinct element count table|4 447 Attopt cache|8 224 Btree proof lookup cache|8 1363 CFuncHash|4 5219 Combo CIDs|8 2415 Databases hash|4 12063 Event Trigger Cache|4 178 JoinRelHashTable|8 990753 LOCALLOCK hash|20 72129 LOCK hash|16 19858 Local Buffer Lookup Table|20 2 Local Buffer Lookup Table|LOG: Hashtable 7 Local predicate lock|16 4557 Operator class cache|4 9321 Operator lookup cache|136 1 Operator lookup cache|136LOG: Hashtable 1411 PLpgSQL function cache|416 8 PREDICATELOCK hash|16 106 PREDICATELOCKTARGET hash|16 73102 PROCLOCK hash|16 10929 Pending Ops Table|12 23534 Per-database table|4 29502 Portal hash|64 233 Prepared Queries|64 689 PrivateRefCount|4 91 RI compare cache|8 310 RI constraint cache|4 289 RI query cache|8 1471 Record information cache|64 1 Record information cache|64LOG: Hashtable 1 Record information cache|6LOG: Hashtable 426050 Relcache by OID|4 1308 RelfilenodeMap cache|8 12 Rendezvous variable hash|64 24 Rewrite / Old to new tid map|12 1483 Rewrite / Unresolved ctids|12 12 SERIALIZABLEXID hash|4 263 Sequence values|4 1190971 Shared Buffer Lookup Table|20 30 Shared Buffer Lookup Table|20LOG: Hashtable 20 Shared Buffer Lookup Table|2LOG: Hashtable 1 Shared Buffer Lookup Table|Event Trigger Cache 27 Shared Buffer Lookup Table|LOCALLOCK hash 2 Shared Buffer Lookup Table|LOCK hash 14 Shared Buffer Lookup Table|LOG: Hashtable 9 Shared Buffer Lookup Table|PROCLOCK hash 10 Shared Buffer Lookup Table|Relcache by OID 20 Shared Buffer Lookup Table|Shared Buffer Lookup
Re: [HACKERS] pg_regress writes into source tree
On 12/18/2014 03:02 AM, Michael Paquier wrote: On Fri, Dec 12, 2014 at 10:45 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Another thing in that patch was that I had to add the sql/ directory to the source tree, but other than that .gitignore file it was empty. Maybe pg_regress should create the sql/ directory in the build dir if it doesn't exist. This is only a problem if a pg_regress suite only runs stuff from input/, because otherwise the sql/ dir already exists in the source. +1 for having pg_regress create the sql/ directory when it does not exist. Current behavior is annoying when modules having only tests in input/... That seems like a separate issue. I think Peter should commit his patch and backpatch it immediately, and we can deal with the missing sql directory when someone sends in a patch. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
* Andrew Dunstan (and...@dunslane.net) wrote: On Wed, Dec 17, 2014 at 5:00 PM, Stephen Frost sfr...@snowman.net mailto:sfr...@snowman.net wrote: contributors.postgresql.org/sfrost http://contributors.postgresql.org/sfrost - Recent commits - Recent commit mentions - Recent emails to any list - Recent commitfest app activity - Recent wiki page updates Frankly, this coin is going to become so debased as to be worthless. We could manage that. I certainly feel like a lot *more* folks should be acknowledged than we do today, if the feeling is that the commit message acknowledgments are insufficient. Would it be possible to articulate the criteria which would be sufficient for inclusion, to avoid having it be debased? From a specific desire to acknowledge reviewers for their work we are now getting to a list that you can get on by posting to any mailing list. I was thinking it'd take a bit more than that- perhaps regular emails and you have to ask for it? Or you have to have been mentioned in the commit history somewhere? Perhaps not obvious but implied was a requirement to have a community account. Please, can we stick to what was the original point. This tendency we have to enlarge proposals way beyond their origin is why getting things done is often so difficult. I agree with your concern that this could turn into a large effort which would derail the main discussion, but the above question is a good one- how do we avoid debaseing the value of inclusion in anything like this (or in the release notes..) while still providing the recognition and credit that the individuals who are helping to make PG the great system it is deserve? Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Commitfest problems
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote: Josh Berkus wrote: So, then, I have a proposal for criteria for getting on the contributors list via patch review: - substantial, deep review of at least one patch (including detailed code review and possible corrections) - functionality reviews of at least 3 patches, including full write-ups (not just it compiled, seems to work). Kibitz as you may, but please don't try to make these criteria more *complicated*, because there's no way we'll ever keep track. The problem with complicated rules (which these, I think, already are) I tend to agree that we want to avoid complicated rules. The corollary to that is the concern Andrew raised about my earlier off-the-cuff proposal- how do we avoid debasing the value of being recognized as a PG contributor? is how to keep track of people that helps to which level. I make a point of crediting reviewers and code contributors in my commit messages, but can you tell which ones of the following guys should make it to these lists? I yanked this text from my commit 73c986adde5d73a5e2555da9b5c8facedb146dcd: Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven Singer, Peter Eisentraut I do agree that we need to give credit in some form, though. I'm just saying can we please not put the responsibility on committers. Ugh, yeah, I certainly wouldn't want to have to work out some complex set of rules to be applied before each commit to define who can be considered a reviewer. That said, most of the above list are committers and those who aren't should be recognized in some fashion, so I'm not sure that this is really a good example. I don't have a good example of someone mentioned as a reviewer in the git message but who doesn't deserve recognition though and I'm actually not sure that we even have such an example in our git history.. If so, well, I'd rather err on the side of being more inclusive than less inclusive anyway. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Commitfest problems
* Jim Nasby (jim.na...@bluetreble.com) wrote: On 12/18/14, 12:08 PM, Joshua D. Drake wrote: It does feel good to be acknowledged for our work especially when there is a policy to acknowledge this in our community. I like this idea but who is going to code our new social network? +1. I do like the idea; but I don't like it enough to do it myself. :) I figured Magnus would be all over it.. ;D /me ducks runs Stephen signature.asc Description: Digital signature
Re: [HACKERS] [Bug] Inconsistent result for inheritance and FOR UPDATE.
On Tue, Dec 16, 2014 at 11:57:54AM -0500, Tom Lane wrote: We seem not to have had a new release of 9.2 since July, which is an awfully long time ago. So, hopefully soon? Nothing's likely to happen during the holidays, so probably mid-January is the earliest feasible target. I agree we're a bit overdue. In our defense, we had the need for too many 9.3.X releases too closely, so in some sense I am glad we didn't feel a dire need to release another 9.3.X recently. I agree with Tom that mid-January is likely. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] NUMERIC private methods?
On Thu, Dec 18, 2014 at 10:21 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andrew Gierth and...@tao11.riddles.org.uk writes: Tom == Tom Lane t...@sss.pgh.pa.us writes: Tom If you're concerned about arithmetic performance, there is a Tom very obvious fix here: use double. Independently of this specific example, the obvious issue there is that there are applications for which double is simply not acceptable. As the guy who last fooled with the numeric calculation algorithms in any major way, I'm painfully aware that numeric is not necessarily more accurate than double for anything more complicated than addition/subtraction/multiplication. The example that was shown upthread is pretty nearly a textbook case of something where I'd not believe that numeric offers any accuracy improvement without *very* careful investigation. In general, if your application is sufficiently arithmetic-intensive that you need to care about the speed of calculations, then it's not obvious which one to pick, and if I hear a knee-jerk simply not acceptable I'm simply going to conclude that you haven't actually studied the subject. I think that's ridiculous. You're basically arguing that numeric doesn't offer meaningful advantages over float8, which flies in the face of the fact that essentially every database application I've ever seen uses numeric and I'm not sure I've ever seen one using float8. Nearly all database users prefer to store quantities like currency units in a type that is guaranteed not to lose precision. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] NUMERIC private methods?
Robert Haas wrote: I think that's ridiculous. You're basically arguing that numeric doesn't offer meaningful advantages over float8, which flies in the face of the fact that essentially every database application I've ever seen uses numeric and I'm not sure I've ever seen one using float8. Nearly all database users prefer to store quantities like currency units in a type that is guaranteed not to lose precision. I think it's reasonable to expose NumericVar and the supporting function prototypes in, say, numeric_internal.h; normal applications that just want to operate on numerics as today can just include numeric.h, and continue to be at arms-length of the implementation details, while code that wants to optimize operations further can use numeric_internal.h and be very aware that they are subject to heavy breakage if we ever feel a need to change the internal API. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commit fest 2014-12, let's begin!
On Mon, Dec 15, 2014 at 03:58:39PM +0200, Heikki Linnakangas wrote: WITH closest_candidates AS ( SELECT streets.gid, streets.name, streets.geom FROM nyc_streets streets ORDER BY streets.geom - 'SRID=26918;POINT(583571.905921312 4506714.34119218)'::geometry LIMIT 100 ) SELECT gid, name FROM closest_candidates ORDER BY ST_Distance( geom, 'SRID=26918;POINT(583571.905921312 4506714.34119218)'::geometry ) LIMIT 1; See blog posts: http://blog.light42.com/wordpress/?p=102 http://workshops.boundlessgeo.com/postgis-intro/knn.html Ugh. Ok, sold :-). I'll review... Just to summarize, the index can only be created on the - operator because that indexes on the center of the bounding box: http://postgis.net/docs/manual-2.1/geometry_distance_centroid.html You can't index on ST_Distance() because that computes the minimum distance between the two objects, which is different for different points: http://postgis.net/docs/manual-2.1/ST_Distance.html I am embarrassed by the LIMIT 100 hack they have to use above to find the closest polygon to a given point, and this recheck patch is going to fix that. I think this will remove the most significant PostGIS wart. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Role Attribute Bitmask Catalog Representation
Adam, * Adam Brightwell (adam.brightw...@crunchydatasolutions.com) wrote: I have attached an updated patch with initial documentation changes for review. Awesome, thanks. I'm going to continue looking at this in more detail, but wanted to mention a few things I noticed in the documentation changes: diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml *** a/doc/src/sgml/catalogs.sgml --- b/doc/src/sgml/catalogs.sgml --- 1391,1493 /row row ! entrystructfieldrolattr/structfield/entry ! entrytypebigint/type/entry ! entry !Role attributes; see xref linkend=sql-createrole for details ! /entry ! /row Shouldn't this refer to the table below (which I like..)? Or perhaps to both the table and sql-createrole? Have you had a chance to actually build the docs and look at the results to see how things look? I should have time later tomorrow to do that and look over the results also, but figured I'd ask. ! table id=catalog-rolattr-bitmap-table !titlestructfieldrolattr/ bitmap positions/title I would call this 'Attributes in rolattr' or similar rather than 'bitmap positions'. !tgroup cols=3 ! thead ! row ! entryPosition/entry ! entryAttribute/entry ! entryDescription/entry ! /row ! /thead There should be a column for 'Option' or something- basically, a clear tie-back to what CREATE ROLE refers to these as. I'm thinking that reordering the columns would help here, consider: Attribute (using the 'Superuser', 'Inherit', etc 'nice' names) CREATE ROLE Clause (note: that's how CREATE ROLE describes the terms) Description Position ! tbody ! row ! entryliteral0/literal/entry ! entrySuperuser/entry entryRole has superuser privileges/entry /row row ! entryliteral1/literal/entry ! entryInherit/entry ! entryRole automatically inherits privileges of roles it is a member of/entry /row This doesn't follow our general convention to wrap lines in the SGML at 80 chars (same as the C code) and, really, if you fix that it looks like these lines shouldn't even be different at all (see above with the 'Role has superuser privileges' entry/entry line). Same is true for a few of the other cases. row ! entryliteral4/literal/entry ! entryCatalog Update/entry entry !Role can update system catalogs directly. (Even a superuser cannot do this unless this column is true) /entry /row I'm really not sure what to do with this one. I don't like leaving it as-is, but I suppose it's probably the right thing for this patch to do. Perhaps another patch should be proposed which improves the documentation around rolcatupdate and its relationship to superuser. diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml new file mode 100644 index ef69b94..74bc702 *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml +para + functionpg_has_role_attribute/function checks the attribute permissions + given to a role. It will always return literaltrue/literal for roles + with superuser privileges unless the attribute being checked is + literalCATUPDATE/literal (superuser cannot bypass + literalCATUPDATE/literal permissions). The role can be specified by name + and by OID. The attribute is specified by a text string which must evaluate + to one of the following role attributes: + literalSUPERUSER/literal, + literalINHERIT/literal, + literalCREATEROLE/literal, + literalCREATEDB/literal, + literalCATUPDATE/literal, + literalCANLOGIN/literal, + literalREPLICATION/literal, or + literalBYPASSRLS/literal. This should probably refer to CREATE ROLE also as being where the meaning of these strings is defined. Otherwise, the docs look pretty good to me. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Role Attribute Bitmask Catalog Representation
FWIW I've been giving this patch a look and and adjusting some coding details here and there. Do you intend to commit it yourself? You're not listed as reviewer or committer for it in the commitfest app, FWIW. One thing I don't very much like is that check_role_attribute() receives a RoleAttr but nowhere it checks that only one bit is set in 'attribute'. From the coding, this routine would return true if just one of those bits is set, which is probably not what is intended. Now I realize that current callers all pass a bitmask with a single bit set, but I think it'd be better to have some protection against that, for possible future coding mistakes. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Role Attribute Bitmask Catalog Representation
Alvaro, * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote: FWIW I've been giving this patch a look and and adjusting some coding details here and there. Do you intend to commit it yourself? You're not listed as reviewer or committer for it in the commitfest app, FWIW. Oh, great, thanks! And, yeah, I'm not as good as I should be about updating the commitfest app. As for committing it, I was thinking I would but you're certainly welcome to if you're interested. I would like this to be committed soonish as it will then allow Adam to rebase the patch which adds the various role attributes discussed previously on top of the representation changes. I suspect he's done some work in that direction already, but I keep asking for changes to this patch which would then ripple down into the other. One thing I don't very much like is that check_role_attribute() receives a RoleAttr but nowhere it checks that only one bit is set in 'attribute'. From the coding, this routine would return true if just one of those bits is set, which is probably not what is intended. Now I realize that current callers all pass a bitmask with a single bit set, but I think it'd be better to have some protection against that, for possible future coding mistakes. That's certainly a good point. I'm inclined to suggest that there be an Assert() check in check_role_attribute(), or were you thinking of something else..? Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] NUMERIC private methods?
Robert Haas robertmh...@gmail.com writes: On Thu, Dec 18, 2014 at 10:21 AM, Tom Lane t...@sss.pgh.pa.us wrote: As the guy who last fooled with the numeric calculation algorithms in any major way, I'm painfully aware that numeric is not necessarily more accurate than double for anything more complicated than addition/subtraction/multiplication. The example that was shown upthread is pretty nearly a textbook case of something where I'd not believe that numeric offers any accuracy improvement without *very* careful investigation. I think that's ridiculous. You're basically arguing that numeric doesn't offer meaningful advantages over float8, which flies in the face of the fact that essentially every database application I've ever seen uses numeric and I'm not sure I've ever seen one using float8. Nearly all database users prefer to store quantities like currency units in a type that is guaranteed not to lose precision. If you're doing banking, you don't do anything except addition, subtraction, and multiplication. And that is what those users who want guaranteed precision are doing, and yeah numeric will make them happy. If you're doing any sort of higher math or statistics, I stand by my statement that you'd better think rather than just blindly assume that numeric is going to be better for you. A moment's fooling about finds this example, which is pretty relevant to the formula we started this thread with: regression=# select (1234::numeric/1235) * 1235; ?column? --- 1234.0100 (1 row) regression=# select (1234::float8/1235) * 1235; ?column? -- 1234 (1 row) What it boils down to is that numeric is great for storing given decimal inputs exactly, and it can do exact addition/subtraction/multiplication on those too, but as soon as you get into territory where the result is fundamentally inexact it is *not* promised to be better than float8. In fact, it's designed to be more or less the same as float8; see the comments in select_div_scale. We could probably improve on this if we were to redesign the algorithms around a concept of decimal floating-point, rather than decimal fixed-point as it is now. But I'm not sure how well that would comport with the SQL standard. And I'm very not sure that we could still do it once we'd tied one hand behind our backs by virtue of exporting a bunch of the internals as public API. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Tue, Dec 16, 2014 at 01:05:31AM -0300, Alvaro Herrera wrote: Noah Misch wrote: On Mon, Dec 15, 2014 at 12:12:10AM -0800, Michael Paquier wrote: FWIW, I just tried that with MinGW-32 and I can see the error on Win7. I also checked that changing now() to = now() fixed the problem, so your assumption was right, Petr. Committed, after fixing the alternate expected output. Thanks. I admit I don't understand what the issue is. The test assumed that no two transactions of a given backend will get the same timestamp value from now(). That holds so long as ticks of the system time are small enough. Not so on at least some Windows configurations. Notice the repeated timestamp values: Windows Server 2003 x64, 32-bit build w/ VS2010 localhost template1=# select clock_timestamp(), pg_sleep(.1 * (n % 2)) from generate_series(0,7) t(n); clock_timestamp| pg_sleep ---+-- 2014-12-18 08:34:34.522126+00 | 2014-12-18 08:34:34.522126+00 | 2014-12-18 08:34:34.631508+00 | 2014-12-18 08:34:34.631508+00 | 2014-12-18 08:34:34.74089+00 | 2014-12-18 08:34:34.74089+00 | 2014-12-18 08:34:34.850272+00 | 2014-12-18 08:34:34.850272+00 | (8 rows) GNU/Linux [local] test=# select clock_timestamp(), pg_sleep(.1 * (n % 2)) from generate_series(0,7) t(n); clock_timestamp| pg_sleep ---+-- 2014-12-19 06:49:47.590556+00 | 2014-12-19 06:49:47.590611+00 | 2014-12-19 06:49:47.691488+00 | 2014-12-19 06:49:47.691508+00 | 2014-12-19 06:49:47.801483+00 | 2014-12-19 06:49:47.801502+00 | 2014-12-19 06:49:47.921486+00 | 2014-12-19 06:49:47.921505+00 | (8 rows) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 2014-12-18 10:02:25 -0800, Joshua D. Drake wrote: I think a lot of hackers forget exactly how tender their egos are. Now I say this knowing that a lot of them will go, Oh give me a break but as someone who employs hackers, deals with open source AND normal people :P every single day, I can tell you without a single inch of sarcasm that petting egos is one of the ways you get things done in the open source (and really any male dominated) community. To me that's a bit over the top stereotyping. However, doing crappy work and let's not be shy about it, there is NOTHING fun about reviewing someone else's code needs to have incentive. FWIW, I don't agree with this at all. Reviewing code can be quite interesting - with the one constraint that the problem the patch solves needs to be somewhat interesting. The latter is what I think gets many of the more experienced reviewers - lots of the patches just solve stuff they don't care about. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Log inability to lock pages during vacuum
On 2014-12-18 16:05:23 -0600, Jim Nasby wrote: On 12/18/14, 3:02 PM, Alvaro Herrera wrote: Andres Freund wrote: On 2014-12-18 16:41:04 -0300, Alvaro Herrera wrote: + if (scan_all) + appendStringInfo(buf, _(waited for %d buffer pins\n), + vacrelstats-pinned_pages); + else + appendStringInfo(buf, + _(skipped %d pages due to buffer pins\n), + vacrelstats-pinned_pages); Unless I miss something this is, as mentioned before, not correct. scan_all doesn't imply at all that we waited for buffer pins. We only do so if lazy_check_needs_freeze(buf). Which usually won't be true for a *significant* number of pages. Ah, interesting, I didn't remember we had that. I guess one possible tweak is to discount the pages we skip from pinned_pages; or we could keep a separate count of pages waited for. Jim, up for a patch? I would prefer that we at least count if we initially don't get the lock; presumably that number is always low anyway and in that case I think we're done with this. If it turns out it is common to initially miss the pin then we could do something fancier. So how about if in the scan_all case we say something like unable to initially acquire pin on %d buffers\n? I'd just do away with the difference between scan_all/!scan_all and always say the above. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers