Re: [HACKERS] pg_basebackup is taking backup of extra files inside a tablespace directory
On Wed, Nov 28, 2012 at 12:43 PM, Michael Paquier michael(dot)paquier(at)gmail(dot)comwrote: On Wed, Nov 28, 2012 at 3:55 PM, Hari Babu haribabu(dot)kommi(at)huawei(dot)comwrote: pg_basebackup is taking backup of extra files other than database related files in side a TABLESPACE directory. And why do you have files not related to your PG server inside a folder dedicated to a PG server? Incase of pg_upgrade old version to new version, new tablespace will be created in same tablespace path with different tablespace version number. So while taking backup need to skip the other data directory files. ls opt/tbs1 PG_9.2_201204301 PG_9.3_201210071 Regards, Hari babu. From: Michael Paquier [mailto:michael.paqu...@gmail.com] Sent: Wednesday, November 28, 2012 12:43 PM To: Hari Babu Cc: PostgreSQL-development Subject: Re: [HACKERS] pg_basebackup is taking backup of extra files inside a tablespace directory On Wed, Nov 28, 2012 at 3:55 PM, Hari Babu haribabu.ko...@huawei.com wrote: pg_basebackup is taking backup of extra files other than database related files in side a TABLESPACE directory. And why do you have files not related to your PG server inside a folder dedicated to a PG server? -- Michael Paquier http://michael.otacoo.com
[HACKERS] Strange behaviour with incompatible psql/server
A friend reported this issue to me and I find it a bit strange and even after spending some time on this, I couldn't really figure out what's going wrong. See attached two SQL files, bad.sql and good.sql. They look the exact same in the editor. In fact, the good.sql is created by copying lines from bad.sql in the editor. There is probably some control character that differentiate the two files, but :set list in vim does not show anything. Now, if I use 8.4.9 psql to connect to the server running 9.0.10, I get the following results with bad.sql $ psql postgres psql (8.4.9, server 9.0.10) WARNING: psql version 8.4, server version 9.0. Some psql features might not work. Type help for help. postgres=# \i bad.sql psql:bad.sql:4: ERROR: syntax error at or near LINE 1: ^ psql:bad.sql:12: NOTICE: CREATE TABLE will create implicit sequence history_id_seq for serial column history.id psql:bad.sql:12: NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index history_pkey for table history CREATE TABLE postgres=# Notice the syntax error above. Now, if I run the good.sql, I don't see any errors. $ psql postgres psql (8.4.9, server 9.0.10) WARNING: psql version 8.4, server version 9.0. Some psql features might not work. Type help for help. postgres=# \i good.sql DROP TABLE psql:good.sql:12: NOTICE: CREATE TABLE will create implicit sequence history_id_seq for serial column history.id psql:good.sql:12: NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index history_pkey for table history CREATE TABLE Finally, if I use psql from 9.0.10 release, both the files run without any errors. See output from the offending SQL file below. $ ./install/bin/psql postgres psql (9.0.10) Type help for help. postgres=# \i bad.sql DROP TABLE psql:bad.sql:12: NOTICE: CREATE TABLE will create implicit sequence history_id_seq for serial column history.id psql:bad.sql:12: NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index history_pkey for table history CREATE TABLE While I'm almost certain that this has something to do with special characters that my naked eyes can not see, all my attempts to spot the difference has failed. So I really have two questions: 1. What's the difference between these files ? 2. Why 9.0 psql works fine with that difference, but 8.4 psql does not ? Any suggestions ? Thanks, Pavan bad.sql Description: Binary data good.sql Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange behaviour with incompatible psql/server
On 28.11.2012 10:46, Pavan Deolasee wrote: While I'm almost certain that this has something to do with special characters that my naked eyes can not see, all my attempts to spot the difference has failed. So I really have two questions: 1. What's the difference between these files ? Compare hexdump -C bad.sql and hexdump -C good.sql. There's a UTF-8 Byte-Order-Mark at the beginning of bad.sql, see https://en.wikipedia.org/wiki/Byte_Order_Mark#UTF-8 2. Why 9.0 psql works fine with that difference, but 8.4 psql does not ? Dunno, I'll let you investigate that ;-) - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange behaviour with incompatible psql/server
On Wed, Nov 28, 2012 at 2:28 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 28.11.2012 10:46, Pavan Deolasee wrote: While I'm almost certain that this has something to do with special characters that my naked eyes can not see, all my attempts to spot the difference has failed. So I really have two questions: 1. What's the difference between these files ? Compare hexdump -C bad.sql and hexdump -C good.sql. There's a UTF-8 Byte-Order-Mark at the beginning of bad.sql, see https://en.wikipedia.org/wiki/**Byte_Order_Mark#UTF-8https://en.wikipedia.org/wiki/Byte_Order_Mark#UTF-8 Thanks. 2. Why 9.0 psql works fine with that difference, but 8.4 psql does not ? Dunno, I'll let you investigate that ;-) So seems like this commit did the trick for later releases. commit 93d3bac5648bddfe195a9cecc45ef0a2da5e85a8 Author: Peter Eisentraut pete...@gmx.net Date: Sat Nov 21 23:59:12 2009 + Ignore UTF-8-encoded Unicode byte-order mark at the beginning of a file if the client encoding is UTF-8. a limited version of a patch proposed by Itagaki Takahiro Should this not be back patched ? The error that's coming because not having this fix is rather very strange and hard to debug for any average individual. I'd almost concluded that one should NEVER use an old psql with a new server even though the warning that comes is not too glaring. Thanks, Pavan
Re: [HACKERS] Strange behaviour with incompatible psql/server
On Wed, Nov 28, 2012 at 2:52 PM, Pavan Deolasee pavan.deola...@gmail.comwrote: Should this not be back patched ? The error that's coming because not having this fix is rather very strange and hard to debug for any average individual. I'd almost concluded that one should NEVER use an old psql with a new server even though the warning that comes is not too glaring. Never mind. I just read the email thread leading to this commit. This wasn't a bug fix per se and that's probably the reason why this was never back patched. The discussion itself looked quite inconclusive as well. Thanks, Pavan
Re: [HACKERS] plpgsql_check_function - rebase for 9.3
Hello a some updated version * possibility to raise (and filter) performance warnings - detects IO castings * detects assign composite value to scalar variable Regards Pavel Stehule 2012/11/27 Pavel Stehule pavel.steh...@gmail.com: Hello I am sending a updated version - now it is prepared for event triggers and it is little bit more robust I run pgindent, but I have no experience with it, so I am not sure about success Regards Pavel Stehule 2012/10/7 Selena Deckelmann sel...@chesnok.com: Hi! On Tue, Jun 26, 2012 at 1:19 AM, Pavel Stehule pavel.steh...@gmail.com wrote: I am sending lightly refreshed patch for checking plpgsql functions.. I checked different implementation, but without success: a) enhancing of SPI to some fake mode can has negative impact on application, and patch was not clear, b) generic plpgsql walker doesn't save lines too. I invite any ideas how to improve this patch I reviewed this and did a clean up for bitrot and a little whitespace. In particular, it needed to learn a little about event triggers. This patch is a follow on from an earlier review thread I found: http://archives.postgresql.org/message-id/d960cb61b694cf459dcfb4b0128514c2072df...@exadv11.host.magwien.gv.at I dug through that thread a bit, and I believe issues raised by Laurenz, Petr and Alvaro were resolved by Pavel over time. All tests pass, and after a read-through, the code seems fine. This also represents my inaugural use of pg_bsd_indent. I ran it on pl_check.c - which made things mostly better. Happy to try and fix it up more if someone can explain to me what (if anything) I did incorrectly when using it. -selena -- http://chesnok.com plpgsql_check_function_20121128_01.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Patch to compute Max LSN of Data Pages
On Wednesday, November 28, 2012 11:49 AM Muhammad Usama wrote: On Tue, Nov 27, 2012 at 5:52 PM, Amit kapila amit.kap...@huawei.com wrote: Friday, November 23, 2012 5:38 AM Muhammad Usama wrote: - For -p {file | dir} option the utility expects the file path relative to the specified data directory path which makes thing little confusing for example ./pg_computemaxlsn -p data/base/12286/12200 data pg_computemaxlsn: file or directory data/base/12286/12200 does not exists although data/base/12286/12200 is valid file path but we gets file does not exists error As the path of file is relative to data directory, that's why in error it prints the path as data/base/12286/12200. Do you have any suggestion what should be done for this? I think we should expect provided path to be relative to current directory or may consider it to be relative to either one of Data or CWD. Because normally we expect path to be relative to CWD if some program is asking for path in command line. And also tab-complete doesn't works on terminal if path is not absolute or relative to the current directory. To handle this I think we should not change current directory to the data directory and generate the file path (like for postmaster.pid) where required by appending data directory path. And before throwing an error should check if path is valid for either DATA or CWD something like Apart from that, I think it needs to also check if the path is under data directory path in either case, else it can get LSN for some running server as pointed in your other point. Also shouldn't we support relative to one of CWD or data directory rather than both? Any opinions from others what is the best to support in such a use case. a. File/Folder path Relative to CWD (Current working directory) b. File/Folder path Relative to data directory c. to both a and b With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW for PostgreSQL
2012/11/28 Shigeru Hanada shigeru.han...@gmail.com: On Sun, Nov 25, 2012 at 5:24 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: I checked the v4 patch, and I have nothing to comment anymore. So, could you update the remaining EXPLAIN with VERBOSE option stuff? Thanks for the review. Here is updated patch. I checked the patch. The new VERBOSE option of EXPLAIN statement seems to me working fine. I think it is time to hand over this patch to committer. It is not a matter to be solved, but just my preference. postgres=# EXPLAIN(VERBOSE) SELECT * FROM ftbl WHERE a 0 AND b like '%a%'; QUERY PLAN Foreign Scan on public.ftbl (cost=100.00..100.01 rows=1 width=36) Output: a, b Filter: (ftbl.b ~~ '%a%'::text) Remote SQL: SELECT a, b FROM public.tbl WHERE ((a OPERATOR(pg_catalog.) 0)) (4 rows) postgres=# EXPLAIN SELECT * FROM ftbl WHERE a 0 AND b like '%a%'; QUERY PLAN - Foreign Scan on ftbl (cost=100.00..100.01 rows=1 width=36) Filter: (b ~~ '%a%'::text) (2 rows) Do you think the qualifier being pushed-down should be explained if VERBOSE option was not given? BTW, we have one more issue around naming of new FDW, and it is discussed in another thread. http://archives.postgresql.org/message-id/9e59e6e7-39c9-4ae9-88d6-bb0098579...@gmail.com I don't have any strong option about this naming discussion. As long as it does not conflict with existing name and is not misleading, I think it is reasonable. So, postgre_fdw is OK for me. pgsql_fdw is also welcome. posugure_fdw may make sense only in Japan. pg_fdw is a bit misleading. postgresql_fdw might be the best, but do we have some clear advantage on this name to take an additional effort to solve the conflict with existing built-in postgresql_fdw_validator() function? I think, postgres_fdw is enough reasonable choice. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW for PostgreSQL
2012/11/28 Kohei KaiGai kai...@kaigai.gr.jp: it is reasonable. So, postgre_fdw is OK for me. pgsql_fdw is also welcome. Sorry, s/postgre_fdw/postgres_fdw/g Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
On 2012-11-27 23:46:58 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2012-11-27 16:31:15 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: Isn't inisprimary updated when an ALTER TABLE ... ADD PRIMARY KEY ... USING someindex ; is done? Also I think indoption might be written to as well. Ugh, you're right. Somebody wasn't paying attention when those ALTER commands were added. On closer look, indoption is never updated --- perhaps you were thinking about pg_class.reloptions. indisprimary, indimmediate, and indisclustered are all subject to post-creation updates though. Yea, I haven't really checked what inoption actually does. We could probably alleviate the consequences of this by having those operations reset indcheckxmin if the tuple's old xmin is below the GlobalXmin horizon. That's something for later though --- it's not a data corruption issue, it just means that the index might unexpectedly not be used for queries for a little bit after an ALTER. mark_index_clustered() does the same btw, its not a problem in the CLUSTER ... USING ...; case because that creates a new pg_index entry anyway but in the ALTER TABLE one thats not the case. After further study I think the situation is that (1) The indisvalid/indisready/indislive updates in CREATE/DROP INDEX CONCURRENTLY can, and must, be done in-place since we don't have exclusive lock on the parent table. (2) All the other updates can be transactional because we hold sufficient locks to ensure that nothing bad will happen. The proposed reductions in ALTER TABLE lock strength would break this in several cases, but that's a problem for another day. Attached is a very preliminary draft patch for this. I've not addressed the question of whether we can clear indcheckxmin during transactional updates of pg_index rows, but I think it covers everything else talked about in this thread. Looks good on a quick lookthrough. Will play a bit more once the indexcheckxmin stuff is sorted out. Some comments: - INDEX_DROP_CLEAR_READY clears indislive, perhasp INDEX_DROP_SET_DEAD or NOT_ALIVE is more appropriate? - I noticed while trying my old isolationtester test that heap_update_inplace disregards any locks on the tuple. I don't really see a scenario where this is problematic right now, seems a bit dangerous for the future though. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Enabling frontend-only xlog desc routines
On Wednesday, November 28, 2012 12:17 AM Alvaro Herrera wrote: I mentioned the remaining issues in a previous email (see message-id 20121025161751.ge6...@alvh.no-ip.org). Attached is a patch that enables xlogdump to #include xlog_internal.h by way of removing that file's inclusion of fmgr.h, which is problematic. I don't think this should be too contentious. I have tried to go through Xlogdump patch provided in mail chain of message-id provided. It seems there is no appropriate file/function header which makes it little difficult to understand the purpose. This is just a suggestion and not related to your this mail. The other interesting question remaining is what to do about the rm_desc function in rmgr.c. We're split between these two ideas: 1. Have this in rmgr.c: #ifdef FRONTEND #define RMGR_REDO_FUNC(func) NULL #else #define RMGR_REDO_FUNC(func) func #endif /* FRONTEND */ and then use RMGR_REDO_FUNC() in the table. 2. Have this in rmgr.c: #ifndef RMGR_REDO_FUNC #define RMGR_REDO_FUNC(func) func #endif And then have the xlogdump Makefile use -D to define a suitable RMGR_REDO_FUNC. What I understood is that as there is only a need of rm_desc function in xlogdump, so we want to separate it out such that it can be easily used. In Approach-1, define for some of functions (redo, startup, cleanup,..) as NULL for frontends so that corresponding functions for frontends become hidden. In Approach-2, frontend (in this case Xlogdump) need to define value for that particular Macro by using -D in makefile. If my above thinking is right, then I think Approach-2 might be better as in that Frontend will have more flexibility if it wants to use some other functionality of rmgr. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Enabling frontend-only xlog desc routines
On 2012-11-28 18:58:45 +0530, Amit Kapila wrote: On Wednesday, November 28, 2012 12:17 AM Alvaro Herrera wrote: I mentioned the remaining issues in a previous email (see message-id 20121025161751.ge6...@alvh.no-ip.org). Attached is a patch that enables xlogdump to #include xlog_internal.h by way of removing that file's inclusion of fmgr.h, which is problematic. I don't think this should be too contentious. I have tried to go through Xlogdump patch provided in mail chain of message-id provided. It seems there is no appropriate file/function header which makes it little difficult to understand the purpose. This is just a suggestion and not related to your this mail. An updated version of xlogdump with some initial documentation, sensible building, and some more is available at http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/xlogreader_v3 The other interesting question remaining is what to do about the rm_desc function in rmgr.c. We're split between these two ideas: 1. Have this in rmgr.c: #ifdef FRONTEND #define RMGR_REDO_FUNC(func) NULL #else #define RMGR_REDO_FUNC(func) func #endif /* FRONTEND */ and then use RMGR_REDO_FUNC() in the table. 2. Have this in rmgr.c: #ifndef RMGR_REDO_FUNC #define RMGR_REDO_FUNC(func) func #endif And then have the xlogdump Makefile use -D to define a suitable RMGR_REDO_FUNC. What I understood is that as there is only a need of rm_desc function in xlogdump, so we want to separate it out such that it can be easily used. In Approach-1, define for some of functions (redo, startup, cleanup,..) as NULL for frontends so that corresponding functions for frontends become hidden. In Approach-2, frontend (in this case Xlogdump) need to define value for that particular Macro by using -D in makefile. If my above thinking is right, then I think Approach-2 might be better as in that Frontend will have more flexibility if it wants to use some other functionality of rmgr. I personally favor approach-1 because I cannot see any potential other use. You basically need to have the full backend code available just to successfully link the other functions. Running is even more complex, and there's no real point in doing that standalone anyway, so, what for? Its not like thats something that annot be changed should an actual usecase emerge. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Review: Extra Daemons / bgworker
Hello Álvaro, first of all, thank you for bringing this up again and providing a patch. My first attempt on that was more than two years ago [1]. As the author of a former bgworker patch, I'd like to provide an additional review - KaiGai was simply faster to sing up as a reviewer on the commitfest app... I started my review is based on rev 6d7e51.. from your bgworker branch on github, only to figure out later that it wasn't quite up to date. I upgraded to bgworker-7.patch. I hope that's the most recent version. # Concept I appreciate the ability to run daemons that are not connected to Postgres shared memory. It satisfies a user request that came up several times when I talked about the bgworkers in Postgres-R. Another good point is the flexible interface via extensions, even allowing different starting points for such background workers. One thing I'd miss (for use of that framework in Postgres-R) is the ability to start a registered bgworker only upon request, way after the system started. So that the bgworker process doesn't even exist until it is really needed. I realize that RegisterBackgroundWorker() is still necessary in advance to reserve the slots in BackgroundWorkerList and shared memory. As a use case independent of Postgres-R, think of something akin to a worker_spi, but wanting that to perform a task every 24h on hundreds of databases. You don't want to keep hundreds of processes occupying PGPROC slots just perform a measly task every 24h. (We've discussed the ability to let bgworkers re-connect to another database back then. For one, you'd still have currently unneeded worker processes around all the time. And second, I think that's hard to get right - after all, a terminated process is guaranteed to not leak any stale data into a newly started one, no matter what.) From my point of view, autovacuum is the very first example of a background worker process. And I'm a bit puzzled about it not being properly integrated into this framework. Once you think of autovacuum as a background job which needs access to Postgres shared memory and a database, but no client connection, it looks like a bit of code duplication (and not using code we already have). I realize this kind of needs the above feature being able to request the (re)start of bgworkers at arbitrary points in time. However, it would also be a nice test case for the bgworker infrastructure. I'd be happy to help with extending the current patch into that direction, if you agree it's generally useful. Or adapt my bgworker code accordingly. # Minor technical issues or questions In assign_maxconnections, et al, GetNumRegisteredBackgroundWorkers() is used in relation to MAX_BACKENDS or to calculate MaxBackends. That seems to leak the bgworkers that registered with neither BGWORKER_SHMEM_ACCESS nor BGWORKER_BACKEND_DATABASE_CONNECTION set. Or is there any reason to discount such extra daemon processes? The additional contrib modules auth_counter and worker_spi are missing from the contrib/Makefile. If that's intentional, they should probably be listed under Missing. The auth_counter module leaves worker.bgw_restart_time uninitialized. Being an example, it might make sense for auth_counter to provide a signal that just calls SetLatch() to interrupt WaitLatch() and make auth_counter emit its log line upon request, i.e. show how to use SIGUSR1. The bgw_restart_time doesn't always work (certainly not the way I'm expecting it to). For example, if you forget to pass the BGWORKER_SHMEM_ACCESS flag, trying to LWLockAcquire() leads to the worker being restarted immediately and repeatedly - independent of the bgw_restart_time setting. The same holds true if the bgworker exits with status 0 or in case it segfaults. Not when exiting with code 1, though. Why is that? Especially in case of a segfault or equally hard errors that can be expected to occur repeatedly, I don't want the worker to be restarted that frequently. # Documentation There are two working examples in contrib. The auth_counter could use a header comment similar to worker_spi, quickly describing what it does. There's no example of a plain extra daemon, without shmem access. Coding guidelines for bgworker / extra daemon writers are missing. I read these must not use sleep(), with an explanation in both examples. Other questions that come to mind: what about signal handlers? fork()? threads? Can/should it use PG_TRY/PG_CATCH or setjmp()/longjmp(). How to best load other 3rd party libraries, i.e. for OpenGL processing? Especially for extra daemons (i.e. bgw_flags = 0), differences to running an external daemon should be documented. I myself am unclear about all of the implications that running as a child of the postmaster has (OOM and cgroups come to mind - there certainly are other aspects). (I myself still have a hard time finding a proper use case for extra daemons. I don't want the postmaster to turn into a general purpose watchdog for everything and the kitchen
Re: [HACKERS] Further pg_upgrade analysis for many tables
On Tue, Nov 27, 2012 at 09:35:10PM -0800, Jeff Janes wrote: On Tue, Nov 27, 2012 at 8:13 PM, Bruce Momjian br...@momjian.us wrote: I have some new interesting results (in seconds, test script attached): -Fc --- dump | pg_restore/psql -- - pg_upgrade - dump restore -Fc-Fc|-1 -Fc|-j -Fp-Fp|-1 git patch 1 0.140.080.140.160.190.130.15 11.04 13.07 1000 3.083.656.536.605.396.376.54 21.05 22.18 2000 6.066.52 12.15 11.78 10.52 12.89 12.11 31.93 31.65 4000 11.07 14.68 25.12 24.47 22.07 26.77 26.77 56.03 47.03 8000 20.85 32.03 53.68 45.23 45.10 59.20 51.33 104.99 85.19 16000 40.28 88.36 127.63 96.65 106.33 136.68 106.64 221.82 157.36 32000 93.78 274.99 368.54 211.30 294.76 376.36 229.80 544.73 321.19 64000 197.79 1109.22 1336.83 577.83 1117.55 1327.98 567.84 1766.12 763.02 I tested custom format with pg_restore -j and -1, as well as text restore. The winner was pg_dump -Fc | pg_restore -1; I don't have the numbers at hand, but if my relcache patch is accepted, then -1 stops being faster. -1 gets rid of the AtOEXAct relcache N^2 behavior, but at the cost of invoking a different N^2, that one in the stats system. I was going to ask you that. :-) Let me run a test with your patch now. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_upgrade test picks up unwanted environment values
I noticed yesterday while doing some buildfarm tinkering that the test script for pg_upgrade doesn't unset various important environment values as pg_regress does. This can cause it to fail e.g. is PGUSER is set, as it was with my testing. I propose to add this to the script unless there are objections: unset PGDATABASE unset PGUSER unset PGSERVICE unset PGSSLMODE unset PGREQUIRESSL unset PGCONNECT_TIMEOUT unset PGHOST unset PGHOSTADDR cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Extra Daemons / bgworker
Markus Wanner wrote: Hi Markus, Many thanks for your review. first of all, thank you for bringing this up again and providing a patch. My first attempt on that was more than two years ago [1]. As the author of a former bgworker patch, I'd like to provide an additional review - KaiGai was simply faster to sing up as a reviewer on the commitfest app... I remember your patchset. I didn't look at it for this round, for no particular reason. I did look at KaiGai's submission from two commitfests ago, and also at a patch from Simon which AFAIK was never published openly. Simon's patch merged the autovacuum code to make autovac workers behave like bgworkers as handled by his patch, just like you suggest. To me it didn't look all that good; I didn't have the guts for that, hence the separation. If later bgworkers are found to work well enough, we can port autovac workers to use this framework, but for now it seems to me that the conservative thing is to leave autovac untouched. (As an example, we introduced ilist some commits back and changed some uses to it; but there are still many places where we're using SHM_QUEUE, or List, or open-coded lists, which we could port to the new infrastructure, but there's no pressing need to do it.) I started my review is based on rev 6d7e51.. from your bgworker branch on github, only to figure out later that it wasn't quite up to date. I upgraded to bgworker-7.patch. I hope that's the most recent version. Sorry about that -- forgot to push to github. bgworker-7 corresponds to commit 0a49a540b which I have just pushed to github. # Concept I appreciate the ability to run daemons that are not connected to Postgres shared memory. It satisfies a user request that came up several times when I talked about the bgworkers in Postgres-R. Another good point is the flexible interface via extensions, even allowing different starting points for such background workers. Great. One thing I'd miss (for use of that framework in Postgres-R) is the ability to start a registered bgworker only upon request, way after the system started. So that the bgworker process doesn't even exist until it is really needed. I realize that RegisterBackgroundWorker() is still necessary in advance to reserve the slots in BackgroundWorkerList and shared memory. As a use case independent of Postgres-R, think of something akin to a worker_spi, but wanting that to perform a task every 24h on hundreds of databases. You don't want to keep hundreds of processes occupying PGPROC slots just perform a measly task every 24h. Yeah, this is something I specifically kept out initially to keep things simple. Maybe one thing to do in this area would be to ensure that there is a certain number of PGPROC elements reserved specifically for bgworkers; kind of like autovacuum workers have. That would avoid having regular clients exhausting slots for bgworkers, and vice versa. How are you envisioning that the requests would occur? # Minor technical issues or questions In assign_maxconnections, et al, GetNumRegisteredBackgroundWorkers() is used in relation to MAX_BACKENDS or to calculate MaxBackends. That seems to leak the bgworkers that registered with neither BGWORKER_SHMEM_ACCESS nor BGWORKER_BACKEND_DATABASE_CONNECTION set. Or is there any reason to discount such extra daemon processes? No, I purposefully let those out, because those don't need a PGPROC. In fact that seems to me to be the whole point of non-shmem-connected workers: you can have as many as you like and they won't cause a backend-side impact. You can use such a worker to connect via libpq to the server, for example. The additional contrib modules auth_counter and worker_spi are missing from the contrib/Makefile. If that's intentional, they should probably be listed under Missing. The auth_counter module leaves worker.bgw_restart_time uninitialized. Being an example, it might make sense for auth_counter to provide a signal that just calls SetLatch() to interrupt WaitLatch() and make auth_counter emit its log line upon request, i.e. show how to use SIGUSR1. KaiGai proposed that we remove auth_counter. I don't see why not; I mean, worker_spi is already doing most of what auth_counter is doing, so why not? However, as you say, maybe we need more coding examples. The bgw_restart_time doesn't always work (certainly not the way I'm expecting it to). For example, if you forget to pass the BGWORKER_SHMEM_ACCESS flag, trying to LWLockAcquire() leads to the worker being restarted immediately and repeatedly - independent of the bgw_restart_time setting. The same holds true if the bgworker exits with status 0 or in case it segfaults. Not when exiting with code 1, though. Why is that? Especially in case of a segfault or equally hard errors that can be expected to occur repeatedly, I don't want the worker to be restarted that frequently. Ah, that's just a bug, of course. -- Álvaro Herrera
Re: [HACKERS] Review: Extra Daemons / bgworker
On 11/28/2012 03:51 PM, Alvaro Herrera wrote: I remember your patchset. I didn't look at it for this round, for no particular reason. I did look at KaiGai's submission from two commitfests ago, and also at a patch from Simon which AFAIK was never published openly. Simon's patch merged the autovacuum code to make autovac workers behave like bgworkers as handled by his patch, just like you suggest. To me it didn't look all that good; I didn't have the guts for that, hence the separation. If later bgworkers are found to work well enough, we can port autovac workers to use this framework, but for now it seems to me that the conservative thing is to leave autovac untouched. Hm.. interesting to see how the same idea grows into different patches and gets refined until we end up with something considered committable. Do you remember anything in particular that didn't look good? Would you help reviewing a patch on top of bgworker-7 that changed autovacuum into a bgworker? (As an example, we introduced ilist some commits back and changed some uses to it; but there are still many places where we're using SHM_QUEUE, or List, or open-coded lists, which we could port to the new infrastructure, but there's no pressing need to do it.) Well, I usually like cleaning things up earlier rather than later (my desk clearly being an exception to that rule, though). But yeah, new code usually brings new bugs with it. Sorry about that -- forgot to push to github. bgworker-7 corresponds to commit 0a49a540b which I have just pushed to github. Thanks. Maybe one thing to do in this area would be to ensure that there is a certain number of PGPROC elements reserved specifically for bgworkers; kind of like autovacuum workers have. That would avoid having regular clients exhausting slots for bgworkers, and vice versa. Yeah, I think that's mandatory, anyways, see below. How are you envisioning that the requests would occur? Just like av_launcher does it now: set a flag in shared memory and signal the postmaster (PMSIGNAL_START_AUTOVAC_WORKER). (That's why I'm so puzzled: it looks like it's pretty much all there, already. I even remember a discussion about that mechanism probably not being fast enough to spawn bgworkers. And a proposal to add multiple such flags, so an avlauncher-like daemon could ask for multiple bgworkers to be started in parallel. I've even measured the serial bgworker fork rate back then, IIRC it was in the hundreds of forks per second...) # Minor technical issues or questions In assign_maxconnections, et al, GetNumRegisteredBackgroundWorkers() is used in relation to MAX_BACKENDS or to calculate MaxBackends. That seems to leak the bgworkers that registered with neither BGWORKER_SHMEM_ACCESS nor BGWORKER_BACKEND_DATABASE_CONNECTION set. Or is there any reason to discount such extra daemon processes? No, I purposefully let those out, because those don't need a PGPROC. In fact that seems to me to be the whole point of non-shmem-connected workers: you can have as many as you like and they won't cause a backend-side impact. You can use such a worker to connect via libpq to the server, for example. Hm.. well, in that case, the shmem-connected ones are *not* counted. If I create and run an extensions that registers 100 shmem-connected bgworkers, I cannot connect to a system with max_connections=100 anymore, because bgworkers then occupy all of the connections, already. Please add the registered shmem-connected bgworkers to the max_connections limit. I think it's counter intuitive to have autovacuum workers reserved separately, but extension's bgworkers eat (client) connections. Or put another way: max_connections should always be the max number of *client* connections the DBA wants to allow. (Or, if that's in some way complicated, please give the DBA an additional GUC akin to max_background_workers. That can be merged with the current max_autovacuum_workers, once/if we rebase autovaccum onto bgworkers). The additional contrib modules auth_counter and worker_spi are missing from the contrib/Makefile. If that's intentional, they should probably be listed under Missing. The auth_counter module leaves worker.bgw_restart_time uninitialized. Being an example, it might make sense for auth_counter to provide a signal that just calls SetLatch() to interrupt WaitLatch() and make auth_counter emit its log line upon request, i.e. show how to use SIGUSR1. KaiGai proposed that we remove auth_counter. I don't see why not; I mean, worker_spi is already doing most of what auth_counter is doing, so why not? Agreed. However, as you say, maybe we need more coding examples. Maybe a minimally usable extra daemon example? Showing how to avoid common pitfalls? Use cases, anybody? :-) Ah, that's just a bug, of course. I see. Glad my review found it. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your
[HACKERS] json accessors
This is a proposal to create some basic functions to extract values from json. The simple functions I envision would be: * json_object_keys(json) = setof text returns the set of dequoted, unescaped keys of the object, errors if it's not an object * json_get(json, keytext) = json returns the json value corresponding to the key text in the json object, null if not found, error if it's not an object * json_get(json, indexint) = json returns the json value of the indexth element in the json array, null of the index is outside the array bounds, errors if it's not an array * json_get_as_text(json, keytext or indexint) = text same as json_get() except that it returns dequoted, unescaped text for a quoted leaf field I also propose to map the json_get functions to the operator '-' and json_get_as_text to '-', so that given x has this json value: {a:[{b:c,d:e},{f:true,g:1}]} the expression x-'a'-0-'d' will yield 'e', x-'a'-0-'f' will yield 'true' and x-'a'-0 will yield '{b:c,d:e}'. The operators would make using these a whole lot nicer :-) Various people have suggested putting json_path or something similar into the core. I'm not sure we want to do that, partly because there are several competing entries in this field, and partly because I don't want to get into the business of evaluating json predicate tests, which I think any tolerably complete gadget would need to do. Regarding implementation, the way I propose to do this is to modify the json parser a bit to turn it into a recursive descent parser, with hooks for various operations. NULL hooks would leave us with the validating parser we have now with no side effects. The hook functions themselves will be very small. This would also allow us to do other things very simply at a later stage, for example a json to xml transformation function would be very easy to construct using this infrastructure, and without disturbing any existing functionality. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] InvokeObjectAccessHook versus DROP INDEX CONCURRENTLY
Just a side note that the above combination doesn't work, at least not if the object access hook tries to make any database state updates. I've put a hack into index_drop that should detect the case: /* * We must commit our transaction in order to make the first pg_index * state update visible to other sessions. If the DROP machinery * has already performed any other actions (removal of other objects, * pg_depend entries, etc), the commit would make those actions * permanent, which would leave us with inconsistent catalog state * if we fail partway through the following sequence. Since DROP * INDEX CONCURRENTLY is restricted to dropping just one index that * has no dependencies, we should get here before anything's been * done --- but let's check that to be sure. We can verify that the * current transaction has not executed any transactional updates by * checking that no XID has been assigned. */ if (GetTopTransactionIdIfAny() != InvalidTransactionId) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg(DROP INDEX CONCURRENTLY must be first action in transaction))); but I wonder exactly what people think they're going to be doing with object access hooks, and whether the hook call needs to be done somewhere else instead. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] InvokeObjectAccessHook versus DROP INDEX CONCURRENTLY
Tom Lane wrote: Just a side note that the above combination doesn't work, at least not if the object access hook tries to make any database state updates. I've put a hack into index_drop that should detect the case: /* * We must commit our transaction in order to make the first pg_index * state update visible to other sessions. If the DROP machinery * has already performed any other actions (removal of other objects, * pg_depend entries, etc), the commit would make those actions * permanent, which would leave us with inconsistent catalog state * if we fail partway through the following sequence. Since DROP * INDEX CONCURRENTLY is restricted to dropping just one index that * has no dependencies, we should get here before anything's been * done --- but let's check that to be sure. We can verify that the * current transaction has not executed any transactional updates by * checking that no XID has been assigned. */ if (GetTopTransactionIdIfAny() != InvalidTransactionId) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg(DROP INDEX CONCURRENTLY must be first action in transaction))); but I wonder exactly what people think they're going to be doing with object access hooks, and whether the hook call needs to be done somewhere else instead. We don't have autonomous transactions currently, but there has been plenty of talk about adding them. Presumably an access hook could use them to, say, maintain an audit log. In that case, the autonomous transaction could very well make other updates before we get to this point; but I assume the state of the other transaction should not affect what's going on in the transaction that's running the DROP INDEX CONCURRENTLY. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump --split patch
Marko Tiikkaja wrote: On 16/11/2012 15:52, Dimitri Fontaine wrote: What happens if you have a table foo and another table FoO? They would go to the same file. If you think there are technical issues behind that decision (e.g. the dump would not restore), I would like to hear an example case. create table foo (a int, b text); create type bar as (stuff foo); create table FoO (more bar); -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Wed, Nov 28, 2012 at 11:04 AM, Andrew Dunstan and...@dunslane.net wrote: This is a proposal to create some basic functions to extract values from json. The simple functions I envision would be: * json_object_keys(json) = setof text returns the set of dequoted, unescaped keys of the object, errors if it's not an object * json_get(json, keytext) = json returns the json value corresponding to the key text in the json object, null if not found, error if it's not an object * json_get(json, indexint) = json returns the json value of the indexth element in the json array, null of the index is outside the array bounds, errors if it's not an array * json_get_as_text(json, keytext or indexint) = text same as json_get() except that it returns dequoted, unescaped text for a quoted leaf field Comments (this is awesome btw): *) ISTM your keytext operators are a reasonable replacement for a hypothetical json_path. That said you're basically forcing json-sql mapping through a highly iterative API, which I don't like. At the very least, I think json_get should return setof json and return all matching constructions. I won't miss predicate tests: we can do all that in SQL. Non-trivial json productions in postgres require the creation of special composite types that structure the data that we (I?) rig up in SQL before routing to json. What about having functions that work in the opposite direction: *) can you access both arrays and records with numeric positional syntax (hopefully, yes?), for example: x-0-0 *) json_object_keys(json) seems to special case to me. how about: json_each(json) which returns a set of key/value pairs and would on arrays or objects (for arrays the key could be invented from the index). *) json_get_as_text(json, keytext or indexint) = text prefer json_to_text() naming. also json_to_hstore(), etc. *) have you considered something like anyelement from_json(anyelement, json) or select json::some_type; (this may or many not be possible given our casting mechanics; i don't know). My reasoning here is that for non-trivial json productions we (I?) typically use composite types to rigidly control the structure of the output document. For 'restful' type protocols I might want to use the same trick: there would be a set of nested composite type/arrays (or even, in trivial cases, a table) that would cleanly map to the document. The parsing here can and should be automatic; this would give nice symmetry with your xxx_to_json functions. Obviously conversion here would be best effort but when it works, it would be wonderful: WITH json_data AS ( SELECT from_json(null::foo[], input_doc) ) i1 as (INSERT INTO bar SELECT ... FROM json_data) i2 as (INSERT INTO baz SELECT ... FROM json_data) where ... would be some combination of unnest() and composite type access syntax. Now, some documents in json won't translate cleanly to composite types because json allows for heterogeneous arrays. But if we're in control of both sides of the protocol that shouldn't matter. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On Tue, Nov 27, 2012 at 10:58 AM, Pavel Stehule pavel.steh...@gmail.com wrote: I unlike using keywords DO for this purpose - when we use it for anonymous blocks Yeah, I don't much like that either. My original suggestion when Kevin and I discussed this over voice was ALTER MATERIALIZED VIEW .. REFRESH or ALTER MATERIALIZED VIEW .. UPDATE. I don't particularly like syntaxes involving DO or LOAD because those words already have strong associations with completely unrelated features. Now, if we don't want to do that and we don't want to use ALTER for a data-modifying command either, another option would be to invent a new toplevel command: REFRESH view_name; Of course, that does introduce another keyword, but the penalty for a new unreserved keyword is pretty small. It seems like a rough analogue of CLUSTER, which could be spelled ALTER TABLE table_name UPDATE TABLE ORDER TO if keyword minimization trumped both concision and clarity, but it doesn't. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
On 2012-11-28 14:09:11 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2012-11-27 23:46:58 -0500, Tom Lane wrote: Attached is a very preliminary draft patch for this. I've not addressed the question of whether we can clear indcheckxmin during transactional updates of pg_index rows, but I think it covers everything else talked about in this thread. - I noticed while trying my old isolationtester test that heap_update_inplace disregards any locks on the tuple. I don't really see a scenario where this is problematic right now, seems a bit dangerous for the future though. I think this should be all right --- we have at least ShareUpdateExclusiveLock on the table and the index before we do anything, so nobody else should be fooling with its pg_index entry. Attached is an updated patch for HEAD that I think is about ready to go. I'll start making a back-patchable version shortly. Looks good! One minor thing I haven't noticed earlier: Perhaps we should also skip over invalid indexes in transformTableLikeClause's CREATE_TABLE_LIKE_INDEXES case? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] binary heap implementation
On 2012-11-27 11:56:41 -0500, Robert Haas wrote: [ Sorry for the slow response on this, Thanksgiving interfered. ] On Wed, Nov 21, 2012 at 3:41 PM, Andres Freund and...@2ndquadrant.com wrote: One very minor nitpick I unfortunately just found now, not sure when that changed: binaryheap_replace_first() hardcodes the indices for the left/right node of the root node. I would rather have it use (left|right)_offset(0). Hmm, yeah... but come to think of it, why do we need that special case at all? Why not just call sift_down on the root node and call it good? See the attached version, which simplifies the code considerably and also makes some comment adjustments per Abhijit's comments. The simplification made me worry for a second but after checking it out I realized that my fear was only based on my original version where I did a kv = simpleheap_remove_first(heap); simpleheap_add(heap, kv-key, kv-value); if there was work to be done. But Abhijit optimized that code to do less work, so the amount of comparisons is exactly the same before/after your simplifications. With considerably less code. Looks ready to me. Thanks, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Further pg_upgrade analysis for many tables
On Tue, Nov 27, 2012 at 09:35:10PM -0800, Jeff Janes wrote: I tested custom format with pg_restore -j and -1, as well as text restore. The winner was pg_dump -Fc | pg_restore -1; I don't have the numbers at hand, but if my relcache patch is accepted, then -1 stops being faster. -1 gets rid of the AtOEXAct relcache N^2 behavior, but at the cost of invoking a different N^2, that one in the stats system. OK, here are the testing results: #tbls git -1AtOEXAct both 1 11.06 13.06 10.99 13.20 1000 21.71 22.92 22.20 22.51 2000 32.86 31.09 32.51 31.62 4000 55.22 49.96 52.50 49.99 8000 105.34 82.10 95.32 82.94 16000 223.67 164.27 187.40 159.53 32000 543.93 324.63 366.44 317.93 640001697.14 791.82 767.32 752.57 Up to 2k, they are all similar. 4k 8k have the -1 patch as a win, and 16k+ really need both patches. I will continue working on the -1 patch, and hopefully we can get your AtOEXAct patch in soon. Is someone reviewing that? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
Robert Haas wrote: I don't particularly like syntaxes involving DO or LOAD because those words already have strong associations with completely unrelated features. Now, if we don't want to do that and we don't want to use ALTER for a data-modifying command either, another option would be to invent a new toplevel command: REFRESH view_name; Of course, that does introduce another keyword, but the penalty for a new unreserved keyword is pretty small. Of the alternatives to LOAD MATERIALIZED VIEW, something involving REFRESH seems the best to me. The question is whether REFRESH MATERIALIZED VIEW (or just REFRESH) is more clear, and whether it is so by enough to merit another keyword. Of course, there is a chance that we may wind up needing that keyword for declaring incremental updates anyway, so it might be a matter of *when* we do it rather than *whether* we do it -- depending on the yet-to-be- determined syntax for specifying incremental updates. My personal preference is still for LOAD MATERIALIZED VIEW because it implies a complete regeneration rather than something more incremental, but I realize that is subjective. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/28/2012 02:08 PM, Merlin Moncure wrote: On Wed, Nov 28, 2012 at 11:04 AM, Andrew Dunstan and...@dunslane.net wrote: This is a proposal to create some basic functions to extract values from json. The simple functions I envision would be: * json_object_keys(json) = setof text returns the set of dequoted, unescaped keys of the object, errors if it's not an object * json_get(json, keytext) = json returns the json value corresponding to the key text in the json object, null if not found, error if it's not an object * json_get(json, indexint) = json returns the json value of the indexth element in the json array, null of the index is outside the array bounds, errors if it's not an array * json_get_as_text(json, keytext or indexint) = text same as json_get() except that it returns dequoted, unescaped text for a quoted leaf field Comments (this is awesome btw): Thanks for the input. *) ISTM your keytext operators are a reasonable replacement for a hypothetical json_path. That said you're basically forcing json-sql mapping through a highly iterative API, which I don't like. At the very least, I think json_get should return setof json and return all matching constructions. I won't miss predicate tests: we can do all that in SQL. Yes, it's iterative. And for deeply nested json it might be somewhat inefficient, although the parser is pretty fast AFAICT. But it's a start. Non-trivial json productions in postgres require the creation of special composite types that structure the data that we (I?) rig up in SQL before routing to json. What about having functions that work in the opposite direction: *) can you access both arrays and records with numeric positional syntax (hopefully, yes?), for example: x-0-0 You can't do that in JS, so I'm not clear why we should allow it. *) json_object_keys(json) seems to special case to me. how about: json_each(json) which returns a set of key/value pairs and would on arrays or objects (for arrays the key could be invented from the index). Again, I don't think we should conflate the processing for arrays and objects. But I could see doing each(json) = setof (text, json) (and maybe a similar function returning setof (text, text), which would dequote leaf nodes as json_get_as_text() does). And similarly a couple of functions to unnest arrays. *) json_get_as_text(json, keytext or indexint) = text prefer json_to_text() naming. also json_to_hstore(), etc. json_to_text seems rather misleading as a name here. Maybe we could remove the _as from the name if that's bothering you. As for json_to_hstore, as I mentioned, the design is intended to enable the easy constructyion of such transformations, although for hstores anything except trivial json structure (i.e. an unnested object) it might have unappealing results. But in any case, the important thing to do first is to get the infrastructure in place. Time is very short and I don't want to extend this very much. *) have you considered something like anyelement from_json(anyelement, json) or select json::some_type; (this may or many not be possible given our casting mechanics; i don't know). I have no idea what the semantics of this would be. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for hint bit i/o mitigation
Merlin Moncure escribió: Maybe abstracting 'last xid cache' along with hint bit management out of both transam.c and tqual.c into something like 'hints.c' is appropriate, but that's a more invasive change. It would be good to have such a patch to measure/compare performance of both approaches. It does seem like the more invasive change might be more expensive for both the transam.c and the tqual.c uses, though; and it's not clear that there's anything to be gained by having them be together in a single module. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/28/2012 03:44 PM, Andrew Dunstan wrote: As for json_to_hstore, as I mentioned, the design is intended to enable the easy constructyion of such transformations, although for hstores anything except trivial json structure (i.e. an unnested object) it might have unappealing results. But in any case, the important thing to do first is to get the infrastructure in place. Time is very short and I don't want to extend this very much. The other thing about doing json_to_hstore() is that, since hstore is not itself a core type, we couldn't do that in the core json module, and therefore we'd either need to expose an API to the JSON parser or replicate it in the hstore module. Exposing it is probably the better way to go. Then people could write extensions that process json just by supplying the hooked functions. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] foreign key locks
Alvaro Herrera wrote: Here's version 24. This no longer applies after the rmgr rm_desc patch. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
Andres Freund and...@2ndquadrant.com writes: One minor thing I haven't noticed earlier: Perhaps we should also skip over invalid indexes in transformTableLikeClause's CREATE_TABLE_LIKE_INDEXES case? I left that as-is intentionally: the fact that an index isn't valid doesn't prevent us from cloning it. A relevant data point is that pg_dump doesn't care whether indexes are valid or not --- it'll dump their definitions anyway. I agree it's a judgment call, though. Anybody want to argue for the other position? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
On 2012-11-28 17:42:18 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: One minor thing I haven't noticed earlier: Perhaps we should also skip over invalid indexes in transformTableLikeClause's CREATE_TABLE_LIKE_INDEXES case? I left that as-is intentionally: the fact that an index isn't valid doesn't prevent us from cloning it. A relevant data point is that pg_dump doesn't care whether indexes are valid or not --- it'll dump their definitions anyway. I agree it's a judgment call, though. Anybody want to argue for the other position? Hm. Seems odd to include indexes that are being dropped concurrently at that moment. But then, we can't really detect that situation and as you say its consistent with pg_dump... Hm. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Refactor flex and bison make rules
Andrew Dunstan and...@dunslane.net writes: On 11/28/2012 02:14 PM, Alvaro Herrera wrote: Okapi has been failing sporadically on ecpg, and I wonder if it's related to this change. Well, it looks like the make is broken and missing a clear dependency requirement. I think we need to ask Jeremy to turn off parallel build for okapi. Yeah, we already know that unpatched make 3.82 has got serious parallelism bugs: http://archives.postgresql.org/pgsql-hackers/2012-09/msg00397.php I wonder whether adding another .NOTPARALLEL directive would be a better idea than insisting people get hold of patched versions. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Refactor flex and bison make rules
On 11/28/12 6:01 PM, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: On 11/28/2012 02:14 PM, Alvaro Herrera wrote: Okapi has been failing sporadically on ecpg, and I wonder if it's related to this change. Well, it looks like the make is broken and missing a clear dependency requirement. I think we need to ask Jeremy to turn off parallel build for okapi. Yeah, we already know that unpatched make 3.82 has got serious parallelism bugs: http://archives.postgresql.org/pgsql-hackers/2012-09/msg00397.php I wonder whether adding another .NOTPARALLEL directive would be a better idea than insisting people get hold of patched versions. We could put ifeq ($(MAKE_VERSION),3.82) .NOTPARALLEL: endif into Makefile.global. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Refactor flex and bison make rules
On 11/28/2012 06:01 PM, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: On 11/28/2012 02:14 PM, Alvaro Herrera wrote: Okapi has been failing sporadically on ecpg, and I wonder if it's related to this change. Well, it looks like the make is broken and missing a clear dependency requirement. I think we need to ask Jeremy to turn off parallel build for okapi. Yeah, we already know that unpatched make 3.82 has got serious parallelism bugs: http://archives.postgresql.org/pgsql-hackers/2012-09/msg00397.php I wonder whether adding another .NOTPARALLEL directive would be a better idea than insisting people get hold of patched versions. You mean in the preproc Makefile? Maybe. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Refactor flex and bison make rules
Peter Eisentraut pete...@gmx.net writes: On 11/28/12 6:01 PM, Tom Lane wrote: I wonder whether adding another .NOTPARALLEL directive would be a better idea than insisting people get hold of patched versions. We could put ifeq ($(MAKE_VERSION),3.82) .NOTPARALLEL: endif into Makefile.global. I don't wish to go *that* far. Parallel make works fine for most of the tree in 3.82, and shutting it off would penalize developers a lot. It appears to me that the case that okapi is hitting is specific to the ecpg preprocessor build rules, and indeed specific to the case where preproc.c needs to be rebuilt. A .NOTPARALLEL in ecpg/preproc/Makefile would probably be enough to fix it. (I'm a bit tempted to make the one already added to ecpg/Makefile conditional on the make version, as you suggest above, too.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Refactor flex and bison make rules
On 11/28/2012 06:19 PM, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: On 11/28/12 6:01 PM, Tom Lane wrote: I wonder whether adding another .NOTPARALLEL directive would be a better idea than insisting people get hold of patched versions. We could put ifeq ($(MAKE_VERSION),3.82) .NOTPARALLEL: endif into Makefile.global. I don't wish to go *that* far. Parallel make works fine for most of the tree in 3.82, and shutting it off would penalize developers a lot. It appears to me that the case that okapi is hitting is specific to the ecpg preprocessor build rules, and indeed specific to the case where preproc.c needs to be rebuilt. A .NOTPARALLEL in ecpg/preproc/Makefile would probably be enough to fix it. (I'm a bit tempted to make the one already added to ecpg/Makefile conditional on the make version, as you suggest above, too.) There is something odd about okapi, because my linux/gcc buildfarm animal is using make 3.82 happily, with make_jobs = 4. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
Andres Freund and...@anarazel.de writes: On 2012-11-28 17:42:18 -0500, Tom Lane wrote: I agree it's a judgment call, though. Anybody want to argue for the other position? Hm. Seems odd to include indexes that are being dropped concurrently at that moment. But then, we can't really detect that situation and as you say its consistent with pg_dump... [ thinks about that for a bit... ] We could have that, for about the same cost as the currently proposed patch: instead of defining the added flag column as index is live, define it as drop in progress, and set it immediately at the start of the DROP CONCURRENTLY sequence. Then the dead condition that RelationGetIndexList must check for is drop in progress and not indisvalid and not indisready. However, this is more complicated and harder to understand. So unless somebody is really excited about being able to tell the difference between create-in-progress and drop-in-progress, I'd rather leave the patch as-is. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Refactor flex and bison make rules
Andrew Dunstan and...@dunslane.net writes: On 11/28/2012 06:19 PM, Tom Lane wrote: It appears to me that the case that okapi is hitting is specific to the ecpg preprocessor build rules, and indeed specific to the case where preproc.c needs to be rebuilt. A .NOTPARALLEL in ecpg/preproc/Makefile would probably be enough to fix it. (I'm a bit tempted to make the one already added to ecpg/Makefile conditional on the make version, as you suggest above, too.) There is something odd about okapi, because my linux/gcc buildfarm animal is using make 3.82 happily, with make_jobs = 4. Yeah, and nobody else has seen this either. It might just be that okapi has exactly the right number of processors with exactly the right speeds to make the failure a lot more probable. Or maybe there's something weird about Gentoo's version of make (wouldn't be the first time). Anyway, deparallelizing just the ecpg/preproc build would cost very little in build time, since it's totally dominated by the preproc.c and preproc.o build steps anyway. I'm inclined to just do it and see if the problem goes away. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
On 2012-11-28 18:41:39 -0500, Tom Lane wrote: Andres Freund and...@anarazel.de writes: On 2012-11-28 17:42:18 -0500, Tom Lane wrote: I agree it's a judgment call, though. Anybody want to argue for the other position? Hm. Seems odd to include indexes that are being dropped concurrently at that moment. But then, we can't really detect that situation and as you say its consistent with pg_dump... [ thinks about that for a bit... ] We could have that, for about the same cost as the currently proposed patch: instead of defining the added flag column as index is live, define it as drop in progress, and set it immediately at the start of the DROP CONCURRENTLY sequence. Then the dead condition that RelationGetIndexList must check for is drop in progress and not indisvalid and not indisready. You're right. However, this is more complicated and harder to understand. So unless somebody is really excited about being able to tell the difference between create-in-progress and drop-in-progress, I'd rather leave the patch as-is. The only real argument for doing this that I can see is a potential REINDEX CONCURRENTLY. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On Wed, Nov 28, 2012 at 2:44 PM, Andrew Dunstan and...@dunslane.net wrote: On 11/28/2012 02:08 PM, Merlin Moncure wrote: *) ISTM your keytext operators are a reasonable replacement for a hypothetical json_path. That said you're basically forcing json-sql mapping through a highly iterative API, which I don't like. At the very least, I think json_get should return setof json and return all matching constructions. I won't miss predicate tests: we can do all that in SQL. Yes, it's iterative. And for deeply nested json it might be somewhat inefficient, although the parser is pretty fast AFAICT. But it's a start. not completely buying that: see comments below. not supporting xpath style decompositions seems wrong to me. IOW, json_get should be set returning (perhaps via wild cards in the keytext) or we need json_each. Non-trivial json productions in postgres require the creation of special composite types that structure the data that we (I?) rig up in SQL before routing to json. What about having functions that work in the opposite direction: *) can you access both arrays and records with numeric positional syntax (hopefully, yes?), for example: x-0-0 You can't do that in JS, so I'm not clear why we should allow it. agreed -- withdrawn. *) json_object_keys(json) seems to special case to me. how about: json_each(json) which returns a set of key/value pairs and would on arrays or objects (for arrays the key could be invented from the index). Again, I don't think we should conflate the processing for arrays and objects. But I could see doing each(json) = setof (text, json) (and maybe a similar function returning setof (text, text), which would dequote leaf nodes as json_get_as_text() does). And similarly a couple of functions to unnest arrays. Yeah. Although, I *do* think you need 'json_each' (or a set returning json_get) and they should be conflated...exactly as jquery does: http://api.jquery.com/jQuery.each/. json objects are associative arrays, right? So if the *value* that gets returned by json_each is itself a collection, we can cast back to json and recurse. at the very least, we ought to decompose large documents into arbitrary smaller chunks (as xpath does) without iterating. In most of the code I'd write, I would decompose to a json object using your stuff then route to something like: insert into foo select (r).* from populate_record(null::foo, json_to_hstore(x)) r from json_each('path-to-record_containg_array', json_document'); assuming the json was deliberately constructed to mashall cleanly into the database, which is perfectly reasonable. *) json_get_as_text(json, keytext or indexint) = text prefer json_to_text() naming. also json_to_hstore(), etc. json_to_text seems rather misleading as a name here. Maybe we could remove the _as from the name if that's bothering you. hm, I think you're right here -- I see the distinction. As for json_to_hstore, as I mentioned, the design is intended to enable the easy constructyion of such transformations, although for hstores anything except trivial json structure (i.e. an unnested object) it might have unappealing results. But in any case, the important thing to do first is to get the infrastructure in place. Time is very short and I don't want to extend this very much. yeah, understood. *) have you considered something like anyelement from_json(anyelement, json) or select json::some_type; (this may or many not be possible given our casting mechanics; i don't know). I have no idea what the semantics of this would be. Yeah, there's a lot of nuance there. Don't have to tackle everything at once I suppose, but spiritually I'm hoping it would serve as replacement for textual record_in, array_in, etc. It's just wrong to have to specify each and every field in during parsing when the receiving structure is well defined in the database. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
Andres Freund and...@2ndquadrant.com writes: On 2012-11-28 18:41:39 -0500, Tom Lane wrote: However, this is more complicated and harder to understand. So unless somebody is really excited about being able to tell the difference between create-in-progress and drop-in-progress, I'd rather leave the patch as-is. The only real argument for doing this that I can see is a potential REINDEX CONCURRENTLY. While I was working on this patch, I came to the conclusion that the only way REINDEX CONCURRENTLY could possibly work is: 1. Do CREATE INDEX CONCURRENTLY with a temporary index name. 2. Swap index names and any dependencies (eg for unique/pkey constraints), in a transaction of its own. 3. Do DROP INDEX CONCURRENTLY on the now-obsolete index. If you try to do it with just one set of index catalog entries, you'll find the pg_class row has to be in two states at once, since there certainly have to be two underlying physical files while all this is going on. That being the case, there'll be two different pg_index rows as well, and thus my worries upthread about whether REINDEX CONCURRENTLY would need to do something special with the pg_index row seem unfounded. Of course, there's still plenty of magic required to make this happen --- I don't see how to do step 2 safely without taking exclusive lock for at least a short interval. But that's mostly about the SnapshotNow scan problem, which we at least have some ideas about how to solve. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
On 2012-11-29 09:10:22 +0900, Michael Paquier wrote: On Thu, Nov 29, 2012 at 8:52 AM, Andres Freund and...@2ndquadrant.comwrote: On 2012-11-28 18:41:39 -0500, Tom Lane wrote: Andres Freund and...@anarazel.de writes: On 2012-11-28 17:42:18 -0500, Tom Lane wrote: I agree it's a judgment call, though. Anybody want to argue for the other position? Hm. Seems odd to include indexes that are being dropped concurrently at that moment. But then, we can't really detect that situation and as you say its consistent with pg_dump... [ thinks about that for a bit... ] We could have that, for about the same cost as the currently proposed patch: instead of defining the added flag column as index is live, define it as drop in progress, and set it immediately at the start of the DROP CONCURRENTLY sequence. Then the dead condition that RelationGetIndexList must check for is drop in progress and not indisvalid and not indisready. You're right. However, this is more complicated and harder to understand. So unless somebody is really excited about being able to tell the difference between create-in-progress and drop-in-progress, I'd rather leave the patch as-is. The only real argument for doing this that I can see is a potential REINDEX CONCURRENTLY. Patch that has been submitted to this commit fest Yea, I did a first look through it recently... Not really sure where to start with the necessary infrastructure yet. and is going to need a lot of rework as well as more infrastructure like a better MVCC-ish SnapshotNow. Which is a major project in itself. I wonder whether my crazy follow updates via t_ctid isn't the actually easier way to get there in the short term. On the other hand, a more MVCCish catalog access would be awesome. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
On 2012-11-28 19:11:46 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2012-11-28 18:41:39 -0500, Tom Lane wrote: However, this is more complicated and harder to understand. So unless somebody is really excited about being able to tell the difference between create-in-progress and drop-in-progress, I'd rather leave the patch as-is. The only real argument for doing this that I can see is a potential REINDEX CONCURRENTLY. While I was working on this patch, I came to the conclusion that the only way REINDEX CONCURRENTLY could possibly work is: 1. Do CREATE INDEX CONCURRENTLY with a temporary index name. 2. Swap index names and any dependencies (eg for unique/pkey constraints), in a transaction of its own. 3. Do DROP INDEX CONCURRENTLY on the now-obsolete index. If you try to do it with just one set of index catalog entries, you'll find the pg_class row has to be in two states at once, since there certainly have to be two underlying physical files while all this is going on. That being the case, there'll be two different pg_index rows as well, and thus my worries upthread about whether REINDEX CONCURRENTLY would need to do something special with the pg_index row seem unfounded. Of course, there's still plenty of magic required to make this happen --- I don't see how to do step 2 safely without taking exclusive lock for at least a short interval. But that's mostly about the SnapshotNow scan problem, which we at least have some ideas about how to solve. That's actually pretty similar to the way Michael has implemented it in his submitted patch and what has been discussed in a recent thread. His patch doesn't claim to solve the concurrency issues around 2) though... Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
Andres Freund and...@2ndquadrant.com writes: On 2012-11-29 09:10:22 +0900, Michael Paquier wrote: and is going to need a lot of rework as well as more infrastructure like a better MVCC-ish SnapshotNow. Which is a major project in itself. I wonder whether my crazy follow updates via t_ctid isn't the actually easier way to get there in the short term. On the other hand, a more MVCCish catalog access would be awesome. Yeah, eliminating the race conditions for SnapshotNow scans would be valuable enough to justify a lot of work --- we could get rid of a bunch of kluges once we had that, not to mention that Simon's project of reducing ALTER TABLE lock strength might stand a chance of working. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/29/2012 01:10 AM, Merlin Moncure wrote: On Wed, Nov 28, 2012 at 2:44 PM, Andrew Dunstan and...@dunslane.net wrote: ... *) have you considered something like anyelement from_json(anyelement, json) or select json::some_type; (this may or many not be possible given our casting mechanics; i don't know). I have no idea what the semantics of this would be. Yeah, there's a lot of nuance there. One way to tackle it would give the argument element as a template and the result will the same template filled in from json filled create table tab1(id serial primary key, ts timestamp default now(), data text); insert into tab1 select from_json(row(null,null,null)::tab1, '{data:the data}'); insert into tab1 select from_json(row(null,null,null)::tab1, '{id:-1, ts:null, data:}'); insert into tab1 select from_json(t.*,'{data:more data}') from tab1 t where id = -1; hannu=# select row_to_json(t.*) from tab1 t; row_to_json --- {id:1,ts:2012-11-29 02:01:48.379172,data:the data} {id:-1,ts:null, data:} {id:2,ts:2012-11-29 02:02:34.600164,data:more data} (3 rows) if extracting the defaults from table def proves too tricky for first iteration, then just set the missing fields to NULL or even better, carry over the values from template; -- Hannu PS: good work so far :) Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json accessors
On 11/29/2012 02:07 AM, Hannu Krosing wrote: On 11/29/2012 01:10 AM, Merlin Moncure wrote: On Wed, Nov 28, 2012 at 2:44 PM, Andrew Dunstan and...@dunslane.net wrote: ... *) have you considered something like anyelement from_json(anyelement, json) or select json::some_type; (this may or many not be possible given our casting mechanics; i don't know). I have no idea what the semantics of this would be. Yeah, there's a lot of nuance there. One way to tackle it would give the argument element as a template and the result will the same template filled in from json filled create table tab1(id serial primary key, ts timestamp default now(), data text); insert into tab1 select from_json(row(null,null,null)::tab1, '{data:the data}'); insert into tab1 select from_json(row(null,null,null)::tab1, '{id:-1, ts:null, data:}'); insert into tab1 select from_json(t.*,'{data:more data}') from tab1 t where id = -1; hannu=# select row_to_json(t.*) from tab1 t; row_to_json --- {id:1,ts:2012-11-29 02:01:48.379172,data:the data} {id:-1,ts:null, data:} {id:2,ts:2012-11-29 02:02:34.600164,data:more data} (3 rows) if extracting the defaults from table def proves too tricky for first iteration, then just set the missing fields to NULL or even better, carry over the values from template; You could even do a template-less row_from_json which returns a records with all fields converted to the JSON-encodable types and hope that the next conversions will be done by postgreSQL as needed. insert into tab1 select row_from_json('{id:100, ts:2012-12-21, data:End of Everything}'); insert into tab1 select * from row_from_json( '[{id:101, ts:2012-12-22, data:1st day after End of Everything} {id:102, ts:2012-12-22, data:2nd day after End of Everything} ]'); Hannu -- Hannu PS: good work so far :) Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
On 2012/11/29, at 9:23, Andres Freund and...@2ndquadrant.com wrote: On 2012-11-28 19:11:46 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2012-11-28 18:41:39 -0500, Tom Lane wrote: However, this is more complicated and harder to understand. So unless somebody is really excited about being able to tell the difference between create-in-progress and drop-in-progress, I'd rather leave the patch as-is. The only real argument for doing this that I can see is a potential REINDEX CONCURRENTLY. While I was working on this patch, I came to the conclusion that the only way REINDEX CONCURRENTLY could possibly work is: 1. Do CREATE INDEX CONCURRENTLY with a temporary index name. 2. Swap index names and any dependencies (eg for unique/pkey constraints), in a transaction of its own. 3. Do DROP INDEX CONCURRENTLY on the now-obsolete index. If you try to do it with just one set of index catalog entries, you'll find the pg_class row has to be in two states at once, since there certainly have to be two underlying physical files while all this is going on. That being the case, there'll be two different pg_index rows as well, and thus my worries upthread about whether REINDEX CONCURRENTLY would need to do something special with the pg_index row seem unfounded. Of course, there's still plenty of magic required to make this happen --- I don't see how to do step 2 safely without taking exclusive lock for at least a short interval. But that's mostly about the SnapshotNow scan problem, which we at least have some ideas about how to solve. That's actually pretty similar to the way Michael has implemented it in his submitted patch and what has been discussed in a recent thread. His patch doesn't claim to solve the concurrency issues around 2) though... Correct, that is the same approach. The patch took as approach to create a completely separate and new index entry which is a clone of the former index. This way all the entries are in catalogs are doubled, and the switch of the names is made while the two indexes are valid, but yes, I am myself wondering about the necessary lock that needs to be taken when switching the 2 index names. By the way, just by knowing that, I would agree to first rework the SnapshotNow mechanisms that would make a far better base for concurrent DDLs, and this is not limited to REINDEX only, but other things like CLUSTER, ALTER TABLE and perhaps others. Then once this is done PG will have better prospectives with features using CONCURRENTLY, and we could envisage a clean implementation for REINDEX CONCURRENTLY, Regards, Michael Paquier -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Refactor flex and bison make rules
Jeremy Drake pgbuildf...@jdrake.com writes: While we're talking about odd issues that only seem to happen on Okapi, does anyone know of anything I can do to diagnose the pg_upgrade failure on the 9.2 branch? There are no rogue (non-buildfarm-related) postmaster/postgres processes running on the machine. [ digs around ... ] It looks like the failure is coming from here: if (strlen(path) = sizeof(unp-sun_path)) return EAI_FAIL; What's the size of the sun_path member of struct sockaddr_un on your machine? I count 115 characters in your socket path ... maybe you just need a less deeply nested test directory. (If that is the problem, seems like we need to return something more helpful than EAI_FAIL here.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
On Thu, Nov 29, 2012 at 10:10 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Attached is an updated patch for HEAD that I think is about ready to go. I'll start making a back-patchable version shortly. Here is an only-lightly-tested version for 9.2. Looks good at a glance. I wonder though if it would have been better to have IndexSetValid/IndexClearValid family of macros instead of the index_set_state_flags kind of a machinery, purely from consistency perspective. I understand that index_set_state_flags also takes care of opening the catalogue etc, but there are only two callers to the function and we could do that outside that. May be too late since we already have the patch committed to HEAD. Thanks, Pavan
Re: [HACKERS] Materialized views WIP patch
2012/11/28 Kevin Grittner kgri...@mail.com: Robert Haas wrote: I don't particularly like syntaxes involving DO or LOAD because those words already have strong associations with completely unrelated features. Now, if we don't want to do that and we don't want to use ALTER for a data-modifying command either, another option would be to invent a new toplevel command: REFRESH view_name; Of course, that does introduce another keyword, but the penalty for a new unreserved keyword is pretty small. Of the alternatives to LOAD MATERIALIZED VIEW, something involving REFRESH seems the best to me. The question is whether REFRESH MATERIALIZED VIEW (or just REFRESH) is more clear, and whether it is so by enough to merit another keyword. Of course, there is a chance that we may wind up needing that keyword for declaring incremental updates anyway, so it might be a matter of *when* we do it rather than *whether* we do it -- depending on the yet-to-be- determined syntax for specifying incremental updates. My personal preference is still for LOAD MATERIALIZED VIEW because it implies a complete regeneration rather than something more incremental, but I realize that is subjective. In this context I prefer REFRESH keyword - I have a LOAD associated with BULKLOAD, a this is different Regards Pavel -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Enabling frontend-only xlog desc routines
On Wednesday, November 28, 2012 7:07 PM Andres Freund wrote: On 2012-11-28 18:58:45 +0530, Amit Kapila wrote: On Wednesday, November 28, 2012 12:17 AM Alvaro Herrera wrote: I mentioned the remaining issues in a previous email (see message-id 20121025161751.ge6...@alvh.no-ip.org). Attached is a patch that enables xlogdump to #include xlog_internal.h by way of removing that file's inclusion of fmgr.h, which is problematic. I don't think this should be too contentious. I have tried to go through Xlogdump patch provided in mail chain of message-id provided. It seems there is no appropriate file/function header which makes it little difficult to understand the purpose. This is just a suggestion and not related to your this mail. An updated version of xlogdump with some initial documentation, sensible building, and some more is available at http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=sh ortlog;h=refs/heads/xlogreader_v3 Oops.. looked at wrong version. The other interesting question remaining is what to do about the rm_desc function in rmgr.c. We're split between these two ideas: 1. Have this in rmgr.c: #ifdef FRONTEND #define RMGR_REDO_FUNC(func) NULL #else #define RMGR_REDO_FUNC(func) func #endif /* FRONTEND */ and then use RMGR_REDO_FUNC() in the table. 2. Have this in rmgr.c: #ifndef RMGR_REDO_FUNC #define RMGR_REDO_FUNC(func) func #endif And then have the xlogdump Makefile use -D to define a suitable RMGR_REDO_FUNC. What I understood is that as there is only a need of rm_desc function in xlogdump, so we want to separate it out such that it can be easily used. In Approach-1, define for some of functions (redo, startup, cleanup,..) as NULL for frontends so that corresponding functions for frontends become hidden. In Approach-2, frontend (in this case Xlogdump) need to define value for that particular Macro by using -D in makefile. If my above thinking is right, then I think Approach-2 might be better as in that Frontend will have more flexibility if it wants to use some other functionality of rmgr. I personally favor approach-1 because I cannot see any potential other use. You basically need to have the full backend code available just to successfully link the other functions. Running is even more complex, and there's no real point in doing that standalone anyway, so, what for? Such functionality might be used if somebody wants to write independent test for storage engine, but not sure if such a thing (Approach-2) can be helpful. As I can see that Approach-1 has advantage, there will be no dependency in makefiles for exposing rm_desc functionality. And for Approach-2, it is unnecessary for makefile to define value if there is actually no other relevant use case for it. Can you think of any other pros-cons of both approaches, or do you think we already have sufficient reasoning to conclude on Approach-1. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers