Re: [HACKERS] WITHIN GROUP patch
2013/10/11 Andrew Gierth and...@tao11.riddles.org.uk Pavel == Pavel Stehule pavel.steh...@gmail.com writes: I found so following error message is not too friendly (mainly because this functionality will be new) postgres=# select dense_rank(3,3,2) within group (order by num desc, odd) from test4; ERROR: Incorrect number of arguments for hypothetical set function LINE 1: select dense_rank(3,3,2) within group (order by num desc, od... ^ Probably some hint should be there? Hm, yeah. Anyone have ideas for better wording for the original error message, or a DETAIL or HINT addition? The key point here is that the number of _hypothetical_ arguments and the number of ordered columns must match, but we don't currently exclude the possibility that a user-defined hypothetical set function might have additional non-hypothetical args. The first alternative that springs to mind is: ERROR: Incorrect number of arguments for hypothetical set function DETAIL: Number of hypothetical arguments (3) must equal number of ordered columns (2) +1 Pavel -- Andrew (irc:RhodiumToad)
Re: [HACKERS] Compression of full-page-writes
On 10 October 2013 23:06 Fujii Masao wrote: On Wed, Oct 9, 2013 at 1:35 PM, Haribabu kommi haribabu.ko...@huawei.com wrote: Thread-1 Threads-2 Head code FPW compressHead code FPW compress Pgbench-org 5min138(0.24GB) 131(0.04GB) 160(0.28GB) 163(0.05GB) Pgbench-1000 5min 140(0.29GB) 128(0.03GB) 160(0.33GB) 162(0.02GB) Pgbench-org 15min 141(0.59GB) 136(0.12GB) 160(0.65GB) 162(0.14GB) Pgbench-1000 15min 138(0.81GB) 134(0.11GB) 159(0.92GB) 162(0.18GB) Pgbench-org - original pgbench Pgbench-1000 - changed pgbench with a record size of 1000. This means that you changed the data type of pgbench_accounts.filler to char(1000)? Yes, I changed the filler column as char(1000). Regards, Hari babu. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Replace duplicate_oids with Perl implementation
Peter Eisentraut pete...@gmx.net writes: Replace duplicate_oids with Perl implementation It is more portable, more robust, and more readable. From: Andrew Dunstan and...@dunslane.net What about unused_oids? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugfix and new feature for PGXS
Le jeudi 10 octobre 2013 21:37:24 Peter Eisentraut a écrit : On Mon, 2013-10-07 at 22:00 -0400, Andrew Dunstan wrote: The code has been sitting in HEAD for several months, and I committed on the back branches because it was wanted. New features normally go through a full development cycle including extensive beta testing, especially when they contain changes to external interfaces. The fact that the patch author wanted his feature released immediately (of course) doesn't warrant a free pass in this case, IMO. What's new feature ? PGXS break when 19e231b was commited, the patch around this special submake is trying to bugfix that. See http://www.postgresql.org/message-id/201305281410.32535.ced...@2ndquadrant.com There are also reports like this one (and thread): http://www.postgresql.org/message-id/cabrt9rcj6rvgmxbyebcgymwbwiobko_w6zvwrzn75_f+usp...@mail.gmail.com where you can read that I am not 'the' only guy who want to have this commited. And believeing this is worth a backpatch as it stands for a bugfix. Here also the problem is stated as something to fix. http://www.postgresql.org/message-id/20130516165318.gf15...@eldon.alvh.no-ip.org That's being said, you are correct about the problem you found and it's good to be able to release new version without a bug for OSX. I'm just sad you didn't get time to voice a bit before Andrew spent too much time on that. -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation signature.asc Description: This is a digitally signed message part.
[HACKERS] Re: custom hash-based COUNT(DISTINCT) aggregate - unexpectedly high memory consumption
gettimeofday(start, NULL); for (i = 0; i VALUES; i++) { state = XXH32_init(result); XXH32_update(state, i, 4); XXH32_digest(state); } gettimeofday(end, NULL); This code is using the update variant, which is only useful when dealing with very large amount of data which can't fit into a single block of memory. This is obviously overkill for a 4-bytes-only test. 3 functions calls, a malloc, intermediate data book keeping, etc. To hash a single block of data, it's better to use the simpler (and faster) variant XXH32() : gettimeofday(start, NULL); for (i = 0; i VALUES; i++) { XXH32(i, 4, result); } gettimeofday(end, NULL); You'll probably get better results by an order of magnitude. For better results, you could even inline it (yes, for such short loop with almost no work to do, it makes a very sensible difference). That being said, it's true that these advanced hash algorithms only shine with big enough amount of data to hash. Hashing a 4-bytes value into a 4-bytes hash is a bit limited exercise. There is no pigeon hole issue. A simple multiplication by a 32-bits prime would fare good enough and result in zero collision. -- View this message in context: http://postgresql.1045698.n5.nabble.com/custom-hash-based-COUNT-DISTINCT-aggregate-unexpectedly-high-memory-consumption-tp5773463p5774264.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
On Thu, Oct 10, 2013 at 10:35 PM, Andrew Gierth and...@tao11.riddles.org.uk wrote: The first alternative that springs to mind is: ERROR: Incorrect number of arguments for hypothetical set function DETAIL: Number of hypothetical arguments (3) must equal number of ordered columns (2) I'd suggest trying to collapse that down into a single message; the DETAIL largely recapitulates the original error message. Also, I'd suggest identifying the name of the function, since people may not be able to identify that easily based just on the fact that it's a hypothetical set function. Here's one idea, with two variants depending on the direction of the inequality: ERROR: function %s has %d arguments but only %d ordering columns ERROR: function %s has %d ordering columns but only %d arguments Or else leave out the actual numbers and just state the principle, but identifying the exact function at fault, e.g. ERROR: number of arguments to function %s does not match number of ordering columns -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On Wed, Oct 9, 2013 at 9:30 PM, Peter Geoghegan p...@heroku.com wrote: * The syntax. I like the composability, and the way it's likely to become idiomatic to combine it with wCTEs. Others may not. I've actually lost track of what syntax you're proposing. I'm continuing to propose: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE with a much less interesting variant that could be jettisoned: INSERT...ON DUPLICATE KEY IGNORE I'm also proposing extended RETURNING to make it work with this. So the basic idea is that within Postgres, the idiomatic way to correctly do upsert becomes something like: postgres=# with r as ( insert into foo(a,b) values (5, '!'), (6, '@') on duplicate key lock for update returning rejects * ) update foo set b = r.b from r where foo.a = r.a; I can't claim to be enamored of this syntax. * The visibility hacks that V4 is likely to have. The fact that preserving the composable syntax may imply changes to HeapTupleSatisfiesMVCC() so that rows locked but with no currently visible version (under conventional rules) are visible to our snapshot by virtue of having been locked all the same (this only matters at read committed). I continue to think this is a bad idea. Fair enough. Is it just because of performance concerns? If so, that's probably not that hard to address. It either has a measurable impact on performance for a very unsympathetic benchmark or it doesn't. I guess that's the standard that I'll be held to, which is probably fair. That's part of it; but I also think that HeapTupleSatisfiesMVCC() is a pretty fundamental bit of the system that I am loathe to tamper with. We can try to talk ourselves into believing that the definition change will only affect this case, but I'm wary that there will be unanticipated consequences, or simply that we'll find, after it's far too late to do anything about it, that we don't particularly care for the new semantics. It's probably an overstatement to say that I'll oppose any whatsoever that touches the semantics of that function, but not by much. Do you see the appeal of the composable syntax? To some extent. It seems to me that what we're designing is a giant grotty hack, albeit a convenient one. But if we're not really going to get MERGE, I'm not sure how much good it is to try to pretend we've got something general. I appreciate that it's odd that serializable transactions now have to worry about seeing something they shouldn't have seen (when they conclusively have to go lock a row version not current to their snapshot). Surely that's never going to be acceptable. At read committed, locking a version not current to the snapshot might be acceptable if we hold our nose, but at any higher level I think we have to fail with a serialization complaint. But that's simpler than any of the alternatives that I see. Does there really need to be a new snapshot type with one tiny difference that apparently doesn't actually affect conventional clients of MVCC snapshots? I think that's the wrong way of thinking about it. If you're introducing a new type of snapshot, or tinkering with the semantics of an existing one, I think that's a reason to reject the patch straight off. We should be looking for a design that doesn't require that. If we can't find one, I'm not sure we should do this at all. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Heavily modified big table bloat even in auto vacuum is running
vacuum is not happening on a heavily modified big table even if the dead tuples are more than configured threshold. This is because during at the end of vacuum, the number of dead tuples of the table is reset as zero, because of this reason the dead tuples which are occurred during the vacuum operation are lost. Thus to trigger a next vacuum on the same table, the configured threshold number of dead tuples needs to be occurred. The next vacuum operation is taking more time because of more number of dead tuples, like this it continues and it leads to Table and index bloat. To handle the above case instead of directly resetting the dead tuples as zero, how if the exact dead tuples are removed from the table stats. With this approach vacuum gets triggered frequently thus it reduces the bloat. Patch for the same is attached in the mail. please let me know is there any problem in this approach. Regards, Hari babu. vacuum_fix_v1.patch Description: vacuum_fix_v1.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical changeset generation v6.2
On Thu, Oct 10, 2013 at 7:11 PM, Andres Freund and...@2ndquadrant.com wrote: Hi Robert, On 2013-10-09 14:49:46 -0400, Robert Haas wrote: I spent some time looking at the sample plugin (patch 9/12). Here are some review comments: - I think that the decoding plugin interface should work more like the foreign data wrapper interface. Instead of using pg_dlsym to look up fixed names, I think there should be a struct of function pointers that gets filled in and registered somehow. You mean something like CREATE OUTPUT PLUGIN registering a function with an INTERNAL return value returning a filled struct? I thought about that, but it seemed more complex. Happy to change it though if it's preferred. I don't see any need for SQL syntax. I was just thinking that the _PG_init function could fill in a structure and then call RegisterLogicalReplicationOutputPlugin(mystruct). - Still wondering how we'll use this from a bgworker. Simplified code to consume data: Cool. As long as that use case is supported, I'm happy; I just want to make sure we're not presuming that there must be an external client. - The output format doesn't look very machine-parseable. I really think we ought to provide something that is. Maybe a CSV-like format, or maybe something else, but I don't see why someone who wants to do change logging should be forced to write and install C code. If something like Bucardo can run on an unmodified system and extract change-sets this way without needing a .so file, that's going to be a huge win for usability. We can change the current format but I really see little to no chance of agreeing on a replication format that's serviceable to several solutions short term. Once we've gained some experience - maybe even this cycle - that might be different. I don't see why you're so pessimistic about that. I know you haven't worked it out yet, but what makes this harder than sitting down and designing something? More generally on this patch set, if I'm going to be committing any of this, I'd prefer to start with what is currently patches 3 and 4, once we reach agreement on those. Sounds like a reasonable start. Perhaps you could reshuffle the order of the series, if it's not too much work. Are we hoping to get any of this committed for this CF? If so, let's make a plan to get that done; time is short. If not, let's update the CF app accordingly. I'd really like to do so. I am travelling atm, but I will be back tomorrow evening and will push an updated patch this weekend. The issue I know of in the latest patches at http://www.postgresql.org/message-id/20131007133232.ga15...@awork2.anarazel.de is renaming from http://www.postgresql.org/message-id/20131008194758.gb3718...@alap2.anarazel.de I'm a bit nervous about the way the combo CID logging. I would have thought that you would emit one record per combo CID, but what you're apparently doing is emitting one record per heap tuple that uses a combo CID. For some reason that feels like an abuse (and maybe kinda inefficient, too). Either way, I also wonder what happens if a (logical?) checkpoint occurs between the combo CID record and the heap record to which it refers, or how you prevent that from happening. What if the combo CID record is written and the transaction aborts before writing the heap record (maybe without writing an abort record to WAL)? What are the performance implications of this additional WAL logging? What's the worst case? What's the typical case? Does it have a noticeable overhead when wal_level logical? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: FORCE_NULL option for copy COPY in CSV mode
On Fri, Oct 11, 2013 at 12:49 AM, Amit Kapila amit.kapil...@gmail.com wrote: On Thu, Oct 10, 2013 at 6:45 PM, Andrew Dunstan and...@dunslane.net wrote: On 10/09/2013 11:47 PM, Amit Kapila wrote: One of the advantage, I could see using NULL For .. syntax is that already we have one syntax with which user can specify what strings can be replaced with NULL, now just to handle quoted empty string why to add different syntax. FORCE_NULL has advantage that it can be used for some columns rather than all columns, but I think for that existing syntax can also be modified to support it. I think it's badly designed on its face. We don't need and shouldn't provide a facility for different NULL markers. A general facility for that would be an ugly an quite pointless addition to code complexity. What we need is simply a way of altering one specific behaviour, namely the treatment of quoted NULL markers. We should not do that by allowing munging the NULL marker per column, I was thinking it to similar in some sense with insert into tbl statement. For example Create table tbl (c1 int, c2 int, c3 int, c4 int) insert into tbl (col2) values(1); Here after table name, user can specify column names for which he wants to provide specific values. but by a syntactical mechanism that directly addresses the change in behaviour. If you don't like FORCE NULL then let's pick something else, but not this ugly and unnecessary NULL FOR gadget. If you don't like idea of one generic syntax for NULLs in COPY command, then FORCE_QUOTED_NULL or QUOTED_NULL will make more sense as compare to FORCE_NULL. Meh. As amused as I am with our bad naming, I don't think there's anything too terribly wrong with FORCE NULL. Symmetry has some value. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] background workers, round three
Hi, Finally I got the chance to put my hands on this code. Really sorry for the late replay. On Fri, Sep 13, 2013 at 10:42 AM, Robert Haas robertmh...@gmail.com wrote: Last week, I attempted to write some code to perform a trivial operation in parallel by launching background workers. Despite my earlier convictions that I'd built enough infrastructure to make this possible, the experiment turned out badly - so, patches! It's been pretty obvious to me from the beginning that any sort of parallel computation would need a way to make sure that if any worker dies, everybody dies. Conversely, if the parent aborts, all the workers should die. My thought was that this could be implemented using PG_TRY()/PG_CATCH() blocks over the existing infrastructure, but this turned out to be naive. If the original backend encounters an error before the child manages to start up, there's no good recovery strategy. The parent can't kill the child because it doesn't exist yet, and the parent can't stop the postmaster from starting the child later. The parent could try waiting until the child starts up and THEN killing it, but that's probably unreliable and surely mind-numbingly lame. The first attached patch, terminate-worker-v1.patch, rectifies this problem by providing a TerminateBackgroundWorker() function. When this function is invoked, the postmaster will become unwilling to restart the worker and will send it a SIGTERM if it's already running. And here are some comments after reading the first patch... The patch looks good, creates no warnings and is not that invasive in the existing code. Few comments about the code: 1) In postmaster.c, what about adding a comment here telling that we can forget about this bgworker as it has already been requested for a termination: + if (rw-rw_worker.bgw_restart_time == BGW_NEVER_RESTART + || rw-rw_terminate) 2) Trying to think about this patch as an independent feature, I think that it would have been nice to demonstrate the capacity of TerminateBackgroundWorker directly with an example in worker_spi to show a direct application of it, with for example the addition of a function like worker_spi_terminate. However this needs: - either an additional API using as input the PID, pid used to fetch a bgworker handle terminating the worker afterwards. This is basically what I did in the patch attached when playing with your patch. You might find it useful... Or not! It introduces as well worker_spi_terminate, a small API scanning the array of bgworker slots . Not sure that this is much compatible with the concept of generation though... - return not only the PID of the worker started dynamically in worker_spi_launch, but also the generation number of the worker to be able to generate a bgworker handle that could be afterwards be used with TerminateBackgroundWorker. Note that this comment is based on my personal experience woth bgworkers, and I think that it is important to provide examples of what is implemented such as users are not off the track, even if playing with bgworker is high-level hacking. TerminateBackgroundWorker can offer a way to users to kill a bgworker more native than pg_terminate_backend for example, especially if they implement a bgworker structure of the type launcher/workers like what autovacuum does. 3) Documentation is needed, following comment 2). It's important that the SIGTERM be sent by the postmaster, because if the original backend tries to send it, there's a race condition: the process might not be started at the time the original backend tries to send the signal, but the postmaster might start it before it sees the terminate request.) That's interesting. Nothing more to say about that (perhaps because of my lack of knowledge of the code in this area). By itself, this is useful, but painful. The pain comes from the fact that all of the house-keeping is left to the programmer. If this is necessary, so be it. But I think that newcomers to background workers would appreciate at least an exampe in worker_spi (using as a base what I provided in the patch attached) they could refer to when beginning to implement a new feature. This is a thing that people could, and for sure would refer to when implementing a complex thing using this infrastructure. This is all I have about the 1st patch... It is already late here, so I'll have a look at the 2nd/3rd patch hopefully tomorrow or the day after. Regards, -- Michael 20131011_bgworker_terminate_pid.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] #define ExclusiveLock in /usr/include/postgresql/9.1/server/storage/lock.h
Hi pg_Hackers, I would like to express my wonder to see the following line #define ExclusiveLock 7 /* blocks ROW SHARE/SELECT...FOR (line number 543) in /usr/include/postgresql/9.1/server/storage/lock.h file, because ExclusiveLock is a name of a class in libspatialindex library (see* *libspatialindex.github.io). Using postgres SPI and the spatial library becomes quite a challenge in a larger project, since the order of the includes starts making a big difference. Suddenly the c++ pre-processor starts generating codes like class 7 { } I suppose changing the define to be all capital letters becomes a huge problem, doesn't it ? Cheers, arturas $ lsb_release -a No LSB modules are available. Distributor ID:Ubuntu Description:Ubuntu 12.04.2 LTS Release:12.04 Codename:precise $ psql -c 'select version()' version PostgreSQL 9.1.9 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit (1 row)
Re: [HACKERS] Cube extension point support // GSoC'13
On 09.10.2013 21:07, Robert Haas wrote: On Tue, Sep 24, 2013 at 9:07 AM, Stas Kelvichstas.kelv...@gmail.com wrote: Hello There is new version of patch. I have separated ordering operators to different patch (https://commitfest.postgresql.org/action/patch_view?id=1243), fixed formatting issues and implemented backward compatibility with old-style points in cube_is_point() and cube_out(). Also comparing output files I've discovered that this four files is combination of two types of different behavior: 1) SELECT '-1e-700'::cube AS cube; can be (0) or (-0) 2) Amount of zeros in exponent of floating point, i.e. SELECT '1e27'::cube AS cube; can be (1e+027) or (1e+27) On my system (OSX) it is second option in both situations. I've also tested it on FreeBSD 9.0 and Ubuntu 12.04 with the same results. So is there some ideas how can I reproduce such results? Heikki, are you going to review this further for this CommitFest? Sorry, I didn't realize the ball was in my court. I went through the patch now, kibitzing over some minor style issues. Attached is a new version. This seems good for commit except for two things: 1. The alternative expected output files still need to be updated. Stas couldn't find a system where some of those file were used. One option is to simply commit the patch as is, and see if the buildfarm goes red. If it doesn't, we can simply remove the alternative files - they are not used on any supported platform. If some animals go red, we'll get the required diff from the buildfarm output and apply. So this isn't a show-stopper. 2. I didn't understand this change: @@ -422,24 +439,14 @@ g_cube_union(PG_FUNCTION_ARGS) Datum g_cube_compress(PG_FUNCTION_ARGS) { - PG_RETURN_DATUM(PG_GETARG_DATUM(0)); + GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0); + PG_RETURN_POINTER(entry); } Datum g_cube_decompress(PG_FUNCTION_ARGS) { GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0); - NDBOX *key = DatumGetNDBOX(PG_DETOAST_DATUM(entry-key)); - - if (key != DatumGetNDBOX(entry-key)) - { - GISTENTRY *retval = (GISTENTRY *) palloc(sizeof(GISTENTRY)); - - gistentryinit(*retval, PointerGetDatum(key), - entry-rel, entry-page, - entry-offset, FALSE); - PG_RETURN_POINTER(retval); - } PG_RETURN_POINTER(entry); } What did the removed code do, and why isn't it needed anymore? Is there a prerequisite patch that hasn't been committed yet? No. - Heikki diff --git a/contrib/cube/cube.c b/contrib/cube/cube.c index dab0e6e..853acbe 100644 --- a/contrib/cube/cube.c +++ b/contrib/cube/cube.c @@ -144,6 +144,7 @@ bool g_cube_internal_consistent(NDBOX *key, NDBOX *query, StrategyNumber strate ** Auxiliary funxtions */ static double distance_1D(double a1, double a2, double b1, double b2); +static bool cube_is_point_internal(NDBOX *cube); /* @@ -181,6 +182,7 @@ cube_a_f8_f8(PG_FUNCTION_ARGS) int i; int dim; int size; + bool point; double *dur, *dll; @@ -198,16 +200,32 @@ cube_a_f8_f8(PG_FUNCTION_ARGS) dur = ARRPTR(ur); dll = ARRPTR(ll); - size = offsetof(NDBOX, x[0]) +sizeof(double) * 2 * dim; + /* Check if it's a point */ + point = true; + for (i = 0; i dim; i++) + { + if (dur[i] != dll[i]) + { + point = false; + break; + } + } + + size = point ? POINT_SIZE(dim) : CUBE_SIZE(dim); result = (NDBOX *) palloc0(size); SET_VARSIZE(result, size); - result-dim = dim; + SET_DIM(result, dim); for (i = 0; i dim; i++) - { result-x[i] = dur[i]; - result-x[i + dim] = dll[i]; + + if (!point) + { + for (i = 0; i dim; i++) + result-x[i + dim] = dll[i]; } + else + SET_POINT_BIT(result); PG_RETURN_NDBOX(result); } @@ -234,16 +252,14 @@ cube_a_f8(PG_FUNCTION_ARGS) dur = ARRPTR(ur); - size = offsetof(NDBOX, x[0]) +sizeof(double) * 2 * dim; + size = POINT_SIZE(dim); result = (NDBOX *) palloc0(size); SET_VARSIZE(result, size); - result-dim = dim; + SET_DIM(result, dim); + SET_POINT_BIT(result); for (i = 0; i dim; i++) - { result-x[i] = dur[i]; - result-x[i + dim] = dur[i]; - } PG_RETURN_NDBOX(result); } @@ -267,14 +283,17 @@ cube_subset(PG_FUNCTION_ARGS) dx = (int32 *) ARR_DATA_PTR(idx); dim = ARRNELEMS(idx); - size = offsetof(NDBOX, x[0]) +sizeof(double) * 2 * dim; + size = IS_POINT(c) ? POINT_SIZE(dim) : CUBE_SIZE(dim); result = (NDBOX *) palloc0(size); SET_VARSIZE(result, size); - result-dim = dim; + SET_DIM(result, dim); + + if (IS_POINT(c)) + SET_POINT_BIT(result); for (i = 0; i dim; i++) { - if ((dx[i] = 0) || (dx[i] c-dim)) + if ((dx[i] = 0) || (dx[i] DIM(c))) { pfree(result); ereport(ERROR, @@ -282,7 +301,8 @@ cube_subset(PG_FUNCTION_ARGS) errmsg(Index out of bounds))); } result-x[i] =
Re: [HACKERS] [PATCH] Add use of asprintf()
Peter Eisentraut escribió: On Tue, 2013-09-17 at 15:13 +0500, Asif Naeem wrote: I did put some time review the patch, please see my findings below i.e. Updated patch for this. Looks good to me. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cube extension point support // GSoC'13
On Fri, Oct 11, 2013 at 5:56 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: 2. I didn't understand this change: @@ -422,24 +439,14 @@ g_cube_union(PG_FUNCTION_ARGS) Datum g_cube_compress(PG_FUNCTION_**ARGS) { - PG_RETURN_DATUM(PG_GETARG_**DATUM(0)); + GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0); + PG_RETURN_POINTER(entry); } Datum g_cube_decompress(PG_FUNCTION_**ARGS) { GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0); - NDBOX *key = DatumGetNDBOX(PG_DETOAST_**DATUM(entry-key)); - - if (key != DatumGetNDBOX(entry-key)) - { - GISTENTRY *retval = (GISTENTRY *) palloc(sizeof(GISTENTRY)); - - gistentryinit(*retval, PointerGetDatum(key), - entry-rel, entry-page, - entry-offset, FALSE); - PG_RETURN_POINTER(retval); - } PG_RETURN_POINTER(entry); } What did the removed code do, and why isn't it needed anymore? I don't understand this place even more generally. What decompress function do is to detoast key and return new gist entry with it if needed. But can GiST key ever be toasted? My experiments show that answer is no, it can't. When too long key is passed then GiST falls during inserting key into page. And I didn't find any code doing GiST key toast in git. However taking care about key toast is sequentially repeated in GiST related code. For example, in contrib/intarray/_int.h #define SIGLENINT 63 /* 122 = key will *toast*, so very slow!!! */ Any ideas? -- With best regards, Alexander Korotkov.
Re: [HACKERS] background workers, round three
On Fri, Oct 11, 2013 at 9:27 AM, Michael Paquier michael.paqu...@gmail.com wrote: Finally I got the chance to put my hands on this code. Really sorry for the late replay. Thanks for the review. I'll respond to this in more detail later, but to make a long story short, I'm looking to apply terminate-worker-v1.patch (possibly with modifications after going over your review comments), but I'm not feeling too good any more about ephemeral-precious-v1.patch, because my experience with those facilities has so far proved unsatisfying. I think I'd like to withdraw the latter patch pending further study. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: simple date constructor from numeric values
Jeevan Chalke escribió: On Wed, Sep 18, 2013 at 9:54 PM, Pavel Stehule pavel.steh...@gmail.comwrote: thank you, I have no comments Assigned it to committer. Hm, these functions are marked as STABLE, right? Why aren't they immutable? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements: calls under-estimation propagation
On 10/10/13 6:20 AM, Sameer Thakur wrote: Please find patch attached which adds documentation for session_start and introduced fields and corrects documentation for queryid to be query_id. session_start remains in the view as agreed. Please fix the tabs in the SGML files. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] #define ExclusiveLock in /usr/include/postgresql/9.1/server/storage/lock.h
Arturas Mazeika maze...@gmail.com writes: I would like to express my wonder to see the following line #define ExclusiveLock 7 /* blocks ROW SHARE/SELECT...FOR (line number 543) in /usr/include/postgresql/9.1/server/storage/lock.h file, because ExclusiveLock is a name of a class in libspatialindex library (see* *libspatialindex.github.io). That seems like an awfully strange name for a class ... I suppose changing the define to be all capital letters becomes a huge problem, doesn't it ? ExclusiveLock has been defined, with that name, in the Postgres code for close to 20 years, and there are many references to it both in the core code and add-ons. Yes, changing it would be painful. More generally, we can never promise not to conflict with any identifiers defined in third-party code. I don't see any strong reason why we should do something about this particular case. I'd suggest arranging your code so that the stuff that deals with libspatialindex can be in a file separate from code that needs to know about Postgres internals. Then you could avoid #include'ing the conflicting headers in the same source file. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Replace duplicate_oids with Perl implementation
On 10/11/2013 03:57 AM, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: Replace duplicate_oids with Perl implementation It is more portable, more robust, and more readable. From: Andrew Dunstan and...@dunslane.net What about unused_oids? Here's a quick version I whipped up along the same lines that you can play with. There's probably a good case for combining them. cheers andrew #!/usr/bin/perl use strict; use warnings; BEGIN { @ARGV = (glob(pg_*.h), qw(indexing.h toasting.h)); } my %oidcounts; while() { next if /^CATALOG\(.*BKI_BOOTSTRAP/; next unless /^DATA\(insert *OID *= *(\d+)/ || /^CATALOG\([^,]*, *(\d+).*BKI_ROWTYPE_OID\((\d+)\)/ || /^CATALOG\([^,]*, *(\d+)/ || /^DECLARE_INDEX\([^,]*, *(\d+)/ || /^DECLARE_UNIQUE_INDEX\([^,]*, *(\d+)/ || /^DECLARE_TOAST\([^,]*, *(\d+), *(\d+)/; $oidcounts{$1}++; $oidcounts{$2}++ if $2; } my $firstobjectid; my $handle; open($handle,../access/transam.h) || die cannot access transam.h: $!; while ($handle) { if (/^#define\s+FirstBootstrapObjectId\s+(\d+)/) { $firstobjectid = $1; last; } } close($handle); die no first object found unless $firstobjectid; my $last = 0; foreach my $oid (sort {$a = $b} keys %oidcounts) { if ($oid $last + 1) { if ($oid $last + 2) { print $last + 1, -, $oid - 1,\n; } else { print $last + 1,\n; } } $last = $oid; } print $last + 1, - , $firstobjectid -1, \n; exit 0; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for reserved connections for replication users
On Fri, 11 Oct 2013 10:00:51 +0530 Amit Kapila amit.kapil...@gmail.com wrote: On Thu, Oct 10, 2013 at 10:06 PM, Gibheer gibh...@zero-knowledge.org wrote: On Thu, 10 Oct 2013 09:55:24 +0530 Amit Kapila amit.kapil...@gmail.com wrote: On Thu, Oct 10, 2013 at 3:17 AM, Gibheer gibh...@zero-knowledge.org wrote: On Mon, 7 Oct 2013 11:39:55 +0530 Amit Kapila amit.kapil...@gmail.com wrote: Robert Haas wrote: On Mon, Aug 5, 2013 at 2:04 AM, Andres Freund andres(at)2ndquadrant(dot)com wrote: Hmm. It seems like this match is making MaxConnections no longer mean the maximum number of connections, but rather the maximum number of non-replication connections. I don't think I support that definitional change, and I'm kinda surprised if this is sufficient to implement it anyway (e.g. see InitProcGlobal()). I don't think the implementation is correct, but why don't you like the definitional change? The set of things you can do from replication connections are completely different from a normal connection. So using separate pools for them seems to make sense. That they end up allocating similar internal data seems to be an implementation detail to me. Because replication connections are still connections. If I tell the system I want to allow 100 connections to the server, it should allow 100 connections, not 110 or 95 or any other number. I think that to reserve connections for replication, mechanism similar to superuser_reserved_connections be used rather than auto vacuum workers or background workers. This won't change the definition of MaxConnections. Another thing is that rather than introducing new parameter for replication reserved connections, it is better to use max_wal_senders as it can serve the purpose. Review for replication_reserved_connections-v2.patch, considering we are going to use mechanism similar to superuser_reserved_connections and won't allow definition of MaxConnections to change. 1. /* the extra unit accounts for the autovacuum launcher */ MaxBackends = MaxConnections + autovacuum_max_workers + 1 + - + max_worker_processes; + + max_worker_processes + max_wal_senders; Above changes are not required. 2. + if ((!am_superuser !am_walsender) ReservedBackends 0 !HaveNFreeProcs(ReservedBackends)) Change the check as you have in patch-1 for ReserveReplicationConnections if (!am_superuser (max_wal_senders 0 || ReservedBackends 0) !HaveNFreeProcs(max_wal_senders + ReservedBackends)) ereport(FATAL, (errcode(ERRCODE_TOO_MANY_CONNECTIONS), errmsg(remaining connection slots are reserved for replication and superuser connections))); 3. In guc.c, change description of max_wal_senders similar to superuser_reserved_connections at below place: {max_wal_senders, PGC_POSTMASTER, REPLICATION_SENDING, gettext_noop(Sets the maximum number of simultaneously running WAL sender processes.), 4. With this approach, there is no need to change InitProcGlobal(), as it is used to keep track bgworkerFreeProcs and autovacFreeProcs, for others it use freeProcs. 5. Below description in config.sgml needs to be changed for superuser_reserved_connections to include the effect of max_wal_enders in reserved connections. Whenever the number of active concurrent connections is at least max_connections minus superuser_reserved_connections, new connections will be accepted only for superusers, and no new replication connections will be accepted. 6. Also similar description should be added to max_wal_senders configuration parameter. 7. Below comment needs to be updated, as it assumes only superuser reserved connections as part of MaxConnections limit. /* * ReservedBackends is the number of backends reserved for superuser use. * This number is taken out of the pool size given by MaxBackends so * number of backend slots available to non-superusers is * (MaxBackends - ReservedBackends). Note what this really means is * if there are = ReservedBackends connections available, only superusers * can make new connections --- pre-existing superuser connections don't * count against the limit. */ int ReservedBackends; 8. Also we can add comment on top of definition for max_wal_senders similar to ReservedBackends. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com Hi, I took the time and reworked the patch with the feedback till now. Thank you very much Amit! Is there any specific reason why you moved this patch to next CommitFest? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com Mike asked me about the status of the patch a couple days back and I didn't think I would be able to work on the patch so soon
Re: [HACKERS] logical changeset generation v6.2
Hi, On 2013-10-11 09:08:43 -0400, Robert Haas wrote: On Thu, Oct 10, 2013 at 7:11 PM, Andres Freund and...@2ndquadrant.com wrote: On 2013-10-09 14:49:46 -0400, Robert Haas wrote: I spent some time looking at the sample plugin (patch 9/12). Here are some review comments: - I think that the decoding plugin interface should work more like the foreign data wrapper interface. Instead of using pg_dlsym to look up fixed names, I think there should be a struct of function pointers that gets filled in and registered somehow. You mean something like CREATE OUTPUT PLUGIN registering a function with an INTERNAL return value returning a filled struct? I thought about that, but it seemed more complex. Happy to change it though if it's preferred. I don't see any need for SQL syntax. I was just thinking that the _PG_init function could fill in a structure and then call RegisterLogicalReplicationOutputPlugin(mystruct). Hm. We can do that, but what'd be the advantage of that? The current model will correctly handle things like a'shared_preload_libraries'ed output plugin, because its _PG_init() will not register it. With the handling done in _PG_init() there would be two. Being able to use the same .so for output plugin handling and some other replication solution specific stuff is imo useful. - Still wondering how we'll use this from a bgworker. Simplified code to consume data: Cool. As long as that use case is supported, I'm happy; I just want to make sure we're not presuming that there must be an external client. The included testcases are written using the SQL SRF interface, which in turn is a usecase that doesn't use walsenders and such, so I hope we won't break it accidentally ;) - The output format doesn't look very machine-parseable. I really think we ought to provide something that is. Maybe a CSV-like format, or maybe something else, but I don't see why someone who wants to do change logging should be forced to write and install C code. If something like Bucardo can run on an unmodified system and extract change-sets this way without needing a .so file, that's going to be a huge win for usability. We can change the current format but I really see little to no chance of agreeing on a replication format that's serviceable to several solutions short term. Once we've gained some experience - maybe even this cycle - that might be different. I don't see why you're so pessimistic about that. I know you haven't worked it out yet, but what makes this harder than sitting down and designing something? Because every replication solution has different requirements for the format and they will want filter the output stream with regard to their own configuration. E.g. bucardo will want to include the transaction timestamp for conflict resolution and such. More generally on this patch set, if I'm going to be committing any of this, I'd prefer to start with what is currently patches 3 and 4, once we reach agreement on those. Sounds like a reasonable start. Perhaps you could reshuffle the order of the series, if it's not too much work. Sure, that's no problem. Do I understand correctly that you'd like wal_decoding: Add information about a tables primary key to struct RelationData wal_decoding: Add wal_level = logical and log data required for logical decoding earlier? I'd really like to do so. I am travelling atm, but I will be back tomorrow evening and will push an updated patch this weekend. The issue I know of in the latest patches at http://www.postgresql.org/message-id/20131007133232.ga15...@awork2.anarazel.de is renaming from http://www.postgresql.org/message-id/20131008194758.gb3718...@alap2.anarazel.de I'm a bit nervous about the way the combo CID logging. I would have thought that you would emit one record per combo CID, but what you're apparently doing is emitting one record per heap tuple that uses a combo CID. I thought and implemented that in the beginning. Unfortunately it's not enough :(. That's probably the issue that took me longest to understand in this patchseries... Combocids can only fix the case where a transaction actually has create a combocid: 1) TX1: INSERT id = 1 at 0/1: (xmin = 1, xmax=Invalid, cmin = 55, cmax = Invalid) 2) TX2: DELETE id = 1 at 0/1: (xmin = 1, xmax=2, cmin = Invalid, cmax = 1) So, if we're decoding data that needs to lookup those rows in TX1 or TX2 we both times need access to cmin and cmax, but neither transaction will have created a multixact. That can only be an issue in transaction with catalog modifications. A slightly more complex variant also requires this if combocids are involved: 1) TX1: INSERT id = 1 at 0/1: (xmin = 1, xmax=Invalid, cmin = 55, cmax = Invalid) 2) TX1: SAVEPOINT foo; 3) TX1-2: UPDATE id = 1 at 0/1: (xmin = 1, xmax=2, cmin = 55, cmax = 56, combo=123) new at 0/1: (xmin = 2, xmax=Invalid, cmin = 57, cmax =
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On 2013-10-11 08:43:43 -0400, Robert Haas wrote: I appreciate that it's odd that serializable transactions now have to worry about seeing something they shouldn't have seen (when they conclusively have to go lock a row version not current to their snapshot). Surely that's never going to be acceptable. At read committed, locking a version not current to the snapshot might be acceptable if we hold our nose, but at any higher level I think we have to fail with a serialization complaint. I think an UPSERTish action in RR/SERIALIZABLE that notices a concurrent update should and has to *ALWAYS* raise a serialization failure. Anything else will cause violations of the given guarantees. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 2013-10-11 09:22:50 +0530, Amit Kapila wrote: I think it will be difficult to prove by using any compression algorithm, that it compresses in most of the scenario's. In many cases it can so happen that the WAL will also not be reduced and tps can also come down if the data is non-compressible, because any compression algorithm will have to try to compress the data and it will burn some cpu for that, which inturn will reduce tps. Then those concepts maybe aren't such a good idea after all. Storing lots of compressible data in an uncompressed fashion isn't an all that common usecase. I most certainly don't want postgres to optimize for blank padded data, especially if it can hurt other scenarios. Just not enough benefit. That said, I actually have relatively high hopes for compressing full page writes. There often enough is lot of repetitiveness between rows on the same page that it should be useful outside of such strange scenarios. But maybe pglz is just not a good fit for this, it really isn't a very good algorithm in this day and aage. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Auto-tuning work_mem and maintenance_work_mem
On Thu, Oct 10, 2013 at 9:41 PM, Christopher Browne cbbro...@gmail.com wrote: On Thu, Oct 10, 2013 at 12:28 PM, Bruce Momjian br...@momjian.us wrote: How do we handle the Python dependency, or is this all to be done in some other language? I certainly am not ready to take on that job. I should think it possible to reimplement it in C. It was considerably useful to start by implementing in Python, as that evades various sorts of efforts needed in C (e.g. - memory allocation, picking a hash table implementation), and allows someone to hack on it without needing to run through a recompile every time something is touched. I think in the context of this problem, reimplementing that from python to C is the easiest part. Actually figuring out what the tool should *do* and how it should do it is the hard part. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On Fri, Oct 11, 2013 at 10:02 AM, Andres Freund and...@2ndquadrant.com wrote: On 2013-10-11 08:43:43 -0400, Robert Haas wrote: I appreciate that it's odd that serializable transactions now have to worry about seeing something they shouldn't have seen (when they conclusively have to go lock a row version not current to their snapshot). Surely that's never going to be acceptable. At read committed, locking a version not current to the snapshot might be acceptable if we hold our nose, but at any higher level I think we have to fail with a serialization complaint. I think an UPSERTish action in RR/SERIALIZABLE that notices a concurrent update should and has to *ALWAYS* raise a serialization failure. Anything else will cause violations of the given guarantees. Sorry, this was just a poor choice of words on my part. I totally agree with you here. Although I wasn't even talking about noticing a concurrent update - I was talking about noticing that a tuple that it's necessary to lock isn't visible to a serializable snapshot in the first place (which should also fail). What I actually meant was that it's odd that that one case (reason for returning) added to HeapTupleSatisfiesMVCC() will always obligate Serializable transactions to throw a serialization failure. Though that isn't strictly true; the modifications to HeapTupleSatisfiesMVCC() that I'm likely to propose also redundantly work for other cases where, if I'm not mistaken, that's okay (today, if you've exclusively locked a tuple and it hasn't been updated/deleted, why shouldn't it be visible to your snapshot?). The onus is on the executor-level code to notice this should-be-invisibility for non-read-committed, probably immediately after returning from value locking. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On Fri, Oct 11, 2013 at 5:43 AM, Robert Haas robertmh...@gmail.com wrote: * The visibility hacks that V4 is likely to have. The fact that preserving the composable syntax may imply changes to HeapTupleSatisfiesMVCC() so that rows locked but with no currently visible version (under conventional rules) are visible to our snapshot by virtue of having been locked all the same (this only matters at read committed). I continue to think this is a bad idea. Is it just because of performance concerns? / That's part of it; but I also think that HeapTupleSatisfiesMVCC() is a pretty fundamental bit of the system that I am loathe to tamper with. We can try to talk ourselves into believing that the definition change will only affect this case, but I'm wary that there will be unanticipated consequences, or simply that we'll find, after it's far too late to do anything about it, that we don't particularly care for the new semantics. It's probably an overstatement to say that I'll oppose any whatsoever that touches the semantics of that function, but not by much. A tuple that is exclusively locked by our transaction and not updated or deleted being visible on that basis alone isn't *that* hard to reason about. Granted, we need to be very careful here, but we're talking about 3 lines of code. Do you see the appeal of the composable syntax? To some extent. It seems to me that what we're designing is a giant grotty hack, albeit a convenient one. But if we're not really going to get MERGE, I'm not sure how much good it is to try to pretend we've got something general. Well, to be fair perhaps all of the things that you consider grotty hacks seem like inherent requirements to me, for any half-way reasonable upsert implementation on any system, that has the essential property of upsert: an atomic insert-or-update (or maybe serialization failure). But that's simpler than any of the alternatives that I see. Does there really need to be a new snapshot type with one tiny difference that apparently doesn't actually affect conventional clients of MVCC snapshots? I think that's the wrong way of thinking about it. If you're introducing a new type of snapshot, or tinkering with the semantics of an existing one, I think that's a reason to reject the patch straight off. We should be looking for a design that doesn't require that. If we can't find one, I'm not sure we should do this at all. I'm confused by this. We need to lock a row not visible to our snapshot under conventional rules. I think we can rule out serialization failures at read committed. That just leaves changing something about the visibility rules of an existing snapshot type, or creating a new snapshot type, no? It would also be unacceptable to update a tuple, and not have the new row version (which of course will still have information from the future) visible to our snapshot - what would regular RETURNING return? So what do you have in mind? I don't think that locking a row and updating it are really that distinct anyway. The benefit of locking is that we don't have to update. We can delete, for example. Perhaps I've totally missed your point here, but to me it sounds like you're saying that certain properties must always be preserved that are fundamentally in tension with upsert working in the way people expect, and the way it is bound to actually work in numerous other systems. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: simple date constructor from numeric values
Hello 2013/10/11 Alvaro Herrera alvhe...@2ndquadrant.com Jeevan Chalke escribió: On Wed, Sep 18, 2013 at 9:54 PM, Pavel Stehule pavel.steh...@gmail.com wrote: thank you, I have no comments Assigned it to committer. Hm, these functions are marked as STABLE, right? Why aren't they immutable? It was my mistake - I was confused from timestamp with time zone type, what has zero related to date and time. fixed to immutable, fixed duplicate oid Regards Pavel -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services make_date_v5.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] buildfarm failures on smew and anole
The build is continuing to fail on smew and anole. The reason it's failing is because those machines are choosing max_connections = 10, which is not enough to run the regression tests. I think this is probably because of System V semaphore exhaustion. The machines are not choosing a small value for shared_buffers - they're still picking 128MB - so the problem is not the operating system's shared memory limit. But it might be that the operating system is short on some other resource that prevents starting up with a more normal value for max_connections. My best guess is System V semaphores; I think that one of the failed runs caused by the dynamic shared memory patch probably left a bunch of semaphores allocated, so the build will keep failing until those are manually cleaned up. Can the owners of these buildfarm machines please check whether there are extra semaphores allocated and if so free them? Or at least reboot, to see if that unbreaks the build? Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [SQL] Comparison semantics of CHAR data type
On Sun, Sep 22, 2013 at 08:51:26PM -0400, Thomas Fanghaenel wrote: I was wondering about the proper semantics of CHAR comparisons in some corner cases that involve control characters with values that are less than 0x20 (space). Consider the following testcase: === create table t (a int, b char(10)); insert into t values (1, 'foo'); insert into t values (2, 'foo '); insert into t values (3, E'foo\t'); insert into t values (4, E'foo\n'); insert into t values (5, E'foo \n'); insert into t values (6, 'foobar'); select * from t order by b; === What's the proper order of these string values in the CHAR domain? The way I interpret the SQL Standard (and assuming that \t and \n collate lower than a space), it's supposed to be this: (3) (4) (5) (1) = (2) (6) Postgres comes up with this: (1) = (2) (3) (4) (5) (6) The reason is that the bpchar functions that implement the relative comparison operators for CHAR(n) effectively strip trailing whitespaces before doing the comparison. One might argue that doing this is not correct. The standard seems to mandate that all CHAR(n) values are actually considered to be of width n, and that trailing spaces are indeed relevant for comparison. In other words, stripping them would only be possible if it can be guaranteed that there are no characters in the character set that collate lower than a space. Any thoughts on this? I searched the mailing list archives, but couldn't find any relevant discussion. There were plenty of threads that argue whether or not it's semantically correct to strip trailing spaces from CHAR(n) values, but the issue of characters collating below a space does not seem to have brought up in any of those discussions before. [I am moving this thread to hackers because I think it needs internals review.] You have some good questions here, though there are two interrelated things going on here. First is collation, and the second is the trimming of spaces from char() comparisons. Let's look at collation first: test= SHOW lc_collate; lc_collate - en_US.UTF-8 (1 row) test= SELECT 'a c' UNION ALL SELECT 'ab' ORDER BY 1; ?column? -- ab a c (2 rows) You will notice spaces are not considered important in a UTF8 collation. If we do this in the C collation, we get a different result: test= CREATE DATABASE test2 WITH LC_COLLATE = 'C' TEMPLATE template0; CREATE DATABASE test= \c test2 You are now connected to database test2 as user postgres. test2= SELECT 'a c' UNION ALL SELECT 'ab' ORDER BY 1; ?column? -- a c ab (2 rows) Also, when using ORDER BY, it isn't clear if the values are ordered that way due to being greater/less-than, or just randomly. For example, I found your example above gave different ordering if I inserted the values differently, using a UTF8 collation. Let me use comparisons instead using UTF8 for clarity: test= SHOW lc_collate; lc_collate - en_US.UTF-8 (1 row) test= select 'a c' 'ab'; ?column? -- f (1 row) and C: test2= SHOW lc_collate; lc_collate C (1 row) test2= select 'a c' 'ab'; ?column? -- t (1 row) Now, let's look at ASCII characters less than space, first in UTF8: test= SHOW lc_collate; lc_collate - en_US.UTF-8 (1 row) test= select E'a\nb' E'a b'; ?column? -- f (1 row) and in C: test2= SHOW lc_collate; lc_collate C (1 row) test2= select E'a\nb' E'a b'; ?column? -- t (1 row) You can see that newline is greater than space in UTF8, but not in C. Now, on to the trailing space issue using the default TEXT value for strings, first in UTF8: test= SHOW lc_collate; lc_collate - en_US.UTF-8 (1 row) test= select E'ab\n' E'ab '; ?column? -- f (1 row) then in C: test2= SHOW lc_collate; lc_collate C (1 row) test2= select E'ab\n' E'ab '; ?column? -- t (1 row) This matches the \n/space issue we saw above. Now, here is where CHAR() starts to show the unusual behavior you saw, first in UTF8: test= SHOW lc_collate; lc_collate - en_US.UTF-8 (1 row) test= select E'ab\n'::CHAR(10) E'ab '::CHAR(10);
Re: [HACKERS] buildfarm failures on smew and anole
On 10/11/2013 03:33 PM, Robert Haas wrote: The build is continuing to fail on smew and anole. The reason it's failing is because those machines are choosing max_connections = 10, which is not enough to run the regression tests. I think this is probably because of System V semaphore exhaustion. The machines are not choosing a small value for shared_buffers - they're still picking 128MB - so the problem is not the operating system's shared memory limit. But it might be that the operating system is short on some other resource that prevents starting up with a more normal value for max_connections. My best guess is System V semaphores; I think that one of the failed runs caused by the dynamic shared memory patch probably left a bunch of semaphores allocated, so the build will keep failing until those are manually cleaned up. Can the owners of these buildfarm machines please check whether there are extra semaphores allocated and if so free them? Or at least reboot, to see if that unbreaks the build? It is possible to set the buildfarm config build_env= {MAX_CONNECTIONS = 10 }, and the tests will run with that constraint. Not sure if this would help. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Auto-tuning work_mem and maintenance_work_mem
On Thu, Oct 10, 2013 at 10:20:36PM -0700, Josh Berkus wrote: Robert, The counter-proposal to auto-tuning is just to raise the default for work_mem to 4MB or 8MB. Given that Bruce's current formula sets it at 6MB for a server with 8GB RAM, I don't really see the benefit of going to a whole lot of code and formulas in order to end up at a figure only incrementally different from a new static default. Agreed. But what do you think the value SHOULD be on such a system? That's the problem: It Depends. One thing in particular which is an issue with calculating against max_connections is that users who don't need 100 connections seldom *reduce* max_connections. So that developer laptop which only needs 3 connections is still going to have a max_connections of 100, just like the DW server where m_c should probably be 30. I guess the point I'm making here is that raising the default value is not mutually exclusive with auto-tuning. We could quadruple the current defaults for work_mem and maintenance_work_mem and be better off right now, today. Then, we could improve things further in the future if and when we agree on an approach to auto-tuning. And people who don't use the auto-tuning will still have a better default. Seems fine to me. I think we are nearing a conclusion on these issues, and I thank everyone for the vigorous discussion. When Josh showed disappointment at the small increases in work_mem and maintenance_work_mem from autotuning, I realized the complexity of autotuning just wasn't warranted here. Andrew's concern about the risks of having a work_mem too high was also sobering. Effective_cache_size has neither of these issues, and hence was logical for auto-tuning. I know Robert originally suggested just improving the work_mem default --- I now agree with him, and am sorry it took me so long to realize he was right. One other problem with auto-tuning is that it really relies not only on allocated_memory, but also on max_connections and autovacuum_max_workers, which are going to be rather arbitrary and hard for a user to set good enough to help auto-tuning. Josh might be right that auto-tuning of work_mem has to be more dynamic, perhaps based on the number of _active_ backends or number of backends who have allocate or are currently using work_mem. Our new dynamic shared memory allocation routines might help here in allocationg memory that can be easily purged from the process address space. I am now seeing a pattern that per-backend allocations really need run-time tuning, rather than being based on fixed GUC values. In summary, I think we need to: * decide on new defaults for work_mem and maintenance_work_mem * add an initdb flag to allow users/packagers to set shared_bufffers? * add an autovacuum_work_mem setting? * change the default for temp_buffers? I will try to think some more about work_mem dynamic/runtime tuning and return to it later. I know Kevin has also thought about it. I am also interesting in working on a server-side function that will make configuration suggestions or use ALTER SYSTEM to set values. I could do it in PL/pgSQL, but PL/Perl would allow me to run operating system commands to probe for OS information. The function could look at statistics and pg_buffercache output, and would be run during a typical workload. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Auto-tuning work_mem and maintenance_work_mem
On 10/11/2013 01:11 PM, Bruce Momjian wrote: In summary, I think we need to: * decide on new defaults for work_mem and maintenance_work_mem * add an initdb flag to allow users/packagers to set shared_bufffers? * add an autovacuum_work_mem setting? * change the default for temp_buffers? If we're changing defaults, bgwriter_lru_maxpages and vacuum_cost_limit could also use a bump; those thresholds were set for servers with 1GB of RAM. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [SQL] Comparison semantics of CHAR data type
Bruce Momjian br...@momjian.us wrote: Thomas Fanghaenel wrote: I was wondering about the proper semantics of CHAR comparisons in some corner cases that involve control characters with values that are less than 0x20 (space). What matters in general isn't where the characters fall when comparing individual bytes, but how the strings containing them sort according to the applicable collation. That said, my recollection of the spec is that when two CHAR(n) values are compared, the shorter should be blank-padded before making the comparison. *That* said, I think the general advice is to stay away from CHAR(n) in favor or VARCHAR(n) or TEXT, and I think that is good advice. I am sorry for this long email, but I would be interested to see what other hackers think about this issue. Since we only have the CHAR(n) type to improve compliance with the SQL specification, and we don't generally encourage its use, I think we should fix any non-compliant behavior. That seems to mean that if you take two CHAR values and compare them, it should give the same result as comparing the same two values as VARCHAR using the same collation with the shorter value padded with spaces. So this is correct: test=# select 'ab'::char(3) collate en_US E'ab\n'::char(3) collate en_US; ?column? -- t (1 row) ... because it matches: test=# select 'ab '::varchar(3) collate en_US E'ab\n'::varchar(3) collate en_US; ?column? -- t (1 row) But this is incorrect: test=# select 'ab'::char(3) collate C E'ab\n'::char(3) collate C; ?column? -- t (1 row) ... because it doesn't match: test=# select 'ab '::varchar(3) collate C E'ab\n'::varchar(3) collate C; ?column? -- f (1 row) Of course, I have no skin in the game, because it took me about two weeks after my first time converting a database with CHAR columns to PostgreSQL to change them all to VARCHAR, and do that as part of all future conversions. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Replace duplicate_oids with Perl implementation
Andrew Dunstan and...@dunslane.net writes: On 10/11/2013 03:57 AM, Tom Lane wrote: What about unused_oids? Here's a quick version I whipped up along the same lines that you can play with. There's probably a good case for combining them. Meh. To me, those two scripts are used in different scenarios, so I'm not especially in favor of merging them. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] drop-index-concurrently-1 on master fails at serializable
On 2013-10-08 15:01:26 -0700, Kevin Grittner wrote: Kevin Grittner kgri...@ymail.com wrote: [ isolation test failed at snapshot-based isolation levels ] Fix pushed, that looks for the right results based on isolation level. Hm, given what we're trying to test here, wouldn't it be better to explicitly use READ COMMITTED? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cmpact commits and changeset extraction
On 2013-10-01 10:12:13 -0400, Robert Haas wrote: On Tue, Oct 1, 2013 at 7:26 AM, Andres Freund and...@2ndquadrant.com wrote: On 2013-10-01 06:20:20 -0400, Robert Haas wrote: On Mon, Sep 30, 2013 at 5:34 PM, Andres Freund and...@2ndquadrant.com wrote: What's wrong with #1? It seems confusing that a changeset stream in database #1 will contain commits (without corresponding changes) from database #2. Seems like aaa pola violation to me. I don't really see the problem. A transaction could be empty for lots of reasons; it may have obtained an XID without writing any data, or whatever it's changed may be outside the bounds of logical rep. Sure. But all of them will have had a corresponding action in the database. If your replication stream suddenly sees commits that you cannot connect to any application activity... And it would depend on the kind of commit, you won't see it if a non-compact commit was used. It also means we need to do work to handle that commit. If you have a busy and a less so database and you're only replicating the non-busy one, that might be noticeable. Maybe. The original reason we added compact commits was because we thought that making unlogged tables logged and visca/versa was going to require adding still more stuff to the commit record. I'm no longer sure that's the case and, in any case, nobody's worked out the design details. But I can't help thinking more stuff's likely to come up in the future. I'm certainly willing to entertain proposals for restructuring that, but I don't really want to just throw it out. Well, what I am thinking of - including reading data depending on a flag in -xinfo - would give you extensibility without requiring different types of commits. And it would only blow up the size by whatever needs to be included. Maybe you should just skip replay of transactions with no useful content. Yes, I have thought about that as well. But I dislike it - how do we define no useful content? The only action we detected for that XID was the commit itself. What if the transaction was intentionally done to get an xid + timestamp in a multimaster system? What if it includes DDL but no logged data? Do we replay a transaction because of the pg_shdepend entry when creating a table in another database? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Replace duplicate_oids with Perl implementation
On 10/11/13 3:57 AM, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: Replace duplicate_oids with Perl implementation It is more portable, more robust, and more readable. From: Andrew Dunstan and...@dunslane.net What about unused_oids? We are not planning to put unused_oids in to the main build path, so there is much less of a need to make it more portable or robust. Also, as we have just (re-)learned, you cannot rely on /usr/bin/perl being there, so rewriting unused_oids in Perl would arguably make it less usable. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Extend namespace of valid guc names
On 2013-10-04 11:04:53 -0400, Robert Haas wrote: On Fri, Oct 4, 2013 at 10:14 AM, Andres Freund and...@2ndquadrant.com wrote: But that's not a new problem? It already exists and isn't really excerbated by this. ... I agree that we could use some more infrastructure around configuration, but I fail to understand why it's this patch's duty to deliver it. And I don't see why this patch would endanger any more groundbreaking improvements. You keep saying the ship has already sailed, but I think that's inaccurate. IMHO, we haven't committed to anything in this area as a matter of policy; I think the lack of a policy is demonstrated by the inconsistencies you point out. Well, it is SQL exposed behaviour that exists that way since the initial introduction of custom_variable_classes in 2004. Even if allowing multiple levels wasn't originally intended, ISTM that thats a long time to now declare it as a parsing bug. Especially as it doesn't actually hurt. Now, if we are already committed to a policy of supporting the use case you're targeting with this patch, then you're right: this is just a trivial bug fix, and we ought to just take it for what it is and fix whatever other issues come up later. But if we're not committed to such a policy, then support multi-level GUCs is a new feature, and it's entirely right to think that, just like any other new feature, it needs to implement that feature completely rather than piecemeal. We know from experience that when certain features (e.g. hash indexes) are implemented incompletely, the resulting warts can remain behind more or less indefinitely. Well, currently we're not committed to supporting it in postgresql.conf, but I think there's little chance of removing it from SQL. So not allowing it in the former doesn't buy us anything. As I read the thread, Amit Kapila is in favor of your proposal. Pavel Stehule, Alvaro Herrera, and I all questioned whether multi-level GUCs were a good idea; at least 2 out of the 3 of us favor not committing the patch out of uncertainty that we wish to be on the hook to support such usages. Andrew Dunstan and Tom Lane agreed that the current behavior was inconsistent but neither clearly endorsed relaxing the checks in guc-file.l; in fact, Tom suggested tightening up SET instead. That's slightly different than how I read it ;). Tom seems to argue against restricting SET further (28392.1378476...@sss.pgh.pa.us). But yes, I certainly haven't convinced everyone. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: simple date constructor from numeric values
Pavel Stehule escribió: It was my mistake - I was confused from timestamp with time zone type, what has zero related to date and time. fixed to immutable, fixed duplicate oid Thanks. I wasn't sure about the error message returned when times are outside range; how about this instead? I'm not wedded to this approach -- I can return to yours if this one isn't liked -- but I think the more specific messages are better. I realize this is inconsistent with the make_date case which always displays the full date instead of specific fields, but I think it's more likely that someone is doing arithmetic to enter time fields than date. (Anyway maybe this is not an important enough issue to create more work for translators.) + if (tm_hour 0 || tm_hour HOURS_PER_DAY) + ereport(ERROR, + (errcode(ERRCODE_DATETIME_FIELD_OVERFLOW), +errmsg(hours field in time value out of range: \%02d\, + tm_hour))); + + if (tm_min 0 || tm_min MINS_PER_HOUR - 1) + ereport(ERROR, + (errcode(ERRCODE_DATETIME_FIELD_OVERFLOW), +errmsg(minutes field in time value out of range: \%02d\, + tm_min))); + + if (sec 0.0 || sec (float8) SECS_PER_MINUTE) + ereport(ERROR, + (errcode(ERRCODE_DATETIME_FIELD_OVERFLOW), +errmsg(seconds field in time value out of range: \%0*.*f\, + MAX_TIME_PRECISION + 3, + MAX_TIME_PRECISION, fabs(sec; + + /* test for 24:00:00 */ + if ((tm_hour == HOURS_PER_DAY (tm_min 0 || sec 0.0))) + ereport(ERROR, + (errcode(ERRCODE_DATETIME_FIELD_OVERFLOW), +errmsg(time value out of range: \%02d:%02d:%0*.*f\, + tm_hour, tm_min, + MAX_TIME_PRECISION + 3, + MAX_TIME_PRECISION, fabs(sec; Other than that (and fixing regression tests as appropriate), I think the attached, which has mild corrections over your v5, is ready to commit. (You had one missing semicolon in the float timestamp case.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 6669,6674 SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1,3})'); --- 6669,6716 row entry indexterm + primarymake_date/primary + /indexterm + literal + function + make_date(parameteryear/parameter typeint/type, + parametermonth/parameter typeint/type, + parameterday/parameter typeint/type) + /function + /literal + /entry + entrytypedate/type/entry + entry + Create date from year, month and day fields + /entry + entryliteralmake_date(2013, 7, 15)/literal/entry + entryliteral2013-07-15/literal/entry +/row + +row + entry + indexterm + primarymake_time/primary + /indexterm + literal + function +make_time(parameterhour/parameter typeint/type, +parametermin/parameter typeint/type, +parametersec/parameter typedouble precision/type) + /function + /literal + /entry + entrytypetime/type/entry + entry + Create time from hour, minutes and second fields + /entry + entryliteralmake_time(8, 15, 23.5)/literal/entry + entryliteral08:15:23.5/literal/entry +/row + +row + entry + indexterm primarynow/primary /indexterm literalfunctionnow()/function/literal *** a/src/backend/utils/adt/date.c --- b/src/backend/utils/adt/date.c *** *** 2729,2731 timetz_izone(PG_FUNCTION_ARGS) --- 2729,2815 PG_RETURN_TIMETZADT_P(result); } + + /* + * make_date() + * date constructor + */ + Datum + make_date(PG_FUNCTION_ARGS) + { + struct pg_tm tm; + DateADT date; + int dterr; + + tm.tm_year = PG_GETARG_INT32(0); + tm.tm_mon = PG_GETARG_INT32(1); + tm.tm_mday = PG_GETARG_INT32(2); + + dterr = ValidateDate(DTK_DATE_M, true, false, false, tm); + + if (dterr != 0) + ereport(ERROR, + (errcode(ERRCODE_DATETIME_FIELD_OVERFLOW), + errmsg(date field value out of range: \%d-%d-%d\, + tm.tm_year, tm.tm_mon, tm.tm_mday))); + + if (!IS_VALID_JULIAN(tm.tm_year, tm.tm_mon, tm.tm_mday)) + ereport(ERROR, + (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE), + errmsg(date out of range: \%d-%d-%d\, + tm.tm_year, tm.tm_mon, tm.tm_mday))); + + date = date2j(tm.tm_year, tm.tm_mon, tm.tm_mday) - POSTGRES_EPOCH_JDATE; + + PG_RETURN_DATEADT(date); + } + + /* + * make_time() + * time constructor + */ + Datum + make_time(PG_FUNCTION_ARGS) + { + int tm_hour =
Re: [HACKERS] drop-index-concurrently-1 on master fails at serializable
Andres Freund and...@2ndquadrant.com wrote: On 2013-10-08 15:01:26 -0700, Kevin Grittner wrote: Kevin Grittner kgri...@ymail.com wrote: [ isolation test failed at snapshot-based isolation levels ] Fix pushed, that looks for the right results based on isolation level. Hm, given what we're trying to test here, wouldn't it be better to explicitly use READ COMMITTED? I thought about that approach, but it seemed better to make sure that things didn't get broken at any isolation level by patches dealing with DROP INDEX CONCURRENTLY. If you're sure that could never happen, we could save a few dozen lines of isolation test code. It's not like READ COMMITTED will never get tested -- I would bet that upwards of 99% of the make installcheck-world runs or make installcheck -C src/test/isolation runs are at that isolation level. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN improvements part 1: additional information
On 10.10.2013 13:57, Heikki Linnakangas wrote: On 09.10.2013 02:04, Tomas Vondra wrote: On 8.10.2013 21:59, Heikki Linnakangas wrote: On 08.10.2013 17:47, Alexander Korotkov wrote: Hi, Tomas! On Sun, Oct 6, 2013 at 3:58 AM, Tomas Vondrat...@fuzzy.cz wrote: I've attempted to rerun the benchmarks tests I did a few weeks ago, but I got repeated crashes when loading the data (into a table with tsvector+gin index). Right before a crash, theres this message in the log: PANIC: not enough space in leaf page! Thanks for testing. Heikki's version of patch don't works for me too on even much more simplier examples. I can try to get it working if he answer my question about GinDataLeafPageGetPostingList* macros. The new macros in that patch version were quite botched. Here's a new attempt. Nope, still the same errors :-( PANIC: not enough space in leaf page! LOG: server process (PID 29722) was terminated by signal 6: Aborted DETAIL: Failed process was running: autovacuum: ANALYZE public.messages I've continued hacking away at the patch, here's yet another version. I've done a lot of cleanup and refactoring to make the code more readable (I hope). I'm not sure what caused the panic you saw, but it's probably fixed now. Let me know if not. Yup, this version fixed the issues. I haven't been able to do any benchmarks yet, all I have is some basic stats | HEAD | patched == load duration | 1084 s | 1086 s subject index | 96 MB | 96 MB body index | 2349 MB | 2051 MB So there's virtually no difference in speed (which is expected, AFAIK) and the large index on full message bodies is significantly smaller. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Auto-tuning work_mem and maintenance_work_mem
Josh Berkus j...@agliodbs.com wrote: On 10/11/2013 01:11 PM, Bruce Momjian wrote: In summary, I think we need to: * decide on new defaults for work_mem and maintenance_work_mem * add an initdb flag to allow users/packagers to set shared_bufffers? * add an autovacuum_work_mem setting? * change the default for temp_buffers? If we're changing defaults, bgwriter_lru_maxpages and vacuum_cost_limit could also use a bump; those thresholds were set for servers with 1GB of RAM. +1 on those. Also, I have often had to bump cpu_tuple_cost into the 0.03 to 0.05 range to get a good plan. In general, this makes the exact settings of *_page_cost less fussy, and I have hit situations where I was completely unable to get a good plan to emerge without bumping cpu_tuple_cost relative to the other cpu costs. I know that it's possible to engineer a workload that shows any particular cost adjustment to make things worse, but in real-life production environments I have never seen an increase in this range make plan choice worse. Regarding the settings which have been the center of attention for most of this thread, I have had very good luck with starting work_mem at machine RAM * 0.25 / max_connections. I get the impression that Josh regards that as too low. My guess is that he deals more with data warehouse reporting systems than I do, where larger settings are both more beneficial and less likely to cause memory exhaustion than the typical systems I've worked with. That is the big problem with auto-configuration -- it depends so much on the workload. In the long run, an admission control policy and/or adaptive configuration based on the observed workload seems like what we *really* need. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Heavily modified big table bloat even in auto vacuum is running
Haribabu kommi haribabu.ko...@huawei.com wrote: To handle the above case instead of directly resetting the dead tuples as zero, how if the exact dead tuples are removed from the table stats. With this approach vacuum gets triggered frequently thus it reduces the bloat. Patch for the same is attached in the mail. Please add this to the open CommitFest to ensure that it gets reviewed: https://commitfest.postgresql.org/action/commitfest_view/open For more information about the process, see: http://wiki.postgresql.org/wiki/CommitFest You may also want to reveiw: http://wiki.postgresql.org/wiki/Developer_FAQ -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Auto-tuning work_mem and maintenance_work_mem
From: Bruce Momjian br...@momjian.us On Thu, Oct 10, 2013 at 11:01:52PM +0900, MauMau wrote: Although this is not directly related to memory, could you set max_prepared_transactions = max_connections at initdb time? People must feel frustrated when they can't run applications on a Java or .NET application server and notice that they have to set max_prepared_transactions and restart PostgreSQL. This is far from friendly. I think the problem is that many users don't need prepared transactions and therefore don't want the overhead. Is that still accurate? I'm not sure if many use XA features, but I saw the questions and answer a few times, IIRC. In the trouble situation, PostgreSQL outputs an intuitive message like increase max_prepared_transactions, so many users might possibly have been able to change the setting and solve the problem themselves without asking for help, feeling stress like Why do I have to set this? For example, max_prepared_transactions is called hideous creature in the following page: https://community.jboss.org/wiki/InstallPostgreSQLOnFedora?_sscc=t According to the below page, the amount of memory consumed for this is (770 + 270 * max_locks_per_transaction) * max_prepared_transactions. With the default setting of maxconnections=100 and max_locks_per_transaction=64, this is only 180KB. So the overhead is negligible. http://www.postgresql.org/docs/9.2/static/kernel-resources.html If the goal is to make PostgreSQL more friendly and run smoothly without frustration from the start and not perfect tuning, I think max_prepared_transactions=max_connections is an easy and good item. If the goal is limited to auto-tuning memory sizes, this improvement can be treated separately. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Auto-tuning work_mem and maintenance_work_mem
From: Dimitri Fontaine dimi...@2ndquadrant.fr MauMau maumau...@gmail.com writes: Although this is not directly related to memory, could you set max_prepared_transactions = max_connections at initdb time? People must You really need to have a transaction manager around when issuing prepared transaction as failing to commit/rollback them will prevent VACUUM and quickly lead you to an interesting situation. I understand this problem occurs only when the user configured the application server to use distributed transactions, the application server crashed between prepare and commit/rollback, and the user doesn't recover the application server. So only improper operation produces the problem. Just setting max_prepared_transactions to non-zero doesn't cause the problem. Is this correct? Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
2013/10/10 Ronan Dunklau rdunk...@gmail.com: Le dimanche 6 octobre 2013 22:33:23 Kohei KaiGai a écrit : 2013/9/10 Ronan Dunklau rdunk...@gmail.com: For row-level triggers, it seems more complicated. From what I understand, OLD/NEW tuples are fetched from the heap using their ctid (except for BEFORE INSERT triggers). How could this be adapted for foreign tables ? It seems to me the origin of difficulty to support row-level trigger is that foreign table does not have a back-door to see the older tuple to be updated, unlike regular tables. The core-PostgreSQL knows on-disk format to store tuples within regular tables, thus, GetTupleForTrigger() can fetch a tuple being identified by tupleid. On the other hand, writable foreign table is also designed to identify a remote tuple with tupleid, as long as FDW module is responsible. It is my understanding that the tupleid is one way for a FDW to identify a tuple, but the AddForeignUpdateTargets hook allows for arbitrary resjunks columns to be added. It is these columns that are then used to identify the tuple to update. I don't think the tupleid itself can be used to identify a tuple for the trigger. Sorry, I'm uncertain the point above. Are you saying FDW driver may be able to handle well a case when a remote tuple to be updated is different from a remote tuple being fetched on the first stage? Or, am I misunderstanding? From another standpoint, I'd like to cancel the above my suggestion, because after-row update or delete trigger tries to fetch an older image of tuple next to the actual update / delete jobs. Even if FDW is responsible to identify a remote tuple using tupleid, the identified tuple we can fetch is the newer image because FDW driver already issues a remote query to update or delete the target record. So, soon or later, FDW shall be responsible to keep a complete tuple image when foreign table has row-level triggers, until writer operation is finished. So, a straightforward idea is adding a new FDW interface to get an older image of the tuple to be updated. It makes possible to implement something similar to GetTupleForTrigger() on foreign tables, even though it takes a secondary query to fetch a record from the remote server. Probably, it is an headache for us One thing we pay attention is, the tuple to be updated is already fetched from the remote server and the tuple being fetched latest is (always?) the target of update or delete. It is not a heavy job for ForeignScanState node to hold a copy of the latest tuple. Isn't it an available way to reference relevant ForeignScanState to get the older image? It is however possible to have only an incomplete tuple. The FDW may have only fetched the tupleid-equivalent, and we would have to ensure that the reltargetlist contains every attribute that the trigger could need. Does the incomplete tuple mean a tuple image but some of columns are replaced with NULL due to optimization, as postgres_fdw is doing, doesn't it? If so, a solution is to enforce FDW driver to fetch all the columns if its managing foreign table has row level triggers for update / delete. Or, perform a second round-trip to the foreign data store to fetch the whole tuple. It might be a solution for before-row trigger, but not for after-row trigger, unfortunately. If my assumption is right, all we need to enhance are (1) add a store on ForeignScanState to hold the last tuple on the scan stage, (2) add GetForeignTupleForTrigger() to reference the store above on the relevant ForeignScanState. I would add a 3) have a GetTupleForTrigger() hook that would get called with the stored tuple. I personnally don't like this approach: there is already (n+1) round trips to the foreign store (one during the scan phase, and one per tuple during the update/delete phase), we don't need another one per tuple for the triggers. As I noted above, 2nd round trip is not a suitable design to support after-row trigger. Likely, GetTupleForTrigger() hook shall perform to return a cached older image being fetched on the first round of remote scan. So, I guess every FDW driver will have similar implementation to handle these the situation, thus it makes sense that PostgreSQL core has a feature to support this handling; that keeps a tuple being fetched last and returns older image for row-level triggers. How about your opinion? And, I also want some comments from committers, not only from mine. -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers