Re: [HACKERS] timeout of pg_receivexlog --status-interval
On Wed, Jul 16, 2014 at 2:29 AM, Sawada Masahiko wrote: > On Tue, Jul 15, 2014 at 7:38 PM, Fujii Masao wrote: >> On Thu, Jul 10, 2014 at 11:10 PM, Sawada Masahiko >> wrote: >>> Hi, >>> >>> At 1047 line of receivelog.c:CopyStreamPoll(), we set NULL to >>> timeoutptr variable. >>> if the value of timeoutprt is set NULL then the process will wait >>> until can read socket using by select() function as following. >>> >>> if (timeout_ms < 0) >>> timeoutptr = NULL; >>> else >>> { >>> timeout.tv_sec = timeout_ms / 1000L; >>> timeout.tv_usec = (timeout_ms % 1000L) * 1000L; >>> timeoutptr = &timeout; >>> } >>> >>> ret = select(PQsocket(conn) + 1, &input_mask, NULL, NULL, timeoutptr); >>> >>> But the 1047 line of receivelog.c is never executed because the value >>> of timeout_ms is NOT allowed less than 0 at CopyStreamReceive which is >>> only one function calls CopyStreamPoll(). >>> The currently code, if we specify -s to 0 then CopyStreamPoll() >>> function is never called. >>> And the pg_receivexlog will be execute PQgetCopyData() and failed, in >>> succession. >> >> Thanks for reporting this! Yep, this is a problem. >> >>> I think that it is contradiction, and should execute select() function >>> with NULL of fourth argument. >>> the attached patch allows to execute select() with NULL, i.g., >>> pg_receivexlog.c will wait until can read socket without timeout, if >>> -s is specified to 0. >> >> Your patch changed the code so that CopyStreamPoll is called even >> when the timeout is 0. I don't agree with this change because the >> timeout=0 basically means that the caller doesn't request to block and >> there is no need to call CopyStreamPoll in this case. So I'm thinking to >> apply the attached patch. Thought? >> > > Thank you for the response. > I think this is better. > > One another point about select() function, I think that they are same > behavior between the fifth argument is NULL and 0(i.g. 0 sec). No, per manpage of select(2), select(2) with timeout=0 behaves differently from select(2) with timeout=NULL. So I don't think your suggestion is acceptable... Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] date-time library
Hi 2014-07-24 3:49 GMT+02:00 Charlie Holleran : > postgres has fantastic date-time parsing. I've tried some Java libraries > that advertise amazing time parsing. But nothing seems to match postgres's > time parsing. I started peeking through the source to find a reference to > a library that postgres might be using for time parsing. no luck. Where > would one look in the source code for the date-time parsing code or library > reference? > > the code is in https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/datetime.c file Regards Pavel
Re: [HACKERS] Stating the significance of Lehman & Yao in the nbtree README
On Wed, Jul 23, 2014 at 8:52 PM, Amit Kapila wrote: > As such there is no problem in saying the way you have mentioned, but > I feel it would be better if we can mention the mechanism of _bt_search() > as quoted by you upthread in the first line. > "> In more concrete terms, _bt_search() releases and only then acquires >> read locks during a descent of the tree (by calling >> _bt_relandgetbuf()), and, perhaps counterintuitively, that's just >> fine." I guess I could say that too. > One more point, why you think it is important to add this new text > on top? I think adding new text after "Lehman and Yao don't require read > locks, .." paragraph is okay. I've added it to the top because it's really the most important point on Lehman and Yao. It's the _whole_ point. Consider how it's introduced here, for example: http://db.cs.berkeley.edu/jmh/cs262b/treeCCR.html Why should I "bury the lead"? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgresql.auto.conf and reload
On Wed, Jul 23, 2014 at 11:02 PM, Alvaro Herrera wrote: > Tom Lane wrote: > > Alvaro Herrera writes: > > > Tom Lane wrote: > > >> No, ALTER SYSTEM is there now and it needs to work right in its first > > >> release. I will go fix this if nobody else does. > > > > > Just checking -- you didn't get around to dealing with this, right? > > > > Not yet... do you want it? > > I might give it a try, but not just yet. I'll let you know if I attempt > anything. I am not sure you have noticed or not, but I have posted a patch upthread to address this issue. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Stating the significance of Lehman & Yao in the nbtree README
On Wed, Jul 23, 2014 at 11:19 AM, Peter Geoghegan wrote: > On Tue, Jul 22, 2014 at 8:59 PM, Amit Kapila wrote: > > Okay, but how does this justify to add below new text in README. > > + Even with these read locks, Lehman and Yao's approach obviates the > > + need of earlier schemes to hold multiple read locks concurrently when > > + descending the tree as part of servicing index scans (pessimistic lock > > + coupling). > > > > Actually I think putting it can lead to inconsistency in the README. > > Currently it indicates that our algorithm is different from L&Y w.r.t taking > > Read Locks and has given explanation for same. > > Not really. Firstly, that sentence acknowledges that there are read > locks where L&Y assume there will not be. "Even with these read locks" > references the first paragraph, where it is stated the Postgres > B-Trees still acquire read locks while descending the tree. I think here you want to state that the difference in Postgres is "as we are using L & Y approach, it don't need to hold *multiple* read locks concurrently", and L & Y approach which obviates this need is explained in second line (which indicates the importance of maintaining right-links and high-keys to detect and recover from page splits). As such there is no problem in saying the way you have mentioned, but I feel it would be better if we can mention the mechanism of _bt_search() as quoted by you upthread in the first line. "> In more concrete terms, _bt_search() releases and only then acquires > read locks during a descent of the tree (by calling > _bt_relandgetbuf()), and, perhaps counterintuitively, that's just > fine." One more point, why you think it is important to add this new text on top? I think adding new text after "Lehman and Yao don't require read locks, .." paragraph is okay. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
[HACKERS] date-time library
postgres has fantastic date-time parsing. I've tried some Java libraries that advertise amazing time parsing. But nothing seems to match postgres's time parsing. I started peeking through the source to find a reference to a library that postgres might be using for time parsing. no luck. Where would one look in the source code for the date-time parsing code or library reference?
Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT
On Wed, Jul 23, 2014 at 9:58 AM, Robert Haas wrote: > I'd suggest something like: > > UPSERT table SET col = value [, ...] WHERE predicate; I think I see another problem with this. The UPDATE must provide a WHERE clause Var on a unique indexed column (let's say it's constrained to provide a "(Var [unique-indexed opclass support function 3 op] Const)" qual during parse analysis because you asked for UPSERT. But it can also affect 0 rows for other reasons, since this UPDATE qual can have arbitrary other expressions. So in general you have any number of reasons for not updating, which implies that you must insert, which might not be possible because there might even still be duplicate key violation in a not-yet-visible-to-you row (even just in the unique index you intended to merge on). Whereas, when inserting, there is exactly one reason (per row) to go and update - a conflict (i.e. a would-be duplicate violation). And this is a conflict that is meaningful to every transaction, regardless of their snapshot, since it represents an objective fact about the physical presence of an index tuple. So, do you make the UPDATE differentiate between different reasons for the UPDATE failing, only inserting when an UPSERT would have happened had you omitted the extra stuff in your UPSERT predicate? Can you really differentiate anyway? And isn't the choice to do the insert based on what your MVCC snapshot happens to see - rather than something that there is necessarily objective truth on, the physical presence of a duplicate tuple in an index - rather arbitrary? It's a different situation to my implementation, where you do an insert-like thing, and then find a conflict row to lock and then update, because at that point the row is successfully locked, and the WHERE clause may be evaluated against rejects and the *most recent version* of the locked, conflict-causing row. The most recent, locked version is not arbitrary at all. That's the version you ought to evaluate a predicate against before finally deciding to UPDATE. I assume you agree with my view that UPSERT should always insert or update. This kind of business sounds closer to SQL MERGE, where an insert may not drive things, and you accept these kinds of anomalies in exchange for a lot more flexibility in not having inserts drive things. Did you happen to read my post on that? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Diagnose incompatible OpenLDAP versions during build and test.
Noah Misch writes: > On Tue, Jul 22, 2014 at 11:51:18AM -0400, Tom Lane wrote: >> Hmm. I'm pretty sure it is not considered good style to drop AC_DEFUN >> blocks right into configure.in; at least, we have never done that before. >> PGAC_LDAP_SAFE should get defined somewhere in config/*.m4, instead. >> I am not real sure however which existing config/ file is appropriate, >> or whether we should create a new one. Peter, any opinion? > I don't mind reinstating the absence of AC_DEFUN from configure.in. Add > config/misc-libraries.m4, perhaps? Alternately, stretch programs.m4; it > already has two readline library checks. programs.m4 sounds like the place then. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Diagnose incompatible OpenLDAP versions during build and test.
On Tue, Jul 22, 2014 at 11:51:18AM -0400, Tom Lane wrote: > Noah Misch writes: > > Diagnose incompatible OpenLDAP versions during build and test. > > Hmm. I'm pretty sure it is not considered good style to drop AC_DEFUN > blocks right into configure.in; at least, we have never done that before. > PGAC_LDAP_SAFE should get defined somewhere in config/*.m4, instead. > I am not real sure however which existing config/ file is appropriate, > or whether we should create a new one. Peter, any opinion? I don't mind reinstating the absence of AC_DEFUN from configure.in. Add config/misc-libraries.m4, perhaps? Alternately, stretch programs.m4; it already has two readline library checks. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PDF builds broken again
Magnus Hagander wrote: > I think it would be very useful to have. Given how long it takes to > build not all the time, but running it every couple of days or weekly > or so would be quite useful. Then we'd catch things earlier and not > have to spend as much time trying to figure out exactly what broke... I have a half-setup buildfarm job in the machine that powers guaibasaurus that's intended to build HTML docs every five minutes or so. I guess I could see about finishing that, and having it do the PDF build once a week or so. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT
On Wed, Jul 23, 2014 at 3:27 PM, Robert Haas wrote: > To the last question, yes. To the first point, I'm not bent on this > particular syntax, but I am +1 for the idea that the command is > something specifically upsert-like rather than something more generic > like an ON CONFLICT SELECT that lets you do arbitrary things with the > returned rows. FWIW I wasn't proposing that you'd be able to do arbitrary things. There'd be a few restrictions, such as forbidding unexpected DML commands, and requiring that only a special update reference the special rejected-projecting CTE. They just wouldn't be arbitrary restrictions. But that's not that interesting to me anymore; you and Andres want a dedicated DML command with the update part built in, that isn't as flexible. Okay, fine. I don't think it makes the implementation any easier, but that's what I'll do. > It's certain arguable whether you should INSERT and then turn failures > into an update or try to UPDATE and then turn failures into an INSERT; > we might even want to have both options available, though that smells > a little like airing too much of our dirty laundry. But I think I > generally favor optimizing for the UPDATE case for more or less the > same reasons Kevin articulated. I don't see the connection between this and Kevin's remarks. And FWIW, I don't see a reason to favor inserts or updates. Fortunately, what I have balances both cases very well, and doesn't cause bloat. The work of descending the index to lock it isn't wasted if an update is required. My implementation decides to either insert or update at literally the latest possible moment. >> For one thing we cannot use any kind of scan unless there is a new >> mechanism for seeing not-yet-visible tuples, like some kind of >> non-MVCC snapshot that those scan nodes can selectively use. Now, I >> guess that you intend that in this scenario you go straight to 5, and >> then your insert finds the not-yet-visible conflict. But it's not >> clear that when 5 fails, and we pick up from 1 that we then get a new >> MVCC Snapshot or something else that gives us a hope of succeeding >> this time around. And it seems quite wasteful to not use knowledge of >> conflicts from the insert at all - that's one thing I'm trying to >> address here; removing unnecessary index scans when we actually know >> what our conflict TIDs are. > > Here you seem to be suggested that I intended to propose your existing > design rather than something else, which I didn't. In this design, > you find the conflict (at most one) but scanning for the tuple to be > updated. Yes, but what if you don't see a conflict because it isn't visible to your snapshot, and then you insert, and only then (step 5), presumably with a dirty snapshot, you find a conflict? How does the loop terminate if that brings you back to step 1 with the same MVCC snapshot feeding the update? >>> 2. If you find more than one tuple that is visible to your scan, error. >> >> This point seems to concern making the UPSERT UPDATE only affect zero >> or one rows. Is that right? If so, why do you think that's a useful >> constraint to impose? > > Because nobody wants an operation to either insert 1 tuple or update > n>=1 tuples. The intention is that the predicate should probably be > something like WHERE unique_key = 'some_value', but you can use > something else. So it's kinda like saying which index you care about > for a particular operation, except without having to explicitly name > an index. But in any event you should use a predicate that uniquely > identifies the tuple you want to update. I agree that you want to uniquely identify each tuple. What I meant was, why should we not be able to upsert multiple rows in a single command? What's wrong with that? >> Point 4b ("if there is more than one such tuple...") seems like it >> needs some practical or theoretical justification - do you just not >> want to follow an update chain? If so, why? What would the error >> message actually be? And, to repeat myself: Why should an MVCC >> snapshot see more than one such tuple in the ordinary course of >> scanning a relation, since there is by definition a unique constraint >> that prevents this? Or maybe I'm wrong; maybe you don't even require >> that. That isn't clear. > > In case (4b), like in case (2), you've done something like UPSERT tab > SET ... WHERE non_unique_index = 'value_appearing_more_than_once'. > You shouldn't do that, because you have only one replacement tuple > (embodied in the SET clause). Oh, I totally agree that we should throw an error if any one row is affected more than once by the same command. Indeed, SQL MERGE requires this, since to do otherwise would leave the final value of the row unspecified. It seemed to me from your original explanation of point 4b that you were saying something much stronger. But if that's all you meant, then I agree. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresq
Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT
Robert Haas wrote: > On Wed, Jul 23, 2014 at 4:05 PM, Peter Geoghegan wrote: > > On Wed, Jul 23, 2014 at 9:58 AM, Robert Haas wrote: > >> 2. If you find more than one tuple that is visible to your scan, error. > > > > This point seems to concern making the UPSERT UPDATE only affect zero > > or one rows. Is that right? If so, why do you think that's a useful > > constraint to impose? > > Because nobody wants an operation to either insert 1 tuple or update > n>=1 tuples. The intention is that the predicate should probably be > something like WHERE unique_key = 'some_value', but you can use > something else. So it's kinda like saying which index you care about > for a particular operation, except without having to explicitly name > an index. But in any event you should use a predicate that uniquely > identifies the tuple you want to update. This seemed a nice idea when I first read it earlier today, but now I'm not so sure. Are you saying that it wouldn't be allowed to use an UPSERT with some sort of join, such that each joined row would produce either one insert or one update? To clarify: suppose I import some external data into a temp table, then run UPSERT "USING" that table so that the rows end up in a permanent table; some of the rows might be already in the permanent table, some others might not. I would hope that by the time UPSERT is done, all the rows are in the permanent table. Would that raise an error, with your proposed design? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()
Peter Geoghegan wrote: > Maybe that would be marginally better than classic Levenshtein > distance, but I doubt it would pay for itself. It's just more code to > maintain. Are we really expecting to not get the best possible > suggestion due to some number of transposition errors very frequently? > You still have to have a worse suggestion spuriously get ahead of > yours, and typically there just aren't that many to begin with. I'm > not targeting spelling errors so much as thinkos around plurals and > whether or not an underscore was used. Damerau-Levenshtein seems like > an algorithm with fairly specialized applications. Yes, it's for typos. I guess it's an unfrequent scenario to have both a typoed column and a column that's missing the plural declension, which is the case in which Damerau-Lvsh would be a win. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()
On Thu, Jul 24, 2014 at 1:09 AM, Tom Lane wrote: > Robert Haas writes: > > There are several possible methods of doing that, but I think the best > > one is just to leave the SQL-callable C functions in fuzzystrmatch and > > move only the underlying code that supports into core. > > I hadn't been paying close attention to this thread, but I'd just assumed > that that would be the approach. > > It might be worth introducing new differently-named pg_proc entries for > the same functions in core, but only if we can agree that there are better > names for them than what the extension uses. > Yes, that's a point I raised upthread as well. What about renaming those functions as string_distance and string_distance_less_than? Then have only fuzzystrmatch do some DirectFunctionCall using the in-core functions? -- Michael
Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT
On Wed, Jul 23, 2014 at 3:01 PM, Kevin Grittner wrote: > Could you clarify that? Does this mean that you feel that we > should write to the heap before reading the index to see if the row > will be a duplicate? If so, I think that is a bad idea, since this > will sometimes be used to apply a new data set which hasn't changed > much from the old, and that approach will perform poorly for this > use case, causing a lot of bloat. It certainly would work well for > the case that most of the rows are expected to be INSERTs rather > than DELETEs, but I'm not sure that's justification for causing > extreme bloat in the other cases. No, I think we should stagger ordinary index insertion in a way that locks indexes - we lock indexes, then if successful insert a heap tuple before finally inserting index tuples using the existing heavyweight page-level index locks. My design doesn't cause bloat under any circumstances. Heikki's design, which he sketched with an actual POC implementation involved possible bloating in the event of a conflict. He also had to go and delete the promise tuple (from within ExecInsert()) in the event of the conflict before row locking in order to prevent unprincipled deadlocking. Andres wanted to do something else similarly involving "promise tuples", where the xid on the inserter was stored in indexes with a special flag. That could also cause bloat. I think that could be particularly bad when conflicts necessitate visiting indexes one by one to kill promise tuples, as opposed to just killing one heap tuple as in the case of Heikki's design. Anyway, both of those designs, and my own design are insert-driven. The main difference between the design that Heikki sketched and my own is that mine does not cause bloat, but is more invasive to the nbtree code (but less invasive to a lot of other places to make the deadlocking-ultimately-conflicting tuple killing work). But I believe that Heikki's design is identical to my own in terms of user-visible semantics. That said, his design was just a sketched and it wouldn't be fair to hold him to it. > Also, just a reminder that I'm going to squawk loudly if the > implementation does not do something fairly predictable and sane > for the case that the table has more than one UNIQUE index and you > attempt to UPSERT a row that is a duplicate of one row on one of > the indexes and a different row on a different index. Duly noted. :-) I think that it's going to have to support that one way or the other. It might be the case that I'll want to make the choice of unique index optionally "implicit", but it's clear that we want to be able to specify a specific unique index in one form or another. Actually, I've already added that. It's just optional right now. I haven't found a better way than by just specifying the name of the unique index in DML, which is ugly, which is the main reason I want to make it optional. Perhaps we can overcome this. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT
Peter Geoghegan-3 wrote >> with semantics like this: >> >> 1. Search the table, using any type of scan you like, for a row >> matching the given predicate. > > Perhaps I've misunderstood, but this is fundamentally different to > what I'd always thought would need to happen. I always understood that > the UPSERT should be insert-driven, and so not really have an initial > scan at all in the same sense that every Insert lacks one. Moreover, > I've always thought that everyone agreed on that. We go to insert, and > then in the course of inserting index tuples do something with dirty > snapshots. That's how we get information on conflicts. SQL standard not-withstanding there really is no way to pick one implementation and not have it be sub-optimal in some situations. Unless there is a high barrier why not introduce syntax: SCAN FIRST; INSERT FIRST that allows the user to specify the behavior that they expect would be most efficient given their existing/new data ratio? >> Having said all that, I believe the INSERT ON CONFLICT syntax is more >> easily comprehensible than previous proposals. But I still tend to >> agree with Andres that an explicit UPSERT syntax or something like it, >> that captures all of the MVCC games inside itself, is likely >> preferable from a user standpoint, whatever the implementation ends up >> looking like. > > Okay then. If you both feel that way, I will come up with something > closer to what you sketch. But for now I still strongly feel it ought > to be driven by an insert. Perhaps I've misunderstood you entirely, > though. Getting a little syntax crazy here but having all of: UPSERT [SCAN|INSERT] FIRST INSERT ON CONFLICT UPDATE - same as INSERT FIRST UPDATE ON MISSING INSERT - same as SCAN FIRST with the corresponding 2 implementations would make the user interface slightly more complicated but able to be conformed to the actual data that the user has. You could basically perform a two-phase pass where you run the user-requested algorithm and then for all failures attempt the alternate algorithm and then error if both fail. I am not at all fluent on the concurrency issues here, and the MVCC violations and re-tries that might be considered, but at a high-level there is disagreement here simply because both answers are "correct" and ideally both can be provided to the user. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Making-joins-involving-ctid-work-for-the-benefit-of-UPSERT-tp5811919p5812640.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] System shutdown signal on Windows (was Re: [GENERAL])
On 23 July 2014 22:16, Tom Lane wrote: > Krystian Bigaj writes: > > - when pg_console_handler receives CTRL_SHUTDOWN_EVENT from OS, then it > > calls pg_queue_signal(SIGINT). > > > Problems: > > - when OS is in shutdown path, then it sends CTRL_SHUTDOWN_EVENT, and > *all* > > Postgres processes (main and sub/forked) will call > pg_queue_signal(SIGINT) > > - so main and sub processes will start to shutdown independently? Can > this > > have any bad consequences? > > Hm. We ought to have that sending SIGTERM instead, so as to mimic the > situation when Unix "init" is trying to shut down the system. It might be > that SIGINT will more or less work, but the postmaster logic is designed > to work with global SIGTERM as being the clean-up-ASAP trigger. As an > example, backends servicing remote applications (which will *not* have > already gotten killed) would not exit in response to SIGINT. > > > I think that CTRL_SHUTDOWN_EVENT should be removed from > pg_console_handler, > > That does not sound like a good idea, at least not if Windows has the > same behavior as "init" does of proceeding to hard kills after some > grace period. > > regards, tom lane > I'm not really familiar with Unix and it's SIG-commands. I know only about SIGINT/SIGTERM from Postgres documentation. However form what I see is that when Postgress is running by pg_ctl from service, then it will receive SIGINT (independently and in general in unspecified order) - *each* postgres.exe process will queue itself SIGINT (because of CTRL_SHUTDOWN_EVENT), - pg_ctl will send SIGINT to main postmaster process (and possibly it will pass that command to sub-processes) So there are two independent paths where SIGINT are sent, and pg_ctl doesn't have really a control when postgres.exe receives SIGINT. This CTRL_SHUTDOWN_EVENT is not used when postgres.exe is run on *user session* - so removing it won't change anything. I see only two cases where CTRL_SHUTDOWN_EVENT might be need (all of there where postgres.exe is run on service session): - postgres.exe run by pg_ctl.exe, but pg_ctl service process was terminated/killed, and then system was shutdown - someone starts postgres.exe from own service, but doesn't send SIGINT/SIGTERM command to postgres.exe on service system shutdown (but he must for service stop) As I previously wrote, I have workaround for it, so if you think that this change would break compatibility and don't want to change it, then I'm really fine with it. However I've probably found something with pg_ctl.c regarding shutdown and maybe that suspicious postgres.exe process termination on Windows. 1) shutdownEvent is signaled in pgwin32_ServiceMain by SERVICE_CONTROL_STOP/SERVICE_CONTROL_SHUTDOWN in pgwin32_ServiceHandler There is dwWaitHint = 1. 2) ... /* Wait for quit... */ ret = WaitForMultipleObjects(2, shutdownHandles, FALSE, INFINITE); pgwin32_SetServiceStatus(SERVICE_STOP_PENDING); switch (ret) { case WAIT_OBJECT_0: /* shutdown event */ kill(postmasterPID, SIGINT); /* * Increment the checkpoint and try again Abort after 12 * checkpoints as the postmaster has probably hung */ while (WaitForSingleObject(postmasterProcess, 5000) == WAIT_TIMEOUT && status.dwCheckPoint < 12) status.dwCheckPoint++; < missing call to pgwin32_SetServiceStatus(SERVICE_STOP_PENDING) or SetServiceStatus(hStatus, (LPSERVICE_STATUS) &status); break; ... There is incremented dwCheckPoint every 5000ms, but that status is not updated (missing pgwin32_SetServiceStatus/SetServiceStatus), so SCM after 10s (dwWaitHint = 1) will not receive incremented dwCheckPoint, and it's allowed to kill that process (because this service didn't respond with dwWaitHint). It kills pg_ctl.exe, but when all services stops, then it simply terminates other remaining processes - in this case postgres.exe http://msdn.microsoft.com/en-us/library/windows/desktop/ms685996(v=vs.85).aspx dwWaitHint: ... If the amount of time specified by dwWaitHint passes, and dwCheckPoint has not been incremented or dwCurrentState has not changed, the service control manager or service control program can assume that an error has occurred and the service should be stopped. ... So if main postgres.exe (run by pg_ctl.exe service process) won't shutdown in 10s (for any reason) it might be terminated/killed by Windows/SCM. Best regards, Krystian Bigaj
Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT
On Wed, Jul 23, 2014 at 4:05 PM, Peter Geoghegan wrote: > On Wed, Jul 23, 2014 at 9:58 AM, Robert Haas wrote: >> This all seems completely to one side of Andres's point. I think what >> he's saying is: don't implement an SQL syntax of the form INSERT ON >> CONFLICT and let people use that to implement upsert. Instead, >> directly implement an SQL command called UPSERT or similar. That's >> long been my intuition as well. I think generality here is not your >> friend. > > Sure, I was just making one fairly narrow point in relation to Andres' > concern about executor structuring of the UPSERT. > >> I'd suggest something like: >> >> UPSERT table SET col = value [, ...] WHERE predicate; > > Without dismissing the overall concern shared by you and Andres, that > particular update-driven syntax doesn't strike me as something that > should be pursued. I would like to be able to insert or update more > than a single row at a time, for one thing. For another, what exactly > appears in the predicate? Is it more or less the same as an UPDATE's > predicate? To the last question, yes. To the first point, I'm not bent on this particular syntax, but I am +1 for the idea that the command is something specifically upsert-like rather than something more generic like an ON CONFLICT SELECT that lets you do arbitrary things with the returned rows. >> with semantics like this: >> >> 1. Search the table, using any type of scan you like, for a row >> matching the given predicate. > > Perhaps I've misunderstood, but this is fundamentally different to > what I'd always thought would need to happen. I always understood that > the UPSERT should be insert-driven, and so not really have an initial > scan at all in the same sense that every Insert lacks one. Moreover, > I've always thought that everyone agreed on that. We go to insert, and > then in the course of inserting index tuples do something with dirty > snapshots. That's how we get information on conflicts. It's certain arguable whether you should INSERT and then turn failures into an update or try to UPDATE and then turn failures into an INSERT; we might even want to have both options available, though that smells a little like airing too much of our dirty laundry. But I think I generally favor optimizing for the UPDATE case for more or less the same reasons Kevin articulated. > For one thing we cannot use any kind of scan unless there is a new > mechanism for seeing not-yet-visible tuples, like some kind of > non-MVCC snapshot that those scan nodes can selectively use. Now, I > guess that you intend that in this scenario you go straight to 5, and > then your insert finds the not-yet-visible conflict. But it's not > clear that when 5 fails, and we pick up from 1 that we then get a new > MVCC Snapshot or something else that gives us a hope of succeeding > this time around. And it seems quite wasteful to not use knowledge of > conflicts from the insert at all - that's one thing I'm trying to > address here; removing unnecessary index scans when we actually know > what our conflict TIDs are. Here you seem to be suggested that I intended to propose your existing design rather than something else, which I didn't. In this design, you find the conflict (at most one) but scanning for the tuple to be updated. >> 2. If you find more than one tuple that is visible to your scan, error. > > This point seems to concern making the UPSERT UPDATE only affect zero > or one rows. Is that right? If so, why do you think that's a useful > constraint to impose? Because nobody wants an operation to either insert 1 tuple or update n>=1 tuples. The intention is that the predicate should probably be something like WHERE unique_key = 'some_value', but you can use something else. So it's kinda like saying which index you care about for a particular operation, except without having to explicitly name an index. But in any event you should use a predicate that uniquely identifies the tuple you want to update. >> 3. If you find exactly one tuple that is visible to your scan, update it. >> Stop. >> 4. If you find no tuple that is visible to your scan, but you do find >> at least one tuple whose xmin has committed and whose xmax has not >> committed, then (4a) if the isolation level is REPEATABLE READ or >> higher, error; (4b) if there is more than one such tuple, error; else >> (4b) update it, in violation of normal MVCC visibility rules. > > Point 4b ("if there is more than one such tuple...") seems like it > needs some practical or theoretical justification - do you just not > want to follow an update chain? If so, why? What would the error > message actually be? And, to repeat myself: Why should an MVCC > snapshot see more than one such tuple in the ordinary course of > scanning a relation, since there is by definition a unique constraint > that prevents this? Or maybe I'm wrong; maybe you don't even require > that. That isn't clear. In case (4b), like in case (2), you've done somethin
Re: [HACKERS] Shapes on the regression test for polygon
On Wed, Jul 23, 2014 at 4:06 PM, Tom Lane wrote: > Robert Haas writes: >> On Mon, Jul 21, 2014 at 10:52 AM, Emre Hasegeli wrote: >>> The first two shapes on src/test/regress/sql/polygon.sql do not make >>> sense to me. > >> Well, I think the number of tabs that makes them look best depends on >> your tab-stop setting. At present, I find that with 8-space tabs >> things seem to line up pretty well, whereas with your patch, 4-space >> tabs line up well. > > Perhaps we should expand tabs to spaces in those ascii-art diagrams? > >> Personally, though, my vote would be to just leave it the way it is. >> I don't think there's really a problem here in need of solving. > > I concur with doubting that changing the actual regression test cases > is a good thing to do, at least not without more analysis. But "the > pictures make no sense with the wrong tab setting" is a pretty simple > issue, and one that I can see biting people. AFAICT, the pictures make no sense with the right tab setting, either. The possibility that someone might use the wrong tab setting is just icing on the cake. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT
Peter Geoghegan wrote: > I still strongly feel it ought to be driven by an insert Could you clarify that? Does this mean that you feel that we should write to the heap before reading the index to see if the row will be a duplicate? If so, I think that is a bad idea, since this will sometimes be used to apply a new data set which hasn't changed much from the old, and that approach will perform poorly for this use case, causing a lot of bloat. It certainly would work well for the case that most of the rows are expected to be INSERTs rather than DELETEs, but I'm not sure that's justification for causing extreme bloat in the other cases. Also, just a reminder that I'm going to squawk loudly if the implementation does not do something fairly predictable and sane for the case that the table has more than one UNIQUE index and you attempt to UPSERT a row that is a duplicate of one row on one of the indexes and a different row on a different index. The example discussed during your PGCon talk was something like a city table with two column, each with a UNIQUE constraint, containing: city_id | city_name -+--- 1 | Toronto 2 | Ottawa ... and an UPSERT comes through for (1, 'Ottawa'). We would all like for that never to happen, but it will. There must be sane and documented behavior in that case. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT
On Wed, Jul 23, 2014 at 5:58 PM, Robert Haas wrote: > I'd suggest something like: > > UPSERT table SET col = value [, ...] WHERE predicate; I don't think this is actually good enough. Typical use cases are things like "increment this counter or insert a new one starting at 0". -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Verbose output of pg_dump not show schema name
On Thu, Apr 17, 2014 at 9:15 PM, Michael Paquier wrote: > > On Fri, Apr 18, 2014 at 4:29 AM, Fabrízio de Royes Mello > wrote: > > > > On Thu, Apr 17, 2014 at 12:46 PM, Bruce Momjian wrote: > >> > >> On Thu, Apr 17, 2014 at 11:44:37AM -0400, Tom Lane wrote: > >> > Bruce Momjian writes: > >> > > The idea is that we only need quotes when there are odd characters in > >> > > the identifier. We do that right now in some places, though I can't > >> > > find them in pg_dump. I know psql does that, see quote_ident(). > >> > > >> > I think our general style rule is that identifiers embedded in messages > >> > are always double-quoted. There's an exception for type names, but > >> > not otherwise. You're confusing the message case with printing SQL. > >> > >> OK. I was unclear if a status _display_ was a message like an error > >> message. > >> > > > > The attached patch fix missing double-quoted in "dumping contents of > > table.." message and add schema name to other messages: > > - "reading indexes for table \"%s\".\"%s\"\n" > > - "reading foreign key constraints for table \"%s\".\"%s\"\n" > > - "reading triggers for table \"%s\".\"%s\"\n" > > > > - "finding the columns and types of table \"%s\".\"%s\"\n" > > - "finding default expressions of table \"%s\".\"%s\"\n" > > - "finding check constraints for table \"%s\".\"%s\"\n" > Cool additions. There may be a more elegant way to check if namespace > is NULL, but I couldn't come up with one myself. So patch may be fine. > Hi all, I think this small patch was lost. There are something wrong? Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL >> Timbira: http://www.timbira.com.br >> Blog sobre TI: http://fabriziomello.blogspot.com >> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello >> Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] [GSoC2014] Patch ALTER TABLE ... SET LOGGED
On Tue, Jul 22, 2014 at 3:29 PM, Fabrízio de Royes Mello < fabriziome...@gmail.com> wrote: > > On Tue, Jul 22, 2014 at 12:01 PM, Fabrízio de Royes Mello < fabriziome...@gmail.com> wrote: > > > > > > > > On Thu, Jul 17, 2014 at 8:02 PM, Andres Freund wrote: > > > > > > > That means should I "FlushRelationBuffers(rel)" before change the > > > > relpersistence? > > > > > > I don't think that'd help. I think what this means that you simply > > > cannot change the relpersistence of the old relation before the heap > > > swap is successful. So I guess it has to be something like (pseudocode): > > > > > > OIDNewHeap = make_new_heap(..); > > > newrel = heap_open(OIDNewHeap, AEL); > > > > > > /* > > > * Change the temporary relation to be unlogged/logged. We have to do > > > * that here so buffers for the new relfilenode will have the right > > > * persistency set while the original filenode's buffers won't get read > > > * in with the wrong (i.e. new) persistency setting. Otherwise a > > > * rollback after the rewrite would possibly result with buffers for the > > > * original filenode having the wrong persistency setting. > > > * > > > * NB: This relies on swap_relation_files() also swapping the > > > * persistency. That wouldn't work for pg_class, but that can't be > > > * unlogged anyway. > > > */ > > > AlterTableChangeCatalogToLoggedOrUnlogged(newrel); > > > FlushRelationBuffers(newrel); > > > /* copy heap data into newrel */ > > > finish_heap_swap(); > > > > > > And then in swap_relation_files() also copy the persistency. > > > > > > > > > That's the best I can come up right now at least. > > > > > > > Isn't better if we can set the relpersistence as an argument to "make_new_heap" ? > > > > I'm thinking to change the make_new_heap: > > > > From: > > > > make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, bool forcetemp, > >LOCKMODE lockmode) > > > > To: > > > > make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, char relpersistence, > >LOCKMODE lockmode) > > > > That way we can create the new heap with the appropriate relpersistence, so in the swap_relation_files also copy the persistency, of course. > > > > And after copy the heap data to the new table (ATRewriteTable) change relpersistence of the OldHeap's indexes, because in the "finish_heap_swap" they'll be rebuild. > > > > Thoughts? > > > > The attached patch implement my previous idea based on Andres thoughts. > I don't liked the last version of the patch, especially this part: +/* check if SetUnlogged or SetLogged exists in subcmds */ +for(pass = 0; pass < AT_NUM_PASSES; pass++) +{ +List *subcmds = tab->subcmds[pass]; +ListCell*lcmd; + +if (subcmds == NIL) +continue; + +foreach(lcmd, subcmds) +{ +AlterTableCmd *cmd = (AlterTableCmd *) lfirst(lcmd); + +if (cmd->subtype == AT_SetUnLogged || cmd->subtype == AT_SetLogged) +{ +/* + * Change the temporary relation to be unlogged/logged. We have to do + * that here so buffers for the new relfilenode will have the right + * persistency set while the original filenode's buffers won't get read + * in with the wrong (i.e. new) persistency setting. Otherwise a + * rollback after the rewrite would possibly result with buffers for the + * original filenode having the wrong persistency setting. + * + * NB: This relies on swap_relation_files() also swapping the + * persistency. That wouldn't work for pg_class, but that can't be + * unlogged anyway. + */ +if (cmd->subtype == AT_SetUnLogged) +newrelpersistence = RELPERSISTENCE_UNLOGGED; + +isSetLoggedUnlogged = true; +} +} +} So I did a refactoring adding new items to AlteredTableInfo to pass the information through the phases. Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL >> Timbira: http://www.timbira.com.br >> Blog sobre TI: http://fabriziomello.blogspot.com >> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello >> Twitter: http://twitter.com/fabriziomello diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml index 69a1e14..2d131df 100644 --- a/doc/src/sgml/ref/alter_table.sgml +++ b/doc/src/sgml/ref/alter_table.sgml @@ -59,16 +59,17 @@ ALTER TABLE [ IF EXISTS ] name ENABLE ALWAYS RULE rewrite_rule_name CLUSTER ON index_name SET WITHOUT CLUSTER +SET {LOGGED | UNLOGGED} SET WITH OIDS SET WITHOUT OIDS SET ( storage_parameter =
Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()
On Wed, Jul 23, 2014 at 1:10 PM, Alvaro Herrera wrote: > I had two thoughts: > > 1. Should we consider making levenshtein available to frontend programs > as well as backend? I don't think so. Why would that be useful? > 2. Would it provide better matching to use Damerau-Levenshtein[1] instead > of raw Levenshtein? Maybe that would be marginally better than classic Levenshtein distance, but I doubt it would pay for itself. It's just more code to maintain. Are we really expecting to not get the best possible suggestion due to some number of transposition errors very frequently? You still have to have a worse suggestion spuriously get ahead of yours, and typically there just aren't that many to begin with. I'm not targeting spelling errors so much as thinkos around plurals and whether or not an underscore was used. Damerau-Levenshtein seems like an algorithm with fairly specialized applications. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PDF builds broken again
On Wed, Jul 23, 2014 at 10:15 PM, Andrew Dunstan wrote: > > On 07/23/2014 06:31 AM, Magnus Hagander wrote: >> >> >> >> Do we actually have any buildfarm boxes building the PDFs? And if so, >> any idea why they didn't catch it? > > > > AFAIK, nobody's ever asked for such a thing. The docs optional step just > builds the default docs target, which is simply the HTML docs. > > It should be possible, just a SMOC. I think it would be very useful to have. Given how long it takes to build not all the time, but running it every couple of days or weekly or so would be quite useful. Then we'd catch things earlier and not have to spend as much time trying to figure out exactly what broke... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] System shutdown signal on Windows (was Re: [GENERAL])
Krystian Bigaj writes: > - when pg_console_handler receives CTRL_SHUTDOWN_EVENT from OS, then it > calls pg_queue_signal(SIGINT). > Problems: > - when OS is in shutdown path, then it sends CTRL_SHUTDOWN_EVENT, and *all* > Postgres processes (main and sub/forked) will call pg_queue_signal(SIGINT) > - so main and sub processes will start to shutdown independently? Can this > have any bad consequences? Hm. We ought to have that sending SIGTERM instead, so as to mimic the situation when Unix "init" is trying to shut down the system. It might be that SIGINT will more or less work, but the postmaster logic is designed to work with global SIGTERM as being the clean-up-ASAP trigger. As an example, backends servicing remote applications (which will *not* have already gotten killed) would not exit in response to SIGINT. > I think that CTRL_SHUTDOWN_EVENT should be removed from pg_console_handler, That does not sound like a good idea, at least not if Windows has the same behavior as "init" does of proceeding to hard kills after some grace period. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PDF builds broken again
On 07/23/2014 06:31 AM, Magnus Hagander wrote: Do we actually have any buildfarm boxes building the PDFs? And if so, any idea why they didn't catch it? AFAIK, nobody's ever asked for such a thing. The docs optional step just builds the default docs target, which is simply the HTML docs. It should be possible, just a SMOC. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()
Peter Geoghegan wrote: > For some reason I thought that that was what Michael was proposing - a > more comprehensive move of code into core than the structuring that I > proposed. I actually thought about a Levenshtein distance operator at > one point months ago, before I entirely gave up on that. The > MAX_LEVENSHTEIN_STRLEN limitation made me think that the Levenshtein > distance functions are not suitable for core as is (although that > doesn't matter for my purposes, since all I need is something that > accommodates NAMEDATALEN sized strings). MAX_LEVENSHTEIN_STRLEN is a > considerable limitation for an in-core feature. I didn't get around to > forming an opinion on how and if that should be fixed. I had two thoughts: 1. Should we consider making levenshtein available to frontend programs as well as backend? 2. Would it provide better matching to use Damerau-Levenshtein[1] instead of raw Levenshtein? .oO(Would anyone be so bold as to attempt to implement bitap[2] using bitmapsets ...) [1] http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance [2] http://en.wikipedia.org/wiki/Bitap_algorithm -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Shapes on the regression test for polygon
Robert Haas writes: > On Mon, Jul 21, 2014 at 10:52 AM, Emre Hasegeli wrote: >> The first two shapes on src/test/regress/sql/polygon.sql do not make >> sense to me. > Well, I think the number of tabs that makes them look best depends on > your tab-stop setting. At present, I find that with 8-space tabs > things seem to line up pretty well, whereas with your patch, 4-space > tabs line up well. Perhaps we should expand tabs to spaces in those ascii-art diagrams? > Personally, though, my vote would be to just leave it the way it is. > I don't think there's really a problem here in need of solving. I concur with doubting that changing the actual regression test cases is a good thing to do, at least not without more analysis. But "the pictures make no sense with the wrong tab setting" is a pretty simple issue, and one that I can see biting people. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT
On Wed, Jul 23, 2014 at 9:58 AM, Robert Haas wrote: > This all seems completely to one side of Andres's point. I think what > he's saying is: don't implement an SQL syntax of the form INSERT ON > CONFLICT and let people use that to implement upsert. Instead, > directly implement an SQL command called UPSERT or similar. That's > long been my intuition as well. I think generality here is not your > friend. Sure, I was just making one fairly narrow point in relation to Andres' concern about executor structuring of the UPSERT. > I'd suggest something like: > > UPSERT table SET col = value [, ...] WHERE predicate; Without dismissing the overall concern shared by you and Andres, that particular update-driven syntax doesn't strike me as something that should be pursued. I would like to be able to insert or update more than a single row at a time, for one thing. For another, what exactly appears in the predicate? Is it more or less the same as an UPDATE's predicate? > with semantics like this: > > 1. Search the table, using any type of scan you like, for a row > matching the given predicate. Perhaps I've misunderstood, but this is fundamentally different to what I'd always thought would need to happen. I always understood that the UPSERT should be insert-driven, and so not really have an initial scan at all in the same sense that every Insert lacks one. Moreover, I've always thought that everyone agreed on that. We go to insert, and then in the course of inserting index tuples do something with dirty snapshots. That's how we get information on conflicts. For one thing we cannot use any kind of scan unless there is a new mechanism for seeing not-yet-visible tuples, like some kind of non-MVCC snapshot that those scan nodes can selectively use. Now, I guess that you intend that in this scenario you go straight to 5, and then your insert finds the not-yet-visible conflict. But it's not clear that when 5 fails, and we pick up from 1 that we then get a new MVCC Snapshot or something else that gives us a hope of succeeding this time around. And it seems quite wasteful to not use knowledge of conflicts from the insert at all - that's one thing I'm trying to address here; removing unnecessary index scans when we actually know what our conflict TIDs are. > 2. If you find more than one tuple that is visible to your scan, error. This point seems to concern making the UPSERT UPDATE only affect zero or one rows. Is that right? If so, why do you think that's a useful constraint to impose? > 3. If you find exactly one tuple that is visible to your scan, update it. > Stop. > 4. If you find no tuple that is visible to your scan, but you do find > at least one tuple whose xmin has committed and whose xmax has not > committed, then (4a) if the isolation level is REPEATABLE READ or > higher, error; (4b) if there is more than one such tuple, error; else > (4b) update it, in violation of normal MVCC visibility rules. Point 4b ("if there is more than one such tuple...") seems like it needs some practical or theoretical justification - do you just not want to follow an update chain? If so, why? What would the error message actually be? And, to repeat myself: Why should an MVCC snapshot see more than one such tuple in the ordinary course of scanning a relation, since there is by definition a unique constraint that prevents this? Or maybe I'm wrong; maybe you don't even require that. That isn't clear. As you know, I've always opposed any type of READ COMMITTED serialization failure. If you're worried about lock starvation hazards - although I don't think strong concerns here are justified - we can always put in a fail-safe number of retries before throwing an error. I'm comfortable with that because I think based on experimentation that it's going to be virtually impossible to hit. Apparently other snapshot isolation databases acquire a new snapshot, and re-do the command rather than using an EvalPlanQual()-like mechanism and thereby violating snapshot isolation. Those other systems have just such a hard limit on retries before erroring, and it seems to work out okay for them. > 5. Construct a new tuple using the col/value pairs from the SET clause > and try to insert it. If this would fail with a unique index > violation, back out and go back to step 1. Again, I can't see why you'd do this step last and not first, since this is naturally the place to initially "see into the future" with a dirty snapshot. > Having said all that, I believe the INSERT ON CONFLICT syntax is more > easily comprehensible than previous proposals. But I still tend to > agree with Andres that an explicit UPSERT syntax or something like it, > that captures all of the MVCC games inside itself, is likely > preferable from a user standpoint, whatever the implementation ends up > looking like. Okay then. If you both feel that way, I will come up with something closer to what you sketch. But for now I still strongly feel it ought to be driven by a
Re: [HACKERS] Least Active Transaction ID function
Hi All, On Wed, Jul 23, 2014 at 5:01 PM, Rohit Goyal wrote: > Hi All, > > I am doing programming with postgresql source code. I want to find out the > function which can give me Least active transaction id currenty in the > system. > > Is there any function which can give me that? > > Regards, > Rohit Goyal > 1> I know that somewhere there is an active transaction list in postgresql. At any point of time, I want to get the smallest transaction present in this active tx list or I want to get the transaction id before which all transaction smaller than that are committed/aborted. Is there any function which can give me this value? 2> I found a function giving *GetStableLatestTransactionId()*, please tel me what this function gives. I was not able to understand the description given above it. Thanks for support in advance :)!! Regards, Rohit Goyal
Re: [HACKERS] Shapes on the regression test for polygon
On Mon, Jul 21, 2014 at 10:52 AM, Emre Hasegeli wrote: > The first two shapes on src/test/regress/sql/polygon.sql do not make > sense to me. They look more like polygons with some more tabs, > but still did not match the coordinates. I changed them to make > consistent with the shapes. I believe this was the intention of > the original author. Patch attached. Well, I think the number of tabs that makes them look best depends on your tab-stop setting. At present, I find that with 8-space tabs things seem to line up pretty well, whereas with your patch, 4-space tabs line up well. Either way, I have no idea what the picture is supposed to mean, because it looks to me like the original picture has circles at (3,3) and (2,1) and plus signs at (2,2), (3,1), and (4,0). With your patch applied, the circle at (2,1) has moved to (2,0.5). I can't correlate any of this to the test cases you modified (and which I see no reason to modify, whether they match the picture or not). And really, if the diagram is confusing, let's just remove it. We don't really need our regression test comments to illustrate what a triangle looks like, or how to do bad ASCII art. Personally, though, my vote would be to just leave it the way it is. I don't think there's really a problem here in need of solving. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checkpointer crashes on slave in 9.4 on windows
On Mon, Jul 21, 2014 at 4:16 AM, Amit Kapila wrote: > During internals tests, it is observed that checkpointer > is getting crashed on slave with below log on slave in > windows: > > LOG: checkpointer process (PID 4040) was terminated by exception 0xC005 > HINT: See C include file "ntstatus.h" for a description of the hexadecimal > value. > LOG: terminating any other active server processes > > I debugged and found that it is happening when checkpointer > tries to update shared memory config and below is the > call stack. > >> postgres.exe!LWLockAcquireCommon(LWLock * l=0x, LWLockMode >> mode=LW_EXCLUSIVE, unsigned __int64 * valptr=0x0020, unsigned >> __int64 val=18446744073709551615) Line 579 + 0x14 bytes C > postgres.exe!LWLockAcquireWithVar(LWLock * l=0x, unsigned > __int64 * valptr=0x0020, unsigned __int64 > val=18446744073709551615) Line 510 C > postgres.exe!WALInsertLockAcquireExclusive() Line 1627 C > postgres.exe!UpdateFullPageWrites() Line 9037 C > postgres.exe!UpdateSharedMemoryConfig() Line 1364 C > postgres.exe!CheckpointerMain() Line 359 C > postgres.exe!AuxiliaryProcessMain(int argc=2, char * * > argv=0x007d2180) Line 427 C > postgres.exe!SubPostmasterMain(int argc=4, char * * > argv=0x007d2170) Line 4635 C > postgres.exe!main(int argc=4, char * * argv=0x007d2170) Line 207 > C > > Basically, here the issue is that during startup when > checkpointer tries to acquire WAL Insertion Locks to > update the value of fullPageWrites, it crashes because > the same is still not initialized. It will be initialized in > InitXLOGAccess() which will get called via RecoveryInProgress() > in case recovery is in progress before doing actual checkpoint. > However we are trying to access it before that which leads to > crash. > > I think the reason why it occurs only on windows is that > on linux fork will ensure that WAL Insertion Locks get > initialized with same values as postmaster. > > To fix this issue, we need to ensure that WAL Insertion > Locks should get initialized before we use them, so one of > the ways is to call InitXLOGAccess() before calling > CheckPointerMain() as I have done in attached patch, other > could be to call RecoveryInProgess() much earlier in path > than now. So, this problem was introduced by Heikki's commit, 68a2e52bbaf98f136a96b3a0d734ca52ca440a95, to replace XLogInsert slots with regular LWLocks. I think the problem here is that the initialization code here really doesn't belong in InitXLOGAccess at all: 1. I think WALInsertLocks is just another global variable that needs to be saved and restored in EXEC_BACKEND mode and that it therefore ought to participate in the save_backend_variables() mechanism instead of having its own special-purpose mechanism to save and restore the value. 2. And I think that the LWLockRegisterTranche call belongs in XLOGShmeInit(), so that it's parallel to the other call in CreateLWLocks. I think that would be more robust, because while your fix will definitely work, we could easily reintroduce a similar platform-specific bug for some other auxiliary process. Using the mechanisms described above will mean that this is set up properly for everything that's attached to shared memory at all. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()
On Wed, Jul 23, 2014 at 8:57 AM, Robert Haas wrote: > There are several possible methods of doing that, but I think the best > one is just to leave the SQL-callable C functions in fuzzystrmatch and > move only the underlying code that supports into core. For some reason I thought that that was what Michael was proposing - a more comprehensive move of code into core than the structuring that I proposed. I actually thought about a Levenshtein distance operator at one point months ago, before I entirely gave up on that. The MAX_LEVENSHTEIN_STRLEN limitation made me think that the Levenshtein distance functions are not suitable for core as is (although that doesn't matter for my purposes, since all I need is something that accommodates NAMEDATALEN sized strings). MAX_LEVENSHTEIN_STRLEN is a considerable limitation for an in-core feature. I didn't get around to forming an opinion on how and if that should be fixed. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] IS NOT DISTINCT FROM + Indexing
Jonathan S. Katz wrote: > Well that definitely answers "how hard would it be." - before > embarking on something laborious (as even just indexing is > nontrivial), I think it would be good to figure out how people are > using IS [NOT] DISTINCT FROM and if there is interest in having it be > indexable, let alone used in a JOIN optimization. It could become a > handy tool to simplify the SQL in queries that are returning a lot of > NULL / NOT NULL data mixed together. FWIW my project (abandoned for now, but I intend to get back to it someday) to catalog NOT NULL constraints in pg_constraint had me rewriting them into some kind of IS DISTINCT FROM NULL expression. (It was IS NOT NULL initially until somebody pointed out that that wouldn't work for composite type columns). I'm not sure how that interacts with what you're doing here, maybe not at all; I thought I'd mention it. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PDF builds broken again
I wrote: > Oh ... the TUG page now has a recipe for finding the problem less > painfully, which I don't recall having seen before: > http://ftp.tug.org/mail-archives/pdftex/2002-February/002216.html > In short, you can add a "draft" option that lets PDF output get generated > anyway, and then you can actually see exactly what link is getting > split. Unfortunately, that recipe seems to have little to do with current reality :-(. Perhaps there's a \usepackage{hyperref} somewhere in the style files JadeTeX provides, but it's sure not in the tex-pdf files we generate, so editing it isn't a convenient solution. What I did to locate the problem was to add some garbage text in the para you identified, so that the A4 PDF would build again, and then look at the output. The problem turns out to be in the *next* para, which in A4 format breaks like this: pg_relation_size accepts the OID or name of a table, index or toast table, and returns the on- disk size in bytes. Specifying âmainâ or leaving out the second argument returns the size of the main data fork of the relation. Specifying âfsmâ returns the size of the Free Space Map (see Section 54.3) associated with the relation. Specifying âvmâ returns the size of the Visibility Map (see Sec- tion 54.4) associated with the relation. Note that this function shows the size of only one fork; for most purposes it is more convenient to use the higher-level functions pg_total_relation_size or pg_table_size. So both the FSM and VM section-reference hyperlinks break across lines, and one or the other is falling across a page boundary in the problematic build. I think we'd better rearrange this para to avoid problems in future, especially in case somebody decides to invent more forks. I remember thinking that the possible options would look better if broken out as an itemizedlist anyway. Let me try that ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] IS NOT DISTINCT FROM + Indexing
Jonathan S. Katz wrote: > before embarking on something laborious (as even just indexing > is nontrivial), I think it would be good to figure out how people > are using IS [NOT] DISTINCT FROM and if there is interest in > having it be indexable, let alone used in a JOIN optimization. > It could become a handy tool to simplify the SQL in queries that > are returning a lot of NULL / NOT NULL data mixed together. To prevent subtle inconsistencies, I think we would need to limit support to data types with a btree opclass which uses "=" as the equality operator on indexes using that opclass (either by default or explicitly). That limitation would still allow a lot of useful cases. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgresql.auto.conf and reload
Tom Lane wrote: > Alvaro Herrera writes: > > Tom Lane wrote: > >> No, ALTER SYSTEM is there now and it needs to work right in its first > >> release. I will go fix this if nobody else does. > > > Just checking -- you didn't get around to dealing with this, right? > > Not yet... do you want it? I might give it a try, but not just yet. I'll let you know if I attempt anything. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index-only scans for multicolumn GIST
> That seems like a nonstarter :-(. Index-only scans don't have a license > to return approximations. If we drop the behavior for circles, how much > functionality do we have left? It should work with exact operator classes, box_ops, point_ops, range_ops, inet_ops. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT
On Sat, Jul 19, 2014 at 10:04 PM, Peter Geoghegan wrote: > On Fri, Jul 18, 2014 at 11:23 AM, Andres Freund > wrote: >> Meh. A understandable syntax wouldn't require the pullups with a special >> scan node and such. > > Well, in general ExecModifyTable()/ExecUpdate() trusts the tid passed > to match the qual in the query. Unless you're willing to give up on > the idea of having a qual on the update (other than something like an > ON CONFLICTS qual, obviously) I think you'll probably end up with some > kind of pseudo-scan node anyway, if only for consistency with how > things already work, to give you somewhere to show the Filter in > explain output and so on. ExecScan() probably needs to ExecQual(). > Inserts don't have scan nodes, but updates do, and so it seems pretty > odd to make the "Insert" node (a ModifyTable node) pullup from a > result node (as always), and the "Update" node (also a ModifyTable > node) treat the first ModifyTable node as its own pseudo-scan node, > when in fact we are scanning the heap in a manner not unlike a > conventional update. Or do you envision a second result node where an > update qual might be evaluated? I am not enthused about either > possibility. > > The idea of not having the ability to put a predicate on the update at > all, much like the MySQL thing isn't completely outrageous, but it > certainly isn't going to go down well those that have already > expressed concerns about how much of a foot gun this could be. This all seems completely to one side of Andres's point. I think what he's saying is: don't implement an SQL syntax of the form INSERT ON CONFLICT and let people use that to implement upsert. Instead, directly implement an SQL command called UPSERT or similar. That's long been my intuition as well. I think generality here is not your friend. I'd suggest something like: UPSERT table SET col = value [, ...] WHERE predicate; with semantics like this: 1. Search the table, using any type of scan you like, for a row matching the given predicate. 2. If you find more than one tuple that is visible to your scan, error. 3. If you find exactly one tuple that is visible to your scan, update it. Stop. 4. If you find no tuple that is visible to your scan, but you do find at least one tuple whose xmin has committed and whose xmax has not committed, then (4a) if the isolation level is REPEATABLE READ or higher, error; (4b) if there is more than one such tuple, error; else (4b) update it, in violation of normal MVCC visibility rules. 5. Construct a new tuple using the col/value pairs from the SET clause and try to insert it. If this would fail with a unique index violation, back out and go back to step 1. Having said all that, I believe the INSERT ON CONFLICT syntax is more easily comprehensible than previous proposals. But I still tend to agree with Andres that an explicit UPSERT syntax or something like it, that captures all of the MVCC games inside itself, is likely preferable from a user standpoint, whatever the implementation ends up looking like. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gaussian distribution pgbench -- splits v4
Hello Robert, Some review comments: Thanks a lot for your return. Please find attached two new parts of the patch (A for setrandom extension, B for pgbench embedded test case extension). 1. I suggest that getExponentialrand and getGaussianrand be renamed to getExponentialRand and getGaussianRand. Done. It was named like that because "getrand" was used for the uniform case. 2. I suggest that the code be changed so that the branch currently labeled as /* uniform with extra argument */ become a hard error instead of a warning. 3. Similarly, I suggest that the use of gaussian or uniform be an error when argc < 6 OR argc > 6. I also suggest that the parenthesized distribution type be dropped from the error message in all cases. I wish to agree, but my interpretation of the previous code is that they were ignored before, so ISTM that we are stuck with keeping the same unfortunate behavior. 4. This question mark seems like it should be a period: + * value fails the test? To be on the safe side, let us try over. Indeed. 5. With regards to the following paragraph: + The default random distribution is uniform, that is all values in the + range are drawn with equal probability. The gaussian and exponential + options allow to change this default. The mandatory + threshold double value controls the actual distribution + with gaussian or exponential. + This paragraph needs a bit of copy-editing. Here's an attempt: "By default, all values in the range are drawn with equal probability. The gaussian and exponential options modify this behavior; each requires a mandatory threshold which determines the precise shape of the distribution." The following paragraph should be changed to begin with "For a Gaussian distribution" and the one after "For an exponential distribution". Ok. I've kept "uniform" in the first sentence, because this is both an option name and it is significant in term of probabilities. 6. Overall, I think the documentation here looks much better now, but I suggest adding one or two example to the Gaussian section. Like this: for example, if threshold is 2.0, 68% of the values will fall in the middle third of the interval; with a threshold of 3.0, 99.7% of the values will fall in the middle third of the interval. These numbers are fabricated, and the middle third of the interval might not be the best part to talk about, but you get the idea (I hope). Done with threshold value 4.0 so I have a "middle quarter" and a "middle half". -- Fabien.diff --git a/contrib/pgbench/README b/contrib/pgbench/README new file mode 100644 index 000..6881256 --- /dev/null +++ b/contrib/pgbench/README @@ -0,0 +1,5 @@ +# gaussian and exponential tests +# with XXX as "expo" or "gauss" +psql test < test-init.sql +./pgbench -M prepared -f test-XXX-run.sql -t 100 -P 1 -n test +psql test < test-XXX-check.sql diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index 4aa8a50..e07206a 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -98,6 +98,8 @@ static int pthread_join(pthread_t th, void **thread_return); #define LOG_STEP_SECONDS 5 /* seconds between log messages */ #define DEFAULT_NXACTS 10 /* default nxacts */ +#define MIN_GAUSSIAN_THRESHOLD 2.0 /* minimum threshold for gauss */ + int nxacts = 0; /* number of transactions per client */ int duration = 0; /* duration in seconds */ @@ -471,6 +473,76 @@ getrand(TState *thread, int64 min, int64 max) return min + (int64) ((max - min + 1) * pg_erand48(thread->random_state)); } +/* + * random number generator: exponential distribution from min to max inclusive. + * the threshold is so that the density of probability for the last cut-off max + * value is exp(-threshold). + */ +static int64 +getExponentialRand(TState *thread, int64 min, int64 max, double threshold) +{ + double cut, uniform, rand; + Assert(threshold > 0.0); + cut = exp(-threshold); + /* erand in [0, 1), uniform in (0, 1] */ + uniform = 1.0 - pg_erand48(thread->random_state); + /* + * inner expresion in (cut, 1] (if threshold > 0), + * rand in [0, 1) + */ + Assert((1.0 - cut) != 0.0); + rand = - log(cut + (1.0 - cut) * uniform) / threshold; + /* return int64 random number within between min and max */ + return min + (int64)((max - min + 1) * rand); +} + +/* random number generator: gaussian distribution from min to max inclusive */ +static int64 +getGaussianRand(TState *thread, int64 min, int64 max, double threshold) +{ + double stdev; + double rand; + + /* + * Get user specified random number from this loop, with + * -threshold < stdev <= threshold + * + * This loop is executed until the number is in the expected range. + * + * As the minimum threshold is 2.0, the probability of looping is low: + * sqrt(-2 ln(r)) <= 2 => r >= e^{-2} ~ 0.135, then when taking the average + * sinus multiplier as 2/pi, we have a 8.6% looping probability in the + * worst case. For a 5.0
Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations
From: "Tom Lane" This seems like a pretty unsafe suggestion, because the smgr level doesn't know or care whether relations are temp; files are files. In any case it would only paper over one specific instance of whatever problem you're worried about, because sinval messages definitely do need to be sent in general. I'm sorry I don't show the exact problem yet. Apart from that, I understood that you insist it's not appropriate for smgr to be aware of whether the file is a temporary relation, in terms of module layering. However, it doesn't seem so in the current implementation. md.c, which is a layer under or part of smgr, uses SmgrIsTemp(). In addition, as the name suggests, SmgrIsTemp() is a function of smgr, which is defined in smgr.h. So, it's not inappropriate for smgr to use it. What I wanted to ask is whether and why sinval messages are really necessary for session-private objects like temp relations. I thought shared inval is, as the name indicates, for objects "shared" among sessions. If so, sinval for session-private objects doesn't seem to match the concept. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()
Robert Haas writes: > There are several possible methods of doing that, but I think the best > one is just to leave the SQL-callable C functions in fuzzystrmatch and > move only the underlying code that supports into core. I hadn't been paying close attention to this thread, but I'd just assumed that that would be the approach. It might be worth introducing new differently-named pg_proc entries for the same functions in core, but only if we can agree that there are better names for them than what the extension uses. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()
On Thu, Jul 17, 2014 at 9:34 AM, Michael Paquier wrote: > Patch 1 does a couple of things: > - fuzzystrmatch is dumped to 1.1, as Levenshtein functions are not part of > it anymore, and moved to core. > - Removal of the LESS_EQUAL flag that made the original submission patch > harder to understand. All the Levenshtein functions wrap a single common > function. > - Documentation is moved, and regression tests for Levenshtein functions are > added. > - Functions with costs are renamed with a suffix with costs. > After hacking this feature, I came up with the conclusion that it would be > better for the user experience to move directly into backend code all the > Levenshtein functions, instead of only moving in the common wrapper as Peter > did in his original patches. This is done this way to avoid keeping portions > of the same feature in two different places of the code (backend with common > routine, fuzzystrmatch with levenshtein functions) and concentrate all the > logic in a single place. Now, we may as well consider renaming the > levenshtein functions into smarter names, like str_distance, and keep > fuzzystrmatch to 1.0, having the functions levenshteing_* calling only the > str_distance functions. This is not cool. Anyone who is running a 9.4 or earlier database using fuzzystrmatch and upgrades, either via dump-and-restore or pg_upgrade, to a version with this patch applied will have a broken database. They will still have the catalog entries for the 1.0 definitions, but those definitions won't be resolvable inside the new cluster's .so file. The user will get a fairly-unfriendly error message that won't go away until they upgrade the extension, which may involve dealing with dependency hell since the new definitions are in a different place than the old definitions, and there may be dependencies on the old definitions. One of the great advantages of extension packaging is that this kind of problem is quite easily avoidable, so let's avoid it. There are several possible methods of doing that, but I think the best one is just to leave the SQL-callable C functions in fuzzystrmatch and move only the underlying code that supports into core. Then, the whole thing will be completely transparent to users. They won't need to upgrade their fuzzystrmatch definitions at all, and everything will just work; under the covers, the fuzzystrmatch code will now be calling into core code rather than to code located in that same module, but the user doesn't need to know or care about that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gaussian distribution pgbench -- part 1/2
On Thu, Jul 17, 2014 at 12:09 AM, Fabien COELHO wrote: > pgbench with gaussian & exponential, part 1 of 2. > > This patch is a subset of the previous patch which only adds the two > new \setrandom gaussian and exponantial variants, but not the > adapted pgbench test cases, as suggested by Fujii Masao. > There is no new code nor code changes. > > The corresponding documentation has been yet again extended wrt > to the initial patch, so that what is achieved is hopefully unambiguous > (there are two mathematical formula, tasty!), in answer to Andres Freund > comments, and partly to Robert Haas comments as well. > > This patch also provides several sql/pgbench scripts and a README, so > that the feature can be tested. I do not know whether these scripts > should make it to postgresql. I would say yes, otherwise there is no way > to test... > > part 2 which provide adapted pgbench test cases will come later. Some review comments: 1. I suggest that getExponentialrand and getGaussianrand be renamed to getExponentialRand and getGaussianRand. 2. I suggest that the code be changed so that the branch currently labeled as /* uniform with extra argument */ become a hard error instead of a warning. 3. Similarly, I suggest that the use of gaussian or uniform be an error when argc < 6 OR argc > 6. I also suggest that the parenthesized distribution type be dropped from the error message in all cases. 4. This question mark seems like it should be a period: +* value fails the test? To be on the safe side, let us try over. 5. With regards to the following paragraph: + The default random distribution is uniform, that is all values in the + range are drawn with equal probability. The gaussian and exponential + options allow to change this default. The mandatory + threshold double value controls the actual distribution + with gaussian or exponential. + This paragraph needs a bit of copy-editing. Here's an attempt: "By default, all values in the range are drawn with equal probability. The gaussian and exponential options modify this behavior; each requires a mandatory threshold which determines the precise shape of the distribution." The following paragraph should be changed to begin with "For a Gaussian distribution" and the one after "For an exponential distribution". 6. Overall, I think the documentation here looks much better now, but I suggest adding one or two example to the Gaussian section. Like this: for example, if threshold is 2.0, 68% of the values will fall in the middle third of the interval; with a threshold of 3.0, 99.7% of the values will fall in the middle third of the interval. These numbers are fabricated, and the middle third of the interval might not be the best part to talk about, but you get the idea (I hope). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Production block comparison facility
On 23 July 2014 15:14, Michael Paquier wrote: > I have spent some time digging more into this idea and finished with the > patch attached Thank you for investigating the idea. I'll review by Monday. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations
"MauMau" writes: > Looking at smgrtruncate(), the sinval message is sent even when the > truncated relation is a temporary relation. However, I think the sinval > message is not necessary for temp relations, because each session doesn't > see the temp relations of other sessions. This seems like a pretty unsafe suggestion, because the smgr level doesn't know or care whether relations are temp; files are files. In any case it would only paper over one specific instance of whatever problem you're worried about, because sinval messages definitely do need to be sent in general. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Least Active Transaction ID function
Hi All, I am doing programming with postgresql source code. I want to find out the function which can give me Least active transaction id currenty in the system. Is there any function which can give me that? Regards, Rohit Goyal
Re: [HACKERS] PDF builds broken again
Magnus Hagander writes: > Have you figured out any way to actually track down which para has the > problem itself, or is it all manual work? Oh ... the TUG page now has a recipe for finding the problem less painfully, which I don't recall having seen before: http://ftp.tug.org/mail-archives/pdftex/2002-February/002216.html In short, you can add a "draft" option that lets PDF output get generated anyway, and then you can actually see exactly what link is getting split. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PDF builds broken again
Magnus Hagander writes: > On Wed, Jul 23, 2014 at 4:06 PM, Tom Lane wrote: >> A more robust fix would be to identify the para where the problem actually >> is and re-word it so that the link doesn't cross a *line* boundary (in >> either US or A4). That makes it safe as long as that particular para >> doesn't get reworded in future; whereas with what you did, addition or >> subtraction of a line anywhere in a pretty broad range could resurrect >> the issue. > Hmm. Good point. OTOH it only showed up in the backbranch (and only in > one of them), so I figured we might get away with it. > Have you figured out any way to actually track down which para has the > problem itself, or is it all manual work? My recollection is it takes a bit of detective work but you can generally figure it out by eyeballing the TeX input file around the complained-of line number. The first trick is that our makefiles think the TeX input file is temp and delete it on failure, so you need to ask for it to be built not the PDF file. The second trick is that the line number is not spot-on; the error seems to come out only when TeX decides where it's going to break the page, which it won't do until it has absorbed a few more lines than actually end up on the page. I think I've posted more details in some past thread on the issue ... oh, here: http://www.postgresql.org/message-id/9473.1296172...@sss.pgh.pa.us >> Of course, it would be a lot better if the toolchain didn't have this >> limitation (or at least managed to report it more usefully). I'm not >> holding my breath for that to happen though. > Yeah, they would probably have done it years ago if they were going to at > all... IIRC there actually was a patch that purported to fix this a year or so ago, but it didn't help :-( regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations
Hello, I'm investigating a mysterious hang problem on PostgreSQL 9.2.8. If many sessions use temporary tables whose rows are deleted on commit, the hang occurs. I'd like to show you the stack trace, but I'm trying to figure out how to reproduce the problem. IIRC, the stack trace was as follows. The standby server was running normally. ... SyncRepWaitForLSN CommitTransaction CommitTransactionCommand ProcessCatchupEvent HandleCatchupInterrupt procsignal_sigusr1_handler recv ... ReadCommand PostgresMain ... Looking at smgrtruncate(), the sinval message is sent even when the truncated relation is a temporary relation. However, I think the sinval message is not necessary for temp relations, because each session doesn't see the temp relations of other sessions. So, the following code seems better. This avoids sinval queue overflow which leads to SIGUSR1. Is this correct? if (SmgrIsTemp(reln)) /* Do his on behalf of sinval message handler */ smgrclosenode(reln->smgr_rnode); else CacheInvalidateSmgr(reln->smgr_rnode); Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inconsistencies of service failure handling on Windows
On Wed, Jul 23, 2014 at 11:22 PM, Tom Lane wrote: > Was that a backend that you directly killed? Or the postmaster? The > subsequent connection failures suggest it was the postmaster. Killing > the postmaster is not a supported operation, not on Windows and not > anywhere else either. It's in the category of "doctor, it hurts when > I do this". > The headshot was done on random backends. Perhaps in some of those tests the postmaster was taken down though :) I didn't check postmaster.pid all the time. -- Michael
Re: [HACKERS] Inconsistencies of service failure handling on Windows
Michael Paquier writes: > While playing on Windows with services, I noticed an inconsistent behavior > in the way failures are handled when using a service for a Postgres > instance. > ... > However when a backend process is directly killed something different > happens. Was that a backend that you directly killed? Or the postmaster? The subsequent connection failures suggest it was the postmaster. Killing the postmaster is not a supported operation, not on Windows and not anywhere else either. It's in the category of "doctor, it hurts when I do this". regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Production block comparison facility
On Tue, Jul 22, 2014 at 4:49 PM, Michael Paquier wrote: > Then, looking at the code, we would need to tweak XLogInsert for the > WAL record construction to always do a FPW and to update > XLogCheckBufferNeedsBackup. Then for the redo part, we would need to > do some extra operations in the area of > RestoreBackupBlock/RestoreBackupBlockContents, including masking > operations before comparing the content of the FPW and the current > page. > > Does that sound right? > I have spent some time digging more into this idea and finished with the patch attached, doing the following: addition of a consistency check when FPW is restored and applied on a given page. The consistency check is made of two phases: - Apply a mask on the FPW and the current page to eliminate potential conflicts like hint bits for example. - Check that the FPW is consistent with the current page, aka the current page does not contain any new information that the FPW taken has not. This is done by checking the masked portions of the FPW and the current page. Also some more details: - If an inconsistency is found, a WARNING is simply logged. - The consistency check is done if current page is not empty, and if database has reached a consistent state. - The page masking API is taken from the WAL replay patch that was submitted in CF1 and plugged in as an independent set of API. - In masking stuff, to facilitate if a page is used by a sequence relation SEQ_MAGIC as well as the its opaque data structure are renamed and moved into sequence.h. - To facilitate debugging and comparison, the masked FPW and current page are also converted into hex. Things could be refactored and improved for sure, but this patch is already useful as-is so I am going to add it to the next commit fest. Comments are welcome. Regards, -- Michael From 5a05a3751ae278ba8eb7b79f50a4f7b652a1179c Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Wed, 23 Jul 2014 23:11:10 +0900 Subject: [PATCH] Add facility to check FPW consistency at WAL replay The consistency check on FDW is made of two phases each time a FPW is restored in place of an existing page:: - Apply a mask on the FPW and the current page to eliminate potential conflicts like hint bits for example. - Check that the FPW is consistent with the current page, aka the current page does not contain any new information that the FPW taken has not. This is done by checking the masked portions of the FPW and the current page. - If an inconsistency is found, a WARNING is simply logged. This check is done only if current page is not empty and if database has reached a consistent check. --- src/backend/access/transam/xlog.c| 104 src/backend/commands/sequence.c | 34 ++-- src/backend/storage/buffer/Makefile | 2 +- src/backend/storage/buffer/bufmask.c | 301 +++ src/include/commands/sequence.h | 13 ++ src/include/storage/bufmask.h| 21 +++ 6 files changed, 452 insertions(+), 23 deletions(-) create mode 100644 src/backend/storage/buffer/bufmask.c create mode 100644 src/include/storage/bufmask.h diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index e4f9595..a383126 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -47,6 +47,7 @@ #include "replication/walsender.h" #include "storage/barrier.h" #include "storage/bufmgr.h" +#include "storage/bufmask.h" #include "storage/fd.h" #include "storage/ipc.h" #include "storage/large_object.h" @@ -4065,6 +4066,109 @@ RestoreBackupBlockContents(XLogRecPtr lsn, BkpBlock bkpb, char *blk, page = (Page) BufferGetPage(buffer); + /* +* Before saving the new contents of the backup block ensure that +* it is consistent with the existing one. Apply masking to it and +* then perform a comparison with what is going to be added. Do +* only this operation once a consistent state has been reached by +* server and only if the page that is being rewritten is not empty +* to have a clean base for comparison. +*/ + if (reachedConsistency && + !PageIsEmpty(page)) + { + RelFileNode rnode; + ForkNumber forknum; + BlockNumber blkno; + char *norm_new_page, *norm_old_page; + char *blk_data = (char *) palloc(BLCKSZ); + charold_buf[BLCKSZ * 2]; + charnew_buf[BLCKSZ * 2]; + int j = 0; + int i; + boolinconsistent = false; + + /* +* Block data needs to be reformated correctly, especially if +* it has a hole. This is the same procssing as below but +* replicating this code saves memcpy calls if consistency +* checks cannot be done on this backup block. +*/ + if (bkpb.h
Re: [HACKERS] PDF builds broken again
On Wed, Jul 23, 2014 at 4:06 PM, Tom Lane wrote: > Magnus Hagander writes: >> On Wed, Jul 23, 2014 at 12:31 PM, Magnus Hagander >> wrote: >>> ! pdfTeX error (ext4): \pdfendlink ended up in different nesting level than >>> \pd > >> Additional point of info - the -US pdf's do build on this version, >> just not the -A4. > >> And with even more of those entries about overfull hbox, so clearly >> that was not the actual breakage. > > Yeah. What this actually is is the symptom of text crossing a page > boundary. The patch you made did not fix the problem (because there's no > hyperlink anywhere in that para); you just moved the problematic line > pair, which must be somewhere below here, up or down so it didn't fall > across a page break. Right - it fixed the symptoms only. (And now that you mention it I do remember the thing about ). > A more robust fix would be to identify the para where the problem actually > is and re-word it so that the link doesn't cross a *line* boundary (in > either US or A4). That makes it safe as long as that particular para > doesn't get reworded in future; whereas with what you did, addition or > subtraction of a line anywhere in a pretty broad range could resurrect > the issue. Hmm. Good point. OTOH it only showed up in the backbranch (and only in one of them), so I figured we might get away with it. Have you figured out any way to actually track down which para has the problem itself, or is it all manual work? > Of course, it would be a lot better if the toolchain didn't have this > limitation (or at least managed to report it more usefully). I'm not > holding my breath for that to happen though. Yeah, they would probably have done it years ago if they were going to at all... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PDF builds broken again
Magnus Hagander writes: > On Wed, Jul 23, 2014 at 12:31 PM, Magnus Hagander wrote: >> ! pdfTeX error (ext4): \pdfendlink ended up in different nesting level than >> \pd > Additional point of info - the -US pdf's do build on this version, > just not the -A4. > And with even more of those entries about overfull hbox, so clearly > that was not the actual breakage. Yeah. What this actually is is the symptom of text crossing a page boundary. The patch you made did not fix the problem (because there's no hyperlink anywhere in that para); you just moved the problematic line pair, which must be somewhere below here, up or down so it didn't fall across a page break. A more robust fix would be to identify the para where the problem actually is and re-word it so that the link doesn't cross a *line* boundary (in either US or A4). That makes it safe as long as that particular para doesn't get reworded in future; whereas with what you did, addition or subtraction of a line anywhere in a pretty broad range could resurrect the issue. Of course, it would be a lot better if the toolchain didn't have this limitation (or at least managed to report it more usefully). I'm not holding my breath for that to happen though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PDF builds broken again
On Wed, Jul 23, 2014 at 3:15 PM, Magnus Hagander wrote: > On Wed, Jul 23, 2014 at 1:31 PM, Magnus Hagander wrote: >> On Wed, Jul 23, 2014 at 12:31 PM, Magnus Hagander >> wrote: >>> It seems at least the 9.0 PDFs are broken (trying to build for the release): >>> >>> Lots of errors/warnings (and AFAIK no way to see which is which in the >>> output), but It hink this is the telltale as usual: >>> >>> Overfull \hbox (7.12454pt too wide) in paragraph at lines 88092--88092 >>> []\T1/pcr/m/n/9 CREATE FUNCTION getf1(myrowtype) RETURNS int AS 'SELECT >>> $1.f1' >>> LANGUAGE SQL;[] >>> [] >>> >>> and many more like it until >>> >>> Overfull \hbox (1.5pt too wide) in alignment at lines 241488--241741 >>> [] [] [] >>> [] >>> >>> [256.0.1 >>> ! pdfTeX error (ext4): \pdfendlink ended up in different nesting level than >>> \pd >>> fstartlink. >>> \AtBegShi@Output ...ipout \box \AtBeginShipoutBox >>> \fi \fi >>> l.241875 ...char95{}stat\char95{}file('filename'); >>> >>> >>> >>> Here is how much of TeX's memory you used: >>> 22467 strings out of 482156 >>> 171125 string characters out of 3785924 >>> 308594 words of memory out of 3085000 >>> 27304 multiletter control sequences out of 15000+50 >>> 80861 words of font info for 131 fonts, out of 300 for 9000 >>> 14 hyphenation exceptions out of 8191 >>> 30i,12n,43p,307b,1338s stack positions out of >>> 1500i,500n,1500p,20b,5s >>> ! ==> Fatal error occurred, no output PDF file produced! >>> >>> >>> >>> >>> Do we actually have any buildfarm boxes building the PDFs? And if so, >>> any idea why they didn't catch it? >>> >>> Do we have a reasonable way to figure out which commit actually broke >>> it, other than manually testing backing out each of the 11 commits >>> since 9.0.17? >> >> Additional point of info - the -US pdf's do build on this version, >> just not the -A4. >> >> And with even more of those entries about overfull hbox, so clearly >> that was not the actual breakage. > > And with some dissecting, the offending patch is > 6b2a1445ec8a631060c4cbff3f172bf31d3379b9. I ended up splitting the paragraph in two in order to get the PDFs to build. I've applied a patch for this to 9.0 only so we can keep building PDFs. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PDF builds broken again
On Wed, Jul 23, 2014 at 1:31 PM, Magnus Hagander wrote: > On Wed, Jul 23, 2014 at 12:31 PM, Magnus Hagander wrote: >> It seems at least the 9.0 PDFs are broken (trying to build for the release): >> >> Lots of errors/warnings (and AFAIK no way to see which is which in the >> output), but It hink this is the telltale as usual: >> >> Overfull \hbox (7.12454pt too wide) in paragraph at lines 88092--88092 >> []\T1/pcr/m/n/9 CREATE FUNCTION getf1(myrowtype) RETURNS int AS 'SELECT >> $1.f1' >> LANGUAGE SQL;[] >> [] >> >> and many more like it until >> >> Overfull \hbox (1.5pt too wide) in alignment at lines 241488--241741 >> [] [] [] >> [] >> >> [256.0.1 >> ! pdfTeX error (ext4): \pdfendlink ended up in different nesting level than >> \pd >> fstartlink. >> \AtBegShi@Output ...ipout \box \AtBeginShipoutBox >> \fi \fi >> l.241875 ...char95{}stat\char95{}file('filename'); >> >> >> >> Here is how much of TeX's memory you used: >> 22467 strings out of 482156 >> 171125 string characters out of 3785924 >> 308594 words of memory out of 3085000 >> 27304 multiletter control sequences out of 15000+50 >> 80861 words of font info for 131 fonts, out of 300 for 9000 >> 14 hyphenation exceptions out of 8191 >> 30i,12n,43p,307b,1338s stack positions out of >> 1500i,500n,1500p,20b,5s >> ! ==> Fatal error occurred, no output PDF file produced! >> >> >> >> >> Do we actually have any buildfarm boxes building the PDFs? And if so, >> any idea why they didn't catch it? >> >> Do we have a reasonable way to figure out which commit actually broke >> it, other than manually testing backing out each of the 11 commits >> since 9.0.17? > > Additional point of info - the -US pdf's do build on this version, > just not the -A4. > > And with even more of those entries about overfull hbox, so clearly > that was not the actual breakage. And with some dissecting, the offending patch is 6b2a1445ec8a631060c4cbff3f172bf31d3379b9. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SKIP LOCKED DATA (work in progress)
On Sun, Jun 29, 2014 at 9:01 PM, Thomas Munro wrote: > > Please find attached a rebased version of my SKIP LOCKED > patch (formerly SKIP LOCKED DATA), updated to support only the > Oracle-like syntax. > > Hi Thomas, Apologies for taking this long to get to reviewing this, I'd gotten a bit side tracked with my own patches during this commitfest. I'm really glad to see this patch is back again. I think it will be very useful for processing queues. I could have made good use of it in my last work, using it for sending unsent emails which were "queued" up in a table in the database. I've so far read over the patch and done only some brief tests of the functionality. Here's what I've picked up on so far: * In heap_lock_tuple() there's a few places where you test the wait_policy, but you're not always doing this in the same order. The previous code did if (nolock) first each time, but since there's now 3 values I think if (wait_policy == LockWaitError) should be first each time as likely this is the most common case. * The following small whitespace change can be removed in gram.y: @@ -119,7 +119,6 @@ typedef struct PrivTarget #define CAS_NOT_VALID 0x10 #define CAS_NO_INHERIT 0x20 - #define parser_yyerror(msg) scanner_yyerror(msg, yyscanner) #define parser_errposition(pos) scanner_errposition(pos, yyscanner) * analyze.c. rc->waitPolicy = Max(rc->waitPolicy, waitPolicy); I'm not quite sure I understand this yet, but I'm guessing the original code was there so that something like: SELECT * FROM (SELECT * FROM a FOR UPDATE) a FOR UPDATE NOWAIT; Would give: ERROR: could not obtain lock on row in relation "a" So it seems that with the patch as you've defined in by the order of the enum values in LockClauseWaitPolicy that SKIP LOCKED overrides FOR UPDATE. I'm wondering if this is dangerous. Should the following really skip locked tuples? SELECT * FROM (SELECT * FROM a FOR UPDATE SKIP LOCKED) a FOR UPDATE LIMIT 1; But on the other hand perhaps I've missed a discussion on this, if so then I think the following comment should be updated to explain it all: * We also consider that NOWAIT wins if it's specified both ways. This * is a bit more debatable but raising an error doesn't seem helpful. * (Consider for instance SELECT FOR UPDATE NOWAIT from a view that * internally contains a plain FOR UPDATE spec.) * * plannodes.h -> RowWaitPolicy waitPolicy; /* NOWAIT and SKIP LOCKED DATA options */ Should be "NOWAIT and SKIP LOCKED options" since DATA has now been removed from the syntax. * nodeLockRow.c has extra space in if condition: else if (erm->waitPolicy == RWP_SKIP ) I'm also wondering about this block of code in general: if (erm->waitPolicy == RWP_WAIT) wait_policy = LockWaitBlock; else if (erm->waitPolicy == RWP_SKIP ) wait_policy = LockWaitSkip; else /* erm->waitPolicy == RWP_NOWAIT */ wait_policy = LockWaitError; Just after this code heap_lock_tuple() is called, and if that returns HeapTupleWouldBlock, the code does a goto lnext, which then will repeat that whole block again. I'm wondering why there's 2 enums that are for the same thing? if you just had 1 then you'd not need this block of code at all, you could just pass erm->waitPolicy to heap_lock_tuple(). * parsenodes.h comment does not meet project standards ( http://www.postgresql.org/docs/devel/static/source-format.html) typedef enum LockClauseWaitPolicy { /* order is important (see applyLockingClause which takes the greatest value when several wait policies have been specified), and values must match RowWaitPolicy from plannodes.h */ * parsenode.h remove "DATA" from LockClauseWaitPolicy waitPolicy; /* NOWAIT and SKIP LOCKED DATA */ I have noticed the /* TODO -- work out what needs to be released here */ comments in head_lock_tuple(), and perhaps the lack of cleaning up is what is causing the following: create table skiptest (id int primary key); insert into skiptest (id) select x.x from generate_series(1,100) x(x); Session 1: begin work; select * from skiptest for update limit 99; Session 2: select * from skiptest for update skip locked limit 1; WARNING: out of shared memory ERROR: out of shared memory HINT: You might need to increase max_locks_per_transaction. Yet if I do: session 1: begin work; select * from skiptest where id > 1 for update; session 2: select * from skiptest for update skip locked limit 1; id 1 (1 row) That test makes me think your todo comments are in the right place, something is missing there for sure! * plays about with patch for a bit * I don't quite understand how heap_lock_tuple works, as if I debug session 2 from the above set of queries the first call to ConditionalLockTupleTuplock() (heapam.c line 4236) succeeds where I'd have thought it would fail, since session 1 should be locking this tuple? Later at line 4527 on the line if (!ConditionalXactLockTableWait(xwait)), it fails to grab the lock and returns HeapTupleWouldBlock. The above query seems to work ok if
Re: [HACKERS] PDF builds broken again
On Wed, Jul 23, 2014 at 12:31 PM, Magnus Hagander wrote: > It seems at least the 9.0 PDFs are broken (trying to build for the release): > > Lots of errors/warnings (and AFAIK no way to see which is which in the > output), but It hink this is the telltale as usual: > > Overfull \hbox (7.12454pt too wide) in paragraph at lines 88092--88092 > []\T1/pcr/m/n/9 CREATE FUNCTION getf1(myrowtype) RETURNS int AS 'SELECT > $1.f1' > LANGUAGE SQL;[] > [] > > and many more like it until > > Overfull \hbox (1.5pt too wide) in alignment at lines 241488--241741 > [] [] [] > [] > > [256.0.1 > ! pdfTeX error (ext4): \pdfendlink ended up in different nesting level than > \pd > fstartlink. > \AtBegShi@Output ...ipout \box \AtBeginShipoutBox > \fi \fi > l.241875 ...char95{}stat\char95{}file('filename'); > > > > Here is how much of TeX's memory you used: > 22467 strings out of 482156 > 171125 string characters out of 3785924 > 308594 words of memory out of 3085000 > 27304 multiletter control sequences out of 15000+50 > 80861 words of font info for 131 fonts, out of 300 for 9000 > 14 hyphenation exceptions out of 8191 > 30i,12n,43p,307b,1338s stack positions out of 1500i,500n,1500p,20b,5s > ! ==> Fatal error occurred, no output PDF file produced! > > > > > Do we actually have any buildfarm boxes building the PDFs? And if so, > any idea why they didn't catch it? > > Do we have a reasonable way to figure out which commit actually broke > it, other than manually testing backing out each of the 11 commits > since 9.0.17? Additional point of info - the -US pdf's do build on this version, just not the -A4. And with even more of those entries about overfull hbox, so clearly that was not the actual breakage. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PDF builds broken again
It seems at least the 9.0 PDFs are broken (trying to build for the release): Lots of errors/warnings (and AFAIK no way to see which is which in the output), but It hink this is the telltale as usual: Overfull \hbox (7.12454pt too wide) in paragraph at lines 88092--88092 []\T1/pcr/m/n/9 CREATE FUNCTION getf1(myrowtype) RETURNS int AS 'SELECT $1.f1' LANGUAGE SQL;[] [] and many more like it until Overfull \hbox (1.5pt too wide) in alignment at lines 241488--241741 [] [] [] [] [256.0.1 ! pdfTeX error (ext4): \pdfendlink ended up in different nesting level than \pd fstartlink. \AtBegShi@Output ...ipout \box \AtBeginShipoutBox \fi \fi l.241875 ...char95{}stat\char95{}file('filename'); Here is how much of TeX's memory you used: 22467 strings out of 482156 171125 string characters out of 3785924 308594 words of memory out of 3085000 27304 multiletter control sequences out of 15000+50 80861 words of font info for 131 fonts, out of 300 for 9000 14 hyphenation exceptions out of 8191 30i,12n,43p,307b,1338s stack positions out of 1500i,500n,1500p,20b,5s ! ==> Fatal error occurred, no output PDF file produced! Do we actually have any buildfarm boxes building the PDFs? And if so, any idea why they didn't catch it? Do we have a reasonable way to figure out which commit actually broke it, other than manually testing backing out each of the 11 commits since 9.0.17? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [TODO] Process pg_hba.conf keywords as case-insensitive
Re: Viswanatham kirankumar 2014-07-23 > 3) Host name is not a SQL object so it will be treated as case-sensitive >except for all, samehost, samenet are considered as keywords. >For these user need to use quotes to differentiate between hostname and > keywords. DNS is case-insensitive, though most of the time case-preserving (nothing guarantees that it won't down-up-whatever-case the answer you get). (FTR, I'll retract my original complaint, the idea of using SQL-like case folding is nice.) Christoph -- c...@df7cb.de | http://www.df7cb.de/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
I'm trying to understand what would it take to have this patch in an acceptable form before the next commitfest. Both Abhijit and Andres has done some extensive review of the patch and have given many useful suggestions to Rahila. While she has incorporated most of them, I feel we are still some distance away from having something which can be committed. Here are my observations based on the discussion on this thread so far. 1. Need for compressing full page backups: There are good number of benchmarks done by various people on this list which clearly shows the need of the feature. Many people have already voiced their agreement on having this in core, even as a configurable parameter. There had been some requests to have more benchmarks such as response times immediately after a checkpoint or CPU consumption which I'm not entirely sure if already done. 2. Need for different compression algorithms: There were requests for comparing different compression algorithms such as LZ4 and snappy. Based on the numbers that Rahila has posted, I can see LZ4 has the best compression ratio, at least for TPC-C benchmarks she tried. Having said that, I was hoping to see more numbers in terms of CPU resource utilization which will demonstrate the trade-off, if any. Anyways, there were also apprehensions expressed about whether to have pluggable algorithm in the final patch that gets committed. If we do decide to support more compression algorithms, I like what Andres had done before i.e. store the compression algorithm information in the varlena header. So basically, we should have a abstract API which can take a buffer and the desired algorithm and returns compressed data, along with varlena header with encoded information. ISTM that the patch Andres had posted earlier was focused primarily on toast data, but I think we can make it more generic so that both toast and FPW can use it. Having said that, IMHO we should go one step at a time. We are using pglz for compressing toast data for long, so we can continue to use the same for compressing full page images. We can simultaneously work on adding more algorithms to core and choose the right candidate for different scenarios such as toast or FPW based on test evidences. But that work can happen independent of this patch. 3. Compressing one block vs all blocks: Andres suggested that compressing all backup blocks in one go may give us better compression ratio. This is worth trying. I'm wondering what would the best way to do so without minimal changes to the xlog insertion code. Today, we add more rdata items for backup block header(s) and backup blocks themselves (if there is a "hole" then 2 per backup block) beyond what the caller has supplied. If we have to compress all the backup blocks together, then one approach is to copy the backup block headers and the blocks to a temp buffer, compress that and replace the rdata entries added previously with a single rdata. Is there a better way to handle multiple blocks in one go? We still need a way to tell the restore path that the wal data is compressed. One way is to always add a varlena header irrespective of whether the blocks are compressed or not. This looks overkill. Another way to add a new field to XLogRecord to record this information. Looks like we can do this without increasing the size of the header since there are 2 bytes padding after the xl_rmid field. 4. Handling holes in backup blocks: I think we address (3) then this can be easily done. Alternatively, we can also memzero the "hole" and then compress the entire page. The compression algorithm should handle that well. Thoughts/comments? Thanks, Pavan
Re: [HACKERS] proposal (9.5) : psql unicode border line styles
2014-07-23 8:38 GMT+02:00 Tomas Vondra : > On 23 Červenec 2014, 7:36, Pavel Stehule wrote: > > > > updated version is in attachment > > OK, thanks. The new version seems OK to me. > Thank you Pavel > > Tomas > >