Re: [HACKERS] pg_filedump 9.3: checksums (and a few other fixes)
On Mon, 2013-06-10 at 01:28 -0400, Alvaro Herrera wrote: Hm, note that XMAX_SHR_LOCK is two bits, so when that flag is present you will get the three lock modes displayed with the above code, which is probably going to be misleading. htup_details.h does this: /* * Use these to test whether a particular lock is applied to a tuple */ #define HEAP_XMAX_IS_SHR_LOCKED(infomask) \ (((infomask) HEAP_LOCK_MASK) == HEAP_XMAX_SHR_LOCK) #define HEAP_XMAX_IS_EXCL_LOCKED(infomask) \ (((infomask) HEAP_LOCK_MASK) == HEAP_XMAX_EXCL_LOCK) #define HEAP_XMAX_IS_KEYSHR_LOCKED(infomask) \ (((infomask) HEAP_LOCK_MASK) == HEAP_XMAX_KEYSHR_LOCK) Presumably it'd be better to do something similar. I was hesitant to do too much interpretation of the bits. Do you think it would be better to just remove the test for XMAX_SHR_LOCK? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimising Foreign Key checks
On Sun, Jun 09, 2013 at 10:51:43AM +0100, Simon Riggs wrote: On 9 June 2013 02:12, Noah Misch n...@leadboat.com wrote: On Sat, Jun 08, 2013 at 08:20:42PM -0400, Robert Haas wrote: On Sat, Jun 8, 2013 at 5:41 PM, Noah Misch n...@leadboat.com wrote: Likewise; I don't see why we couldn't perform an optimistic check ASAP and schedule a final after-statement check when an early check fails. That changes performance characteristics without changing semantics. ...this seems like it might have some promise; but what if the action we're performing isn't idempotent? And how do we know? The action discussed so far is RI_FKey_check_ins(). It acquires a KEY SHARE lock (idempotent by nature) on a row that it finds using B-tree equality (presumed IMMUTABLE, thus idempotent). RI_FKey_check_upd() is nearly the same action, so the same argument holds. Before treating any other operation in the same way, one would need to conduct similar analysis. As long as we are talking about FKs only, then this approach can work. All we are doing is applying the locks slightly earlier than before. Once locked they will prevent any later violations, so we are safe from anybody except *ourselves* from making changes that would invalidate the earlier check. Trouble is, there are various ways I can see that as possible, so making a check early doesn't allow you to avoid making the check later as well. This UPDATE or DELETE that invalidates the check by modifying the PK row will fire the usual RI_FKey_*_{upd,del} trigger on the PK table. That will (a) fail the transaction, (b) CASCADE to delete the new FK row, or (c) update the new FK row's key column to NULL/DEFAULT. If (a) happens we're of course fine. If (b) or (c) happens, the FK's AFTER check already today becomes a no-op due to the visibility test in RI_FKey_check(). Consequently, I don't think later actions of the SQL statement can put us in a position to need a second check. AFAICS there are weird cases where changing the way FKs execute will change the way complex trigger applications will execute. I don't see a way to avoid that other than do nothing. Currently, we execute the checks following the normal order of execution rules for triggers. Every idea we've had so far changes that in some way; variously in major or minor ways, but changed nonetheless. I've tried to envision a trigger-creates-missing-references scenario that would notice the difference. The trigger in question would, I'm presuming, be an AFTER ROW INSERT trigger named such that it fires before the FK trigger. The initial optimistic FK check would fail, so we would queue a traditional FK AFTER trigger. From that point, the scenario proceeds exactly as it proceeds today. Could you detail a problem scenario you have in mind? Even the approach of deferring checks to allow them to be applied in a batch mean we might change the way applications execute in detail. However, since the only possible change there would be to decrease the number of self-induced failures that seems OK. So the question is how much change do we want to introduce? I'll guess not much, rather than lots or none. The batch would need to fire at the trigger firing position of the *last* queue entry it covers. If you run a final FK check earlier than that, other AFTER triggers that expect to run before the FK check and affect its outcome may not yet have run. In contrast, an FK check running later than usual is mostly fine; whether a tuple has been locked does not affect the semantics of later ordinary actions in the transaction. (I say ordinary to exclude a function like pgrowlocks() that makes a point to discern. Also note that the reasoning about timing only applies to definitive FK checks that can throw errors; a tentative, soft-failure check is acceptable anytime after the new row is in place.) One can, however, at least construct problem cases. When a query calls functions that perform non-transactional actions, changing the timing of an ERROR with respect to those calls changes application behavior. Taking locks in a different order affects the incidence of deadlocks. Does compatibility to that degree have much value? I'd be happy to file those in the not much change category. Proposal: Have a WHEN clause that accumulates values to be checked in a hash table up to work_mem in size, allowing us to eliminate the most common duplicates (just not *all* duplicates). If the value isn't a duplicate (or at least the first seen tuple with that value), we will queue up a check for later. That approach gives us *exactly* what we have now and works with the two common cases: i) few, mostly duplicated values, ii) many values, but clustered together. Then apply changes in batches at end of statement. I'm still fine with this proposal, but it does not dramatically sidestep these sorts of tricky situations. Suppose a COPY inserts rows with fkcol=1 at TIDs (0,3),
Re: [HACKERS] JSON and unicode surrogate pairs
On 06/09/2013 07:47 PM, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: I did that, but it's evident from the buildfarm that there's more work to do. The problem is that we do the de-escaping as we lex the json to construct the look ahead token, and at that stage we don't know whether or not it's really going to be needed. That means we can cause errors to be raised in far too many places. It's failing on this line: converted = pg_any_to_server(utf8str, utf8len, PG_UTF8); even though the operator in use (-) doesn't even use the de-escaped value. The real solution is going to be to delay the de-escaping of the string until it is known to be wanted. That's unfortunately going to be a bit invasive, but I can't see a better solution. I'll work on it ASAP. Not sure that this idea isn't a dead end. IIUC, you're proposing to jump through hoops in order to avoid complaining about illegal JSON data, essentially just for backwards compatibility with 9.2's failure to complain about it. If we switch over to a pre-parsed (binary) storage format for JSON values, won't we be forced to throw these errors anyway? If so, maybe we should just take the compatibility hit now while there's still a relatively small amount of stored JSON data in the wild. No, I probably haven't explained it very well. Here is the regression diff from jacana: ERROR: cannot call json_populate_recordset on a nested object -- handling of unicode surrogate pairs select json '{ a: \ud83d\ude04\ud83d\udc36 }' - 'a' as correct; ! correct ! ! \ud83d\ude04\ud83d\udc36 ! (1 row) ! select json '{ a: \ud83d\ud83d }' - 'a'; -- 2 high surrogates in a row ERROR: invalid input syntax for type json DETAIL: high order surrogate must not follow a high order surrogate. --- 922,928 ERROR: cannot call json_populate_recordset on a nested object -- handling of unicode surrogate pairs select json '{ a: \ud83d\ude04\ud83d\udc36 }' - 'a' as correct; ! ERROR: character with byte sequence 0xf0 0x9f 0x98 0x84 in encoding UTF8 has no equivalent in encoding WIN1252 select json '{ a: \ud83d\ud83d }' - 'a'; -- 2 high surrogates in a row ERROR: invalid input syntax for type json DETAIL: high order surrogate must not follow a high order surrogate. The sequence in question is two perfectly valid surrogate pairs. ... After thinking about this some more I have come to the conclusion that we should only do any de-escaping of \u sequences, whether or not they are for BMP characters, when the server encoding is utf8. For any other encoding, which is already a violation of the JSON standard anyway, and should be avoided if you're dealing with JSON, we should just pass them through even in text output. This will be a simple and very localized fix. We'll still have to deal with this issue when we get to binary storage of JSON, but that's not something we need to confront today. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF-8 encoding problem w/ libpq
Thanks Andrew. I will test the next release. Martin -Original Message- From: Andrew Dunstan [mailto:and...@dunslane.net] Sent: 08 June 2013 16:43 To: Tom Lane Cc: Heikki Linnakangas; k...@rice.edu; Martin Schäfer; pgsql- hack...@postgresql.org Subject: Re: [HACKERS] UTF-8 encoding problem w/ libpq On 06/03/2013 02:41 PM, Andrew Dunstan wrote: On 06/03/2013 02:28 PM, Tom Lane wrote: . I wonder though if we couldn't just fix this code to not do anything to high-bit-set bytes in multibyte encodings. That's exactly what I suggested back in November. This thread seems to have gone cold, so I have applied the fix I originally suggested along these lines to all live branches. At least that means we won't produce junk, but we still need to work out how to downcase multi-byte characters. If anyone thinks there are other places in the code that need similar treatment, they are welcome to find them. I have not yet found one. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimising Foreign Key checks
On 10 June 2013 07:06, Noah Misch n...@leadboat.com wrote: On Sun, Jun 09, 2013 at 10:51:43AM +0100, Simon Riggs wrote: On 9 June 2013 02:12, Noah Misch n...@leadboat.com wrote: On Sat, Jun 08, 2013 at 08:20:42PM -0400, Robert Haas wrote: On Sat, Jun 8, 2013 at 5:41 PM, Noah Misch n...@leadboat.com wrote: Likewise; I don't see why we couldn't perform an optimistic check ASAP and schedule a final after-statement check when an early check fails. That changes performance characteristics without changing semantics. ...this seems like it might have some promise; but what if the action we're performing isn't idempotent? And how do we know? The action discussed so far is RI_FKey_check_ins(). It acquires a KEY SHARE lock (idempotent by nature) on a row that it finds using B-tree equality (presumed IMMUTABLE, thus idempotent). RI_FKey_check_upd() is nearly the same action, so the same argument holds. Before treating any other operation in the same way, one would need to conduct similar analysis. As long as we are talking about FKs only, then this approach can work. All we are doing is applying the locks slightly earlier than before. Once locked they will prevent any later violations, so we are safe from anybody except *ourselves* from making changes that would invalidate the earlier check. Trouble is, there are various ways I can see that as possible, so making a check early doesn't allow you to avoid making the check later as well. This UPDATE or DELETE that invalidates the check by modifying the PK row will fire the usual RI_FKey_*_{upd,del} trigger on the PK table. That will (a) fail the transaction, (b) CASCADE to delete the new FK row, or (c) update the new FK row's key column to NULL/DEFAULT. If (a) happens we're of course fine. If (b) or (c) happens, the FK's AFTER check already today becomes a no-op due to the visibility test in RI_FKey_check(). Consequently, I don't think later actions of the SQL statement can put us in a position to need a second check. AFAICS there are weird cases where changing the way FKs execute will change the way complex trigger applications will execute. I don't see a way to avoid that other than do nothing. Currently, we execute the checks following the normal order of execution rules for triggers. Every idea we've had so far changes that in some way; variously in major or minor ways, but changed nonetheless. I've tried to envision a trigger-creates-missing-references scenario that would notice the difference. The trigger in question would, I'm presuming, be an AFTER ROW INSERT trigger named such that it fires before the FK trigger. The initial optimistic FK check would fail, so we would queue a traditional FK AFTER trigger. From that point, the scenario proceeds exactly as it proceeds today. Could you detail a problem scenario you have in mind? Even the approach of deferring checks to allow them to be applied in a batch mean we might change the way applications execute in detail. However, since the only possible change there would be to decrease the number of self-induced failures that seems OK. So the question is how much change do we want to introduce? I'll guess not much, rather than lots or none. The batch would need to fire at the trigger firing position of the *last* queue entry it covers. If you run a final FK check earlier than that, other AFTER triggers that expect to run before the FK check and affect its outcome may not yet have run. In contrast, an FK check running later than usual is mostly fine; whether a tuple has been locked does not affect the semantics of later ordinary actions in the transaction. (I say ordinary to exclude a function like pgrowlocks() that makes a point to discern. Also note that the reasoning about timing only applies to definitive FK checks that can throw errors; a tentative, soft-failure check is acceptable anytime after the new row is in place.) One can, however, at least construct problem cases. When a query calls functions that perform non-transactional actions, changing the timing of an ERROR with respect to those calls changes application behavior. Taking locks in a different order affects the incidence of deadlocks. Does compatibility to that degree have much value? I'd be happy to file those in the not much change category. Proposal: Have a WHEN clause that accumulates values to be checked in a hash table up to work_mem in size, allowing us to eliminate the most common duplicates (just not *all* duplicates). If the value isn't a duplicate (or at least the first seen tuple with that value), we will queue up a check for later. That approach gives us *exactly* what we have now and works with the two common cases: i) few, mostly duplicated values, ii) many values, but clustered together. Then apply changes in batches at end of statement. I'm still fine with this proposal, but it does not dramatically sidestep
[HACKERS] Postgresql for cygwin - 3rd
Il 3/6/2013 11:46 PM, Andrew Dunstan ha scritto: Excellent. Will test it out soon. cheers andrew Andrew, please find attached a full patch for cygwin relative to 9.3beta1 : - DLLTOLL/DLLWRAP are not used anymore, replaced by gcc also for postgres.exe (*) - DLL versioning is added Check failures: - prepared_xacts is still freezing The cygwin failure you highlighted was solved, so it should be something else - attached the 2 regressions diffs tsearch ... FAILED without_oid ... FAILED The second one seems a new one, not sure cygwin specific Regards Marco *) http://www.cygwin.com/ml/cygwin/2013-03/msg00032.html --- origsrc/postgresql-9.3beta1/src/Makefile.global.in 2013-05-06 22:57:06.0 +0200 +++ src/postgresql-9.3beta1/src/Makefile.global.in 2013-06-08 15:33:28.587266200 +0200 @@ -508,6 +508,11 @@ ifeq ($(PORTNAME),win32) LIBS += -lws2_32 -lshfolder endif +# missing for link on cygwin ? +ifeq ($(PORTNAME),cygwin) +LIBS += $(LDAP_LIBS_BE) +endif + # Not really standard libc functions, used by the backend. TAS = @TAS@ --- origsrc/postgresql-9.3beta1/src/Makefile.shlib 2013-05-06 22:57:06.0 +0200 +++ src/postgresql-9.3beta1/src/Makefile.shlib 2013-06-08 15:33:28.613267700 +0200 @@ -281,8 +281,9 @@ ifeq ($(PORTNAME), unixware) endif ifeq ($(PORTNAME), cygwin) + LINK.shared = $(CC) -shared ifdef SO_MAJOR_VERSION -shlib = cyg$(NAME)$(DLSUFFIX) +shlib = cyg$(NAME)-$(SO_MAJOR_VERSION)$(DLSUFFIX) endif haslibarule = yes endif @@ -371,6 +372,12 @@ else # PORTNAME == cygwin || PORTNAME == # If SHLIB_EXPORTS is set, the rules below will build a .def file from # that. Else we build a temporary one here. +ifeq ($(PORTNAME), cygwin) +$(shlib) $(stlib): $(OBJS) | $(SHLIB_PREREQS) + $(CC) $(CFLAGS) -shared -o $(shlib) -Wl,--out-implib=$(stlib) $(OBJS) $(LDFLAGS) $(LDFLAGS_SL) $(SHLIB_LINK) $(LIBS) $(LDAP_LIBS_BE) + + +else ifeq (,$(SHLIB_EXPORTS)) DLL_DEFFILE = lib$(NAME)dll.def exports_file = $(DLL_DEFFILE) @@ -387,6 +394,7 @@ $(shlib): $(OBJS) $(DLL_DEFFILE) | $(SHL $(stlib): $(shlib) $(DLL_DEFFILE) | $(SHLIB_PREREQS) $(DLLTOOL) --dllname $(shlib) $(DLLTOOL_LIBFLAGS) --def $(DLL_DEFFILE) --output-lib $@ +endif # PORTNAME == cygwin endif # PORTNAME == cygwin || PORTNAME == win32 --- origsrc/postgresql-9.3beta1/src/backend/Makefile2013-05-06 22:57:06.0 +0200 +++ src/postgresql-9.3beta1/src/backend/Makefile2013-06-08 15:33:28.633268800 +0200 @@ -62,18 +62,8 @@ endif ifeq ($(PORTNAME), cygwin) -postgres: $(OBJS) postgres.def libpostgres.a - $(DLLTOOL) --dllname $@$(X) --output-exp $@.exp --def postgres.def - $(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) -o $@$(X) -Wl,--base-file,$@.base $@.exp $(call expand_subsys,$(OBJS)) $(LIBS) - $(DLLTOOL) --dllname $@$(X) --base-file $@.base --output-exp $@.exp --def postgres.def - $(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) -Wl,--stack,$(WIN32_STACK_RLIMIT) -o $@$(X) $@.exp $(call expand_subsys,$(OBJS)) $(LIBS) - rm -f $@.exp $@.base - -postgres.def: $(OBJS) - $(DLLTOOL) --export-all --output-def $@ $(call expand_subsys,$^) - -libpostgres.a: postgres.def - $(DLLTOOL) --dllname postgres.exe --def postgres.def --output-lib $@ +postgres libpostgres.a: $(OBJS) + $(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) $(export_dynamic) $(call expand_subsys,$^) $(LIBS) -o $@ -Wl,--stack,$(WIN32_STACK_RLIMIT) -Wl,--export-all-symbols -Wl,--out-implib=libpostgres.a endif # cygwin --- origsrc/postgresql-9.3beta1/src/interfaces/libpq/Makefile 2013-05-06 22:57:06.0 +0200 +++ src/postgresql-9.3beta1/src/interfaces/libpq/Makefile 2013-06-08 15:33:28.65427 +0200 @@ -45,7 +45,7 @@ OBJS += ip.o md5.o OBJS += encnames.o wchar.o ifeq ($(PORTNAME), cygwin) -override shlib = cyg$(NAME)$(DLSUFFIX) +override shlib = cyg$(NAME)-$(SO_MAJOR_VERSION)$(DLSUFFIX) endif ifeq ($(PORTNAME), win32) --- origsrc/postgresql-9.3beta1/src/makefiles/Makefile.cygwin 2013-05-06 22:57:06.0 +0200 +++ src/postgresql-9.3beta1/src/makefiles/Makefile.cygwin 2013-06-08 16:03:24.065961700 +0200 @@ -1,6 +1,4 @@ # src/makefiles/Makefile.cygwin -DLLTOOL= dlltool -DLLWRAP= dllwrap ifdef PGXS BE_DLLLIBS= -L$(libdir) -lpostgres else @@ -44,6 +42,4 @@ endif # Rule for building a shared library from a single .o file %.dll: %.o - $(DLLTOOL) --export-all --output-def $*.def $ - $(DLLWRAP) -o $@ --def $*.def $ $(LDFLAGS) $(LDFLAGS_SL) $(BE_DLLLIBS) - rm -f $*.def +$(CC) $(CFLAGS) -shared -o $@ $ $(LDFLAGS) $(LDFLAGS_SL) $(BE_DLLLIBS) *** /pub/devel/postgresql/postgresql-9.3beta1-1/src/postgresql-9.3beta1/src/test/regress/expected/tsearch.out 2013-05-06 22:57:06.0 +0200 --- /pub/devel/postgresql/postgresql-9.3beta1-1/build/src/test/regress/results/tsearch.out 2013-06-10
Re: [HACKERS] pg_dump with postgis extension dumps rules separately
Joe Conway m...@joeconway.com writes: OK, done this way and committed. Thanks, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF-8 encoding problem w/ libpq
On 04.06.2013 09:39, Martin Schäfer wrote: Can't really blame Windows on that. On Windows, we don't require that the encoding and LC_CTYPE's charset match. The OP used UTF-8 encoding in the server, but LC_CTYPE=English_United Kingdom.1252, ie. LC_CTYPE implies WIN1252 encoding. We allow that and it generally works on Windows because in varstr_cmp, we use MultiByteToWideChar() followed by wcscoll_l(), which doesn't care about the charset implied by LC_CTYPE. But for isupper(), it matters. Does this mean that the UTF-8 messing up would disappear if the database were using a different locale for LC_CTYPE? If so, which locale should I use? This would be useful for a temporary workaround. Maybe, not sure. The logical thing to do would be to set LC_CTYPE to English_United Kingdom.65001, which tell Windows to expect UTF-8 charset. However, old discussions on this subject suggest that Windows won't accept that: http://www.postgresql.org/message-id/20071015090954.gd4...@svr2.hagander.net It's still worth a try, I think. Things might've changed since then. If that doesn't work, you could also try some other random codepages as a workaround. If you're lucky, one of them might work better, even though it would still be the wrong codepage for UTF-8. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Placing hints in line pointers
On Mon, Jun 10, 2013 at 3:43 AM, Jeff Davis pg...@j-davis.com wrote: We wouldn't need to do a FPW when a hint changes, we would only need to take a copy of the ItemId array, which is much smaller. And it could be protected by its own checksum. I like the direction of this idea. One of the previous proposals was to move all the hint bits to a dedicated area. This had a few problems: 1) Three areas would have meant we wold have needed some tricky ability to relocate the third area since the current grow from both ends technique doesn't scale to three. Throwing them in with the line pointers might nicely solve this. 2) We kept finding more hint bits lying around. There are hint bits in the heap page header, there are hint bits in indexes, and I thought we might have found more hint bits in the tuple headers than the 4 you note. I'm not sure this is still true incidentally, I think the PD_ALLVISIBLE flag might no longer be a hint bit? Was that the only one in the page header? Are the index hint bits also relocatable to the line pointers in the index pages? 3) The reduction in the checksum coverage. Personally I thought this was a red herring -- they're hint bits, they're whole raison d'etre is to be a performance optimization. But if you toss the line pointers in with them then I see a problem. Protecting the line pointers is critically important. A flipped bit in a line pointer could cause all kinds of weirdness. And I don't think it would be easy to protect them with their own checksum. You would have the same problems you currently have of a process updating the hint bit behind your back while calculating the checksum or after your last WAL record but before the block is flushed. Now if this is combined with the other idea -- masking out *just* the hint bits from the checksum I wonder if moving them to the line pointers doesn't make that more feasible. Since they would be in a consistent location for every line pointer, instead of having to check on each iteration if we're looking at the beginning of a tuple, and the only thing we would be counting on being correct before checking the checksum would be the number of line pointers (rather than every line pointer offset). -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)
Hello Greg, Thanks for this very detailed review and the suggestions! I'll submit a new patch Question 1: should it report the maximum lang encountered? I haven't found the lag measurement to be very useful yet, outside of debugging the feature itself. Accordingly I don't see a reason to add even more statistics about the number outside of testing the code. I'm seeing some weird lag problems that this will be useful for though right now, more on that a few places below. I'll explain below why it is really interesting to get this figure, and that it is not really available as precisely elsewhere. Question 2: the next step would be to have the current lag shown under option --progress, but that would mean having a combined --throttle --progress patch submission, or maybe dependencies between patches. This is getting too far ahead. Ok! Let's get the throttle part nailed down before introducing even more moving parts into this. I've attached an updated patch that changes a few things around already. I'm not done with this yet and it needs some more review before commit, but it's not too far away from being ready. Ok. I'll submit a new version by the end of the week. This feature works quite well. On a system that will run at 25K TPS without any limit, I did a run with 25 clients and a rate of 400/second, aiming at 10,000 TPS, and that's what I got: number of clients: 25 number of threads: 1 duration: 60 s number of transactions actually processed: 599620 average transaction lag: 0.307 ms tps = 9954.779317 (including connections establishing) tps = 9964.947522 (excluding connections establishing) I never thought of implementing the throttle like this before, Stochastic processes are a little bit magic:-) but it seems to work out well so far. Check out tps.png to see the smoothness of the TPS curve (the graphs came out of pgbench-tools. There's a little more play outside of the target than ideal for this case. Maybe it's worth tightening the Poisson curve a bit around its center? The point of a Poisson distribution is to model random events the kind of which are a little bit irregular, such as web requests or queuing clients at a taxi stop. I cannot really change the formula, but if you want to argue with Siméon Denis Poisson, hist current address is 19th section of Père Lachaise graveyard in Paris:-) More seriously, the only parameter that can be changed is the 100.0 which drives the granularity of the Poisson process. A smaller value would mean a smaller potential multiplier; that is how far from the average time the schedule can go. This may come under tightening, although it would depart from a perfect process and possibly may be a little less smooth... for a given definition of tight, perfect and smooth:-) [...] What I did instead was think of this as a transaction rate target, which makes the help a whole lot simpler: -R SPEC, --rate SPEC target rate per client in transactions per second Ok, I'm fine with this name. Made the documentation easier to write too. I'm not quite done with that yet, the docs wording in this updated patch could still be better. I'm not an English native speaker, any help is welcome here. I'll do my best. I personally would like this better if --rate specified a *total* rate across all clients. Ok, I can do that, with some reworking so that the stochastic process is shared by all threads instead of being within each client. This mean that a lock between threads to access some variables, which should not impact the test much. Another option is to have a per-thread stochastic process. However, there are examples of both types of settings in the program already, so there's no one precedent for which is right here. -t is per-client and now -R is too; I'd prefer it to be like -T instead. It's not that important though, and the code is cleaner as it's written right now. Maybe this is better; I'm not sure. I like the idea of just one process instead of a per-client one. I did not try at the beginning because the implementation is less straightforward. On the topic of this weird latency spike issue, I did see that show up in some of the results too. Your example illustrates *exactly* why the lag measure was added. The Poisson processes generate an ideal event line (that is irregularly scheduled transaction start times targetting the expected tps) which induces a varrying load that the database is trying to handle. If it cannot start right away, this means that some transactions are differed with respect to their schedule start time. The measure latency reports exactly that: the clients do not handle the load. There may be some catchup later, that is the clients come back in line with the scheduled transactions. I need to put this measure here because the schedluled time is only known to pgbench and not available elsewhere. The max would really be more
[HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses
Hi, I create patch which is improvement of checkpoint IO scheduler for stable transaction responses. * Problem in checkpoint IO schedule in heavy transaction case When heavy transaction in database, I think PostgreSQL checkpoint scheduler has two problems at start and end of checkpoint. One problem is IO heavy when starting initial checkpoint in rounds of checkpoint. This problem was caused by full-page-write which cause WAL IO in fast page writes after checkpoint write page. Therefore, when starting checkpoint, WAL-based checkpoint scheduler wrong judgment that is late schedule by full-page-write, nevertheless checkpoint schedule is not late. This is caused bad transaction response. I think WAL-based checkpoint scheduler was not property in starting checkpoint. Second problem is fsync freeze problem in end of checkpoint. Normally, checkpoint write is executed in background by OS's IO scheduler. But when it does not correctly work, end of checkpoint fsync was caused IO freeze and slower transactions. Unexpected slow transaction will cause monitor error in HA-cluster and decrease user-experience in application service. It is especially serious problem in cloud and virtual server database system which does not have IO performance. However we don't have solution in postgresql.conf parameter very much. We prefer checkpoint time to fast response transactions. In fact checkpoint time is short, and it becomes little bit long that is not problem. You may think that checkpoint_segments and checkpoint_timeout are set larger value, however large checkpoint_segments affects file-cache which is not read and is wasted, and large checkpoint_timeout was caused long-time crash-recovery. * Improvement method of checkpoint IO scheduler 1. Improvement full-page-write IO heavy problem in start of checkpoint My idea is very simple. When start of checkpoint, checkpoint_completion_target become more loose. I set three parameter of this issue; 'checkpoint_smooth_target', 'checkpoint_smooth_margin' and 'checkpointer_write_delay'. 'checkpointer_smooth_target' parameter is a term point that is smooth checkpoint IO schedule in checkpoint progress. 'checkpoint_smooth_margin' parameter can be more smooth checkpoint schedule. It is heuristic parameter, but it solves this problem effectively. 'checkpointer_write_delay' parameter is sleep time for checkpoint schedule. This parameter is nearly same 'bgwriter_delay' in PG9.1 older. If you want to get more detail information, please see attached patch. 2. Improvement fsync freeze problem in end of checkpoint When fsync freeze problem was happened, file fsync more repeatedly is meaningless and causes stop transactions. So I think, if fsync executing time was long, IO queue is flooded and should give IO priority to transactions for fast response time. It realize by inserting sleep time during fsync when fsync time was long. It seems to be long time in checkpoint, but it is not very long. In fact, when fsync time is long, IO queue is packed by another IO which is included checkpoint writes, it only gives IO priority to another executing transactions. I tested my patch in DBT-2 benchmark. Please see result of test. My patch realize higher transaction and fast response than plain PG. Checkpoint time is little bit longer than plain PG, but it is not serious. * Result of DBT-2 with this patch. (Compared with original PG9.2.4) I use DBT-2 benchmark software by OSDL. I also use pg_statsinfo and pg_stats_reporter in this benchmark. - Patched PG (patched 9.2.4) DBT-2 result: http://goo.gl/1PD3l statsinfo report: http://goo.gl/UlGAO settings: http://goo.gl/X4Whu - Original PG (9.2.4) DBT-2 result: http://goo.gl/XVxtj statsinfo report: http://goo.gl/UT1Li settings: http://goo.gl/eofmb Measurement Value is improved 4%, 'new-order 90%tile' is improved 20%, 'new-order average' is improved 18%, 'new-order deviation' is improved 24%, and 'new-order maximum' is improved 27%. I confirm high throughput and WAL IO at executing checkpoint in pg_stats_reporter's report. My patch realizes high response transactions and non-blocking executing transactions. Bad point of my patch is longer checkpoint. Checkpoint time was increased about 10% - 20%. But it can work correctry on schedule-time in checkpoint_timeout. Please see checkpoint result (http://goo.gl/NsbC6). * Test server Server: HP Proliant DL360 G7 CPU:Xeon E5640 2.66GHz (1P/4C) Memory: 18GB(PC3-10600R-9) Disk: 146GB(15k)*4 RAID1+0 RAID controller: P410i/256MB It is not advertisement of pg_statsinfo and pg_stats_reporter:-) They are free software. If you have comment and another idea about my patch, please send me. Best Regards, -- Mitsumasa KONDO NTT Open Source Software Center diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c index fdf6625..a66ce36 100644 ---
Re: [HACKERS] Configurable location for extension .control files
Tom Lane t...@sss.pgh.pa.us writes: Andres Freund and...@2ndquadrant.com writes: I don't really care much about Oliver's usecase TBH, but I would very much welcome making it easier for application developers to package part of ther in-database application code as extensions without either requiring a selfcompiled postgres with a custom extension dir or them having have root access to the machine running postgres. Well, if you're installing an extension that includes C code, you're going to need root access anyway to install the shlib (at least on conservatively-configured machines). At most places, that means you require the extension to be properly packaged (RPM or DEB or something else) in a way that the sysadmins are able to manage it in production. For sites where they don't have in-house system packagers, it would be very welcome to be able to setup PostgreSQL in a way that allows it to LOAD the extension's binary code (.so, .dll, .dylib) from a non-root owned place even if you installed it from official packages. Of course, it wouldn't be activated by default and frowned upon in the docs for conservative production environments. The use case still seems valid to me, and would nicely complete the Extension Templates facility given the right tooling. Can I prepare a patch allowing PostgreSQL to load extension control files and modules from a non-default place in the file-system? Please? For pure-SQL extensions, Dimitri's been pushing a different approach that needn't involve the filesystem at all. We didn't get that finished in 9.3 but I think it's still on the agenda for 9.4. Yes it is. The patch is listed in for the next commitfest, I intend to check about bitrot and update it before the CF starts. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] erroneous restore into pg_catalog schema
Greg, * Greg Stark (st...@mit.edu) wrote: On Tue, May 14, 2013 at 11:59 AM, Stephen Frost sfr...@snowman.net wrote: * Dimitri Fontaine (dimi...@2ndquadrant.fr) wrote: I'm not sure I agree with that view about pg_catalog. Sometimes we talk about moving some parts of core in pre-installed extensions instead, and if we do that we will want those extensions to install themselves into pg_catalog. For my part, I'd still prefer to have those go into a different schema than into pg_catalog. Perhaps that's overkill but I really do like the seperation of system tables from extensions which can be added and removed.. This was discussed previously. It's a bad idea. It's very tempting but it doesn't scale. Then every user needs to know every schema for every extension they might want to use. Having a schema that isn't pg_catalog doesn't necessairly mean we need a schema per extension. Just a 'pg_extensions' schema, which is added to search_path behind the scenes (just like pg_catalog..) would be better than having everything go into pg_catalog. I'd prefer that we let the admins control both where extensions get installed to and what schemas are added to the end of the search_path. It's exactly equivalent to the very common pattern of sysadmins installing things into /usr/local/apache, /usr/local/kde, /usr/local/gnome, /usr/local/pgsql, etc. Then every user needs a mile-long PATH, LD_LIBRARY_PATH, JAVACLASSPATH, etc. And every user has a slightly different ordering and slightly different subset of directories in their paths resulting in different behaviours and errors for each user. This would be because admins can't maintain control over the PATH variable in every shell, not even to simply add things to it, and so users end up building up their own PATH by hand over time. What's more, even with a distro like Debian, you don't keep all of your system configuration (eg: /etc) in the same place that all the user-called binaries (/usr/bin) go, nor do you put the libraries (eg: functions in extensions which are not intended to be user-facing) in the same place as binaries. A correctly integrated package will use standard locations and then users can simply refer to the standard locations and find what's been installed. It would be ok to have a schema for all extensions separately from the core, but it can't be a schema for each extension or else we might as well not have the extension mechanism at all. Users would still need to install the extension by editing their config to refer to it. ... because we don't give the admins (or even the extensions themselves..) any ability to handle this. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] erroneous restore into pg_catalog schema
Greg Stark st...@mit.edu writes: On Tue, May 14, 2013 at 11:59 AM, Stephen Frost sfr...@snowman.net wrote: For my part, I'd still prefer to have those go into a different schema than into pg_catalog. Perhaps that's overkill but I really do like the seperation of system tables from extensions which can be added and removed.. This was discussed previously. It's a bad idea. It's very tempting but it doesn't scale. Then every user needs to know every schema for every extension they might want to use. +1 Your description of how bad this idea is is the best I've read I think: It's exactly equivalent to the very common pattern of sysadmins installing things into /usr/local/apache, /usr/local/kde, /usr/local/gnome, /usr/local/pgsql, etc. Then every user needs a mile-long PATH, LD_LIBRARY_PATH, JAVACLASSPATH, etc. And every user has a slightly different ordering and slightly different subset of directories in their paths resulting in different behaviours and errors for each user. A correctly integrated package will use standard locations and then users can simply refer to the standard locations and find what's been installed. It would be ok to have a schema for all extensions separately from the core, but it can't be a schema for each extension or else we might as well not have the extension mechanism at all. Users would still need to install the extension by editing their config to refer to it. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] erroneous restore into pg_catalog schema
On Mon, Jun 10, 2013 at 2:03 PM, Stephen Frost sfr...@snowman.net wrote: Having a schema that isn't pg_catalog doesn't necessairly mean we need a schema per extension. Just a 'pg_extensions' schema, which is added to search_path behind the scenes (just like pg_catalog..) would be better than having everything go into pg_catalog. Well no objection here. That's just like having /usr/local/{lib,bin,etc}. I'd prefer that we let the admins control both where extensions get installed to and what schemas are added to the end of the search_path. This I object to. That's like having /usr/local/{apache,pgsql,kde,gnome}/bin. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)
From: Craig Ringer cr...@2ndquadrant.com The problem is that WAL for all tablespaces is mixed together in the archives. If you lose your tablespace then you have to keep *all* WAL around and replay *all* of it again when the tablespace comes back online. This would be very inefficient, would require a lot of tricks to cope with applying WAL to a database that has an on-disk state in the future as far as the archives are concerned. It's not as simple as just replaying all WAL all over again - as I understand it, things like CLUSTER or TRUNCATE will result in relfilenodes not being where they're expected to be as far as old WAL archives are concerned. Selective replay would be required, and that leaves the door open to all sorts of new and exciting bugs in areas that'd hardly ever get tested. Although I still lack understanding of PostgreSQL implementation, I have an optimistic feeling that such complexity would not be required. While a tablespace is offline, subsequent access from new or existing transactions is rejected with an error tablespace is offline. So new WAL records would not be generated for the offline tablespace. To take the tablespace back online, the DBA performs per-tablespace archive recovery. Per-tablespace archive recovery restores tablespace data files from the backup, then read through archive and pg_xlog/ WAL as usual, and selectively applies WAL records for the tablespace. I don't think it's a must-be-fixed problem that the WAL for all tablespaces is mixed in one location. I suppose we can tolerate that archive recovery takes a long time. To solve the massive disk space explosion problem I imagine we'd have to have per-tablespace WAL. That'd cause a *huge* increase in fsync costs and loss of the rather nice property that WAL writes are nice sequential writes. It'd be complicated and probably cause nightmares during recovery, for archive-based replication, etc. Per-tablespace WAL is very interesting for another reason -- massive-scale OLTP for database consolidation. This feature would certainly be a breakthrough for amazing performance, because WAL is usually the last bottleneck in OLTP. Yes, I can imagine recovery would be much, much more complicated,. None of these options seem exactly simple or pretty, especially given the additional complexities that'd be involved in allowing WAL records to be applied out-of-order, something that AFAIK _never_h happens at the moment. As I mentioned above, in my shallow understanding, it seems that the additional complexities can be controlled. The key problem, of course, is that this all sounds like a lot of complicated work for a case that's not really supposed to happen. Right now, the answer is your database is unrecoverable, switch to your streaming warm standby and re-seed it from the standby. Not pretty, but at least there's the option of using a sync standby and avoiding data loss. Sync standby... maybe. Let me consider this. How would you approach this? Thanks Craig, you gave me some interesting insights. All of these topics are interesting, and I'd like to work on them when I have acquired enough knowledge and experience in PostgreSQL development. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... ALTER CONSTRAINT
Andres Freund and...@2ndquadrant.com writes: I haven't looked at the patch in detail, but I am very, very much in favor of the feature in general… I have wished for this more than once, +1 -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configurable location for extension .control files
Dimitri Fontaine dimi...@2ndquadrant.fr writes: For sites where they don't have in-house system packagers, it would be very welcome to be able to setup PostgreSQL in a way that allows it to LOAD the extension's binary code (.so, .dll, .dylib) from a non-root owned place even if you installed it from official packages. Of course, it wouldn't be activated by default and frowned upon in the docs for conservative production environments. The use case still seems valid to me, and would nicely complete the Extension Templates facility given the right tooling. Can I prepare a patch allowing PostgreSQL to load extension control files and modules from a non-default place in the file-system? Please? I don't see the need for this. The sort of situation you're describing probably has PG installed at a non-default install --prefix anyway, and thus the standard extension directory is already not root-owned. More generally, it seems pretty insane to me to want to configure a trusted PG installation so that it can load C code from an untrusted place. The trust level cannot be any higher than the weakest link. Thus, I don't see a scenario in which any packager would ship binaries using such an option, even if it existed. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON and unicode surrogate pairs
Andrew Dunstan and...@dunslane.net writes: After thinking about this some more I have come to the conclusion that we should only do any de-escaping of \u sequences, whether or not they are for BMP characters, when the server encoding is utf8. For any other encoding, which is already a violation of the JSON standard anyway, and should be avoided if you're dealing with JSON, we should just pass them through even in text output. This will be a simple and very localized fix. Hmm. I'm not sure that users will like this definition --- it will seem pretty arbitrary to them that conversion of \u sequences happens in some databases and not others. We'll still have to deal with this issue when we get to binary storage of JSON, but that's not something we need to confront today. Well, if we have to break backwards compatibility when we try to do binary storage, we're not going to be happy either. So I think we'd better have a plan in mind for what will happen then. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configurable location for extension .control files
On 2013-06-10 10:13:45 -0400, Tom Lane wrote: Dimitri Fontaine dimi...@2ndquadrant.fr writes: For sites where they don't have in-house system packagers, it would be very welcome to be able to setup PostgreSQL in a way that allows it to LOAD the extension's binary code (.so, .dll, .dylib) from a non-root owned place even if you installed it from official packages. Of course, it wouldn't be activated by default and frowned upon in the docs for conservative production environments. The use case still seems valid to me, and would nicely complete the Extension Templates facility given the right tooling. Can I prepare a patch allowing PostgreSQL to load extension control files and modules from a non-default place in the file-system? Please? I don't see the need for this. The sort of situation you're describing probably has PG installed at a non-default install --prefix anyway, and thus the standard extension directory is already not root-owned. Why does it need to be a non-distribution postgres? A customer needing to develop a postgres extensions is a fairly frequent thing, but often enough they are still happy to use a distribution postgres. More generally, it seems pretty insane to me to want to configure a trusted PG installation so that it can load C code from an untrusted place. The trust level cannot be any higher than the weakest link. Thus, I don't see a scenario in which any packager would ship binaries using such an option, even if it existed. I fail to see the logic here. We do allow LOAD with arbitrary paths. We do allow untrusted languages. We do allow specifying arbitrary paths in 'C' CREATE FUNCTION statements. ... Sure, all of that requires superuser, but so does anything in an extension that can cause an .so to be loaded? Why are extensions different? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configurable location for extension .control files
On 2013-06-10 10:39:48 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2013-06-10 10:13:45 -0400, Tom Lane wrote: More generally, it seems pretty insane to me to want to configure a trusted PG installation so that it can load C code from an untrusted place. The trust level cannot be any higher than the weakest link. Thus, I don't see a scenario in which any packager would ship binaries using such an option, even if it existed. I fail to see the logic here. You are confusing location in the filesystem with permissions. Assuming that a sysadmin wants to allow, say, the postgres DBA to install random extensions, all he has to do is adjust the permissions on the .../extension directory to allow that (or not). Putting the extension directory somewhere else doesn't change that meaningfully, it just makes things more confusing and hence error-prone. That's different because that a) effects all clusters on the machine and b) will get reversed by package management on the next update. In any case, no packager is going to ship an insecure-by-default configuration, which is what Dimitri seems to be fantasizing would happen. It would have to be local option to relax the permissions on the directory, no matter where it is. *I* don't want that at all. All I'd like to have is a postgresql.conf option specifying additional locations. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] erroneous restore into pg_catalog schema
Stephen Frost sfr...@snowman.net writes: Having a schema that isn't pg_catalog doesn't necessairly mean we need a schema per extension. Just a 'pg_extensions' schema, which is added to search_path behind the scenes (just like pg_catalog..) would be better than having everything go into pg_catalog. I'd prefer that we let the admins control both where extensions get installed to and what schemas are added to the end of the search_path. That was discussed in the scope of the first extension patch and it took us about 1 year to conclude not to try to solve search_path at the same time as extensions. I'm not convinced we've had extensions for long enough to be able to reach a conclusion already, but I'll friendly watch that conversation happen again. My opinion is that a pg_extension schema with a proper and well documented set of search_path properties would be good to have. A way to rename it per-database doesn't strike me as that useful as long as we have ALTER EXTENSION … SET SCHEMA … The current default schema where to install extensions being public, almost anything we do on that front will be an improvement. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configurable location for extension .control files
Andres Freund and...@2ndquadrant.com writes: In any case, no packager is going to ship an insecure-by-default configuration, which is what Dimitri seems to be fantasizing would happen. It would have to be local option to relax the permissions on the directory, no matter where it is. *I* don't want that at all. All I'd like to have is a postgresql.conf option specifying additional locations. Same from me. I think I would even take non-plural location. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configurable location for extension .control files
Andres Freund and...@2ndquadrant.com writes: On 2013-06-10 10:13:45 -0400, Tom Lane wrote: More generally, it seems pretty insane to me to want to configure a trusted PG installation so that it can load C code from an untrusted place. The trust level cannot be any higher than the weakest link. Thus, I don't see a scenario in which any packager would ship binaries using such an option, even if it existed. I fail to see the logic here. You are confusing location in the filesystem with permissions. Assuming that a sysadmin wants to allow, say, the postgres DBA to install random extensions, all he has to do is adjust the permissions on the .../extension directory to allow that (or not). Putting the extension directory somewhere else doesn't change that meaningfully, it just makes things more confusing and hence error-prone. In any case, no packager is going to ship an insecure-by-default configuration, which is what Dimitri seems to be fantasizing would happen. It would have to be local option to relax the permissions on the directory, no matter where it is. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] erroneous restore into pg_catalog schema
* Dimitri Fontaine (dimi...@2ndquadrant.fr) wrote: My opinion is that a pg_extension schema with a proper and well documented set of search_path properties would be good to have. A way to rename it per-database doesn't strike me as that useful as long as we have ALTER EXTENSION … SET SCHEMA … While having one place to put everything sounds great, it doesn't do a whole lot of good if you consider conflicts- either because you want multiple versions available or because there just happens to be some overlap in function names (or similar). There are also extensions which have more than just functions in them but also tables, which increases the chances of a conflict happening. Having the extension authors end up having to prefix everything with the name of the extension to avoid conflicts would certainly be worse than actually using schemas. Again, in PG, there's a lot more control which the database admin has and, imv, DBAs are going to be able to manage the extensions if they're given the right tools. Saying dump everything in one place because that's the only place we can be sure all users will look at just doesn't fit. There also isn't one central authority which deals with how extension components are named, unlike with package-based systems where Debian or Red Hat or someone deals with those issues. Lastly, afaik, we don't have any 'divert' or 'alternatives' type of system for dealing with legitimate conflicts when they happen (and they will..). Basically, there's a lot of infrastructure that goes into making put everything in /usr/bin work and we haven't got any of it while we also don't have the problem that is individual user shells with unique .profile/.bashrc/.tcshrc files that set PATH variables. The current default schema where to install extensions being public, almost anything we do on that front will be an improvement. Indeed.. I've never liked that. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] erroneous restore into pg_catalog schema
* Greg Stark (st...@mit.edu) wrote: I'd prefer that we let the admins control both where extensions get installed to and what schemas are added to the end of the search_path. This I object to. That's like having /usr/local/{apache,pgsql,kde,gnome}/bin. ... or it's like giving the admins the ability to manage their systems and deal with conflicts or issues that we don't currently have any way to handle. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_filedump 9.3: checksums (and a few other fixes)
Jeff Davis wrote: I was hesitant to do too much interpretation of the bits. Do you think it would be better to just remove the test for XMAX_SHR_LOCK? I don't know, but then I'm biased because I know what that specific bit combination means. I guess someone that doesn't know is going to be surprised by seeing both lock strength bits together .. but maybe they're just going to have a look at htup_details.h and instantly understand what's going on. Not sure how likely that is. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] erroneous restore into pg_catalog schema
Stephen Frost sfr...@snowman.net writes: While having one place to put everything sounds great, it doesn't do a whole lot of good if you consider conflicts- either because you want multiple versions available or because there just happens to be some overlap in function names (or similar). There are also extensions which have more than just functions in them but also tables, which increases the chances of a conflict happening. Having the extension authors end up having to prefix everything with the name of the extension to avoid conflicts would certainly be worse than actually using schemas. Now you're not talking about *default* settings anymore, or are you? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON and unicode surrogate pairs
On 06/10/2013 10:18 AM, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: After thinking about this some more I have come to the conclusion that we should only do any de-escaping of \u sequences, whether or not they are for BMP characters, when the server encoding is utf8. For any other encoding, which is already a violation of the JSON standard anyway, and should be avoided if you're dealing with JSON, we should just pass them through even in text output. This will be a simple and very localized fix. Hmm. I'm not sure that users will like this definition --- it will seem pretty arbitrary to them that conversion of \u sequences happens in some databases and not others. Then what should we do when there is no matching codepoint in the database encoding? First we'll have to delay the evaluation so it's not done over-eagerly, and then we'll have to try the conversion and throw an error if it doesn't work. The second part is what's happening now, but the delayed evaluation is not. Or we could abandon the conversion altogether, but that doesn't seem very friendly either. I suspect the biggest case for people to use these sequences is where the database is UTF8 but the client encoding is not. Frankly, if you want to use Unicode escapes, you should really be using a UTF8 encoded database if at all possible. We'll still have to deal with this issue when we get to binary storage of JSON, but that's not something we need to confront today. Well, if we have to break backwards compatibility when we try to do binary storage, we're not going to be happy either. So I think we'd better have a plan in mind for what will happen then. I don't see any reason why we couldn't store the JSON strings with the Unicode escape sequences intact in the binary format. What the binary format buys us is that it has decomposed the JSON into a tree structure, so instead of parsing the JSON we can just walk the tree, but the leaf nodes of the tree are still (in the case of the nodes under discussion) text-like objects. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SPGist triple parity concept doesn't work
That would work fine as long as the invariant is maintained accurately. However, there are at least two cases where the existing code fails to maintain the invariant: Hmm. Didn't catch that during development. Thoughts? Give me some time to play around it. Will think. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_filedump 9.3: checksums (and a few other fixes)
Alvaro Herrera alvhe...@2ndquadrant.com writes: Jeff Davis wrote: I was hesitant to do too much interpretation of the bits. Do you think it would be better to just remove the test for XMAX_SHR_LOCK? I don't know, but then I'm biased because I know what that specific bit combination means. I guess someone that doesn't know is going to be surprised by seeing both lock strength bits together .. but maybe they're just going to have a look at htup_details.h and instantly understand what's going on. Not sure how likely that is. I think it's all right because there are only 4 combinations of the two bits and all 4 will be printed sensibly if we do it this way. It would be bad if pg_filedump could print invalid flag combinations in a misleading way --- but I don't see such a risk here. So we might as well go for readability. The thing I'm not too happy about is having to copy the checksum code into pg_filedump. We just got rid of the need to do that for the CRC code, and here it is coming back again. Can't we rearrange the core checksum code similarly to what we did for the CRC stuff recently, so that you only need access to a .h file for it? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SPGist triple parity concept doesn't work
On 7 June 2013 02:32, Tom Lane t...@sss.pgh.pa.us wrote: Hm, good point. That reinforces my feeling that the page-number-based approach isn't workable as a guarantee; though we might want to keep that layout rule as a heuristic that would help reduce contention. Can the locks just be taken in, say, numeric order of the pages involved? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON and unicode surrogate pairs
Andrew Dunstan and...@dunslane.net writes: Or we could abandon the conversion altogether, but that doesn't seem very friendly either. I suspect the biggest case for people to use these sequences is where the database is UTF8 but the client encoding is not. Well, if that's actually the biggest use-case, then maybe we should just say we're *not* in the business of converting those escapes. That would make things nice and consistent regardless of the DB encoding, and it would avoid the problem of being able to input a value and then not being able to output it again. It's legal, is it not, to just write the equivalent Unicode character in the JSON string and not use the escapes? If so I would think that that would be the most common usage. If someone's writing an escape, they probably had a reason for doing it that way, and might not appreciate our overriding their decision. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SPGist triple parity concept doesn't work
Teodor Sigaev teo...@sigaev.ru writes: Thoughts? Give me some time to play around it. Will think. I experimented a bit with the idea of taking a heavyweight lock to represent the right to alter an inner tuple. The results were pretty grim: an spgist index build example went from 225 ms to 380 ms, and a test case involving concurrent insertions (basically the complainant's original example pgbench-ified) went from 5400 tps to 4300 tps. I hadn't realized our heavyweight lock code was quite that expensive :-( Anyway I now think that we might be better off with the other idea of abandoning an insertion and retrying if we get a lock conflict. That would at least not create any performance penalty for non-concurrent scenarios; and even in concurrent cases, I suspect you'd have to be rather unlucky to get penalties as bad as the heavyweight-lock solution is showing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON and unicode surrogate pairs
On 06/10/2013 11:43 AM, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: Or we could abandon the conversion altogether, but that doesn't seem very friendly either. I suspect the biggest case for people to use these sequences is where the database is UTF8 but the client encoding is not. Well, if that's actually the biggest use-case, then maybe we should just say we're *not* in the business of converting those escapes. That would make things nice and consistent regardless of the DB encoding, and it would avoid the problem of being able to input a value and then not being able to output it again. It's legal, is it not, to just write the equivalent Unicode character in the JSON string and not use the escapes? If so I would think that that would be the most common usage. If someone's writing an escape, they probably had a reason for doing it that way, and might not appreciate our overriding their decision. We never store the converted values in the JSON object, nor do we return them from functions that return JSON. But many of the functions and operators that process the JSON have variants that return text instead of JSON, and in those cases, when the value returned is a JSON string, we do the following to it: * strip the outside quotes * de-escape the various escaped characters (i.e. everything preceded by a backslash in the railroad diagram for string at http://www.json.org/) Here's an example of the difference: andrew=# select '{ a: \u00a9}'::json - 'a'; ?column? -- \u00a9 (1 row) andrew=# select '{ a: \u00a9}'::json -'a'; ?column? -- © (1 row) It's the process of producing the latter that is giving us a headache in non-UTF8 databases. ... [ more caffeine is consumed ] ... I have just realized that the problem is actually quite a lot bigger than that. We also use this value for field name comparison. So, let us suppose that we have a LATIN1 database and a piece of JSON with a field name containing the Euro sign (\u20ac), a character that is not in LATIN1. Making that processable so it doesn't blow up would be mighty tricky and error prone. The non-orthogonality I suggested as a solution upthread is, by contrast, very small and easy to manage, and not terribly hard to explain - see attached. cheers andrew diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 3adb365..592420a 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -10161,6 +10161,17 @@ table2-mapping note para + The text-returning variants of these functions and operators will convert Unicode escapes + in the JSON text to the appropriate UTF8 character when the database encoding is UTF8. In + other encodings the escape sequence is simply preserved as part of the text value, since we + can't be sure that the Unicode code point has a matching code point in the database encoding. + In general, it is best to avoid mixing Unicode escapes in JSON with a non-UTF8 database + encoding, if possible. +/para + /note + + note +para The xref linkend=hstore extension has a cast from typehstore/type to typejson/type, so that converted typehstore/type values are represented as JSON objects, not as string values. diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c index d8046c5..bb8aa4f 100644 --- a/src/backend/utils/adt/json.c +++ b/src/backend/utils/adt/json.c @@ -717,7 +717,6 @@ json_lex_string(JsonLexContext *lex) { char utf8str[5]; int utf8len; - char *converted; if (ch = 0xd800 ch = 0xdbff) { @@ -749,13 +748,31 @@ json_lex_string(JsonLexContext *lex) errdetail(low order surrogate must follow a high order surrogate.), report_json_context(lex))); - unicode_to_utf8(ch, (unsigned char *) utf8str); - utf8len = pg_utf_mblen((unsigned char *) utf8str); - utf8str[utf8len] = '\0'; - converted = pg_any_to_server(utf8str, utf8len, PG_UTF8); - appendStringInfoString(lex-strval, converted); - if (converted != utf8str) - pfree(converted); + /* + * For UTF8, replace the escape sequence by the actual utf8 + * character in lex-strval. For other encodings, just pass + * the escape sequence through, since the chances are very + * high that the database encoding won't have a matching + * codepoint - that's one of the possible reasons that the + * user used unicode escapes in the first place. + */ + + if (GetDatabaseEncoding() == PG_UTF8) + { + unicode_to_utf8(ch, (unsigned char *) utf8str); + utf8len = pg_utf_mblen((unsigned char *) utf8str); + appendBinaryStringInfo(lex-strval, utf8str, utf8len); + } + else if (ch = 0x1) + { + /* must have been a surrogate pair */ + appendBinaryStringInfo(lex-strval, s-12, 12); + } + else + { + /*
Re: [HACKERS] pg_filedump 9.3: checksums (and a few other fixes)
On Mon, 2013-06-10 at 11:38 -0400, Tom Lane wrote: The thing I'm not too happy about is having to copy the checksum code into pg_filedump. We just got rid of the need to do that for the CRC code, and here it is coming back again. Can't we rearrange the core checksum code similarly to what we did for the CRC stuff recently, so that you only need access to a .h file for it? The CRC implementation is entirely in header files. Do you think we need to go that far, or is it fine to just put it in libpgport and link that to pg_filedump? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_filedump 9.3: checksums (and a few other fixes)
Jeff Davis wrote: On Mon, 2013-06-10 at 11:38 -0400, Tom Lane wrote: The thing I'm not too happy about is having to copy the checksum code into pg_filedump. We just got rid of the need to do that for the CRC code, and here it is coming back again. Can't we rearrange the core checksum code similarly to what we did for the CRC stuff recently, so that you only need access to a .h file for it? The CRC implementation is entirely in header files. Do you think we need to go that far, or is it fine to just put it in libpgport and link that to pg_filedump? If a lib is okay, use libpgcommon please, not libpgport. But I think a .h would be better, because there'd be no need to have a built source tree to build pg_filedump, only the headers installed. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_filedump 9.3: checksums (and a few other fixes)
Alvaro Herrera alvhe...@2ndquadrant.com writes: Jeff Davis wrote: The CRC implementation is entirely in header files. Do you think we need to go that far, or is it fine to just put it in libpgport and link that to pg_filedump? If a lib is okay, use libpgcommon please, not libpgport. But I think a .h would be better, because there'd be no need to have a built source tree to build pg_filedump, only the headers installed. Neither lib would provide a useful solution so far as Red Hat's packaging is concerned, because those libs are not exposed to other packages (and never will be, unless we start supplying them as .so's) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] pg_isready (was: [WIP] pg_ping utility)
On Mon, Jun 3, 2013 at 2:14 AM, Fujii Masao masao.fu...@gmail.com wrote: Sorry for the delay in responding to you. On Mon, Feb 11, 2013 at 6:03 AM, Phil Sorber p...@omniti.com wrote: On Fri, Feb 8, 2013 at 1:07 PM, Phil Sorber p...@omniti.com wrote: On Fri, Feb 8, 2013 at 12:46 PM, Fujii Masao masao.fu...@gmail.com wrote: No maybe. But I think that all the client commands should follow the same rule. Otherwise a user would get confused when specifying options. I would consider the rest of the apps using it as a consensus. I will make sure it aligns in behavior. I've done as you suggested, and made sure they align with other command line utils. What I have found is that dbname is passed (almost) last in the param array so that it clobbers all previous values. I have made this patch as minimal as possible basing it off of master and not off of my previous attempt. For the record I still like the overall design of my previous attempt better, but I have not included a new version based on that here so as not to confuse the issue, however I would gladly do so upon request. Updated patch attached. Thanks! The patch basically looks good to me. I updated the patch against current master and fixed some problems: - Handle the hostaddr in the connection string properly. - Remove the character 'V' from the third argument of getopt_long(). - Handle the error cases of PQconninfoParse() and PQconndefaults(). - etc... Please see the attached patch. Committed. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Vacuum, Freeze and Analyze: the big picture
I suspect vacuum, autovacuum, autovacuum tuning, table and index bloat, etc is just too complicated for a lot of people running Pg installs to really understand. I'd really, really love to see some feedback-based auto-tuning of vacuum. Heck, it's hard for *me* to understand, and I helped design it. I think there's no question that it could be vastly improved. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Freezing without write I/O
On 01.06.2013 23:21, Robert Haas wrote: On Sat, Jun 1, 2013 at 2:48 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: We define a new page-level bit, something like PD_RECENTLY_FROZEN. When this bit is set, it means there are no unfrozen tuples on the page with XIDs that predate the current half-epoch. Whenever we know this to be true, we set the bit. If the page LSN crosses more than one half-epoch boundary at a time, we freeze the page and set the bit. If the page LSN crosses exactly one half-epoch boundary, then (1) if the bit is set, we clear it and (2) if the bit is not set, we freeze the page and set the bit. Yep, I think that would work. Want to write the patch, or should I? ;-) Have at it. Here's a first draft. A lot of stuff missing and broken, but make check passes :-). In the patch, instead of working with half-epochs, there are XID-LSN ranges, which can be of arbitrary size. An XID-LSN range consists of three values: minlsn: The point in WAL where this range begins. minxid - maxxid: The range of XIDs allowed in this range. Every point in WAL belongs to exactly one range. The minxid-maxxid of the ranges can overlap. For example: 1. XIDs 25000942 - 2703 LSN 0/3BB9938 2. XIDs 23000742 - 2603 LSN 0/2AB9288 3. XIDs 22000721 - 2503 LSN 0/1AB8BE0 4. XIDs 2202 - 2403 LSN 0/10B1550 The invariant with the ranges is that a page with a certain LSN is only allowed to contain XIDs that belong to the range specified by that LSN. For example, if a page has LSN 0/350, it belongs to the 2nd range, and can only contain XIDs between 23000742 - 2603. If a backend updates the page, so that it's LSN is updated to, say, 0/3D12345, all XIDs on the page older than 25000942 must be frozen first, to avoid violating the rule. The system keeps track of a small number of these XID-LSN ranges. Where we currently truncate clog, we can also truncate the ranges with maxxid the clog truncation point. Vacuum removes any dead tuples and updates relfrozenxid as usual, to make sure that there are no dead tuples or aborted XIDs older than the minxid of the oldest tracked XID-LSN range. It no longer needs to freeze old committed XIDs, however - that's the gain from this patch (except to uphold the invariant, if it has to remove some dead tuples on the page and update its LSN). A new range is created whenever we reach the maxxid on the current one. The new range's minxid is set to the current global oldest xmin value, and maxxid is just the old maxxid plus a fixed number (1 million in the patch, but we probably want a higher value in reality). This ensures that if you modify a page and update its LSN, all the existing XIDs on the page that cannot be frozen yet are greater than the minxid of the latest range. In other words, you can always freeze old XIDs on a page, so that any remaining non-frozen XIDs are within the minxid-maxxid of the latest range. The HeapTupleSatisfies functions are modified to look at the page's LSN first. If it's old enough, it doesn't look at the XIDs on the page level at all, and just considers everything on the page is visible to everyone (I'm calling this state a mature page). I think the tricky part is going to be figuring out the synchronization around half-epoch boundaries. Yep. I skipped all those difficult parts in this first version. There are two race conditions that need to be fixed: 1. When a page is updated, we check if it needs to be frozen. If its LSN is greater than the latest range's LSN. IOW, if we've already modified the page, and thus frozen all older tuples, within the current range. However, it's possible that a new range is created immediately after we've checked that. When we then proceed to do the actual update on the page and WAL-log that, the new LSN falls within the next range, and we should've frozen the page. I'm planning to fix that by adding a parity bit on the page header. Each XID-LSN range is assigned a parity bit, 0 or 1. When we check if a page needs to be frozen on update, we make note of the latest range's parity bit, and write it in the page header. Later, when we look at the page's LSN to determine which XID-LSN range it belongs to, we compare the parity. If the parity doesn't match, we know that the race condition happened, so we treat the page to belong to the previous range, not the one it would normally belong to, per the LSN. 2. When we look at a page, and determine that it's not old enough to be matured, we then check the clog as usual. A page is considered mature, if the XID-LSN range (and corresponding clog pages) has already been truncated away. It's possible that between those steps, the XID-LSN range and clog is truncated away, so that the backend tries to access a clog page that doesn't exist anymore. To fix that, the XID-LSN range and clog truncation needs to be done in two steps. First, mark the truncation point in shared memory. Then
Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)
Josh, Daniel, Right now, what we're telling users is You can have continuous backup with Postgres, but you'd better hire and expensive consultant to set it up for you, or use this external tool of dubious provenance which there's no packages for, or you might accidentally cause your database to shut down in the middle of the night. This is an outright falsehood. We are telling them, You better know what you are doing or You should call a consultant. This is no different than, You better know what you are doing or You should take driving lessons. What I'm pointing out is that there is no simple case for archiving the way we have it set up. That is, every possible way to deploy PITR for Postgres involves complex, error-prone configuration, setup, and monitoring. I don't think that's necessary; simple cases should have simple solutions. If you do a quick survey of pgsql-general, you will see that the issue of databases shutting down unexpectedly due to archiving running them out of disk space is a very common problem. People shouldn't be afraid of their backup solutions. I'd agree that one possible answer for this is to just get one of the external tools simplified, well-packaged, distributed, instrumented for common monitoring systems, and referenced in our main documentation. I'd say Barman is the closest to a simple solution for the simple common case, at least for PITR. I've been able to give some clients Barman and have them deploy it themselves. This isn't true of the other tools I've tried. Too bad it's GPL, and doesn't do archiving-for-streaming. I have a clear bias in experience here, but I can't relate to someone who sets up archives but is totally okay losing a segment unceremoniously, because it only takes one of those once in a while to make a really, really bad day. Who is this person that lackadaisically archives, and are they just fooling themselves? And where are these archivers that If WAL archiving is your *second* level of redundancy, you will generally be willing to have it break rather than interfere with the production workload. This is particularly the case if you're using archiving just as a backup for streaming replication. Heck, I've had one client where archiving was being used *only* to spin up staging servers, and not for production at all; do you think they wanted production to shut down if they ran out of archive space (which it did)? I'll also point out that archiving can silently fail for a number of reasons having nothing to do with safety options, such as an NFS mount in Linux silently going away (I've also had this happen), or network issues causing file corruption. Which just points out that we need better ways to detect gaps/corruption in archiving. Anyway, what I'm pointing out is that this is a business decision, and there is no way that we can make a decision for the users what to do when we run out of WAL space. And that the stop archiving option needs to be there for users, as well as the shut down option. *without* requiring users to learn the internals of the archiving system to implement it, or to know the implied effects of non-obvious PostgreSQL settings. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bad error message on valuntil
* Tom Lane wrote: it supposes that rolvaliduntil represents an expiration date for the user, but really it's only an expiration date for the password.) Does anyone think the docs for CREATE ROLE/VALID UNTIL should mention this more clearly? Currently, it is described as The VALID UNTIL clause sets a date and time after which the role's password is no longer valid. If this clause is omitted the password will be valid for all time. This is entirely correct, but I think it could be made clearer by adding a sentence like This clause does not apply to authentication methods that do not involve a password, such as trust, ident, and GSSAPI. And at the top of section 19.3 (Authentication Methods): Time restrictions for the logon of users controlled by an external authentication service, such as GSSAPI or PAM, can be imposed by that service only, not by PostgreSQL itself. -- Christian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER DEFAULT PRIVILEGES FOR ROLE is broken
Agreed. Seems reasonable. Yy! -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Configurable location for extension .control files
*I* don't want that at all. All I'd like to have is a postgresql.conf option specifying additional locations. Same from me. I think I would even take non-plural location. I don't personally see a reason for plural locations, but it would be nice if it recursed (that is, looked for .so's in subdirectories). My reason for this is that I work on applications which have in-house extensions as well as public ones, and I'd like to keep the two separated by directory. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] DO ... RETURNING
Hallo Everybody As far as I can see, currently you can not return anything out of a DO (anonymous code) block. Something like DO LANGUAGE plpythonu RETURNS TABLE (name text, uid int, gid int) $$ with open('/etc/passwd') as f: fields = f.readline().split(':') while fields: name, uid, gid = fields[0], int(fields[2]),int(fields[3]) yield name, uid, gid fields = f.readline().split(':') $$; As I did not pay attention when DO was introduced, I thought it is faster to ask here than read all possibly relevant mails in archives So: has there been a discussion on extending the DO construct with ability to rturn data out of it, similar to what named functions do. If there was then what were the arguments against doing this ? Or was this just that it was not thought important at that time ? -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DO ... RETURNING
2013/6/10 Hannu Krosing ha...@2ndquadrant.com: Hallo Everybody As far as I can see, currently you can not return anything out of a DO (anonymous code) block. Something like DO LANGUAGE plpythonu RETURNS TABLE (name text, uid int, gid int) $$ with open('/etc/passwd') as f: fields = f.readline().split(':') while fields: name, uid, gid = fields[0], int(fields[2]),int(fields[3]) yield name, uid, gid fields = f.readline().split(':') $$; As I did not pay attention when DO was introduced, I thought it is faster to ask here than read all possibly relevant mails in archives So: has there been a discussion on extending the DO construct with ability to rturn data out of it, similar to what named functions do. If there was then what were the arguments against doing this ? Or was this just that it was not thought important at that time ? I don't like this idea. I know so DO is +/- function, but it is too restrict. I hope so we will have a procedures with possibility unbound queries. and then you can do DO $$ SELECT * FROM pg_class; SELECT * FROM pg_proc; ... $$ LANGUAGE SQL; and you don't need to define output structure - what is much more user friendly. Regards Pavel -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DO ... RETURNING
* Pavel Stehule (pavel.steh...@gmail.com) wrote: 2013/6/10 Hannu Krosing ha...@2ndquadrant.com: If there was then what were the arguments against doing this ? I don't recall offhand, but it would be *extremely* useful to have. Or was this just that it was not thought important at that time ? For my part, without looking at what needs to happen for it, big +1 for adding it. I don't like this idea. I know so DO is +/- function, but it is too restrict. I hope so we will have a procedures with possibility unbound queries. I don't see that as an argument against adding support for what can be done today within our existing structures and API. and you don't need to define output structure - what is much more user friendly. Sure, some day this would be a nice addition. There's no need to hold up adding support for a defined table return type for DO waiting for this other feature to happen though. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] DO ... RETURNING
On Mon, Jun 10, 2013 at 09:23:19PM +0200, Pavel Stehule wrote: 2013/6/10 Hannu Krosing ha...@2ndquadrant.com: Hallo Everybody As far as I can see, currently you can not return anything out of a DO (anonymous code) block. Something like DO LANGUAGE plpythonu RETURNS TABLE (name text, uid int, gid int) $$ with open('/etc/passwd') as f: fields = f.readline().split(':') while fields: name, uid, gid = fields[0], int(fields[2]),int(fields[3]) yield name, uid, gid fields = f.readline().split(':') $$; As I did not pay attention when DO was introduced, I thought it is faster to ask here than read all possibly relevant mails in archives So: has there been a discussion on extending the DO construct with ability to rturn data out of it, similar to what named functions do. If there was then what were the arguments against doing this ? Or was this just that it was not thought important at that time ? I don't like this idea. I know so DO is +/- function, but it is too restrict. I hope so we will have a procedures with possibility unbound queries. and then you can do DO $$ SELECT * FROM pg_class; SELECT * FROM pg_proc; ... $$ LANGUAGE SQL; and you don't need to define output structure - what is much more user friendly. If I understand the proposal correctly, the idea is only to try to return something when DO is invoked with RETURNING. 1. Did I understand correctly, Hannu? 2. If I did, does this alleviate your concerns, Pavel? Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DO ... RETURNING
2013/6/10 David Fetter da...@fetter.org: On Mon, Jun 10, 2013 at 09:23:19PM +0200, Pavel Stehule wrote: 2013/6/10 Hannu Krosing ha...@2ndquadrant.com: Hallo Everybody As far as I can see, currently you can not return anything out of a DO (anonymous code) block. Something like DO LANGUAGE plpythonu RETURNS TABLE (name text, uid int, gid int) $$ with open('/etc/passwd') as f: fields = f.readline().split(':') while fields: name, uid, gid = fields[0], int(fields[2]),int(fields[3]) yield name, uid, gid fields = f.readline().split(':') $$; As I did not pay attention when DO was introduced, I thought it is faster to ask here than read all possibly relevant mails in archives So: has there been a discussion on extending the DO construct with ability to rturn data out of it, similar to what named functions do. If there was then what were the arguments against doing this ? Or was this just that it was not thought important at that time ? I don't like this idea. I know so DO is +/- function, but it is too restrict. I hope so we will have a procedures with possibility unbound queries. and then you can do DO $$ SELECT * FROM pg_class; SELECT * FROM pg_proc; ... $$ LANGUAGE SQL; and you don't need to define output structure - what is much more user friendly. If I understand the proposal correctly, the idea is only to try to return something when DO is invoked with RETURNING. 1. Did I understand correctly, Hannu? 2. If I did, does this alleviate your concerns, Pavel? not too much. Two different concepts in one statement is not good idea. What using a cursors as temporary solution? BEGIN; DO $$ BEGIN OPEN mycursor AS SELECT * FROM blablabla; END $$ FETCH FROM mycursor; COMMIT; Still I don't like this idea, because you should to support DO RETURNING in other statements - like INSERT INTO DO RETURNING ??? What about local temporary functions ?? CREATE TEMPORARY FUNCTION xx(a int) RETURNES TABLE (xxx) SELECT * FROM xxx; Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Revisit items marked 'NO' in sql_features.txt
On 10 June 2013 00:39, Robins Tharakan thara...@gmail.com wrote: While reviewing sql_features.txt, found a few items marked NO ('Not supported') whereas, at the outset, they seemed to be supported. Apologies, if this is already considered and / or still marked 'NO' for a reason, but a list of such items mentioned below: I think you'll need to check the standard yourself and report back. I don't suppose anybody remembers the details enough without checking the standard. There's usually a small gotcha, and we are scrupulous to report we either fully meet the standard or do not. F202TRUNCATE TABLE: identity column restart option NO F263Comma-separated predicates in simple CASE expressionNO T041Basic LOB data type support 01 BLOB data type NO T051Row types NO T174Identity columnsNO T176Sequence generator support NO T177Sequence generator support: simple restart option NO T522Default values for IN parameters of SQL-invoked procedures NO T571Array-returning external SQL-invoked functions NO -- Robins Tharakan -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses
On 10 June 2013 11:51, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: I create patch which is improvement of checkpoint IO scheduler for stable transaction responses. Looks like good results, with good measurements. Should be an interesting discussion. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DO ... RETURNING
On 06/10/2013 09:34 PM, David Fetter wrote: If I understand the proposal correctly, the idea is only to try to return something when DO is invoked with RETURNING. 1. Did I understand correctly, Hannu? Yes. Of course we could default it to RETURNS SETOF RECORD :) 2. If I did, does this alleviate your concerns, Pavel? Cheers, David. -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DO ... RETURNING
On 06/10/2013 09:45 PM, Pavel Stehule wrote: 2013/6/10 David Fetter da...@fetter.org: On Mon, Jun 10, 2013 at 09:23:19PM +0200, Pavel Stehule wrote: 2013/6/10 Hannu Krosing ha...@2ndquadrant.com: Hallo Everybody As far as I can see, currently you can not return anything out of a DO (anonymous code) block. Something like DO LANGUAGE plpythonu RETURNS TABLE (name text, uid int, gid int) $$ with open('/etc/passwd') as f: fields = f.readline().split(':') while fields: name, uid, gid = fields[0], int(fields[2]),int(fields[3]) yield name, uid, gid fields = f.readline().split(':') $$; As I did not pay attention when DO was introduced, I thought it is faster to ask here than read all possibly relevant mails in archives So: has there been a discussion on extending the DO construct with ability to rturn data out of it, similar to what named functions do. If there was then what were the arguments against doing this ? Or was this just that it was not thought important at that time ? I don't like this idea. I know so DO is +/- function, but it is too restrict. I hope so we will have a procedures with possibility unbound queries. and then you can do DO $$ SELECT * FROM pg_class; SELECT * FROM pg_proc; ... $$ LANGUAGE SQL; and you don't need to define output structure - what is much more user friendly. If I understand the proposal correctly, the idea is only to try to return something when DO is invoked with RETURNING. 1. Did I understand correctly, Hannu? 2. If I did, does this alleviate your concerns, Pavel? not too much. Two different concepts in one statement is not good idea. What two different concepts do you mean ? What using a cursors as temporary solution? BEGIN; DO $$ BEGIN OPEN mycursor AS SELECT * FROM blablabla; END $$ FETCH FROM mycursor; COMMIT; How would this work in an SQL query ? SELECT * FROM (FETCH FROM mycursor ) mc; ? Still I don't like this idea, because you should to support DO RETURNING in other statements - like INSERT INTO DO RETURNING ??? Yes, I really would like DO to be full set returning construct similar to SELECT or I/U/D RETURNING. The syntax should be either RETURNS (as in function definition) or RETURNING as for I/U/D. I actually like the RETURNING better as it really does immediate return and not just defines a function returning something. What about local temporary functions ?? CREATE TEMPORARY FUNCTION xx(a int) RETURNES TABLE (xxx) SELECT * FROM xxx; You mean that we define and use it in the same statement and after ';' ends the statement it disappears from scope ? This would probably still bloat pg_function table ? -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DO ... RETURNING
* Pavel Stehule (pavel.steh...@gmail.com) wrote: not too much. Two different concepts in one statement is not good idea. What are the different concepts..? We already have set returning functions, why would set returning anonymous functions be any different? What using a cursors as temporary solution? That only works when you want to just return the results of a table. What if you want to construct the data set in the DO block? Okay, fine, you could use a temp table, but what if you don't have rights to create temporary tables? Still I don't like this idea, because you should to support DO RETURNING in other statements - like INSERT INTO DO RETURNING ??? That would certainly be neat, but it doesn't have to be there in the first incarnation, or really, ever, if it turns out to be painful to do. What about local temporary functions ?? You can already create temporary functions by simply creating them in pg_temp. I'd like to see us add explicit support for them though, but I don't see this as related to the DO-RETURNING question. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Freezing without write I/O
On 10 June 2013 19:58, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 01.06.2013 23:21, Robert Haas wrote: On Sat, Jun 1, 2013 at 2:48 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: We define a new page-level bit, something like PD_RECENTLY_FROZEN. When this bit is set, it means there are no unfrozen tuples on the page with XIDs that predate the current half-epoch. Whenever we know this to be true, we set the bit. If the page LSN crosses more than one half-epoch boundary at a time, we freeze the page and set the bit. If the page LSN crosses exactly one half-epoch boundary, then (1) if the bit is set, we clear it and (2) if the bit is not set, we freeze the page and set the bit. Yep, I think that would work. Want to write the patch, or should I? ;-) Have at it. Here's a first draft. A lot of stuff missing and broken, but make check passes :-). Well done, looks like good progress. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallell Optimizer
Hi, I asked a while ago in this group about the possibility to implement a parallel planner in a multithread way, and the replies were that the proposed approach couldn't be implemented, because the postgres is not thread-safe. With the new feature Background Worker Processes, such implementation would be possible? Well, there are versions of genetic algorithms that use the concept of islands in which the populations evolve in parallel in the different islands and allows interaction between the islands and so on. I'm working in an algorithm based on multiagent systems. At the present moment, I mean in H2, the agents are threads, there are a few locks related to agents solutions, and a few locks for the best current solution in the environment where the agents are 'running'. The agents can exchange messages with a purpose. The environment is shared by the all agents and they use the environment to get informations from another agents (current solution for example), tries to update the best current solution and so on. According with the answers, I think the feature Background Worker Processes still doesn't meets my needs. So, I'll keep monitoring the progress of this functionality to implement the planner in future. Thanks, Fred
Re: [HACKERS] Revisit items marked 'NO' in sql_features.txt
On 6/9/13 7:39 PM, Robins Tharakan wrote: F202TRUNCATE TABLE: identity column restart option NO We don't support identity columns. F263Comma-separated predicates in simple CASE expressionNO We don't support that. T041Basic LOB data type support 01 BLOB data type NO We don't support the BLOB type. T051Row types NO Needs checking. T174Identity columnsNO We don't support that. T176Sequence generator support NO T177Sequence generator support: simple restart option NO I think someone has determined that our sequences don't match the SQL standard's sequences. I don't know why, though. There are some syntax differences, in any case. T522Default values for IN parameters of SQL-invoked procedures NO We don't support procedures. T571Array-returning external SQL-invoked functions NO PostgreSQL arrays are not compatible with SQL. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER DEFAULT PRIVILEGES FOR ROLE is broken
On 6/7/13 12:57 PM, Tom Lane wrote: Hm. Throwing a NOTICE saying btw, this won't be of any value until the user has CREATE rights in that schema might be a reasonable compromise. Seems like a can of worms to me. There are many other cases where you need to do something else in order to make the thing you just did useful. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cost limited statements RFC
On Sat, Jun 8, 2013 at 10:00 PM, Greg Smith g...@2ndquadrant.com wrote: But I have neither any firsthand experience nor any empirical reason to presume that the write limit needs to be lower when the read-rate is high. No argument from me that that this is an uncommon issue. Before getting into an example, I should highlight this is only an efficiency issue to me. If I can't blend the two rates together, what I'll have to do is set both read and write individually to lower values than I can right now. That's not terrible; I don't actually have a problem with that form of UI refactoring. I just need separate read and write limits of *some* form. If everyone thinks it's cleaner to give two direct limit knobs and eliminate the concept of multipliers and coupling, that's a reasonable refactoring. It just isn't the easiest change from what's there now, and that's what I was trying to push through before. OK, understood. Let's see what others have to say. On the throughput graph, + values above the axis are write throughput, while - ones are reads. It's subtle, but during the periods where the writes are heavy, the read I/O the server can support to the same drive drops too. Compare 6:00 (low writes, high reads) to 12:00 (high writes, low reads). When writes rise, it can't quite support the same read throughput anymore. This isn't that surprising on a system where reads cost more than writes do. That is indeed quite a surprising system, but I'm having trouble seeing the effect you're referring to, because it looks to me like a lot of the read peaks correspond to write peaks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Batch API for After Triggers
On Sat, Jun 8, 2013 at 5:00 PM, Simon Riggs si...@2ndquadrant.com wrote: While fiddling with FK tuning, Noah suggested batching trigger executions together to avoid execution overhead. It turns out there is no easy way to write triggers that can take advantage of the knowledge that they are being executed as a set of trigger executions. Some API is required to allow a trigger to understand that there may be other related trigger executions in the very near future, so it can attempt to amortise call overhead across many invocations (batching). The attached patch adds two fields to the TriggerDesc trigger functions are handed, allowing them to inspect (if they choose) the additional fields and thus potentially use some form of batching. I'm unclear how this could be used in practice. Are the events in a batch guaranteed to, say, all be related to the same relation? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Server side lo-funcs name
On Sun, Jun 9, 2013 at 8:16 PM, Tatsuo Ishii is...@postgresql.org wrote: Recently we got a complain about server side large object function names described in the doc: http://www.postgresql.org/message-id/51b2413f.8010...@gmail.com In the doc: http://www.postgresql.org/docs/9.3/static/lo-funcs.html There are server-side functions callable from SQL that correspond to each of the client-side functions described above; indeed, for the most part the client-side functions are simply interfaces to the equivalent server-side functions From the description it is hard for users to find out server side functions loread and lowrite becuase they are looking for lo_read and lo_write. So I think his complain is fair. Included patches attempt to fix the problem. + each of the client-side functions described above(please note This line contains an obvious whitespace error, but more than that, I think the resulting paragraph doesn't read very well this way. I would suggest adding a new paragraph further down, maybe like this: --- a/doc/src/sgml/lobj.sgml +++ b/doc/src/sgml/lobj.sgml @@ -572,6 +572,14 @@ SELECT lo_export(image.raster, '/tmp/motd') FROM image The client-side functions do not require superuser privilege. /para + para +The functionality of functionlo_read/function and +functionlo_write/function is also available via server-side calls, +but the names of the server-side functions differ from the client side +interfaces in that they do not contain underscores. You must call +these functions as functionloread/ and functionlowrite/. + /para + /sect1 sect1 id=lo-examplesect -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)
Here is submission v9 based on your v8 version. - the tps is global, with a mutex to share the global stochastic process - there is an adaptation for the fork emulation - I do not know wheter this works with Win32 pthread stuff. - reduced multiplier ln(100) - ln(1000) - avg max throttling lag are reported There's a little more play outside of the target than ideal for this case. Maybe it's worth tightening the Poisson curve a bit around its center? A stochastic process moves around the target value, but is not right on it. -- Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index 8c202bf..6e52dee 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -75,6 +75,7 @@ static int pthread_join(pthread_t th, void **thread_return); #else /* Use emulation with fork. Rename pthread identifiers to avoid conflicts */ +#define PTHREAD_FORK_EMULATION #include sys/wait.h #define pthread_tpg_pthread_t @@ -82,6 +83,11 @@ static int pthread_join(pthread_t th, void **thread_return); #define pthread_create pg_pthread_create #define pthread_join pg_pthread_join +#define pthread_mutex_t int +#define PTHREAD_MUTEX_INITIALIZER 0 +#define pthread_mutex_lock(m) +#define pthread_mutex_unlock(m) + typedef struct fork_pthread *pthread_t; typedef int pthread_attr_t; @@ -137,6 +143,12 @@ int unlogged_tables = 0; double sample_rate = 0.0; /* + * When clients are throttled to a given rate limit, this is the target delay + * to reach that rate in usec. 0 is the default and means no throttling. + */ +int64 throttle_delay = 0; + +/* * tablespace selection */ char *tablespace = NULL; @@ -205,6 +217,7 @@ typedef struct int nvariables; instr_time txn_begin; /* used for measuring transaction latencies */ instr_time stmt_begin; /* used for measuring statement latencies */ + bool throttled; /* whether current transaction was throttled */ int use_file; /* index in sql_files for this client */ bool prepared[MAX_FILES]; } CState; @@ -222,6 +235,8 @@ typedef struct instr_time *exec_elapsed; /* time spent executing cmds (per Command) */ int *exec_count; /* number of cmd executions (per Command) */ unsigned short random_state[3]; /* separate randomness for each thread */ + int64 throttle_lag; /* transaction lag behind throttling */ + int64 throttle_lag_max; } TState; #define INVALID_THREAD ((pthread_t) 0) @@ -230,9 +245,17 @@ typedef struct { instr_time conn_time; int xacts; + int64 throttle_lag; + int64 throttle_lag_max; } TResult; /* + * throttling management + */ +pthread_mutex_t throttle_trigger_mutex = PTHREAD_MUTEX_INITIALIZER; +int64 throttle_trigger; /* previous/next throttling (us) */ + +/* * queries read from files */ #define SQL_COMMAND 1 @@ -355,6 +378,8 @@ usage(void) -n do not run VACUUM before tests\n -N do not update tables \pgbench_tellers\ and \pgbench_branches\\n -r report average latency per command\n + -R SPEC, --rate SPEC\n + target rate in transactions per second\n -s NUM report this scale factor in output\n -S perform SELECT-only transactions\n -t NUM number of transactions each client runs (default: 10)\n @@ -902,13 +927,53 @@ doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile, AggVa top: commands = sql_files[st-use_file]; + /* handle throttling once per transaction by inserting a sleep. + * this is simpler than doing it at the end. + */ + if (throttle_delay ! st-throttled) + { + /* compute delay to approximate a Poisson distribution + * 100 = 13.8 .. 0 multiplier + * 10 = 11.5 .. 0 + * 1 = 9.2 .. 0 + *1000 = 6.9 .. 0 + * if transactions are too slow or a given wait shorter than + * a transaction, the next transaction will start right away. + */ + int64 wait = (int64) + throttle_delay * -log(getrand(thread, 1, 1000)/1000.0); + + pthread_mutex_lock(throttle_trigger_mutex); + throttle_trigger += wait; + st-until = throttle_trigger; + pthread_mutex_unlock(throttle_trigger_mutex); + + st-sleeping = 1; + st-throttled = true; + if (debug) + fprintf(stderr, client %d throttling INT64_FORMAT us\n, + st-id, wait); + } + if (st-sleeping) { /* are we sleeping? */ instr_time now; + int64 now_us; INSTR_TIME_SET_CURRENT(now); - if (st-until = INSTR_TIME_GET_MICROSEC(now)) + now_us = INSTR_TIME_GET_MICROSEC(now); + if (st-until = now_us) + { st-sleeping = 0; /* Done sleeping, go ahead with next command */ + if (throttle_delay st-state==0) + { +/* measure lag of throttled transaction */ +int64 lag = now_us - st-until; +thread-throttle_lag += lag; +if (lag thread-throttle_lag_max) + thread-throttle_lag_max = lag; + } + } else return true; /* Still sleeping, nothing to do here */ } @@
Re: [HACKERS] JSON and unicode surrogate pairs
On Mon, Jun 10, 2013 at 10:18 AM, Tom Lane t...@sss.pgh.pa.us wrote: Well, if we have to break backwards compatibility when we try to do binary storage, we're not going to be happy either. So I think we'd better have a plan in mind for what will happen then. Who says we're ever going to do any such thing? This was extensively debated when we added the original type, and I thought that it was agreed that we might ultimately need both a type that stored JSON as text and another that stored it as binary. And we might need an XML-binary type as well. But there are also cases where storing the data as text is *better*, and I don't see us ever getting rid of that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Freezing without write I/O
On Mon, Jun 10, 2013 at 4:48 PM, Simon Riggs si...@2ndquadrant.com wrote: Well done, looks like good progress. +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON and unicode surrogate pairs
On 06/11/2013 12:07 AM, Robert Haas wrote: On Mon, Jun 10, 2013 at 10:18 AM, Tom Lane t...@sss.pgh.pa.us wrote: Well, if we have to break backwards compatibility when we try to do binary storage, we're not going to be happy either. So I think we'd better have a plan in mind for what will happen then. Who says we're ever going to do any such thing? This was extensively debated when we added the original type, and I thought that it was agreed that we might ultimately need both a type that stored JSON as text and another that stored it as binary. This is where the compatibility comes in - we do want both to accept the same textual format. And we might need an XML-binary type as well. But there are also cases where storing the data as text is *better*, Then use text :) and I don't see us ever getting rid of that. While JSON is a serialisation format most things people want to used it for are actually structured types, not their serialisation to text. The serialisation should happen automatically. -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON and unicode surrogate pairs
On 06/10/2013 06:07 PM, Robert Haas wrote: On Mon, Jun 10, 2013 at 10:18 AM, Tom Lane t...@sss.pgh.pa.us wrote: Well, if we have to break backwards compatibility when we try to do binary storage, we're not going to be happy either. So I think we'd better have a plan in mind for what will happen then. Who says we're ever going to do any such thing? This was extensively debated when we added the original type, and I thought that it was agreed that we might ultimately need both a type that stored JSON as text and another that stored it as binary. And we might need an XML-binary type as well. But there are also cases where storing the data as text is *better*, and I don't see us ever getting rid of that. It was discussed at Pgcon as a result of Oleg and Teodor's talk, and at the Unconference. But in any case it's moot here. None of what I'm suggesting has anything to do with the storage representation of JSON, only with how we process it in whatever form. And none of it will break backwards compatibility at all. So, please, let's concentrate on the problem that's actually at hand. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)
On Sat, Jun 8, 2013 at 11:07 AM, Joshua D. Drake j...@commandprompt.comwrote: On 06/08/2013 07:36 AM, MauMau wrote: 1. If the machine or postgres crashes while archive_command is copying a WAL file, later archive recovery fails. This is because cp leaves a file of less than 16MB in archive area, and postgres refuses to start when it finds such a small archive WAL file. Should that be changed? If the file is 16MB but it turns to gibberish after 3MB, recovery proceeds up to the gibberish. Given that, why should it refuse to start if the file is only 3MB to start with? The solution, which IIRC Tomas san told me here, is to do like cp %p /archive/dir/%f.tmp mv /archive/dir/%f.tmp /archive/dir/%f. This will overwrite /archive/dir/%f if it already exists, which is usually recommended against. Although I don't know that it necessarily should be. One common problem with archiving is for a network glitch to occur during the archive command, so the archive command fails and tries again later. But the later tries will always fail, because the target was created before/during the glitch. Perhaps a more full featured archive command would detect and rename an existing file, rather than either overwriting it or failing. If we have no compunction about overwriting the file, then I don't see a reason to use the cp + mv combination. If the simple cp fails to copy the entire file, it will be tried again until it succeeds. Well it seems to me that one of the problems here is we tell people to use copy. We should be telling people to use a command (or supply a command) that is smarter than that. Actually we describe what archive_command needs to fulfill, and tell them to use something that accomplishes that. The example with cp is explicitly given as an example, not a recommendation. 3. You cannot know the reason of archive_command failure (e.g. archive area full) if you don't use PostgreSQL's server logging. This is because archive_command failure is not logged in syslog/eventlog. Wait, what? Is this true (someone else?) It is kind of true. PostgreSQL does not automatically arrange for the stderr of the archive_command to be sent to syslog. But archive_command can do whatever it wants, including arranging for its own failure messages to go to syslog. Cheers, Jeff
Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)
On Mon, Jun 10, 2013 at 11:59 AM, Josh Berkus j...@agliodbs.com wrote: Anyway, what I'm pointing out is that this is a business decision, and there is no way that we can make a decision for the users what to do when we run out of WAL space. And that the stop archiving option needs to be there for users, as well as the shut down option. *without* requiring users to learn the internals of the archiving system to implement it, or to know the implied effects of non-obvious PostgreSQL settings. I don't doubt this, that's why I do have a no-op fallback for emergencies. The discussion was about defaults. I still think that drop-wal-from-archiving-whenever is not a good one. You may have noticed I also wrote that a neater, common way to drop WAL when under pressure might be nice, to avoid having it ad-hoc and all over, so it's not as though I wanted to suggest an Postgres feature to this effect was an anti-feature. And, as I wrote before, it's much easier to teach an external system to drop WAL than it is to teach Postgres to attenuate, hence the repeated correspondence from my fellows and myself about attenuation side of the equation. Hope that clears things up about where I stand on the matter. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)
Daniel, Jeff, I don't doubt this, that's why I do have a no-op fallback for emergencies. The discussion was about defaults. I still think that drop-wal-from-archiving-whenever is not a good one. Yeah, we can argue defaults for a long time. What would be better is some way to actually determine what the user is trying to do, or wants to happen. That's why I'd be in favor of an explict setting; if there's a setting which says: on_archive_failure=shutdown ... then it's a LOT clearer to the user what will happen if the archive runs out of space, even if we make no change to the defaults. And if that setting is changeable on reload, it even becomes a way for users to get out of tight spots. You may have noticed I also wrote that a neater, common way to drop WAL when under pressure might be nice, to avoid having it ad-hoc and all over, so it's not as though I wanted to suggest an Postgres feature to this effect was an anti-feature. Yep. Drake was saying it was an anti-feature, though, so I was arguing with him. Well it seems to me that one of the problems here is we tell people to use copy. We should be telling people to use a command (or supply a command) that is smarter than that. Actually we describe what archive_command needs to fulfill, and tell them to use something that accomplishes that. The example with cp is explicitly given as an example, not a recommendation. If we offer cp as an example, we *are* recommending it. If we don't recommend it, we shouldn't have it as an example. In fact, if we don't recommend cp, then PostgreSQL should ship with some example shell scripts for archive commands, just as we do for init scripts. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)
On Mon, Jun 10, 2013 at 4:42 PM, Josh Berkus j...@agliodbs.com wrote: Daniel, Jeff, I don't doubt this, that's why I do have a no-op fallback for emergencies. The discussion was about defaults. I still think that drop-wal-from-archiving-whenever is not a good one. Yeah, we can argue defaults for a long time. What would be better is some way to actually determine what the user is trying to do, or wants to happen. That's why I'd be in favor of an explict setting; if there's a setting which says: on_archive_failure=shutdown ... then it's a LOT clearer to the user what will happen if the archive runs out of space, even if we make no change to the defaults. And if that setting is changeable on reload, it even becomes a way for users to get out of tight spots. I like your suggestion, save one thing: it's not a 'failure' or archiving if it cannot keep up, provided one subscribes to the view that archiving is not elective. I nit pick at this because one might think this has something to do with a non-zero return code from the archiving program, which already has a pretty alarmist message in event of transient failures (I think someone brought this up on -hackers but a few months ago...can't remember if that resulted in a change). I don't have a better suggestion that is less jargonrific though, but I wanted to express my general appreciation as to the shape of the suggestion. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)
On 06/10/2013 04:42 PM, Josh Berkus wrote: Actually we describe what archive_command needs to fulfill, and tell them to use something that accomplishes that. The example with cp is explicitly given as an example, not a recommendation. If we offer cp as an example, we *are* recommending it. If we don't recommend it, we shouldn't have it as an example. In fact, if we don't recommend cp, then PostgreSQL should ship with some example shell scripts for archive commands, just as we do for init scripts. Not a bad idea. One that supports rsync and another that supports robocopy. That should cover every platform we support. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that blossoms a rose in the deeps of my heart. - W.B. Yeats -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] how to find out whether a view is updatable
Sorry for my late reply. On Sun, Jun 9, 2013 at 6:45 PM, Dean Rasheed dean.a.rash...@gmail.comwrote: I called it updatable rather than writable or read-only because it might perhaps be extended in the future with separate options for insertable and deletable. It could also be extended to give column-level control over updatability, or something like use_remote_updatability could be added, but that all feels like 9.4 material. Yes this is definitely material for 9.4. You should add this patch to the 1st commit fest. I'll add myself as a reviewer. Thanks, -- Michael
Re: [HACKERS] ALTER TABLE ... ALTER CONSTRAINT
On Mon, Jun 10, 2013 at 11:06 PM, Dimitri Fontaine dimi...@2ndquadrant.frwrote: Andres Freund and...@2ndquadrant.com writes: I haven't looked at the patch in detail, but I am very, very much in favor of the feature in general… I have wished for this more than once, +1 +1. It will be useful. -- Michael
Re: [HACKERS] Parallell Optimizer
On Sat, Jun 8, 2013 at 5:04 AM, Simon Riggs si...@2ndquadrant.com wrote: On 7 June 2013 20:23, Tom Lane t...@sss.pgh.pa.us wrote: As for other databases, I suspect that ones that have parallel execution are probably doing it with a thread model not a process model. Separate processes are more common because it covers the general case where query execution is spread across multiple nodes. Threads don't work across nodes and parallel queries predate (working) threading models. Indeed. Parallelism based on processes would be more convenient for master-master type of applications. Even if no master-master feature is implemented directly in core, at least a parallelism infrastructure based on processes could be used for this purpose. -- Michael
Re: [HACKERS] Parallell Optimizer
On Sat, Jun 8, 2013 at 5:04 AM, Simon Riggs si...@2ndquadrant.com wrote: On 7 June 2013 20:23, Tom Lane t...@sss.pgh.pa.us wrote: As for other databases, I suspect that ones that have parallel execution are probably doing it with a thread model not a process model. Separate processes are more common because it covers the general case where query execution is spread across multiple nodes. Threads don't work across nodes and parallel queries predate (working) threading models. Indeed. Parallelism based on processes would be more convenient for master-master type of applications. Even if no master-master feature is implemented directly in core, at least a parallelism infrastructure based on processes could be used for this purpose. As long as true synchronous replication is not implemented in core, I am not sure there's a value for parallel execution spreading across multile nodes because of the delay of data update propagation. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallell Optimizer
On Tue, Jun 11, 2013 at 9:45 AM, Tatsuo Ishii is...@postgresql.org wrote: On Sat, Jun 8, 2013 at 5:04 AM, Simon Riggs si...@2ndquadrant.com wrote: On 7 June 2013 20:23, Tom Lane t...@sss.pgh.pa.us wrote: As for other databases, I suspect that ones that have parallel execution are probably doing it with a thread model not a process model. Separate processes are more common because it covers the general case where query execution is spread across multiple nodes. Threads don't work across nodes and parallel queries predate (working) threading models. Indeed. Parallelism based on processes would be more convenient for master-master type of applications. Even if no master-master feature is implemented directly in core, at least a parallelism infrastructure based on processes could be used for this purpose. As long as true synchronous replication is not implemented in core, I am not sure there's a value for parallel execution spreading across multile nodes because of the delay of data update propagation. True, but we cannot drop the possibility to have such features in the future either, so a process-based model is safer regarding the possible range of features and applications we could gain with. -- Michael
Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)
Not a bad idea. One that supports rsync and another that supports robocopy. That should cover every platform we support. Example script: = #!/usr/bin/env bash # Simple script to copy WAL archives from one server to another # to be called as archive_command (call this as wal_archive %p %f) # Settings. Please change the below to match your configuration. # holding directory for the archived log files on the replica # this is NOT pg_xlog: WALDIR=/var/lib/pgsql/archive_logs # touch file to shut off archiving in case of filling up the disk: NOARCHIVE=/var/lib/pgsql/NOARCHIVE # replica IP, IPv6 or DNS address: REPLICA=192.168.1.3 # put any special SSH options here, # and the location of RSYNC: export RSYNC_RSH=ssh RSYNC=/usr/bin/rsync DO NOT CHANGE THINGS BELOW THIS LINE ## SOURCE=$1 # %p FILE=$2 # %f DEST=${WALDIR}/${FILE} # See whether we want all archiving off test -f ${NOARCHIVE} exit 0 # Copy the file to the spool area on the replica, error out if # the transfer fails ${RSYNC} --quiet --archive --rsync-path=${RSYNC} ${SOURCE} \ ${REPLICA}:${DEST} if [ $? -ne 0 ]; then exit 1 fi -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Don't downcase non-ascii identifier chars in multi-byte encoding
On Sun, Jun 09, 2013 at 11:39:18AM -0400, Tom Lane wrote: The key point for me is that if tolower() actually does anything in the previous state of the code, it's more than likely going to produce invalidly encoded data. The consequences of that can't be good. You can argue that there might be people out there for whom the transformation accidentally produced a validly-encoded string, but how likely is that really? It seems much more likely that the only reason we've not had more complaints is that on most popular platforms, the code accidentally fails to fire on any UTF8 characters (or any common ones, anyway). On those platforms, there will be no change of behavior. Your hypothesis is that almost all libc tolower() implementations will in every case either (a) turn a multi-byte character to byte soup not valid in the server encoding or (b) leave it unchanged? Quite possible. If that hypothesis holds, I agree that the committed change does not break compatibility. That carries a certain appeal. I still anticipate regretting that we have approved and made reliable this often-sufficed-by-accident behavior, particularly when the SQL standard calls for something else. But I think I now understand your reasoning. The resistance to moving this code to use towlower() for non-ASCII mainly comes from worries about speed, I think; although there was also something about downcasing conversions that change the string's byte length being problematic for some callers. Considering that using ASCII-only or quoted identifiers sidesteps the speed penalty altogether, that seems a poor cause for demur. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON and unicode surrogate pairs
On Mon, Jun 10, 2013 at 11:20:13AM -0400, Andrew Dunstan wrote: On 06/10/2013 10:18 AM, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: After thinking about this some more I have come to the conclusion that we should only do any de-escaping of \u sequences, whether or not they are for BMP characters, when the server encoding is utf8. For any other encoding, which is already a violation of the JSON standard anyway, and should be avoided if you're dealing with JSON, we should just pass them through even in text output. This will be a simple and very localized fix. Hmm. I'm not sure that users will like this definition --- it will seem pretty arbitrary to them that conversion of \u sequences happens in some databases and not others. Yep. Suppose you have a LATIN1 database. Changing it to a UTF8 database where everyone uses client_encoding = LATIN1 should not change the semantics of successful SQL statements. Some statements that fail with one database encoding will succeed in the other, but a user should not witness a changed non-error result. (Except functions like decode() that explicitly expose byte representations.) Having SELECT '[\u00e4]'::json - 0 emit 'ä' in the UTF8 database and '\u00e4' in the LATIN1 database would move PostgreSQL in the wrong direction relative to that ideal. Then what should we do when there is no matching codepoint in the database encoding? First we'll have to delay the evaluation so it's not done over-eagerly, and then we'll have to try the conversion and throw an error if it doesn't work. The second part is what's happening now, but the delayed evaluation is not. +1 for doing it that way. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DO ... RETURNING
2013/6/10 Stephen Frost sfr...@snowman.net: * Pavel Stehule (pavel.steh...@gmail.com) wrote: not too much. Two different concepts in one statement is not good idea. What are the different concepts..? We already have set returning functions, why would set returning anonymous functions be any different? 1. DO as function 2. DO as batch What using a cursors as temporary solution? That only works when you want to just return the results of a table. What if you want to construct the data set in the DO block? Okay, fine, you could use a temp table, but what if you don't have rights to create temporary tables? Still I don't like this idea, because you should to support DO RETURNING in other statements - like INSERT INTO DO RETURNING ??? That would certainly be neat, but it doesn't have to be there in the first incarnation, or really, ever, if it turns out to be painful to do. this is reason, why I dislike it - It is introduce significant strange SQL extension What about local temporary functions ?? You can already create temporary functions by simply creating them in pg_temp. I'd like to see us add explicit support for them though, but I don't see this as related to the DO-RETURNING question. I don't think we have to introduce a new NON ANSI concept, when is possible using current feature. so for me -1 Pavel Thanks, Stephen -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] gitmaster.postgresql.org down?
$ git pull ssh: connect to host gitmaster.postgresql.org port 22: Connection timed out fatal: The remote end hung up unexpectedly -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimising Foreign Key checks
On Mon, Jun 10, 2013 at 09:05:40AM +0100, Simon Riggs wrote: Your earlier comments argue that it is OK to make an early check. The above seems to argue the opposite, not sure. I'll attempt to summarize. If we execute a traditional error-throwing FK check any earlier than we execute it today, applications with certain triggers will notice a behavior change (probably not OK). However, we *can* safely execute an optimistic FK check as early as just after ExecInsertIndexTuples(). If the optimistic check is successful, later activity cannot invalidate its success as concerns that particular inserted tuple. IIUYC we can do this: * search hash table for a value, if found, skip check and continue * if entry in hash not found make an immediate FK check * if the check passes, store value in hash table, if it fits * if check does not pass or value doesn't fit, queue up an after trigger queue entry Why shall doesn't-fit prompt an after-statement recheck? You do need a mechanism to invalidate table entries or the entire table. As a first cut at that, perhaps have parent table RI triggers empty any local hash tables of the same FK relationship. Note that invalidating table entries does not invalidate skips already done on the strength of those entries. except we want to batch things a little, so same algo just with a little batching. * search hash table for a value, if found, skip check and continue * if entry in hash not found add to next batch of checks and continue * when batch full make immediate FK checks for whole batch in one SQL stmt * if a check passes, store value in hash table, if it fits * if check does not pass or value doesn't fit, queue up an after trigger queue entry * when executing queue, use batches to reduce number of SQL stmts I think this all can be made to work, too. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DO ... RETURNING
2013/6/10 Hannu Krosing ha...@2ndquadrant.com: On 06/10/2013 09:45 PM, Pavel Stehule wrote: 2013/6/10 David Fetter da...@fetter.org: On Mon, Jun 10, 2013 at 09:23:19PM +0200, Pavel Stehule wrote: 2013/6/10 Hannu Krosing ha...@2ndquadrant.com: Hallo Everybody As far as I can see, currently you can not return anything out of a DO (anonymous code) block. Something like DO LANGUAGE plpythonu RETURNS TABLE (name text, uid int, gid int) $$ with open('/etc/passwd') as f: fields = f.readline().split(':') while fields: name, uid, gid = fields[0], int(fields[2]),int(fields[3]) yield name, uid, gid fields = f.readline().split(':') $$; As I did not pay attention when DO was introduced, I thought it is faster to ask here than read all possibly relevant mails in archives So: has there been a discussion on extending the DO construct with ability to rturn data out of it, similar to what named functions do. If there was then what were the arguments against doing this ? Or was this just that it was not thought important at that time ? I don't like this idea. I know so DO is +/- function, but it is too restrict. I hope so we will have a procedures with possibility unbound queries. and then you can do DO $$ SELECT * FROM pg_class; SELECT * FROM pg_proc; ... $$ LANGUAGE SQL; and you don't need to define output structure - what is much more user friendly. If I understand the proposal correctly, the idea is only to try to return something when DO is invoked with RETURNING. 1. Did I understand correctly, Hannu? 2. If I did, does this alleviate your concerns, Pavel? not too much. Two different concepts in one statement is not good idea. What two different concepts do you mean ? What using a cursors as temporary solution? BEGIN; DO $$ BEGIN OPEN mycursor AS SELECT * FROM blablabla; END $$ FETCH FROM mycursor; COMMIT; How would this work in an SQL query ? SELECT * FROM (FETCH FROM mycursor ) mc; we doesn't support it, but oracle, db2 allows SELECT * FROM TABLE(cursorname) ? Still I don't like this idea, because you should to support DO RETURNING in other statements - like INSERT INTO DO RETURNING ??? Yes, I really would like DO to be full set returning construct similar to SELECT or I/U/D RETURNING. The syntax should be either RETURNS (as in function definition) or RETURNING as for I/U/D. I actually like the RETURNING better as it really does immediate return and not just defines a function returning something. What about local temporary functions ?? CREATE TEMPORARY FUNCTION xx(a int) RETURNES TABLE (xxx) SELECT * FROM xxx; You mean that we define and use it in the same statement and after ';' ends the statement it disappears from scope ? This would probably still bloat pg_function table ? it is same hard issue like TEMPORARY TABLES Hannu, what is motivation for your proposal??? I have a two objections: * it is not too user friendly - you have to specify returns list every time, what is not comfort for very short life objects * it is on way to introduce lot of NOT ANSI SQL extensions, that are not in other databases, * it doesn't carry really new functionality Regards Pavel -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitmaster.postgresql.org down?
* Tatsuo Ishii (is...@postgresql.org) wrote: $ git pull ssh: connect to host gitmaster.postgresql.org port 22: Connection timed out fatal: The remote end hung up unexpectedly dekendi (the server hosting gitmaster) is currently offline. We're aware and are working on it. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] gitmaster.postgresql.org down?
On Tue, Jun 11, 2013 at 1:29 PM, Stephen Frost sfr...@snowman.net wrote: * Tatsuo Ishii (is...@postgresql.org) wrote: $ git pull ssh: connect to host gitmaster.postgresql.org port 22: Connection timed out fatal: The remote end hung up unexpectedly dekendi (the server hosting gitmaster) is currently offline. We're aware and are working on it. You can still fetch the latest code from github if you cannot wait: https://github.com/postgres/postgres -- Michael
Re: [HACKERS] DO ... RETURNING
Pavel, * Pavel Stehule (pavel.steh...@gmail.com) wrote: 2013/6/10 Stephen Frost sfr...@snowman.net: What are the different concepts..? We already have set returning functions, why would set returning anonymous functions be any different? 1. DO as function 2. DO as batch We already have set returning functions. Still I don't like this idea, because you should to support DO RETURNING in other statements - like INSERT INTO DO RETURNING ??? That would certainly be neat, but it doesn't have to be there in the first incarnation, or really, ever, if it turns out to be painful to do. this is reason, why I dislike it - It is introduce significant strange SQL extension DO already exists and isn't in the SQL standard. This isn't a significant diversion from that, imv. You can already create temporary functions by simply creating them in pg_temp. I'd like to see us add explicit support for them though, but I don't see this as related to the DO-RETURNING question. I don't think we have to introduce a new NON ANSI concept, when is possible using current feature. DO already exists and would cover certain cases that temproary functions don't today. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] gitmaster.postgresql.org down?
dekendi (the server hosting gitmaster) is currently offline. We're aware and are working on it. You can still fetch the latest code from github if you cannot wait: https://github.com/postgres/postgres Thanks but I need to commit/push something. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitmaster.postgresql.org down?
* Tatsuo Ishii (is...@postgresql.org) wrote: dekendi (the server hosting gitmaster) is currently offline. We're aware and are working on it. You can still fetch the latest code from github if you cannot wait: https://github.com/postgres/postgres Thanks but I need to commit/push something. I was able to get a hold of someone over at rackspace and bring the box back up on an older kernel. Looks like this ancient DL585 doesn't particularly like the new 3.2 kernels. Everything should be back up in a few more minutes. We're checking to see if there are any firmware updates available and if anyone has experience with the kernel panic we got when trying 3.2.0-4 and we will likely need to reboot the system again in the future, but it's up and running for now. Apologies for the downtime. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] erroneous restore into pg_catalog schema
* Dimitri Fontaine (dimi...@2ndquadrant.fr) wrote: Stephen Frost sfr...@snowman.net writes: While having one place to put everything sounds great, it doesn't do a whole lot of good if you consider conflicts- either because you want multiple versions available or because there just happens to be some overlap in function names (or similar). There are also extensions which have more than just functions in them but also tables, which increases the chances of a conflict happening. Having the extension authors end up having to prefix everything with the name of the extension to avoid conflicts would certainly be worse than actually using schemas. Now you're not talking about *default* settings anymore, or are you? What happens with the default settings when you try to install two extensions that have overlapping function signatures..? I can't imagine it 'just works'.. And then what? Is there a way that an admin can set up search paths for individual users which provide the 'right' function and work even when the user decides to change their search_path? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] gitmaster.postgresql.org down?
I was able to get a hold of someone over at rackspace and bring the box back up on an older kernel. Looks like this ancient DL585 doesn't particularly like the new 3.2 kernels. Everything should be back up in a few more minutes. We're checking to see if there are any firmware updates available and if anyone has experience with the kernel panic we got when trying 3.2.0-4 and we will likely need to reboot the system again in the future, but it's up and running for now. Apologies for the downtime. Thanks, Stephen Thank you for taking care of this! -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Server side lo-funcs name
Recently we got a complain about server side large object function names described in the doc: http://www.postgresql.org/message-id/51b2413f.8010...@gmail.com In the doc: http://www.postgresql.org/docs/9.3/static/lo-funcs.html There are server-side functions callable from SQL that correspond to each of the client-side functions described above; indeed, for the most part the client-side functions are simply interfaces to the equivalent server-side functions From the description it is hard for users to find out server side functions loread and lowrite becuase they are looking for lo_read and lo_write. So I think his complain is fair. Included patches attempt to fix the problem. I have committed this. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers