Re: pgbench cpu overhead (was Re: [HACKERS] lazy vxid locks, v1)
On 07/24/2011 03:50 AM, Jeff Janes wrote: On Mon, Jun 13, 2011 at 7:03 AM, Stefan Kaltenbrunner ste...@kaltenbrunner.cc wrote: On 06/13/2011 01:55 PM, Stefan Kaltenbrunner wrote: [...] all those tests are done with pgbench running on the same box - which has a noticable impact on the results because pgbench is using ~1 core per 8 cores of the backend tested in cpu resoures - though I don't think it causes any changes in the results that would show the performance behaviour in a different light. actuall testing against sysbench with the very same workload shows the following performance behaviour: with 40 threads(aka the peak performance point): pgbench:223308 tps sysbench: 311584 tps with 160 threads (backend contention dominated): pgbench:57075 sysbench: 43437 so it seems that sysbench is actually significantly less overhead than pgbench and the lower throughput at the higher conncurency seems to be cause by sysbench being able to stress the backend even more than pgbench can. for those curious - the profile for pgbench looks like: samples %symbol name 2937841.9087 doCustom 1750224.9672 threadRun 7629 10.8830 pg_strcasecmp 5871 8.3752 compareVariables 2568 3.6633 getVariable 2167 3.0913 putVariable 2065 2.9458 replaceVariable 1971 2.8117 parseVariable 534 0.7618 xstrdup 278 0.3966 xrealloc 137 0.1954 xmalloc Hi Stefan, How was this profile generated? I get a similar profile using --enable-profiling and gprof, but I find it not believable. The complete absence of any calls to libpq is not credible. I don't know about your profiler, but with gprof they should be listed in the call graph even if they take a negligible amount of time. So I think pgbench is linking to libpq libraries that do not themselves support profiling (I have no idea how that could happen though). If the calls graphs are not getting recorded correctly, surely the timing can't be reliable either. hmm - the profile was generated using oprofile, but now that you are mentioning this aspect, I suppose that this was a profile run without opcontrol --seperate=lib... I'm not currently in a position to retest that - but maybe you could do a run? (I also tried profiling pgbench with perf, but in that case I get nothing other than kernel and libc calls showing up. I don't know what that means) To support this, I've dummied up doCustom so that does all the work of deciding what needs to happen, executing the metacommands, interpolating the variables into the SQL string, but then simply refrains from calling the PQ functions to send and receive the query and response. (I had to make a few changes around the select loop in threadRun to support this). The result is that the dummy pgbench is charged with only 57% more CPU time than the stock one, but it gets over 10 times as many transactions done. I think this supports the notion that the CPU bottleneck is not in pgbench.c, but somewhere in the libpq or the kernel. interesting - iirc we actually had some reports about current libpq behaviour causing scaling issues on some OSes - see http://archives.postgresql.org/pgsql-hackers/2009-06/msg00748.php and some related threads. Iirc the final patch for that was never applied though and the original author lost interest, I think that I was able to measure some noticable performance gains back in the days but I don't think I still have the numbers somewhere. Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq SSL with non-blocking sockets
On 07/16/2011 12:46 AM, Tom Lane wrote: I think the direction to move in ought to be to use the existing buffer as-is, and have pqCheckOutBufferSpace refuse to enlarge the buffer while we are in write frozen state. It should be OK to append data to the buffer, though, so long as we remember how much we're allowed to pass to SSL_write when we next try to write. Alternative to freezing the outBuffer would be to set SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER mode during SSL initialisation. That would enable the buffer address to be changed in-between the SSL_write calls, so long as the content remains the same. Attached is a patch that uses the single buffer approach described by Tom, but with a moving SSL write buffer enabled. Modifying pqCheckOutBufferSpace is also an option, but it'd break some (arguably already broken) client applications that don't have proper retry handling. Notably some versions of psycopg2 have problems with handling zero return values from PQputCopyData. So ISTM that from backporting perspective the moving write buffer is a bit safer. I'll see if I can come up with a test case for the SSL_read retry before attempting to fix that. regards, Martin diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c index 17dde4a..977e37c 100644 --- a/src/interfaces/libpq/fe-misc.c +++ b/src/interfaces/libpq/fe-misc.c @@ -793,20 +793,26 @@ pqSendSome(PGconn *conn, int len) while (len 0) { int sent; + int write_size = len; char sebuf[256]; -#ifndef WIN32 - sent = pqsecure_write(conn, ptr, len); -#else - +#ifdef WIN32 /* * Windows can fail on large sends, per KB article Q201213. The * failure-point appears to be different in different versions of * Windows, but 64k should always be safe. */ - sent = pqsecure_write(conn, ptr, Min(len, 65536)); + write_size = Min(len, 65536); +#endif + +#ifdef USE_SSL + /* We have leftovers from previous SSL write, send these first. */ + if (conn-ssl_retry_bytes) + write_size = conn-ssl_retry_bytes; #endif + sent = pqsecure_write(conn, ptr, write_size); + if (sent 0) { /* @@ -857,11 +863,30 @@ pqSendSome(PGconn *conn, int len) return -1; } } +#ifdef USE_SSL + else if (sent == 0) + { + /* + * We are in non-blocking mode, set up for retrying a SSL write + * operation. OpenSSL requires the same buffer to be passed to the + * write retry, so we must remember the write size. However, we + * don't need to keep the original buffer address as the + * SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER mode ought to be set. + */ + if (!conn-ssl_retry_bytes) +conn-ssl_retry_bytes = write_size; + } +#endif else { ptr += sent; len -= sent; remaining -= sent; +#ifdef USE_SSL + /* Forget the retry buffer size if all the data has been sent */ + if (conn-ssl_retry_bytes) +conn-ssl_retry_bytes = 0; +#endif } if (len 0) diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c index cd1292c..540561d 100644 --- a/src/interfaces/libpq/fe-secure.c +++ b/src/interfaces/libpq/fe-secure.c @@ -757,6 +757,9 @@ init_ssl_system(PGconn *conn) #endif return -1; } + + /* Enable moving write buffer to simplify write retry handling. */ + SSL_CTX_set_mode(SSL_context, SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER); } #ifdef ENABLE_THREAD_SAFETY diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h index d56ef5d..ae212a6 100644 --- a/src/interfaces/libpq/libpq-int.h +++ b/src/interfaces/libpq/libpq-int.h @@ -407,6 +407,8 @@ struct pg_conn X509 *peer; /* X509 cert of server */ char peer_dn[256 + 1]; /* peer distinguished name */ char peer_cn[SM_USER + 1]; /* peer common name */ + + int ssl_retry_bytes; /* buffer size for SSL write retry */ #ifdef USE_SSL_ENGINE ENGINE *engine; /* SSL engine, if any */ #else -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: pgbench cpu overhead (was Re: [HACKERS] lazy vxid locks, v1)
Jeff Janes jeff.ja...@gmail.com writes: How was this profile generated? I get a similar profile using --enable-profiling and gprof, but I find it not believable. The complete absence of any calls to libpq is not credible. I don't know about your profiler, but with gprof they should be listed in the call graph even if they take a negligible amount of time. So I think pgbench is linking to libpq libraries that do not themselves support profiling (I have no idea how that could happen though). If the calls graphs are not getting recorded correctly, surely the timing can't be reliable either. Last I checked, gprof simply does not work for shared libraries on Linux --- is that what you're testing on? If so, try oprofile or some other Linux-specific solution. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: pgbench cpu overhead (was Re: [HACKERS] lazy vxid locks, v1)
Stefan Kaltenbrunner ste...@kaltenbrunner.cc writes: interesting - iirc we actually had some reports about current libpq behaviour causing scaling issues on some OSes - see http://archives.postgresql.org/pgsql-hackers/2009-06/msg00748.php and some related threads. Iirc the final patch for that was never applied though and the original author lost interest, I think that I was able to measure some noticable performance gains back in the days but I don't think I still have the numbers somewhere. Huh? That patch did get applied in some form or other -- at least, libpq does contain references to both SO_NOSIGPIPE and MSG_NOSIGNAL these days. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq SSL with non-blocking sockets
Martin Pihlak martin.pih...@gmail.com writes: On 07/16/2011 12:46 AM, Tom Lane wrote: I think the direction to move in ought to be to use the existing buffer as-is, and have pqCheckOutBufferSpace refuse to enlarge the buffer while we are in write frozen state. It should be OK to append data to the buffer, though, so long as we remember how much we're allowed to pass to SSL_write when we next try to write. Alternative to freezing the outBuffer would be to set SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER mode during SSL initialisation. Hmmm ... that is a really interesting flag. The documentation is inadequate, but I googled a bit and found this thread about it: http://www.mail-archive.com/openssl-users@openssl.org/msg45440.html in which it's stated that (1) the prohibition on moving the buffer is actually just a sort of sanity check, and can be turned off using this flag; (2) it is okay to increase, but not reduce, the length parameter to SSL_write in the next call after a WANT_READ/WANT_WRITE return. So this does look like it'd fix the issue for SSL_write, without needing to introduce a concept of a write frozen buffer state. I am wondering though how far back support for this flag exists in OpenSSL, and whether the situation is the same for SSL_read or not. I don't see any SSL_MODE_ACCEPT_MOVING_READ_BUFFER macro ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq SSL with non-blocking sockets
I wrote: So this does look like it'd fix the issue for SSL_write, without needing to introduce a concept of a write frozen buffer state. I am wondering though how far back support for this flag exists in OpenSSL, A bit of archaeology reveals that the flag was introduced in OpenSSL 0.9.4, released in 1999. So it's probably Old Enough. (BTW, the 0.9.4 changelog credits this change to one Bodo Moeller ... so the comments from him in the other thread I linked to can be considered authoritative ...) Still wondering about the SSL_read end of it, though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq SSL with non-blocking sockets
I wrote: Still wondering about the SSL_read end of it, though. And on that front, some digging around in the OpenSSL source code indicates that they do all their work in internal buffers, and transfer data into SSL_read's result buffer only when ready to return it. So the claim in the documentation that SSL_read has a restriction comparable to SSL_write is a lie: there is no case where they'll copy some data into the buffer and then return -1. So the SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER solution looks like a good fix. I'll see about applying it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] XPATH vs. server_encoding != UTF-8
On Jul24, 2011, at 01:25 , Florian Pflug wrote: On Jul23, 2011, at 22:49 , Peter Eisentraut wrote: On lör, 2011-07-23 at 17:49 +0200, Florian Pflug wrote: The current thread about JSON and the ensuing discussion about the XML types' behaviour in non-UTF8 databases made me try out how well XPATH() copes with that situation. The code, at least, looks suspicious - XPATH neither verifies that the server encoding is UTF-8, not does it pass the server encoding on to libxml's xpath functions. This issue is on the Todo list, and there are some archive links there. Thanks for the pointer, but I think the discussion there doesn't really apply here. Upon further reflection, I came to realize that it in fact does apply. All the non-XPath related XML *parsing* seems to go through xml_parse(), but we also use libxml to write XML, making XMLELEMENT() and friends equally susceptible to all kinds of encoding trouble. For the fun of it, try the following in a ISO-8859-1 database (which client_encoding correctly set up, so the umlaut-a reaches the backend unharmed) select xmlelement(name r, xmlattributes('ä' as a)); you get xmlelement --- r a=#x4000;/ Well, actually, you only get that about 9 times out of 10. Sometimes you instead get xmlelement --- r a=#x4001;\x01\x01/ It seems the libxml reads past the terminating zero byte if it's preceeded by an invalid UTF-8 byte sequence (like 0xe4 0x00 in the example above). Ouch! Also, passing encoding ASCII to libxml's parser doesn't prevent it from expanding entity references referring to characters outside the ASCII range. So even with my patch applied you can make XPATH() return wrong results. For example (0xe4 is the unicode codepoint representing umlaut-a) select xpath('/r/@a', 'r a=#xe4;/'::xml); gives (*with* my patch applied) xpath --- {ä} So scratch the whole idea. There doesn't seem to be a simple way to make the XML type work sanely in a non-UTF-8 setting :-(. Apart from simple input and output that is, which already seems to work correctly regardless of the server encoding. BTW, for the sake of getting this into the archives just in case someone decides to fix this and stumbles over this thread: It seems to me that the easiest way to fix XML generation in the non-UTF-8 case would be to cease using libxml for emitting XML at all. The only non-trivial use of libxml there is the escaping of attribute values, and we do already have our own escape_xml() function - it just needs to be taught the additional escapes needed for attribute values. (libxml is also used to convert binary values to base64 or hexadecimal notation, but there're no encoding issues there) best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: pgbench cpu overhead (was Re: [HACKERS] lazy vxid locks, v1)
On 07/24/2011 05:55 PM, Tom Lane wrote: Stefan Kaltenbrunner ste...@kaltenbrunner.cc writes: interesting - iirc we actually had some reports about current libpq behaviour causing scaling issues on some OSes - see http://archives.postgresql.org/pgsql-hackers/2009-06/msg00748.php and some related threads. Iirc the final patch for that was never applied though and the original author lost interest, I think that I was able to measure some noticable performance gains back in the days but I don't think I still have the numbers somewhere. Huh? That patch did get applied in some form or other -- at least, libpq does contain references to both SO_NOSIGPIPE and MSG_NOSIGNAL these days. hmm yeah - your are right, when I looked that up a few hours ago I failed to find the right commit but it was indeed commited: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=cea80e726edd42a39bb0220290738f7825de8e57 I think I mentally mixed that up with compare word-at-a-time in bcTruelen patch that was also discussed for affecting query rates for trivial queries. I actually wonder if -HEAD would show that issue even more clearly now that we have parts of roberts performance work in the tree... Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON
On Jul24, 2011, at 05:14 , Robert Haas wrote: On Fri, Jul 22, 2011 at 10:36 PM, Joey Adams joeyadams3.14...@gmail.com wrote: Interesting. This leads to a couple more questions: * Should the JSON data type (eventually) have an equality operator? +1. +1. * Should the JSON input function alphabetize object members by key? I think it would probably be better if it didn't. I'm wary of overcanonicalization. It can be useful to have things come back out in more or less the format you put them in. The downside being that we'd then either need to canonicalize in the equality operator, or live with either no equality operator or a rather strange one. Also, if we don't canonicalize now, we (or rather our users) are in for some pain should we ever decide to store JSON values in some other form than plain text. Because if we do that, we'd presumably want to order the members in some predefined way (by their hash value, for example). So, from me +1 for alphabetizing members. If we canonicalize strings and numbers and alphabetize object members, then our equality function is just texteq. The only stumbling block is canonicalizing numbers. Fortunately, JSON's definition of a number is its decimal syntax, so the algorithm is child's play: * Figure out the digits and exponent. * If the exponent is greater than 20 or less than 6 (arbitrary), use exponential notation. The problem is: 2.718282e-1000 won't equal 0 as may be expected. I doubt this matters much. I don't think that 2.718282e-100 SHOULD equal 0. I agree. As for your proposed algorithm, I suggest to instead use exponential notation if it produces a shorter textual representation. In other words, for values between -1 and 1, we'd switch to exponential notation if there's more than 1 leading zero (to the right of the decimal point, of course), and for values outside that range if there're more than 2 trailing zeros and no decimal point. All after redundant zeros and decimal points are removed. So we'd store 0 as 0 1 as 1 0.1 as 0.1 0.01 as 0.01 0.001 as 1e-3 10 as 10 100 as 100 1000 as 1e3 1000.1 as 1000.1 1001 as 1001 If, in the future, we add the ability to manipulate large JSON trees efficiently (e.g. by using an auxiliary table like TOAST does), we'll probably want unique members, so enforcing them now may be prudent. I doubt you're going to want to reinvent TOAST, but I do think there are many advantages to forbidding duplicate keys. ISTM the question is whether to throw an error or just silently discard one of the k/v pairs. Keeping both should not be on the table, IMHO. +1. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Problem with pg_upgrade's directory write check on Windows
Excerpts from Bruce Momjian's message of dom jul 24 01:46:08 -0400 2011: Robert Haas wrote: Should I fix this in pg_upgrade 9.1 for Windows or just in 9.2? ?The check works fine on non-Windows. Seems worth back-patching to me. Attached patch applied and backpatched to 9.1. I was able to test both code paths on my BSD machine by modifying the ifdefs. I will have EnterpriseDB do further testing. Err, why not 9.0? -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] vacuumlo patch
Please consider adding this minor patch, or offering improvements. *Problem*: vacuumlo required PostgreSQL to use more locks per transaction than possible resulting in an error and a log file full of ignored until end of transaction warnings. (max_locks_per_transaction is limited by shmmax which is limited by RAM) *Solution*: ask vacuumlo to complete after N lo_unlink(OID)s. * Alternate Solution*: ask vacuumlo to start a new transaction after N lo_unlink(OID)s. (more complex and can be implemented with the other solution in the shell) *Design Decisions*: Add a flag so the new behavior is optional and it defaults to the old behavior. *Implementation*: Followed code formatting style. It's likely pointless to check if an int is 2147483647 but maybe it clarifies the code. Added a friendly error message and the a line in the help. git diff -U0 HEAD -- contrib/vacuumlo/vacuumlo.c diff --git a/contrib/vacuumlo/vacuumlo.c b/contrib/vacuumlo/vacuumlo.c index f6e2a28..b7c7d64 100644 --- a/contrib/vacuumlo/vacuumlo.c +++ b/contrib/vacuumlo/vacuumlo.c @@ -50,0 +51 @@ struct _param +inttransaction_limit; @@ -284,0 +286 @@ vacuumlo(char *database, struct _param * param) +{ @@ -285,0 +288,5 @@ vacuumlo(char *database, struct _param * param) +if(param-transaction_limit!=0 deleted=param-transaction_limit) +{ +break; +} +} @@ -315,0 +323 @@ usage(const char *progname) +printf( -l LIMIT stop after removing LIMIT LOs\n); @@ -344,0 +353 @@ main(int argc, char **argv) +param.transaction_limit = 0; @@ -362 +371 @@ main(int argc, char **argv) -c = getopt(argc, argv, h:U:p:vnwW); +c = getopt(argc, argv, h:U:p:l:vnwW); @@ -397,0 +407,8 @@ main(int argc, char **argv) +case 'l': +param.transaction_limit = strtol(optarg, NULL, 10); +if ((param.transaction_limit 0) || (param.transaction_limit 2147483647)) +{ +fprintf(stderr, %s: invalid transaction limit number: %s, valid range is form 0(disabled) to 2147483647.\n, progname, optarg); +exit(1); +} +break;
[HACKERS] SSL-mode error reporting in libpq
In testing the fix for the SSL problem that Martin Pihlak reported, I realized that libpq doesn't really cope very well with errors reported by OpenSSL. In the case at hand, SSL_write returns an SSL_ERROR_SSL code, which pqsecure_write quite reasonably handles by putting SSL error: bad write retry into conn-errorMessage. However, it then sets errno = ECONNRESET, which causes its caller pqSendSome() to overwrite that potentially-useful message with an outright lie: server closed the connection unexpectedly. I think what we ought to do is adjust the code so that in SSL mode, pqsecure_write is responsible for constructing all error messages and pqSendSome should just leave conn-errorMessage alone. We could perhaps go a bit further and make pqsecure_write responsible for the error message in non-SSL mode too, but it looks to me like pqSendSome has to have a switch on the errno anyway to decide whether to keep trying or not, so moving that responsibility would just lead to duplicative coding. Any objections? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq SSL with non-blocking sockets
Martin Pihlak martin.pih...@gmail.com writes: If possible I would like the fix to be backported as well. This is quite a nasty bug and difficult to track down. Especially as the actual SSL error messages are masked by the server closed the connection unexpectedly message. I've applied the simplified fix (just set SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER) as well as a patch to improve the error reporting situation. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Problem with pg_upgrade's directory write check on Windows
Alvaro Herrera wrote: Excerpts from Bruce Momjian's message of dom jul 24 01:46:08 -0400 2011: Robert Haas wrote: Should I fix this in pg_upgrade 9.1 for Windows or just in 9.2? ?The check works fine on non-Windows. Seems worth back-patching to me. Attached patch applied and backpatched to 9.1. I was able to test both code paths on my BSD machine by modifying the ifdefs. I will have EnterpriseDB do further testing. Err, why not 9.0? The check did not exist in 9.0 -- I mentioned that in the commit message. I could add the check into 9.0, but we usually don't backpatch such things. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON
On Sat, Jul 23, 2011 at 11:14 PM, Robert Haas robertmh...@gmail.com wrote: I doubt you're going to want to reinvent TOAST, ... I was thinking about making it efficient to access or update foo.a.b.c.d[1000] in a huge JSON tree. Simply TOASTing the varlena text means we have to unpack the entire datum to access and update individual members. An alternative would be to split the JSON into chunks (possibly by using the pg_toast_id table) and have some sort of index that can be used to efficiently look up values by path. This would not be trivial, and I don't plan to implement it any time soon. On Sun, Jul 24, 2011 at 2:19 PM, Florian Pflug f...@phlo.org wrote: On Jul24, 2011, at 05:14 , Robert Haas wrote: On Fri, Jul 22, 2011 at 10:36 PM, Joey Adams joeyadams3.14...@gmail.com wrote: ... Fortunately, JSON's definition of a number is its decimal syntax, so the algorithm is child's play: * Figure out the digits and exponent. * If the exponent is greater than 20 or less than 6 (arbitrary), use exponential notation. I agree. As for your proposed algorithm, I suggest to instead use exponential notation if it produces a shorter textual representation. In other words, for values between -1 and 1, we'd switch to exponential notation if there's more than 1 leading zero (to the right of the decimal point, of course), and for values outside that range if there're more than 2 trailing zeros and no decimal point. All after redundant zeros and decimal points are removed. So we'd store 0 as 0 1 as 1 0.1 as 0.1 0.01 as 0.01 0.001 as 1e-3 10 as 10 100 as 100 1000 as 1e3 1000.1 as 1000.1 1001 as 1001 Interesting idea. The reason I suggested using exponential notation only for extreme exponents (less than -6 or greater than +20) is partly for presentation value. Users might be annoyed to see 100 turned into 1e6. Moreover, applications working solely with integers that don't expect the floating point syntax may choke on the converted numbers. 32-bit integers can be losslessly encoded as IEEE double-precision floats (JavaScript's internal representation), and JavaScript's algorithm for converting a number to a string ([1], section 9.8.1) happens to preserve the integer syntax (I think). Should we follow the JavaScript standard for rendering numbers (which my suggestion approximates)? Or should we use the shortest encoding as Florian suggests? - Joey [1]: http://www.ecma-international.org/publications/files/ECMA-ST-ARCH/ECMA-262%205th%20edition%20December%202009.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] python cleanup
On 04/24/2011 07:31 PM, Peter Eisentraut wrote: On sön, 2011-04-24 at 12:25 -0400, Tom Lane wrote: This file is in fundamental violation of the first commandment of Postgres #includes, which is thou shalt have no other gods before c.h. We need to put postgres.h *before* the Python.h include. I don't know what issues led to the current arrangement but it is fraught with portability gotchas. In particular it's just about guaranteed to fail on platforms wherestdio.h reacts to _FILE_OFFSET_BITS --- plpython.c is going to get compiled expecting a different stdio library than the rest of the backend. Here is where this happened: commit ab6ee1f9fc7039b1e8d8ebf939da3fd55e73efad Author: Joe Conwaym...@joeconway.com Date: Thu Aug 5 03:10:29 2004 + Move include for Python.h above postgres.h to eliminate compiler warning. diff --git a/src/pl/plpython/plpython.c b/src/pl/plpython/plpython.c index 76ea031..07eed86 100644 --- a/src/pl/plpython/plpython.c +++ b/src/pl/plpython/plpython.c @@ -29,11 +29,12 @@ * MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS. * * IDENTIFICATION - * $PostgreSQL: pgsql/src/pl/plpython/plpython.c,v 1.52 2004/08/04 21:34:29 tgl Exp $ + * $PostgreSQL: pgsql/src/pl/plpython/plpython.c,v 1.53 2004/08/05 03:10:29 joe Exp $ * * */ +#includePython.h #include postgres.h /* system stuff */ @@ -54,7 +55,6 @@ #include utils/syscache.h #include utils/typcache.h -#includePython.h #includecompile.h #includeeval.h If you switch it back around, you indeed get a bunch of annoying warnings. This will need some playing around it get right. On my Linux system the attached compiles without warnings. If this seems like the way to go I'll investigate more on Windows. cheers andrew diff --git a/src/pl/plpython/plpython.c b/src/pl/plpython/plpython.c index e03d7ce..cfe7f78 100644 --- a/src/pl/plpython/plpython.c +++ b/src/pl/plpython/plpython.c @@ -6,6 +6,21 @@ * */ +#include postgres.h + +/* + * Save settings the Python headers might override + */ +#ifdef _POSIX_C_SOURCE +#define _PGSAVE_POSIX_C_SOURCE _POSIX_C_SOURCE +#undef _POSIX_C_SOURCE +#endif + +#ifdef _XOPEN_SOURCE +#define _PGSAVE_XOPEN_SOURCE _XOPEN_SOURCE +#undef _XOPEN_SOURCE +#endif + #if defined(_MSC_VER) defined(_DEBUG) /* Python uses #pragma to bring in a non-default libpython on VC++ if * _DEBUG is defined */ @@ -84,7 +99,20 @@ typedef int Py_ssize_t; PyObject_HEAD_INIT(type) size, #endif -#include postgres.h +/* + * Restore settings the Python headers might have overridden. + */ +#ifdef _PGSAVE_POSIX_C_SOURCE +#undef _POSIX_C_SOURCE +#define _POSIX_C_SOURCE _PGSAVE_POSIX_C_SOURCE +#undef _PGSAVE_POSIX_C_SOURCE +#endif + +#ifdef _PGSAVE_XOPEN_SOURCE +#undef _XOPEN_SOURCE +#define _XOPEN_SOURCE _PGSAVE_XOPEN_SOURCE +#undef _PGSAVE_XOPEN_SOURCE +#endif /* system stuff */ #include unistd.h -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq SSL with non-blocking sockets
On 24 July 2011 21:33, Tom Lane t...@sss.pgh.pa.us wrote: I've applied the simplified fix (just set SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER) as well as a patch to improve the error reporting situation. I'm not exactly sure why, and don't have time to investigate right now, but this commit (fee476da952a1f02f7ccf6e233fb4824c2bf6af4) appears to have broken the build for me: gcc -O0 -g -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wformat-security -fno-strict-aliasing -fwrapv -g -pthread -D_REENTRANT -D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS -fpic -DFRONTEND -DUNSAFE_STAT_OK -I. -I../../../src/include -D_GNU_SOURCE -I../../../src/port -I../../../src/port -DSO_MAJOR_VERSION=5 -c -o fe-misc.o fe-misc.c -MMD -MP -MF .deps/fe-misc.Po fe-misc.c: In function ‘pqReadData’: fe-misc.c:651:11: error: ‘PGconn’ has no member named ‘ssl’ fe-misc.c:743:11: error: ‘PGconn’ has no member named ‘ssl’ fe-misc.c:761:10: error: ‘PGconn’ has no member named ‘ssl’ fe-misc.c: In function ‘pqSendSome’: fe-misc.c:841:14: error: ‘PGconn’ has no member named ‘ssl’ fe-misc.c:861:14: error: ‘PGconn’ has no member named ‘ssl’ The problem goes away if I check out the commit made immediately prior to this one - d0c23026b2499ba9d6797359241ade076a5a677d. I'm building with my usual, rather generic settings for hacking. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON
On Jul25, 2011, at 00:48 , Joey Adams wrote: On Sun, Jul 24, 2011 at 2:19 PM, Florian Pflug f...@phlo.org wrote: On Jul24, 2011, at 05:14 , Robert Haas wrote: On Fri, Jul 22, 2011 at 10:36 PM, Joey Adams joeyadams3.14...@gmail.com wrote: ... Fortunately, JSON's definition of a number is its decimal syntax, so the algorithm is child's play: * Figure out the digits and exponent. * If the exponent is greater than 20 or less than 6 (arbitrary), use exponential notation. I agree. As for your proposed algorithm, I suggest to instead use exponential notation if it produces a shorter textual representation. In other words, for values between -1 and 1, we'd switch to exponential notation if there's more than 1 leading zero (to the right of the decimal point, of course), and for values outside that range if there're more than 2 trailing zeros and no decimal point. All after redundant zeros and decimal points are removed. So we'd store 0 as 0 1 as 1 0.1 as 0.1 0.01 as 0.01 0.001 as 1e-3 10 as 10 100 as 100 1000 as 1e3 1000.1 as 1000.1 1001 as 1001 Interesting idea. The reason I suggested using exponential notation only for extreme exponents (less than -6 or greater than +20) is partly for presentation value. Users might be annoyed to see 100 turned into 1e6. I'm not concerned about that, but ... Moreover, applications working solely with integers that don't expect the floating point syntax may choke on the converted numbers. now that you say it, that's definitely a concern. 32-bit integers can be losslessly encoded as IEEE double-precision floats (JavaScript's internal representation), and JavaScript's algorithm for converting a number to a string ([1], section 9.8.1) happens to preserve the integer syntax (I think). Indeed. In fact, it seems to be designed to use the integer syntax for all integral values with = 66 binary digits. log10(2^66) ~ 19.87 Should we follow the JavaScript standard for rendering numbers (which my suggestion approximates)? Or should we use the shortest encoding as Florian suggests? In the light of the above, consider my suggestion withdrawn. I now think we should just follow the JavaScript standard as closely as possible. As you said, it's pretty much the same as your suggestion, just more precise in the handling of some corner-cases like infinity, nan, +/-0, some questions of leading and trailing zeros, ... I wouldn't have made my suggestion had I realized earlier that limit of 20 for the exponent was carefully chosen to ensure that the full range of a 64-bit integer value would be represented in non-exponential notation. I assumed the bracketed arbitrary in your description applied to both the upper (20) as well as the lower (-6) bound, when it really only applies to the lower bound. Sorry for that. (I am now curious where the seemingly arbitrary lower bound of -6 comes from, though. The only explanation I can come up with is that somebody figured that 0.01 is still easily distinguished visually from 0.1, but not so much from 0.001) best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq SSL with non-blocking sockets
On 07/24/2011 07:49 PM, Peter Geoghegan wrote: On 24 July 2011 21:33, Tom Lanet...@sss.pgh.pa.us wrote: I've applied the simplified fix (just set SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER) as well as a patch to improve the error reporting situation. I'm not exactly sure why, and don't have time to investigate right now, but this commit (fee476da952a1f02f7ccf6e233fb4824c2bf6af4) appears to have broken the build for me: Yeah, looks like it fails unless --with-openssl is set. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] vacuumlo patch
Excerpts from Tim's message of dom jul 24 14:48:08 -0400 2011: Please consider adding this minor patch, or offering improvements. *Problem*: vacuumlo required PostgreSQL to use more locks per transaction than possible resulting in an error and a log file full of ignored until end of transaction warnings. (max_locks_per_transaction is limited by shmmax which is limited by RAM) Interesting. I cannot vouch for or against the patch, but please resubmit without the -U0 switch. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Policy on pulling in code from other projects?
Excerpts from Dave Page's message of sáb jul 23 02:25:30 -0400 2011: Also consider if the library is widely available on common distros or not. If not, packagers are going to have to start packaging that first, in order to build the PostgreSQL packages. This is a *huge* issue for use if we want to use wxWidgets addon libraries with pgAdmin. More likely, they are going to ignore it and pass the --disable-liburi (whatever) configure parameter and the functionality is going to be absent most of the time anyway. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] python cleanup
Andrew Dunstan and...@dunslane.net writes: On my Linux system the attached compiles without warnings. If this seems like the way to go I'll investigate more on Windows. Hmm ... +/* + * Save settings the Python headers might override + */ +#ifdef _POSIX_C_SOURCE +#define _PGSAVE_POSIX_C_SOURCE _POSIX_C_SOURCE +#undef _POSIX_C_SOURCE +#endif ... +/* + * Restore settings the Python headers might have overridden. + */ +#ifdef _PGSAVE_POSIX_C_SOURCE +#undef _POSIX_C_SOURCE +#define _POSIX_C_SOURCE _PGSAVE_POSIX_C_SOURCE +#undef _PGSAVE_POSIX_C_SOURCE +#endif I don't believe that this sequence will restore the contents of the _POSIX_C_SOURCE macro to what it was before. For that matter, it's not even quite right about ensuring that the macro's defined-ness status is restored (what if the python headers set _POSIX_C_SOURCE when it wasn't set before?). We might not need more than defined-ness to be right, though. What in the world are the python headers doing fooling with these macros, anyway?? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] vacuumlo patch
Hi Álvaro, thanks for the response. Here is the requested diff with 3 lines of context: git diff HEAD -- contrib/vacuumlo/vacuumlo.c diff --git a/contrib/vacuumlo/vacuumlo.c b/contrib/vacuumlo/vacuumlo.c index f6e2a28..b7c7d64 100644 --- a/contrib/vacuumlo/vacuumlo.c +++ b/contrib/vacuumlo/vacuumlo.c @@ -48,6 +48,7 @@ struct _param char *pg_host; intverbose; intdry_run; +inttransaction_limit; }; intvacuumlo(char *, struct _param *); @@ -282,7 +283,13 @@ vacuumlo(char *database, struct _param * param) fprintf(stderr, %s, PQerrorMessage(conn)); } else +{ deleted++; +if(param-transaction_limit!=0 deleted=param-transaction_limit) +{ +break; +} +} } else deleted++; @@ -313,6 +320,7 @@ usage(const char *progname) printf( -h HOSTNAME database server host or socket directory\n); printf( -n don't remove large objects, just show what would be done\n); printf( -p PORT database server port\n); +printf( -l LIMIT stop after removing LIMIT LOs\n); printf( -U USERNAME user name to connect as\n); printf( -w never prompt for password\n); printf( -W force password prompt\n); @@ -342,6 +350,7 @@ main(int argc, char **argv) param.pg_port = NULL; param.verbose = 0; param.dry_run = 0; +param.transaction_limit = 0; if (argc 1) { @@ -359,7 +368,7 @@ main(int argc, char **argv) while (1) { -c = getopt(argc, argv, h:U:p:vnwW); +c = getopt(argc, argv, h:U:p:l:vnwW); if (c == -1) break; @@ -395,6 +404,14 @@ main(int argc, char **argv) } param.pg_port = strdup(optarg); break; +case 'l': +param.transaction_limit = strtol(optarg, NULL, 10); +if ((param.transaction_limit 0) || (param.transaction_limit 2147483647)) +{ +fprintf(stderr, %s: invalid transaction limit number: %s, valid range is form 0(disabled) to 2147483647.\n, progname, optarg); +exit(1); +} +break; case 'h': param.pg_host = strdup(optarg); break;
Re: [HACKERS] Tracing in Postgres
I want to retain all the error messages, error report that is used by Postgres. I don't intend to log any information extra other than what is provided by Postgres. But I just want to replace the implementation of the logging/tracing in Postgres, so that the existing messages can be redirected to a file, a USB etc., There is an existing tracing frameworkfor this,I intend to use the API s provided by this framework. Regards, Harshitha On Fri, Jul 22, 2011 at 9:08 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, Jul 22, 2011 at 5:00 AM, Harshitha S hershe...@gmail.com wrote: I am trying to integrate a tracing framework in the Postgres code. I need to know if elog.c under backend/utils/error is the place where the changes can be made. Depends on what your tracing framework is trying to do. The tracing framework that I want to integrate has some additional capability. I want to replace the tracing and logging functionality in the existing Postgres framework with the APIs used in this framework without making changes in every file. If anybody has any inputs on this, please help me. If you just want to analyze the log messages, I would suggest letting PostgreSQL to write them out to files and then postprocessing the files. It will be less work. Also, there's a pretty limited amount of processing that is safe to do in elog.c. Some arbitrary part of the system has blown up, so you can't (just to take one example) read from the database at that point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON
On Sun, Jul 24, 2011 at 2:19 PM, Florian Pflug f...@phlo.org wrote: The downside being that we'd then either need to canonicalize in the equality operator, or live with either no equality operator or a rather strange one. It just occurred to me that, even if we sort object members, texteq might not be a sufficient way to determine equality. In particular, IEEE floats treat +0 and -0 as two different things, but they are equal when compared. Note that we're only dealing with a decimal representation; we're not (currently) converting to double-precision representation and back. Should we mimic IEEE floats and preserve -0 versus +0 while treating them as equal? Or should we treat JSON floats like numeric and convert -0 to 0 on input? Or should we do something else? I think converting -0 to 0 would be a bad idea, as it would violate the intuitive assumption that JSON can be used to marshal double-precision floats. - Joey -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON
On Mon, Jul 25, 2011 at 1:05 AM, Joey Adams joeyadams3.14...@gmail.com wrote: Should we mimic IEEE floats and preserve -0 versus +0 while treating them as equal? Or should we treat JSON floats like numeric and convert -0 to 0 on input? Or should we do something else? I think converting -0 to 0 would be a bad idea, as it would violate the intuitive assumption that JSON can be used to marshal double-precision floats. On the other hand, JavaScript's own .toString and JSON.stringify turn -0 into 0, so JSON can't marshal -0 around, anyway (in practice). Now I think turning -0 into 0 would be fine for canonicalizing numbers in json_in. - Joey -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers