Re: pgbench - adding pl/pgsql versions of tests
Hello Nathan, I'm unclear about what variety of scripts that could be provided given the tables made available with pgbench. ISTM that other scenari would involve both an initialization and associated scripts, and any proposal would be bared because it would open the door to anything. Why's that? Just a wild guess based on 19 years of occasional contributions to pg and pgbench in particular:-) I'm not aware of any project policy that prohibits such enhancements to pgbench. Attempts in extending pgbench often fall under "you can do it outside (eg with a custom script) so there is no need to put that in pgbench as it would add to the maintenance burden with a weak benefit proven by the fact that it is not there already". It might take some effort to gather consensus on a proposal like this, but IMHO that doesn't mean we shouldn't try. Done it in the past. Probably will do it again in the future:-) If the prevailing wisdom is that we shouldn't add more built-in scripts because there is an existing way to provide custom ones, then it's not clear that we should proceed with $SUBJECT, anyway. I'm afraid there is that argument. I do not think that this policy is good wrt $SUBJECT, ISTM that having an easy way to test something with a PL/pgSQL function would help promote the language by advertising/showing the potential performance benefit (or not, depending). Just one function would be enough for that. -- Fabien.
Re: pgbench: allow to exit immediately when any client is aborted
About pgbench exit on abort v4: Patch applies cleanly, compiles, local make check ok, doc looks ok. This looks ok to me. -- Fabien.
Re: LLVM 16 (opaque pointers)
[...] Further changes are already needed for their "main" branch (LLVM 17-to-be), so this won't quite be enough to shut seawasp up. For information, the physical server which was hosting my 2 bf animals (seawasp and moonjelly) has given up rebooting after a power cut a few weeks/months ago, and I have not setup a replacement (yet). -- Fabien.
Re: pgbench - adding pl/pgsql versions of tests
Hello Nathan, 1. so I don't have to create the script and function manually each time I want to test mainly the database (instead of the client-database system) 2. so that new users of PostgreSQL can easily see how much better OLTP workloads perform when packaged up as a server-side function I'm not sure we should add micro-optimized versions of the existing scripts to pgbench. Your point about demonstrating the benefits of server-side functions seems reasonable, but it also feels a bit like artifically improving pgbench numbers. I think I'd rather see some more variety in the built-in scripts so that folks can more easily test a wider range of common workloads. Perhaps this could include a test that is focused on server-side functions. ISTM that your argument suggests to keep the tpcb-like PL/pgSQL version. It is the more beneficial anyway as it merges 4/5 commands in one call, so it demonstrate the effect of investing in this kind of approach. I'm unclear about what variety of scripts that could be provided given the tables made available with pgbench. ISTM that other scenari would involve both an initialization and associated scripts, and any proposal would be bared because it would open the door to anything. In any case, it looks like there is unaddressed feedback for this patch, so I'm marking it as "waiting on author." Indeed. -- Fabien.
Re: pgbench with libevent?
4. libevent development seems slugish, last bugfix was published 3 years ago, version 2.2 has been baking for years, but the development seems lively (+100 contributors). Ugh, I would stay away from something like that. Would we become hostage to an undelivering group? No thanks. Ok. Or maybe libuv (used by nodejs?). Note: libev had no updates in 8 years. libev or libuv? No updates in 8 years => dead. No way. Sorry, it was not a typo, but the information was not very explicit. I have looked at 3 libraries: libevent, libuv and libev. libuv is quite lively, last updated 2023-06-30. libev is an often cited library, which indeed seems quite dead, so I was "noting" that I had discarded it, but it looked like a typo. Reworking based on wait events as proposed downthread sounds more promising. The wait event postgres backend implementation would require a lot of work to be usable in a client context. My current investigation is that libuv could be the reasonable target, if any, especially as it seems to provide a portable thread pool as well. -- Fabien.
Re: pgbench with libevent?
Interesting. In my understanding this also needs to make Latch frontend-friendly? It could be refactored to support a different subset of event types -- maybe just sockets, no latches and obviously no 'postmaster death'. But figuring out how to make latches work between threads might also be interesting for future projects... Maybe Fabien has completion-based I/O in mind (not just "readiness"). Pgbench is really a primitive client on top of libpq. ISTM that completion-based I/O would require to enhance libpq asynchronous-ity, not just expose its underlying fd to allow asynchronous implementations. Currently pgbench only actuall "waits" for results from the server and testing PQisBusy to check whether they are there. That's something that some of those libraries can do, IIUC. For example, when your thread wakes up, it tells you "your socket read is finished, the data is already in your target buffer". Indeed, libevent has a higher level "buffer" oriented API. As opposed to "you can now call recv() without blocking", so you avoid another trip into the kernel. But that's also something we'll eventually want to figure out in the server. -- Fabien.
Re: pgbench with libevent?
Hello Thomas, Pgbench is managing clients I/Os manually with select or poll. Much of this could be managed by libevent. Or maybe libuv (used by nodejs?). From preliminary testing libevent seems not too good at fine grain time management which are used for throttling, whereas libuv advertised that it is good at it, although what it does is yet to be seen. Do you think our WaitEventSet stuff could be good here, if made frontend-friendly? Interesting question. AFAICS, the answer is that it could indeed probably fit the task, but it would require significant work to make it thread-compatible, and to untangle it from IsUnderPosmaster/postmaster death, memory context, elog/ereport, and other back-end specific stuff. If you remove all that with a clean abstraction (quite a task), then once done the question could be why not use libevent/libuv/… in the backend instead of maintaining more or less the same thing inside postgres? So ISTM that as far as pgbench is concerned it would be much simpler to use libevent/libuv/… directly if the pros are enough and the cons not redhibitory, and provided that the needed detailed features are really there. -- Fabien.
Re: Make psql's qeury canceling test simple by using signal() routine of IPC::Run
Bonjour Michaël, On Sun, Aug 13, 2023 at 11:22:33AM +0200, Fabien COELHO wrote: Test run is ok on my Ubuntu laptop. I have a few comments about this patch. Argh, sorry! I looked at what was removed (a lot) from the previous version, not what was remaining and should also have been removed. -- Fabien.
Re: pgbench with libevent?
Pgbench is managing clients I/Os manually with select or poll. Much of this could be managed by libevent. Or maybe libuv (used by nodejs?). From preliminary testing libevent seems not too good at fine grain time management which are used for throttling, whereas libuv advertised that it is good at it, although what it does is yet to be seen. Note: libev had no updates in 8 years. -- Fabien.
pgbench with libevent?
Hello devs, Pgbench is managing clients I/Os manually with select or poll. Much of this could be managed by libevent. Pros: 1. libevent is portable, stable, and widely used (eg Chromium, Memcached, PgBouncer). 2. libevent implements more I/O wait methods, which may be more efficient on some platforms (eg FreeBSD kqueue, Windows wepoll in libevent 2.2 alpha), and hides portability issues. 3. it would remove significant portions of unattractive pgbench code, esp. in threadRun, and around socket/poll abstraction and portability layer. 4. depending on the number of event loops, the client load could be shared more evenly. currently a thread only manages its own clients, some client I/Os may be waiting to be processed while other threads could be available to process them. Cons: 1. it adds a libevent dependency to postgres. This may be a no go from the start. 2. this is a significant refactoring project which implies a new internal architecture and adds new code to process and generate appropriate events. 3. libevent ability to function efficiently in a highly multithreaded environment is unclear. Should there be one event queue which generate a shared work queue? or several event queues, one per thread (which would remove the sharing pro)? or something in between? Some experiments and configuratibility may be desirable. This may also have an impact on pgbench user interface and output depending on the result, eg there may be specialized event and worker threads, some statistics may be slightly different, new options may be needed… 4. libevent development seems slugish, last bugfix was published 3 years ago, version 2.2 has been baking for years, but the development seems lively (+100 contributors). Neutral? 1. BSD 3 clauses license. Is pros > cons, or not? Other thoughts, pros, cons? -- Fabien.
Re: pgbench: allow to exit immediately when any client is aborted
Hello Yugo-san, I attached the updated patch v3 including changes above, a test, and fix of the typo you pointed out. I'm sorry but the test in the previous patch was incorrect. I attached the correct one. About pgbench exit on abort v3: Patch does not "git apply", but is ok with "patch" although there are some minor warnings. Compiles. Code is ok. Tests are ok. About Test: The code is simple to get an error quickly but after a few transactions, good. I'll do a minimal "-c 2 -j 2 -t 10" instead of "-c 4 -j 4 -T 10". -- Fabien.
Re: Make psql's qeury canceling test simple by using signal() routine of IPC::Run
Hello Yugo-san, Currently, the psql's test of query cancelling (src/bin/psql/t/020_cancel.pl) gets the PPID of a running psql by using "\!" meta command, and sends SIGINT to the process by using "kill". However, IPC::Run provides signal() routine that sends a signal to a running process, so I think we can rewrite the test using this routine to more simple fashion as attached patch. What do you think about it? I'm the one who pointed out signal(), so I'm a little biaised, nevertheless, regarding the patch: Patch applies with "patch". Test code is definitely much simpler. Test run is ok on my Ubuntu laptop. Let's drop 25 lines of perl! -- Fabien.
Re: pgbnech: allow to cancel queries during benchmark
Hello Yugo-san, Some feedback about v2. There is some dead code (&& false) which should be removed. Maybe it should check that cancel is not NULL before calling PQcancel? I think this is already checked as below, but am I missing something? + if (all_state[i].cancel != NULL) + (void) PQcancel(all_state[i].cancel, errbuf, sizeof(errbuf)); After looking at the code, I'm very unclear whether they may be some underlying race conditions, or not, depending on when the cancel is triggered. I think that some race conditions are still likely given the current thread 0 implementation, and dealing with them with a barrier or whatever is not desirable at all. In order to work around this issue, ISTM that we should go away from the simple and straightforward thread 0 approach, and the only clean way is that the cancelation should be managed by each thread for its own client. I'd suggest to have the advanceState to call PQcancel when CancelRequested is set and switch to CSTATE_ABORTED to end properly. This means that there would be no need for the global client states pointer, so the patch should be smaller and simpler. Possibly there would be some shortcuts added here and there to avoid lingering after the control-C, in threadRun. I am not sure this approach is simpler than mine. My argument is more about latent race conditions and inter-thread interference than code simplicity. In multi-threads, only one thread can catches the signal and other threads continue to run. Yep. This why I see plenty uncontrolled race conditions if thread 0 cancels clients which are managed in parallel by other threads and may be active. I'm not really motivated to delve into libpq internals to check whether there are possibly bad issues or not, but if two threads write message at the same time in the same socket, I assume that this can be bad if you are unlucky. ISTM that the rational convention should be that each thread cancels its own clients, which ensures that there is no bad interaction between threads. Therefore, if Ctrl+C is pressed while threads are waiting responses from the backend in wait_on_socket_set, only one thread can be interrupted and return, but other threads will continue to wait and cannot check CancelRequested. So, for implementing your suggestion, we need any hack to make all threads return from wait_on_socket_set when the event occurs, but I don't have idea to do this in simpler way. In my patch, all threads can return from wait_on_socket_set at Ctrl+C because when thread #0 cancels all connections, the following error is sent to all sessions: ERROR: canceling statement due to user request and all threads will receive the response from the backend. Hmmm. I understand that the underlying issue you are raising is that other threads may be stuck while waiting on socket events and that with your approach they will be cleared somehow by socket 0. I'll say that (1) this point does not address potential race condition issues with several thread interacting directly on the same client ; (2) thread 0 may also be stuck waiting for events so the cancel is only taken into account when it is woken up. If we accept that each thread cancels its clients when it is woken up, which may imply some (usually small) delay, then it is not so different from the current version because the code must wait for 0 to wake up anyway, and it solves (1). The current version does not address potential thread interactions. -- Fabien.
Re: pgbench: allow to exit immediately when any client is aborted
Hello Yugo-san, There are cases where "goto done" is used where some error like "invalid socket: ..." happens. I would like to make pgbench exit in such cases, too, so I chose to handle errors below the "done:" label. However, we can change it to call "exit" instead of "goo done" at each place. Do you think this is better? Good point. Now I understand the "!= FINISHED", because indeed in these cases the done is reached with unfinished but not necessarily ABORTED clients, and the comment was somehow misleading. On reflection, there should be only one exit() call, thus I'd say to keep the "goto done" as you did, but to move the checking loop *before* the disconnect_all, and the overall section comment could be something like "possibly abort if any client is not finished, meaning some error occured", which is consistent with the "!= FINISHED" condition. -- Fabien.
Re: pgbench: allow to exit immediately when any client is aborted
Hello Yugo-san, I attached v2 patch including the documentation and some comments in the code. I've looked at this patch. I'm unclear whether it does what it says: "exit immediately on abort", I would expect a cold call to "exit" (with a clear message obviously) when the abort occurs. Currently it skips to "done" which starts by closing this particular thread client connections, then it will call "exit" later, so it is not "immediate". I do not think that this cleanup is necessary, because anyway all other threads will be brutally killed by the exit called by the aborted thread, so why bothering to disconnect only some connections? /* If any client is aborted, exit immediately. */ if (state[i].state != CSTATE_FINISHED) For this comment, I would prefer "if ( ... == CSTATE_ABORTED)" rather that implying that not finished means aborted, and if you follow my previous remark then this code can be removed. Typo: "going to exist" -> "going to exit". Note that it is not "the whole thread" but "the program" which is exiting. There is no test. -- Fabien.
Re: pgbnech: allow to cancel queries during benchmark
Yugo-san, Some feedback about v1 of this patch. Patch applies cleanly, compiles. There are no tests, could there be one? ISTM that one could be done with a "SELECT pg_sleep(...)" script?? The global name "all_state" is quite ambiguous, what about "client_states" instead? Or maybe it could be avoided, see below. Instead of renaming "state" to "all_state" (or client_states as suggested above), I'd suggest to minimize the patch by letting "state" inside the main and adding a "client_states = state;" just after the allocation, or another approach, see below. Should PQfreeCancel be called on deconnections, in finishCon? I think that there may be a memory leak with the current implementation?? Maybe it should check that cancel is not NULL before calling PQcancel? After looking at the code, I'm very unclear whether they may be some underlying race conditions, or not, depending on when the cancel is triggered. I think that some race conditions are still likely given the current thread 0 implementation, and dealing with them with a barrier or whatever is not desirable at all. In order to work around this issue, ISTM that we should go away from the simple and straightforward thread 0 approach, and the only clean way is that the cancelation should be managed by each thread for its own client. I'd suggest to have the advanceState to call PQcancel when CancelRequested is set and switch to CSTATE_ABORTED to end properly. This means that there would be no need for the global client states pointer, so the patch should be smaller and simpler. Possibly there would be some shortcuts added here and there to avoid lingering after the control-C, in threadRun. -- Fabien.
Re: pgbnech: allow to cancel queries during benchmark
Hello Yugo-san, In thread #0, setup_cancel_handler is called before the loop, the CancelRequested flag is set when Ctrl+C is issued. In the loop, cancel requests are sent when the flag is set only in thread #0. SIGINT is blocked in other threads, but the threads will exit after their query are cancelled. If thread safety is disabled or OS is Windows, the signal is not blocked because pthread_sigmask cannot be used. (I didn't test the patch on WIndows yet, though.) I choose the design that the signal handler and the query cancel are performed only in thread #0 because I wanted to make the behavior as predicable as possible. However, another design that all thread can received SIGINT and that the first thread that catches the signal is responsible to sent cancel requests for all connections may also work. Also, the array of CState that contains all clients state is changed to a global variable so that all connections can be accessed within a thread. As a side note, actually I thought another design to create a special thread for handling query cancelling in which SIGINT would be catched by sigwait, I quit the idea because it would add complexity due to code for Windows and --disabled-thread-safe. I agree that the simpler the better. I would appreciate it if you could kindly review and test the patch! I'll try to have a look at it. -- Fabien.
Re: pgbench: option delaying queries till connections establishment?
Hello Dave, I am running pgbench with the following pgbench -h localhost -c 100 -j 100 -t 2 -S -s 1000 pgbench -U pgbench --protocol=simple Without pgbouncer I get around 5k TPS with pgbouncer I get around 15k TPS Looking at the code connection initiation time should not be part of the calculation so I' puzzled why pgbouncer is making such a dramatic difference ? Turns out that for this specific test, pg is faster with a pooler. This does not tell "why". Does the pooler prepares statements, whereas "simple" does not? -- Fabien.
Re: [PATCH] pgbench: add multiconnect option
Hello Jelte, This patch seems to have quite some use case overlap with my patch which adds load balancing to libpq itself: https://www.postgresql.org/message-id/flat/pr3pr83mb04768e2ff04818eeb2179949f7...@pr3pr83mb0476.eurprd83.prod.outlook.com Thanks for the pointer. The end purpose of the patch is to allow pgbench to follow a failover at some point, at the client level, AFAICR. My patch is only able to add "random" load balancing though, not "round-robin". So this patch still definitely seems useful, even when mine gets merged. Yep. I'm not sure the end purpose is the same, but possibly the pgbench patch could take advantage of libpq extension. I'm not sure that the support for the "working" connection is necessary from a feature perspective though (usability/discoverability is another question). It's already possible to achieve the same behaviour by simply providing multiple host names in the connection string. You can even tell libpq to connect to a primary or secondary by using the target_session_attrs option. -- Fabien.
Re: pgbench - adding pl/pgsql versions of tests
Hello, The attached patch adds pl/pgsql versions of "tpcb-like" and "simple-update" internal test scripts Why not, it makes sense because it is relevant to some usage patterns. Why not having the select version as a version as well? If we are going to follow this road, we could also consider "combined" queries with \; as well? $ pgbench -b list Available builtin scripts: tpcb-like: plpgsql-tpcb-like: simple-update: plpgsql-simple-update: select-only: which one can run using the -b / --builtin= option ISTM that the -b had a fast selection so that only a prefix was enough to select a script (-b se = -b select-only). Maybe such convenient shortcut should be preserved, it seems that the long name will be needed for the pl versions. And a flag --no-functions which lets you not to create the functions at init Hmmm. Not so sure. there are also character flags to -I / --init , -- Y to drop the functions and -- y to create the functions. Creating is default behaviour, but can be disabled fia long flag --no-functions ) Ok. I selected Yy as they were unused and can be thought of as "inverted lambda symbol" :) :-) If there are no strong objections, I'll add it to the commitfest as well Please do that. -- Fabien Coelho.
Re: [PATCH] random_normal function
Bonjour Michaël, Overall, I think that there should be a clearer discussion and plan about which random functionS postgres should provide to complement the standard instead of going there… randomly:-) So, what does the specification tells about seeds, normal and random functions? A bunch of DBMSs implement RAND, sometimes RANDOM, SEED or even NORMAL using from time to time specific SQL keywords to do the work. I do not have the SQL standard, so I have no idea about what is in there. From a typical use case point of view, I'd say uniform, normal and exponential would make sense for floats. I'm also okay with generating a uniform bytes pseudo-randomly. I'd be more at ease to add simple functions rather than a special heavy-on-keywords syntax, even if standard. Note that SQLValueFunction made the addition of more returning data types a bit more complicated (not much, still) than the new COERCE_SQL_SYNTAX by going through a mapping function, so the keyword/function mapping is straight-forward. I'm unclear about why this paragraph is here. -- Fabien.
Re: [PATCH] random_normal function
Hello Paul, My 0.02€ about the patch: Patch did not apply with "git apply", I had to "patch -p1 <" and there was a bunch of warnings. Patches compile and make check is okay. The first patch adds empty lines at the end of "sql/random.sql", I think that it should not. # About random_normal: I'm fine with providing a random_normal implementation at prng and SQL levels. There is already such an implementation in "pgbench.c", which outputs integers, I'd suggest that it should also use the new prng function, there should not be two box-muller transformations in pg. # About pg_prng_double_normal: On the comment, I'd write "mean + stddev * val" instead of starting with the stddev part. Usually there is an empty line between the variable declarations and the first statement. There should be a comment about why it needs u1 larger than some epsilon. This constraint seems to generate a small bias? I'd suggest to add the UNLIKELY() compiler hint on the loop. # About random_string: Should the doc about random_string tell that the output bytes are expected to be uniformly distributed? Does it return "random values" or "pseudo random values"? I do not understand why the "drandom_string" function is in "float.c", as it is not really related to floats. Also it does not return a string but a bytea, so why call it "_string" in the first place? I'm do not think that it should use pg_strong_random which may be very costly on some platform. Also, pg_strong_random does not use the seed, so I do not understand why it needs to be checked. I'd suggest that the output should really be uniform pseudo-random, possibly based on the drandom() state, or maybe not. Overall, I think that there should be a clearer discussion and plan about which random functionS postgres should provide to complement the standard instead of going there… randomly:-) -- Fabien.
Re: Order getopt arguments
Hello Peter, I had noticed that most getopt() or getopt_long() calls had their letter lists in pretty crazy orders. There might have been occasional attempts at grouping, but those then haven't been maintained as new options were added. To restore some sanity to this, I went through and ordered them alphabetically. I agree that a more or less random historical order does not make much sense. For pgbench, ISTM that sorting per functionality then alphabetical would be better than pure alphabetical because it has 2 modes. Such sections might be (1) general (2) connection (3) common/shared (4) initialization and (5) benchmarking, we some comments on each. What do you think? If okay, I'll send you a patch for that. -- Fabien.
Re: psql - factor out echo code
Now some of the output generated by test_extdepend gets a bit confusing: +-- QUERY: + + +-- QUERY: That's not entirely this patch's fault. Still that's not really intuitive to see the output of a query that's just a blank spot.. Hmmm. What about adding an explicit \echo before these empty outputs to mitigate the possible induced confusion? \echo is not possible. Attached an attempt to improve the situation by replacing empty lines with comments in this test. -- Fabien.diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 7672ed9e9d..16873aff62 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -5284,18 +5284,9 @@ echo_hidden_command(const char *query) { if (pset.echo_hidden != PSQL_ECHO_HIDDEN_OFF) { - printf(_("* QUERY **\n" - "%s\n" - "**\n\n"), query); - fflush(stdout); + echoQuery(stdout, true, query); if (pset.logfile) - { - fprintf(pset.logfile, - _("* QUERY **\n" - "%s\n" - "**\n\n"), query); - fflush(pset.logfile); - } + echoQuery(pset.logfile, true, query); if (pset.echo_hidden == PSQL_ECHO_HIDDEN_NOEXEC) return false; diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index b989d792aa..30fd5bcda1 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -543,6 +543,18 @@ PrintTiming(double elapsed_msec) elapsed_msec, days, (int) hours, (int) minutes, seconds); } +/* + * Echo user query + */ +void +echoQuery(FILE *out, bool is_internal, const char *query) +{ + if (is_internal) + fprintf(out, _("-- INTERNAL QUERY:\n%s\n\n"), query); + else + fprintf(out, _("-- QUERY:\n%s\n\n"), query); + fflush(out); +} /* * PSQLexec @@ -569,18 +581,9 @@ PSQLexec(const char *query) if (pset.echo_hidden != PSQL_ECHO_HIDDEN_OFF) { - printf(_("* QUERY **\n" - "%s\n" - "**\n\n"), query); - fflush(stdout); + echoQuery(stdout, true, query); if (pset.logfile) - { - fprintf(pset.logfile, - _("* QUERY **\n" - "%s\n" - "**\n\n"), query); - fflush(pset.logfile); - } + echoQuery(pset.logfile, true, query); if (pset.echo_hidden == PSQL_ECHO_HIDDEN_NOEXEC) return NULL; @@ -796,10 +799,7 @@ ExecQueryTuples(const PGresult *result) * assumes that MainLoop did that, so we have to do it here. */ if (pset.echo == PSQL_ECHO_ALL && !pset.singlestep) -{ - puts(query); - fflush(stdout); -} + echoQuery(stdout, false, query); if (!SendQuery(query)) { @@ -1058,13 +1058,7 @@ SendQuery(const char *query) } if (pset.logfile) - { - fprintf(pset.logfile, -_("* QUERY **\n" - "%s\n" - "**\n\n"), query); - fflush(pset.logfile); - } + echoQuery(pset.logfile, false, query); SetCancelConn(pset.db); diff --git a/src/bin/psql/common.h b/src/bin/psql/common.h index f0820dd7d5..359c48143e 100644 --- a/src/bin/psql/common.h +++ b/src/bin/psql/common.h @@ -32,6 +32,7 @@ extern void psql_setup_cancel_handler(void); extern PGresult *PSQLexec(const char *query); extern int PSQLexecWatch(const char *query, const printQueryOpt *opt, FILE *printQueryFout); +extern void echoQuery(FILE *out, bool is_internal, const char *query); extern bool SendQuery(const char *query); extern bool is_superuser(void); diff --git a/src/test/modules/test_extensions/expected/test_extdepend.out b/src/test/modules/test_extensions/expected/test_extdepend.out index 0b62015d18..a08ef8d19d 100644 --- a/src/test/modules/test_extensions/expected/test_extdepend.out +++ b/src/test/modules/test_extensions/expected/test_extdepend.out @@ -11,17 +11,17 @@ SELECT * FROM test_extdep_commands; CREATE EXTENSION test_ext5 SCHEMA test_ext SET search_path TO test_ext CREATE TABLE a (a1 int) - + -- function b CREATE FUNCTION b() RETURNS TRIGGER LANGUAGE plpgsql AS + $$ BEGIN NEW.a1 := NEW.a1 + 42; RETURN NEW; END; $$ ALTER FUNCTION b() DEPENDS ON EXTENSION test_ext5 - + -- trigger c CREATE TRIGGER c BEFORE INSERT ON a FOR EACH ROW EXECUTE PROCEDURE b() ALTER TRIGGER c ON a DEPENDS ON EXTENSION test_ext5 - + -- materialized view d CREATE MATERIALIZED VIEW d AS SELECT * FROM a ALTER MATERIALIZED VIEW d DEPENDS ON EXTENSION test_ext5 - + -- index e CREATE INDEX e ON a (a1) ALTER INDEX e DEPENDS ON EXTENSION test_ext5 RESET search_path @@ -29,24 +29,58 @@ SELECT * FROM test_extdep_commands; -- First, test that dependent objects go away when the extension is dropped. SELECT * FROM test_extdep_commands \gexec +-- QUERY: CREATE SCHEMA test_ext + +-- QUERY: CREATE EXTENSION test_ext5 SCHEMA test_ext + +-- QUERY: SET search_path TO test_ext + +-- QUERY: CREATE TABLE a (a1 int) +-- QUERY: + -- function b + +-- QUERY: CREATE FUNCTION b() RETURNS TRIGGER LANGUAGE plpgsql AS $$
Re: psql - factor out echo code
Hmm. The refactoring is worth it as much as the differentiation between QUERY and INTERNAL QUERY as the same pattern is repeated 5 times. Now some of the output generated by test_extdepend gets a bit confusing: +-- QUERY: + + +-- QUERY: That's not entirely this patch's fault. Still that's not really intuitive to see the output of a query that's just a blank spot.. Hmmm. What about adding an explicit \echo before these empty outputs to mitigate the possible induced confusion? -- Fabien.
Re: How to *really* quit psql?
Hello David, Question: is there any way to really abort a psql script from an included file? Under what circumstances would it be appropriate for a script to take it on itself to decide that? It has no way of knowing what the next -f option is or what the user intended. Can we add an exit code argument to the \quit meta-command that could be set to non-zero and, combined with ON_ERROR_STOP, produces the desired effect of aborting everything just like an error under ON_ERROR_STOP does (which is the workaround here I suppose, but an ugly one that involves the server). I like the simple idea of adding an optional exit status argument to \quit. I'm unsure whether "ON_ERROR_STOP" should or should not change the behavior, or whether it should just exit(n) with \quit n. Note that using quit to abort a psql script is already used when loading extensions to prevent them to be run directly by psql: -- from some sql files in "contrib/pg_stat_statements/": \echo Use "ALTER EXTENSION pg_stat_statements UPDATE TO '1.10'" to load this file. \quit But the same trick would fail if the guard is reach with an include. -- Fabien.
Re: How to *really* quit psql?
Hello David, vagrant@vagrant:~$ /usr/local/pgsql/bin/psql -v ON_ERROR_STOP=1 -f two.psql -f three.psql postgres ?column? -- 2 (1 row) ?column? -- 3 (1 row) (there is a \quit at the end of two.psql) Yep, that summarizes my issues! ON_ERROR_STOP is only of SQL errors, so a script can really stop by having an intentional SQL error. -- Fabien. quit.sql Description: application/sql error.sql Description: application/sql include.sql Description: application/sql exit.sql Description: application/sql
Re: How to *really* quit psql?
Hello Tom, - when the current script is included from something, you quit the current script and proceed after the \i of next -f, BAD Question: is there any way to really abort a psql script from an included file? Under what circumstances would it be appropriate for a script to take it on itself to decide that? The use case is psql scripts which update or cleanup an application schema. For security, some of these scripts check for conditions (eg, we are not in production, the application schema is in the expected version, whatever…) and should abort if the conditions are not okay. As checking for the conditions requires a few lines of code and is always the same, a simple approach is to include another script which does the check and aborts the run if necessary, eg: ```sql -- this script should not run in "prod"! \ir not_in_prod.sql -- should have aborted if it is a "prod" version. DROP TABLE AllMyUsers CASCADE; DROP TABLE QuiteImportantData CASCADE; ``` It has no way of knowing what the next -f option is or what the user intended. The intention of the user who wrote the script is to abort in some cases, to avoid damaging the database contents. -- Fabien.
How to *really* quit psql?
Hello devs, I want to abort a psql script. How can I do that? The answer seems to be \quit, but it is not so simple: - when the current script is from a terminal, you exit psql, OK - when the current script is from a file (-f, <), you exit psql, OK - when the current script is included from something, you quit the current script and proceed after the \i of next -f, BAD Question: is there any way to really abort a psql script from an included file? I've found "\! kill $PPID" which works with bash, but I'm not sure of the portability and I was hoping for something straightforward and cleaner. If there is really no simple way, would it be okay to add a \exit which does that? -- Fabien.
Re: [PATCH] pgbench: add multiconnect option
Hello Ian, cfbot reports the patch no longer applies. As CommitFest 2022-11 is currently underway, this would be an excellent time to update the patch. Attached a v5 which is just a rebase. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 40e6a50a7f..a3ae7cc9be 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -29,7 +29,7 @@ PostgreSQL documentation pgbench option - dbname + dbname or conninfo @@ -169,6 +169,9 @@ pgbench options d not specified, the environment variable PGDATABASE is used. If that is not set, the user name specified for the connection is used. +Alternatively, the dbname can be +a standard connection information string. +Several connections can be provided. @@ -918,6 +921,21 @@ pgbench options d + + --connection-policy=policy + + +Set the connection policy when multiple connections are available. +Default is round-robin provided (ro). +Possible values are: +first (f), +random (ra), +round-robin (ro), +working (w). + + + + -h hostname --host=hostname diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index b208d74767..02f8278b34 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -301,13 +301,39 @@ uint32 max_tries = 1; bool failures_detailed = false; /* whether to group failures in * reports or logs by basic types */ +char *logfile_prefix = NULL; + +/* main connection definition */ const char *pghost = NULL; const char *pgport = NULL; const char *username = NULL; -const char *dbName = NULL; -char *logfile_prefix = NULL; const char *progname; +/* multi connections */ +typedef enum mc_policy_t +{ + MC_UNKNOWN = 0, + MC_FIRST, + MC_RANDOM, + MC_ROUND_ROBIN, + MC_WORKING +} mc_policy_t; + +/* connection info list */ +typedef struct connection_t +{ + const char *connection; /* conninfo or dbname */ + int errors; /* number of connection errors */ +} connection_t; + +static intn_connections = 0; +static connection_t *connections = NULL; +static mc_policy_t mc_policy = MC_ROUND_ROBIN; + +/* last used connection */ +// FIXME per thread? +static int current_connection = 0; + #define WSEP '@'/* weight separator */ volatile bool timer_exceeded = false; /* flag from signal handler */ @@ -871,7 +897,7 @@ usage(void) { printf("%s is a benchmarking tool for PostgreSQL.\n\n" "Usage:\n" - " %s [OPTION]... [DBNAME]\n" + " %s [OPTION]... [DBNAME or CONNINFO ...]\n" "\nInitialization options:\n" " -i, --initialize invokes initialization mode\n" " -I, --init-steps=[" ALL_INIT_STEPS "]+ (default \"" DEFAULT_INIT_STEPS "\")\n" @@ -929,6 +955,7 @@ usage(void) " -h, --host=HOSTNAME database server host or socket directory\n" " -p, --port=PORT database server port number\n" " -U, --username=USERNAME connect as specified database user\n" + " --connection-policy=Sset multiple connection policy (\"first\", \"rand\", \"round-robin\", \"working\")\n" " -V, --versionoutput version information, then exit\n" " -?, --help show this help, then exit\n" "\n" @@ -1535,13 +1562,89 @@ tryExecuteStatement(PGconn *con, const char *sql) PQclear(res); } +/* store a new connection information string */ +static void +push_connection(const char *c) +{ + connections = pg_realloc(connections, sizeof(connection_t) * (n_connections+1)); + connections[n_connections].connection = pg_strdup(c); + connections[n_connections].errors = 0; + n_connections++; +} + +/* switch connection */ +static int +next_connection(int *pci) +{ + int ci; + + ci = ((*pci) + 1) % n_connections; + *pci = ci; + + return ci; +} + +/* return the connection index to use for next attempt */ +static int +choose_connection(int *pci) +{ + int ci; + + switch (mc_policy) + { + case MC_FIRST: + ci = 0; + break; + case MC_RANDOM: + // FIXME should use a prng state ; not thread safe ; + ci = (int) getrand(_random_sequence, 0, n_connections-1); + *pci = ci; + break; + case MC_ROUND_ROBIN: + ci = next_connection(pci); + break; + case MC_WORKING: + ci = *pci; + break; + default: + pg_fatal("unexpected multi connection policy: %d", mc_policy); + exit(1); + } + + return ci; +} + +/* return multi-connection policy based on its name or shortest prefix */ +static mc_policy_t +get_connection_policy(const char *s) +{ + if (s == NULL || *s == '\0' || strcmp(s, "first") == 0 || strcmp(s, "f") == 0) + return MC_FIRST; + else if (strcmp(s, "random") == 0 || strcmp(s, "ra") == 0) + return MC_RANDOM; + else if (strcmp(s, "round-robin") == 0 || strcmp(s, "ro") == 0) + return MC_ROUND_ROBIN; + else if (strcmp(s, "working") == 0 ||
Re: [patch] \g with multiple result sets and \watch with copy queries
Bonjour Daniel, Good catch! Thanks for the quick fix! As usual, what is not tested does not:-( Attached a tap test to check for the expected behavior with multiple command \g. -- Fabien.diff --git a/src/bin/psql/t/001_basic.pl b/src/bin/psql/t/001_basic.pl index f447845717..c81feadd4e 100644 --- a/src/bin/psql/t/001_basic.pl +++ b/src/bin/psql/t/001_basic.pl @@ -325,4 +325,22 @@ is($row_count, '10', 'client-side error commits transaction, no ON_ERROR_STOP and multiple -c switches' ); +# test \g +psql_like($node, "SELECT 'un' \\g g_file_1.out", qr//, "one command \\g"); +my $c1 = slurp_file("g_file_1.out"); +like($c1, qr/un/); +unlink "g_file_1.out" or die $!; + +psql_like($node, "SELECT 'deux' \\; SELECT 'trois' \\g g_file_2.out", qr//, "two commands \\g"); +my $c2 = slurp_file("g_file_2.out"); +like($c2, qr/deux.*trois/s); +unlink "g_file_2.out" or die $!; + +psql_like($node, "\\set SHOW_ALL_RESULTS 0\nSELECT 'quatre' \\; SELECT 'cinq' \\g g_file_3.out", qr//, + "two commands \\g with only last result"); +my $c3 = slurp_file("g_file_3.out"); +like($c3, qr/cinq/); +unlike($c3, qr/quatre/); +unlink "g_file_3.out" or die $!; + done_testing();
Re: [PATCH] Introduce array_shuffle() and array_sample()
With our current PRNG infrastructure it doesn't cost much to have a separate PRNG for every purpose. I don't object to having array_shuffle() and array_sample() share one PRNG, but I don't think it should go much further than that. Thanks for your thoughts, Tom. I have a couple of questions. Should we introduce a new seed function for the new PRNG state, used by array_shuffle() and array_sample()? What would be a good name? Or should those functions use pg_global_prng_state? Is it safe to assume, that pg_global_prng_state is seeded? I'd suggest to use the existing global state. The global state should be seeded at the process start, AFAIKR. -- Fabien.
Re: [PATCH] Introduce array_shuffle() and array_sample()
Hello, Thank you for your feedback. I attached a patch, that addresses most of your points. I'll look into it. It would help if the patch could include a version number at the end. Should the exchange be skipped when i == k? The additional branch is actually slower (on my machine, tested with an 10M element int array) for 1D-arrays, which i assume is the most common case. Even with a 2D-array with a subarray size of 1M there is no difference in execution time. i == k should be relatively rare for large arrays, given a good prng. Ok, slower is bad. The random and array shuffling functions share a common state. I'm wondering whether it should ot should not be so. It seems ok, but then ISTM that the documentation suggests implicitely that setseed applies to random() only, which is not the case anymore after the patch. I really like the idea of a prng state owned by the user, that is used by all user facing random functions, but see that their might be concerns about this change. I would update the setseed() documentation, if this proposal is accepted. Do you think my patch should already include the documentation update? Duno. I'm still wondering what it should do. I'm pretty sure that the documentation should be clear about a shared seed, if any. I do not think that departing from the standard is a good thing, either. If more samples are required than the number of elements, it does not error out. I'm wondering whether it should. The second argument to array_sample() is treated like a LIMIT, rather than the actual number of elements. I updated the documentation. My use-case is: take max random items from an array of unknown size. Hmmm. ISTM that the use case of wanting "this many" stuff would make more sense because it is strictier so less likely to result in unforseen consequences. On principle I do not like not knowing the output size. If someone wants a limit, they can easily "LEAST(#1 dim size, other limit)" to get it, it is easy enough with a strict function. Also, the sampling should not return its result in order when the number of elements required is the full array, ISTM that it should be shuffled there as well. You are the second person asking for this. It's done. I am thinking about ditching array_sample() and replace it with a second signature for array_shuffle(): array_shuffle(array anyarray) -> anyarray array_shuffle(array anyarray, limit int) -> anyarray What do you think? I think that shuffle means shuffle, not partial shuffle, so a different name seems better to me. -- Fabien.
Re: [PATCH] Introduce array_shuffle() and array_sample()
i came to the same conclusions and went with Option 1 (see patch). Mainly because most code in utils/adt is organized by type and this way it is clear, that this is a thin wrapper around pg_prng. Small patch update. I realized the new functions should live array_userfuncs.c (rather than arrayfuncs.c), fixed some file headers and added some comments to the code. My 0,02€ about this patch: Use (void) when declaring no parameters in headers or in functions. Should the exchange be skipped when i == k? I do not see the point of having *only* inline functions in a c file. The external functions should not be inlined? The random and array shuffling functions share a common state. I'm wondering whether it should ot should not be so. It seems ok, but then ISTM that the documentation suggests implicitely that setseed applies to random() only, which is not the case anymore after the patch. If more samples are required than the number of elements, it does not error out. I'm wondering whether it should. Also, the sampling should not return its result in order when the number of elements required is the full array, ISTM that it should be shuffled there as well. I must say that without significant knowledge of the array internal implementation, the swap code looks pretty strange. ISTM that the code would be clearer if pointers and array references style were not intermixed. Maybe you could add a test with a 3D array? Some sample with NULLs? Unrelated: I notice again that (postgre)SQL does not provide a way to generate random integers. I do not see why not. Should we provide one? -- Fabien.
Re: Future Postgres 15 and Clang 15
Hello Thomas, llvmjit.c:1233:81: error: too few arguments to function call, expected 3, have 2 ref_gen = LLVMOrcCreateCustomCAPIDefinitionGenerator(llvm_resolve_symbols, NULL); Ah yes, I hadn't seen that one yet. That function grew a "Dispose" argument, which we can just pass NULL for, at a guess: https://github.com/llvm/llvm-project/commit/14b7c108a2bf46541efc3a5c9cbd589b3afc18e6 I agree with the guess. Whether the NULL induced semantics is the right one is much less obvious… The question is: should pg 15 try to be clang 15 ready? I'm afraid yes, as LLVM does 2 releases per year, so clang 15 should come out this Fall, together with pg 15. Possibly other changes will come before the releases:-/ OK let's try to get a patch ready first and then see what we can do. I'm more worried about code that compiles OK but then crashes or gives wrong query results (like the one for 9b4e4cfe) than I am about code that doesn't compile at all (meaning no one can actually ship it!). I think the way our two projects' release cycles work, there will occasionally be short periods where we can't use their very latest release, but we can try to avoid that... Yep. The part which would worry me is the code complexity and kludges induced by trying to support a moving API. Maybe careful header-handled macros can do the trick (eg for an added parameter as above), but I'm afraid it cannot always be that simple. -- Fabien.
Future Postgres 15 and Clang 15
Just a note/reminder that "seawasp" has been unhappy for some days now because of yet another change in the unstable API provided by LLVM: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=seawasp=2022-06-23%2023%3A18%3A17 llvmjit.c:1115:50: error: use of undeclared identifier 'LLVMJITCSymbolMapPair' LLVMOrcCSymbolMapPairs symbols = palloc0(sizeof(LLVMJITCSymbolMapPair) * LookupSetSize); llvmjit.c:1233:81: error: too few arguments to function call, expected 3, have 2 ref_gen = LLVMOrcCreateCustomCAPIDefinitionGenerator(llvm_resolve_symbols, NULL); The question is: should pg 15 try to be clang 15 ready? I'm afraid yes, as LLVM does 2 releases per year, so clang 15 should come out this Fall, together with pg 15. Possibly other changes will come before the releases:-/ -- Fabien.
Re: better page-level checksums
Hello Robert, I think for this purpose we should limit ourselves to algorithms whose output size is, at minimum, 64 bits, and ideally, a multiple of 64 bits. I'm sure there are plenty of options other than the ones that btrfs uses; I mentioned them only as a way of jump-starting the discussion. Note that SHA-256 and BLAKE2B apparently emit enormously wide 16 BYTE checksums. That's a lot of space to consume with a checksum, but your chances of a collision are very small indeed. My 0.02€ about that: You do not have to store the whole hash algorithm output, you can truncate or reduce (eg by xoring parts) the size to what makes sense for your application and security requirements. ISTM that 64 bits is more than enough for a page checksum, whatever the underlying hash algorithm. Also, ISTM that a checksum algorithm does not really need to be cryptographically strong, which means that cheaper alternatives are ok, although good quality should be sought nevertheless. -- Fabien.
Re: pgcon unconference / impact of block size on performance
Hello Tomas, At on of the pgcon unconference sessions a couple days ago, I presented a bunch of benchmark results comparing performance with different data/WAL block size. Most of the OLTP results showed significant gains (up to 50%) with smaller (4k) data pages. You wrote something about SSD a long time ago, but the link is now dead: http://www.fuzzy.cz/en/articles/ssd-benchmark-results-read-write-pgbench/ See also: http://www.cybertec.at/postgresql-block-sizes-getting-started/ http://blog.coelho.net/database/2014/08/08/postgresql-page-size-for-SSD.html [...] The other important factor is the native SSD page, which is similar to sectors on HDD. SSDs however don't allow in-place updates, and have to reset/rewrite of the whole native page. It's actually more complicated, because the reset happens at a much larger scale (~8MB block), so it does matter how quickly we "dirty" the data. The consequence is that using data pages smaller than the native page (depends on the device, but seems 4K is the common value) either does not help or actually hurts the write performance. All the SSD results show this behavior - the Optane and Samsung nicely show that 4K is much better (in random write IOPS) than 8K, but 1-2K pages make it worse. Yep. ISTM that uou should also consider the underlying FS block size. Ext4 uses 4 KiB by default, so if you write 2 KiB it will write 4 KiB anyway. There is no much doubt that with SSD we should reduce the default page size. There are some negative impacts (eg more space is lost because of headers and the number of tuples that can be fitted), but I guess the should be an overall benefit. It would help a lot if it would be possible to initdb with a different block size, without recompiling. -- Fabien.
Re: psql now shows zero elapsed time after an error
Probably it would be appropriate to add a test case. I'll propose something later. committed with a test Thanks! -- Fabien.
Re: psql now shows zero elapsed time after an error
Hello, Thanks for the catch and the proposed fix! Indeed, on errors the timing is not updated appropriately. ISTM that the best course is to update the elapsed time whenever a result is obtained, so that a sensible value is always available. See attached patch which is a variant of Richard's version. fabien=# SELECT 1 as one \; SELECT 1/0 \; SELECT 2 as two; ┌─┐ │ one │ ├─┤ │ 1 │ └─┘ (1 row) ERROR: division by zero Time: 0,352 ms Probably it would be appropriate to add a test case. I'll propose something later. -- Fabien.diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index feb1d547d4..773673cc62 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -1560,6 +1560,16 @@ ExecQueryAndProcessResults(const char *query, double *elapsed_msec, bool *svpt_g else result = PQgetResult(pset.db); + /* + * Get current timing measure in case an error occurs + */ + if (timing) + { +INSTR_TIME_SET_CURRENT(after); +INSTR_TIME_SUBTRACT(after, before); +*elapsed_msec = INSTR_TIME_GET_MILLISEC(after); + } + continue; } else if (svpt_gone_p && !*svpt_gone_p) @@ -1612,7 +1622,7 @@ ExecQueryAndProcessResults(const char *query, double *elapsed_msec, bool *svpt_g last = (next_result == NULL); /* - * Get timing measure before printing the last result. + * Update current timing measure. * * It will include the display of previous results, if any. * This cannot be helped because the server goes on processing @@ -1623,7 +1633,7 @@ ExecQueryAndProcessResults(const char *query, double *elapsed_msec, bool *svpt_g * With combined queries, timing must be understood as an upper bound * of the time spent processing them. */ - if (last && timing) + if (timing) { INSTR_TIME_SET_CURRENT(after); INSTR_TIME_SUBTRACT(after, before);
Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors
It find it a little annoying that there is no change in tests, it means that the format is not checked at all:-( Yeah. Perhaps it's a little bit hard to perform this kind of tests in the TAP test? Not really. I'll look into it. -- Fabien.
Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors
Ok. Fine with me. Possibly at some point there was the idea that there could be other failures counted, but there are none. Also, there has been questions about the failures detailed option, or whether the reports should always be detailed, and the result may be some kind of not convincing compromise. Attached is the patch to remove the column. Patch applies cleanly. Compilation ok. Global and local "make check" ok. Doc build ok. It find it a little annoying that there is no change in tests, it means that the format is not checked at all:-( -- Fabien.
Re: random() function documentation
How about we just say "uses a linear-feedback shift register algorithm"? I think it'd be sufficient to just say that it's a deterministic pseudorandom number generator. I don't see much value in documenting the internal algorithm used. Hmmm… I'm not so sure. ISTM that people interested in using the random user-facing variants (only random?) could like a pointer on the algorithm to check for the expected quality of the produced pseudo-random stream? See attached. Should we perhaps also add a warning that the same seed is not guaranteed to produce the same sequence across different (major?) versions? I wouldn't bother, on the grounds that then we'd need such disclaimers in a whole lot of places. Others might see it differently though. Agreed, Agreed. though I think when the release notes are written, they ought to warn that the sequence will change with this release. Yes. -- Fabien.diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 0a5c402640..7492454592 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -1832,10 +1832,11 @@ repeat('Pg', 4) PgPgPgPg - The random() function uses a simple linear - congruential algorithm. It is fast but not suitable for cryptographic - applications; see the module for a more - secure alternative. + The random() function uses + https://en.wikipedia.org/wiki/Xoroshiro128%2B;>xoroshiro128**, a + linear feedback shift register algorithm. + It is fast but not suitable for cryptographic applications; + see the module for a more secure alternative. If setseed() is called, the series of results of subsequent random() calls in the current session can be repeated by re-issuing setseed() with the same
Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors
Hello Tom, The buildfarm is still complaining about the synopsis being too wide for PDF format. I think what we ought to do is give up on using a for log lines at all, and instead convert the documentation into a tabular list of fields. Proposal attached, which also fixes a couple of outright errors. Looks ok. Html doc generation ok. While looking at the html outpout, the "pgbench" command line just below wraps strangely: pgbench --aggregate-interval=10 --time=20 --client=10 --log --rate=1000 --latency-limit=10 --failures-detailed --max-tries=10 test ISTM that there should be no nl in the pgbench … section, although maybe it would trigger a complaint in the pdf format. One thing that this doesn't fix is that the existing text appears to suggest that the "failures" column is something different from the sum of the serialization_failures and deadlock_failures columns, which it's obvious from the code is not so. If this isn't a code bug then I think we ought to just drop that column entirely, because it's redundant. Ok. Fine with me. Possibly at some point there was the idea that there could be other failures counted, but there are none. Also, there has been questions about the failures detailed option, or whether the reports should always be detailed, and the result may be some kind of not convincing compromise. (BTW, now that I've read this stuff I am quite horrified by how the non-aggregated log format has been mangled for error retries, and will be probably be submitting a proposal to change that. But that's a different issue.) Indeed, any improvement is welcome! -- Fabien.
Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors
Hello Tatsuo-san, interval_start num_transactions sum_latency sum_latency_2 min_latency max_latency sum_lag sum_lag_2 min_lag max_lag skipped failures serialization_failures deadlock_failures retried retries I would suggest to reorder the last chunk to: ... retried retries failures serfail dlfail because I intend to add connection failures handling at some point, and it would make more sense to add the corresponding count at the end with other fails. -- Fabien Coelho - CRI, MINES ParisTech
Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors
Or those three columns always, sum_lag sum_lag_2, min_lag max_lag, skipped, retried retries? What about this? (a log line is not actually folded) interval_start num_transactions sum_latency sum_latency_2 min_latency max_latency failures serialization_failures deadlock_failures retried retries [ sum_lag sum_lag_2 min_lag max_lag [ skipped ] ] My 0.02€: I agree that it would be better to have a more deterministic aggregated log format. ISTM that it should skip failures and lags if no fancy options has been selected, i.e.: [ fails ... retries [ sum_lag ... [ skipped ] ] ? Alterlatively, as the failure stuff is added to the format, maybe it could be at the end: [ sum_lag ... [ skipped [ fails ... retries ] ] ] ? failures: always 0 (if --max-tries is 1, the default) sum of serialization_failures and deadlock_failures (if --max-tries is not 1) serialization_failures and deadlock_failures: always 0 (if --max-tries is 1, the default) 0 or more (if --max-tries is not 1) retried and retries: always 0 (if --max-tries is 1, the default) 0 or more (if --max-tries is not 1) Best reagards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Fabien.
Re: [PATCH] pgbench: add multiconnect option
According to the cfbot this patch needs a rebase Indeed. v4 attached. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index ebdb4b3f46..d96d2d291d 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -29,7 +29,7 @@ PostgreSQL documentation pgbench option - dbname + dbname or conninfo @@ -169,6 +169,9 @@ pgbench options d not specified, the environment variable PGDATABASE is used. If that is not set, the user name specified for the connection is used. +Alternatively, the dbname can be +a standard connection information string. +Several connections can be provided. @@ -918,6 +921,21 @@ pgbench options d + + --connection-policy=policy + + +Set the connection policy when multiple connections are available. +Default is round-robin provided (ro). +Possible values are: +first (f), +random (ra), +round-robin (ro), +working (w). + + + + -h hostname --host=hostname diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index acf3e56413..d99d40fbb9 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -305,13 +305,39 @@ uint32 max_tries = 1; bool failures_detailed = false; /* whether to group failures in reports * or logs by basic types */ +char *logfile_prefix = NULL; + +/* main connection definition */ const char *pghost = NULL; const char *pgport = NULL; const char *username = NULL; -const char *dbName = NULL; -char *logfile_prefix = NULL; const char *progname; +/* multi connections */ +typedef enum mc_policy_t +{ + MC_UNKNOWN = 0, + MC_FIRST, + MC_RANDOM, + MC_ROUND_ROBIN, + MC_WORKING +} mc_policy_t; + +/* connection info list */ +typedef struct connection_t +{ + const char *connection; /* conninfo or dbname */ + int errors; /* number of connection errors */ +} connection_t; + +static intn_connections = 0; +static connection_t *connections = NULL; +static mc_policy_t mc_policy = MC_ROUND_ROBIN; + +/* last used connection */ +// FIXME per thread? +static int current_connection = 0; + #define WSEP '@'/* weight separator */ volatile bool timer_exceeded = false; /* flag from signal handler */ @@ -873,7 +899,7 @@ usage(void) { printf("%s is a benchmarking tool for PostgreSQL.\n\n" "Usage:\n" - " %s [OPTION]... [DBNAME]\n" + " %s [OPTION]... [DBNAME or CONNINFO ...]\n" "\nInitialization options:\n" " -i, --initialize invokes initialization mode\n" " -I, --init-steps=[" ALL_INIT_STEPS "]+ (default \"" DEFAULT_INIT_STEPS "\")\n" @@ -931,6 +957,7 @@ usage(void) " -h, --host=HOSTNAME database server host or socket directory\n" " -p, --port=PORT database server port number\n" " -U, --username=USERNAME connect as specified database user\n" + " --connection-policy=Sset multiple connection policy (\"first\", \"rand\", \"round-robin\", \"working\")\n" " -V, --versionoutput version information, then exit\n" " -?, --help show this help, then exit\n" "\n" @@ -1538,13 +1565,89 @@ tryExecuteStatement(PGconn *con, const char *sql) PQclear(res); } +/* store a new connection information string */ +static void +push_connection(const char *c) +{ + connections = pg_realloc(connections, sizeof(connection_t) * (n_connections+1)); + connections[n_connections].connection = pg_strdup(c); + connections[n_connections].errors = 0; + n_connections++; +} + +/* switch connection */ +static int +next_connection(int *pci) +{ + int ci; + + ci = ((*pci) + 1) % n_connections; + *pci = ci; + + return ci; +} + +/* return the connection index to use for next attempt */ +static int +choose_connection(int *pci) +{ + int ci; + + switch (mc_policy) + { + case MC_FIRST: + ci = 0; + break; + case MC_RANDOM: + // FIXME should use a prng state ; not thread safe ; + ci = (int) getrand(_random_sequence, 0, n_connections-1); + *pci = ci; + break; + case MC_ROUND_ROBIN: + ci = next_connection(pci); + break; + case MC_WORKING: + ci = *pci; + break; + default: + pg_log_fatal("unexpected multi connection policy: %d", mc_policy); + exit(1); + } + + return ci; +} + +/* return multi-connection policy based on its name or shortest prefix */ +static mc_policy_t +get_connection_policy(const char *s) +{ + if (s == NULL || *s == '\0' || strcmp(s, "first") == 0 || strcmp(s, "f") == 0) + return MC_FIRST; + else if (strcmp(s, "random") == 0 || strcmp(s, "ra") == 0) + return MC_RANDOM; + else if (strcmp(s, "round-robin") == 0 || strcmp(s, "ro") == 0) + return MC_ROUND_ROBIN; + else if (strcmp(s, "working") == 0 || strcmp(s, "w") == 0) + return MC_WORKING; + else + return MC_UNKNOWN; +} + +/* get backend connection info */
Re: psql - add SHOW_ALL_RESULTS option
Again, after the SendQuery refactoring extraction. I'm doing this locally, so don't feel obliged to send more of these. ;-) Good for me :-) -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Attached a rebase. Again, after the SendQuery refactoring extraction. -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..8f7f93172a 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index caabb06c53..f01adb1fd2 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3570,10 +3563,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4160,6 +4149,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of psql. + + + + + SHOW_CONTEXT diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index c9847c8f9a..c84688e89c 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -32,9 +32,10 @@ static bool DescribeQuery(const char *query, double *elapsed_msec); static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec); -static bool ExecQueryAndProcessResult(const char *query, double *elapsed_msec, bool *svpt_gone_p); static bool command_no_begin(const char *query); static bool is_select_command(const char *query); +static int SendQueryAndProcessResults(const char *query, double *pelapsed_msec, + bool is_watch, const printQueryOpt *opt, FILE *printQueryFout, bool *tx_ended); /* @@ -355,7 +356,7 @@ CheckConnection(void) * Returns true for valid result, false for error state. */ static bool -AcceptResult(const PGresult *result) +AcceptResult(const PGresult *result, bool show_error) { bool OK; @@ -386,7 +387,7 @@ AcceptResult(const PGresult *result) break; } - if (!OK) + if (!OK && show_error) { const char *error = PQerrorMessage(pset.db); @@ -474,6 +475,18 @@ ClearOrSaveResult(PGresult *result) } } +/* + * Consume all results + */ +static void +ClearOrSaveAllResults() +{ + PGresult *result; + + while ((result = PQgetResult(pset.db)) != NULL) + ClearOrSaveResult(result); +} + /* * Print microtiming output. Always print raw milliseconds; if the interval @@ -574,7 +587,7 @@ PSQLexec(const char *query) ResetCancelConn(); - if (!AcceptResult(res)) + if (!AcceptResult(res, true)) { ClearOrSaveResult(res); res = NULL; @@ -597,11 +610,8 @@ int PSQLexecWatch(const char *query, const printQueryOpt *opt, FILE *printQueryFout) { bool timing = pset.timing; - PGresult *res; double elapsed_msec = 0; - instr_time before; - instr_time after; - FILE *fout; + int res; if (!pset.db) { @@
Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors
Or those three columns always, sum_lag sum_lag_2, min_lag max_lag, skipped, retried retries? Anyway now that current CF is closing, it will not be possible to change those logging design soon. Or can we change the logging design even after CF is closed? My 0.02€: I'm not sure how the official guidelines are to be interpreted in that case, but if the design is to be changed, ISTM that it is better to do it before a release instead of letting the release out with one format and changing it in the next release? -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Attached a rebase. -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..8f7f93172a 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index caabb06c53..f01adb1fd2 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3570,10 +3563,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4160,6 +4149,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of psql. + + + + + SHOW_CONTEXT diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index d65b9a124f..c84688e89c 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -34,6 +34,8 @@ static bool DescribeQuery(const char *query, double *elapsed_msec); static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec); static bool command_no_begin(const char *query); static bool is_select_command(const char *query); +static int SendQueryAndProcessResults(const char *query, double *pelapsed_msec, + bool is_watch, const printQueryOpt *opt, FILE *printQueryFout, bool *tx_ended); /* @@ -354,7 +356,7 @@ CheckConnection(void) * Returns true for valid result, false for error state. */ static bool -AcceptResult(const PGresult *result) +AcceptResult(const PGresult *result, bool show_error) { bool OK; @@ -385,7 +387,7 @@ AcceptResult(const PGresult *result) break; } - if (!OK) + if (!OK && show_error) { const char *error = PQerrorMessage(pset.db); @@ -473,6 +475,18 @@ ClearOrSaveResult(PGresult *result) } } +/* + * Consume all results + */ +static void +ClearOrSaveAllResults() +{ + PGresult *result; + + while ((result = PQgetResult(pset.db)) != NULL) + ClearOrSaveResult(result); +} + /* * Print microtiming output. Always print raw milliseconds; if the interval @@ -573,7 +587,7 @@ PSQLexec(const char *query) ResetCancelConn(); - if (!AcceptResult(res)) + if (!AcceptResult(res, true)) { ClearOrSaveResult(res); res = NULL; @@ -596,11 +610,8 @@ int PSQLexecWatch(const char *query, const printQueryOpt *opt, FILE *printQueryFout) { bool timing = pset.timing; - PGresult *res; double elapsed_msec = 0; - instr_time before; - instr_time after; - FILE *fout; + int res; if (!pset.db) { @@ -609,77 +620,14 @@ PSQLexecWatch(const char *query, const printQueryOpt *opt, FILE *printQueryFout) } SetCancelConn(pset.db); - - if (timing) -
Re: psql - add SHOW_ALL_RESULTS option
Hello Peter, As Tom said earlier, wasn't this fixed by 618c16707? If not, is there any other discussion on the specifics of this issue? I'm not aware of one. This answer is that I had kept psql's calls to PQerrorMessage which reports errors from the connection, whereas it needed to change to PQresultErrorMessage to benefit from the libpq improvement. I made the added autocommit/on_error_rollback tests at the end really focus on multi-statement queries (\;), as was more or less intended. I updated the tap test. -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..8f7f93172a 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index caabb06c53..f01adb1fd2 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3570,10 +3563,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4160,6 +4149,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of psql. + + + + + SHOW_CONTEXT diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index d65b9a124f..c84688e89c 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -34,6 +34,8 @@ static bool DescribeQuery(const char *query, double *elapsed_msec); static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec); static bool command_no_begin(const char *query); static bool is_select_command(const char *query); +static int SendQueryAndProcessResults(const char *query, double *pelapsed_msec, + bool is_watch, const printQueryOpt *opt, FILE *printQueryFout, bool *tx_ended); /* @@ -354,7 +356,7 @@ CheckConnection(void) * Returns true for valid result, false for error state. */ static bool -AcceptResult(const PGresult *result) +AcceptResult(const PGresult *result, bool show_error) { bool OK; @@ -385,7 +387,7 @@ AcceptResult(const PGresult *result) break; } - if (!OK) + if (!OK && show_error) { const char *error = PQerrorMessage(pset.db); @@ -473,6 +475,18 @@ ClearOrSaveResult(PGresult *result) } } +/* + * Consume all results + */ +static void +ClearOrSaveAllResults() +{ + PGresult *result; + + while ((result = PQgetResult(pset.db)) != NULL) + ClearOrSaveResult(result); +} + /* * Print microtiming output. Always print raw milliseconds; if the interval @@ -573,7 +587,7 @@ PSQLexec(const char *query) ResetCancelConn(); - if (!AcceptResult(res)) + if
Re: psql - add SHOW_ALL_RESULTS option
As Tom said earlier, wasn't this fixed by 618c16707? If not, is there any other discussion on the specifics of this issue? I'm not aware of one. Hmmm… I'll try to understand why the doubled message seems to be still there. -- Fabien.
Re: [PATCH] pgbench: add multiconnect option
Pgbench is a simple benchmark tool by design, and I wonder if adding a multiconnect feature will cause pgbench to be used incorrectly. Maybe, but I do not see how it would be worse that what pgbench already allows. I agree that pgbench is simple; perhaps really too simple when it comes to being able to measure much more than basic query flows. What pgbench does have in its favor is being distributed with the core distribution. I think there is definitely space for a more complicated benchmarking tool that exercises more scenarios and more realistic query patterns and scenarios. Whether that is distributed with the core is another question. As far as this feature is concerned, the source code impact of the patch is very small, so I do not think that is worth barring this feature on that ground. -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Hello Peter, Attached v17 is another try. The point is to record the current status, whatever it is, buggy or not, and to update the test when libpq fixes things, whenever this is done. [...] The expected output (which passes) contains this line twice: psql::2: FATAL: terminating connection due to administrator command psql::2: FATAL: terminating connection due to administrator command If I paste this test case into current master without your patch, I only get this line once. So your patch is changing this output. The whole point of the libpq fixes was to not have this duplicate output. So I think something is still wrong somewhere. Hmmm. Yes and no:-) The previous path inside libpq silently ignores intermediate results, it skips all results to keep only the last one. The new approach does not discard resultss silently, hence the duplicated output, because they are actually there and have always been there in the first place, they were just ignored: The previous "good" result is really a side effect of a bad implementation in a corner case, which just becomes apparent when opening the list of results. So my opinion is still to dissociate the libpq "bug/behavior" fix from this feature, as they are only loosely connected, because it is a very corner case anyway. An alternative would be to remove the test case, but I'd prefer that it is kept. If you want to wait for libpq to provide a solution for this corner case, I'm afraid that "never" is the likely result, especially as no test case exercices this path to show that there is a problem somewhere, so nobody should care to fix it. I'm not sure it is even worth it given the highly special situation which triggers the issue, which is not such an actual problem (ok, the user is told twice that there was a connection loss, no big deal). -- Fabien.
Re: [PATCH] pgbench: add multiconnect option
Hi Sami, Pgbench is a simple benchmark tool by design, and I wonder if adding a multiconnect feature will cause pgbench to be used incorrectly. Maybe, but I do not see how it would be worse that what pgbench already allows. A real world use-case will be helpful for this thread. Basically more versatile testing for non single host setups. For instance, it would allow testing directly a multi-master setup, such as bucardo, symmetricds or coackroachdb. It would be a first step on the path to allow interesting features such as: - testing failover setup, on connection error a client could connect to another host. - testing a primary/standby setup, with write transactions sent to the primary and read transactions sent to the standbyes. Basically I have no doubt that it can be useful. For the current patch, Should the report also cover per-database statistics (tps/latency/etc.) ? That could be a "per-connection" option. If there is a reasonable use case I think that it would be an easy enough feature to implement. Attached a rebased version. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index be1896fa99..69bd5b76f1 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -29,7 +29,7 @@ PostgreSQL documentation pgbench option - dbname + dbname or conninfo @@ -160,6 +160,9 @@ pgbench options d not specified, the environment variable PGDATABASE is used. If that is not set, the user name specified for the connection is used. +Alternatively, the dbname can be +a standard connection information string. +Several connections can be provided. @@ -843,6 +846,21 @@ pgbench options d + + --connection-policy=policy + + +Set the connection policy when multiple connections are available. +Default is round-robin provided (ro). +Possible values are: +first (f), +random (ra), +round-robin (ro), +working (w). + + + + -h hostname --host=hostname diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 000ffc4a5c..5006e21766 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -278,13 +278,39 @@ bool is_connect; /* establish connection for each transaction */ bool report_per_command; /* report per-command latencies */ int main_pid; /* main process id used in log filename */ +char *logfile_prefix = NULL; + +/* main connection definition */ const char *pghost = NULL; const char *pgport = NULL; const char *username = NULL; -const char *dbName = NULL; -char *logfile_prefix = NULL; const char *progname; +/* multi connections */ +typedef enum mc_policy_t +{ + MC_UNKNOWN = 0, + MC_FIRST, + MC_RANDOM, + MC_ROUND_ROBIN, + MC_WORKING +} mc_policy_t; + +/* connection info list */ +typedef struct connection_t +{ + const char *connection; /* conninfo or dbname */ + int errors; /* number of connection errors */ +} connection_t; + +static intn_connections = 0; +static connection_t *connections = NULL; +static mc_policy_t mc_policy = MC_ROUND_ROBIN; + +/* last used connection */ +// FIXME per thread? +static int current_connection = 0; + #define WSEP '@'/* weight separator */ volatile bool timer_exceeded = false; /* flag from signal handler */ @@ -694,7 +720,7 @@ usage(void) { printf("%s is a benchmarking tool for PostgreSQL.\n\n" "Usage:\n" - " %s [OPTION]... [DBNAME]\n" + " %s [OPTION]... [DBNAME or CONNINFO ...]\n" "\nInitialization options:\n" " -i, --initialize invokes initialization mode\n" " -I, --init-steps=[" ALL_INIT_STEPS "]+ (default \"" DEFAULT_INIT_STEPS "\")\n" @@ -749,6 +775,7 @@ usage(void) " -h, --host=HOSTNAME database server host or socket directory\n" " -p, --port=PORT database server port number\n" " -U, --username=USERNAME connect as specified database user\n" + " --connection-policy=Sset multiple connection policy (\"first\", \"rand\", \"round-robin\", \"working\")\n" " -V, --versionoutput version information, then exit\n" " -?, --help show this help, then exit\n" "\n" @@ -1323,13 +1350,89 @@ tryExecuteStatement(PGconn *con, const char *sql) PQclear(res); } +/* store a new connection information string */ +static void +push_connection(const char *c) +{ + connections = pg_realloc(connections, sizeof(connection_t) * (n_connections+1)); + connections[n_connections].connection = pg_strdup(c); + connections[n_connections].errors = 0; + n_connections++; +} + +/* switch connection */ +static int +next_connection(int *pci) +{ + int ci; + + ci = ((*pci) + 1) % n_connections; + *pci = ci; + + return ci; +} + +/* return the connection index to use for next attempt */ +static int
Re: psql - add SHOW_ALL_RESULTS option
Hello Peter, See attached v16 which removes the libpq workaround. I suppose this depends on https://www.postgresql.org/message-id/flat/ab4288f8-be5c-57fb-2400-e3e857f53e46%40enterprisedb.com getting committed, because right now this makes the psql TAP tests fail because of the duplicate error message. How should we handle that? Ok, it seems I got the patch wrong. Attached v17 is another try. The point is to record the current status, whatever it is, buggy or not, and to update the test when libpq fixes things, whenever this is done. In this part of the patch, there seems to be part of a sentence missing: Indeed! The missing part was put back in v17. -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..8f7f93172a 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index caabb06c53..f01adb1fd2 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3570,10 +3563,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4160,6 +4149,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of psql. + + + + + SHOW_CONTEXT diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index d65b9a124f..2f3dd91602 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -34,6 +34,8 @@ static bool DescribeQuery(const char *query, double *elapsed_msec); static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec); static bool command_no_begin(const char *query); static bool is_select_command(const char *query); +static int SendQueryAndProcessResults(const char *query, double *pelapsed_msec, + bool is_watch, const printQueryOpt *opt, FILE *printQueryFout, bool *tx_ended); /* @@ -354,7 +356,7 @@ CheckConnection(void) * Returns true for valid result, false for error state. */ static bool -AcceptResult(const PGresult *result) +AcceptResult(const PGresult *result, bool show_error) { bool OK; @@ -385,7 +387,7 @@ AcceptResult(const PGresult *result) break; } - if (!OK) + if (!OK && show_error) { const char *error = PQerrorMessage(pset.db); @@ -473,6 +475,18 @@ ClearOrSaveResult(PGresult *result) } } +/* + * Consume all results + */ +static void +ClearOrSaveAllResults() +{ + PGresult *result; + + while ((result = PQgetResult(pset.db)) != NULL) + ClearOrSaveResult(result); +} + /* * Print microtiming output. Always
Re: [PATCH] pgbench: add multiconnect option
Hello Greg, It looks like David sent a patch and Fabien sent a followup patch. But there hasn't been a whole lot of discussion or further patches. It sounds like there are some basic questions about what the right interface should be. Are there specific questions that would be helpful for moving forward? Review the designs and patches and tell us what you think? Personnaly, I think that allowing multiple connections is a good thing, especially if the code impact is reduced, which is the case with the version I sent. Then for me the next step would be to have a reconnection on errors so as to implement a client-side failover policy that could help testing a server-failover performance impact. I have done that internally but it requires that "Pgbench Serialization and deadlock errors" to land, as it would just be another error that can be handled. -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Are you planning to send a rebased patch for this commit fest? Argh, I did it in a reply in another thread:-( Attached v15. So as to help moves things forward, I'd suggest that we should not to care too much about corner case repetition of some error messages which are due to libpq internals, so I could remove the ugly buffer reset from the patch and have the repetition, and if/when the issue is fixed later in libpq then the repetition will be removed, fine! The issue is that we just expose the strange behavior of libpq, which is libpq to solve, not psql. See attached v16 which removes the libpq workaround. -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..8f7f93172a 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index caabb06c53..f01adb1fd2 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3570,10 +3563,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4160,6 +4149,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of psql. + + + + + SHOW_CONTEXT diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index d65b9a124f..7903075975 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -34,6 +34,8 @@ static bool DescribeQuery(const char *query, double *elapsed_msec); static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec); static bool command_no_begin(const char *query); static bool is_select_command(const char *query); +static int SendQueryAndProcessResults(const char *query, double *pelapsed_msec, + bool is_watch, const printQueryOpt *opt, FILE *printQueryFout, bool *tx_ended); /* @@ -354,7 +356,7 @@ CheckConnection(void) * Returns true for valid result, false for error state. */ static bool -AcceptResult(const PGresult *result) +AcceptResult(const PGresult *result, bool show_error) { bool OK; @@ -385,7 +387,7 @@ AcceptResult(const PGresult *result) break; } - if (!OK) + if (!OK && show_error) { const char *error = PQerrorMessage(pset.db); @@ -473,6 +475,18 @@ ClearOrSaveResult(PGresult *result) } } +/* + * Consume all results + */ +static void +ClearOrSaveAllResults() +{ + PGresult *result; + + while ((result = PQgetResult(pset.db)) != NULL) + ClearOrSaveResult(result); +} + /* * Print microtiming output. Always print raw milliseconds; if the interval @@ -573,7
Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors
Hello Yugo-san, About Pgbench error handling v16: This patch set needs a minor rebase because of 506035b0. Otherwise, patch compiles, global and local "make check" are ok. Doc generation is ok. This patch is in good shape, the code and comments are clear. Some minor remarks below, including typos and a few small suggestions. ## About v16-1 This refactoring patch adds a struct for managing pgbench variables, instead of mixing fields into the client state (CState) struct. Patch compiles, global and local "make check" are both ok. Although this patch is not necessary to add the feature, I'm fine with it as improves pgbench source code readability. ## About v16-2 This last patch adds handling of serialization and deadlock errors to pgbench transactions. This feature is desirable because it enlarge performance testing options, and makes pgbench behave more like a database client application. Possible future extension enabled by this patch include handling deconnections errors by trying to reconnect, for instance. The documentation is clear and well written, at least for my non-native speaker eyes and ears. English: "he will be aborted" -> "it will be aborted". I'm fine with renaming --report-latencies to --report-per-command as the later is clearer about what the options does. I'm still not sure I like the "failure detailed" option, ISTM that the report could be always detailed. That would remove some complexity and I do not think that people executing a bench with error handling would mind having the details. No big deal. printVerboseErrorMessages: I'd make the buffer static and initialized only once so that there is no significant malloc/free cycle involved when calling the function. advanceConnectionState: I'd really prefer not to add new variables (res, status) in the loop scope, and only declare them when actually needed in the state branches, so as to avoid any unwanted interaction between states. typo: "fullowing" -> "following" Pipeline cleaning: the advance function is already s long, I'd put that in a separate function and call it. I think that the report should not remove data when they are 0, otherwise it makes it harder to script around it (in failures_detailed on line 6284). The test cover the different cases. I tried to suggest a simpler approach in a previous round, but it seems not so simple to do so. They could be simplified later, if possible. -- Fabien.
Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
Hello Justin, Review about v34, up from v32 which I reviewed in January. v34 is an updated version of v32, without the part about lstat at the end of the series. All 7 patches apply cleanly. # part 01 One liner doc improvement about the directory flag. OK. # part 02 Add tests for various options on pg_ls_dir, and for pg_stat_file, which were not exercised before. "make check" is ok. OK. # part 03 This patch adds a new pg_ls_dir_metadata. Internally, this is an extension of pg_ls_dir_files function which is used by other pg_ls functions. Doc ok. New tests are added which check that the result columns are configured as required, including error cases. "make check" is ok. OK. # part 04 Add a new "isdir" column to "pg_ls_tmpdir" output. This is a small behavioral change. I'm ok with that, however I must say that I'm still unsure why we would not jump directly to a "type" char column. What is wrong with outputing 'd' or '-' instead of true or false? This way, the interface needs not change if "lstat" is used later? ISTM that the interface issue should be somehow independent of the implementation issue, and we should choose directly the right/best interface? Independently, the documentation may be clearer about what happens to "isdir" when the file is a link? It may say that the behavior may change in the future? About homogeneity, I note that some people may be against adding "isdir" to other ls functions. I must say that I cannot see a good argument not to do it, and that I hate dealing with systems which are not homogeneous because it creates surprises and some loss of time. "make check" ok. OK. # part 05 Add isdir to other ls functions. Doc is updated. Same as above, I'd prefer a char instead of a bool, as it is more extendable and future-proof. OK. # part 06 This part extends and adds a test for pg_ls_logdir. "make check" is ok. OK. # part 07 This part extends pg_stat_file with more date informations. "make check" is ok. OK. # doc Overall doc generation is OK. -- Fabien.
Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
Hello Justin, I hope to look at it over the week-end. -- Fabien Coelho - CRI, MINES ParisTech
Re: cpluspluscheck complains about use of register
It seems we should just remove the use of register? I have a vague idea that it was once important to say "register" if you are going to use the variable in an asm snippet that requires it to be in a register. That might be wrong, or it might be obsolete even if once true. We could try taking these out and seeing if the buildfarm complains. We have several inline asm statements not using register despite using variables in a register (e.g. pg_atomic_compare_exchange_u32_impl()), so I wouldn't expect a problem with compilers we support. Should we make configure test for -Wregister? There's at least one additional use of register that we'd have to change (pg_regexec). From a compilation perspective, "register" tells the compiler that you cannot have a pointer on a variable, i.e. it generates an error if someone adds something like: void * p = _variable; Removing the "register" declaration means that such protection would be removed, and creating such a pointer could reduce drastically compiler optimization opportunities. -- Fabien.
Re: Commitfest 2022-03 Patch Triage Part 1b
Just FYI. Better to follow up to the thread for the patch that's already in the CF. Otherwise your patch will missed by someone who looks at the CF entry to see the latest patch. Indeed. Done. -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Are you planning to send a rebased patch for this commit fest? Argh, I did it in a reply in another thread:-( Attached v15. So as to help moves things forward, I'd suggest that we should not to care too much about corner case repetition of some error messages which are due to libpq internals, so I could remove the ugly buffer reset from the patch and have the repetition, and if/when the issue is fixed later in libpq then the repetition will be removed, fine! The issue is that we just expose the strange behavior of libpq, which is libpq to solve, not psql. What do you think? -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..8f7f93172a 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index caabb06c53..f01adb1fd2 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3570,10 +3563,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4160,6 +4149,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of psql. + + + + + SHOW_CONTEXT diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index d65b9a124f..3b2f6305b4 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -34,6 +34,8 @@ static bool DescribeQuery(const char *query, double *elapsed_msec); static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec); static bool command_no_begin(const char *query); static bool is_select_command(const char *query); +static int SendQueryAndProcessResults(const char *query, double *pelapsed_msec, + bool is_watch, const printQueryOpt *opt, FILE *printQueryFout, bool *tx_ended); /* @@ -354,7 +356,7 @@ CheckConnection(void) * Returns true for valid result, false for error state. */ static bool -AcceptResult(const PGresult *result) +AcceptResult(const PGresult *result, bool show_error) { bool OK; @@ -385,7 +387,7 @@ AcceptResult(const PGresult *result) break; } - if (!OK) + if (!OK && show_error) { const char *error = PQerrorMessage(pset.db); @@ -473,6 +475,18 @@ ClearOrSaveResult(PGresult *result) } } +/* + * Consume all results + */ +static void +ClearOrSaveAllResults() +{ + PGresult *result; + + while ((result = PQgetResult(pset.db)) != NULL) + ClearOrSaveResult(result); +} + /* * Print microtiming output. Always print raw milliseconds; if the interval @@ -573,7 +587,7 @@ PSQLexec(const char
Re: Commitfest 2022-03 Patch Triage Part 1b
Hello Greg, Peter posted an updated version of Fabiens patch about a month ago (which at this point no longer applies) Attached a v15 which is a rebase, after some minor changes in the source and some new test cases added (good!). fixing a few issues, but also point at old review comments still unaddressed. ISTM that all comments have been addressed. However, the latest patch raises issues about work around libpq corner case behaviors which are really just that, corner cases. Since this was pushed, but had to be reverted, I assume there is a desire for the feature but it seems to need more work still. It looks like Peter and Fabien were debating the merits of a libpq change and probably that won't happen this release cycle. ISTM these are really very minor issues that could be resolved in this cycle. Is there a kernel of psql functionality that can be extracted from this without the libpq change in this release cycle or should it wait until we add the functionality to libpq? The patch can wait for the issues to be resolved one way or an other before proceeding, *or* it can be applied, maybe with a small tweak, and the libpq issues be fixed separately. For a reminder, there are two actual "issues"features" or "bug" with libpq, which are made visible by the patch, but are pre-existing: (1) under some circumstances a infinite stream of empty results is returned, that has to be filtered out manually. (2) under some special circumstances some error messages may be output twice because of when libpq decides to reset the error buffer. (1) has been there for ever, and (2) is under investigation to possibly improve the situation, so as to remove a hack in the code to avoid it. The alternative which IMO would be ok is to admit that under some very special conditions the same error message may be output twice, and if it is resolved later on then fine. If it's the latter then perhaps we should move this to 16? I'm not that pessimistic! I may be proven wrong:-) -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..8f7f93172a 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index caabb06c53..f01adb1fd2 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3570,10 +3563,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4160,6 +4149,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of
Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors
Hello Tatsuo-san, It seems the patch is ready for committer except below. Do you guys want to do more on below? I'm planning a new review of this significant patch, possibly over the next week-end, or the next. -- Fabien.
Re: Typo in pgbench messages.
So fine with me wrt having a more homogeneous report. So are you fine with Kawamoto-san's patch? Yes. Patch applies cleanly (hmmm, it would have been better to have it as an attachement). Make & make check ok. Fine with me. -- Fabien.
Re: Typo in pgbench messages.
Hello Daniel, I think that the break of typographical rules is intentional to allow such simplistic low-level stream handling through pipes or scripts. I'd prefer that the format is not changed. Maybe a comment could be added to explain the reason behind it. That doesn't sound like an overwhelmingly convincing argument to print some messages with 'X %' and others with 'X%'. Indeed. The no-space % are for database loading progress and the final report which I happen not to process through pipes and are more user-facing interfaces/reports. The stream piping need applies more to repeated lines such as those output by progress, which happen to avoid percentages anyway so the questions does not arise. So fine with me wrt having a more homogeneous report. -- Fabien.
Re: Typo in pgbench messages.
Bonjour Michaël, I think it's better to back-patch this to stable branches if there's no objection. Thought? That's only cosmetic, so I would just bother about HEAD if I were to change something like that (I would not bother at all, personally). One argument against a backpatch is that this would be disruptive with tools that parse and analyze the output generated by pgbench. Fabien, don't you have some tools and/or wrappers doing exactly that? Yep, I like to "| cut -d' ' -fx" and other "line.split(' ')" or whatever. I think that the break of typographical rules is intentional to allow such simplistic low-level stream handling through pipes or scripts. I'd prefer that the format is not changed. Maybe a comment could be added to explain the reason behind it. -- Fabien.
Re: libpq async duplicate error results
Hello Tom, I concur with Fabien's analysis: we report the FATAL message from the server during the first PQgetResult, and then the second call discovers that the connection is gone and reports "server closed the connection unexpectedly". Those are two independent events (in libpq's view anyway) so reporting them separately is correct, or at least necessary unless you want to engage in seriously major redesign and behavioral changes. It is annoying that some of the text is duplicated in the second report, but in the end that's cosmetic, and I'm not sure we can improve it without breaking other things. In particular, we can't just think about what comes back in the PGgetResult calls, we also have to consider what PQerrorMessage(conn) will report afterwards. So I don't think that resetting conn->errorMessage during a PQgetResult series would be a good fix. An idea that I don't have time to pursue right now is to track how much of conn->errorMessage has been read out by PQgetResult, and only report newly-added text in later PQgetResult calls of a series, while PQerrorMessage would continue to report the whole thing. Buffer resets would occur where they do now. It'd probably be necessary to make a user-visible PQgetResult that becomes a wrapper around PQgetResultInternal, because internal callers such as PQexecFinish will need the old behavior, or at least some of them will. I agree that the message reset is not convincing, but it was the only way to get the expected behavior without changing the existing functions. I see two paths: (1) keep the duplicate message in this corner case. (2) do the hocus pocus you suggest around PQerrorMessage and PQgetResult to work around the duplication, just for this corner case. I'd tend to choose (1) to keep it simple, even if (2) is feasible. -- Fabien.
Re: Latest LLVM breaks our code again
Hello Andres, I'm doubtful that tracking development branches of LLVM is a good investment. Their development model is to do changes in-tree much more than we do. Adjusting to API changes the moment they're made will often end up with further changes to the same / related lines. Once they open the relevant release-NN branch, it's a different story. Maybe it'd make sense to disable --with-llvm on seawasp> and have ppa separate animal that tracks the newest release branch? The point of seawasp is somehow to have an early warning of upcoming changes, especially the --with-llvm part. LLVM indeed is a moving target, but they very rarely back down on their API changes… As pg version are expected to run for 5 years, they will encounter newer compiler versions, so being as ready as possible seems worthwhile. ISTM that there is no actual need to fix LLVM breakage on the spot, but I think it is pertinent to be ok near release time. This is why there is a "EXPERIMENTAL" marker in the system description. There can be some noise when LLVM development version is broken, this has happened in the past with seawasp (llvm) and moonjelly (gcc), but not often. About tracking releases: this means more setups and runs and updates, and as I think they do care about compatibility *within* a release we would not see breakages so it would not be very useful, and we would only get the actual breakages at LLVM release time, which may be much later. For these reasons, I'm inclined to let seawasp as it is. -- Fabien.
Re: Latest LLVM breaks our code again
Speaking of buildfarm breakage, seawasp has been failing for the past several days. It looks like bleeding-edge LLVM has again changed some APIs we depend on. First failure is here: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=seawasp=2022-01-28%2000%3A17%3A48 Indeed. I'm sorry for the late warning: clang could not compile with its previous version at some point so my weekly recompilation got stuck and I did not notice for some months. I repaired by switching compiling clang with gcc instead. ::sigh:: -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Hello Peter, It would be better to do without. Also, it makes one wonder how others are supposed to use this multiple-results API properly, if even psql can't do it without extending libpq. Needs more thought. Fine with me! Obviously I'm okay if libpq is repaired instead of writing strange code on the client to deal with strange behavior. I have a new thought on this, as long as we are looking into libpq. Why can't libpq provide a variant of PQexec() that returns all results, instead of just the last one. It has all the information, all it has to do is return the results instead of throwing them away. Then the changes in psql would be very limited, and we don't have to re-invent PQexec() from its pieces in psql. And this would also make it easier for other clients and user code to make use of this functionality more easily. Attached is a rough draft of what this could look like. It basically works. Thoughts? My 0.02€: With this approach results are not available till the last one has been returned? If so, it loses the nice asynchronous property of getting results as they come when they come? This might or might not be desirable depending on the use case. For "psql", ISTM that we should want immediate and asynchronous anyway?? I'm unclear about what happens wrt to client-side data buffering if several large results are returned? COPY?? Also, I guess the user must free the returned array on top of closing all results? -- Fabien.
Re: libpq async duplicate error results
command = SELECT pg_terminate_backend(pg_backend_pid()); result 1 status = PGRES_FATAL_ERROR error message = "FATAL: terminating connection due to administrator command " result 2 status = PGRES_FATAL_ERROR error message = "FATAL: terminating connection due to administrator command server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. " Also, why are there multiple results being generated in the first place? My interpretation is that the first message is a close message issued by the server before actually severing the connection, and the second message is generated by libpq when it notices that the connection has been closed, so there is some sense in having to results to report these two consecutive errors, and the question might be about when the buffer should be reset. -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Hallo Peter, Attached v14 moves the status extraction before the possible clear. I've added a couple of results = NULL after such calls in the code. In the psql.sql test file, the test I previously added concluded with \set ECHO none, which was a mistake that I have now fixed. As a result, the tests that you added after that point didn't show their input lines, which was weird and not intentional. So the tests will now show a different output. Ok. I notice that this patch has recently gained a new libpq function. I gather that this is to work around the misbehaviors in libpq that we have discussed. Indeed. But I think if we are adding a libpq API function to work around a misbehavior in libpq, we might as well fix the misbehavior in libpq to begin with. Adding a new public libpq function is a significant step, needs documentation, etc. I'm not so sure. The choice is (1) change the behavior of an existing function or (2) add a new function. Whatever the existing function does, the usual anwer to API changes is "someone is going to complain because it breaks their code", so "Returned with feedback", hence I did not even try. The advantage of (2) is that it does not harm anyone to have a new function that they just do not need to use. It would be better to do without. Also, it makes one wonder how others are supposed to use this multiple-results API properly, if even psql can't do it without extending libpq. Needs more thought. Fine with me! Obviously I'm okay if libpq is repaired instead of writing strange code on the client to deal with strange behavior. -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Hello Andres, The reason this test constantly fails on cfbot windows is a use-after-free bug. Indeed! Thanks a lot for the catch and the debug! The ClearOrSaveResult function is quite annoying because it may or may not clear the result as a side effect. Attached v14 moves the status extraction before the possible clear. I've added a couple of results = NULL after such calls in the code. -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..8f7f93172a 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 1ab200a4ad..0a22850912 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3570,10 +3563,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4160,6 +4149,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of psql. + + + + + SHOW_CONTEXT diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index 3503605a7d..47eabcbb8e 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -34,6 +34,8 @@ static bool DescribeQuery(const char *query, double *elapsed_msec); static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec); static bool command_no_begin(const char *query); static bool is_select_command(const char *query); +static int SendQueryAndProcessResults(const char *query, double *pelapsed_msec, + bool is_watch, const printQueryOpt *opt, FILE *printQueryFout, bool *tx_ended); /* @@ -354,7 +356,7 @@ CheckConnection(void) * Returns true for valid result, false for error state. */ static bool -AcceptResult(const PGresult *result) +AcceptResult(const PGresult *result, bool show_error) { bool OK; @@ -385,7 +387,7 @@ AcceptResult(const PGresult *result) break; } - if (!OK) + if (!OK && show_error) { const char *error = PQerrorMessage(pset.db); @@ -473,6 +475,18 @@ ClearOrSaveResult(PGresult *result) } } +/* + * Consume all results + */ +static void +ClearOrSaveAllResults() +{ + PGresult *result; + + while ((result = PQgetResult(pset.db)) != NULL) + ClearOrSaveResult(result); +} + /* * Print microtiming output. Always print raw milliseconds; if the interval @@ -573,7 +587,7 @@ PSQLexec(const char *query) ResetCancelConn(); - if (!AcceptResult(res)) + if (!AcceptResult(res, true)) { ClearOrSaveResult(res); res = NULL; @@ -596,11 +610,8 @@ int PSQLexecWatch(const char *query,
Re: psql - add SHOW_ALL_RESULTS option
Hello Peter, quite weird and confusing. Maybe this type of test should be done in the TAP framework instead. Attached v13 where the crash test is moved to tap. -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..8f7f93172a 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 1ab200a4ad..0a22850912 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3570,10 +3563,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4160,6 +4149,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of psql. + + + + + SHOW_CONTEXT diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index f210ccbde8..b8e8c2b245 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -33,6 +33,8 @@ static bool DescribeQuery(const char *query, double *elapsed_msec); static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec); static bool command_no_begin(const char *query); static bool is_select_command(const char *query); +static int SendQueryAndProcessResults(const char *query, double *pelapsed_msec, + bool is_watch, const printQueryOpt *opt, FILE *printQueryFout, bool *tx_ended); /* @@ -353,7 +355,7 @@ CheckConnection(void) * Returns true for valid result, false for error state. */ static bool -AcceptResult(const PGresult *result) +AcceptResult(const PGresult *result, bool show_error) { bool OK; @@ -384,7 +386,7 @@ AcceptResult(const PGresult *result) break; } - if (!OK) + if (!OK && show_error) { const char *error = PQerrorMessage(pset.db); @@ -472,6 +474,18 @@ ClearOrSaveResult(PGresult *result) } } +/* + * Consume all results + */ +static void +ClearOrSaveAllResults() +{ + PGresult *result; + + while ((result = PQgetResult(pset.db)) != NULL) + ClearOrSaveResult(result); +} + /* * Print microtiming output. Always print raw milliseconds; if the interval @@ -572,7 +586,7 @@ PSQLexec(const char *query) ResetCancelConn(); - if (!AcceptResult(res)) + if (!AcceptResult(res, true)) { ClearOrSaveResult(res); res = NULL; @@ -595,11 +609,8 @@ int PSQLexecWatch(const char *query, const printQueryOpt *opt, FILE *printQueryFout) { bool timing = pset.timing; - PGresult *res; double elapsed_msec = 0; - instr_time before; - instr_time after; - FILE *fout; + int res; if (!pset.db) { @@ -608,77
Re: psql - add SHOW_ALL_RESULTS option
Hello Peter, With this "voluntary crash", the regression test output is now psql ... ok (test process exited with exit code 2) 281 ms Normally, I'd expect this during development if there was a crash somewhere, but showing this during a normal run now, and moreover still saying "ok", Well, from a testing perspective, the crash is voluntary and it is indeed ok:-) is quite weird and confusing. Maybe this type of test should be done in the TAP framework instead. It could. Another simpler option: add a "psql_voluntary_crash.sql" with just that test instead of modifying the "psql.sql" test script? That would keep the test exit code information, but the name of the script would make things clearer? Also, if non zero status do not look so ok, should they be noted as bad? -- Fabien.
Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
Here is my review about v32: I forgot to tell that doc generation for the cumulated changes also works. -- Fabien.
Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
Hello Justin, Happy new year! I think the 2nd half of the patches are still waiting for fixes to lstat() on windows. Not your problem? Here is my review about v32: All patches apply cleanly. # part 01 One liner doc improvement to tell that creation time is only available on windows. It is indeed not available on Linux. OK. # part 02 Add tests for various options on pg_ls_dir, and for pg_stat_file, which were not exercised before. "make check" is ok. OK. # part 03 This patch adds a new pg_ls_dir_metadata. Internally, this is an extension of pg_ls_dir_files function which is used by other pg_ls functions. Doc ok. About the code: ISTM that the "if (1) { if (2) continue; } else if(3) { if (4) continue; }" structure" may be simplified to "if (1 && 2) continue; if (3 && 4) continue;", at least if IS_DIR and IS_REG are incompatible? Otherwise, at least "else if (3 & 4) continue"? The ifdef WIN32 (which probably detects windows 64 bits…) overwrites values[3]. ISTM it could be reordered so that there is no overwrite, and simpler single assignements. #ifndef WIN32 v = ...; #else v = ... ? ... : ...; #endif New tests are added which check that the result columns are configured as required, including error cases. "make check" is ok. OK. # part 04 Add a new "isdir" column to "pg_ls_tmpdir" output. This is a small behavioral change. I'm ok with it, however I'm unsure why we would not jump directly to the "type" char column done later in the patch series. ISTM all such functions should be extended the same way for better homogeneity? That would also impact "waldir", "archive_status", "logical_*", "replslot" variants. "make check" ok. OK. # part 05 This patch applies my previous advice:-) ISTM that parts 4 and 5 should be one single patch. The test changes show that only waldir has a test. Would it be possible to add minimal tests to other variants as well? "make check" ok. I'd consider add such tests with part 02. OK. # part 06 This part extends and adds a test for pg_ls_logdir. ISTM that it should be merged with the previous patches. "make check" is ok. OK. # part 07 This part extends pg_stat_file with more date informations. ISTM that the documentation should be clear about windows vs unix/cygwin specific data provided (change/creation). The code adds a new value_from_stat function to avoid code duplication. Fine with me. All pg_ls_*dir functions are impacted. Fine with me. "make check" is ok. OK. # part 08 This part substitutes lstat to stat. Fine with me. "make check" is ok. I guess that lstat does something under windows even if the concept of link is somehow different there. Maybe the doc should say so somewhere? OK. # part 09 This part switches the added "isdir" to a "type" column. "make check" is ok. This is a definite improvement. OK. # part 10 This part adds a redundant "isdir" column. I do not see the point. "make check" is ok. NOT OK. # part 11 This part adds a recurse option. Why not. However, the true value does not seem to be tested? "make check" is ok. My opinion is unclear. Overall, ignoring part 10, this makes a definite improvement to postgres ls capabilities. I do not seen any reason why not to add those. -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
[...] I agree that these two behaviors in libpq are dubious, especially the second one. I want to spend some time analyzing this more and see if fixes in libpq might be appropriate. Ok. My analysis is that fixing libpq behavior is not in the scope of a psql patch, and that if I was to do that it was sure delay the patch even further. Also these issues/features are corner cases that provably very few people bumped into. -- Fabien.
Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
Hello Justin, It seems that the v31 patch does not apply anymore: postgresql> git apply ~/v31-0001-Document-historic-behavior-of-links-to-directori.patch error: patch failed: doc/src/sgml/func.sgml:27410 error: doc/src/sgml/func.sgml: patch does not apply -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Hello Peter, I finally took some time to look at this. Attached v11 is a rebase. This patch still has a few of the problems reported earlier this year. In [0], it was reported that certain replication commands result in infinite loops because of faulty error handling. This still happens. I wrote a test for it, attached here. (I threw in a few more basic tests, just to have some more coverage that was lacking, and to have a file to put the new test in.) Hmmm… For some unclear reason on errors on a PGRES_COPY_* state PQgetResult keeps on returning an empty result. PQexec manually ignores it, so I did the same with a comment, but for me the real bug is somehow in PQgetResult behavior… In [1], it was reported that server crashes produce duplicate error messages. This also still happens. I didn't write a test for it, maybe you have an idea. (Obviously, we could check whether the error message is literally there twice in the output, but that doesn't seem very general.) But it's easy to test manually: just have psql connect, shut down the server, then run a query. This is also a feature/bug of libpq which happens to be hidden by PQexec: when one command crashes PQgetResult actually returns *2* results. First one with the FATAL message, second one when libpq figures out that the connection was lost with the second message appended to the first. PQexec just happen to silently ignore the first result. I added a manual reset of the error message when first shown so that it is not shown twice. It is unclear to me whether the reset should be somewhere in libpq instead. I added a voluntary crash at the end of the psql test. Attached v12 somehow fixes these issues in "psql" code rather than in libpq. -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index e0abe34bb6..8f7f93172a 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index ae38d3dcc3..1d411ae124 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3564,10 +3557,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4154,6 +4143,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of psql. + + + + + SHOW_CONTEXT diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index ec975c3e2a..e06699878b 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -33,6 +33,8 @@ static bool
Re: rand48 replacement
Pushed with that change and some others, notably: Thanks for the improvements and the push! -- Fabien.
Re: rand48 replacement
Hello, They are not more nor less relevant than any other "random" constant. The state needs a default initialization. The point of using DK's is that it is somehow cannot be some specially crafted value which would have some special property only know to the purveyor of the constant and could be used by them to break the algorithm. Well, none of that is in the comment, which is probably just as well because it reads like baseless paranoia. Sure. Welcome to cryptography:-) *Any* initialization vector should be as good as any other; if it's not, that's an algorithm fault. Yep. (OK, I'll give it a pass for zeroes being bad, but otherwise not.) Ok. We can use any non-zero constant. What's wrong with constants provided by a Turing award computer scientist? I find them more attractive that some stupid 0x0123456789…. * Function names like these convey practically nothing to readers: +extern int64 pg_prng_i64(pg_prng_state *state); [...] The intention is obviously "postgres pseudo-random number generator for ". [...] What would you suggest? We have names for these types, and those abbreviations are (mostly) not them. Name-wise I'd be all right with pg_prng_int64 and so on, Ok. You prefer "uint64" to "u64". but I still expect that these functions' header comments should be explicit about uniformity and about the precise output range. Ok. As an example, it's far from obvious whether the minimum value of pg_prng_int32 should be zero or INT_MIN. (Actually, I suspect you ought to provide both of those cases.) I agree that it is not obvious. I added "p" for "positive" variants. I found one place where one could be used. And the output range of pg_prng_float8 is not merely unobvious, but not very easy to deduce from examining the code either; not that users should have to. Ok. BTW, why are we bothering with FIRST_BIT_MASK in the first place, rather than returning "v & 1" for pg_prng_bool? Because some PRNG are very bad in the low bits, not xoroshiro stuff, though. Good, but then you shouldn't write associated code as if that's still a problem, because you'll cause other people to think it's still a problem and write equally contorted code elsewhere. "v & 1" is a transparent way of producing a bool, while this code isn't. "v & 1" really produces an integer, not a bool. I'd prefer to actually generate a boolean and let the compiler optimizer do the cleaning. Some Xoshiro-family generators have "linear artifacts in the low bits", Although Xoroshiro128** is supposed to be immune, I thought better to keep away from these, and I could not see why the last bit would be better than any other bit, so taking the first looked okay to me at least. I think that the attached v18 addresses most of your concerns. -- Fabien.diff --git a/configure b/configure index 896b781473..f8c8164428 100755 --- a/configure +++ b/configure @@ -16463,32 +16463,6 @@ esac fi -ac_fn_c_check_func "$LINENO" "random" "ac_cv_func_random" -if test "x$ac_cv_func_random" = xyes; then : - $as_echo "#define HAVE_RANDOM 1" >>confdefs.h - -else - case " $LIBOBJS " in - *" random.$ac_objext "* ) ;; - *) LIBOBJS="$LIBOBJS random.$ac_objext" - ;; -esac - -fi - -ac_fn_c_check_func "$LINENO" "srandom" "ac_cv_func_srandom" -if test "x$ac_cv_func_srandom" = xyes; then : - $as_echo "#define HAVE_SRANDOM 1" >>confdefs.h - -else - case " $LIBOBJS " in - *" srandom.$ac_objext "* ) ;; - *) LIBOBJS="$LIBOBJS srandom.$ac_objext" - ;; -esac - -fi - ac_fn_c_check_func "$LINENO" "strlcat" "ac_cv_func_strlcat" if test "x$ac_cv_func_strlcat" = xyes; then : $as_echo "#define HAVE_STRLCAT 1" >>confdefs.h diff --git a/configure.ac b/configure.ac index b50130b323..a5c10b8d56 100644 --- a/configure.ac +++ b/configure.ac @@ -1858,8 +1858,6 @@ AC_REPLACE_FUNCS(m4_normalize([ mkdtemp pread pwrite - random - srandom strlcat strlcpy strnlen diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index 42a830c33b..659ac05655 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -32,6 +32,7 @@ #include "catalog/index.h" #include "catalog/pg_am.h" #include "commands/tablecmds.h" +#include "common/pg_prng.h" #include "lib/bloomfilter.h" #include "miscadmin.h" #include "storage/lmgr.h" @@ -466,8 +467,8 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace, total_pages = RelationGetNumberOfBlocks(rel); total_elems = Max(total_pages * (MaxTIDsPerBTreePage / 3), (int64) state->rel->rd_rel->reltuples); - /* Random seed relies on backend srandom() call to avoid repetition */ - seed = random(); + /* Random seed relies on backend prng initialization to avoid repetition */ + seed = pg_prng_uint64(_global_prng_state); /* Create Bloom filter to fingerprint index */ state->filter = bloom_create(total_elems, maintenance_work_mem, seed); state->heaptuplespresent = 0; diff --git
Re: rand48 replacement
Hello Tom, Thanks for the feedback. +/* use Donald Knuth's LCG constants for default state */ How did Knuth get into this? This algorithm is certainly not his, so why are those constants at all relevant? They are not more nor less relevant than any other "random" constant. The state needs a default initialization. The point of using DK's is that it is somehow cannot be some specially crafted value which would have some special property only know to the purveyor of the constant and could be used by them to break the algorithm. https://en.wikipedia.org/wiki/Dual_EC_DRBG * I could do without the stream-of-consciousness notes in pg_prng.c. I think what's appropriate is to say "We use thus-and-such a generator which is documented here", maybe with a line or two about its properties. The stuff was really written essentially as a "why this" for the first patch, and to prevent questions about "why not this other generator" later, because it could never stop. * Function names like these convey practically nothing to readers: +extern int64 pg_prng_i64(pg_prng_state *state); +extern uint32 pg_prng_u32(pg_prng_state *state); +extern int32 pg_prng_i32(pg_prng_state *state); +extern double pg_prng_f64(pg_prng_state *state); +extern bool pg_prng_bool(pg_prng_state *state); The intention is obviously "postgres pseudo-random number generator for ". ISTM that it conveys (1) that it is a postgres-specific stuff, (2) that it is a PRNG (which I find *MUCH* more informative than the misleading statement that something is random when it is not, and it is shorter) and (3) about the type it returns, because C does require functions to have distinct names. What would you suggest? and these functions' header comments add a grand total of zero bits of information. Yes, probably. I do not like not to comment at all on a function. What someone generally wants to know first about a PRNG is (a) is it uniform and (b) what is the range of outputs, neither of which are specified anywhere. ISTM (b) is suggested thanks to the type and (a) I'm not sure about a PRNG which would claim not at least claim to be uniform. Non uniform PRNG are usually built based on a uniform one. What do you suggest as alternate names? +#define FIRST_BIT_MASK UINT64CONST(0x8000) +#define RIGHT_HALF_MASK UINT64CONST(0x) +#define DMANTISSA_MASK UINT64CONST(0x000F) I'm not sure that these named constants are any more readable than writing the underlying constant, maybe less so --- in particular I think something based on (1<<52)-1 would be more appropriate for the float mantissa operations. We don't need RIGHT_HALF_MASK at all, the casts to uint32 or int32 will accomplish that just fine. Yep. I did it for uniformity. BTW, why are we bothering with FIRST_BIT_MASK in the first place, rather than returning "v & 1" for pg_prng_bool? Because some PRNG are very bad in the low bits, not xoroshiro stuff, though. Is xoroshiro128ss less random in the low-order bits than the higher? If so, that would be a pretty important thing to document. If it's not, we shouldn't make the code look like it is. Dunno. Why should we prefer low bits? + * select in a range with bitmask rejection. What is "bitmask rejection"? Is it actually important to callers? No, it is important to understand how it does it. That is the name of the technique which is implemented, which helps if you want to understand what is going on by googling it. This point could be moved inside the function. I think this should be documented more like "Produce a random integer uniformly selected from the range [rmin, rmax)." Sure. -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Hello Daniel, This patch still has a few of the problems reported earlier this year. The patch fails to apply and the thread seems to have taken a nap. I'm not napping:-) I just do not have enough time available this month. I intend to work on the patch in the next CF (January). AFAICR there is one necessary rebase and one bug to fix. -- Fabien.
Re: Triage on old commitfest entries
Hello Tom, As I threatened in another thread, I've looked through all of the oldest commitfest entries to see which ones should maybe be tossed, on the grounds that they're unlikely to ever get committed so we should stop pushing them forward to the next CF. psql - add SHOW_ALL_RESULTS option 11 Last substantive discussion 2021-09, currently passing cfbot This got committed and reverted once already. I have to be suspicious of whether this is a good design. Thoughts? ISTM that the main problem with this patch is that it touches a barely tested piece of software, aka "psql":-( The second problem is that the initial code is fragile because it handles different modes with pretty intricate code. So, on the first commit it broke a few untested things, among the many untested things. This resulted in more tests being added (sql, tap) so that the relevant features are covered, so my point of view is that this patch is currently a net improvement both from an engineering perspective and for features and enabling other features. Also, there is some interest to get it in. So I do not think that it deserves to be dropped. -- Fabien.
Re: psql - add SHOW_ALL_RESULTS option
Hello Peter, Attached v9 integrates your tests and makes them work. Attached v11 is a rebase. -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index b52d187722..0cf4a37a5f 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 14e0a4dbe3..6866a06f2d 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3542,10 +3535,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4132,6 +4121,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of psql. + + + + + SHOW_CONTEXT diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index 5640786678..481a316fed 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -33,6 +33,8 @@ static bool DescribeQuery(const char *query, double *elapsed_msec); static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec); static bool command_no_begin(const char *query); static bool is_select_command(const char *query); +static int SendQueryAndProcessResults(const char *query, double *pelapsed_msec, + bool is_watch, const printQueryOpt *opt, FILE *printQueryFout, bool *tx_ended); /* @@ -353,7 +355,7 @@ CheckConnection(void) * Returns true for valid result, false for error state. */ static bool -AcceptResult(const PGresult *result) +AcceptResult(const PGresult *result, bool show_error) { bool OK; @@ -384,7 +386,7 @@ AcceptResult(const PGresult *result) break; } - if (!OK) + if (!OK && show_error) { const char *error = PQerrorMessage(pset.db); @@ -472,6 +474,18 @@ ClearOrSaveResult(PGresult *result) } } +/* + * Consume all results + */ +static void +ClearOrSaveAllResults() +{ + PGresult *result; + + while ((result = PQgetResult(pset.db)) != NULL) + ClearOrSaveResult(result); +} + /* * Print microtiming output. Always print raw milliseconds; if the interval @@ -572,7 +586,7 @@ PSQLexec(const char *query) ResetCancelConn(); - if (!AcceptResult(res)) + if (!AcceptResult(res, true)) { ClearOrSaveResult(res); res = NULL; @@ -594,11 +608,8 @@ PSQLexec(const char *query) int PSQLexecWatch(const char *query, const printQueryOpt *opt, FILE *printQueryFout) { - PGresult *res; double elapsed_msec = 0; - instr_time before; - instr_time after; - FILE *fout; + int res; if (!pset.db) { @@ -607,77 +618,14 @@ PSQLexecWatch(const char *query, const printQueryOpt *opt,
Re: rand48 replacement
I guess the declaration needs PGDLLIMPORT. Indeed, thanks! Attached v16 adds that. -- Fabien.diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index d19f73127c..b250ae912b 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -32,6 +32,7 @@ #include "catalog/index.h" #include "catalog/pg_am.h" #include "commands/tablecmds.h" +#include "common/pg_prng.h" #include "lib/bloomfilter.h" #include "miscadmin.h" #include "storage/lmgr.h" @@ -465,8 +466,8 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace, total_pages = RelationGetNumberOfBlocks(rel); total_elems = Max(total_pages * (MaxTIDsPerBTreePage / 3), (int64) state->rel->rd_rel->reltuples); - /* Random seed relies on backend srandom() call to avoid repetition */ - seed = random(); + /* Random seed relies on backend prng initialization to avoid repetition */ + seed = pg_prng_u64(_global_prng_state); /* Create Bloom filter to fingerprint index */ state->filter = bloom_create(total_elems, maintenance_work_mem, seed); state->heaptuplespresent = 0; diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c index e9092ba359..0c4279447c 100644 --- a/contrib/auto_explain/auto_explain.c +++ b/contrib/auto_explain/auto_explain.c @@ -16,6 +16,7 @@ #include "access/parallel.h" #include "commands/explain.h" +#include "common/pg_prng.h" #include "executor/instrument.h" #include "jit/jit.h" #include "utils/guc.h" @@ -275,8 +276,7 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags) if (nesting_level == 0) { if (auto_explain_log_min_duration >= 0 && !IsParallelWorker()) - current_query_sampled = (random() < auto_explain_sample_rate * - ((double) MAX_RANDOM_VALUE + 1)); + current_query_sampled = pg_prng_f64(_global_prng_state) < auto_explain_sample_rate; else current_query_sampled = false; } diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c index 2c2f149fb0..146b524076 100644 --- a/contrib/file_fdw/file_fdw.c +++ b/contrib/file_fdw/file_fdw.c @@ -1188,7 +1188,7 @@ file_acquire_sample_rows(Relation onerel, int elevel, * Found a suitable tuple, so save it, replacing one old tuple * at random */ -int k = (int) (targrows * sampler_random_fract(rstate.randstate)); +int k = (int) (targrows * sampler_random_fract()); Assert(k >= 0 && k < targrows); heap_freetuple(rows[k]); diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 76d4fea21c..c139382170 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -5157,7 +5157,7 @@ analyze_row_processor(PGresult *res, int row, PgFdwAnalyzeState *astate) if (astate->rowstoskip <= 0) { /* Choose a random reservoir element to replace. */ - pos = (int) (targrows * sampler_random_fract(astate->rstate.randstate)); + pos = (int) (targrows * sampler_random_fract(>rstate.randstate)); Assert(pos >= 0 && pos < targrows); heap_freetuple(astate->rows[pos]); } diff --git a/contrib/tablefunc/tablefunc.c b/contrib/tablefunc/tablefunc.c index 52b272f298..ac6db426ac 100644 --- a/contrib/tablefunc/tablefunc.c +++ b/contrib/tablefunc/tablefunc.c @@ -36,6 +36,7 @@ #include "access/htup_details.h" #include "catalog/pg_type.h" +#include "common/pg_prng.h" #include "executor/spi.h" #include "funcapi.h" #include "lib/stringinfo.h" @@ -290,8 +291,8 @@ get_normal_pair(float8 *x1, float8 *x2) do { - u1 = (float8) random() / (float8) MAX_RANDOM_VALUE; - u2 = (float8) random() / (float8) MAX_RANDOM_VALUE; + u1 = pg_prng_f64(_global_prng_state); + u2 = pg_prng_f64(_global_prng_state); v1 = (2.0 * u1) - 1.0; v2 = (2.0 * u2) - 1.0; diff --git a/contrib/tsm_system_rows/tsm_system_rows.c b/contrib/tsm_system_rows/tsm_system_rows.c index 4996612902..1a46d4b143 100644 --- a/contrib/tsm_system_rows/tsm_system_rows.c +++ b/contrib/tsm_system_rows/tsm_system_rows.c @@ -69,7 +69,7 @@ static BlockNumber system_rows_nextsampleblock(SampleScanState *node, BlockNumbe static OffsetNumber system_rows_nextsampletuple(SampleScanState *node, BlockNumber blockno, OffsetNumber maxoffset); -static uint32 random_relative_prime(uint32 n, SamplerRandomState randstate); +static uint32 random_relative_prime(uint32 n, pg_prng_state *randstate); /* @@ -213,25 +213,25 @@ system_rows_nextsampleblock(SampleScanState *node, BlockNumber nblocks) if (sampler->step == 0) { /* Initialize now that we have scan descriptor */ - SamplerRandomState randstate; + pg_prng_state randstate; /* If relation is empty, there's nothing to scan */ if (nblocks == 0) return InvalidBlockNumber; /* We only need an RNG during this setup step */ - sampler_random_init_state(sampler->seed, randstate); + sampler_random_init_state(sampler->seed, ); /* Compute
Re: rand48 replacement
Attached v15 also does call srandom if it is there, and fixes yet another remaining random call. I think that I have now removed all references to "random" from pg source. However, the test still fails on windows, because the linker does not find a global variable when compiling extensions, but it seems to find the functions defined in the very same file... Link: 4130 C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64\link.exe /ERRORREPORT:QUEUE /OUT:".\Release\tablefunc\tablefunc.dll" /INCREMENTAL:NO /NOLOGO Release/postgres/postgres.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /NODEFAULTLIB:libc /DEF:"./Release/tablefunc/tablefunc.def" /MANIFEST /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /manifest:embed /DEBUG /PDB:".\Release\tablefunc\tablefunc.pdb" /SUBSYSTEM:CONSOLE /STACK:"4194304" /TLBID:1 /DYNAMICBASE:NO /NXCOMPAT /IMPLIB:"Release/tablefunc/tablefunc.lib" /MACHINE:X64 /ignore:4197 /DLL .\Release\tablefunc\win32ver.res 4131 .\Release\tablefunc\tablefunc.obj 4132 Creating library Release/tablefunc/tablefunc.lib and object Release/tablefunc/tablefunc.exp 4133 tablefunc.obj : error LNK2001: unresolved external symbol pg_global_prng_state [C:\projects\postgresql\tablefunc.vcxproj] 4134 .\Release\tablefunc\tablefunc.dll : fatal error LNK1120: 1 unresolved externals [C:\projects\postgresql\tablefunc.vcxproj] 4135 Done Building Project "C:\projects\postgresql\tablefunc.vcxproj" (default targets) -- FAILED. The missing symbol is really defined in common/pg_prng.c which AFAICT is linked with postgres. If someone experienced with the windows compilation chain could give a hint of what is needed, I'd appreciate it! -- Fabien.
Re: rand48 replacement
Just FTR, I strongly object to your removal of process-startup srandom() calls. Ok. The point of the patch is to replace and unify the postgres underlying PRNG, so there was some logic behind this removal. FTR, this was triggered by your comment on Jul 1: [...] I see that you probably did that because random.c and srandom.c depend on it, but I wonder why we don't make an effort to flush those altogether. It's surely pretty confusing to newbies that what appears to be a call of the libc primitives is no such thing. I understood "flushing s?random.c" as that it would be a good thing to remove their definitions, hence their calls, whereas in the initial patch I provided a replacement for srandom & random. -- Fabien.
Re: rand48 replacement
Hello Tom, Just FTR, I strongly object to your removal of process-startup srandom() calls. Ok. The point of the patch is to replace and unify the postgres underlying PRNG, so there was some logic behind this removal. Those are not only setting the seed for our own use, but also ensuring that things like random() calls within PL functions or other libraries aren't 100% predictable. Sure, they shouldn't be predictable. Attached v15 also does call srandom if it is there, and fixes yet another remaining random call. -- Fabien.diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index d19f73127c..b250ae912b 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -32,6 +32,7 @@ #include "catalog/index.h" #include "catalog/pg_am.h" #include "commands/tablecmds.h" +#include "common/pg_prng.h" #include "lib/bloomfilter.h" #include "miscadmin.h" #include "storage/lmgr.h" @@ -465,8 +466,8 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace, total_pages = RelationGetNumberOfBlocks(rel); total_elems = Max(total_pages * (MaxTIDsPerBTreePage / 3), (int64) state->rel->rd_rel->reltuples); - /* Random seed relies on backend srandom() call to avoid repetition */ - seed = random(); + /* Random seed relies on backend prng initialization to avoid repetition */ + seed = pg_prng_u64(_global_prng_state); /* Create Bloom filter to fingerprint index */ state->filter = bloom_create(total_elems, maintenance_work_mem, seed); state->heaptuplespresent = 0; diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c index e9092ba359..0c4279447c 100644 --- a/contrib/auto_explain/auto_explain.c +++ b/contrib/auto_explain/auto_explain.c @@ -16,6 +16,7 @@ #include "access/parallel.h" #include "commands/explain.h" +#include "common/pg_prng.h" #include "executor/instrument.h" #include "jit/jit.h" #include "utils/guc.h" @@ -275,8 +276,7 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags) if (nesting_level == 0) { if (auto_explain_log_min_duration >= 0 && !IsParallelWorker()) - current_query_sampled = (random() < auto_explain_sample_rate * - ((double) MAX_RANDOM_VALUE + 1)); + current_query_sampled = pg_prng_f64(_global_prng_state) < auto_explain_sample_rate; else current_query_sampled = false; } diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c index 2c2f149fb0..146b524076 100644 --- a/contrib/file_fdw/file_fdw.c +++ b/contrib/file_fdw/file_fdw.c @@ -1188,7 +1188,7 @@ file_acquire_sample_rows(Relation onerel, int elevel, * Found a suitable tuple, so save it, replacing one old tuple * at random */ -int k = (int) (targrows * sampler_random_fract(rstate.randstate)); +int k = (int) (targrows * sampler_random_fract()); Assert(k >= 0 && k < targrows); heap_freetuple(rows[k]); diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 76d4fea21c..c139382170 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -5157,7 +5157,7 @@ analyze_row_processor(PGresult *res, int row, PgFdwAnalyzeState *astate) if (astate->rowstoskip <= 0) { /* Choose a random reservoir element to replace. */ - pos = (int) (targrows * sampler_random_fract(astate->rstate.randstate)); + pos = (int) (targrows * sampler_random_fract(>rstate.randstate)); Assert(pos >= 0 && pos < targrows); heap_freetuple(astate->rows[pos]); } diff --git a/contrib/tablefunc/tablefunc.c b/contrib/tablefunc/tablefunc.c index 779bd4415e..fdd117f81e 100644 --- a/contrib/tablefunc/tablefunc.c +++ b/contrib/tablefunc/tablefunc.c @@ -36,6 +36,7 @@ #include "access/htup_details.h" #include "catalog/pg_type.h" +#include "common/pg_prng.h" #include "executor/spi.h" #include "funcapi.h" #include "lib/stringinfo.h" @@ -290,8 +291,8 @@ get_normal_pair(float8 *x1, float8 *x2) do { - u1 = (float8) random() / (float8) MAX_RANDOM_VALUE; - u2 = (float8) random() / (float8) MAX_RANDOM_VALUE; + u1 = pg_prng_f64(_global_prng_state); + u2 = pg_prng_f64(_global_prng_state); v1 = (2.0 * u1) - 1.0; v2 = (2.0 * u2) - 1.0; diff --git a/contrib/tsm_system_rows/tsm_system_rows.c b/contrib/tsm_system_rows/tsm_system_rows.c index 4996612902..1a46d4b143 100644 --- a/contrib/tsm_system_rows/tsm_system_rows.c +++ b/contrib/tsm_system_rows/tsm_system_rows.c @@ -69,7 +69,7 @@ static BlockNumber system_rows_nextsampleblock(SampleScanState *node, BlockNumbe static OffsetNumber system_rows_nextsampletuple(SampleScanState *node, BlockNumber blockno, OffsetNumber maxoffset); -static uint32 random_relative_prime(uint32 n, SamplerRandomState randstate); +static uint32 random_relative_prime(uint32 n, pg_prng_state *randstate); /* @@ -213,25 +213,25 @@ system_rows_nextsampleblock(SampleScanState *node, BlockNumber nblocks) if
Re: rand48 replacement
[1]: http://cfbot.cputube.org/ Indeed. I wish that these results would be available from the cf interface. Attached a v11 which might improve things. Not enough. Here is a v12 which might improve things further. Not enough. Here is a v13 which might improve things further more. Not enough. Here is a v14 which might improve things further more again. Sorry for this noise due to blind windows tests. -- Fabien.diff --git a/configure b/configure index 7542fe30a1..7dc95e3960 100755 --- a/configure +++ b/configure @@ -800,6 +800,7 @@ infodir docdir oldincludedir includedir +runstatedir localstatedir sharedstatedir sysconfdir @@ -941,6 +942,7 @@ datadir='${datarootdir}' sysconfdir='${prefix}/etc' sharedstatedir='${prefix}/com' localstatedir='${prefix}/var' +runstatedir='${localstatedir}/run' includedir='${prefix}/include' oldincludedir='/usr/include' docdir='${datarootdir}/doc/${PACKAGE_TARNAME}' @@ -1193,6 +1195,15 @@ do | -silent | --silent | --silen | --sile | --sil) silent=yes ;; + -runstatedir | --runstatedir | --runstatedi | --runstated \ + | --runstate | --runstat | --runsta | --runst | --runs \ + | --run | --ru | --r) +ac_prev=runstatedir ;; + -runstatedir=* | --runstatedir=* | --runstatedi=* | --runstated=* \ + | --runstate=* | --runstat=* | --runsta=* | --runst=* | --runs=* \ + | --run=* | --ru=* | --r=*) +runstatedir=$ac_optarg ;; + -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb) ac_prev=sbindir ;; -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \ @@ -1330,7 +1341,7 @@ fi for ac_var in exec_prefix prefix bindir sbindir libexecdir datarootdir \ datadir sysconfdir sharedstatedir localstatedir includedir \ oldincludedir docdir infodir htmldir dvidir pdfdir psdir \ - libdir localedir mandir + libdir localedir mandir runstatedir do eval ac_val=\$$ac_var # Remove trailing slashes. @@ -1483,6 +1494,7 @@ Fine tuning of the installation directories: --sysconfdir=DIRread-only single-machine data [PREFIX/etc] --sharedstatedir=DIRmodifiable architecture-independent data [PREFIX/com] --localstatedir=DIR modifiable single-machine data [PREFIX/var] + --runstatedir=DIR modifiable per-process data [LOCALSTATEDIR/run] --libdir=DIRobject code libraries [EPREFIX/lib] --includedir=DIRC header files [PREFIX/include] --oldincludedir=DIR C header files for non-gcc [/usr/include] @@ -15026,7 +15038,7 @@ else We can't simply define LARGE_OFF_T to be 9223372036854775807, since some C++ compilers masquerading as C compilers incorrectly reject 9223372036854775807. */ -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62)) +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31)) int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721 && LARGE_OFF_T % 2147483647 == 1) ? 1 : -1]; @@ -15072,7 +15084,7 @@ else We can't simply define LARGE_OFF_T to be 9223372036854775807, since some C++ compilers masquerading as C compilers incorrectly reject 9223372036854775807. */ -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62)) +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31)) int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721 && LARGE_OFF_T % 2147483647 == 1) ? 1 : -1]; @@ -15096,7 +15108,7 @@ rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext We can't simply define LARGE_OFF_T to be 9223372036854775807, since some C++ compilers masquerading as C compilers incorrectly reject 9223372036854775807. */ -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62)) +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31)) int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721 && LARGE_OFF_T % 2147483647 == 1) ? 1 : -1]; @@ -15141,7 +15153,7 @@ else We can't simply define LARGE_OFF_T to be 9223372036854775807, since some C++ compilers masquerading as C compilers incorrectly reject 9223372036854775807. */ -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62)) +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31)) int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721 && LARGE_OFF_T % 2147483647 == 1) ? 1 : -1]; @@ -15165,7 +15177,7 @@ rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext We can't simply define LARGE_OFF_T to be 9223372036854775807, since some C++ compilers masquerading as C compilers incorrectly reject 9223372036854775807. */ -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62)) +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31)) int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721 && LARGE_OFF_T % 2147483647 == 1) ? 1 : -1]; @@ -16301,32 +16313,6 @@ esac fi -ac_fn_c_check_func "$LINENO"
Re: rand48 replacement
[1]: http://cfbot.cputube.org/ Indeed. I wish that these results would be available from the cf interface. Attached a v11 which might improve things. Not enough. Here is a v12 which might improve things further. Not enough. Here is a v13 which might improve things further more. -- Fabien.diff --git a/configure b/configure index 7542fe30a1..7dc95e3960 100755 --- a/configure +++ b/configure @@ -800,6 +800,7 @@ infodir docdir oldincludedir includedir +runstatedir localstatedir sharedstatedir sysconfdir @@ -941,6 +942,7 @@ datadir='${datarootdir}' sysconfdir='${prefix}/etc' sharedstatedir='${prefix}/com' localstatedir='${prefix}/var' +runstatedir='${localstatedir}/run' includedir='${prefix}/include' oldincludedir='/usr/include' docdir='${datarootdir}/doc/${PACKAGE_TARNAME}' @@ -1193,6 +1195,15 @@ do | -silent | --silent | --silen | --sile | --sil) silent=yes ;; + -runstatedir | --runstatedir | --runstatedi | --runstated \ + | --runstate | --runstat | --runsta | --runst | --runs \ + | --run | --ru | --r) +ac_prev=runstatedir ;; + -runstatedir=* | --runstatedir=* | --runstatedi=* | --runstated=* \ + | --runstate=* | --runstat=* | --runsta=* | --runst=* | --runs=* \ + | --run=* | --ru=* | --r=*) +runstatedir=$ac_optarg ;; + -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb) ac_prev=sbindir ;; -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \ @@ -1330,7 +1341,7 @@ fi for ac_var in exec_prefix prefix bindir sbindir libexecdir datarootdir \ datadir sysconfdir sharedstatedir localstatedir includedir \ oldincludedir docdir infodir htmldir dvidir pdfdir psdir \ - libdir localedir mandir + libdir localedir mandir runstatedir do eval ac_val=\$$ac_var # Remove trailing slashes. @@ -1483,6 +1494,7 @@ Fine tuning of the installation directories: --sysconfdir=DIRread-only single-machine data [PREFIX/etc] --sharedstatedir=DIRmodifiable architecture-independent data [PREFIX/com] --localstatedir=DIR modifiable single-machine data [PREFIX/var] + --runstatedir=DIR modifiable per-process data [LOCALSTATEDIR/run] --libdir=DIRobject code libraries [EPREFIX/lib] --includedir=DIRC header files [PREFIX/include] --oldincludedir=DIR C header files for non-gcc [/usr/include] @@ -15026,7 +15038,7 @@ else We can't simply define LARGE_OFF_T to be 9223372036854775807, since some C++ compilers masquerading as C compilers incorrectly reject 9223372036854775807. */ -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62)) +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31)) int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721 && LARGE_OFF_T % 2147483647 == 1) ? 1 : -1]; @@ -15072,7 +15084,7 @@ else We can't simply define LARGE_OFF_T to be 9223372036854775807, since some C++ compilers masquerading as C compilers incorrectly reject 9223372036854775807. */ -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62)) +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31)) int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721 && LARGE_OFF_T % 2147483647 == 1) ? 1 : -1]; @@ -15096,7 +15108,7 @@ rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext We can't simply define LARGE_OFF_T to be 9223372036854775807, since some C++ compilers masquerading as C compilers incorrectly reject 9223372036854775807. */ -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62)) +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31)) int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721 && LARGE_OFF_T % 2147483647 == 1) ? 1 : -1]; @@ -15141,7 +15153,7 @@ else We can't simply define LARGE_OFF_T to be 9223372036854775807, since some C++ compilers masquerading as C compilers incorrectly reject 9223372036854775807. */ -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62)) +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31)) int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721 && LARGE_OFF_T % 2147483647 == 1) ? 1 : -1]; @@ -15165,7 +15177,7 @@ rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext We can't simply define LARGE_OFF_T to be 9223372036854775807, since some C++ compilers masquerading as C compilers incorrectly reject 9223372036854775807. */ -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62)) +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31)) int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721 && LARGE_OFF_T % 2147483647 == 1) ? 1 : -1]; @@ -16301,32 +16313,6 @@ esac fi -ac_fn_c_check_func "$LINENO" "random" "ac_cv_func_random" -if test "x$ac_cv_func_random" = xyes; then : - $as_echo "#define HAVE_RANDOM 1" >>confdefs.h -
Re: rand48 replacement
Hello again, Just wanted to let you know that cfbot [1] doesn't seem to be happy with the patch. Apparently, some tests are falling. To be honest, I didn't invest too much time into investigating this. Hopefully, it's not a big deal. [1]: http://cfbot.cputube.org/ Indeed. I wish that these results would be available from the cf interface. Attached a v11 which might improve things. Not enough. Here is a v12 which might improve things further. -- Fabien.diff --git a/configure b/configure index 7542fe30a1..7dc95e3960 100755 --- a/configure +++ b/configure @@ -800,6 +800,7 @@ infodir docdir oldincludedir includedir +runstatedir localstatedir sharedstatedir sysconfdir @@ -941,6 +942,7 @@ datadir='${datarootdir}' sysconfdir='${prefix}/etc' sharedstatedir='${prefix}/com' localstatedir='${prefix}/var' +runstatedir='${localstatedir}/run' includedir='${prefix}/include' oldincludedir='/usr/include' docdir='${datarootdir}/doc/${PACKAGE_TARNAME}' @@ -1193,6 +1195,15 @@ do | -silent | --silent | --silen | --sile | --sil) silent=yes ;; + -runstatedir | --runstatedir | --runstatedi | --runstated \ + | --runstate | --runstat | --runsta | --runst | --runs \ + | --run | --ru | --r) +ac_prev=runstatedir ;; + -runstatedir=* | --runstatedir=* | --runstatedi=* | --runstated=* \ + | --runstate=* | --runstat=* | --runsta=* | --runst=* | --runs=* \ + | --run=* | --ru=* | --r=*) +runstatedir=$ac_optarg ;; + -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb) ac_prev=sbindir ;; -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \ @@ -1330,7 +1341,7 @@ fi for ac_var in exec_prefix prefix bindir sbindir libexecdir datarootdir \ datadir sysconfdir sharedstatedir localstatedir includedir \ oldincludedir docdir infodir htmldir dvidir pdfdir psdir \ - libdir localedir mandir + libdir localedir mandir runstatedir do eval ac_val=\$$ac_var # Remove trailing slashes. @@ -1483,6 +1494,7 @@ Fine tuning of the installation directories: --sysconfdir=DIRread-only single-machine data [PREFIX/etc] --sharedstatedir=DIRmodifiable architecture-independent data [PREFIX/com] --localstatedir=DIR modifiable single-machine data [PREFIX/var] + --runstatedir=DIR modifiable per-process data [LOCALSTATEDIR/run] --libdir=DIRobject code libraries [EPREFIX/lib] --includedir=DIRC header files [PREFIX/include] --oldincludedir=DIR C header files for non-gcc [/usr/include] @@ -15026,7 +15038,7 @@ else We can't simply define LARGE_OFF_T to be 9223372036854775807, since some C++ compilers masquerading as C compilers incorrectly reject 9223372036854775807. */ -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62)) +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31)) int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721 && LARGE_OFF_T % 2147483647 == 1) ? 1 : -1]; @@ -15072,7 +15084,7 @@ else We can't simply define LARGE_OFF_T to be 9223372036854775807, since some C++ compilers masquerading as C compilers incorrectly reject 9223372036854775807. */ -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62)) +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31)) int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721 && LARGE_OFF_T % 2147483647 == 1) ? 1 : -1]; @@ -15096,7 +15108,7 @@ rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext We can't simply define LARGE_OFF_T to be 9223372036854775807, since some C++ compilers masquerading as C compilers incorrectly reject 9223372036854775807. */ -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62)) +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31)) int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721 && LARGE_OFF_T % 2147483647 == 1) ? 1 : -1]; @@ -15141,7 +15153,7 @@ else We can't simply define LARGE_OFF_T to be 9223372036854775807, since some C++ compilers masquerading as C compilers incorrectly reject 9223372036854775807. */ -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62)) +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31)) int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721 && LARGE_OFF_T % 2147483647 == 1) ? 1 : -1]; @@ -15165,7 +15177,7 @@ rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext We can't simply define LARGE_OFF_T to be 9223372036854775807, since some C++ compilers masquerading as C compilers incorrectly reject 9223372036854775807. */ -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62)) +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31)) int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721 && LARGE_OFF_T % 2147483647 == 1) ? 1 : -1]; @@ -16301,32 +16313,6 @@ esac
Re: psql - add SHOW_ALL_RESULTS option
Hallo Peter, It turns out that your v8 patch still has the issue complained about in [0]. The issue is that after COMMIT AND CHAIN, the internal savepoint is gone, but the patched psql still thinks it should be there and tries to release it, which leads to errors. Indeed. Thanks for the catch. Attached v9 integrates your tests and makes them work. -- Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out index b52d187722..0cf4a37a5f 100644 --- a/contrib/pg_stat_statements/expected/pg_stat_statements.out +++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out @@ -50,8 +50,28 @@ BEGIN \; SELECT 2.0 AS "float" \; SELECT 'world' AS "text" \; COMMIT; + float +--- + 2.0 +(1 row) + + text +--- + world +(1 row) + -- compound with empty statements and spurious leading spacing \;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;; + ?column? +-- +6 +(1 row) + + ?column? +-- + ! +(1 row) + ?column? -- 5 @@ -61,6 +81,11 @@ COMMIT; SELECT 1 + 1 + 1 AS "add" \gset SELECT :add + 1 + 1 AS "add" \; SELECT :add + 1 + 1 AS "add" \gset + add +- + 5 +(1 row) + -- set operator SELECT 1 AS i UNION SELECT 2 ORDER BY i; i diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index fcab5c0d51..7c5504bc74 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) - Also, psql only prints the - result of the last SQL command in the string. - This is different from the behavior when the same string is read from - a file or fed to psql's standard input, - because then psql sends - each SQL command separately. - Because of this behavior, putting more than one SQL command in a - single -c string often has unexpected results. - It's better to use repeated -c commands or feed - multiple commands to psql's standard input, + If having several commands executed in one transaction is not desired, + use repeated -c commands or feed multiple commands to + psql's standard input, either using echo as illustrated above, or via a shell here-document, for example: @@ -3542,10 +3535,6 @@ select 1\; select 2\; select 3; commands included in the string to divide it into multiple transactions. (See for more details about how the server handles multi-query strings.) -psql prints only the last query result -it receives for each request; in this example, although all -three SELECTs are indeed executed, psql -only prints the 3. @@ -4132,6 +4121,18 @@ bar +SHOW_ALL_RESULTS + + +When this variable is set to off, only the last +result of a combined query (\;) is shown instead of +all of them. The default is on. The off behavior +is for compatibility with older versions of psql. + + + + + SHOW_CONTEXT diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index 5640786678..481a316fed 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -33,6 +33,8 @@ static bool DescribeQuery(const char *query, double *elapsed_msec); static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec); static bool command_no_begin(const char *query); static bool is_select_command(const char *query); +static int SendQueryAndProcessResults(const char *query, double *pelapsed_msec, + bool is_watch, const printQueryOpt *opt, FILE *printQueryFout, bool *tx_ended); /* @@ -353,7 +355,7 @@ CheckConnection(void) * Returns true for valid result, false for error state. */ static bool -AcceptResult(const PGresult *result) +AcceptResult(const PGresult *result, bool show_error) { bool OK; @@ -384,7 +386,7 @@ AcceptResult(const PGresult *result) break; } - if (!OK) + if (!OK && show_error) { const char *error = PQerrorMessage(pset.db); @@ -472,6 +474,18 @@ ClearOrSaveResult(PGresult *result) } } +/* + * Consume all results + */ +static void +ClearOrSaveAllResults() +{ + PGresult *result; + + while ((result = PQgetResult(pset.db)) != NULL) + ClearOrSaveResult(result); +} + /* * Print microtiming output. Always print raw milliseconds; if the interval @@ -572,7 +586,7 @@ PSQLexec(const char *query) ResetCancelConn(); - if (!AcceptResult(res)) + if (!AcceptResult(res, true)) { ClearOrSaveResult(res); res = NULL; @@ -594,11 +608,8 @@ PSQLexec(const char *query) int PSQLexecWatch(const char *query, const
Re: rand48 replacement
Hello Aleksander, Attached a v10 which is some kind of compromise where the interface uses inclusive min and max bounds, so that all values can be reached. Just wanted to let you know that cfbot [1] doesn't seem to be happy with the patch. Apparently, some tests are falling. To be honest, I didn't invest too much time into investigating this. Hopefully, it's not a big deal. [1]: http://cfbot.cputube.org/ Indeed. I wish that these results would be available from the cf interface. Attached a v11 which might improve things. Thanks for the ping! -- Fabien.diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index d19f73127c..b250ae912b 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -32,6 +32,7 @@ #include "catalog/index.h" #include "catalog/pg_am.h" #include "commands/tablecmds.h" +#include "common/pg_prng.h" #include "lib/bloomfilter.h" #include "miscadmin.h" #include "storage/lmgr.h" @@ -465,8 +466,8 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace, total_pages = RelationGetNumberOfBlocks(rel); total_elems = Max(total_pages * (MaxTIDsPerBTreePage / 3), (int64) state->rel->rd_rel->reltuples); - /* Random seed relies on backend srandom() call to avoid repetition */ - seed = random(); + /* Random seed relies on backend prng initialization to avoid repetition */ + seed = pg_prng_u64(_global_prng_state); /* Create Bloom filter to fingerprint index */ state->filter = bloom_create(total_elems, maintenance_work_mem, seed); state->heaptuplespresent = 0; diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c index 2c2f149fb0..146b524076 100644 --- a/contrib/file_fdw/file_fdw.c +++ b/contrib/file_fdw/file_fdw.c @@ -1188,7 +1188,7 @@ file_acquire_sample_rows(Relation onerel, int elevel, * Found a suitable tuple, so save it, replacing one old tuple * at random */ -int k = (int) (targrows * sampler_random_fract(rstate.randstate)); +int k = (int) (targrows * sampler_random_fract()); Assert(k >= 0 && k < targrows); heap_freetuple(rows[k]); diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 76d4fea21c..c139382170 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -5157,7 +5157,7 @@ analyze_row_processor(PGresult *res, int row, PgFdwAnalyzeState *astate) if (astate->rowstoskip <= 0) { /* Choose a random reservoir element to replace. */ - pos = (int) (targrows * sampler_random_fract(astate->rstate.randstate)); + pos = (int) (targrows * sampler_random_fract(>rstate.randstate)); Assert(pos >= 0 && pos < targrows); heap_freetuple(astate->rows[pos]); } diff --git a/contrib/tsm_system_rows/tsm_system_rows.c b/contrib/tsm_system_rows/tsm_system_rows.c index 4996612902..1a46d4b143 100644 --- a/contrib/tsm_system_rows/tsm_system_rows.c +++ b/contrib/tsm_system_rows/tsm_system_rows.c @@ -69,7 +69,7 @@ static BlockNumber system_rows_nextsampleblock(SampleScanState *node, BlockNumbe static OffsetNumber system_rows_nextsampletuple(SampleScanState *node, BlockNumber blockno, OffsetNumber maxoffset); -static uint32 random_relative_prime(uint32 n, SamplerRandomState randstate); +static uint32 random_relative_prime(uint32 n, pg_prng_state *randstate); /* @@ -213,25 +213,25 @@ system_rows_nextsampleblock(SampleScanState *node, BlockNumber nblocks) if (sampler->step == 0) { /* Initialize now that we have scan descriptor */ - SamplerRandomState randstate; + pg_prng_state randstate; /* If relation is empty, there's nothing to scan */ if (nblocks == 0) return InvalidBlockNumber; /* We only need an RNG during this setup step */ - sampler_random_init_state(sampler->seed, randstate); + sampler_random_init_state(sampler->seed, ); /* Compute nblocks/firstblock/step only once per query */ sampler->nblocks = nblocks; /* Choose random starting block within the relation */ /* (Actually this is the predecessor of the first block visited) */ - sampler->firstblock = sampler_random_fract(randstate) * + sampler->firstblock = sampler_random_fract() * sampler->nblocks; /* Find relative prime as step size for linear probing */ - sampler->step = random_relative_prime(sampler->nblocks, randstate); + sampler->step = random_relative_prime(sampler->nblocks, ); } /* Reinitialize lb */ @@ -317,7 +317,7 @@ gcd(uint32 a, uint32 b) * (else return 1). */ static uint32 -random_relative_prime(uint32 n, SamplerRandomState randstate) +random_relative_prime(uint32 n, pg_prng_state *randstate) { uint32 r; diff --git a/contrib/tsm_system_time/tsm_system_time.c b/contrib/tsm_system_time/tsm_system_time.c index 788d8f9a68..36acc6c106 100644 --- a/contrib/tsm_system_time/tsm_system_time.c +++ b/contrib/tsm_system_time/tsm_system_time.c @@ -69,7 +69,7 @@
Re: Avoid stuck of pbgench due to skipped transactions
Hello Fujii-san, Stop counting skipped transactions under -T as soon as the timer is exceeded. Because otherwise it can take a very long time to count all of them especially when quite a lot of them happen with unrealistically high rate setting in -R, which would prevent pgbench from ending immediately. Because of this behavior, note that there is no guarantee that all skipped transactions are counted under -T though there is under -t. This is OK in practice because it's very unlikely to happen with realistic setting. Ok, I find this text quite clear. One question is; which version do we want to back-patch to? If we consider it a "very minor bug fix" which is triggered by somehow unrealistic options, so I'd say 14 & dev, or possibly only dev. -- Fabien.