Re: [HACKERS] VIP: new format for psql - shell - simple using psql in shell
> > bash isn't everywhere (on UNIX)... it is true - but first format - space is used as separator and space is escaped should be processed on every shell. Regards Pavel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VIP: new format for psql - shell - simple using psql in shell
On Sat, May 26, 2012 at 11:50 AM, Pavel Stehule wrote: > 2012/5/26 Tom Lane : >> Bruce Momjian writes: >>> On Sat, May 26, 2012 at 05:39:23PM +0200, Pavel Stehule wrote: I proposed new psql's format "shell". This format is optimized for processing returned result in shell: >> >>> I am unclear exactly how this relates to shells. >> >> What I'm unclear on is why we'd want to encourage that style of >> programming. The most charitable prediction of performance is that it >> would suck --- not only do you have all the inefficiencies inherent in >> row-by-row result processing with a shell script, but you're forcing a >> separate database connection for each query. And I don't actually see >> where it would be especially convenient to use, compared to say perl >> or python or other scripting languages. I'd rather see us worrying >> about the convenience of cases like >> >> psql ... | perl -e ... > > A performance is not important in this case - typical use case for > this feature are simple tasks - some simple maintaining - where people > can prepare SQL in psql, and later can reuse knowledge in some simple > scripts. Shell has one significant advantage against perl or python - > is everywhere (on UNIX) and it is best for very simple tasks. > > Regards > > Pavel > > > >> >> regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers bash isn't everywhere (on UNIX)... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VIP: new format for psql - shell - simple using psql in shell
2012/5/26 Jan-Benedict Glaw : > Hi! > > On Sat, 2012-05-26 17:39:23 +0200, Pavel Stehule > wrote: >> postgres=# select * from foo; > [...] >> postgres=# \pset format shell >> Output format is shell. >> postgres=# select * from foo; >> a b c >> Hello,\ World 10 2012-05-26 >> Ahoj,\ Svete 20 2012-06-15 > [...] > > I like that idea! Up to now, I basically used IFS='|' with > tuples-only, but it's not clean wrt. '\n' within the data. > > I didn't check your patch if it gets this "right", but if it does, it > would ease daily work (where I check for '\n' in the data beforehand > and abort...) > please, test it. I am long time bash user, but my knowledge is not too strong, and any second ayes are welcome. Regards Pavel > MfG, JBG > > -- > Jan-Benedict Glaw jbg...@lug-owl.de +49-172-7608481 > Signature of: Fortschritt bedeutet, einen Schritt so zu machen, > the second : daß man den nächsten auch noch machen kann. > > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.10 (GNU/Linux) > > iEYEARECAAYFAk/BJ6UACgkQHb1edYOZ4btypgCeKC4I2MwzPYPbTwjmFxAnzQPt > +ykAn3B6oNnutk80Ige31qxjzsXrTRid > =CJvM > -END PGP SIGNATURE- > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VIP: new format for psql - shell - simple using psql in shell
2012/5/26 Bruce Momjian : > On Sat, May 26, 2012 at 12:43:40PM -0400, Tom Lane wrote: >> Bruce Momjian writes: >> > On Sat, May 26, 2012 at 05:39:23PM +0200, Pavel Stehule wrote: >> >> I proposed new psql's format "shell". This format is optimized for >> >> processing returned result in shell: >> >> > I am unclear exactly how this relates to shells. >> >> What I'm unclear on is why we'd want to encourage that style of >> programming. The most charitable prediction of performance is that it >> would suck --- not only do you have all the inefficiencies inherent in >> row-by-row result processing with a shell script, but you're forcing a >> separate database connection for each query. And I don't actually see >> where it would be especially convenient to use, compared to say perl >> or python or other scripting languages. I'd rather see us worrying >> about the convenience of cases like > > Wouldn't you just us unaligned mode for this, and set IFS to '|'? > > $ psql --no-align --tuples-only -c 'SELECT 1,2' test > 1|2 > > $ export IFS="|" > $ IFS='|' sql --no-align --tuples-only -c 'SELECT 1,2' test | > while read x y; do echo $x; echo $y; done > 1 > 2 > > Are you worried about pipes in data? Does you idea fix this? I can do use IFS, but it is not easy when you would to work with multicolumn tables - because you have to two IFS. Processing single column tables is simple now - difference is in multicolumn tables. My idea is secure to separator - because separator is just space and new line and these symbols are escaped. Regards Pavel > > -- > Bruce Momjian http://momjian.us > EnterpriseDB http://enterprisedb.com > > + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] --disable-shared is entirely broken these days
On Sat, May 26, 2012 at 12:53 PM, Tom Lane wrote: > 2. Seeing that this is the first complaint since 9.0, should we decide > that --disable-shared is no longer worth supporting? Seems like we > should either make this case work or remove this switch. I notice > that the switch isn't documented anywhere in the SGML docs, either. > If we do keep it, we'd better document that it results in a severely > crippled version of Postgres. I can't really imagine the reasons to support this switch at this time. Compile times are fast, disk space cheap, and if one wanted to avoid the dynamic linker entirely and use features like replication (which, IIRC, is dynamically linked for libpqwalreceiver.so) and plpgsql then people are out of luck without some work of (perhaps?) questionable value. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] --disable-shared is entirely broken these days
A gripe today in pgsql-novice made me realize that configure's --disable-shared option has been useless since 9.0, because it prevents plpgsql.so from being built, which causes initdb to fail now that plpgsql is installed "by default" --- which actually seems to mean "you don't have any choice about whether to install it". So, a few questions: 1. Should we honor the promise implicit in the "by default" wording that there should be a way to initdb without plpgsql? 2. Seeing that this is the first complaint since 9.0, should we decide that --disable-shared is no longer worth supporting? Seems like we should either make this case work or remove this switch. I notice that the switch isn't documented anywhere in the SGML docs, either. If we do keep it, we'd better document that it results in a severely crippled version of Postgres. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Synchronized scans versus relcache reinitialization
I've been poking at Jeff Frost's and Greg Mullane's recent reports of high load due to many processes getting "stuck" in relcache init file rebuild operations. I can reproduce a similar behavior here by creating a database containing a whole lot of many-column views, thereby bloating pg_attribute to the gigabyte range, then manually removing the pg_internal.init file (simulating what would happen after a relcache inval on any system catalog), and then throwing a bunch of new connections at the database simultaneously. Each new connection tries to rebuild the init file, and they basically saturate the machine. I don't believe that this case quite matches what happened to either Jeff or Greg, but nonetheless it's quite reproducible and it needs to be fixed. I can identify three sub-issues: 1. If pg_attribute is larger than 1/4th of shared_buffers, the synchronized scan logic kicks in when we do seqscans to fill the tuple descriptors for core system catalogs. For this particular use case that's not merely not helpful, it's positively disastrous. The reason is that the desired rows are almost always in the first couple dozen blocks of pg_attribute, and the reading code in RelationBuildTupleDesc knows this and is coded to stop once it's collected the expected number of pg_attribute rows for the particular catalog. So even with a very large pg_attribute, not much work should be expended here. But the syncscan logic causes some of the heapscans to start from points later than block zero, causing them to miss the rows they need, so that the scan has to run to the end and wrap around before it finds all the rows it needs. In my test case on HEAD, this happens just once out of the eleven heapscans that occur in this phase, if a single backend is doing this in isolation. That increases the startup time from a few milliseconds to about eight-tenths of a second, due to having to scan all of pg_attribute. (In my test case, pg_attribute is fully cached in RAM, but most of it is in kernel buffers not PG buffers.) Bad as that is, it gets rapidly worse if there are multiple incoming new connections. All of them get swept up in the full-table syncscan started by the first arrival, so that now all rather than only some of their heapscans start from a point later than block zero, meaning that all eleven rather than just one of their heapscans are unduly expensive. It seems clear to me that we should just disable syncscans for the relcache reload heapscans. There is lots of downside due to breaking the early-exit optimization in RelationBuildTupleDesc, and basically no upside. I'm inclined to just modify systable_beginscan to prevent use of syncscan whenever indexOK is false. If we wanted to change its API we could make this happen only for RelationBuildTupleDesc's calls, but I don't see any upside for allowing syncscans for other forced-heapscan callers either. 2. The larger problem here is that when we have N incoming connections we let all N of them try to rebuild the init file independently. This doesn't make things faster for any one of them, and once N gets large enough it makes things slower for all of them. We would be better off letting the first arrival do the rebuild work while the others just sleep waiting for it. I believe that this fix would probably have ameliorated Jeff and Greg's cases, even though those do not seem to have triggered the syncscan logic. 3. Having now spent a good deal of time poking at this, I think that the syncscan logic is in need of more tuning, and I am wondering whether we should even have it turned on by default. It appears to be totally useless for fully-cached-in-RAM scenarios, even if most of the relation is out in kernel buffers rather than in shared buffers. The best case I saw was less than 2X speedup compared to N-times-the-single-client case, and that wasn't very reproducible, and it didn't happen at all unless I hacked BAS_BULKREAD mode to use a ring buffer size many times larger than the current 256K setting (otherwise the timing requirements are too tight for multiple backends to stay in sync --- a seqscan can blow through that much data in a fraction of a millisecond these days, if it's reading from kernel buffers). The current tuning may be all right for cases where you're actually reading from spinning rust, but that seems to be a decreasing fraction of real-world use cases. Anyway, I think we definitely need to fix systable_beginscan to not use syncscans; that's about a one-line change and seems plenty safe to backpatch. I also intend to look at avoiding concurrent relcache rebuilds, which I think should also be simple enough if we are willing to introduce an additional LWLock. (That would prevent concurrent init file rebuilds in different databases, but it's not clear that very many people care about such scenarios.) I am inclined to back-patch that as well; it's a bit riskier than the first change, but the first change is apparently not going
Re: [HACKERS] VIP: new format for psql - shell - simple using psql in shell
Hi! On Sat, 2012-05-26 17:39:23 +0200, Pavel Stehule wrote: > postgres=# select * from foo; [...] > postgres=# \pset format shell > Output format is shell. > postgres=# select * from foo; > a b c > Hello,\ World 10 2012-05-26 > Ahoj,\ Svete 20 2012-06-15 [...] I like that idea! Up to now, I basically used IFS='|' with tuples-only, but it's not clean wrt. '\n' within the data. I didn't check your patch if it gets this "right", but if it does, it would ease daily work (where I check for '\n' in the data beforehand and abort...) MfG, JBG -- Jan-Benedict Glaw jbg...@lug-owl.de +49-172-7608481 Signature of: Fortschritt bedeutet, einen Schritt so zu machen, the second : daß man den nächsten auch noch machen kann. signature.asc Description: Digital signature
Re: [HACKERS] Backends stalled in 'startup' state: index corruption
On Sat, May 26, 2012 at 01:25:29PM -0400, Tom Lane wrote: > Greg Sabino Mullane writes: > > On Sat, May 26, 2012 at 12:17:04PM -0400, Tom Lane wrote: > >> If you see any block numbers above about 20 then maybe the triggering > >> condition is a row relocation after all. > > > Highest was 13. > > Hm ... but wait, you said you'd done a VACUUM FULL on the catalogs. > So it's not clear whether this is reflective of the state at the time > the problem was happening. True. I'll try to get a high water mark when (er...if!) it happens again. -- Greg Sabino Mullane g...@endpoint.com End Point Corporation PGP Key: 0x14964AC8 pgpUK3N5QYoTd.pgp Description: PGP signature
Re: [HACKERS] Backends stalled in 'startup' state: index corruption
Greg Sabino Mullane writes: > On Sat, May 26, 2012 at 12:17:04PM -0400, Tom Lane wrote: >> If you see any block numbers above about 20 then maybe the triggering >> condition is a row relocation after all. > Highest was 13. Hm ... but wait, you said you'd done a VACUUM FULL on the catalogs. So it's not clear whether this is reflective of the state at the time the problem was happening. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VIP: new format for psql - shell - simple using psql in shell
On Sat, May 26, 2012 at 12:43:40PM -0400, Tom Lane wrote: > Bruce Momjian writes: > > On Sat, May 26, 2012 at 05:39:23PM +0200, Pavel Stehule wrote: > >> I proposed new psql's format "shell". This format is optimized for > >> processing returned result in shell: > > > I am unclear exactly how this relates to shells. > > What I'm unclear on is why we'd want to encourage that style of > programming. The most charitable prediction of performance is that it > would suck --- not only do you have all the inefficiencies inherent in > row-by-row result processing with a shell script, but you're forcing a > separate database connection for each query. And I don't actually see > where it would be especially convenient to use, compared to say perl > or python or other scripting languages. I'd rather see us worrying > about the convenience of cases like Wouldn't you just us unaligned mode for this, and set IFS to '|'? $ psql --no-align --tuples-only -c 'SELECT 1,2' test 1|2 $ export IFS="|" $ IFS='|' sql --no-align --tuples-only -c 'SELECT 1,2' test | while read x y; do echo $x; echo $y; done 1 2 Are you worried about pipes in data? Does you idea fix this? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VIP: new format for psql - shell - simple using psql in shell
2012/5/26 Tom Lane : > Bruce Momjian writes: >> On Sat, May 26, 2012 at 05:39:23PM +0200, Pavel Stehule wrote: >>> I proposed new psql's format "shell". This format is optimized for >>> processing returned result in shell: > >> I am unclear exactly how this relates to shells. > > What I'm unclear on is why we'd want to encourage that style of > programming. The most charitable prediction of performance is that it > would suck --- not only do you have all the inefficiencies inherent in > row-by-row result processing with a shell script, but you're forcing a > separate database connection for each query. And I don't actually see > where it would be especially convenient to use, compared to say perl > or python or other scripting languages. I'd rather see us worrying > about the convenience of cases like > > psql ... | perl -e ... A performance is not important in this case - typical use case for this feature are simple tasks - some simple maintaining - where people can prepare SQL in psql, and later can reuse knowledge in some simple scripts. Shell has one significant advantage against perl or python - is everywhere (on UNIX) and it is best for very simple tasks. Regards Pavel > > regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VIP: new format for psql - shell - simple using psql in shell
Bruce Momjian writes: > On Sat, May 26, 2012 at 05:39:23PM +0200, Pavel Stehule wrote: >> I proposed new psql's format "shell". This format is optimized for >> processing returned result in shell: > I am unclear exactly how this relates to shells. What I'm unclear on is why we'd want to encourage that style of programming. The most charitable prediction of performance is that it would suck --- not only do you have all the inefficiencies inherent in row-by-row result processing with a shell script, but you're forcing a separate database connection for each query. And I don't actually see where it would be especially convenient to use, compared to say perl or python or other scripting languages. I'd rather see us worrying about the convenience of cases like psql ... | perl -e ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VIP: new format for psql - shell - simple using psql in shell
2012/5/26 Bruce Momjian : > On Sat, May 26, 2012 at 05:39:23PM +0200, Pavel Stehule wrote: >> Hello >> >> I proposed new psql's format "shell". This format is optimized for >> processing returned result in shell: >> >> >> postgres=# select * from foo; >> a | b | c >> --++ >> Hello, World | 10 | 2012-05-26 >> Ahoj, Svete | 20 | 2012-06-15 >> (2 rows) >> >> postgres=# \pset format shell >> Output format is shell. >> postgres=# select * from foo; >> a b c >> Hello,\ World 10 2012-05-26 >> Ahoj,\ Svete 20 2012-06-15 >> >> postgres=# \x >> Expanded display is on. >> postgres=# select * from foo; >> ( c l ) >> ( [a]=Hello,\ World [b]=10 [c]=2012-05-26 ) >> ( [a]=Ahoj,\ Svete [b]=20 [c]=2012-06-15 ) > ... >> ) | while read dbname owner encoding collate ctype priv; > > I am unclear exactly how this relates to shells. Do shells read this > via read? I am unclear that would actually work. What do the brackets > mean? Does read process \space as a non-space? > "read" can read multicolumn files, where space is separator and real space is escaped. It is first sample. Second example is related to Bash's feature - associative array support - data has format that is same like assoc array Pavel > -- > Bruce Momjian http://momjian.us > EnterpriseDB http://enterprisedb.com > > + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backends stalled in 'startup' state: index corruption
On Sat, May 26, 2012 at 12:17:04PM -0400, Tom Lane wrote: > If you see any block numbers above about 20 then maybe the triggering > condition is a row relocation after all. Highest was 13. -- Greg Sabino Mullane g...@endpoint.com End Point Corporation PGP Key: 0x14964AC8 pgpa6XGTGTEIZ.pgp Description: PGP signature
Re: [HACKERS] No, pg_size_pretty(numeric) was not such a hot idea
Fujii Masao writes: > On Sat, May 26, 2012 at 9:30 AM, Tom Lane wrote: >> The argument for adding pg_size_pretty(numeric) was pretty darn thin in >> the first place, IMHO; it does not seem to me that it justified this >> loss of usability. > Ouch! But removing pg_size_pretty(numeric) causes another usability > issue, e.g., pg_size_pretty(pg_xlog_location_diff(...)) fails. So how about > removing pg_size_pretty(bigint) to resolve those two issues? > I guess pg_size_pretty(numeric) is a bit slower than bigint version, but > I don't think that such a bit slowdown of pg_size_pretty() becomes > a matter practically. No? AFAICS that argument is based on wishful thinking, not evidence. I did some simple measurements and determined that at least on my development machine, pg_size_pretty(numeric) is about a factor of four slower than pg_size_pretty(bigint) --- and that's just counting the function itself, not any added coercion-to-numeric processing. Now maybe you could argue that it's never going to be used in a context where anyone cares about its performance at all, but I've got doubts about that. In any case, it's probably too late to do anything about this for 9.2; and once we ship it like that there will be little point in changing it later, since people will already have had to add explicit casts to any queries where the problem arises. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backends stalled in 'startup' state: index corruption
Greg Sabino Mullane writes: > On Fri, May 25, 2012 at 07:02:42PM -0400, Tom Lane wrote: >> pg_attribute just enough smaller to avoid the scenario. Not sure about >> Greg's case, but he should be able to tell us the size of pg_attribute >> and his shared_buffers setting ... > pg_attribute around 5 MB (+6MB indexes), shared_buffers 4GB. However, > there is a *lot* of churn in pg_attribute and pg_class, mostly due > to lots of temporary tables. > P.S. Hmmm that's weird, I just double-checked the above and pg_attribute > is now 52MB/70MB (the original figures were from yesterday). At any rate, > nowhere near 1/4 shared buffers. Hmph. And Jeff swears his isn't large enough to trigger syncscans either. This could all be due to the "thundering herd" effect of a lot of processes all doing the same only-moderately-expensive-in-itself thing; except it's hard to see how the problem gets rolling unless the single-process cache reload time is already a lot more than the inter-arrival time. Would you guys please try this in the problem databases: select a.ctid, c.relname from pg_attribute a join pg_class c on a.attrelid=c.oid where c.relnamespace=11 and c.relkind in ('r','i') order by 1 desc; If you see any block numbers above about 20 then maybe the triggering condition is a row relocation after all. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VIP: new format for psql - shell - simple using psql in shell
On Sat, May 26, 2012 at 05:39:23PM +0200, Pavel Stehule wrote: > Hello > > I proposed new psql's format "shell". This format is optimized for > processing returned result in shell: > > > postgres=# select * from foo; > a | b | c > --++ > Hello, World | 10 | 2012-05-26 > Ahoj, Svete | 20 | 2012-06-15 > (2 rows) > > postgres=# \pset format shell > Output format is shell. > postgres=# select * from foo; > a b c > Hello,\ World 10 2012-05-26 > Ahoj,\ Svete 20 2012-06-15 > > postgres=# \x > Expanded display is on. > postgres=# select * from foo; > ( c l ) > ( [a]=Hello,\ World [b]=10 [c]=2012-05-26 ) > ( [a]=Ahoj,\ Svete [b]=20 [c]=2012-06-15 ) ... > ) | while read dbname owner encoding collate ctype priv; I am unclear exactly how this relates to shells. Do shells read this via read? I am unclear that would actually work. What do the brackets mean? Does read process \space as a non-space? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No, pg_size_pretty(numeric) was not such a hot idea
On 26-05-2012 01:45, Fujii Masao wrote: > Ouch! But removing pg_size_pretty(numeric) causes another usability > issue, e.g., pg_size_pretty(pg_xlog_location_diff(...)) fails. So how about > removing pg_size_pretty(bigint) to resolve those two issues? > I guess pg_size_pretty(numeric) is a bit slower than bigint version, but > I don't think that such a bit slowdown of pg_size_pretty() becomes > a matter practically. No? > That's what I proposed at [1]. +1 for dropping the pg_size_pretty(bigint). [1] http://archives.postgresql.org/message-id/4f315f6c.8030...@timbira.com -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] VIP: new format for psql - shell - simple using psql in shell
Hello I proposed new psql's format "shell". This format is optimized for processing returned result in shell: postgres=# select * from foo; a | b | c --++ Hello, World | 10 | 2012-05-26 Ahoj, Svete | 20 | 2012-06-15 (2 rows) postgres=# \pset format shell Output format is shell. postgres=# select * from foo; a b c Hello,\ World 10 2012-05-26 Ahoj,\ Svete 20 2012-06-15 postgres=# \x Expanded display is on. postgres=# select * from foo; ( c l ) ( [a]=Hello,\ World [b]=10 [c]=2012-05-26 ) ( [a]=Ahoj,\ Svete [b]=20 [c]=2012-06-15 ) shell scripts can looks like: ( psql -t -P format=shell postgres < ${row[$field]}" done; echo; done;) I invite any comments, mainly from bash or shell experts Regards Pavel Stehule format_shell.diff Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backends stalled in 'startup' state: index corruption
On Fri, May 25, 2012 at 07:02:42PM -0400, Tom Lane wrote: > However, the remaining processes trying to > compute new init files would still have to complete the process, so I'd > expect there to be a diminishing effect --- the ones that were stalling > shouldn't all release exactly together. Unless there is some additional > effect that's syncing them all. (I wonder for instance if the syncscan > logic is kicking in here.) How fast would you expect that to happen? As far as I could tell, they all released at once, or at least within probably 15 seconds of each other; I wasn't running ps constantly. I could check the logs and get a better figure if you think it's an important data point. > One interesting question is why there's a thundering herd of new > arrivals in the first place. IIRC you said you were using a connection > pooler. I wonder if it has a bug^H^H^Hdesign infelicity that makes it > drop and reopen all its connections simultaneously. No, we are not. Or rather, there is some pooling, but there is also a fairly large influx of new connections. As far as I could tell, the few existing connections were not affected. > 1. Somebody decides to update one of those rows, and it gets dropped in > some remote region of the table. The only really plausible reason for > this is deciding to fool with the column-specific stats target > (attstattarget) of a system catalog. Does that sound like something > either of you might have done? No, zero chance of this, barring some rogue intruder on the network with a strange sense of humor. > pg_attribute just enough smaller to avoid the scenario. Not sure about > Greg's case, but he should be able to tell us the size of pg_attribute > and his shared_buffers setting ... pg_attribute around 5 MB (+6MB indexes), shared_buffers 4GB. However, there is a *lot* of churn in pg_attribute and pg_class, mostly due to lots of temporary tables. P.S. Hmmm that's weird, I just double-checked the above and pg_attribute is now 52MB/70MB (the original figures were from yesterday). At any rate, nowhere near 1/4 shared buffers. -- Greg Sabino Mullane g...@endpoint.com End Point Corporation PGP Key: 0x14964AC8 pgpGtYKGLr70y.pgp Description: PGP signature
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Sat, 26 May 2012, Stephen Frost wrote: Alright, can you just time 'cat' when they're started a few seconds or whatever apart from each other? I can't imagine it being affected in the same way as these, but seems like it wouldn't hurt to check. I've tryed cat'ting a created in advance 8gig file on the same filesystem. And during the multi-threaded run it takes ~4sec, roughly the same as when run separately. * Sergey E. Koposov, PhD, Research Associate Institute of Astronomy, University of Cambridge Madingley road, CB3 0HA, Cambridge, UK Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
* Sergey Koposov (kopo...@ast.cam.ac.uk) wrote: > Turning off synch seq scans doesn't help either. 18 sec > multithreaded run vs 7 sec single threaded. Alright, can you just time 'cat' when they're started a few seconds or whatever apart from each other? I can't imagine it being affected in the same way as these, but seems like it wouldn't hurt to check. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Sat, 26 May 2012, Robert Haas wrote: This theory is seeming fairly plausible - how can we test it? How about trying it with synchronize_seqscans = off? If the synchronized-seqscan logic is causing contention on the buf mapping locks and individual buffer locks, that should fix it. Turning off synch seq scans doesn't help either. 18 sec multithreaded run vs 7 sec single threaded. S * Sergey E. Koposov, PhD, Research Associate Institute of Astronomy, University of Cambridge Madingley road, CB3 0HA, Cambridge, UK Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
On Fri, May 25, 2012 at 10:30 AM, Merlin Moncure wrote: > I think what's happening here is that the buffer partitions don't help > (in fact, they hurt) in the presence of multiple concurrent scans that > are operating on approximately the same data. Sooner or later the > scans line up on each other and start binding when reassigning lock > tags (which take out up to two ordered exclusive lwlocks). This is on > the read side, so the buffer access strategy is zero help (I confirmed > this off list). This theory is seeming fairly plausible - how can we test it? How about trying it with synchronize_seqscans = off? If the synchronized-seqscan logic is causing contention on the buf mapping locks and individual buffer locks, that should fix it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Interrupting long external library calls
On May26, 2012, at 12:40 , Simon Riggs wrote: > On 25 May 2012 17:34, Tom Lane wrote: >> I assume that the geos::util::Interrupt::request() call sets a flag >> somewhere that's going to be periodically checked in long-running >> loops. Would it be possible for the periodic checks to include a >> provision for a callback into Postgres-specific glue code, wherein >> you could test the same flags CHECK_FOR_INTERRUPTS does? A similar >> approach might then be usable in other contexts, and it seems safer >> to me than messing with a host environment's signal handling. > > Can we do that as a macro, e.g. > > POSTGRES_LIBRARY_INTERRUPT_CHECK() > > Otherwise the fix to this problem may be worse - faulty interrupt > handlers are worse than none at all. As it stands, ProcessInterrupts() sometimes returns instead of ereport()ing even if InterruptPending is set, interrupts aren't held off, and the code isn't in a critical section. That happens if QueryCancelPending (or however that's called) gets reset after a SIGINT arrived but before CHECK_FOR_INTERRUPTS() is called. Or at least that is how I interpret the comment at the bottom of that function... We could thus easily provide POSTGRES_LIBRARY_INTERRUPT_CHECK() if we're content with the (slim, probably) chance of false positives. Or we'd need to refactor things in a way that allows the hypothetical POSTGRES_LIBRARY_INTERRUPT_CHECK() to re-use the tests in ProcessInterrupts(), but without modifying any of the flags. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Interrupting long external library calls
On 25 May 2012 17:34, Tom Lane wrote: > Sandro Santilli writes: >> I ended up providing an explicit mechanism to request interruption of >> whatever the library is doing, and experimented (successfully so far) >> requesting the interruption from a SIGINT handler. > >> Do you see any major drawback in doing so ? > > This seems a bit fragile. It might work all right in Postgres, where > we tend to set up signal handlers just once at process start, but ISTM > other systems might assume they can change their signal handlers at > any time. The handler itself looks less than portable anyway --- > what about the SIGINFO case? > > I assume that the geos::util::Interrupt::request() call sets a flag > somewhere that's going to be periodically checked in long-running > loops. Would it be possible for the periodic checks to include a > provision for a callback into Postgres-specific glue code, wherein > you could test the same flags CHECK_FOR_INTERRUPTS does? A similar > approach might then be usable in other contexts, and it seems safer > to me than messing with a host environment's signal handling. Can we do that as a macro, e.g. POSTGRES_LIBRARY_INTERRUPT_CHECK() Otherwise the fix to this problem may be worse - faulty interrupt handlers are worse than none at all. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", Line: 1741
pg 9.2 git master AMD 8120 (8-core) / 6 GB memory / Centos 6.2 I have experimented a bit with dropping a table from master, then querying that table from a sync-rep slave. It is a little worrying that this, the first test I tried, crashes the slave. There are two instance on one machine, head1 (=master) and head2 (=sync-rep slave). First, I generated a tab-separated file, a one off, to be used in the test: echo " copy ( select repeat('X',20)as c1 , repeat('X',20)as c2 , repeat('X',20)as c3 , repeat('X',20)as c4 , repeat('X',20)as c5 from generate_series(1, 20) ) to stdout csv delimiter E'\t'; " | $HOME/pg_stuff/pg_installations/pgsql.head1/bin/psql -p 6564 -d testdb > dropload_copy.txt That txt file is zipped, and the actual test consists of a bash while loop which 1. drops the table 2. loads the file into the table 3. Either: a. nothing b. does a select count(*) on the table So, it repeats the following: zcat dropload_copy.txt.gz \ | grep -v '^#' \ | $HOME/pg_stuff/pg_installations/pgsql.head1/bin/psql -p 6564 -d testdb -c " drop table if exists t; create table t ( c1 text, c2 text, c3 text, c4 text, c5 text ); copy t from stdin csv delimiter E'\t'; analyze t;"; PAUSE_DURATION=0 PSQL=$HOME/pg_stuff/pg_installations/pgsql.head1/bin/psql if [[ 0 -eq 1 ]]; # ON-OFF switch then echo "sleep $PAUSE_DURATION" sleep $PAUSE_DURATION; ( echo "select current_setting('port') port, count(*) from $schema.$table" | $PSQL -qtXp 6564 -d testdb # master echo "select current_setting('port') port, count(*) from $schema.$table" | $PSQL -qtXp 6565 -d testdb # wal_receiver_01 #echo "select current_setting('port') port, count(*) from $schema.$table" | $PSQL -qtXp 6566 -d testdb # wal_receiver_02 ) | grep -v '^$' fi This runs fine for hours on end, as long as the ON-OFF switch is disabled. But when that if-block is added the client crashes after a while (sometimes almost immediately; it never survives longer then 20 minutes): 2012-05-26 10:44:22.617 CEST 10274 ERROR: could not fsync file "base/21268/32807": No such file or directory 2012-05-26 10:44:28.465 CEST 10274 ERROR: could not fsync file "base/21268/32867": No such file or directory 2012-05-26 10:44:28.587 CEST 10270 FATAL: could not open file "base/21268/32994": No such file or directory 2012-05-26 10:44:28.588 CEST 10270 CONTEXT: writing block 2508 of relation base/21268/32994 xlog redo multi-insert (init): rel 1663/21268/33006; blk 3117; 58 tuples TRAP: FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", Line: 1741) 2012-05-26 10:44:31.131 CEST 10269 LOG: startup process (PID 10270) was terminated by signal 6: Aborted 2012-05-26 10:44:31.131 CEST 10269 LOG: terminating any other active server processes Crazy scenario , I'll admit, but surely this shouldn't be able to crash the client? I attach the logfiles of master(=head1) and slave (=head2). It show how the above ran for an hour without problems (while the ON/OFF switch was disabled), and how the crash came quickly when I switched it on (to add the select count(*) statements). Erik Rijkers logfile.head2 Description: Binary data logfile.head1 Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers