Re: [HACKERS] WIP: explain analyze with 'rows' but not timing
2011/12/23 Tom Lane : > Pavel Stehule writes: >> 2011/12/23 Tom Lane : >>> Tomas Vondra writes: The motivation for this patch was that collection timing data often causes performance issues and in some cases it's not needed. But is this true for row counts? > >>> Perhaps more to the point, is there a use case for collecting timing >>> data without row counts? I find it hard to visualize a valid reason. > >> yes - a searching of bad prediction > > No, because timing alone proves nothing at all. The machine could just > have been overloaded. sorry, I didn't understand to question. Using only time is not practical Regards Pavel > > regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Representation of index clause lists
Robert Haas writes: > On Fri, Dec 23, 2011 at 7:53 PM, Tom Lane wrote: >> Thoughts either way? > OidFunctionCall11? How about making a struct out of some or all of > those arguments and passing that? Well, that was what I meant by the allusion to IndexPath --- I'm inclined to pass "root" and "indexpath" as the only input arguments, with the other inputs being found in the path struct. (Hm, maybe "outer_rel" would have to remain separate too.) I guess we could also think about merging the four output parameters into one output struct, but I'm less excited about that since it would involve inventing a struct that's not used otherwise. > What about using arrays rather than Lists? I think lists are more convenient to work with and iterate over, as long as you only need iteration and not random access, which is the case here AFAICS. In particular, if we changed to arrays then we'd have to reinvent assorted operations like list_union that are currently used on these lists, and I don't see any bang for the buck there. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Representation of index clause lists
On Fri, Dec 23, 2011 at 7:53 PM, Tom Lane wrote: > Thoughts either way? OidFunctionCall11? How about making a struct out of some or all of those arguments and passing that? What about using arrays rather than Lists? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: bytea_agg
Excerpts from Pavel Stehule's message of vie dic 23 18:36:11 -0300 2011: > Hello > > 2011/12/23 Tom Lane : > > I generally agree with Peter: string_agg makes sense here. The only > > real argument against it is Pavel's point that he didn't include a > > delimiter parameter, but that just begs the question why not. It > > seems at least plausible that there would be use-cases for it. > > I don't know a real usage for bytea delimiter. Probably there is, but > I expect so most often use case will be without delimiter. Well, sometimes bytea is used to store character strings when the encoding information is to be handled by the app instead of having Postgres know it, for various reasons. I haven't seen any of those cases yet that would use string_add(bytea) but I wouldn't foreclose that possibility. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Representation of index clause lists
No sooner had I committed e2c2c2e8b1df7dfdb01e7e6f6191a569ce3c3195 than I started having second thoughts about the choice of representation. The requirement is to tell which index column each potential index qual is meant to be used with. I used a list-of-sublists representation, in which each sublist corresponds to an index column, because that's what group_clauses_by_indexkey has historically produced. But on reflection that seems like a leftover Lisp hack more than it does a natural choice. The alternative that comes to mind is to use a flat list of quals and a parallel integer list of column numbers. Places that need to track what goes with what would chase both lists in parallel using forboth(), while places that don't care could just ignore the integer list. This would end up with a net savings of palloc overhead because we could get rid of the assorted calls to flatten_clausegroups_list and flatten_indexorderbys_list that are currently needed by code that doesn't want to think about column correspondences. And it just seems a bit more natural. The only downside I can think of is that amcostestimate functions would need two more arguments, bringing us to the point where we need an OidFunctionCall11 (!), or else do some more drastic restructuring like passing them an IndexPath explicitly. But the existing commit has already effectively changed their API, so this doesn't seem like a showstopper. Thoughts either way? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: explain analyze with 'rows' but not timing
Dne 23.12.2011 22:37, Pavel Stehule napsal(a): 2011/12/23 Tom Lane: Tomas Vondra writes: The motivation for this patch was that collection timing data often causes performance issues and in some cases it's not needed. But is this true for row counts? Perhaps more to the point, is there a use case for collecting timing data without row counts? I find it hard to visualize a valid reason. yes - a searching of bad prediction But that's the purpose of collecting row counts without timing data. TL is asking about the opposite case - collecting timing data without row counts. I can't imagine such use case ... Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: explain analyze with 'rows' but not timing
Pavel Stehule writes: > 2011/12/23 Tom Lane : >> Tomas Vondra writes: >>> The motivation for this patch was that collection timing data often >>> causes performance issues and in some cases it's not needed. But is this >>> true for row counts? >> Perhaps more to the point, is there a use case for collecting timing >> data without row counts? I find it hard to visualize a valid reason. > yes - a searching of bad prediction No, because timing alone proves nothing at all. The machine could just have been overloaded. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: bytea_agg
Pavel Stehule wrote: > maybe we can introduce a synonym type for bytea - like "binary > string" or "bstring". The standard mentions these names for binary strings: BINARY, BINARY VARYING, or BINARY LARGE OBJECT which have a certain symmetry with: CHARACTER, CHARACTER VARYING, and CHARACTER LARGE OBJECT -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: explain analyze with 'rows' but not timing
2011/12/23 Tom Lane : > Tomas Vondra writes: >> The motivation for this patch was that collection timing data often >> causes performance issues and in some cases it's not needed. But is this >> true for row counts? > > Perhaps more to the point, is there a use case for collecting timing > data without row counts? I find it hard to visualize a valid reason. yes - a searching of bad prediction Regards Pavel > > regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: bytea_agg
Hello 2011/12/23 Tom Lane : > Robert Haas writes: >> On Fri, Dec 23, 2011 at 2:16 PM, Peter Eisentraut wrote: >>> On fre, 2011-12-23 at 13:30 -0500, Robert Haas wrote: Well, because it doesn't operate on strings. > >>> Sure, binary strings. Both the SQL standard and the PostgreSQL >>> documentation use that term. > >> I'm unimpressed by that argument, but let's see what other people think. > > I generally agree with Peter: string_agg makes sense here. The only > real argument against it is Pavel's point that he didn't include a > delimiter parameter, but that just begs the question why not. It > seems at least plausible that there would be use-cases for it. I don't know a real usage for bytea delimiter. Probably there is, but I expect so most often use case will be without delimiter. And when it is necessary, then || should be used. I see two ways: a) use it bytea_agg as it now b) use a string_agg with delimiter, that will be usually empty. Using a string_agg for bytea is not too intuitive (but has sense) - maybe we can introduce a synonym type for bytea - like "binary string" or "bstring". Regards Pavel > > So I think we should try to make this as much like the text case as > possible. > > regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: explain analyze with 'rows' but not timing
Tomas Vondra writes: > The motivation for this patch was that collection timing data often > causes performance issues and in some cases it's not needed. But is this > true for row counts? Perhaps more to the point, is there a use case for collecting timing data without row counts? I find it hard to visualize a valid reason. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: bytea_agg
Robert Haas writes: > On Fri, Dec 23, 2011 at 2:16 PM, Peter Eisentraut wrote: >> On fre, 2011-12-23 at 13:30 -0500, Robert Haas wrote: >>> Well, because it doesn't operate on strings. >> Sure, binary strings. Both the SQL standard and the PostgreSQL >> documentation use that term. > I'm unimpressed by that argument, but let's see what other people think. I generally agree with Peter: string_agg makes sense here. The only real argument against it is Pavel's point that he didn't include a delimiter parameter, but that just begs the question why not. It seems at least plausible that there would be use-cases for it. So I think we should try to make this as much like the text case as possible. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Page Checksums + Double Writes
Jeff Janes writes: > I had a perhaps crazier idea. Aren't CLOG pages older than global xmin > effectively read only? Could backends that need these bypass locking > and shared memory altogether? Hmm ... once they've been written out from the SLRU arena, yes. In fact you don't need to go back as far as global xmin --- *any* valid xmin is a sufficient boundary point. The only real problem is to know whether the data's been written out from the shared area yet. This idea has potential. I like it better than Robert's, mainly because I do not want to see us put something in place that would lead people to try to avoid rollbacks. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Page Checksums + Double Writes
Jeff Janes wrote: > Could we get some major OLTP users to post their CLOG for > analysis? I wouldn't think there would be much > security/propietary issues with CLOG data. FWIW, I got the raw numbers to do my quick check using this Ruby script (put together for me by Peter Brant). If it is of any use to anyone else, feel free to use it and/or post any enhanced versions of it. #!/usr/bin/env ruby Dir.glob("*") do |file_name| contents = File.read(file_name) total = contents.enum_for(:each_byte).enum_for(:each_slice, 256).inject(0) do |count, chunk| if chunk.all? { |b| b == 0x55 } count + 1 else count end end printf "%s %d\n", file_name, total end -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Page Checksums + Double Writes
Tom Lane wrote: > Robert Haas writes: >> An obvious problem is that, if the abort rate is significantly >> different from zero, and especially if the aborts are randomly >> mixed in with commits rather than clustered together in small >> portions of the XID space, the CLOG rollup data would become >> useless. > > Yeah, I'm afraid that with N large enough to provide useful > acceleration, the cases where you'd actually get a win would be > too thin on the ground to make it worth the trouble. Just to get a real-life data point, I check the pg_clog directory for Milwaukee County Circuit Courts. They have about 300 OLTP users, plus replication feeds to the central servers. Looking at the now-present files, there are 19,104 blocks of 256 bytes (which should support N of 1024, per Robert's example). Of those, 12,644 (just over 66%) contain 256 bytes of hex 55. "Last modified" dates on the files go back to the 4th of October, so this represents roughly three months worth of real-life transactions. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Page Checksums + Double Writes
On 12/23/11, Robert Haas wrote: > On Fri, Dec 23, 2011 at 11:14 AM, Kevin Grittner > wrote: >> Thoughts? > > Those are good thoughts. > > Here's another random idea, which might be completely nuts. Maybe we > could consider some kind of summarization of CLOG data, based on the > idea that most transactions commit. I had a perhaps crazier idea. Aren't CLOG pages older than global xmin effectively read only? Could backends that need these bypass locking and shared memory altogether? > An obvious problem is that, if the abort rate is significantly > different from zero, and especially if the aborts are randomly mixed > in with commits rather than clustered together in small portions of > the XID space, the CLOG rollup data would become useless. On the > other hand, if you're doing 10k tps, you only need to have a window of > a tenth of a second or so where everything commits in order to start > getting some benefit, which doesn't seem like a stretch. Could we get some major OLTP users to post their CLOG for analysis? I wouldn't think there would be much security/propietary issues with CLOG data. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: bytea_agg
Robert Haas wrote: > Peter Eisentraut wrote: >> Robert Haas wrote: >>> Peter Eisentraut wrote: On ons, 2011-12-21 at 11:04 +0100, Pavel Stehule wrote: > this patch adds a bytea_agg aggregation. > > It allow fast bytea concatetation. Why not call it string_agg? All the function names are the same between text and bytea (e.g., ||, substr, position, length). It would be nice not to introduce arbitrary differences. >>> >>> Well, because it doesn't operate on strings. >> >> Sure, binary strings. Both the SQL standard and the PostgreSQL >> documentation use that term. > > I'm unimpressed by that argument, but let's see what other people > think. I, for one, try to be consistent about saying "character strings" when that is what I mean. Since at least the SQL-92 standard there have been both "character strings" and "bit strings", with a certain amount of symmetry in how they are handled. I don't remember when binary strings were introduced, but that is the standard terminology. There is, for example, a standard substring function for binary strings. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: bytea_agg
On Fri, Dec 23, 2011 at 2:16 PM, Peter Eisentraut wrote: > On fre, 2011-12-23 at 13:30 -0500, Robert Haas wrote: >> On Fri, Dec 23, 2011 at 12:51 PM, Peter Eisentraut wrote: >> > On ons, 2011-12-21 at 11:04 +0100, Pavel Stehule wrote: >> >> this patch adds a bytea_agg aggregation. >> >> >> >> It allow fast bytea concatetation. >> > >> > Why not call it string_agg? All the function names are the same between >> > text and bytea (e.g., ||, substr, position, length). It would be nice >> > not to introduce arbitrary differences. >> >> Well, because it doesn't operate on strings. > > Sure, binary strings. Both the SQL standard and the PostgreSQL > documentation use that term. I'm unimpressed by that argument, but let's see what other people think. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: bytea_agg
On fre, 2011-12-23 at 13:30 -0500, Robert Haas wrote: > On Fri, Dec 23, 2011 at 12:51 PM, Peter Eisentraut wrote: > > On ons, 2011-12-21 at 11:04 +0100, Pavel Stehule wrote: > >> this patch adds a bytea_agg aggregation. > >> > >> It allow fast bytea concatetation. > > > > Why not call it string_agg? All the function names are the same between > > text and bytea (e.g., ||, substr, position, length). It would be nice > > not to introduce arbitrary differences. > > Well, because it doesn't operate on strings. Sure, binary strings. Both the SQL standard and the PostgreSQL documentation use that term. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Another hstore_type idea
On Fri, Dec 23, 2011 at 7:06 AM, Marc Mamin wrote: > after reading the thread on "Typed hstore proposal", I wonder if another > minded extension of hstore would be benefical: > > add additional hstore types with numerical data type for the values. I would expect the primary *performance* value in an "hstore extension" to come from things that allow accessing data without needing to "unbox" it. (I remember the concept of unboxing from APL; it seems to have been subsumed by object oriented terminology... http://en.wikipedia.org/wiki/Object_type_%28object-oriented_programming%29#Unboxing) The big "win" comes not as much from type matching (which seems to me like a morass, as you'll need the zillion Postgres types to cover all the cases) as it comes from avoiding the need to take the "blobs" of tuple data and re-parse them. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?" -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: explain analyze with 'rows' but not timing
On 23.12.2011 16:14, Tom Lane wrote: > Tomas Vondra writes: >> One thing I'm wondering about is that the InstrumentOptions are not >> exclusive - INSTRUMENT_TIMER means 'collect timing and row counts' while >> INSTRUMENT_ROWS means 'collect row counts'. Wouldn't it be better to >> redefine the INSTRUMENT_TIMER so that it collects just timing info. I.e. >> to get the current behaviour, you'd have to do this > >>instrument_options |= (INSTRUMENT_TIMER | INSTRUMENT_ROWS) > >> It's quite trivial change in explain.c, the problem I have with that is >> that it might break extensions. > > I'm not especially concerned by that angle --- we make bigger API > changes all the time. But you could change the name, eg > > instrument_options |= (INSTRUMENT_TIMING | INSTRUMENT_ROWS) > > and then #define INSTRUMENT_TIMER as the OR of the two real bits > for backward compatibility. OK, that seems like a good solution. But is it worth the additional complexity in explain.c? The motivation for this patch was that collection timing data often causes performance issues and in some cases it's not needed. But is this true for row counts? Are there machines where collecting row counts is above noise level? I've never seen that, but that's not a proof of nonexistence. If the overhead of this is negligible, then I could just hide the row counts from the output. Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Page Checksums + Double Writes
On Fri, Dec 23, 2011 at 12:42 PM, Tom Lane wrote: > Robert Haas writes: >> An obvious problem is that, if the abort rate is significantly >> different from zero, and especially if the aborts are randomly mixed >> in with commits rather than clustered together in small portions of >> the XID space, the CLOG rollup data would become useless. > > Yeah, I'm afraid that with N large enough to provide useful > acceleration, the cases where you'd actually get a win would be too thin > on the ground to make it worth the trouble. Well, I don't know: something like pgbench is certainly going to benefit, because all the transactions commit. I suspect that's true for many benchmarks. Whether it's true of real-life workloads is more arguable, of course, but if the benchmarks aren't measuring things that people really do with the database, then why are they designed the way they are? I've certainly written applications that relied on the database for integrity checking, so rollbacks were an expected occurrence, but then again those were very low-velocity systems where there wasn't going to be enough CLOG contention to matter anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: bytea_agg
On Fri, Dec 23, 2011 at 12:51 PM, Peter Eisentraut wrote: > On ons, 2011-12-21 at 11:04 +0100, Pavel Stehule wrote: >> this patch adds a bytea_agg aggregation. >> >> It allow fast bytea concatetation. > > Why not call it string_agg? All the function names are the same between > text and bytea (e.g., ||, substr, position, length). It would be nice > not to introduce arbitrary differences. Well, because it doesn't operate on strings. I argued when we added string_agg that it ought to be called concat_agg, or something like that, but I got shouted down. So now here we are. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: bytea_agg
Hello 2011/12/23 Peter Eisentraut : > On ons, 2011-12-21 at 11:04 +0100, Pavel Stehule wrote: >> this patch adds a bytea_agg aggregation. >> >> It allow fast bytea concatetation. > > Why not call it string_agg? All the function names are the same between > text and bytea (e.g., ||, substr, position, length). It would be nice > not to introduce arbitrary differences. My opinion is not strong. I don't think so using string_agg is good name (- as minimal (and only one) reason is different API - there is no support for delimiter. If I remember well discussion about string_agg, where delimiter is not optimal, there is request for immutable interface for aggregates - there was a issue with ORDER clause. So bytea_agg is good name. Regards Pavel > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: bytea_agg
Hello 2011/12/23 Peter Eisentraut : > On ons, 2011-12-21 at 11:04 +0100, Pavel Stehule wrote: >> this patch adds a bytea_agg aggregation. >> >> It allow fast bytea concatetation. > > Why not call it string_agg? All the function names are the same between > text and bytea (e.g., ||, substr, position, length). It would be nice > not to introduce arbitrary differences. My opinion is not too strong. I don't think so using string_agg is good name (for bytea_agg) - as minimal (and only one) reason is different API - there is no support for delimiter. If I remember well discussion about string_agg, where delimiter is not optimal, there is request for immutable interface for aggregates - there was a issue with ORDER clause. So bytea_agg is good name. Regards Pavel > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: bytea_agg
On ons, 2011-12-21 at 11:04 +0100, Pavel Stehule wrote: > this patch adds a bytea_agg aggregation. > > It allow fast bytea concatetation. Why not call it string_agg? All the function names are the same between text and bytea (e.g., ||, substr, position, length). It would be nice not to introduce arbitrary differences. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Page Checksums + Double Writes
Robert Haas writes: > An obvious problem is that, if the abort rate is significantly > different from zero, and especially if the aborts are randomly mixed > in with commits rather than clustered together in small portions of > the XID space, the CLOG rollup data would become useless. Yeah, I'm afraid that with N large enough to provide useful acceleration, the cases where you'd actually get a win would be too thin on the ground to make it worth the trouble. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Page Checksums + Double Writes
On Fri, Dec 23, 2011 at 11:14 AM, Kevin Grittner wrote: > Thoughts? Those are good thoughts. Here's another random idea, which might be completely nuts. Maybe we could consider some kind of summarization of CLOG data, based on the idea that most transactions commit. We introduce the idea of a CLOG rollup page. On a CLOG rollup page, each bit represents the status of N consecutive XIDs. If the bit is set, that means all XIDs in that group are known to have committed. If it's clear, then we don't know, and must fall through to a regular CLOG lookup. If you let N = 1024, then 8K of CLOG rollup data is enough to represent the status of 64 million transactions, which means that just a couple of pages could cover as much of the XID space as you probably need to care about. Also, you would need to replace CLOG summary pages in memory only very infrequently. Backends could test the bit without any lock. If it's set, they do pg_read_barrier(), and then check the buffer label to make sure it's still the summary page they were expecting. If so, no CLOG lookup is needed. If the page has changed under us or the bit is clear, then we fall through to a regular CLOG lookup. An obvious problem is that, if the abort rate is significantly different from zero, and especially if the aborts are randomly mixed in with commits rather than clustered together in small portions of the XID space, the CLOG rollup data would become useless. On the other hand, if you're doing 10k tps, you only need to have a window of a tenth of a second or so where everything commits in order to start getting some benefit, which doesn't seem like a stretch. Perhaps the CLOG rollup data wouldn't even need to be kept on disk. We could simply have bgwriter (or bghinter) set the rollup bits in shared memory for new transactions, as it becomes possible to do so, and let lookups for XIDs prior to the last shutdown fall through to CLOG. Or, if that's not appealing, we could reconstruct the data in memory by groveling through the CLOG pages - or maybe just set summary bits only for CLOG pages that actually get faulted in. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: explain analyze with 'rows' but not timing
Excerpts from Tomas Vondra's message of vie dic 23 11:45:40 -0300 2011: > One thing I'm wondering about is that the InstrumentOptions are not > exclusive - INSTRUMENT_TIMER means 'collect timing and row counts' while > INSTRUMENT_ROWS means 'collect row counts'. Wouldn't it be better to > redefine the INSTRUMENT_TIMER so that it collects just timing info. I.e. > to get the current behaviour, you'd have to do this > >instrument_options |= (INSTRUMENT_TIMER | INSTRUMENT_ROWS) > > It's quite trivial change in explain.c, the problem I have with that is > that it might break extensions. No, maybe add INSTRUMENT_TIMER_ONLY and then define INSTRUMENT_TIMER as setting both. That way you don't break compatibility. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xlog location arithmetic
On Fri, Dec 23, 2011 at 10:59 AM, Tom Lane wrote: > Robert Haas writes: >> On Fri, Dec 23, 2011 at 10:18 AM, Tom Lane wrote: >>> Even if there are several, what exact advantage does a datatype offer >>> over representing LSN values as numerics? It seems to me to be adding >>> complication and extra code (I/O converters at least) for very little >>> gain. > >> I guess I'm just constitutionally averse to labeling things as "text" >> when they really aren't. > > Er ... text? I thought the proposal was to use numeric. The proposal is to make a function that takes a text argument (which is really an LSN, but we choose to represent it as text) and returns numeric. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Page Checksums + Double Writes
"Kevin Grittner" wrote: >> I would suggest you examine how to have an array of N bgwriters, >> then just slot the code for hinting into the bgwriter. That way a >> bgwriter can set hints, calc CRC and write pages in sequence on a >> particular block. The hinting needs to be synchronised with the >> writing to give good benefit. > > I'll think about that. I see pros and cons, and I'll have to see > how those balance out after I mull them over. I think maybe the best solution is to create some common code to use from both. The problem with *just* doing it in bgwriter is that it would not help much with workloads like Robert has been using for most of his performance testing -- a database which fits entirely in shared buffers and starts thrashing on CLOG. For a background hinter process my goal would be to deal with xids as they are passed by the global xmin value, so that you have a cheap way to know that they are ripe for hinting, and you can frequently hint a bunch of transactions that are all in the same CLOG page which is recent enough to likely be already loaded. Now, a background hinter isn't going to be a net win if it has to grovel through every tuple on every dirty page every time it sweeps through the buffers, so the idea depends on having a sufficiently efficient was to identify interesting buffers. I'm hoping to improve on this, but my best idea so far is to add a field to the buffer header for "earliest unhinted xid" for the page. Whenever this background process wakes up and is scanning through the buffers (probably just in buffer number order), it does a quick check, without any pin or lock, to see if the buffer is dirty and the earliest unhinted xid is below the global xmin. If it passes both of those tests, there is definitely useful work which can be done if the page doesn't get evicted before we can do it. We pin the page, recheck those conditions, and then we look at each tuple and hint where possible. As we go, we remember the earliest xid that we see which is *not* being hinted, to store back into the buffer header when we're done. Of course, we would also update the buffer header for new tuples or when an xmax is set if the xid involved precedes what we have in the buffer header. This would not only help avoid multiple page writes as unhinted tuples on the page are read, it would minimize thrashing on CLOG and move some of the hinting work from the critical path of reading a tuple into a background process. Thoughts? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xlog location arithmetic
Robert Haas writes: > On Fri, Dec 23, 2011 at 10:18 AM, Tom Lane wrote: >> Even if there are several, what exact advantage does a datatype offer >> over representing LSN values as numerics? It seems to me to be adding >> complication and extra code (I/O converters at least) for very little >> gain. > I guess I'm just constitutionally averse to labeling things as "text" > when they really aren't. Er ... text? I thought the proposal was to use numeric. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xlog location arithmetic
On Fri, Dec 23, 2011 at 10:18 AM, Tom Lane wrote: > Robert Haas writes: >> On Fri, Dec 23, 2011 at 10:05 AM, Tom Lane wrote: >>> I too think a datatype is overkill, if we're only planning on providing >>> one function. > >> Are there any other functions we ought to provide? > > Even if there are several, what exact advantage does a datatype offer > over representing LSN values as numerics? It seems to me to be adding > complication and extra code (I/O converters at least) for very little > gain. I guess I'm just constitutionally averse to labeling things as "text" when they really aren't. I do it all the time in Perl, of course, but in PostgreSQL we have strong data typing, and it seems like we might as well use it. Also, we've occasionally talked (in the light of Pavan's single-pass vacuum patch, for example) about needing to store LSNs in system catalogs; and we're certainly not going to want to do that as text. I'll admit that it's not 100% clear that anything like this will ever happen, though, so maybe it's premature to worry about it. I can see I'm in the minority on this one, though... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Non-inheritable check constraints
On Thu, Dec 22, 2011 at 10:54 PM, Alvaro Herrera wrote: > I agree with Robert that this usage of ALTER TABLE ONLY is slightly > different from other usages of the same command, but I disagree that > this means that we need another command to do what we want to do here. > IOW, I prefer to keep the syntax we have. Another disadvantage of the current syntax becomes evident when you look at the pg_dump output. If you pg_dump a regular constraint, the constraint gets added as part of the table definition, and the rows are all checked as they are inserted. If you pg_dump an ONLY constraint, the constraint gets added after loading the data, requiring an additional full-table scan to validate it. >> > I am tempted to say we should revert this and rethink. I don't >> > believe we are only a small patch away from finding all the bugs here. >> >> Sure, if we all think that CREATE TABLE should support ONLY CONSTRAINT type >> of syntax, then +1 for reverting this and a subsequent revised submission. > > I don't think this is a given ... In fact, IMO if we're only two or > three fixes away from having it all nice and consistent, I think > reverting is not necessary. Sure. It's the "if" part of that sentence that I'm not too sure about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xlog location arithmetic
Robert Haas writes: > On Fri, Dec 23, 2011 at 10:05 AM, Tom Lane wrote: >> I too think a datatype is overkill, if we're only planning on providing >> one function. > Are there any other functions we ought to provide? Even if there are several, what exact advantage does a datatype offer over representing LSN values as numerics? It seems to me to be adding complication and extra code (I/O converters at least) for very little gain. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: explain analyze with 'rows' but not timing
Tomas Vondra writes: > One thing I'm wondering about is that the InstrumentOptions are not > exclusive - INSTRUMENT_TIMER means 'collect timing and row counts' while > INSTRUMENT_ROWS means 'collect row counts'. Wouldn't it be better to > redefine the INSTRUMENT_TIMER so that it collects just timing info. I.e. > to get the current behaviour, you'd have to do this >instrument_options |= (INSTRUMENT_TIMER | INSTRUMENT_ROWS) > It's quite trivial change in explain.c, the problem I have with that is > that it might break extensions. I'm not especially concerned by that angle --- we make bigger API changes all the time. But you could change the name, eg instrument_options |= (INSTRUMENT_TIMING | INSTRUMENT_ROWS) and then #define INSTRUMENT_TIMER as the OR of the two real bits for backward compatibility. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xlog location arithmetic
On 12/23/2011 10:05 AM, Tom Lane wrote: Magnus Hagander writes: At this point, my question is: do we want to support the lsn data type idea or a basic function that implements the difference between LSNs? Personally I think a function is enough - it solves the only case that I've actually seen. But a datatype would be a more complete solution, of course - but it seems a bit of an overkill to me. Not really sure which way we should go - I was hoping somebody else would comment as well.. I too think a datatype is overkill, if we're only planning on providing one function. Just emit the values as numeric and have done. +1. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xlog location arithmetic
On Fri, Dec 23, 2011 at 10:05 AM, Tom Lane wrote: > I too think a datatype is overkill, if we're only planning on providing > one function. Are there any other functions we ought to provide? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xlog location arithmetic
Magnus Hagander writes: >> At this point, my question is: do we want to support the lsn data type idea >> or >> a basic function that implements the difference between LSNs? > Personally I think a function is enough - it solves the only case that > I've actually seen. But a datatype would be a more complete solution, > of course - but it seems a bit of an overkill to me. Not really sure > which way we should go - I was hoping somebody else would comment as > well.. I too think a datatype is overkill, if we're only planning on providing one function. Just emit the values as numeric and have done. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: explain analyze with 'rows' but not timing
On 23.12.2011 14:57, Robert Haas wrote: > 2011/12/22 Tomas Vondra : >> The gettimeofday() calls are not exactly cheap in some cases, so why to >> pay that price when all you need is the number of rows? > > Fair point. > >> The patch attached does this: >> >> 1) adds INSTRUMENT_ROWS, a new InstrumentOption >> >> - counts rows without timing (no gettimeofday() callse) >> - if you want timing info, use INSTRUMENT_TIMER >> >> 2) adds new option "TIMING" to EXPLAIN, i.e. >> >>EXPLAIN (ANALYZE ON, TIMING ON) SELECT ... >> >> 3) adds auto_explain.log_rows_only (false by default) >> >> - if you set this to 'true', then the instrumentation will just >> count rows, without calling gettimeofday() > > This seems like an unnecessarily confusing interface, because you've > named the auto_explain option differently from the EXPLAIN option and > given it (almost) the opposite sense: timing=off means the same thing > as log_rows_only=on. > > I think the EXPLAIN (TIMING) option is good the way you have it, but > then just have auto_explain.log_timing, with a default value of on. Makes sense. I've updated the patch to reflect this, so the option is called auto_explain.log_timing and is true by default. I'll add the patch to the next commit fest. One thing I'm wondering about is that the InstrumentOptions are not exclusive - INSTRUMENT_TIMER means 'collect timing and row counts' while INSTRUMENT_ROWS means 'collect row counts'. Wouldn't it be better to redefine the INSTRUMENT_TIMER so that it collects just timing info. I.e. to get the current behaviour, you'd have to do this instrument_options |= (INSTRUMENT_TIMER | INSTRUMENT_ROWS) It's quite trivial change in explain.c, the problem I have with that is that it might break extensions. Tomas diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c index b320698..4b52b26 100644 --- a/contrib/auto_explain/auto_explain.c +++ b/contrib/auto_explain/auto_explain.c @@ -23,6 +23,7 @@ static intauto_explain_log_min_duration = -1; /* msec or -1 */ static bool auto_explain_log_analyze = false; static bool auto_explain_log_verbose = false; static bool auto_explain_log_buffers = false; +static bool auto_explain_log_timing = false; static int auto_explain_log_format = EXPLAIN_FORMAT_TEXT; static bool auto_explain_log_nested_statements = false; @@ -133,6 +134,17 @@ _PG_init(void) NULL, NULL); + DefineCustomBoolVariable("auto_explain.log_timing", +"Collect timing data, not just row counts.", +NULL, + &auto_explain_log_timing, +true, +PGC_SUSET, +0, +NULL, +NULL, +NULL); + EmitWarningsOnPlaceholders("auto_explain"); /* Install hooks. */ @@ -170,7 +182,12 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags) /* Enable per-node instrumentation iff log_analyze is required. */ if (auto_explain_log_analyze && (eflags & EXEC_FLAG_EXPLAIN_ONLY) == 0) { - queryDesc->instrument_options |= INSTRUMENT_TIMER; + if (auto_explain_log_timing) + queryDesc->instrument_options |= INSTRUMENT_TIMER; + else + queryDesc->instrument_options |= INSTRUMENT_ROWS; + + if (auto_explain_log_buffers) queryDesc->instrument_options |= INSTRUMENT_BUFFERS; } diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml index 8a9c9de..4488956 100644 --- a/doc/src/sgml/ref/explain.sgml +++ b/doc/src/sgml/ref/explain.sgml @@ -40,6 +40,7 @@ EXPLAIN [ ANALYZE ] [ VERBOSE ] statementboolean ] COSTS [ boolean ] BUFFERS [ boolean ] +TIMING [ boolean ] FORMAT { TEXT | XML | JSON | YAML } @@ -172,6 +173,20 @@ ROLLBACK; +TIMING + + + Include information on timing for each node. Specifically, include the + actual startup time and time spent in the node. This may involve a lot + of gettimeofday calls, and on some systems this means significant + slowdown of the query. + This parameter may only be used when ANALYZE is also + enabled. It defaults to TRUE. + + + + + FORMAT diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index e38de5
Re: [HACKERS] WIP: explain analyze with 'rows' but not timing
2011/12/22 Tomas Vondra : > The gettimeofday() calls are not exactly cheap in some cases, so why to > pay that price when all you need is the number of rows? Fair point. > The patch attached does this: > > 1) adds INSTRUMENT_ROWS, a new InstrumentOption > > - counts rows without timing (no gettimeofday() callse) > - if you want timing info, use INSTRUMENT_TIMER > > 2) adds new option "TIMING" to EXPLAIN, i.e. > > EXPLAIN (ANALYZE ON, TIMING ON) SELECT ... > > 3) adds auto_explain.log_rows_only (false by default) > > - if you set this to 'true', then the instrumentation will just > count rows, without calling gettimeofday() This seems like an unnecessarily confusing interface, because you've named the auto_explain option differently from the EXPLAIN option and given it (almost) the opposite sense: timing=off means the same thing as log_rows_only=on. I think the EXPLAIN (TIMING) option is good the way you have it, but then just have auto_explain.log_timing, with a default value of on. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: bytea_agg
On Thu, Dec 22, 2011 at 11:49 AM, Robert Haas wrote: > On Wed, Dec 21, 2011 at 5:04 AM, Pavel Stehule > wrote: >> this patch adds a bytea_agg aggregation. >> >> It allow fast bytea concatetation. > > Looks fine to me. I'll commit this, barring objections. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] Fix Leaky View Problem
On Fri, Dec 23, 2011 at 5:56 AM, Kohei KaiGai wrote: > I'd like the regression test on select_view test being committed also > to detect unexpected changed in the future. How about it? Can you resend that as a separate patch? I remember there were some things I didn't like about it, but I don't remember what they were at the moment... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Moving more work outside WALInsertLock
On Fri, Dec 23, 2011 at 3:15 AM, Heikki Linnakangas wrote: > On 23.12.2011 10:13, Heikki Linnakangas wrote: >> So, here's a WIP patch of what I've been working on. > > And here's the patch I forgot to attach.. Fails regression tests for me. I found this in postmaster.log: PANIC: could not find WAL buffer for 0/2ECA438STATEMENT: ANALYZE onek2; LOG: stuck waiting upto 0/300 LOG: server process (PID 34529) was terminated by signal 6: Aborted DETAIL: Failed process was running: ANALYZE onek2; -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Another hstore_type idea
Hello, after reading the thread on "Typed hstore proposal", I wonder if another minded extension of hstore would be benefical: add additional hstore types with numerical data type for the values. e.g.: hstore_float : text -> float This should allow to add some nice aggregation function on these hstore, e.g. select distinct_sum(x) from ( a -> 1 b -> 5 a -> 3 c -> 2 ) => a -> 4 b -> 5 c -> 2 I have a small case where I'm doing this with a custom type (text, float) , but with poor performances. I guess that such an extension would make sense if it were to bring a significant performance gain compared to the custom type approach. best regards and Merry Christmas, Marc Mamin
Re: [HACKERS] xlog location arithmetic
On Tue, Dec 20, 2011 at 14:08, Euler Taveira de Oliveira wrote: > On 20-12-2011 07:27, Magnus Hagander wrote: >> On Tue, Dec 6, 2011 at 19:06, Robert Haas wrote: >>> On Tue, Dec 6, 2011 at 1:00 PM, Euler Taveira de Oliveira >>> wrote: On 06-12-2011 13:11, Robert Haas wrote: > On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander > wrote: >> I've been considering similar things, as you can find in the archives, >> but what I was thinking of was converting the number to just a plain >> bigint, then letting the user apply whatever arithmetic wanted at the >> SQL level. I never got around to acutally coding it, though. It could >> easily be extracted from your patch of course - and I think that's a >> more flexible approach. Is there some advantage to your method that >> I'm missing? > > I went so far as to put together an lsn data type. I didn't actually > get all that far with it, which is why I haven't posted it sooner, but > here's what I came up with. It's missing indexing support and stuff, > but that could be added if people like the approach. It solves this > problem by implementing -(lsn,lsn) => numeric (not int8, that can > overflow since it is not unsigned), which allows an lsn => numeric > conversion by just subtracting '0/0'::lsn. > Interesting approach. I don't want to go that far. If so, you want to change all of those functions that deal with LSNs and add some implicit conversion between text and lsn data types (for backward compatibility). As of int8, I'm >> >> As long as you have the conversion, you don't really need to change >> them, do you? It might be nice in some ways, but this is still a >> pretty internal operation, so I don't see it as critical. >> > For correctness, yes. > > At this point, my question is: do we want to support the lsn data type idea or > a basic function that implements the difference between LSNs? Personally I think a function is enough - it solves the only case that I've actually seen. But a datatype would be a more complete solution, of course - but it seems a bit of an overkill to me. Not really sure which way we should go - I was hoping somebody else would comment as well.. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reprise: pretty print viewdefs
Robert Haas writes: > I *still* spend a lot of time editing in a 25x80 window. 80 is a good choice whatever the screen size, because it's about the most efficient text width as far as eyes movements are concerned: the eye is much better at going top-bottom than left-right. That's also why printed papers still pick shorter columns and website design often do, too. If it's physically tiring to read your text, very few people will. Now, 25 lines… maybe you should tweak your terminal setups, or consider editing in a more comfortable environment :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] Fix Leaky View Problem
2011/12/22 Robert Haas : > On Mon, Dec 12, 2011 at 12:00 PM, Kohei KaiGai wrote: >> The "v8.option-2" add checks around examine_simple_variable, and >> prevent to reference statistical data, if Var node tries to reference >> relation with security-barrier attribute. > > I adopted this approach, and committed this. > Thanks for your help and efforts. I'd like the regression test on select_view test being committed also to detect unexpected changed in the future. How about it? Best regards, -- KaiGai Kohei -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Making TEXT NUL-transparent
* Florian Pflug: > On Nov24, 2011, at 10:54 , Florian Weimer wrote: >>> Or is it not only about being able to *store* NULs in a text field? >> >> No, the entire core should be NUL-transparent. > > That's unlikely to happen. Yes, with the type input/output functions tied to NUL-terminated strings, that seems indeed unlikely to happen. > A more realistic approach would be to solve this only for UTF-8 > encoded strings by encoding the NUL character not as a single 0 byte, > but as sequence of non-0 bytes. 0xFF cannot occur in valid UTF-8, so that's one possibility. > Java, for example, seems to use it to serialize Strings (which may contain > NUL characters) to UTF-8. Only internally in the VM. UTF-8 produced by the I/O encoder/decoders produces and consumes NUL bytes. > Should you try to add a new encoding which supports that, you might also > want to allow CESU-8-style encoding of UTF-16 surrogate pairs. This means > that code points representable by UTF-16 surrogate pairs may be encoded by > separately encoding the two surrogate characters in UTF-8. I'm not sure if this is a good idea. The motivation behind CESU-8 is that it sorts byte-encoded strings in the same order as UTF-16, which is a completely separate concern. -- Florian Weimer BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions and 9.2
On Wed, Dec 21, 2011 at 5:46 AM, Robert Haas wrote:> > Assuming the command in > question can be stuffed inside a function, the most you're gaining is > a little notational convenience I can answer that one (why a full-blown mechanism for a notational convenience). It has occurred to me to use this mechanism to support extensions, but I find the prospect of having to teach people special operators to understand how to use the standard extension feature an unstomachable prospect. The single biggest problem is that pg_restore will not know to call this function rather than run CREATE EXTENSION, and then two databases, prepared exactly the same cannot be pg_dump-ed and restored in a reasonable way. If there's a definable whitelist, then this vital functionality will work. There are other sicknesses imposed if one has to hack up how to delegate extension creation to non-superusers: * The postgres manual, wiki, and ecosystem of recipes on the web and internet at large basically are not going to work without modification. Death by a thousand cuts. * "\df" in psql displays some new operators that you aren't familiar with. Also, putting aside your pg_dump/pg_restore generated SQL will not work, they will look funny. This is an eyesore. * If one tells someone else "yeah, the system supports extensions", they're going to type CREATE EXTENSION. And then it's not going to work, and then they're either going to give up, yak shave for a few hours figuring out what they were "supposed" to do for their provider or organization, or maybe worst of all hack their way around functionality the extension could have provided in a cleaner way had it just worked nice and tidy to begin with. I find this functionality basically vital because it greatly decreases the friction between people, organizations, and software in the domain of deploying, reasoning, and communicating about the installation and status of Postgres extensions in a database. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Real-life range datasets
Hello, On Thu, Dec 22, 2011 at 12:51 PM, Benedikt Grundmann < bgrundm...@janestreet.com> wrote: > I should be able to give you a table with the same characteristics as > the instruments table but bogus data by replacing all entries in the > table with random strings of the same length or something like that. > I can probably take a little bit of time during this or the next week > to generate such "fake" real world data ;-) Is there an ftp site to > upload the gzipped pg_dump file to? > Thank you very much for your response! I'm going to send you accessories for upload soon. - With best regards, Alexander Korotkov.
Re: [HACKERS] Moving more work outside WALInsertLock
On 16.12.2011 15:42, Heikki Linnakangas wrote: On 16.12.2011 15:03, Simon Riggs wrote: On Fri, Dec 16, 2011 at 12:50 PM, Heikki Linnakangas wrote: On 16.12.2011 14:37, Simon Riggs wrote: I already proposed a design for that using page-level share locks any reason not to go with that? Sorry, I must've missed that. Got a link? From nearly 4 years ago. http://grokbase.com/t/postgresql.org/pgsql-hackers/2008/02/reworking-wal-locking/145qrhllcqeqlfzntvn7kjefijey Ah, thanks. That is similar to what I'm experimenting, but a second lwlock is still fairly heavy-weight. I think with many backends, you will be beaten badly by contention on the spinlocks alone. I'll polish up and post what I've been experimenting with, so we can discuss that. So, here's a WIP patch of what I've been working on. The WAL insertions is split into two stages: 1. Reserve the space from the WAL stream. This is done while holding a spinlock. The page holding the reserved space doesn't necessary need to be in cache yet, the reservation can run ahead of the WAL buffer cache. (quick testing suggests that a lwlock is too heavy-weight for this) 2. Ensure the page is in the WAL buffer cache. If not, initialize it, evicting old pages if needed. Then finish the CRC calculation of the header and memcpy the record in place. (if the record spans multiple pages, it operates on one page at a time, to avoid problems with running out of WAL buffers) As long as wal_buffers is high enough, and the I/O can keep up, stage 2 can happen in parallel in many backends. The WAL writer process pre-initializes new pages ahead of the insertions, so regular backends rarely need to do that. When a page is written out, with XLogWrite(), you need to wait for any in-progress insertions to the pages you're about to write out to finish. For that, every backend has slot with an XLogRecPtr in shared memory. Iẗ́'s set to the position where that backend is currently inserting to. If there's no insertion in-progress, it's invalid, but when it's valid it acts like a barrier, so that no-one is allowed to XLogWrite() beyond that position. That's very lightweight to the backends, but I'm using busy-waiting to wait on an insertion to finish ATM. That should be replaced with something smarter, that's the biggest missing part of the patch. One simple way to test the performance impact of this is: psql -c "DROP TABLE IF EXISTS foo; CREATE TABLE foo (id int4); CHECKPOINT" postgres echo "BEGIN; INSERT INTO foo SELECT i FROM generate_series(1, 1) i; ROLLBACK" > parallel-insert-test.sql pgbench -n -T 10 -c4 -f parallel-insert-test.sql postgres On my dual-core laptop, this patch increases the tps on that from about 60 to 110. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers