Re: [HACKERS] join removal
reducejoins.c ? flattenjoins.c ? filterjoins.c ? -- dim Le 28 mars 2010 à 22:12, Tom Lane a écrit : Robert Haas writes: On Sun, Mar 28, 2010 at 2:10 PM, Tom Lane wrote: joinremoval.c ? Maybe, except as I mentioned in the email linked upthread, my plan for implementing inner join removal would also include allowing join reordering in cases where we currently don't. So I don't want to sandbox it too tightly as join removal, per se, though that's certainly what we have on the table ATM. It's more like advanced open-heart join-tree surgery - like prepjointree, but much later in the process. Hm. At this point we're not really working with a join *tree* in any case --- the data structure we're mostly concerned with is the list of SpecialJoinInfo structs, and what we're trying to do is weaken the constraints described by that list. So I'd rather stay away from "tree" terminology. planjoins.c would fit with other names in the plan/ directory but it seems like a misnomer because we're not really "planning" any joins at this stage. adjustjoins.c? loosenjoins.c? weakenjoins.c? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for 9.1: initdb -C option
David Christensen wrote: Enclosed is a patch to add a -C option to initdb to allow you to easily append configuration directives to the generated postgresql.conf file for use in programmatic generation. We had a patch not quite make it for 9.0 that switched over the postgresql.conf file to make it easy to scan a whole directory looking for configuration files: http://archives.postgresql.org/message-id/9837222c0910240641p7d75e2a4u2cfa6c1b5e603...@mail.gmail.com The idea there was to eventually reduce the amount of postgresql.conf hacking that initdb and other tools have to do. Your patch would add more code into a path that I'd like to see reduced significantly. That implementation would make something easy enough for your use case too (below untested but show the general idea): $ for cluster in 1 2 3 4 5 6; do initdb -D data$cluster ( cat < data$cluster/conf.d/99clustersetup done This would actually work just fine for what you're doing right now if you used ">> data$cluster/postgresql.conf" for that next to last line there. There would be duplicates, which I'm guessing is what you wanted to avoid with this patch, but the later values set for the parameters added to the end would win and be the active ones. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for 9.1: initdb -C option
David Christensen wrote: > Enclosed is a patch to add a -C option to initdb to allow you to easily > append configuration directives to the generated postgresql.conf file Why don't you use just "echo 'options' >> $PGDATA/postgresql.conf" ? Could you explain where the -C options is better than initdb + echo? Regards, --- Takahiro Itagaki NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Patch for 9.1: initdb -C option
Hackers, Enclosed is a patch to add a -C option to initdb to allow you to easily append configuration directives to the generated postgresql.conf file for use in programmatic generation. In my case, I'd been creating multiple db clusters with a script and would have specific overrides that I needed to make. This patch fell out of the desire to make this a little cleaner. Please review and comment. From the commit message: This is a simple mechanism to allow you to provide explicit overrides to any GUC at initdb time. As a basic example, consider the case where you are programmatically generating multiple db clusters in order to test various configurations: $ for cluster in 1 2 3 4 5 6; > do initdb -D data$cluster -C "port = 1234$cluster" -C 'max_connections = 10' -C shared_buffers=1M; > done A possible future improvement would be to provide some basic formatting corrections to allow specificications such as -C 'port 1234', -C port=1234, and -C 'port = 1234' to all be ultimately output as 'port = 1234' in the final output. This would be consistent with postmaster's parsing. The -C flag was chosen to be a mnemonic for "config". Regards, David -- David Christensen End Point Corporation da...@endpoint.com 0001-Add-C-option-to-initdb-to-allow-invocation-time-GUC-.patch Description: Binary data initdb-dash-C.diff Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Add JSON support
2010/3/29 Andrew Dunstan : > Robert Haas wrote: >> On Sun, Mar 28, 2010 at 4:48 PM, Joseph Adams >> wrote: >>> I'm wondering whether the internal representation of JSON should be >>> plain JSON text, or some binary code that's easier to traverse and >>> whatnot. For the sake of code size, just keeping it in text is >>> probably best. >> >> +1 for text. > > Agreed. There's another choice, called BSON. http://www.mongodb.org/display/DOCS/BSON I've not researched it yet deeply, it seems reasonable to be stored in databases as it is invented for MongoDB. >>> Now my thoughts and opinions on the JSON parsing/unparsing itself: >>> >>> It should be built-in, rather than relying on an external library >>> (like XML does). >> >> Why? I'm not saying you aren't right, but you need to make an >> argument rather than an assertion. This is a community, so no one is >> entitled to decide anything unilaterally, and people want to be >> convinced - including me. > > Yeah, why? We should not be in the business of reinventing the wheel (and > then maintaining the reinvented wheel), unless the code in question is > *really* small. Many implementations in many languages of JSON show that parsing JSON is not so difficult to code and the needs vary. Hence, I wonder if we can have it very our own. Never take it wrongly, I don't disagree text format nor disagree to use an external library. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Add JSON support
On Sun, Mar 28, 2010 at 11:24 PM, Joseph Adams wrote: > I apologize; I was just starting the conversation with some of my > ideas to receive feedback. I didn't want people to have to wade > through too many "I think"s . I'll be sure to use tags in > the future :-) FWIW, I don't care at all whether you say "I think" or "I know"; the point is that you have to provide backup for any position you choose to take. > My reasoning for "It should be built-in" is: > * It would be nice to have a built-in serialization format that's > available by default. > * It might be a little faster because it doesn't have to link to an > external library. I don't think either of these reasons is valid. > * The code to interface between JSON logic and PostgreSQL will > probably be much larger than the actual JSON encoding/decoding itself. If true, this is a good argument. > * The externally-maintained and packaged libjson implementations I > saw brought in lots of dependencies (e.g. glib). As is this. > * "Everyone else" (e.g. PHP) uses a statically-linked JSON implementation. But this isn't. > Is the code in question "*really*" small? Well, not really, but it's > not enormous either. By the way, I found a bug in PHP's JSON_parser > (json_decode("true "); /* with a space */ returns null instead of > true). I'll have to get around to reporting that. > > Now, assuming JSON support is built-in to PostgreSQL and is enabled by > default, it is my opinion that encoding issues should not be dealt > with in the JSON code itself, but that the JSON code itself should > assume UTF-8. I think conversions should be done to/from UTF-8 before > passing it through the JSON code because this would likely be the > smallest way to implement it (not necessarily the fastest, though). > > Mike Rylander pointed out something wonderful, and that is that JSON > code can be stored in plain old ASCII using \u... . If a target > encoding supports all of Unicode, the JSON serializer could be told > not to generate \u escapes. Otherwise, the \u escapes would be > necessary. > > Thus, here's an example of how (in my opinion) character sets and such > should be handled in the JSON code: > > Suppose the client's encoding is UTF-16, and the server's encoding is > Latin-1. When JSON is stored to the database: > 1. The client is responsible and sends a valid UTF-16 JSON string. > 2. PostgreSQL checks to make sure it is valid UTF-16, then converts > it to UTF-8. > 3. The JSON code parses it (to ensure it's valid). > 4. The JSON code unparses it (to get a representation without > needless whitespace). It is given a flag indicating it should only > output ASCII text. > 5. The ASCII is stored in the server, since it is valid Latin-1. > > When JSON is retrieved from the database: > 1. ASCII is retrieved from the server > 2. If user needs to extract one or more fields, the JSON is parsed, > and the fields are extracted. > 3. Otherwise, the JSON text is converted to UTF-16 and sent to the client. > > Note that I am being biased toward optimizing code size rather than speed. Can you comment on my proposal elsewhere on this thread and compare your proposal to mine? In what ways are they different, and which is better, and why? > Here's a question about semantics: should converting JSON to text > guarantee that Unicode will be \u escaped, or should it render actual > Unicode whenever possible (when the client uses a Unicode-complete > charset) ? I feel pretty strongly that the data should be stored in the database in the format in which it will be returned to the user - any conversion which is necessary should happen on the way in. I am not 100% sure to what extent we should attempt to canonicalize the input and to what extend we should simply store it in whichever way the user chooses to provide it. > As for reinventing the wheel, I'm in the process of writing yet > another JSON implementation simply because I didn't find the other > ones I looked at palatable. I am aiming for simple code, not fast > code. I am using malloc for structures and realloc for strings/arrays > rather than resorting to clever buffering tricks. Of course, I'll > switch it over to palloc/repalloc before migrating it to PostgreSQL. I'm not sure that optimizing for simplicity over speed is a good idea. I think we can reject implementations as unpalatable because they are slow or feature-poor or have licensing issues or are not actively maintained, but rejecting them because they use complex code in order to be fast doesn't seem like the right trade-off to me. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Add JSON support
On Sun, Mar 28, 2010 at 5:19 PM, Robert Haas wrote: > On Sun, Mar 28, 2010 at 4:48 PM, Joseph Adams > wrote: >> Now my thoughts and opinions on the JSON parsing/unparsing itself: >> >> It should be built-in, rather than relying on an external library >> (like XML does). > > Why? I'm not saying you aren't right, but you need to make an > argument rather than an assertion. This is a community, so no one is > entitled to decide anything unilaterally, and people want to be > convinced - including me. I apologize; I was just starting the conversation with some of my ideas to receive feedback. I didn't want people to have to wade through too many "I think"s . I'll be sure to use tags in the future :-) My reasoning for "It should be built-in" is: * It would be nice to have a built-in serialization format that's available by default. * It might be a little faster because it doesn't have to link to an external library. * The code to interface between JSON logic and PostgreSQL will probably be much larger than the actual JSON encoding/decoding itself. * The externally-maintained and packaged libjson implementations I saw brought in lots of dependencies (e.g. glib). * "Everyone else" (e.g. PHP) uses a statically-linked JSON implementation. Is the code in question "*really*" small? Well, not really, but it's not enormous either. By the way, I found a bug in PHP's JSON_parser (json_decode("true "); /* with a space */ returns null instead of true). I'll have to get around to reporting that. Now, assuming JSON support is built-in to PostgreSQL and is enabled by default, it is my opinion that encoding issues should not be dealt with in the JSON code itself, but that the JSON code itself should assume UTF-8. I think conversions should be done to/from UTF-8 before passing it through the JSON code because this would likely be the smallest way to implement it (not necessarily the fastest, though). Mike Rylander pointed out something wonderful, and that is that JSON code can be stored in plain old ASCII using \u... . If a target encoding supports all of Unicode, the JSON serializer could be told not to generate \u escapes. Otherwise, the \u escapes would be necessary. Thus, here's an example of how (in my opinion) character sets and such should be handled in the JSON code: Suppose the client's encoding is UTF-16, and the server's encoding is Latin-1. When JSON is stored to the database: 1. The client is responsible and sends a valid UTF-16 JSON string. 2. PostgreSQL checks to make sure it is valid UTF-16, then converts it to UTF-8. 3. The JSON code parses it (to ensure it's valid). 4. The JSON code unparses it (to get a representation without needless whitespace). It is given a flag indicating it should only output ASCII text. 5. The ASCII is stored in the server, since it is valid Latin-1. When JSON is retrieved from the database: 1. ASCII is retrieved from the server 2. If user needs to extract one or more fields, the JSON is parsed, and the fields are extracted. 3. Otherwise, the JSON text is converted to UTF-16 and sent to the client. Note that I am being biased toward optimizing code size rather than speed. Here's a question about semantics: should converting JSON to text guarantee that Unicode will be \u escaped, or should it render actual Unicode whenever possible (when the client uses a Unicode-complete charset) ? As for reinventing the wheel, I'm in the process of writing yet another JSON implementation simply because I didn't find the other ones I looked at palatable. I am aiming for simple code, not fast code. I am using malloc for structures and realloc for strings/arrays rather than resorting to clever buffering tricks. Of course, I'll switch it over to palloc/repalloc before migrating it to PostgreSQL. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC Query
On Sun, Mar 28, 2010 at 10:01 PM, gaurav gupta wrote: > My idea is to add a functionality of Auto tuning and Auto Indexing/ > Reindexing in DB languages. > > Though I am not working on this I have some idea about implementation. > Idea is that on the no. of rows deleted, Inserted in the table we can make > our system capable to reindex the table that will save the time of user. Reindexing is not routine maintenance for PostgreSQL, so this seems fairly pointless. > Similarly using the no. of select hits on a table we can check that if > maximum no. of times it is on a non-index field we can index on that field > to make select faster. Well, a SELECT statement "hits" a whole row, not a single column; but even if you could somehow figure out a way to tally up per-column statistics (and it's certainly not obvious to me how to do such a thing) it doesn't follow that a column which is frequently accessed is a good candidate for indexing. I don't think this is a good project for a first-time hacker, or something that can realistically be completed in one summer. It sounds more like a PhD project to me. I wrote to another student who is considering submitting a GSOC proposal with some ideas I thought might be suitable. You might want to review that email: http://archives.postgresql.org/pgsql-hackers/2010-03/msg01034.php ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] GSoC Query
Sir/Ma'am, I am a Mtech student and want to participate in GSoC. I have a project idea and want to discuss its feasibility, usability and chance of selection with you. My idea is to add a functionality of Auto tuning and Auto Indexing/ Reindexing in DB languages. Though I am not working on this I have some idea about implementation. Idea is that on the no. of rows deleted, Inserted in the table we can make our system capable to reindex the table that will save the time of user. Similarly using the no. of select hits on a table we can check that if maximum no. of times it is on a non-index field we can index on that field to make select faster. I am looking forward to hear from you. -- Thanks & Regards, Gaurav Kumar Gupta +91-9032844745
Re: [HACKERS] Proposal: Add JSON support
On Sun, Mar 28, 2010 at 8:33 PM, Robert Haas wrote: > On Sun, Mar 28, 2010 at 8:23 PM, Mike Rylander wrote: >> In practice, every parser/serializer I've used (including the one I >> helped write) allows (and, often, forces) any non-ASCII character to >> be encoded as \u followed by a string of four hex digits. > > Is it correct to say that the only feasible place where non-ASCII > characters can be used is within string constants? Yes. That includes object property strings -- they are quoted string literals. > If so, it might be > reasonable to disallow characters with the high-bit set unless the > server encoding is one of the flavors of Unicode of which the spec > approves. I'm tempted to think that when the server encoding is > Unicode we really ought to allow Unicode characters natively, because > turning a long string of two-byte wide chars into a long string of > six-byte wide chars sounds pretty evil from a performance point of > view. > +1 As an aside, \u-encoded (escaped) characters and native multi-byte sequences (of any RFC-allowable Unicode encoding) are exactly equivalent in JSON -- it's a storage and transmission format, and doesn't prescribe the application-internal representation of the data. If it's faster (which it almost certainly is) to not mangle the data when it's all staying server side, that seems like a useful optimization. For output to the client, however, it would be useful to provide a \u-escaping function, which (AIUI) should always be safe regardless of client encoding. -- Mike Rylander | VP, Research and Design | Equinox Software, Inc. / The Evergreen Experts | phone: 1-877-OPEN-ILS (673-6457) | email: mi...@esilibrary.com | web: http://www.esilibrary.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Add JSON support
Andrew Dunstan wrote: Robert Haas wrote: On Sun, Mar 28, 2010 at 8:23 PM, Mike Rylander wrote: In practice, every parser/serializer I've used (including the one I helped write) allows (and, often, forces) any non-ASCII character to be encoded as \u followed by a string of four hex digits. Is it correct to say that the only feasible place where non-ASCII characters can be used is within string constants? If so, it might be reasonable to disallow characters with the high-bit set unless the server encoding is one of the flavors of Unicode of which the spec approves. I'm tempted to think that when the server encoding is Unicode we really ought to allow Unicode characters natively, because turning a long string of two-byte wide chars into a long string of six-byte wide chars sounds pretty evil from a performance point of view. We support exactly one unicode encoding on the server side: utf8. And the maximum possible size of a validly encoded unicode char in utf8 is 4 (and that's pretty rare, IIRC). Sorry. Disregard this. I see what you mean. Yeah, I thing *requiring* non-ascii character to be escaped would be evil. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Add JSON support
Robert Haas wrote: On Sun, Mar 28, 2010 at 8:23 PM, Mike Rylander wrote: In practice, every parser/serializer I've used (including the one I helped write) allows (and, often, forces) any non-ASCII character to be encoded as \u followed by a string of four hex digits. Is it correct to say that the only feasible place where non-ASCII characters can be used is within string constants? If so, it might be reasonable to disallow characters with the high-bit set unless the server encoding is one of the flavors of Unicode of which the spec approves. I'm tempted to think that when the server encoding is Unicode we really ought to allow Unicode characters natively, because turning a long string of two-byte wide chars into a long string of six-byte wide chars sounds pretty evil from a performance point of view. We support exactly one unicode encoding on the server side: utf8. And the maximum possible size of a validly encoded unicode char in utf8 is 4 (and that's pretty rare, IIRC). cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Add JSON support
On Sun, Mar 28, 2010 at 8:23 PM, Mike Rylander wrote: > In practice, every parser/serializer I've used (including the one I > helped write) allows (and, often, forces) any non-ASCII character to > be encoded as \u followed by a string of four hex digits. Is it correct to say that the only feasible place where non-ASCII characters can be used is within string constants? If so, it might be reasonable to disallow characters with the high-bit set unless the server encoding is one of the flavors of Unicode of which the spec approves. I'm tempted to think that when the server encoding is Unicode we really ought to allow Unicode characters natively, because turning a long string of two-byte wide chars into a long string of six-byte wide chars sounds pretty evil from a performance point of view. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Add JSON support
On Sun, Mar 28, 2010 at 7:36 PM, Tom Lane wrote: > Andrew Dunstan writes: >> Here's another thought. Given that JSON is actually specified to consist >> of a string of Unicode characters, what will we deliver to the client >> where the client encoding is, say Latin1? Will it actually be a legal >> JSON byte stream? > > No, it won't. We will *not* be sending anything but latin1 in such a > situation, and I really couldn't care less what the JSON spec says about > it. Delivering wrongly-encoded data to a client is a good recipe for > all sorts of problems, since the client-side code is very unlikely to be > expecting that. A datatype doesn't get to make up its own mind whether > to obey those rules. Likewise, data on input had better match > client_encoding, because it's otherwise going to fail the encoding > checks long before a json datatype could have any say in the matter. > > While I've not read the spec, I wonder exactly what "consist of a string > of Unicode characters" should actually be taken to mean. Perhaps it > only means that all the characters must be members of the Unicode set, > not that the string can never be represented in any other encoding. > There's more than one Unicode encoding anyway... In practice, every parser/serializer I've used (including the one I helped write) allows (and, often, forces) any non-ASCII character to be encoded as \u followed by a string of four hex digits. Whether it would be easy inside the backend, when generating JSON from user data stored in tables that are not in a UTF-8 encoded cluster, to convert to UTF-8, that's something else entirely. If it /is/ easy and safe, then it's just a matter of scanning for multi-byte sequences and replacing those with their \u equivalents. I have some simple and fast code I could share, if it's needed, though I suspect it's not. :) UPDATE: Thanks, Robert, for pointing to the RFC. -- Mike Rylander | VP, Research and Design | Equinox Software, Inc. / The Evergreen Experts | phone: 1-877-OPEN-ILS (673-6457) | email: mi...@esilibrary.com | web: http://www.esilibrary.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Add JSON support
On Sun, Mar 28, 2010 at 7:36 PM, Tom Lane wrote: > Andrew Dunstan writes: >> Here's another thought. Given that JSON is actually specified to consist >> of a string of Unicode characters, what will we deliver to the client >> where the client encoding is, say Latin1? Will it actually be a legal >> JSON byte stream? > > No, it won't. We will *not* be sending anything but latin1 in such a > situation, and I really couldn't care less what the JSON spec says about > it. Delivering wrongly-encoded data to a client is a good recipe for > all sorts of problems, since the client-side code is very unlikely to be > expecting that. A datatype doesn't get to make up its own mind whether > to obey those rules. Likewise, data on input had better match > client_encoding, because it's otherwise going to fail the encoding > checks long before a json datatype could have any say in the matter. > > While I've not read the spec, I wonder exactly what "consist of a string > of Unicode characters" should actually be taken to mean. Perhaps it > only means that all the characters must be members of the Unicode set, > not that the string can never be represented in any other encoding. > There's more than one Unicode encoding anyway... See sections 2.5 and 3 of: http://www.ietf.org/rfc/rfc4627.txt?number=4627 ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Add JSON support
Andrew Dunstan writes: > Here's another thought. Given that JSON is actually specified to consist > of a string of Unicode characters, what will we deliver to the client > where the client encoding is, say Latin1? Will it actually be a legal > JSON byte stream? No, it won't. We will *not* be sending anything but latin1 in such a situation, and I really couldn't care less what the JSON spec says about it. Delivering wrongly-encoded data to a client is a good recipe for all sorts of problems, since the client-side code is very unlikely to be expecting that. A datatype doesn't get to make up its own mind whether to obey those rules. Likewise, data on input had better match client_encoding, because it's otherwise going to fail the encoding checks long before a json datatype could have any say in the matter. While I've not read the spec, I wonder exactly what "consist of a string of Unicode characters" should actually be taken to mean. Perhaps it only means that all the characters must be members of the Unicode set, not that the string can never be represented in any other encoding. There's more than one Unicode encoding anyway... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Add JSON support
Tom Lane wrote: Andrew Dunstan writes: Robert Haas wrote: I think you need to assume that the encoding will be the server encoding, not UTF-8. Although others on this list are better qualified to speak to that than I am. The trouble is that JSON is defined to be specifically Unicode, and in practice for us that means UTF8 on the server side. It could get a bit hairy, and it's definitely not something I think you can wave away with a simple "I'll just throw some encoding/decoding function calls at it." It's just text, no? Are there any operations where this actually makes a difference? If we're going to provide operations on it that might involve some. I don't know. Like Robert, I'm *very* wary of trying to introduce any text storage into the backend that is in an encoding different from server_encoding. Even the best-case scenarios for that will involve multiple new places for encoding conversion failures to happen. I agree entirely. All I'm suggesting is that there could be many wrinkles here. Here's another thought. Given that JSON is actually specified to consist of a string of Unicode characters, what will we deliver to the client where the client encoding is, say Latin1? Will it actually be a legal JSON byte stream? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Alpha release this week?
On Sun, Mar 28, 2010 at 4:40 PM, Josh Berkus wrote: > We've got two locations and some individuals signed up for a test-fest > this weekend. Would it be possible to do an alpha release this week? > It would really help to be testing later code than Alpha4. I'm willing to do the CVS bits, if that's helpful. Or maybe Peter wants to do it. Anyway I have no problem with the idea. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Add JSON support
Andrew Dunstan writes: > Robert Haas wrote: >> I think you need to assume that the encoding will be the server >> encoding, not UTF-8. Although others on this list are better >> qualified to speak to that than I am. > The trouble is that JSON is defined to be specifically Unicode, and in > practice for us that means UTF8 on the server side. It could get a bit > hairy, and it's definitely not something I think you can wave away with > a simple "I'll just throw some encoding/decoding function calls at it." It's just text, no? Are there any operations where this actually makes a difference? Like Robert, I'm *very* wary of trying to introduce any text storage into the backend that is in an encoding different from server_encoding. Even the best-case scenarios for that will involve multiple new places for encoding conversion failures to happen. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] five-key syscaches
Per previous discussion, PFA a patch to change the maximum number of keys for a syscache from 4 to 5. http://archives.postgresql.org/pgsql-hackers/2010-02/msg01105.php This is intended for application to 9.1, and is supporting infrastructure for knngist. ...Robert syscache5.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Add JSON support
Robert Haas wrote: On Sun, Mar 28, 2010 at 4:48 PM, Joseph Adams wrote: I'm wondering whether the internal representation of JSON should be plain JSON text, or some binary code that's easier to traverse and whatnot. For the sake of code size, just keeping it in text is probably best. +1 for text. Agreed. Now my thoughts and opinions on the JSON parsing/unparsing itself: It should be built-in, rather than relying on an external library (like XML does). Why? I'm not saying you aren't right, but you need to make an argument rather than an assertion. This is a community, so no one is entitled to decide anything unilaterally, and people want to be convinced - including me. Yeah, why? We should not be in the business of reinventing the wheel (and then maintaining the reinvented wheel), unless the code in question is *really* small. As far as character encodings, I'd rather keep that out of the JSON parsing/serializing code itself and assume UTF-8. Wherever I'm wrong, I'll just throw encode/decode/validate operations at it. I think you need to assume that the encoding will be the server encoding, not UTF-8. Although others on this list are better qualified to speak to that than I am. The trouble is that JSON is defined to be specifically Unicode, and in practice for us that means UTF8 on the server side. It could get a bit hairy, and it's definitely not something I think you can wave away with a simple "I'll just throw some encoding/decoding function calls at it." cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Add JSON support
On Sun, Mar 28, 2010 at 4:48 PM, Joseph Adams wrote: > I'm wondering whether the internal representation of JSON should be > plain JSON text, or some binary code that's easier to traverse and > whatnot. For the sake of code size, just keeping it in text is > probably best. +1 for text. > Now my thoughts and opinions on the JSON parsing/unparsing itself: > > It should be built-in, rather than relying on an external library > (like XML does). Why? I'm not saying you aren't right, but you need to make an argument rather than an assertion. This is a community, so no one is entitled to decide anything unilaterally, and people want to be convinced - including me. > As far as character encodings, I'd rather keep that out of the JSON > parsing/serializing code itself and assume UTF-8. Wherever I'm wrong, > I'll just throw encode/decode/validate operations at it. I think you need to assume that the encoding will be the server encoding, not UTF-8. Although others on this list are better qualified to speak to that than I am. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] join removal
On Sun, Mar 28, 2010 at 4:12 PM, Tom Lane wrote: > Robert Haas writes: >> On Sun, Mar 28, 2010 at 2:10 PM, Tom Lane wrote: >>> joinremoval.c ? > >> Maybe, except as I mentioned in the email linked upthread, my plan for >> implementing inner join removal would also include allowing join >> reordering in cases where we currently don't. So I don't want to >> sandbox it too tightly as join removal, per se, though that's >> certainly what we have on the table ATM. It's more like advanced >> open-heart join-tree surgery - like prepjointree, but much later in >> the process. > > Hm. At this point we're not really working with a join *tree* in any > case --- the data structure we're mostly concerned with is the list of > SpecialJoinInfo structs, and what we're trying to do is weaken the > constraints described by that list. So I'd rather stay away from "tree" > terminology. > > planjoins.c would fit with other names in the plan/ directory but it > seems like a misnomer because we're not really "planning" any joins > at this stage. > > adjustjoins.c? loosenjoins.c? weakenjoins.c? How about analyzejoins.c? Loosen and weaken don't seem like quite the right idea; adjust is a little generic and perhaps overused, but not bad. If you don't like analyzejoins then go with adjustjoins. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Proposal: Add JSON support
I introduced myself in the thread "Proposal: access control jails (and introduction as aspiring GSoC student)", and we discussed jails and session-local variables. But, as Robert Haas suggested, implementing variable support in the backend would probably be way too ambitious a project for a newbie like me. I decided instead to pursue the task of adding JSON support to PostgreSQL, hence the new thread. I plan to reference datatype-xml.html and functions-xml.html in some design decisions, but there are some things that apply to XML that don't apply to JSON and vice versa. For instance, jsoncomment wouldn't make sense because (standard) JSON doesn't have comments. For access, we might have something like json_get('foo[1].bar') and json_set('foo[1].bar', 'hello'). jsonforest and jsonagg would be beautiful. For mapping, jsonforest/jsonagg could be used to build a JSON string from a result set (SELECT jsonagg(jsonforest(col1, col2, ...)) FROM tbl), but I'm not sure on the best way to go the other way around (generate a result set from JSON). CSS-style selectors would be cool, but "selecting" is what SQL is all about, and I'm not sure having a json_select("dom-element[key=value]") function is a good, orthogonal approach. I'm wondering whether the internal representation of JSON should be plain JSON text, or some binary code that's easier to traverse and whatnot. For the sake of code size, just keeping it in text is probably best. Now my thoughts and opinions on the JSON parsing/unparsing itself: It should be built-in, rather than relying on an external library (like XML does). Priorities of the JSON implementation, in descending order, are: * Small * Correct * Fast Moreover, JSON operations shall not crash due to stack overflows. I'm thinking Bison/Flex is overkill for parsing JSON (I haven't seen any JSON implementations out there that use it anyway). I would probably end up writing the JSON parser/serializer manually. It should not take more than a week. As far as character encodings, I'd rather keep that out of the JSON parsing/serializing code itself and assume UTF-8. Wherever I'm wrong, I'll just throw encode/decode/validate operations at it. Thoughts? Thanks. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Alpha release this week?
All, We've got two locations and some individuals signed up for a test-fest this weekend. Would it be possible to do an alpha release this week? It would really help to be testing later code than Alpha4. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] join removal
Robert Haas writes: > On Sun, Mar 28, 2010 at 2:10 PM, Tom Lane wrote: >> joinremoval.c ? > Maybe, except as I mentioned in the email linked upthread, my plan for > implementing inner join removal would also include allowing join > reordering in cases where we currently don't. So I don't want to > sandbox it too tightly as join removal, per se, though that's > certainly what we have on the table ATM. It's more like advanced > open-heart join-tree surgery - like prepjointree, but much later in > the process. Hm. At this point we're not really working with a join *tree* in any case --- the data structure we're mostly concerned with is the list of SpecialJoinInfo structs, and what we're trying to do is weaken the constraints described by that list. So I'd rather stay away from "tree" terminology. planjoins.c would fit with other names in the plan/ directory but it seems like a misnomer because we're not really "planning" any joins at this stage. adjustjoins.c? loosenjoins.c? weakenjoins.c? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] join removal
On Sun, Mar 28, 2010 at 2:10 PM, Tom Lane wrote: > Robert Haas writes: >> On Sun, Mar 28, 2010 at 2:04 PM, Tom Lane wrote: >>> * in a new file in plan/. Not sure if it's worth this, though your >>> thought that we might add more logic later makes it more defensible. > >> I sort of like the last of these ideas though I'm at a loss for what >> to call it. Otherwise I kind of like planmain.c. > > joinremoval.c ? Maybe, except as I mentioned in the email linked upthread, my plan for implementing inner join removal would also include allowing join reordering in cases where we currently don't. So I don't want to sandbox it too tightly as join removal, per se, though that's certainly what we have on the table ATM. It's more like advanced open-heart join-tree surgery - like prepjointree, but much later in the process. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] join removal
Robert Haas writes: > On Sun, Mar 28, 2010 at 2:04 PM, Tom Lane wrote: >> * in a new file in plan/. Not sure if it's worth this, though your >> thought that we might add more logic later makes it more defensible. > I sort of like the last of these ideas though I'm at a loss for what > to call it. Otherwise I kind of like planmain.c. joinremoval.c ? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] join removal
On Sun, Mar 28, 2010 at 2:04 PM, Tom Lane wrote: > Robert Haas writes: >> On Sun, Mar 28, 2010 at 12:19 AM, Tom Lane wrote: >>> * I left join_is_removable where it was, mainly so that it was easy to >>> compare how much it changed for this usage (not a lot). I'm not sure >>> that joinpath.c is an appropriate place for it anymore, though I can't >>> see any obviously better place either. Any thoughts on that? > >> I dislike the idea of leaving it in joinpath.c. I don't even think it >> properly belongs in the path subdirectory since it no longer has >> anything to do with paths. Also worth thinking about where we would >> put the logic I pontificated about here: >> http://archives.postgresql.org/pgsql-hackers/2009-10/msg01012.php > > The only argument I can see for leaving it where it is is that it > depends on clause_sides_match_join, which we'd have to either duplicate > or global-ize in order to continue sharing that code. However, since > join_is_removable now needs a slightly different API for that anyway > (cf changes in draft patch), it's probably better to not try to share it. > So let's put the join removal code somewhere else. The reasonable > alternatives seem to be: > > * in a new file in prep/. Although this clearly has the flavor of > preprocessing, all the other work in prep/ is done before we get into > query_planner(). So this choice seems a bit dubious. > > * directly in plan/planmain.c. Has the advantage of being where the > caller is, so no globally visible function declaration needed. No other > redeeming social value though. > > * in plan/initsplan.c. Somewhat reasonable, although that file is > rather large already. > > * in a new file in plan/. Not sure if it's worth this, though your > thought that we might add more logic later makes it more defensible. I sort of like the last of these ideas though I'm at a loss for what to call it. Otherwise I kind of like planmain.c. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] join removal
Robert Haas writes: > On Sun, Mar 28, 2010 at 12:19 AM, Tom Lane wrote: >> * I left join_is_removable where it was, mainly so that it was easy to >> compare how much it changed for this usage (not a lot). I'm not sure >> that joinpath.c is an appropriate place for it anymore, though I can't >> see any obviously better place either. Any thoughts on that? > I dislike the idea of leaving it in joinpath.c. I don't even think it > properly belongs in the path subdirectory since it no longer has > anything to do with paths. Also worth thinking about where we would > put the logic I pontificated about here: > http://archives.postgresql.org/pgsql-hackers/2009-10/msg01012.php The only argument I can see for leaving it where it is is that it depends on clause_sides_match_join, which we'd have to either duplicate or global-ize in order to continue sharing that code. However, since join_is_removable now needs a slightly different API for that anyway (cf changes in draft patch), it's probably better to not try to share it. So let's put the join removal code somewhere else. The reasonable alternatives seem to be: * in a new file in prep/. Although this clearly has the flavor of preprocessing, all the other work in prep/ is done before we get into query_planner(). So this choice seems a bit dubious. * directly in plan/planmain.c. Has the advantage of being where the caller is, so no globally visible function declaration needed. No other redeeming social value though. * in plan/initsplan.c. Somewhat reasonable, although that file is rather large already. * in a new file in plan/. Not sure if it's worth this, though your thought that we might add more logic later makes it more defensible. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] join removal
I wrote: > Robert Haas writes: >> I'm alarmed by your follow-on statement that the current code can't >> handle the two-levels of removable join case. Seems like it ought to >> form {B C} as a path over {B} and then {A B C} as a path over {A}. > Actually I think it ought to form {A B} as a no-op join and then be able > to join {A B} to {C} as a no-op join. It won't recognize joining A to > {B C} as a no-op because the RHS isn't a baserel. But yeah, I was quite > surprised at the failure too. We should take the time to understand why > it's failing before we go further. OK, I traced through it, and the reason HEAD fails on this example is that it *doesn't* recognize {A B} as a feasible no-op join, for precisely the reason that it sees some B vars marked as being needed for the not-yet-done {B C} join. So that path is blocked, and the other possible path to the desired result is also blocked because it won't consider {B C} as a valid RHS for a removable join. I don't see any practical way to escape the false-attr_needed problem given the current code structure. We could maybe hack our way to a solution by weakening the restriction against the RHS being a join, eg by noting that the best path for the RHS is a no-op join and then drilling down to the one baserel. But it seems pretty ugly. So I think the conclusion is clear: we should consign the current join-removal code to the dustbin and pursue the preprocessing way instead. Will work on it today. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] More idle thoughts
Simon Riggs writes: > On Fri, 2010-03-26 at 18:59 +, Greg Stark wrote: >> It occurs to me we could do the same for CHECK_FOR_INTERRUPTS() by >> conditionally having it call a function which calls gettimeofday and >> compares with the previous timestamp received at the last CFI(). > Reducing latency sounds good, but what has CFI got to do with that? It took me about five minutes to figure out what Greg was on about too. His point is that we need to locate code paths in which an extremely long time can pass between successive CFI calls, because that means the backend will fail to respond to SIGINT/SIGTERM for a long time. Instrumenting CFI itself is a possible tool for that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] join removal
Simon Riggs writes: > Does the new patch find more than two levels of join removal? Well, I'd assume if it can do two nested levels then it should work for any number, but I plead guilty to not having actually tested that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] More idle thoughts
On Fri, 2010-03-26 at 18:59 +, Greg Stark wrote: > The Linux kernel had a big push to reduce latency, and one of the > tricks they did was they replaced the usual interrupt points with a > call which noted how long it had been since the last interrupt point. > It occurs to me we could do the same for CHECK_FOR_INTERRUPTS() by > conditionally having it call a function which calls gettimeofday and > compares with the previous timestamp received at the last CFI(). Reducing latency sounds good, but what has CFI got to do with that? -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] join removal
On Sun, 2010-03-28 at 02:15 -0400, Tom Lane wrote: > I wrote: > > [ crude patch ] > > Oh, btw, if you try to run the regression test additions in that patch > against CVS HEAD, you'll find out that HEAD actually fails to optimize > the two-levels-of-removable-joins case. Seems like another reason to > press ahead with making the change. Yes, please. Does the new patch find more than two levels of join removal? -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] join removal
Robert Haas writes: > I'm alarmed by your follow-on statement that the current code can't > handle the two-levels of removable join case. Seems like it ought to > form {B C} as a path over {B} and then {A B C} as a path over {A}. Actually I think it ought to form {A B} as a no-op join and then be able to join {A B} to {C} as a no-op join. It won't recognize joining A to {B C} as a no-op because the RHS isn't a baserel. But yeah, I was quite surprised at the failure too. We should take the time to understand why it's failing before we go further. I ran out of steam last night but will have a look into that today. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] join removal
On Sun, Mar 28, 2010 at 12:19 AM, Tom Lane wrote: > Robert Haas writes: >> On Sat, Mar 27, 2010 at 4:11 PM, Tom Lane wrote: >>> I'm not seeing how that would occur or would matter, but the worst case >>> answer is to restart the scan of the SpecialJoinInfos from scratch any >>> time you succeed in doing a join removal. > >> Well, say you have something like > >> SELECT 1 FROM A LEFT JOIN (B LEFT JOIN C ON Pbc) ON Pab > >> I think that the SpecialJoinInfo structure for the join between B and >> C will match the criteria I articulated upthread, but the one for the >> join between A and {B C} will not. If C had not been in the query >> from the begining then we'd have had: > >> SELECT 1 FROM A LEFT JOIN B ON Pab > >> ...under which circumstances the SpecialJoinInfo would match the >> aforementioned criteria. > > I experimented with this and found that you're correct: the tests on the > different SpecialJoinInfos do interact, which I hadn't believed > initially. The reason for this is that when we find out we can remove a > particular rel, we have to remove the bits for it in other relations' > attr_needed bitmaps. In the above example, we first discover we can > remove C. Whatever B vars were used in Pbc will have an attr_needed > set of {B,C}, and that C bit will prevent us from deciding that B can > be removed when we are examining the upper SpecialJoinInfo (which will > not consider C to be part of either min_lefthand or min_righthand). > So we have to remove the C bits when we remove C. > > Attached is an extremely quick-and-dirty, inadequately commented draft > patch that does it along the lines you are suggesting. This was just to > see if I could get it to work at all; it's not meant for application in > anything like its current state. However, I feel a very strong > temptation to finish it up and apply it before we enter beta. As you > noted, this way is a lot cheaper than the original coding, whether one > focuses on the cost of failing cases or the cost when the optimization > is successful. And if we hold it off till 9.1, then any bug fixes that > have to be made in the area later will need to be made against two > significantly different implementations, which will be a real PITA. > > Things that would need to be cleaned up: > > * I left join_is_removable where it was, mainly so that it was easy to > compare how much it changed for this usage (not a lot). I'm not sure > that joinpath.c is an appropriate place for it anymore, though I can't > see any obviously better place either. Any thoughts on that? I dislike the idea of leaving it in joinpath.c. I don't even think it properly belongs in the path subdirectory since it no longer has anything to do with paths. Also worth thinking about where we would put the logic I pontificated about here: http://archives.postgresql.org/pgsql-hackers/2009-10/msg01012.php > * The removed relation has to be taken out of the set of baserels > somehow, else for example the Assert in make_one_rel will fail. > The current hack is to change its reloptkind to RELOPT_OTHER_MEMBER_REL, > which I think is a bit unclean. We could try deleting it from the > simple_rel_array altogether, but I'm worried that that could result in > dangling-pointer failures, since we're probably not going to go to the > trouble of removing every single reference to the rel from the planner > data structures. A possible compromise is to invent another reloptkind > value that is only used for "dead" relations. +1 for dead relation type. > * It would be good to not count the removed relation in > root->total_table_pages. If we made either of the changes suggested > above then we could move the calculation of total_table_pages down to > after remove_useless_joins and ignore the removed relation(s) > appropriately. Otherwise I'm tempted to just subtract off the relation > size from total_table_pages on-the-fly when we remove it. Sounds good. > * I'm not sure yet about the adjustment of PlaceHolder bitmaps --- we > might need to break fix_placeholder_eval_levels into two steps to get > it right. Not familiar enough with that code to comment. > * Still need to reverse out the now-dead code from the original patch, > in particular the NoOpPath support. Yeah. > Thoughts? I'm alarmed by your follow-on statement that the current code can't handle the two-levels of removable join case. Seems like it ought to form {B C} as a path over {B} and then {A B C} as a path over {A}. Given that it doesn't, we already have a fairly serious bug, so we've either got to put more work into the old implementation or switch to this new one - and I think at this point you and I are both fairly convinced that this is a better way going forward. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Augment WAL records for btree delete with GetOldestXmin() to
On Sat, 2010-03-27 at 22:39 +, Greg Stark wrote: > On Sat, Mar 27, 2010 at 7:36 PM, Simon Riggs wrote: > > On Sat, 2010-03-27 at 19:15 +, Greg Stark wrote: > > > If we're pruning an index entry to a heap tuple that has been HOT > > > pruned wouldn't the HOT pruning record have already conflicted with > > > any queries that could see it? > > > > Quite probably, but a query that started after that record arrived might > > slip through. We have to treat each WAL record separately. > > Slip through? I'm not following you. No, there is no possibility for it to slip through, you're right. (After much thinking). > > Do you agree with the conjecture? That LP_DEAD items can be ignored > > because their xid would have been earlier than the latest LP_NORMAL > > tuple we find? (on any page). > > > > Or is a slightly less strong condition true: we can ignore LP_DEAD items > > on a page that is also referenced by an LP_NORMAL item. > > I don't like having dependencies on the precise logic in vacuum rather > than only on the guarantees that vacuum provides. We want to improve > the logic in vacuum and hot pruning to cover more cases and that will > be harder if there's code elsewhere depending on its simple-minded xid > <= globalxmin test. Agreed -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers