[HACKERS] SSI error messages
Some of these new error messages from the SSI code are a mouthful: not enough elements in RWConflictPool to record a rw-conflict not enough elements in RWConflictPool to record a potential rw-conflict These are basically "out of shared memory" conditions that could be moved to a DETAIL message. Canceled on identification as a pivot, during conflict out checking. Canceled on conflict out to old pivot %u. Canceled on identification as a pivot, with conflict out to old committed transaction %u. Canceled on conflict out to old pivot. Canceled on identification as a pivot, during conflict in checking. Canceled on identification as a pivot, during write. Canceled on conflict out to pivot %u, during read. Canceled on identification as a pivot, during commit attempt. Canceled on commit attempt with conflict in from prepared pivot. These are DETAIL messages, with the rest of the report saying ERROR: could not serialize access due to read/write dependencies among transactions HINT: The transaction might succeed if retried. AFAICT, the documentation doesn't mention anything about this "pivot" that keeps coming up. Is it useful that have the user face this information? Is there anything a user can derive from seeing one of these messages versus another, and in addition to the error and hint, that would help them address the situation? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] ON COMMIT action not catalogued?
(Going through some loose ends in the information schema ...) Is my understanding right that the ON COMMIT action of temporary tables is not recorded in the system catalogs anywhere? Would make sense, wouldn't be a big problem, just want to make sure I didn't miss anything. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is there a committer in the house?
On tor, 2011-07-14 at 12:01 -0700, Josh Berkus wrote: > Currently we have 8 patches "Ready for Committer" in the current CF. > Some of them have been that status for some time. > > >From traffic on this list, I'm getting the impression that nobody other > than Robert, Heikki and Tom are committing other people's patches. I > know we have more committers than that. Bruce? Simon? Itagaki? Bueller? I'm still working on preparing 9.1. I'm not following 9.2 development yet. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: patch review : Add ability to constrain backend temporary file space
>> I have looked into the v6 patches. One thing I would like to suggest >> is, enhancing the error message when temp_file_limit will be exceeded. >> >> ERROR: aborting due to exceeding temp file limit >> >> Is it possible to add the current temp file limit to the message? For >> example, >> >> ERROR: aborting due to exceeding temp file limit 1kB >> >> I know the current setting of temp_file_limit can be viewd in other >> ways, but I think this will make admin's or application developer's >> life a little bit easier. > > Hi Tatsuo, > > Yeah, good suggestion - I agree that it would be useful to supply > extra detail, I'll amend and resubmit a new patch (along with whatever > review modifications we come up with in the meantime)! > > Cheers > > Mark Maybe we could add more info regarding current usage and requested amount in addition to the temp file limit value. I mean something like: ERROR: aborting due to exceeding temp file limit. Current usage 9000kB, requested size 1024kB, thus it will exceed temp file limit 1kB. What do you think? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] per-column generic option
(2011/07/15 4:17), Josh Berkus wrote: > All, > > Is the spec for this feature still under discussion? I don't seem to > see a consensus on this thread. Yeah, a remaining concern is whether generic (FDW) options should be separated from existing attoptions or not. Indeed, reloptions/attoptions mechanism seems to be also applicable to generic options, since both need to store multiple key-value pairs, but IMHO generic options should be separated from reloptions/attoptions for several reasons: 1) FDW options are handled by only FDW, but reloptions/attoptions are handled by PG core modules such as planner, AM and autovacuum. If we can separate them completely, they would be able to share one attribute, but I worry that some of reloptions/attoptions make sense for some FDW. For instance, n_distinct might be useful to control costs of a foreign table scan. Though attoptions can't be set via CREATE/ALTER FOREIGN TABLE yet. 2) In future, newly added option might conflict somebody's FDW option. Robert Haas has pointed out this issue some days ago. FDW validator would reject unknown options, so every FDW would have to know all of reloptions/attoptions to avoid this issue. 3) It would be difficult to unify syntax to set options from perspective of backward compatibility and syntax consistency. Other SQL/MED objects have the syntax such as "OPTIONS (key 'value', ...)", but reloptions/attoptions have the syntax such as "SET (key = value, ...)". Without syntax unification, some tools should care relkind before handling attoptions. For instance, pg_dump should choose syntax used to dump attoptions. It seems undesired complexity. Any comments/objections/questions are welcome. Regards, -- Shigeru Hanada * 英語 - 自動検出 * 英語 * 日本語 * 英語 * 日本語 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_class.relistemp
On Jul 14, 2011, at 6:43 PM, Robert Haas wrote: >> Is that all I need to do, or is there something else I should be aware of >> with regard to unlogged tables? > > Probably not, in this case. Just a thought: maybe you could rewrite the query > to check whether the namespace name starts with pg_temp. Maybe that would be > version-independent... Ah, good idea, I forgot about pg_temp. David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_class.relistemp
On Jul 14, 2011, at 5:13 PM, "David E. Wheeler" wrote: > On Jul 14, 2011, at 3:05 PM, Tom Lane wrote: > >> Josh Berkus writes: There are a ton of things that change with each release, and all we do by putting in hacks for backwards compatibility is add bloat that needs to be maintained, and encourage vendors to be lazy. >> >>> I don't agree that having comprehensive system views with multi-version >>> stability would be a "hack". >> >> If we had that, it wouldn't be a hack. > > Is that an endorsement for adding such a feature? > >> Putting in a hack to cover the >> specific case of relistemp, on the other hand, is just a hack. > > Sure. > >> The real question here, IMO, is "how many applications are there that >> really need to know about temporary relations, but have no interest in >> the related feature of unlogged relations?". Because only such apps >> would be served by a compatibility hack for this. An app that thinks it >> knows the semantics of relistemp, and isn't updated to grok unlogged >> tables, may be worse than broken --- it may be silently incorrect. > > So pgTAP creates temporary tables to store result sets so that it can then > compare the results of two queries. The function in question was getting a > list of columns in such a temporary table in order to make sure that the > types were the same between two such tables before comparing results. It > checked relistemp to make sure it was looking at the temp table rather than > some other table that might happen to have the same name. > > So now the query looks like this: > >SELECT pg_catalog.format_type(a.atttypid, a.atttypmod) > FROM pg_catalog.pg_attribute a > JOIN pg_catalog.pg_class c ON a.attrelid = c.oid > WHERE c.relname = $1 > -- AND c.relistemp -- 8.3-9.0 > AND c.relpersistence = 't' -- 9.1 > AND attnum > 0 > AND NOT attisdropped > ORDER BY attnum > > Is that all I need to do, or is there something else I should be aware of > with regard to unlogged tables? Probably not, in this case. Just a thought: maybe you could rewrite the query to check whether the namespace name starts with pg_temp. Maybe that would be version-independent... ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Single pass vacuum - take 1
On Thu, Jul 14, 2011 at 6:22 PM, Simon Riggs wrote: > On Thu, Jul 14, 2011 at 4:57 PM, Pavan Deolasee > wrote: > > > Thanks Simon for looking at the patch. > > Sorry, I didn't notice there was a patch attached. Not reviewed it. I > thought we were still just talking. > > No problem. Please review it when you have time. > > > I am not sure if the use case is really narrow. > > This is a very rare issue, because of all the work yourself and Heikki > have put in. > > I don't think its rare case since vacuum on any table, small or large, would take two passes today. Avoiding one of the two, would help the system in general. HOT and other things help to a great extent, but unfortunately they don't make vacuum completely go away. So we still want to do vacuum in the most efficient way. > It's only important when we have a (1) big table (hence long scan > time), (2) a use case that avoids HOT *and* (3) we have dirtied a > large enough section of table that the vacuum map is ineffective and > we need to scan high % of table. That combination is pretty rare, so > penalising everybody else with 8 bytes per block seems too much to me. > > The additional space is allocated only for those pages which has dead-vacuum line pointers and would stay only till the next HOT-prune operation on the page. So everybody does not pay the penalty, even if we assume its too much and if thats what worries you most. > > I have great faith in your talents, just not sure about this > particular use of them. Not sure if thats a compliment or a criticism :-) > I'm sorry to voice them now you've written the > patch. > > Yeah, I would have liked if you would have said this earlier, but I appreciate comments any time. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] Review of patch Bugfix for XPATH() if expression returns a scalar value
Hi Radosław, Do you plan to comment on this patch (the one that adds support for XPATH expressions returning scalar values, not the one that escapes text and attributes nodes which are selected) further, or should it be marked "Ready for Committer"? I'm asking because I just noticed that you marked the other patch as "Ready for Commiter", but not this one. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Single pass vacuum - take 1
On Thu, Jul 14, 2011 at 4:57 PM, Pavan Deolasee wrote: > Thanks Simon for looking at the patch. Sorry, I didn't notice there was a patch attached. Not reviewed it. I thought we were still just talking. > I am not sure if the use case is really narrow. This is a very rare issue, because of all the work yourself and Heikki have put in. It's only important when we have a (1) big table (hence long scan time), (2) a use case that avoids HOT *and* (3) we have dirtied a large enough section of table that the vacuum map is ineffective and we need to scan high % of table. That combination is pretty rare, so penalising everybody else with 8 bytes per block seems too much to me. Big VACUUMs are a problem, but my observation would be that those are typically transaction wraparound VACUUMs and the extra writes are not caused by row removal. So we do sometimes do Phase 2 and Phase 3 even when there is a very low number of row removals - since not all VACUUMs are triggered by changes. > Today, we dirty the pages in > both the passes and also emit WAL records. This is exactly the thing I'm suggesting we avoid. > Just the heap scan can take a > very long time for large tables, blocking the autovacuum worker threads from > doing useful work on other tables. If I am not wrong, we use ring buffers > for vacuum which would most-likely force those buffers to be written/read > twice to the disk. I think the problem comes from dirtying too many blocks. Scanning the tables using the ring buffer is actually fairly cheap. The second scan only touches the blocks that need secondary cleaning, so the cost of it is usually much less. I'm suggesting we write each block at most once, rather than twice as we do now. Yes, we have to do both scans. My idea does exactly same number of writes as yours. On read-only I/O, your idea is clearly more efficient, but overall that's not by enough to justify the 8 byte per block overhead, IMHO. > Which part of the patch you think is very complex ? We can try to simplify > that. Or are you seeing any obvious bugs that I missed ? IMHO taking out a > phase completely from vacuum (as this patch does) can simplify things. I have great faith in your talents, just not sure about this particular use of them. I'm sorry to voice them now you've written the patch. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Patch Review: Collect frequency statistics and selectivity estimation for arrays
Review of patch: https://commitfest.postgresql.org/action/patch_view?id=539 I glanced through the code and played around with the feature and, in general, it looks pretty good. But I have a some comments/questions. Design: First, it makes me uncomfortable that you are using the MCV and histogram slot kinds in a way that is very different from other data types. I realize that tsvector uses MCV in the same way that you do but: 1) I don't like that very much either. 2) TS vector is different in that equality ( in the btree sense ) doesn't make sense, whereas it does for arrays. Using the histogram slot for the array lengths is also very surprising to me. Why not just use a new STA_KIND? It's not like we are running out of room, and this will be the second 'container' type that splits the container and stores stats about the elements. Second, I'm fairly unclear about what the actual underlying statistical method is and what assumptions it makes. Before I can really review the method, I will probably need to see some documentation, as I say below. Do you have any tests/simulations that explore the estimation procedure that I can look at? When will it fail? In what cases does it improve estimation? Documentation: Seems a bit lacking - especially if you keep the non standard usage of mcv/histograms. Also, it would be really nice to see some update to the row-estimation-examples page ( in chapter 55 ). The code comments are good in general, but there are not enough high level comments . It would be really nice to have a comment that gives an overview of what each of the following functions are supposed to do: mcelem_array_selec mcelem_array_contain_overlap_selec and especially mcelem_array_contained_selec Also, in the compute_array_stats you reference compute_tsvector_stats for the algorithm, but I think that it would be smarter to either copy the relevant portions of the comment over or to reference a published document. If compute_tsvector_stats changes, I doubt that anyone will remember to update the comment. Finally, it would be really nice to see, explicitly, and in mathematical notation 1) The data that is being collected and summarized. ( the statistic ) 2) The estimate being derived from the statistic ( ie the selectivity ) i) Any assumptions being made ( ie independence of elements within an array ) For instance, the only note I could find that actually addressed the estimator and assumptions underneath it was +* mult * dist[i] / mcelem_dist[i] gives us probability probability +* of qual matching from assumption of independent element occurence +* with condition that length = i. and that's pretty cryptic. That should be expanded upon and placed more prominently. A couple nit picks: 1) In calc_distr you go to some lengths to avoid round off errors. Since it is certainly just the order of the estimate that matters, why not just perform the calculation in log space? 2) compute_array_stats is really long. Could you break it up? ( I know that the other analyze functions are the same way, but why continue in that vein? ) Best, Nathan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_class.relistemp
On Jul 14, 2011, at 3:05 PM, Tom Lane wrote: > Josh Berkus writes: >>> There are a ton of >>> things that change with each release, and all we do by putting in >>> hacks for backwards compatibility is add bloat that needs to be >>> maintained, and encourage vendors to be lazy. > >> I don't agree that having comprehensive system views with multi-version >> stability would be a "hack". > > If we had that, it wouldn't be a hack. Is that an endorsement for adding such a feature? > Putting in a hack to cover the > specific case of relistemp, on the other hand, is just a hack. Sure. > The real question here, IMO, is "how many applications are there that > really need to know about temporary relations, but have no interest in > the related feature of unlogged relations?". Because only such apps > would be served by a compatibility hack for this. An app that thinks it > knows the semantics of relistemp, and isn't updated to grok unlogged > tables, may be worse than broken --- it may be silently incorrect. So pgTAP creates temporary tables to store result sets so that it can then compare the results of two queries. The function in question was getting a list of columns in such a temporary table in order to make sure that the types were the same between two such tables before comparing results. It checked relistemp to make sure it was looking at the temp table rather than some other table that might happen to have the same name. So now the query looks like this: SELECT pg_catalog.format_type(a.atttypid, a.atttypmod) FROM pg_catalog.pg_attribute a JOIN pg_catalog.pg_class c ON a.attrelid = c.oid WHERE c.relname = $1 -- AND c.relistemp -- 8.3-9.0 AND c.relpersistence = 't' -- 9.1 AND attnum > 0 AND NOT attisdropped ORDER BY attnum Is that all I need to do, or is there something else I should be aware of with regard to unlogged tables? Thanks, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_class.relistemp
>> I don't agree that having comprehensive system views with multi-version >> stability would be a "hack". > > If we had that, it wouldn't be a hack. Putting in a hack to cover the > specific case of relistemp, on the other hand, is just a hack. I agree. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_class.relistemp
Josh Berkus writes: >> There are a ton of >> things that change with each release, and all we do by putting in >> hacks for backwards compatibility is add bloat that needs to be >> maintained, and encourage vendors to be lazy. > I don't agree that having comprehensive system views with multi-version > stability would be a "hack". If we had that, it wouldn't be a hack. Putting in a hack to cover the specific case of relistemp, on the other hand, is just a hack. The real question here, IMO, is "how many applications are there that really need to know about temporary relations, but have no interest in the related feature of unlogged relations?". Because only such apps would be served by a compatibility hack for this. An app that thinks it knows the semantics of relistemp, and isn't updated to grok unlogged tables, may be worse than broken --- it may be silently incorrect. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_class.relistemp
> As one of said vendors, I completely disagree. I don't agree that you qualify as a vendor. You're on the friggin' core team. I'm talking about vendors like DBVizualizer or TORA, for which PostgreSQL is just one of the databases they support. If stuff breaks gratuitously, the reaction of some of them is always to either drop support or delay it for a year or more. This doesn't benefit our community. > There are a ton of > things that change with each release, and all we do by putting in > hacks for backwards compatibility is add bloat that needs to be > maintained, and encourage vendors to be lazy. I don't agree that having comprehensive system views with multi-version stability would be a "hack". > Break compatibility is actually something that is important to us - it > forces us to fix obvious issues, and makes it much harder to > inadvertently miss important changes. What I'm hearing from you is: "Breaking backwards compatibility is something we should do more of because it lets us know which vendors are paying attention and weeds out the unfit." Is that what you meant to say? That seems like a way to ensure that PostgreSQL support continue to be considered optional, or based on outdated versions, for multi-database tools. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is there a committer in the house?
Josh Berkus wrote: > > > ISTM that you are right that there are other committers that have not > > done anything. How strange that you name myself, yet not others. > > Touchy today, eh? > > And I do name others, read the email again. > > Anyway, my question is: the list of committers I know of who have > general knowledge of the codebase and can commit a wide variety of other > people's patches are: > > Tom > Robert > Heikki > Bruce > Simon I have found my reading of the email lists is too delayed to effectively commit other's patches. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_class.relistemp
On Jul 14, 2011, at 2:10 PM, Dave Page wrote: > Break compatibility is actually something that is important to us - it > forces us to fix obvious issues, and makes it much harder to > inadvertently miss important changes. Agreed, but a deprecation cycle would be much appreciated. David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is there a committer in the house?
> ISTM that you are right that there are other committers that have not > done anything. How strange that you name myself, yet not others. Touchy today, eh? And I do name others, read the email again. Anyway, my question is: the list of committers I know of who have general knowledge of the codebase and can commit a wide variety of other people's patches are: Tom Robert Heikki Bruce Simon Who else? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Small patch for GiST: move childoffnum to child
I think there's two bugs in the existing gistFindPath code: if (top->parent && XLByteLT(top->parent->lsn, GistPageGetOpaque(page)->nsn) && GistPageGetOpaque(page)->rightlink != InvalidBlockNumber /* sanity check */ ) { /* page splited while we thinking of... */ ptr = (GISTInsertStack *) palloc0(sizeof(GISTInsertStack)); ptr->blkno = GistPageGetOpaque(page)->rightlink; ptr->childoffnum = InvalidOffsetNumber; ptr->parent = top; ptr->next = NULL; tail->next = ptr; tail = ptr; } First, notice that we're setting "ptr->parent = top". 'top' is the current node we're processing, and ptr represents the node to the right of the current node. The current node is *not* the parent of the node to the right. I believe that line should be "ptr->parent = top->parent". Second, we're adding the entry for the right sibling to the end of the list of nodes to visit. But when we process entries from the list, we exit immediately when we see a leaf page. That means that the right sibling can get queued up behind leaf pages, and thus never visited. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Fast GiST index build
On 14.07.2011 11:33, Alexander Korotkov wrote: On Wed, Jul 13, 2011 at 5:59 PM, Heikki Linnakangas< heikki.linnakan...@enterprisedb.com> wrote: One thing that caught my eye is that when you empty a buffer, you load the entire subtree below that buffer, down to the next buffered or leaf level, into memory. Every page in that subtree is kept pinned. That is a problem; in the general case, the buffer manager can only hold a modest number of pages pinned at a time. Consider that the minimum value for shared_buffers is just 16. That's unrealistically low for any real system, but the default is only 32MB, which equals to just 4096 buffers. A subtree could easily be larger than that. With level step = 1 we need only 2 levels in subtree. With mininun index tuple size (12 bytes) each page can have at maximum 675. Thus I think default shared_buffers is enough for level step = 1. Hundreds of buffer pins is still a lot. And with_level_step=2, the number of pins required explodes to 675^2 = 455625. Pinning a buffer that's already in the shared buffer cache is cheap, I doubt you're gaining much by keeping the private hash table in front of the buffer cache. Also, it's possible that not all of the subtree is actually required during the emptying, so in the worst case pre-loading them is counter-productive. I believe it's enough to add check we have sufficient shared_buffers, isn't it? Well, what do you do if you deem that shared_buffers is too small? Fall back to the old method? Also, shared_buffers is shared by all backends, so you can't assume that you get to use all of it for the index build. You'd need a wide safety margin. I don't think you're benefiting at all from the buffering that BufFile does for you, since you're reading/writing a full block at a time anyway. You might as well use the file API in fd.c directly, ie. OpenTemporaryFile/FileRead/**FileWrite. BufFile is distributing temporary data through several files. AFAICS postgres avoids working with files larger than 1GB. Size of tree buffers can easily be greater. Without BufFile I need to maintain set of files manually. Ah, I see. Makes sense. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_class.relistemp
On Thursday, July 14, 2011, Josh Berkus wrote: > >> There has never, ever, been a guarantee that the system catalogs don't >> change across versions. Anybody issuing such queries should expect to >> have to retest them and possibly change them in each new major release. > > I know that's always been our policy. It his, however, > vendor-unfriendly because we don't supply any interface for many things > (such as temp tables) other than the system catalogs. > > So if we're going to break compatibility, then we could stand to make a > little noise about it. As one of said vendors, I completely disagree. There are a ton of things that change with each release, and all we do by putting in hacks for backwards compatibility is add bloat that needs to be maintained, and encourage vendors to be lazy. Break compatibility is actually something that is important to us - it forces us to fix obvious issues, and makes it much harder to inadvertently miss important changes. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Understanding GIN posting trees
On 14.07.2011 17:28, Teodor Sigaev wrote: Why is the posting tree a tree? AFAICS, we never search it using the TID, it's always scanned in whole. It would be simpler to store the TIDs in a posting list in no particular order. This could potentially make insertions cheaper, as you could just append to the last posting list page for the key, instead of traversing the posting tree to a particular location. You could also pack the tids denser, as you wouldn't need to reserve free space for additions in the middle. For consistentFn call we need to collect all data for current tid. We do that by scanning over logical ordered arrays of tids and trees allows to do that by scanning a leafs pages. Oh, I see. You essentially do a merge join of all the posting trees of query keys. Hmm, but we do need to scan all the posting trees of all the matched keys in whole anyway. We could collect all TIDs in the posting lists of all the keys into separate TIDBitmaps, and then combine the bitmaps, calling consistentFn for each TID that was present in at least one bitmap. I guess the performance characteristics of that would be somewhat different from what we have now, and you'd need to keep a lot of in-memory bitmaps if the query contains a lot of keys. While we're at it, it just occurred to me that it if the number of query keys is limited, say <= 16, you could build a lookup table for each combination of keys either occurring or not. You could use then use that instead of calling consistentFn for each possible match. You could even use the table to detect common cases like "all/any keys must match", perhaps opening some optimization opportunities elsewhere. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is there a committer in the house?
On Thu, Jul 14, 2011 at 8:01 PM, Josh Berkus wrote: > Currently we have 8 patches "Ready for Committer" in the current CF. > Some of them have been that status for some time. > > From traffic on this list, I'm getting the impression that nobody other > than Robert, Heikki and Tom are committing other people's patches. I > know we have more committers than that. Bruce? Simon? Itagaki? Bueller? Someone called Simon Riggs has committed one patch by another author and reviewed 3 others, as well as spending many days working on bugs for beta. It would be sensible if people based their comments on actual events rather than negative impressions. ISTM that you are right that there are other committers that have not done anything. How strange that you name myself, yet not others. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_class.relistemp
On Thu, Jul 14, 2011 at 21:59, Josh Berkus wrote: > >> There has never, ever, been a guarantee that the system catalogs don't >> change across versions. Anybody issuing such queries should expect to >> have to retest them and possibly change them in each new major release. > > I know that's always been our policy. It his, however, > vendor-unfriendly because we don't supply any interface for many things > (such as temp tables) other than the system catalogs. > > So if we're going to break compatibility, then we could stand to make a > little noise about it. We've broken the admin apps in pretty much every single release. And they generally don't complain. If someone developing an admin app hasn't been doing extensive testing starting *at the latest* with beta1 (and recommended per each alpha), they shouldn't expect to release until quite long after the release. That said, a stable set of system views certainly wouldn't hurt - but making extra noise about a simple change like this one would likely not make a difference. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduced power consumption in WAL Writer process
On Wed, Jul 13, 2011 at 10:56 PM, Peter Geoghegan wrote: > Attached is patch for the WAL writer that removes its tight polling > loop (which probably doesn't get hit often in practice, as we just > sleep if wal_writer_delay is under a second), and, at least > potentially, reduces power consumption when idle by using a latch. > > I will break all remaining power consumption work down into > per-auxiliary process patches. I think that this is appropriate - if > we hit a snag on one of the processes, there is no need to have that > hold up everything. > > I've commented that we handle all expected signals, and therefore we > shouldn't worry about having timeout invalidated by signals, just as > with the archiver. Previously, we didn't even worry about Postmaster > death within the tight polling loop, presumably because > wal_writer_delay is typically small enough to avoid that being a > problem. I thought that WL_POSTMASTER_DEATH might be superfluous here, > but then again there is a codepath specifically for the case where > wal_writer_delay exceeds one second, so it is included in this initial > version. > > Comments? ISTM that this in itself isn't enough to reduce power consumption. Currently the only people that use WALWriter are asynchronous commits, so we should include within RecordTransactionCommit() a SetLatch() command for the WALWriter. That way we can have WALWriter sleep until its needed. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_class.relistemp
> There has never, ever, been a guarantee that the system catalogs don't > change across versions. Anybody issuing such queries should expect to > have to retest them and possibly change them in each new major release. I know that's always been our policy. It his, however, vendor-unfriendly because we don't supply any interface for many things (such as temp tables) other than the system catalogs. So if we're going to break compatibility, then we could stand to make a little noise about it. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need help understanding pg_locks
Looks good to me. --- Florian Pflug wrote: > On Jul14, 2011, at 22:18 , Bruce Momjian wrote: > > !OID of the database in which the lock target exists, or > > !zero if the lock is a shared object, or > > !null if the lock is on a transaction ID > > For consistency, I think it should say "target" in the second part > of the sentence also now, instead of "lock ... on". > > Updated patch attached. I tried to make the descriptions a > bit more consistent, replaced "object" by "target", and > added "targeted by" after the phrase which describes the > locked (or waited-for) object. > > best regards, > Florian Pflug > > diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml > index d4a1d36..33be5d0 100644 > *** a/doc/src/sgml/catalogs.sgml > --- b/doc/src/sgml/catalogs.sgml > *** > *** 6928,6936 > oid > linkend="catalog-pg-database">pg_database.oid > > !OID of the database in which the object exists, or > !zero if the object is a shared object, or > !null if the object is a transaction ID > > > > --- 6928,6936 > oid > linkend="catalog-pg-database">pg_database.oid > > !OID of the database in which the lock target exists, or > !zero if the target is a shared object, or > !null if the target is a transaction ID > > > > *** > *** 6938,6944 > oid > linkend="catalog-pg-class">pg_class.oid > > !OID of the relation, or null if the object is not > a relation or part of a relation > > > --- 6938,6944 > oid > linkend="catalog-pg-class">pg_class.oid > > !OID of the relation targeted by the lock, or null if the target is > not > a relation or part of a relation > > > *** > *** 6947,6954 > integer > > > !Page number within the relation, or null if the object > !is not a tuple or relation page > > > > --- 6947,6954 > integer > > > !Page number targeted by the lock within the relation, > !or null if the target is not a relation page or tuple > > > > *** > *** 6956,6962 > smallint > > > !Tuple number within the page, or null if the object is not a tuple > > > > --- 6956,6963 > smallint > > > !Tuple number targeted by the lock within the page, > !or null if the target is not a tuple > > > > *** > *** 6964,6971 > text > > > !Virtual ID of a transaction, or null if the object is not a > !virtual transaction ID > > > > --- 6965,6972 > text > > > !Virtual ID of the transaction targeted by the lock, > !or null if the target is not a virtual transaction ID > > > > *** > *** 6973,6979 > xid > > > !ID of a transaction, or null if the object is not a transaction ID > > > > --- 6974,6981 > xid > > > !ID of the transaction targeted by the lock, > !or null if the target is not a transaction ID > > > > *** > *** 6981,6988 > oid > linkend="catalog-pg-class">pg_class.oid > > !OID of the system catalog containing the object, or null if the > !object is not a general database object > > > > --- 6983,6990 > oid > linkend="catalog-pg-class">pg_class.oid > > !OID of the system catalog containing the lock target, or null if the > !target is not a general database object > > > > *** > *** 6990,6997 > oid > any OID column > > !OID of the object within its system catalog, or null if the > !object is not a general database object. > For advisory locks it is used to distinguish the two key > spaces (1 for an int8 key, 2 for two int4 keys). > > --- 6992,6999 > oid > any OID column > > !OID of the lock target within its system catalog, or null if the > !target is not a general database object. > For advisory locks it is used to distinguish the two key > spaces (1 for an int8 key, 2 for two int4 keys). > > *** > *** 7001,7010 > smallint > > > !For a table column, this is the column
Re: [HACKERS] WIP: Fast GiST index build
On Wed, Jul 13, 2011 at 5:59 PM, Heikki Linnakangas < heikki.linnakan...@enterprisedb.com> wrote: > One thing that caught my eye is that when you empty a buffer, you load the > entire subtree below that buffer, down to the next buffered or leaf level, > into memory. Every page in that subtree is kept pinned. That is a problem; > in the general case, the buffer manager can only hold a modest number of > pages pinned at a time. Consider that the minimum value for shared_buffers > is just 16. That's unrealistically low for any real system, but the default > is only 32MB, which equals to just 4096 buffers. A subtree could easily be > larger than that. > With level step = 1 we need only 2 levels in subtree. With mininun index tuple size (12 bytes) each page can have at maximum 675. Thus I think default shared_buffers is enough for level step = 1. I believe it's enough to add check we have sufficient shared_buffers, isn't it? > I don't think you're benefiting at all from the buffering that BufFile does > for you, since you're reading/writing a full block at a time anyway. You > might as well use the file API in fd.c directly, ie. > OpenTemporaryFile/FileRead/**FileWrite. BufFile is distributing temporary data through several files. AFAICS postgres avoids working with files larger than 1GB. Size of tree buffers can easily be greater. Without BufFile I need to maintain set of files manually. -- With best regards, Alexander Korotkov.
Re: [HACKERS] WIP: Fast GiST index build
On 14.07.2011 23:41, Alexander Korotkov wrote: Do you think using "rightlink" as pointer to parent page is possible during index build? It would allow to simplify code significantly, because of no more need to maintain in-memory structures for parents memorizing. I guess, but where do you store the rightlink, then? Would you need a final pass through the index to fix all the rightlinks? I think you could use the NSN field. It's used to detect concurrent page splits, but those can't happen during index build, so you don't need that field during index build. You just have to make it look like an otherwise illegal NSN, so that it won't be mistaken for a real NSN after the index is built. Maybe add a new flag to mean that the NSN is actually invalid. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need help understanding pg_locks
On Jul14, 2011, at 22:18 , Bruce Momjian wrote: > !OID of the database in which the lock target exists, or > !zero if the lock is a shared object, or > !null if the lock is on a transaction ID For consistency, I think it should say "target" in the second part of the sentence also now, instead of "lock ... on". Updated patch attached. I tried to make the descriptions a bit more consistent, replaced "object" by "target", and added "targeted by" after the phrase which describes the locked (or waited-for) object. best regards, Florian Pflug diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index d4a1d36..33be5d0 100644 *** a/doc/src/sgml/catalogs.sgml --- b/doc/src/sgml/catalogs.sgml *** *** 6928,6936 oid pg_database.oid !OID of the database in which the object exists, or !zero if the object is a shared object, or !null if the object is a transaction ID --- 6928,6936 oid pg_database.oid !OID of the database in which the lock target exists, or !zero if the target is a shared object, or !null if the target is a transaction ID *** *** 6938,6944 oid pg_class.oid !OID of the relation, or null if the object is not a relation or part of a relation --- 6938,6944 oid pg_class.oid !OID of the relation targeted by the lock, or null if the target is not a relation or part of a relation *** *** 6947,6954 integer !Page number within the relation, or null if the object !is not a tuple or relation page --- 6947,6954 integer !Page number targeted by the lock within the relation, !or null if the target is not a relation page or tuple *** *** 6956,6962 smallint !Tuple number within the page, or null if the object is not a tuple --- 6956,6963 smallint !Tuple number targeted by the lock within the page, !or null if the target is not a tuple *** *** 6964,6971 text !Virtual ID of a transaction, or null if the object is not a !virtual transaction ID --- 6965,6972 text !Virtual ID of the transaction targeted by the lock, !or null if the target is not a virtual transaction ID *** *** 6973,6979 xid !ID of a transaction, or null if the object is not a transaction ID --- 6974,6981 xid !ID of the transaction targeted by the lock, !or null if the target is not a transaction ID *** *** 6981,6988 oid pg_class.oid !OID of the system catalog containing the object, or null if the !object is not a general database object --- 6983,6990 oid pg_class.oid !OID of the system catalog containing the lock target, or null if the !target is not a general database object *** *** 6990,6997 oid any OID column !OID of the object within its system catalog, or null if the !object is not a general database object. For advisory locks it is used to distinguish the two key spaces (1 for an int8 key, 2 for two int4 keys). --- 6992,6999 oid any OID column !OID of the lock target within its system catalog, or null if the !target is not a general database object. For advisory locks it is used to distinguish the two key spaces (1 for an int8 key, 2 for two int4 keys). *** *** 7001,7010 smallint !For a table column, this is the column number (the classid and objid refer to the !table itself). For all other object types, this column is !zero. Null if the object is not a general database object --- 7003,7013 smallint !Column number targeted by the lock (the classid and objid refer to the !table itself), !or zero if the target is some other general database object, !or null if the target is not a general database object -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscriptio
Re: [HACKERS] patch: enhanced get diagnostics statement 2
2011/7/14 Alvaro Herrera : > Excerpts from Pavel Stehule's message of jue jul 14 16:25:56 -0400 2011: >> 2011/7/14 Alvaro Herrera : >> > A couple items for this patch: > >> it is good idea > > Thanks ... I expect you're going to resubmit the patch based on David's > last version and my comments? > yes, but tomorrow, time to go sleep Regards Pavel > -- > Álvaro Herrera > The PostgreSQL Company - Command Prompt, Inc. > PostgreSQL Replication, Consulting, Custom Development, 24x7 support > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Fast GiST index build
Do you think using "rightlink" as pointer to parent page is possible during index build? It would allow to simplify code significantly, because of no more need to maintain in-memory structures for parents memorizing. -- With best regards, Alexander Korotkov.
Re: [HACKERS] patch: enhanced get diagnostics statement 2
Excerpts from Pavel Stehule's message of jue jul 14 16:25:56 -0400 2011: > 2011/7/14 Alvaro Herrera : > > A couple items for this patch: > it is good idea Thanks ... I expect you're going to resubmit the patch based on David's last version and my comments? -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: enhanced get diagnostics statement 2
2011/7/14 Alvaro Herrera : > A couple items for this patch: > > The docs state that the variable to receive each diagnostic value needs > to be "of the right data type" but fails to specify what it is. I think > it'd be good to turn that into a table with three > columns: name, type, description. > > This seems poor style: > > + case PLPGSQL_GETDIAG_ERROR_CONTEXT: > + case PLPGSQL_GETDIAG_ERROR_DETAIL: > + case PLPGSQL_GETDIAG_ERROR_HINT: > + case PLPGSQL_GETDIAG_RETURNED_SQLSTATE: > + case PLPGSQL_GETDIAG_MESSAGE_TEXT: > + if (!$2) > + ereport(ERROR, > + (errcode(ERRCODE_SYNTAX_ERROR), > + errmsg("EXCEPTION_CONTEXT or > EXCEPTION_DETAIL or EXCEPTION_HINT or RETURNED_SQLSTATE or MESSAGE_TEXT are > not allowed in current diagnostics statement"), > + parser_errposition(@1))); > + > > > I think we could replace this with something like > > + if (!$2) > + ereport(ERROR, > + (errcode(ERRCODE_SYNTAX_ERROR), > + errmsg("diagnostic value %s is > not allowed in GET CURRENT DIAGNOSTICS statement", stringify(ditem->kind)), > > > Other than that, and a few grammar fixes in code comments, this seems > good to me. > it is good idea Regards Pavel > -- > Álvaro Herrera > The PostgreSQL Company - Command Prompt, Inc. > PostgreSQL Replication, Consulting, Custom Development, 24x7 support > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_class.relistemp
"David E. Wheeler" wrote: > A deprecation cycle at least might be useful. How about a "relistemp" extension on pgxn.org for the "generated column" function to provide the backward compatibility? Is the new extension mechanism a sane way to help those who need a phase-out period? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need help understanding pg_locks
Florian Pflug wrote: > I still believe the chance of confusion to be extremely small, but since > you feel otherwise, what about "Targeted" instead of "Locked". As in > > OID of the relation targeted by the lock, or null if the lock does not > target a relation or part of a relation. > > Page number within the relation targeted by the lock, or null if the > lock does not target a tuple or a relation page. > > Virtual ID of the transaction targeted by the lock, or null if the lock > does not target a virtual transaction ID. > > "Protected"/"protects" instead of "Targeted"/"targets" would also work. > > Both avoid the imprecision of saying "Locked", and the ambiguity "on" - > which might either mean the physical location of the lock, or the object > its protecting/targeting. > > > I reworded that line to: > > > > + OID of the relation of the lock target, or null if the lock is not > > I'm not a huge fan of that. IMHO " .. of .. of .. " chains are hard to > read. Plus, there isn't such a thing as the "relation of a lock target" - > the relation *is* the lock target, not a part thereof. Agreed. I like "targeted by". New patch attached. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml new file mode 100644 index c5851af..6fa6fa9 *** a/doc/src/sgml/catalogs.sgml --- b/doc/src/sgml/catalogs.sgml *** *** 6928,6936 oid pg_database.oid !OID of the database in which the object exists, or !zero if the object is a shared object, or !null if the lock object is on a transaction ID --- 6928,6936 oid pg_database.oid !OID of the database in which the lock target exists, or !zero if the lock is a shared object, or !null if the lock is on a transaction ID *** *** 6938,6944 oid pg_class.oid !OID of the relation, or null if the lock object is not on a relation or part of a relation --- 6938,6944 oid pg_class.oid !OID of the relation targeted by the lock, or null if the lock is not on a relation or part of a relation *** *** 6947,6953 integer !Page number within the relation, or null if the lock object is not on a tuple or relation page --- 6947,6953 integer !Page number within the relation targeted by the lock, or null if the lock is not on a tuple or relation page *** *** 6956,6963 smallint !Tuple number within the page, or null if the lock object is not !on a tuple --- 6956,6963 smallint !Tuple number within the page targeted by the lock, or null if !the lock is not on a tuple *** *** 6965,6971 text !Virtual ID of a transaction lock, or null if the lock object is not on a virtual transaction ID --- 6965,6971 text !Virtual ID of a transaction targeted by the lock, or null if the lock is not on a virtual transaction ID *** *** 6974,6980 xid !ID of a transaction lock, or null if the lock object is not on a transaction ID --- 6974,6981 xid !ID of a transaction targeted by the lock, or null if the lock !is not on a transaction ID *** *** 6982,6989 oid pg_class.oid !OID of the system catalog containing the object, or null if the !lock object is not on a general database object. --- 6983,6990 oid pg_class.oid !OID of the system catalog targeted by the lock, or null if the !lock is not on a general database object. *** *** 6991,6998 oid any OID column !OID of the object within its system catalog, or null if the !lock object is not on a general database object. For advisory locks it is used to distinguish the two key spaces (1 for an int8 key, 2 for two int4 keys). --- 6992,6999 oid any OID column !OID of the object within its system catalog targeted by the !lock, or null if the lock is not on a general database object. For advisory locks it is used to dis
Re: [HACKERS] pg_class.relistemp
On Jul 14, 2011, at 12:19 PM, Tom Lane wrote: >> Are they not testing our 9.1 betas? > > There has never, ever, been a guarantee that the system catalogs don't > change across versions. Anybody issuing such queries should expect to > have to retest them and possibly change them in each new major release. > I see nothing to sweat about here. A deprecation cycle at least might be useful. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: enhanced get diagnostics statement 2
A couple items for this patch: The docs state that the variable to receive each diagnostic value needs to be "of the right data type" but fails to specify what it is. I think it'd be good to turn that into a table with three columns: name, type, description. This seems poor style: + case PLPGSQL_GETDIAG_ERROR_CONTEXT: + case PLPGSQL_GETDIAG_ERROR_DETAIL: + case PLPGSQL_GETDIAG_ERROR_HINT: + case PLPGSQL_GETDIAG_RETURNED_SQLSTATE: + case PLPGSQL_GETDIAG_MESSAGE_TEXT: + if (!$2) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), +errmsg("EXCEPTION_CONTEXT or EXCEPTION_DETAIL or EXCEPTION_HINT or RETURNED_SQLSTATE or MESSAGE_TEXT are not allowed in current diagnostics statement"), +parser_errposition(@1))); + I think we could replace this with something like + if (!$2) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), +errmsg("diagnostic value %s is not allowed in GET CURRENT DIAGNOSTICS statement", stringify(ditem->kind)), Other than that, and a few grammar fixes in code comments, this seems good to me. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Understanding GIN posting trees
On 14.07.2011 22:10, Tom Lane wrote: Heikki Linnakangas writes: Why is the posting tree a tree? AFAICS, we never search it using the TID, it's always scanned in whole. It would be simpler to store the TIDs in a posting list in no particular order. This could potentially make insertions cheaper, as you could just append to the last posting list page for the key, instead of traversing the posting tree to a particular location. You could also pack the tids denser, as you wouldn't need to reserve free space for additions in the middle. Surely VACUUM would like to search it by TID for deletion purposes? It doesn't, it scans all the tid lists in whole. I guess it could search by TID, it could be a win if there's only a few deleted tuples, in a small range of pages. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_class.relistemp
Bruce Momjian writes: > Josh Berkus wrote: >> BTW, if we're dumping relistemp, we're going to need to notify every >> maker of a PostgreSQL admin interface before we release 9.1. > Are they not testing our 9.1 betas? There has never, ever, been a guarantee that the system catalogs don't change across versions. Anybody issuing such queries should expect to have to retest them and possibly change them in each new major release. I see nothing to sweat about here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] per-column generic option
All, Is the spec for this feature still under discussion? I don't seem to see a consensus on this thread. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Understanding GIN posting trees
Heikki Linnakangas writes: > Why is the posting tree a tree? AFAICS, we never search it using the > TID, it's always scanned in whole. It would be simpler to store the TIDs > in a posting list in no particular order. This could potentially make > insertions cheaper, as you could just append to the last posting list > page for the key, instead of traversing the posting tree to a particular > location. You could also pack the tids denser, as you wouldn't need to > reserve free space for additions in the middle. Surely VACUUM would like to search it by TID for deletion purposes? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension ownership and pg_dump
Heikki Linnakangas writes: > pg_dump seems to not dump the owner of extensions. Is that intentional? Yeah. Partly it was a matter of not wanting to implement ALTER EXTENSION OWNER, but I think there were some definitional issues too. See the archives from back around February or so. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Is there a committer in the house?
All, Currently we have 8 patches "Ready for Committer" in the current CF. Some of them have been that status for some time. >From traffic on this list, I'm getting the impression that nobody other than Robert, Heikki and Tom are committing other people's patches. I know we have more committers than that. Bruce? Simon? Itagaki? Bueller? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Extension ownership and pg_dump
pg_dump seems to not dump the owner of extensions. Is that intentional? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allow pg_archivecleanup to ignore extensions
On 7/12/11 11:17 AM, Josh Berkus wrote: > On 7/12/11 7:38 AM, Simon Riggs wrote: >> On Sun, Jul 10, 2011 at 7:13 PM, Josh Berkus wrote: >> >>> This patch[1] is for some reason marked "waiting on Author". But I >>> can't find that there's been any review of it searching the list. >>> What's going on with it? Has it been reviewed? >> >> Yes, I reviewed it on list. Some minor changes were discussed. I'm >> with Greg now, so we'll discuss and handle it. > > I couldn't find the review searching the archives. Can you please link > it in the Commitfest application? Thanks. Given the total lack of activity on this patch, I'm bumping it. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Single pass vacuum - take 1
Excerpts from Pavan Deolasee's message of jue jul 14 13:54:36 -0400 2011: > On Thu, Jul 14, 2011 at 12:43 PM, Heikki Linnakangas < > heikki.linnakan...@enterprisedb.com> wrote: > > Seems odd to store relindxvacxlogid/off as two int32 columns. Store it in > > one uint64 column, or invent a new datatype for LSNs, or store it as text in > > %X/%X format. > > > > > Yeah. I don't remember what issues I faced with uint64 column, may be did > not get around the case where uint64 is not natively supported on the > platform. But %X/%X looks very reasonable. Will change to that. Where is this to be stored? AFAIR we no longer support platforms that do not have working 64 bit integer types. As a column name, relindxvacxlogid seems a bit unfortunate, BTW ... -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need help understanding pg_locks
On Jul14, 2011, at 19:06 , Bruce Momjian wrote: > Florian Pflug wrote: >> On Jul13, 2011, at 21:08 , Bruce Momjian wrote: >>> + OID of the relation lock target, or null if the lock is not >>> on a relation or part of a relation >> >> That, however, not so much. "relation lock target" might easily >> be interpreted as the "relation's lock target" or the >> "relation lock's target" - at least by non-native speakers such >> as myself. The same is true fro "transaction lock target" and >> friends. >> >> Can't we simply go with "Locked relation", "Locked transaction id" >> and so on (as in my versions B,C and D up-thread)? I can't really >> get excited about the slight imprecision caused by the fact that some >> rows describe aspiring lock holders instead of current lock holders. >> The existence of the "granted" column makes the situation pretty clear. >> >> Plus, it's technically not even wrong - a process is waiting because >> somebody else *is* actually holding a lock on the object. So >> the tuple/transaction/... is, in fact, a "Locked tuple/transaction/..." > > I think it will be very confusing to have "locked" refer to the person > holding the lock while the row is based on who is waiting for it. I still believe the chance of confusion to be extremely small, but since you feel otherwise, what about "Targeted" instead of "Locked". As in OID of the relation targeted by the lock, or null if the lock does not target a relation or part of a relation. Page number within the relation targeted by the lock, or null if the lock does not target a tuple or a relation page. Virtual ID of the transaction targeted by the lock, or null if the lock does not target a virtual transaction ID. "Protected"/"protects" instead of "Targeted"/"targets" would also work. Both avoid the imprecision of saying "Locked", and the ambiguity "on" - which might either mean the physical location of the lock, or the object its protecting/targeting. > I reworded that line to: > > + OID of the relation of the lock target, or null if the lock is not I'm not a huge fan of that. IMHO " .. of .. of .. " chains are hard to read. Plus, there isn't such a thing as the "relation of a lock target" - the relation *is* the lock target, not a part thereof. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Single pass vacuum - take 1
On Thu, Jul 14, 2011 at 12:43 PM, Heikki Linnakangas < heikki.linnakan...@enterprisedb.com> wrote: > On 14.07.2011 18:57, Pavan Deolasee wrote: > >> On Thu, Jul 14, 2011 at 11:46 AM, Simon Riggs >> wrote: >> >>> I'd say that seems way too complex for such a small use case and we've >>> >>> only just fixed the bugs from 8.4 vacuum map complexity. The code's >>> looking very robust now and I'm uneasy that such changes are really >>> worth it. >>> >>> Thanks Simon for looking at the patch. >> >> I am not sure if the use case is really narrow. Today, we dirty the pages >> in >> both the passes and also emit WAL records. Just the heap scan can take a >> very long time for large tables, blocking the autovacuum worker threads >> from >> doing useful work on other tables. If I am not wrong, we use ring buffers >> for vacuum which would most-likely force those buffers to be written/read >> twice to the disk. >> > > Seems worthwhile to me. What bothers me a bit is the need for the new > 64-bit LSN value on each heap page. Also, note that temporary tables are not > WAL-logged, so there's no LSNs. > > Yeah, the good thing is we store it only when its needed. Temp and unlogged tables don't need any special handling because we don't rely on the WAL logging for the table itself. As you have noted down the thread, any counter which is guaranteed to not wrap-around would have worked. LSN is just very handy for the purpose (let me think more if we can just do with a flag). > How does this interact with the visibility map? If you set the visibility > map bit after vacuuming indexes, a subsequent vacuum will not visit the > page. The second vacuum will update relindxvacxlogid/off, but it will not > clean up the dead line pointers left behind by the first vacuum. Now the LSN > on the page differs from the one stored in pg_class, so subsequent pruning > will not remove the dead line pointers either. I think you can sidestep that > if you check that the page's vacuum LSN <= vacuum LSN in pg_class, instead > of equality. > > Hmm. We need to carefully see that. As the patch stands, we don't set the visibility map bit when there are any dead line pointers on the page. I'm not sure if its clear, but the ItemIdIsDead() test accounts for dead as well as dead-vacuum line pointers, so the test in lazy_heap_scan() won't let VM bit to be set early. The next vacuum cycle would still look at the page and set the bit if the page appears all-visible at that time. Note that in the next vacuum cycle, we would first clear the dead-vacuum line pointers if its not already done by some intermediate HOT-prune operation. > Ignoring the issue stated in previous paragraph, I think you wouldn't > actually need an 64-bit LSN. A smaller counter is enough, as wrap-around > doesn't matter. In fact, a single bit would be enough. After a successful > vacuum, the counter on each heap page (with dead line pointers) is N, and > the value in pg_class is N. There are no other values on the heap, because > vacuum will have cleaned them up. When you begin the next vacuum, it will > stamp pages with N+1. So at any stage, there is only one of two values on > any page, so a single bit is enough. (But as I said, that doesn't hold if > vacuum skips some pages thanks to the visibility map) > > I am not sure if that can take care of aborted vacuum case with a single bit or a counter that can wrap-around. Let me think more about it. > Is there something in place to make sure that pruning uses an up-to-date > relindxvacxlogid/off value? I guess it doesn't matter if it's out-of-date, > you'll just miss the opportunity to remove some dead tuples. > > Yeah, exactly. We just rely on the value that was read when the pg_class tuple was last read. If we don't have the up-to-date value, we might miss some opportunity to clean up those dead-vauum line pointers. > Seems odd to store relindxvacxlogid/off as two int32 columns. Store it in > one uint64 column, or invent a new datatype for LSNs, or store it as text in > %X/%X format. > > Yeah. I don't remember what issues I faced with uint64 column, may be did not get around the case where uint64 is not natively supported on the platform. But %X/%X looks very reasonable. Will change to that. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] Understanding GIN posting trees
I have a couple of questions on GIN: The code seems to assume that it's possible for the same TID to appear twice for a single key (see addItemPointersToTuple()). I understand that it's possible for a single heap tuple to contain the same key twice. For example if you index an array of integers like [1,2,1]. But once you've inserted all the keys for a single heap item, you never try to insert the same TID again, so no duplicates should occur. Looking at the history, it looks like pre-8.4 we assumed that no such duplicates are possible. Duplicates of a single key for one column are eliminated in extractEntriesSU(), but apparently when the multi-column support was added, we didn't make the de-duplication to run across the keys extracted from all columns. Now that the posting tree/list insertion code has to deal with duplicates anyway, the de-duplication performed in extractEntriesSU() seems pointless. But I wonder if it would be better to make extractEntriesSU() remove duplicates across all columns, so that the insertion code wouldn't need to deal with duplicates. During vacuuming of pending list we could get a powerloss and some data will be in tree and pending list both, after restart pgsql will try to insert the same data to tree. Dealing with the duplicates in the insertion code isn't particularly difficult. And in fact, now that we only support the getbitmap method, we wouldn't really need to eliminate duplicates anyway. But I have an ulterior motive: Why is the posting tree a tree? AFAICS, we never search it using the TID, it's always scanned in whole. It would be simpler to store the TIDs in a posting list in no particular order. This could potentially make insertions cheaper, as you could just append to the last posting list page for the key, instead of traversing the posting tree to a particular location. You could also pack the tids denser, as you wouldn't need to reserve free space for additions in the middle. For consistentFn call we need to collect all data for current tid. We do that by scanning over logical ordered arrays of tids and trees allows to do that by scanning a leafs pages. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_class.relistemp
Josh Berkus wrote: > All, > > BTW, if we're dumping relistemp, we're going to need to notify every > maker of a PostgreSQL admin interface before we release 9.1. > > This is why we should have had a complete set of sysviews ... Are they not testing our 9.1 betas? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_class.relistemp
All, BTW, if we're dumping relistemp, we're going to need to notify every maker of a PostgreSQL admin interface before we release 9.1. This is why we should have had a complete set of sysviews ... -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUG] SSPI authentication fails on Windows when server parameter is localhost or domain name
On Wed, Jun 15, 2011 at 10:53, Ahmed Shinwari wrote: > Hi All, > > I faced a bug on Windows while connecting via SSPI authentication. I was > able to find the bug and have attached the patch. Details listed below; > > Postgres Installer: Version 9.0.4 > OS: Windows Server 2008 R2/Windows 7 Thanks - great analysis! However, I think there is a better fix for this - simply moving a } one line. In particular, I'm concerned about passing the same pointer both as input and output to the function - I couldn't find anything in the documentation saying this was safe (nor did I find anything saying it's unsafe, but.) Especially since this code clearly behaves different on different versions - I've been completely unable to reproduce this on any of my test machines, but they are all Windows Server 2003. So - attached is a new version of the patch, how does this look to you? FYI, I've had Thom test this new version and it does appear to work fine in his scenario. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c index 7799111..936cfea 100644 --- a/src/backend/libpq/auth.c +++ b/src/backend/libpq/auth.c @@ -1349,16 +1349,22 @@ pg_SSPI_recvauth(Port *port) _("could not accept SSPI security context"), r); } + /* + * Overwrite the current context with the one we just received. + * If sspictx is NULL it was the first loop and we need to allocate + * a buffer for it. On subsequent runs, we can just overwrite the + * buffer contents since the size does not change. + */ if (sspictx == NULL) { sspictx = malloc(sizeof(CtxtHandle)); if (sspictx == NULL) ereport(ERROR, (errmsg("out of memory"))); - - memcpy(sspictx, &newctx, sizeof(CtxtHandle)); } + memcpy(sspictx, &newctx, sizeof(CtxtHandle)); + if (r == SEC_I_CONTINUE_NEEDED) elog(DEBUG4, "SSPI continue needed"); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] help with sending email
On Jul 13, 2011, at 12:29 PM, "Fernando Acosta Torrelly" wrote: > Hi everybody: > > I am using pgmail to send email in an application, but I would like to use > html format > > Does anybody has an example how to do this?, or what do you recommend me to > use por doing this? > > Thanks in advance for your attention. > This is the mailing list for PostgreSQL, not pgmail. And it is also a development mailing list; there are others for user questions. ...Robert
Re: [HACKERS] Re: patch review : Add ability to constrain backend temporary file space
On 14/07/11 18:48, Tatsuo Ishii wrote: Hi I am the new reviewer:-) I have looked into the v6 patches. One thing I would like to suggest is, enhancing the error message when temp_file_limit will be exceeded. ERROR: aborting due to exceeding temp file limit Is it possible to add the current temp file limit to the message? For example, ERROR: aborting due to exceeding temp file limit 1kB I know the current setting of temp_file_limit can be viewd in other ways, but I think this will make admin's or application developer's life a little bit easier. Hi Tatsuo, Yeah, good suggestion - I agree that it would be useful to supply extra detail, I'll amend and resubmit a new patch (along with whatever review modifications we come up with in the meantime)! Cheers Mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need help understanding pg_locks
Florian Pflug wrote: > On Jul13, 2011, at 21:08 , Bruce Momjian wrote: > > - OID of the database in which the object exists, or > > - zero if the object is a shared object, or > > - null if the lock object is on a transaction ID > > + OID of the database in which the lock target exists, or > > + zero if the lock is a shared object, or > > + null if the lock is on a transaction ID > > This sounds good. > > > + OID of the relation lock target, or null if the lock is not > >on a relation or part of a relation > > That, however, not so much. "relation lock target" might easily > be interpreted as the "relation's lock target" or the > "relation lock's target" - at least by non-native speakers such > as myself. The same is true fro "transaction lock target" and > friends. > > Can't we simply go with "Locked relation", "Locked transaction id" > and so on (as in my versions B,C and D up-thread)? I can't really > get excited about the slight imprecision caused by the fact that some > rows describe aspiring lock holders instead of current lock holders. > The existence of the "granted" column makes the situation pretty clear. > > Plus, it's technically not even wrong - a process is waiting because > somebody else *is* actually holding a lock on the object. So > the tuple/transaction/... is, in fact, a "Locked tuple/transaction/..." I think it will be very confusing to have "locked" refer to the person holding the lock while the row is based on who is waiting for it. I reworded that line to: + OID of the relation of the lock target, or null if the lock is not Update patch attached. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index c5851af..84c2257 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -6928,9 +6928,9 @@ oid pg_database.oid - OID of the database in which the object exists, or - zero if the object is a shared object, or - null if the lock object is on a transaction ID + OID of the database in which the lock target exists, or + zero if the lock is a shared object, or + null if the lock is on a transaction ID @@ -6938,7 +6938,7 @@ oid pg_class.oid - OID of the relation, or null if the lock object is not + OID of the relation of the lock target, or null if the lock is not on a relation or part of a relation @@ -6947,7 +6947,7 @@ integer - Page number within the relation, or null if the lock object + Page number within the relation, or null if the lock is not on a tuple or relation page @@ -6956,7 +6956,7 @@ smallint - Tuple number within the page, or null if the lock object is not + Tuple number within the page, or null if the lock is not on a tuple @@ -6965,7 +6965,7 @@ text - Virtual ID of a transaction lock, or null if the lock object is not + Virtual ID of a transaction lock target, or null if the lock is not on a virtual transaction ID @@ -6974,7 +6974,7 @@ xid - ID of a transaction lock, or null if the lock object is not on a transaction ID + ID of a transaction lock target, or null if the lock is not on a transaction ID @@ -6982,8 +6982,8 @@ oid pg_class.oid - OID of the system catalog containing the object, or null if the - lock object is not on a general database object. + OID of the system catalog containing the lock target, or null if the + lock is not on a general database object. @@ -6992,7 +6992,7 @@ any OID column OID of the object within its system catalog, or null if the - lock object is not on a general database object. + lock is not on a general database object. For advisory locks it is used to distinguish the two key spaces (1 for an int8 key, 2 for two int4 keys). @@ -7005,7 +7005,7 @@ For a table column, this is the column number (the classid and objid refer to the table itself). For all other object types, this column is - zero. Null if the lock object is not on a general database object. + zero. Null if the lock is not on a general database object. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_class.relistemp
On Jul 14, 2011, at 9:55 AM, Heikki Linnakangas wrote: > Far back. But you only need it in >= 9.1. Older versions have the > pg_class.relistemp column anyway. Yeah. > Not sure how this helps, though. If you modify pgTAP to install that > automatically in pgTAP when dealing with a new server version, you might as > well modify its queries to use relispersistence = 't' directly when dealing > with a new server version. It works as a manual work-around if you can't > upgrade pgTAP, I guess. Yeah, that's what I'd rather avoid. I'll probably have to modify the function that makes the call to look at the version number. Annoying, but do-able. https://github.com/theory/pgtap/blob/master/sql/pgtap.sql.in#L5894 Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_class.relistemp
"David E. Wheeler" wrote: > On Jul 13, 2011, at 12:57 PM, Kevin Grittner wrote: > >> create or replace function relistemp(rel pg_class) >> returns boolean language sql immutable strict as >> $$select $1.relpersistence = 't';$$; >> >> Just don't forget to use the table name or alias in front of >> it... :-) > > Oh, nice hack. How far back does that work (pgTAP runs on 8.0 and > higher)? As far as I know, the technique of creating a function with a record type as its only parameter to use as a "generated column" goes way back. This particular function won't work prior to 9.1, because you won't have the relpersistence column, but then, prior to 9.1 you wouldn't need to run this because you already have a relistemp column. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_class.relistemp
On 14.07.2011 19:51, David E. Wheeler wrote: On Jul 13, 2011, at 12:57 PM, Kevin Grittner wrote: create or replace function relistemp(rel pg_class) returns boolean language sql immutable strict as $$select $1.relpersistence = 't';$$; Just don't forget to use the table name or alias in front of it... :-) Oh, nice hack. How far back does that work (pgTAP runs on 8.0 and higher)? Far back. But you only need it in >= 9.1. Older versions have the pg_class.relistemp column anyway. Not sure how this helps, though. If you modify pgTAP to install that automatically in pgTAP when dealing with a new server version, you might as well modify its queries to use relispersistence = 't' directly when dealing with a new server version. It works as a manual work-around if you can't upgrade pgTAP, I guess. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Full GUID support
On Jul 14, 2011, at 12:05 AM, Hiroshi Saito wrote: > Hi Thomas-san, Ralf-san. > > I appreciate your great work. > Thanks! > > CC to Postgres-ML. > > Regards, > Hiroshi Saito > > (2011/07/14 3:49), Thomas Lotterer wrote: > > Thanks for the hint. > > Our ftp daemon is dumping core. > > We are debugging ... Ah, good news, thanks. Where should I report stuff like this in the future? I sent a message about this to r...@engelschall.com on May 15th and also a Twitter DM but didn't hear back… Thanks, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_class.relistemp
On Jul 13, 2011, at 12:57 PM, Kevin Grittner wrote: > create or replace function relistemp(rel pg_class) > returns boolean language sql immutable strict as > $$select $1.relpersistence = 't';$$; > > Just don't forget to use the table name or alias in front of it... :-) Oh, nice hack. How far back does that work (pgTAP runs on 8.0 and higher)? Thanks, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Single pass vacuum - take 1
On 14.07.2011 18:57, Pavan Deolasee wrote: On Thu, Jul 14, 2011 at 11:46 AM, Simon Riggs wrote: I'd say that seems way too complex for such a small use case and we've only just fixed the bugs from 8.4 vacuum map complexity. The code's looking very robust now and I'm uneasy that such changes are really worth it. Thanks Simon for looking at the patch. I am not sure if the use case is really narrow. Today, we dirty the pages in both the passes and also emit WAL records. Just the heap scan can take a very long time for large tables, blocking the autovacuum worker threads from doing useful work on other tables. If I am not wrong, we use ring buffers for vacuum which would most-likely force those buffers to be written/read twice to the disk. Seems worthwhile to me. What bothers me a bit is the need for the new 64-bit LSN value on each heap page. Also, note that temporary tables are not WAL-logged, so there's no LSNs. How does this interact with the visibility map? If you set the visibility map bit after vacuuming indexes, a subsequent vacuum will not visit the page. The second vacuum will update relindxvacxlogid/off, but it will not clean up the dead line pointers left behind by the first vacuum. Now the LSN on the page differs from the one stored in pg_class, so subsequent pruning will not remove the dead line pointers either. I think you can sidestep that if you check that the page's vacuum LSN <= vacuum LSN in pg_class, instead of equality. Ignoring the issue stated in previous paragraph, I think you wouldn't actually need an 64-bit LSN. A smaller counter is enough, as wrap-around doesn't matter. In fact, a single bit would be enough. After a successful vacuum, the counter on each heap page (with dead line pointers) is N, and the value in pg_class is N. There are no other values on the heap, because vacuum will have cleaned them up. When you begin the next vacuum, it will stamp pages with N+1. So at any stage, there is only one of two values on any page, so a single bit is enough. (But as I said, that doesn't hold if vacuum skips some pages thanks to the visibility map) Is there something in place to make sure that pruning uses an up-to-date relindxvacxlogid/off value? I guess it doesn't matter if it's out-of-date, you'll just miss the opportunity to remove some dead tuples. Seems odd to store relindxvacxlogid/off as two int32 columns. Store it in one uint64 column, or invent a new datatype for LSNs, or store it as text in %X/%X format. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduced power consumption in WAL Writer process
On Thu, Jul 14, 2011 at 10:53 AM, Heikki Linnakangas wrote: > On 14.07.2011 12:42, Simon Riggs wrote: >> >> On Thu, Jul 14, 2011 at 9:57 AM, Fujii Masao >> wrote: >> >>> Currently walwriter might write out the WAL before a transaction commits. >>> IOW, walwriter tries to write out the WAL in wal_buffers in every >>> wakeups. >>> This might be useful for long transaction which generates lots of WAL >>> records before commit. So we should call SetLatch() in XLogInsert() >>> instead >>> of RecordTransactionCommit()? Though I'm not sure how much walwriter >>> improves the performance of synchronous commit case.. >> >> Yeh, we did previously have a heuristic to write out the WAL when it >> was more than half full. Not sure I want to put exactly that code back >> into such a busy code path. >> >> I suggest that we set latch every time the wal buffers wrap. >> >> So at the bottom of AdvanceXLInsertBuffer(), if nextidx == 0 then >> SetLatch on the WALWriter. >> >> That's a simple test and we only check it if we're switch WAL buffer page. > > That was my first though too - but I wonder if that's too aggressive? A > backend that does for example a large bulk load will cycle through the > buffers real quick. It seems like a bad idea to wake up walwriter between > each buffer in that case. Then again, setting a latch that's already set is > cheap, so maybe it works fine in practice. > > Maybe it would be better to do it less frequently, say, every time you > switch to new WAL segment. Or every 10 buffers or something like that. Yes, that roughly what I'm saying. When nextidx == 0 is just after we wrapped wal_buffers, i.e. we only wake up the WALWriter every wal_buffers pages. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduced power consumption in WAL Writer process
On 14.07.2011 12:42, Simon Riggs wrote: On Thu, Jul 14, 2011 at 9:57 AM, Fujii Masao wrote: Currently walwriter might write out the WAL before a transaction commits. IOW, walwriter tries to write out the WAL in wal_buffers in every wakeups. This might be useful for long transaction which generates lots of WAL records before commit. So we should call SetLatch() in XLogInsert() instead of RecordTransactionCommit()? Though I'm not sure how much walwriter improves the performance of synchronous commit case.. Yeh, we did previously have a heuristic to write out the WAL when it was more than half full. Not sure I want to put exactly that code back into such a busy code path. I suggest that we set latch every time the wal buffers wrap. So at the bottom of AdvanceXLInsertBuffer(), if nextidx == 0 then SetLatch on the WALWriter. That's a simple test and we only check it if we're switch WAL buffer page. That was my first though too - but I wonder if that's too aggressive? A backend that does for example a large bulk load will cycle through the buffers real quick. It seems like a bad idea to wake up walwriter between each buffer in that case. Then again, setting a latch that's already set is cheap, so maybe it works fine in practice. Maybe it would be better to do it less frequently, say, every time you switch to new WAL segment. Or every 10 buffers or something like that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch Review: Bugfix for XPATH() if text or attribute nodes are selected
On sön, 2011-07-10 at 11:40 -0700, Josh Berkus wrote: > Hackers, > > >> B. 6. Current behaviour _is intended_ (there is "if" to check node type) > >> and _"natural"_. In this particular case user ask for text content of some > >> node, and this content is actually "<". > > > > I don't buy that. The check for the node type is there because > > two different libxml functions are used to convert nodes to > > strings. The if has absolutely *zero* to do with escaping, expect > > for that missing escape_xml() call in the "else" case. > > > > Secondly, there is little point in having an type XML if we > > don't actually ensure that values of that type can only contain > > well-formed XML. > > Can anyone else weigh in on this? Peter? Looks like a good change to me. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch Review: Bugfix for XPATH() if text or attribute nodes are selected
On ons, 2011-07-13 at 11:58 +0200, Nicolas Barbier wrote: > 2011/6/29, Florian Pflug : > > > Secondly, there is little point in having an type XML if we > > don't actually ensure that values of that type can only contain > > well-formed XML. > > +1. The fact that XPATH() must return a type that cannot depend on the > given expression (even if it is a constant string) may be unfortunate, > but returning XML-that-is-not-quite-XML sounds way worse to me. The example given was XPATH('/*/text()', '<') This XPath expression returns a node set, and XML is a serialization format of a node, so returning xml[] in this particular case seems entirely reasonable to me. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Single pass vacuum - take 1
On Tue, Jul 12, 2011 at 9:47 PM, Pavan Deolasee wrote: > http://archives.postgresql.org/pgsql-hackers/2011-05/msg01119.php > PFA a patch which implements the idea with some variation. > At the start of the first pass, we remember the current LSN. Every page that > needs some work is HOT-pruned so that dead tuples are truncated to dead line > pointers. We collect those dead line pointers and mark them as > dead-vacuumed. Since we don't have any LP flag bits available, we instead > just use the LP_DEAD flag along with offset value 1 to mark the line pointer > as dead-vacuumed. The page is defragmented and we store the LSN remembered > at the start of the pass in the page special area as vacuum LSN. We also > update the free space at that point because we are not going to do a second > pass on the page anymore. > > Once we collect all dead line pointers and mark them as dead-vacuumed, we > clean-up the indexes and remove all index pointers pointing to those > dead-vacuumed line pointers. If the index vacuum finishes successfully, we > store the LSN in the pg_class row of the table (needs catalog changes). At > that point, we are certain that there are no index pointers pointing to > dead-vacuumed line pointers and they can be reclaimed at the next > opportunity. > > During normal operations or subsequent vacuum, if the page is chosen for > HOT-prunung, we check if has any dead-vacuumed line pointers and if the > vacuum LSN stored on the page special area is equal to the one stored in the > pg_class row, and reclaim those dead-vacuum line pointers (the index > pointers to these line pointers are already taken care of). If the pg_class > LSN is not the same, the last vacuum probably did not finish completely and > we collect the dead-vacuum line pointers just like other dead line pointers > and try to clean up the index pointers as usual. > I ran few pgbench tests with the patch. I don't see much difference in the > overall tps, but the vacuum time for the accounts table reduces by nearly > 50%. I neither see much difference in the overall bloat, but then pgbench > uses HOT very nicely and the accounts table got only couple of vacuum cycles > in my 7-8 hour run. > There are couple of things that probably need more attention. I am not sure > if we need to teach ANALYZE to treat dead line pointers differently. Since > they take up much less space than a dead tuple, they should definitely have > a lower weight, but at the same time, we need to take into account the > number of indexes on the table. The start of first pass LSN that we are > remembering is in fact the start of the WAL page and I think there could be > some issues with that, especially for very tiny tables. For example, first > vacuum may run completely. If another vacuum is started on the same table > and say it gets the same LSN (because we did not write more than 1 page > worth WAL in between) and if the second vacuum aborts after it cleaned up > few pages, we might get into some trouble. The likelihood of such things > happening is very small, but may be its worth taking care of it. May be we > can get the exact current LSN and not store it in the pg_class if we don't > do anything during the cycle. > Comments ? Hi Pavan, I'd say that seems way too complex for such a small use case and we've only just fixed the bugs from 8.4 vacuum map complexity. The code's looking very robust now and I'm uneasy that such changes are really worth it. You're trying to avoid Phase 3, the second pass on the heap. Why not avoid the write in Phase 1 if its clear that we'll need to come back again in Phase 3? So we either do a write in Phase 1 or in Phase 3, but never both? That minimises the writes, which are what hurt the most. We can reduce the overall cost simply by not doing Phase 2 and Phase 3 if the number of rows to remove is too few, say < 1%. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Single pass vacuum - take 1
On Thu, Jul 14, 2011 at 11:46 AM, Simon Riggs wrote: > > > Hi Pavan, > > I'd say that seems way too complex for such a small use case and we've > only just fixed the bugs from 8.4 vacuum map complexity. The code's > looking very robust now and I'm uneasy that such changes are really > worth it. > > Thanks Simon for looking at the patch. I am not sure if the use case is really narrow. Today, we dirty the pages in both the passes and also emit WAL records. Just the heap scan can take a very long time for large tables, blocking the autovacuum worker threads from doing useful work on other tables. If I am not wrong, we use ring buffers for vacuum which would most-likely force those buffers to be written/read twice to the disk. Which part of the patch you think is very complex ? We can try to simplify that. Or are you seeing any obvious bugs that I missed ? IMHO taking out a phase completely from vacuum (as this patch does) can simplify things. > You're trying to avoid Phase 3, the second pass on the heap. Why not > avoid the write in Phase 1 if its clear that we'll need to come back > again in Phase 3? So we either do a write in Phase 1 or in Phase 3, > but never both? That minimises the writes, which are what hurt the > most. > > You can possibly do the work in the Phase 3, but that doesn't avoid the second scan. > We can reduce the overall cost simply by not doing Phase 2 and Phase 3 > if the number of rows to remove is too few, say < 1%. > If you have set the vacuum parameters such that it kicks in when there are say 5% updates/deletes, you would most likely have that much work to do anyways. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] SAVEPOINTs and COMMIT performance
On Mon, Jun 6, 2011 at 10:33 AM, Heikki Linnakangas wrote: > On 06.02.2011 23:09, Simon Riggs wrote: >> >> On Sun, 2011-02-06 at 12:11 -0500, Bruce Momjian wrote: >>> >>> Did this ever get addressed? >> >> Patch attached. >> >> Seems like the easiest fix I can come up with. > >> @@ -2518,7 +2518,7 @@ CommitTransactionCommand(void) >> case TBLOCK_SUBEND: >> do >> { >> - CommitSubTransaction(); >> + CommitSubTransaction(true); >> s = CurrentTransactionState; /* changed >> by pop */ >> } while (s->blockState == TBLOCK_SUBEND); >> /* If we had a COMMIT command, finish off the main >> xact too */ > > We also get into this codepath at RELEASE SAVEPOINT, in which case it is > wrong to not reassign the locks to the parent subtransaction. Attached patch splits TBLOCK_SUBEND state into two new states: TBLOCK_SUBCOMMIT and TBLOCK_SUBRELEASE, so that we can do the right thing. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services savepoint_commit_performance.v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for distinguishing PG instances in event log
2011/5/26 MauMau : > Hello, > > I wrote and attached a patch for the TODO item below (which I proposed). > > Allow multiple Postgres clusters running on the same machine to distinguish > themselves in the event log > http://archives.postgresql.org/pgsql-hackers/2011-03/msg01297.php > http://archives.postgresql.org/pgsql-hackers/2011-05/msg00574.php > > I changed two things from the original proposal. > > 1. regsvr32.exe needs /n when you specify event source > I described the reason in src/bin/pgevent/pgevent.c. > > 2. I moved the article for event log registration to more suitable place > The traditional place and what I originally proposed were not best, because > those who don't build from source won't read those places. > > I successfully tested event log registration/unregistration, event logging > with/without event_source parameter, and SHOWing event_source parameter with > psql on Windows Vista (32-bit). I would appreciate if someone could test it > on 64-bit Windows who has the 64-bit environment. > > I'll add this patch to the first CommitFest of 9.2. Thank you in advance for > reviewing it. + + On Windows, you need to register an event source + and its library with the operating system in order + to make use of the eventlog option for + log_destination. + See for details. + * This part is not strictly correct - you don't *need* to do that, it just makes things look nicer, no? * Also, what is the use for set_eventlog_parameters()? It's just a string variable, it shuold work without it. * We these days avoid #ifdef'ing gucs just because they are not on that platform - so the list is consistent. The guc should be available on non-windows platforms as well. * The guc also needs to go in postgresql.conf.sample * We never build in unicode mode, so all those checks are unnecessary. * Are we really allowed to call MessageBox in DlLRegisterService? Won't that break badly in cases like silent installs? Attached is an updated patch, which doesn't work yet. I believe the changes to the backend are correct, but probably some of the cleanups and changes in the dll are incorrect, because I seem to be unable to register either the default or a custom handler so far. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 842558d..583a5c9 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2975,6 +2975,13 @@ local0.*/var/log/postgresql to the syslog daemon's configuration file to make it work. + + On Windows, you need to register an event source + and its library with the operating system in order + to make use of the eventlog option for + log_destination. + See for details. + @@ -3221,6 +3228,24 @@ local0.*/var/log/postgresql + + event_source (string) + + event_source configuration parameter + + + + When logging to event log is enabled, this parameter + determines the program name used to identify + PostgreSQL messages in + event log. The default is + PostgreSQL. + This parameter can only be set in the postgresql.conf + file or on the server command line. + + + + diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml index 0410cff..41b9009 100644 --- a/doc/src/sgml/installation.sgml +++ b/doc/src/sgml/installation.sgml @@ -1552,19 +1552,6 @@ PostgreSQL, contrib and HTML documentation successfully made. Ready to install. - Registering eventlog on Windows: - -To register a Windows eventlog -library with the operating system, issue this command after installation: - -regsvr32 pgsql_library_directory/pgevent.dll - -This creates registry entries used by the event viewer. - - - - Uninstallation: To undo the installation use the command gmake diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml index ef83206..bfbb641 100644 --- a/doc/src/sgml/runtime.sgml +++ b/doc/src/sgml/runtime.sgml @@ -2294,4 +2294,54 @@ ssh -L 6:db.foo.com:5432 j...@shell.foo.com + + Registering Event Log on Windows + + + event log + event log + + + + To register a Windows event log + library with the operating system, issue this command: + +regsvr32 pgsql_library_directory/pgevent.dll + + This creates registry entries used by the event viewer, under the default event + source named "PostgreSQL". + + + + You can specify the event source name with /i option. In this case, -n option + is necessary, too: + +regsvr32 /n /i:event_source_name + pgsql_library_directory/pgevent.dll + + You also need to set in + postgresql.conf. Note that the event source specific
Re: [HACKERS] proposal: a validator for configuration files
On Jul 14, 2011, at 4:38 AM, Alvaro Herrera wrote: > Excerpts from Florian Pflug's message of mié jul 13 20:12:28 -0400 2011: >> On Jul14, 2011, at 01:38 , Alvaro Herrera wrote: >>> One strange thing here is that you could get two such messages; say if a >>> file has 100 parse errors and there are also valid lines that contain >>> bogus settings (foo = bar). I don't find this to be too problematic, >>> and I think fixing it would be excessively annoying. >>> >>> For example, a bogus run would end like this: >>> >>> 95 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" >>> line 4, near end of line >>> 96 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" >>> line 41, near end of line >>> 97 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" >>> line 104, near end of line >>> 98 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" >>> line 156, near end of line >>> 99 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" >>> line 208, near end of line >>> 100 LOG: syntax error in file "/pgsql/install/HEAD/data/postgresql.conf" >>> line 260, near end of line >>> 101 LOG: too many errors found, stopped processing file >>> "/pgsql/install/HEAD/data/postgresql.conf" >>> 102 LOG: unrecognized configuration parameter "plperl.err" >>> 103 LOG: unrecognized configuration parameter "this1" >>> 104 LOG: too many errors found, stopped processing file >>> "/pgsql/install/HEAD/data/postgresql.conf" >>> 105 FATAL: errors detected while parsing configuration files >> >> How about changing ParseConfigFile to say "too many *syntax* error found" >> instead? It'd be more precise, and we wouldn't emit exactly the >> same message twice. > > Yeah, I thought about doing it that way but refrained because it'd be > one more string to translate. That's a poor reason, I admit :-) I'll > change it. This is happening because a check for total number of errors so far is happening only after coming across at least one non-recognized configuration option. What about adding one more check right after ParseConfigFile, so we can bail out early when overwhelmed with syntax errors? This would save a line in translation :). > >> Do you want me to take a closer look at your modified version of the >> patch before you commit, or did you post it more as a "FYI, this is >> how it's going to look like"? > > I know I'd feel more comfortable if you (and Alexey, and Selena) gave it > another look :-) I have checked it here and don't see any more problems with it. -- Command Prompt, Inc. http://www.CommandPrompt.com PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch Review: Bugfix for XPATH() if text or attribute nodes are selected
On Thu, 14 Jul 2011 15:15:33 +0300, Peter Eisentraut wrote: On sön, 2011-07-10 at 11:40 -0700, Josh Berkus wrote: Hackers, >> B. 6. Current behaviour _is intended_ (there is "if" to check node type) and _"natural"_. In this particular case user ask for text content of some node, and this content is actually "<". > > I don't buy that. The check for the node type is there because > two different libxml functions are used to convert nodes to > strings. The if has absolutely *zero* to do with escaping, expect > for that missing escape_xml() call in the "else" case. > > Secondly, there is little point in having an type XML if we > don't actually ensure that values of that type can only contain > well-formed XML. Can anyone else weigh in on this? Peter? Looks like a good change to me. I'll bump it in few hours, as I can't recall password from keyring. Now I have hands clean and it's not my business to care about this. Best regards. Radek. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Understanding GIN posting trees
I have a couple of questions on GIN: The code seems to assume that it's possible for the same TID to appear twice for a single key (see addItemPointersToTuple()). I understand that it's possible for a single heap tuple to contain the same key twice. For example if you index an array of integers like [1,2,1]. But once you've inserted all the keys for a single heap item, you never try to insert the same TID again, so no duplicates should occur. Looking at the history, it looks like pre-8.4 we assumed that no such duplicates are possible. Duplicates of a single key for one column are eliminated in extractEntriesSU(), but apparently when the multi-column support was added, we didn't make the de-duplication to run across the keys extracted from all columns. Now that the posting tree/list insertion code has to deal with duplicates anyway, the de-duplication performed in extractEntriesSU() seems pointless. But I wonder if it would be better to make extractEntriesSU() remove duplicates across all columns, so that the insertion code wouldn't need to deal with duplicates. Dealing with the duplicates in the insertion code isn't particularly difficult. And in fact, now that we only support the getbitmap method, we wouldn't really need to eliminate duplicates anyway. But I have an ulterior motive: Why is the posting tree a tree? AFAICS, we never search it using the TID, it's always scanned in whole. It would be simpler to store the TIDs in a posting list in no particular order. This could potentially make insertions cheaper, as you could just append to the last posting list page for the key, instead of traversing the posting tree to a particular location. You could also pack the tids denser, as you wouldn't need to reserve free space for additions in the middle. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Small patch for GiST: move childoffnum to child
On Thu, Jul 14, 2011 at 12:56 PM, Heikki Linnakangas < heikki.linnakan...@enterprisedb.com> wrote: > First, notice that we're setting "ptr->parent = top". 'top' is the current > node we're processing, and ptr represents the node to the right of the > current node. The current node is *not* the parent of the node to the right. > I believe that line should be "ptr->parent = top->parent". > I think same. > Second, we're adding the entry for the right sibling to the end of the list > of nodes to visit. But when we process entries from the list, we exit > immediately when we see a leaf page. That means that the right sibling can > get queued up behind leaf pages, and thus never visited. I think possible solution is to save right sibling immediatly after current page . Thus, this code fragment should looks like this: >if (top->parent && XLByteLT(top->parent->lsn, > GistPageGetOpaque(page)->nsn) && >GistPageGetOpaque(page)->**rightlink != > InvalidBlockNumber /* sanity check */ ) >{ >/* page splited while we thinking of... */ >ptr = (GISTInsertStack *) palloc0(sizeof(** > GISTInsertStack)); >ptr->blkno = GistPageGetOpaque(page)->**rightlink; >ptr->childoffnum = InvalidOffsetNumber; >ptr->parent = top->parent; >ptr->next = top->next; >top->next = ptr; >if (tail == top); >tail = ptr; } > -- With best regards, Alexander Korotkov.
Re: [HACKERS] Reduced power consumption in WAL Writer process
On Thu, Jul 14, 2011 at 9:57 AM, Fujii Masao wrote: > Currently walwriter might write out the WAL before a transaction commits. > IOW, walwriter tries to write out the WAL in wal_buffers in every wakeups. > This might be useful for long transaction which generates lots of WAL > records before commit. So we should call SetLatch() in XLogInsert() instead > of RecordTransactionCommit()? Though I'm not sure how much walwriter > improves the performance of synchronous commit case.. Yeh, we did previously have a heuristic to write out the WAL when it was more than half full. Not sure I want to put exactly that code back into such a busy code path. I suggest that we set latch every time the wal buffers wrap. So at the bottom of AdvanceXLInsertBuffer(), if nextidx == 0 then SetLatch on the WALWriter. That's a simple test and we only check it if we're switch WAL buffer page. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Fast GiST index build
On Thu, Jul 14, 2011 at 12:42 PM, Heikki Linnakangas < heikki.linnakan...@enterprisedb.com> wrote: > Pinning a buffer that's already in the shared buffer cache is cheap, I > doubt you're gaining much by keeping the private hash table in front of the > buffer cache. > Yes, I see. Pinning a lot of buffers don't gives singnificant benefits but produce a lot of problems. Also removing the hash table can simplify code. Also, it's possible that not all of the subtree is actually required during > the emptying, so in the worst case pre-loading them is counter-productive. What do you think about pre-fetching all of the subtree? It requires actual loading of level_step - 1 levels of it. I some cases it still can be counter-productive. But probably it is productive in average? > Well, what do you do if you deem that shared_buffers is too small? Fall > back to the old method? Also, shared_buffers is shared by all backends, so > you can't assume that you get to use all of it for the index build. You'd > need a wide safety margin. I assumed to check if there are enough of shared_buffers before switching to buffering method. But concurent backends makes this method unsafe. There are other difficulties with concurrent backends: it would be nice estimate usage of effective cache by other backeds before switching to buffering method. If don't take care about it then we can don't switch to buffering method which it can give significant benefit. -- With best regards, Alexander Korotkov.
Re: [HACKERS] Reduced power consumption in WAL Writer process
On Thu, Jul 14, 2011 at 5:39 PM, Simon Riggs wrote: > On Wed, Jul 13, 2011 at 10:56 PM, Peter Geoghegan > wrote: >> Attached is patch for the WAL writer that removes its tight polling >> loop (which probably doesn't get hit often in practice, as we just >> sleep if wal_writer_delay is under a second), and, at least >> potentially, reduces power consumption when idle by using a latch. >> >> I will break all remaining power consumption work down into >> per-auxiliary process patches. I think that this is appropriate - if >> we hit a snag on one of the processes, there is no need to have that >> hold up everything. >> >> I've commented that we handle all expected signals, and therefore we >> shouldn't worry about having timeout invalidated by signals, just as >> with the archiver. Previously, we didn't even worry about Postmaster >> death within the tight polling loop, presumably because >> wal_writer_delay is typically small enough to avoid that being a >> problem. I thought that WL_POSTMASTER_DEATH might be superfluous here, >> but then again there is a codepath specifically for the case where >> wal_writer_delay exceeds one second, so it is included in this initial >> version. >> >> Comments? > > ISTM that this in itself isn't enough to reduce power consumption. > > Currently the only people that use WALWriter are asynchronous commits, > so we should include within RecordTransactionCommit() a SetLatch() > command for the WALWriter. > > That way we can have WALWriter sleep until its needed. +1 Currently walwriter might write out the WAL before a transaction commits. IOW, walwriter tries to write out the WAL in wal_buffers in every wakeups. This might be useful for long transaction which generates lots of WAL records before commit. So we should call SetLatch() in XLogInsert() instead of RecordTransactionCommit()? Though I'm not sure how much walwriter improves the performance of synchronous commit case.. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] help with sending email
Hi everybody: I am using pgmail to send email in an application, but I would like to use html format Does anybody has an example how to do this?, or what do you recommend me to use por doing this? Thanks in advance for your attention. Best Regards, Fernando Acosta Lima - Perú
Re: [HACKERS] Full GUID support
Hi Thomas-san, Ralf-san. I appreciate your great work. Thanks! CC to Postgres-ML. Regards, Hiroshi Saito (2011/07/14 3:49), Thomas Lotterer wrote: > Thanks for the hint. > Our ftp daemon is dumping core. > We are debugging ... > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers