Re: [HACKERS] Not ready for 8.3
On Fri, 18 May 2007, Jim C. Nasby wrote: Rather than trying to get something "FAQ-perfect" right now, can you at least brain-dump what your current process is so that others can learn? I've created a wiki page for this and added Heikki's tips: http://developer.postgresql.org/index.php/Working_with_CVS Heikki's great summary helps (I think the one piece I was screwing up is covered there), and Pavan's comments adds some useful bits. The still missing part is how to make a real branch to work in, which is much easier to work with once you figure out how to do it than either using just diffs or tagging. I agree the wiki is the right place to hash this out; when that's converged to something useful, then take a snapshot and convert for the official manual. I'll work on it. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Signing off of patches (was Re: Not ready for 8.3)
On Fri, 2007-18-05 at 21:59 -0400, Tom Lane wrote: > I kinda think this is emphasizing the wrong end of the process. I don't disagree, but I think a tool like this would still be enormously helpful (to me, at any rate). While there's more to the process of feature development than just mailing patch hunks back and forth, Review Board seems a nice improvement for that part of the process. -Neil ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Signing off of patches (was Re: Not ready for 8.3)
On 05/18/2007 08:59:11 PM, Tom Lane wrote: I'd like to see something that emphasizes review and feedback at the stages of germinal idea, rough functional spec, implementation concept, etc. Reviewing the actual code is good too, but if that's the first stage that you ask for help at, you are off in the wrong direction; particularly so if you're a larval Postgres hacker. Speaking as a larval Postgres hacker I have trouble asking about the germinal idea and rough functional spec parts. Without having some clue about the implementation concept it's difficult for me to imagine whether or not I want to or will be able to put the effort into making the actual code work. Rather then spend time yakking about design, I figured I'd better put the effort into poking about the code enough to see what would be involved. It might be different if I "just felt like hacking about in Postgres", but I've got a specific need I want to see addressed. That constrains what I'm willing to work on so the pertinent question becomes how hard it's going to be to get the job done. I guess what it comes down to is that I trust my estimate of how hard it's going to be for me to learn to do something new, even a bad initial estimate, more than I trust some stranger who's not familiar with my skill set. So I try cutting some code just to get going. Then decide if I want to continue. Obviously a PG expert is going to be able to tell me a lot about what I'm trying to do. But that might take a lot of mentoring, and I'm also not ready to commit to paying back with code contribution that goes past my immediate need. Which makes me reluctant to ask for such help. I'd guess a lot of larval hackers are in my shoes. They're getting involved to address some specific need. Hope this helps. Karl <[EMAIL PROTECTED]> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] COPY into a view; help w. design & patch
On 05/18/2007 07:05:50 PM, Robert Haas wrote: I'm not sure exactly why you want to do with this, but it seems very similar to what you can already do using prepared statements. Instead of saying COPY INTO (statement) (column [, ...]) a1,a2,a3,...,an b1,b2,b3,...,bn c1,c2,c3,...,cn \. I don't really want to do this. I really want my users to be able to use the COPY statement without worrying about whether they are copying into a table or a view. COPY tableorview (column [, ...]) FROM stdin; a1,a2,a3,...,an b1,b2,b3,...,bn c1,c2,c3,...,cn \. I just wanted to cover all the options when going over the design choices, so came up with the COPY INTO syntax. You could instead say: BEGIN WORK PREPARE somestatementhandle (column[, ...]) AS statement EXECUTE somestatementhandle ('a1','a2','a3','...','an'); EXECUTE somestatementhandle ('b1','b2','b3','...','bn'); EXECUTE somestatementhandle ('c1','c2','c3','...','cn'); DEALLOCATE somestatementhandle COMMIT WORK The trouble is that my users have data, in excel spreadsheets and the like, and it needs to get into PostgreSQL. The data they have corresponds to views made in the db, but not to tables. I _could_ make tables that "correspond" to the views and put BEFORE INSERT triggers on them and have the triggers insert into the views (or the equalivent), but then the users would have to use the views for most things and the "corresponding tables" when doing a COPY or using the application's data import function. That's not so good for lots of reasons. Of course I could always write a special application for each view to import into each view, but why not have the COPY command there to do it for me? Karl <[EMAIL PROTECTED]> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Signing off of patches (was Re: Not ready for 8.3)
Alvaro Herrera <[EMAIL PROTECTED]> writes: > In order to make patch review more effective, perhaps we could use some > tools to help us. How about > http://www.chipx86.com/blog/?p=222 I kinda think this is emphasizing the wrong end of the process. "Code everything, then ask for comments" is about as far away from the correct mindset as one can readily get. I'd like to see something that emphasizes review and feedback at the stages of germinal idea, rough functional spec, implementation concept, etc. Reviewing the actual code is good too, but if that's the first stage that you ask for help at, you are off in the wrong direction; particularly so if you're a larval Postgres hacker. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Signing off of patches (was Re: Not ready for 8.3)
Andrew Dunstan wrote: > > Alvaro Herrera wrote: > > > >Having the Linux process still in memory, I started thinking that maybe > >what we need, is a sign-off process, whereby developer A reviews other > >developers' patches, make comments, and when the commented-on developer > >(call him B) has fixed the issues that A had, then A signs off B patch. > >In return for the favor, B also reviews and signs off A patches. > >Eventually this leads to more cooperation between otherwise independent > >developers. > > You're thinking of the wrong process. What you seem to be suggesting is > a process that mimics the Usenet Oracle[1], in which if you ask a > question you get given one to answer ... Interesting, but, err, not really what I was thinking. In order to make patch review more effective, perhaps we could use some tools to help us. How about http://www.chipx86.com/blog/?p=222 ? -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Signing off of patches (was Re: Not ready for 8.3)
Alvaro Herrera wrote: Having the Linux process still in memory, I started thinking that maybe what we need, is a sign-off process, whereby developer A reviews other developers' patches, make comments, and when the commented-on developer (call him B) has fixed the issues that A had, then A signs off B patch. In return for the favor, B also reviews and signs off A patches. Eventually this leads to more cooperation between otherwise independent developers. You're thinking of the wrong process. What you seem to be suggesting is a process that mimics the Usenet Oracle[1], in which if you ask a question you get given one to answer ... cheers andrew [1] http://cgi.cs.indiana.edu/~oracle/index.cgi (Zot) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] COPY into a view; help w. design & patch
I'm not sure exactly why you want to do with this, but it seems very similar to what you can already do using prepared statements. Instead of saying COPY INTO (statement) (column [, ...]) a1,a2,a3,...,an b1,b2,b3,...,bn c1,c2,c3,...,cn \. You could instead say: BEGIN WORK PREPARE somestatementhandle (column[, ...]) AS statement EXECUTE somestatementhandle ('a1','a2','a3','...','an'); EXECUTE somestatementhandle ('b1','b2','b3','...','bn'); EXECUTE somestatementhandle ('c1','c2','c3','...','cn'); DEALLOCATE somestatementhandle COMMIT WORK See http://www.postgresql.org/docs/8.2/interactive/sql-prepare.html ...Robert -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Karl O. Pinc Sent: Friday, May 18, 2007 6:41 PM To: pgsql-hackers@postgresql.org Subject: [HACKERS] COPY into a view; help w. design & patch Hi, I'm attempting a patch that would allow the COPY command to copy into a view. I needed to code something so as to get a clue, but the design has not been discussed so I'm posting here rather than the patches list. I had a wee bit of discussion about this on IRC. There was a suggestion to add generality by to copying into a INSERT statement. However the INSERT statement and the COPY statement both list column names, and other issues came up and I stopped thinking about it. Better I think would be to have a syntax like: COPY INTO ( statement [; , ...]) ( column [, ...] ) FROM { 'filename' | STDIN } and so forth Statements would then have $1, $2 type arguments in them that correspond to the supplied column names. The column names wouldn't mean much, really they'd just be an indication of how many columns there are in the input data. Statements would usually be INSERT statements, but could be any other sort as well. (DELETE comes to mind as useful, as do INSERT statements that use a SELECT ... WHERE NOT EXISTS ... sort of query to deal with inserting into 1-to-many relationships.) I don't know if it would be tricky to use a semicolon as the token delimiting statements but I presume not. If this more general syntax were chosen the copying into a view would just be syntactic sugar for copying into an INSERT statement that inserted into the view. I haven't thought a whole lot about a generalized syntax. (The return code would be more like that of EXECUTE.) I've been focusing on copying into a view, which is what I want. At this point I'd much prefer just implementing the part that copies into a view as that's already something I need help with. (!) As far as the patch itself, it's time to ask for help. The basic idea is to prepare an INSERT statement into the view and then run it in a portal for each line of input. (This should generalize to the "more general" copy syntax above, I hope.) I've checked (I think) that my data structures are the same as what I'd get if I was doing a PREPARE and then a EXECUTE. There's issues of locking and snapshots and probably other stuff and I thought I was following the example of what the prepare/execute code does but I've obviously got something wrong. I'm having problems debugging what I've got. It works up to the point of the PortalRun() call, then it crashes. The regression tests are incomplete, done only to the point where, in copy2.sql, it crashes. The changes to outfuncs.c can be ignored. It won't be in the final patch. I've just been using it to dump various data structures as can be seen from the various bits of debugging stuff left in my patch. I've been debugging by running "make check" and then looking at the copy2.out section of regression.diffs. This isn't going to work if I'm going to poke about inside the PortalRun() call. Is there a make target that will setup the regression environment so that I can then run the backend via gdb, or something? I'm working on a system that has a live pg install on it and need to be careful not to break that. Please help with suggestions for design, code, and how to debug. As you may have noticed, this is the first time I've messed with the pg code. I could really use a lot of help. I've been at this for quite a while and you can see that I've not even gotten something to work. Thanks. Karl <[EMAIL PROTECTED]> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] COPY into a view; help w. design & patch
Hi, I'm attempting a patch that would allow the COPY command to copy into a view. I needed to code something so as to get a clue, but the design has not been discussed so I'm posting here rather than the patches list. I had a wee bit of discussion about this on IRC. There was a suggestion to add generality by to copying into a INSERT statement. However the INSERT statement and the COPY statement both list column names, and other issues came up and I stopped thinking about it. Better I think would be to have a syntax like: COPY INTO ( statement [; , ...]) ( column [, ...] ) FROM { 'filename' | STDIN } and so forth Statements would then have $1, $2 type arguments in them that correspond to the supplied column names. The column names wouldn't mean much, really they'd just be an indication of how many columns there are in the input data. Statements would usually be INSERT statements, but could be any other sort as well. (DELETE comes to mind as useful, as do INSERT statements that use a SELECT ... WHERE NOT EXISTS ... sort of query to deal with inserting into 1-to-many relationships.) I don't know if it would be tricky to use a semicolon as the token delimiting statements but I presume not. If this more general syntax were chosen the copying into a view would just be syntactic sugar for copying into an INSERT statement that inserted into the view. I haven't thought a whole lot about a generalized syntax. (The return code would be more like that of EXECUTE.) I've been focusing on copying into a view, which is what I want. At this point I'd much prefer just implementing the part that copies into a view as that's already something I need help with. (!) As far as the patch itself, it's time to ask for help. The basic idea is to prepare an INSERT statement into the view and then run it in a portal for each line of input. (This should generalize to the "more general" copy syntax above, I hope.) I've checked (I think) that my data structures are the same as what I'd get if I was doing a PREPARE and then a EXECUTE. There's issues of locking and snapshots and probably other stuff and I thought I was following the example of what the prepare/execute code does but I've obviously got something wrong. I'm having problems debugging what I've got. It works up to the point of the PortalRun() call, then it crashes. The regression tests are incomplete, done only to the point where, in copy2.sql, it crashes. The changes to outfuncs.c can be ignored. It won't be in the final patch. I've just been using it to dump various data structures as can be seen from the various bits of debugging stuff left in my patch. I've been debugging by running "make check" and then looking at the copy2.out section of regression.diffs. This isn't going to work if I'm going to poke about inside the PortalRun() call. Is there a make target that will setup the regression environment so that I can then run the backend via gdb, or something? I'm working on a system that has a live pg install on it and need to be careful not to break that. Please help with suggestions for design, code, and how to debug. As you may have noticed, this is the first time I've messed with the pg code. I could really use a lot of help. I've been at this for quite a while and you can see that I've not even gotten something to work. Thanks. Karl <[EMAIL PROTECTED]> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein copyview_v1.patch.gz Description: GNU Zip compressed data ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Idea that might inspire more patch reviewing.
Bruce Momjian wrote: > In talking to people who are assigned to review patches or could review > patches, I often get the reply, "Oh, yea, I need to do that". Would it inspire more people to learn enough to become patch reviewers if patch authors scheduled walkthroughs of their patches with question and answer sessions over IRC or maybe even some voice conferencing system like skype? While it might not be of immediate value, I imagine a number of inspiring-to-be-hackers might find such walkthroughs enlightening, and if actual qualified reviewers participate in the Q&A during those walkthroughs seeing the kinds of questions raised would be quite educational as well. I don't know if this would help - I guess it needs 3 things: patch authors willing to do such walkthroughs, qualified people willing to participate in them, and enough wannabe hackers wanting to listen in. Do people think this would help- or is it just a clunkier way of doing what's already done via email on the patches list? ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Async commands (like drop index)
On Fri, May 18, 2007 at 01:39:56PM -0400, Neil Conway wrote: > On Fri, 2007-18-05 at 13:29 -0400, Alvaro Herrera wrote: > > I think what Joshua really wants is an equivalent of this > > That's not what his original email asked for, at any rate. > > > start: > > BEGIN; > > LOCK TABLE foo IN ACCESS EXCLUSIVE MODE NOWAIT; > > -- if fail, rollback and go to start > > DROP INDEX foo_idx; > > COMMIT; > > > > The idea is that the lock is only acquired if immediately available, > > thus not blocking other queries which would otherwise be blocked behind > > the DROP INDEX. > > ISTM this can easily be implemented with statement_timeout (which is > more general to boot). Well, with statement_timeout, and writing a function... This functionality would actually be useful, but I'm not sure if it's worth including in core. It would be really nice to have an example of how to do this in a PostgreSQL Cookbook somewhere though. -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Async commands (like drop index)
On Fri, 2007-18-05 at 13:29 -0400, Alvaro Herrera wrote: > I think what Joshua really wants is an equivalent of this That's not what his original email asked for, at any rate. > start: > BEGIN; > LOCK TABLE foo IN ACCESS EXCLUSIVE MODE NOWAIT; > -- if fail, rollback and go to start > DROP INDEX foo_idx; > COMMIT; > > The idea is that the lock is only acquired if immediately available, > thus not blocking other queries which would otherwise be blocked behind > the DROP INDEX. ISTM this can easily be implemented with statement_timeout (which is more general to boot). -Neil ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Async commands (like drop index)
Neil Conway wrote: > On Fri, 2007-18-05 at 11:47 -0500, Jim C. Nasby wrote: > > Assuming the concurrent psql stuff gets in, do you still see a use for > > this? > > I think concurrent psql (and/or async libpq) is the right way to handle > this sort of requirement. "DROP INDEX NOWAIT" is hacky, and would be > difficult (impossible?) to implement in a reasonable manner: the backend > is fundamentally single-threaded. Also, how does the client learn when > the DROP INDEX actually finishes? The client would either need to poll > the database, or we'd need to implement something like select() -- > neither is a very appealing alternative. I think what Joshua really wants is an equivalent of this: start: BEGIN; LOCK TABLE foo IN ACCESS EXCLUSIVE MODE NOWAIT; -- if fail, rollback and go to start DROP INDEX foo_idx; COMMIT; The idea is that the lock is only acquired if immediately available, thus not blocking other queries which would otherwise be blocked behind the DROP INDEX. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Async commands (like drop index)
On Fri, 2007-18-05 at 11:47 -0500, Jim C. Nasby wrote: > Assuming the concurrent psql stuff gets in, do you still see a use for > this? I think concurrent psql (and/or async libpq) is the right way to handle this sort of requirement. "DROP INDEX NOWAIT" is hacky, and would be difficult (impossible?) to implement in a reasonable manner: the backend is fundamentally single-threaded. Also, how does the client learn when the DROP INDEX actually finishes? The client would either need to poll the database, or we'd need to implement something like select() -- neither is a very appealing alternative. -1 from me: this functionality belongs on the client-side, where asynchronous operations are much easier to manage. -Neil ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Async commands (like drop index)
Tom Lane wrote: "Joshua D. Drake" <[EMAIL PROTECTED]> writes: Yes. As cool as concurrent psql is... the majority of our users don't use it. They use PgAdminIII. So? IIRC pgAdmin can open up multiple connections already. This should be client agnostic imo. Just to be perfectly clear: the odds of making a single backend support concurrent operations in the foreseeable future are indistinguishable from zero. So if you want a behavior like this, it's going to have to happen by making the client open up multiple connections. I see no reason at all to tie such a feature to index-dropping in particular. I was just using the index dropping as something particularly useful. It could be anything. Also note that I really wasn't trying to detract from what's important right now. I just wanted to get this on the list for later discussion. Joshua D. Drake regards, tom lane -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Async commands (like drop index)
"Joshua D. Drake" <[EMAIL PROTECTED]> writes: > Yes. As cool as concurrent psql is... the majority of our users don't > use it. They use PgAdminIII. So? IIRC pgAdmin can open up multiple connections already. > This should be client agnostic imo. Just to be perfectly clear: the odds of making a single backend support concurrent operations in the foreseeable future are indistinguishable from zero. So if you want a behavior like this, it's going to have to happen by making the client open up multiple connections. I see no reason at all to tie such a feature to index-dropping in particular. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Async commands (like drop index)
Jim C. Nasby wrote: On Thu, May 17, 2007 at 12:30:45PM -0700, Joshua D. Drake wrote: It seems that it may be useful to allow something like: DROP INDEX NOWAIT. The idea being, that the terminal will come back, the index will be dropped in the background. If it doesn't drop, it rollback like normal and logs. I bring this up now, as an idea. We can argue about it later... :) Assuming the concurrent psql stuff gets in, do you still see a use for this? Yes. As cool as concurrent psql is... the majority of our users don't use it. They use PgAdminIII. This should be client agnostic imo. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Async commands (like drop index)
On Thu, May 17, 2007 at 12:30:45PM -0700, Joshua D. Drake wrote: > It seems that it may be useful to allow something like: > > DROP INDEX NOWAIT. > > The idea being, that the terminal will come back, the index will be > dropped in the background. If it doesn't drop, it rollback like normal > and logs. > > I bring this up now, as an idea. We can argue about it later... :) Assuming the concurrent psql stuff gets in, do you still see a use for this? -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Signing off of patches (was Re: Not ready for 8.3)
On Fri, May 18, 2007 at 12:05:55PM -0400, Alvaro Herrera wrote: > there are no obvious, glaring mistakes could go a long way. (I have > this weird idea that I should not apply a patch unless someone else says > "hey, looks OK to me". Somehow, the mere lack of objections does not > increase my confidence.) I have nothing to contribute on the suggestion, since I can neither offer review nor patches. But I can offer an analogy that will maybe strengthen your point, and might offer a hint of how to make the developer community bigger. In the IETF working groups I follow, most of the chairs have decided to impose some baseline level of group review for protocol documents. In dnsop, for instance, we have a rule that if at least five people do not review an Internet Draft and agree to its publication, it just won't get advanced as a working group document. The idea is that, if we can't get that small number of reviews, then either the working group either isn't interested in the feature or topic, or the draft is a bad idea as it stands. As a result, if you want to have the suasion to get people to review your own submissions, you also have to do the work of reviewing others'. But it also means that if you're new to an area, you can become better in that area by doing document review. Probably, your own reviews won't uncover big flaws that those more experienced with the protocol will find; but you'll be able to make some small contributions that will allow you help in getting the documents finished. Also, while you're at it, you'll be forced to read all the referenced documents, which help you learn about the protocol and therefore make you more valuable to the WG. Perhaps, then, new contributors to Postgres could also take on the task of reviewing some of the patches, not as a matter of being the _only_ reviewer -- the new code still needs review by those more experienced with the rest of the code -- but as a first-pass review that will help in a "more eyeballs" sort of way. This would also have the happy paedogogical effect that those newer reviewers would learn more of the code in each cycle. I think this is similar to a previous suggestion someone made about "mentored review", but it doesn't require formal mentoring for it to get started. A -- Andrew Sullivan | [EMAIL PROTECTED] Users never remark, "Wow, this software may be buggy and hard to use, but at least there is a lot of code underneath." --Damien Katz ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Lack of urgency in 8.3 reviewing
On Fri, May 18, 2007 at 03:21:00PM +0200, Zdenek Kotala wrote: > What's about > > - full page writes improvement > > but I will have time after PGCon. Added you to the list for that at http://developer.postgresql.org/index.php/Todo:PatchStatus -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Lack of urgency in 8.3 reviewing
On Fri, May 18, 2007 at 12:33:11AM +0800, Cui Shijun wrote: > Thank you for your suggestions, I am thinking about "Full page writes > improvement". It seems not so complicated, just fit for a novice like > me. > I'll work on it. :-) Updated on http://developer.postgresql.org/index.php/Todo:PatchStatus -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Not ready for 8.3
On Thu, May 17, 2007 at 11:02:31PM -0400, Greg Smith wrote: > On Thu, 17 May 2007, David Fetter wrote: > > >Would you be interested in providing this meat? You're uniquely > >qualified because your shins still smart from all the things you > >barked them on :) > > Unfortunately I'm temporarily on the other side of this problem; all the > time I have to spare right now is going to into reviewing other people's > patches. > > Even more unfortunately, I'm not 100% happy with what I'm doing, and > certainly wouldn't want to present the shoddy techniques I'm using as > recommended. If anyone else has useful info on this subject (keeping a > local repository in sync with HEAD via rsync while working on branches) > they'd like to pass along, please drop me a message with whatever level of > documentation you might have--even rough notes would be a help. Once I'm > finished with the reviews I'm trying to contribute, I will sort through > any suggestions I get and turn that into formal documentation. This has > left enough shin damage that I'd enjoy completely slaying the cause. Rather than trying to get something "FAQ-perfect" right now, can you at least brain-dump what your current process is so that others can learn? I've created a wiki page for this and added Heikki's tips: http://developer.postgresql.org/index.php/Working_with_CVS -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Signing off of patches (was Re: Not ready for 8.3)
Alvaro Herrera wrote: > Sorry, I wasn't trying to say that we should follow the Linux model all > that closely. I know there are regressions in the "point zero" > releases, and that there are bugs. This morning a friend IM'ed me a comment about Martin Michlmayr's PhD thesis, which is about release process on open source projects. It is an interesting read: http://www.cyrius.com/journal/phd?reverse=yes Now, I don't think we can apply any conclusions directly from his study. But it got me thinking anyway; what can we learn from them? I don't think time-based release is appropriate for us, because we already have the time constraints, but it is not good because we allow some patches to "starve" on patch queues, or we let the schedule slip. Having the Linux process still in memory, I started thinking that maybe what we need, is a sign-off process, whereby developer A reviews other developers' patches, make comments, and when the commented-on developer (call him B) has fixed the issues that A had, then A signs off B patch. In return for the favor, B also reviews and signs off A patches. Eventually this leads to more cooperation between otherwise independent developers. The point of all this is to have an incentive for developers to review patches. And what better incentive than having a chance that other developers will retribute by reviewing his own patches? This doesn't need to be anything too formal; I think that just having people feel that other people at least took a look at the thing and there are no obvious, glaring mistakes could go a long way. (I have this weird idea that I should not apply a patch unless someone else says "hey, looks OK to me". Somehow, the mere lack of objections does not increase my confidence.) Opinions? -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Lack of urgency in 8.3 reviewing
Alvaro Herrera wrote: Alvaro Herrera wrote: Zdenek Kotala wrote: Heikki Linnakangas wrote: Zdenek Kotala wrote: Heikki Linnakangas wrote: - heap page diagnostic functions I would like to take this review (after PGCon). Too late, Bruce applied it already :). Yes ... Never mind You know, the fact that it was applied does not mean that it doesn't need review. If there is a bug on it which your review can find, we would like to know before it is released. The review process is not just so that it can be flown past some evil committee. I forgot to add that it seems that a general feeling here is that as soon as code is committed, it is Tom Lane's problem if there is a bug, because he will track it down and fix it. So if it was committed, we can forget about it because he'll take care. I hope that's not how people think. I try to track down bugs when they're reported when they're in areas of code that I'm familiar with, but Tom still usually beats me to it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Lack of urgency in 8.3 reviewing
Alvaro Herrera wrote: > Zdenek Kotala wrote: > > Heikki Linnakangas wrote: > > >Zdenek Kotala wrote: > > >>Heikki Linnakangas wrote: > > >> > > >>>- heap page diagnostic functions > > >> > > >>I would like to take this review (after PGCon). > > > > > >Too late, Bruce applied it already :). > > > > Yes ... Never mind > > You know, the fact that it was applied does not mean that it doesn't > need review. If there is a bug on it which your review can find, we > would like to know before it is released. > > The review process is not just so that it can be flown past some evil > committee. I forgot to add that it seems that a general feeling here is that as soon as code is committed, it is Tom Lane's problem if there is a bug, because he will track it down and fix it. So if it was committed, we can forget about it because he'll take care. -- Alvaro Herrera http://www.PlanetPostgreSQL.org/ "Granting software the freedom to evolve guarantees only different results, not better ones." (Zygo Blaxell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Lack of urgency in 8.3 reviewing
Zdenek Kotala wrote: > Heikki Linnakangas wrote: > >Zdenek Kotala wrote: > >>Heikki Linnakangas wrote: > >> > >>>- heap page diagnostic functions > >> > >>I would like to take this review (after PGCon). > > > >Too late, Bruce applied it already :). > > Yes ... Never mind You know, the fact that it was applied does not mean that it doesn't need review. If there is a bug on it which your review can find, we would like to know before it is released. The review process is not just so that it can be flown past some evil committee. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Lack of urgency in 8.3 reviewing
Heikki Linnakangas wrote: Zdenek Kotala wrote: Heikki Linnakangas wrote: - heap page diagnostic functions I would like to take this review (after PGCon). Too late, Bruce applied it already :). Yes ... Never mind What's about - full page writes improvement but I will have time after PGCon. Zdenek ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Reducing stats collection overhead
Tom Lane wrote: > Arjen van der Meijden told me that according to the tweakers.net > benchmark, HEAD is noticeably slower than 8.2.4, and I soon confirmed > here that for small SELECT queries issued as separate transactions, > there's a significant difference. I think much of the difference stems > from the fact that we now have stats_row_level ON by default, and so > every transaction sends a stats message that wasn't there by default > in 8.2. When you're doing a few thousand transactions per second > (not hard for small read-only queries) that adds up. So, did this patch make the performance problem go away? -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [GSOC] - I ntegrity check algorithm for data files
On Fri, May 18, 2007 at 12:52:20PM +0200, Robert Mach wrote: > Before presenting possible errors, I divided them into physical and > logical errors. Physical errors will refer to errors in the structure of > pages and tuples, whereas logical errors will depict errors that cause > incorrect performance of postgresql, but are correct according the > structure of data files. Have you looked at pgfsck? I think you also need to check much lower level stuff also. Let: - Page headers - Are all the item pointers in order an non-overlapping - Is each item correctly structured, do the XIDs look OK. What about the HINT bits. Can the tuple be decoded properly. One common problem with data corruption is that postgres with see a varlen header which says the daa is 2K (or 2M) whereas the item is only 50bytes long, so that's obviously broken. Only once you're sure about the actual structure can you worry about thing like toast pointer (which are important, but less likely to crash postgres). All this applies to index pages also. Hope this helps, -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: Working with PostgreSQL source tree (was Re: [HACKERS] Not ready for 8.3)
On 5/18/07, Heikki Linnakangas <[EMAIL PROTECTED]> wrote: Pavan Deolasee wrote: > For primitive version control, I make a diff after any significant changes: >> >> ~/pg_sandbox/pgsql.cluster$ cvs diff -cN > cluster-mvcc-1.patch > > I usually commit each version and tag the tree. That helps me to get > diff between two versions as well. Doesn't that confuse rsync? I use cvsup. But thats not relevant here. Yes, cvsup would delete any local tags/branches and sync the tree. But then I don't sync the tree very often. And when I need to sync my work with CVS HEAD, I create all the required diffs, sync the tree and reapply the patches (and commit and tag) the tree again. Thats some work, but is I am OK with that because as I said we don't need to sync with head that often. Thanks -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
[HACKERS] [GSOC] - I ntegrity check algorithm for data files
Hello, my name is Robert Mach and I am happy to by working on GSoC project for Postgresql. The name of the project is Integrity check algorithm for data files in Postgresql. So far, I have put together a list of possible checks for data failure and I would like to hear as much opinions on this list or even better some recommendation on what should be added to this list or removed. Before presenting possible errors, I divided them into physical and logical errors. Physical errors will refer to errors in the structure of pages and tuples, whereas logical errors will depict errors that cause incorrect performance of postgresql, but are correct according the structure of data files. In order to find PHYSICAL errors: - check whether the total size of all TOAST table chunks is the same as the size mentioned in the TOASTed table - in case of variable length representation of data (attlen = -1) compare the real size of stored data with the size of data mentioned within the varlena lenth word - count the number of rows in a table and compare it with the pg_class.reltuples of corresponding record in pg_class. -check the format of data according to the flags (that determine the representation) belonging to them. e.g.: in case of TOASTed values, is the size of pointer datum really 20B?, etc. -check the fields firmly defined by the structure for occurrence of odd data In order to find LOGICAL errors: -check the validity of all items in index (e.g. concurrent update and index scan with constrain could cause inconsistent snapshot of database used for creating the index)/ -After creating of index or running the index scan, check whether all tuples that should be indexed are really indexed (e.g. ) -check the validity and effectiveness of free space map (whether it is not considering valid data as free, whether the size is fitting the needs of database, etc.) -check whether all user-defined functions in the database are visible/usable for the users (maybe also verify privileges..) -check whether the constrains applied on items are fulfilled: - the uniqueness - range of values - correctness of foreign key values -in case of very large databases, check whether the wrap around of transactions IDs occurred. (Transaction ID = 2 to the power of 32) -check the integrity of catalogs I see different ways of delivering this functionality to Postgresql. The best of course would be to become part of Postgresql release either as a PostgreSQL command (like UPDATE) or as an postgresql server application like vacuumdb. Other possibility is to create a freestanding program that would be called with location of datafiles as arguments. Last possibility is to create an administrative console access (single user mode) to the database in which this integrity check could be fired. I hope to get lots of thoughts on this proposal as well as lots on other ideas on what should be checked in odrer to verify the integrity of data in Postgres. Cheers, Robert ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: Working with PostgreSQL source tree (was Re: [HACKERS] Not ready for 8.3)
Pavan Deolasee wrote: For primitive version control, I make a diff after any significant changes: ~/pg_sandbox/pgsql.cluster$ cvs diff -cN > cluster-mvcc-1.patch I usually commit each version and tag the tree. That helps me to get diff between two versions as well. Doesn't that confuse rsync? I use "diff patch-1 patch-2" for diffs between versions, but it's really hard to read. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: Working with PostgreSQL source tree (was Re: [HACKERS] Not ready for 8.3)
On 5/18/07, Heikki Linnakangas <[EMAIL PROTECTED]> wrote: When I start working on a subject, I make a new checkout from the local repository to a designated directory under "pg_sandbox"-directory. For example, when I started working on the mvcc-safe cluster patch: ~/pg_sandbox$ cvs -d /home/hlinnaka/pgcvsrepository cvs co pgsql ~/pg_sandbox$ mv pgsql pgsql.cluster You can use cvs co -d pgsql.cluster to save the above step. ~/pg_sandbox$ cd pgsql.cluster ~/pg_sandbox/pgsql.cluster$ cvs update -dP # Remove empty dirs. There's a checkout-flag for that as well, but I never remember it... ~/pg_sandbox/pgsql.cluster$ ./configure --enable-depend --enable-cassert --enable-debug --prefix=/home/hlinnaka/pgsql.cluster I find configuring with CFLAGS="-g -O0" more useful. For primitive version control, I make a diff after any significant changes: ~/pg_sandbox/pgsql.cluster$ cvs diff -cN > cluster-mvcc-1.patch I usually commit each version and tag the tree. That helps me to get diff between two versions as well. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] Lack of urgency in 8.3 reviewing
Zdenek Kotala wrote: Heikki Linnakangas wrote: - heap page diagnostic functions I would like to take this review (after PGCon). Too late, Bruce applied it already :). More eyeballs on it still wouldn't hurt of course. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Lack of urgency in 8.3 reviewing
Heikki Linnakangas wrote: - heap page diagnostic functions I would like to take this review (after PGCon). Zdenek ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Working with PostgreSQL source tree (was Re: [HACKERS] Not ready for 8.3)
Greg Smith wrote: On Thu, 17 May 2007, David Fetter wrote: Would you be interested in providing this meat? You're uniquely qualified because your shins still smart from all the things you barked them on :) Unfortunately I'm temporarily on the other side of this problem; all the time I have to spare right now is going to into reviewing other people's patches. Even more unfortunately, I'm not 100% happy with what I'm doing, and certainly wouldn't want to present the shoddy techniques I'm using as recommended. If anyone else has useful info on this subject (keeping a local repository in sync with HEAD via rsync while working on branches) they'd like to pass along, please drop me a message with whatever level of documentation you might have--even rough notes would be a help. Once I'm finished with the reviews I'm trying to contribute, I will sort through any suggestions I get and turn that into formal documentation. This has left enough shin damage that I'd enjoy completely slaying the cause. Let me describe my method. It's not ideal by any means, but I've been satisfied: First of all, having a local CVS repository is a must. I used to use cvsup until a couple of months ago, but it suddenly stopped working, so I switched to rsync and I haven't looked back. I have a one line shell script to update it: rsync --progress -avzCH --delete anoncvs.postgresql.org::pgsql-cvs /home/hlinnaka/pgcvsrepository When I start working on a subject, I make a new checkout from the local repository to a designated directory under "pg_sandbox"-directory. For example, when I started working on the mvcc-safe cluster patch: ~/pg_sandbox$ cvs -d /home/hlinnaka/pgcvsrepository cvs co pgsql ~/pg_sandbox$ mv pgsql pgsql.cluster ~/pg_sandbox$ cd pgsql.cluster ~/pg_sandbox/pgsql.cluster$ cvs update -dP # Remove empty dirs. There's a checkout-flag for that as well, but I never remember it... ~/pg_sandbox/pgsql.cluster$ ./configure --enable-depend --enable-cassert --enable-debug --prefix=/home/hlinnaka/pgsql.cluster For primitive version control, I make a diff after any significant changes: ~/pg_sandbox/pgsql.cluster$ cvs diff -cN > cluster-mvcc-1.patch The last number is a revision number, increment by one after each diff. That's it! -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] pg_standby question (solved)
On Sun, 2007-05-13 at 18:28 +0200, Pavel Stehule wrote: > I understand better it. Second cluster has to be an clone of first > cluster. -> don't use initdb for second cluster. Is possible add this > notice to pg_standby's README? The README says that you should read the main docs, which do describe this. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] interval / interval -> double operator
On 5/17/07, Tom Lane <[EMAIL PROTECTED]> wrote: "Andrew Hammond" <[EMAIL PROTECTED]> writes: > On 5/17/07, Tom Lane <[EMAIL PROTECTED]> wrote: >> What are the grounds for defining it that way rather than some other >> way? > The only alternative that came to mind when I wrote it was using a numeric > instead of float. No, I'm wondering what's the justification for smashing it to a single number at all, when the inputs are three-field values. Interval divided by float doesn't produce just a float, for example. I think I see what you're getting at here. '1 month' / '1 day' could return a number of reasonable values depending on how many days are in the month (28 to 31) and on how many hours are in a day (generally 24, but can be 23 or 25 for DST adjustments). The definition above simply assumes that EXTRACT(epoch...) does the Right Thing. Hmmm. I'm at a loss for the right way to solve this. It seems very reasonable to want to divide intervals by intervals (how many nanocenturies in a fortnight?), but I'm at a loss for how to do that correctly. I'll read the code from EXTRACT(epoch...) and see what happening there. Andrew
Re: [HACKERS] What is happening on buildfarm member baiji?
On Thu, May 17, 2007 at 07:51:34PM -0400, Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Are we going to apply this? > > Not in the form submitted so far, but I trust Magnus is working on > fixing it. I am. Most likely won't have time to look at it properly until after pgcon, though. //Magnus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster