Re: [HACKERS] Need a mentor, and a project.
2009/12/16 decibel deci...@decibel.org On Dec 11, 2009, at 8:44 PM, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Ashish wrote: I am thinking about starting with the following TODO item: -- Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage. I even have a sample patch you can use as a start, attached. IMO the hard part of the TODO item is to design a useful user interface for highlighting specific EXPLAIN entries (and NOTICE messages probably ain't it either). Getting the numbers is trivial. What about prefixing explain output with line numbers? NOTICEs (or whatever mechanism) could then reference the line numbers. +1 -- Lets call it Postgres EnterpriseDB http://www.enterprisedb.com gurjeet[.sin...@enterprisedb.com singh.gurj...@{ gmail | hotmail | indiatimes | yahoo }.com Twitter: singh_gurjeet Skype: singh_gurjeet Mail sent from my BlackLaptop device
Re: [HACKERS] Need a mentor, and a project.
On Dec 11, 2009, at 8:44 PM, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Ashish wrote: I am thinking about starting with the following TODO item: -- Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage. I even have a sample patch you can use as a start, attached. IMO the hard part of the TODO item is to design a useful user interface for highlighting specific EXPLAIN entries (and NOTICE messages probably ain't it either). Getting the numbers is trivial. What about prefixing explain output with line numbers? NOTICEs (or whatever mechanism) could then reference the line numbers. Unfortunately, I think you'll be very hard-pressed to come up with a way to denote problems on the lines themselves, since horizontal space is already very hard to come by in complex plans. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a mentor, and a project.
I am thinking about starting with the following TODO item: -- Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage. I picked this because it is somewhat related to query processing which is what I am most interested in. It also seems like a good start up project for a newbie like me. Before I start looking into what this would involve and start a conversation on designing a solution - I wanted to know what you guys think about this particular TODO, and it suitability to a newbie. Looking forward to your comments... Thanks Ashish On Mon, 7 Dec 2009, Josh Berkus wrote: On 12/7/09 4:41 PM, Ashish wrote: Albe Joshua, thanks for the advice. I am in the process of deciding what to work on and am looking at the TODO list. I definitely do not intend to work in a vacuum :-) I am really excited about this and look forward to being challenged and learning a lot. When you decide what you want to work on, let us know and we'll try to find you an appropriate mentor. --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a mentor, and a project.
Ashish wrote: I am thinking about starting with the following TODO item: -- Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage. I picked this because it is somewhat related to query processing which is what I am most interested in. It also seems like a good start up project for a newbie like me. Before I start looking into what this would involve and start a conversation on designing a solution - I wanted to know what you guys think about this particular TODO, and it suitability to a newbie. Looking forward to your comments... I even have a sample patch you can use as a start, attached. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: doc/src/sgml/ref/explain.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/ref/explain.sgml,v retrieving revision 1.38 diff -c -c -r1.38 explain.sgml *** doc/src/sgml/ref/explain.sgml 18 Sep 2006 19:54:01 - 1.38 --- doc/src/sgml/ref/explain.sgml 22 Dec 2006 17:09:05 - *** *** 64,72 para The literalANALYZE/literal option causes the statement to be actually executed, not only planned. The total elapsed time expended within each plan node (in !milliseconds) and total number of rows it actually returned are added to !the display. This is useful for seeing whether the planner's estimates !are close to reality. /para important --- 64,72 para The literalANALYZE/literal option causes the statement to be actually executed, not only planned. The total elapsed time expended within each plan node (in !milliseconds) and total number of rows it actually returned and variance are added to !the display. A sign of the variance indicates whether the estimate was too high or too low. !This is useful for seeing how close the planner's estimates are to reality. /para important *** *** 222,229 QUERY PLAN - ! HashAggregate (cost=39.53..39.53 rows=1 width=8) (actual time=0.661..0.672 rows=7 loops=1) !-gt; Index Scan using test_pkey on test (cost=0.00..32.97 rows=1311 width=8) (actual time=0.050..0.395 rows=99 loops=1) Index Cond: ((id gt; $1) AND (id lt; $2)) Total runtime: 0.851 ms (4 rows) --- 222,229 QUERY PLAN - ! HashAggregate (cost=39.53..39.53 rows=1 width=8) (actual time=0.661..0.672 rows=7 var=-6.00 loops=1) !-gt; Index Scan using test_pkey on test (cost=0.00..32.97 rows=1311 width=8) (actual time=0.050..0.395 rows=99 var=+12.24 loops=1) Index Cond: ((id gt; $1) AND (id lt; $2)) Total runtime: 0.851 ms (4 rows) Index: src/backend/commands/explain.c === RCS file: /cvsroot/pgsql/src/backend/commands/explain.c,v retrieving revision 1.152 diff -c -c -r1.152 explain.c *** src/backend/commands/explain.c 4 Oct 2006 00:29:51 - 1.152 --- src/backend/commands/explain.c 22 Dec 2006 17:09:09 - *** *** 57,62 --- 57,63 static void show_sort_keys(Plan *sortplan, int nkeys, AttrNumber *keycols, const char *qlabel, StringInfo str, int indent, ExplainState *es); + static double ExplainVariance(double estimate, double actual); /* * ExplainQuery - *** *** 704,713 { double nloops = planstate-instrument-nloops; ! appendStringInfo(str, (actual time=%.3f..%.3f rows=%.0f loops=%.0f), 1000.0 * planstate-instrument-startup / nloops, 1000.0 * planstate-instrument-total / nloops, planstate-instrument-ntuples / nloops, planstate-instrument-nloops); } else if (es-printAnalyze) --- 705,716 { double nloops = planstate-instrument-nloops; ! appendStringInfo(str, (actual time=%.3f..%.3f rows=%.0f var=%+.2f loops=%.0f), 1000.0 * planstate-instrument-startup / nloops, 1000.0 * planstate-instrument-total / nloops, planstate-instrument-ntuples / nloops, + ExplainVariance(plan-plan_rows, + planstate-instrument-ntuples / nloops), planstate-instrument-nloops); } else if (es-printAnalyze) *** *** 1205,1207 --- 1208,1225 appendStringInfo(str, \n); } + + + static double ExplainVariance(double estimate, double
Re: [HACKERS] Need a mentor, and a project.
On Fri, Dec 11, 2009 at 9:00 PM, Ashish abin...@u.washington.edu wrote: I am thinking about starting with the following TODO item: -- Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage. I picked this because it is somewhat related to query processing which is what I am most interested in. It also seems like a good start up project for a newbie like me. Before I start looking into what this would involve and start a conversation on designing a solution - I wanted to know what you guys think about this particular TODO, and it suitability to a newbie. Looking forward to your comments... If we're going to do this, I think we should implement this as an optional behavior controlled by a new EXPLAIN option (maybe VARIANCE, following Bruce's patch?) and generate the output using ExplainPropertysome-data-type. We could possibly make the option take an optional threshold indicating how much variance is required before the variance gets displayed, and display the variance for every node if VARIANCE is specified without an argument. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a mentor, and a project.
On Fri, Dec 11, 2009 at 9:05 PM, Bruce Momjian br...@momjian.us wrote: Ashish wrote: I am thinking about starting with the following TODO item: -- Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage. I picked this because it is somewhat related to query processing which is what I am most interested in. It also seems like a good start up project for a newbie like me. Before I start looking into what this would involve and start a conversation on designing a solution - I wanted to know what you guys think about this particular TODO, and it suitability to a newbie. Looking forward to your comments... I even have a sample patch you can use as a start, attached. Interesting. The logic in ExplainVariance() doesn't look right to me - the cases where one argument is zero seem like they will produce a differently-scaled result than otherwise. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a mentor, and a project.
Robert Haas wrote: On Fri, Dec 11, 2009 at 9:05 PM, Bruce Momjian br...@momjian.us wrote: Ashish wrote: I am thinking about starting with the following TODO item: -- Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage. I picked this because it is somewhat related to query processing which is what I am most interested in. It also seems like a good start up project for a newbie like me. Before I start looking into what this would involve and start a conversation on designing a solution - I wanted to know what you guys think about this particular TODO, and it suitability to a newbie. Looking forward to your comments... I even have a sample patch you can use as a start, attached. Interesting. The logic in ExplainVariance() doesn't look right to me - the cases where one argument is zero seem like they will produce a differently-scaled result than otherwise. Yea, it is just a starting point for him. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a mentor, and a project.
Bruce Momjian br...@momjian.us writes: Ashish wrote: I am thinking about starting with the following TODO item: -- Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage. I even have a sample patch you can use as a start, attached. Of course, the reason that patch isn't already in there is that it's pretty much useless. It clutters what's already cluttered output and doesn't do much of anything to help draw one's attention to the larger estimation errors, which of course is what the TODO item is really about. IMO the hard part of the TODO item is to design a useful user interface for highlighting specific EXPLAIN entries (and NOTICE messages probably ain't it either). Getting the numbers is trivial. I'm not sure there is any really nice solution within the confines of plain ASCII text output. There was an interesting approach online at http://explain-analyze.info, but that site seems to be down now :-( regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a mentor, and a project.
Tom Lane escreveu: I'm not sure there is any really nice solution within the confines of plain ASCII text output. There was an interesting approach online at http://explain-analyze.info, but that site seems to be down now :-( Estimation error is one of the ideas. The other ones I have in mind is: (i) accumulative time or percentage per node and (ii) color node that has estimation off (if the terminal support colors). Of course, those features should be enabled using some explain options like ACCUMULATIVE and COLOR. Another explain tool that has a similar approach is http://explain.depesz.com/ . -- Euler Taveira de Oliveira http://www.timbira.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a mentor, and a project.
On mån, 2009-12-07 at 09:53 +0100, Albe Laurenz wrote: I would start with the TODO list: http://wiki.postgresql.org/wiki/Todo These are things for which there is a consensus that it would be a good idea to implement them. The Todo list is not a list of things for which such a consensus exists. The Todo list is in general a list of things that someone thought should be considered at some point. But unless the item is linked to a mailing list thread that already shows a consensus about the feature, you need to start with a discussion about a plan. So don't submit a project plan to your university or boss based on I will work on item X because it's on the Todo list without taking ample time to discuss things here first. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a mentor, and a project.
Peter Eisentraut wrote: But unless the item is linked to a mailing list thread that already shows a consensus about the feature, you need to start with a discussion about a plan. And realistically, even if the item is so linked, someone new to the project still shouldn't just plow away on it without asking for confirmation first anyway. There are many things on the TODO list that everyone would like to see fixed, the problem is well defined and unambiguous, but the way the solution needs to be structured is much harder than is obvious. As a simplest example, we regularly have people show up with patches where the solution was just add threading to the back-end here... which might seem completely reasonable to someone new--but it will never get committed. -- Greg Smith2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a mentor, and a project.
abindra wrote: Next quarter I am planning to do an Independent Study course where the main objective would be to allow me to get familiar with the internals of Postgres by working on a project(s). I would like to work on something that could possibly be accepted as a patch. This is (I think) somewhat similar to what students do during google summer and I was hoping to get some help here in terms of: 1. A good project to work on for a newbie. 2. Would someone be willing to be a mentor? It would be nice to be able to get some guidance on a one-to-one basis. I would start with the TODO list: http://wiki.postgresql.org/wiki/Todo These are things for which there is a consensus that it would be a good idea to implement them. Pick things that look interesting to you, and try to read the discussions in the archives that lead to the TODO items. Bring the topic up in the hackers list, say that you would like to work on this or that TODO item, present your ideas of how you want to do it. Ask about things where you feel insecure. If you get some support, proceed to write a patch. Ask for directions, post half-baked patches and ask for comments. That is because you will probably receive a good amount of critizism and maybe rejection, and if you invest a couple of months into working on something that nobody knows about *and* your work gets rejected, that is much worse than drawing fire right away. It's probably not easy to find a mentor (unless you have money to give away), but you may find people who are interested in what you are doing and who will help you. Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a mentor, and a project.
On Mon, Dec 07, 2009 at 09:53:32AM +0100, Albe Laurenz wrote: abindra wrote: Next quarter I am planning to do an Independent Study course where the main objective would be to allow me to get familiar with the internals of Postgres by working on a project(s). I would like to work on something that could possibly be accepted as a patch. This is (I think) somewhat similar to what students do during google summer and I was hoping to get some help here in terms of: 1. A good project to work on for a newbie. 2. Would someone be willing to be a mentor? It would be nice to be able to get some guidance on a one-to-one basis. I would start with the TODO list: http://wiki.postgresql.org/wiki/Todo These are things for which there is a consensus that it would be a good idea to implement them. Pick things that look interesting to you, and try to read the discussions in the archives that lead to the TODO items. I agree the TODO list is a good place to start. Other good sources include the -hackers list and comments in the code. I was surprised when I began taking an interest in PostgreSQL how rarely interesting projects mentioned on -hackers made it into the TODO list; I've come to realize that the TODO contains, in general, very non-controversial items everyone is pretty sure we could use, whereas -hackers ranges freely over other topics which are still very interesting but often more controversial or less obviously necessary. Committed patches both large and small address TODO list items fairly rarely, so don't get too hung up on finding something from the TODO list alone. Bring the topic up in the hackers list, say that you would like to work on this or that TODO item, present your ideas of how you want to do it. Ask about things where you feel insecure. If you get some support, proceed to write a patch. Ask for directions, post half-baked patches and ask for comments. That is because you will probably receive a good amount of critizism and maybe rejection, and if you invest a couple of months into working on something that nobody knows about *and* your work gets rejected, that is much worse than drawing fire right away. +1. Especially when developing a complex patch, and especially when you're new to the community, you need to avoid working in a vacuum, for social as well as technical reasons. The more complex a patch, the more consensus you'll eventually need to achieve before getting it committed, in general, and it helps to gain that consensus early on, rather than after you've written a lot of code. The keyword proposal might be a useful search term when digging in the -hackers archives for historical examples. -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com signature.asc Description: Digital signature
Re: [HACKERS] Need a mentor, and a project.
On Sun, Dec 6, 2009 at 9:24 PM, abin...@u.washington.edu wrote: 2. Would someone be willing to be a mentor? It would be nice to be able to get some guidance on a one-to-one basis. I might be willing to do this, but if you pick a project that is outside my area of knowledge then I might not be able to help as much. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a mentor, and a project.
Albe Joshua, thanks for the advice. I am in the process of deciding what to work on and am looking at the TODO list. I definitely do not intend to work in a vacuum :-) I am really excited about this and look forward to being challenged and learning a lot. Regards Ashish On Mon, 7 Dec 2009, Joshua Tolley wrote: On Mon, Dec 07, 2009 at 09:53:32AM +0100, Albe Laurenz wrote: abindra wrote: Next quarter I am planning to do an Independent Study course where the main objective would be to allow me to get familiar with the internals of Postgres by working on a project(s). I would like to work on something that could possibly be accepted as a patch. This is (I think) somewhat similar to what students do during google summer and I was hoping to get some help here in terms of: 1. A good project to work on for a newbie. 2. Would someone be willing to be a mentor? It would be nice to be able to get some guidance on a one-to-one basis. I would start with the TODO list: http://wiki.postgresql.org/wiki/Todo These are things for which there is a consensus that it would be a good idea to implement them. Pick things that look interesting to you, and try to read the discussions in the archives that lead to the TODO items. I agree the TODO list is a good place to start. Other good sources include the -hackers list and comments in the code. I was surprised when I began taking an interest in PostgreSQL how rarely interesting projects mentioned on -hackers made it into the TODO list; I've come to realize that the TODO contains, in general, very non-controversial items everyone is pretty sure we could use, whereas -hackers ranges freely over other topics which are still very interesting but often more controversial or less obviously necessary. Committed patches both large and small address TODO list items fairly rarely, so don't get too hung up on finding something from the TODO list alone. Bring the topic up in the hackers list, say that you would like to work on this or that TODO item, present your ideas of how you want to do it. Ask about things where you feel insecure. If you get some support, proceed to write a patch. Ask for directions, post half-baked patches and ask for comments. That is because you will probably receive a good amount of critizism and maybe rejection, and if you invest a couple of months into working on something that nobody knows about *and* your work gets rejected, that is much worse than drawing fire right away. +1. Especially when developing a complex patch, and especially when you're new to the community, you need to avoid working in a vacuum, for social as well as technical reasons. The more complex a patch, the more consensus you'll eventually need to achieve before getting it committed, in general, and it helps to gain that consensus early on, rather than after you've written a lot of code. The keyword proposal might be a useful search term when digging in the -hackers archives for historical examples. -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a mentor, and a project.
On 12/7/09 4:41 PM, Ashish wrote: Albe Joshua, thanks for the advice. I am in the process of deciding what to work on and am looking at the TODO list. I definitely do not intend to work in a vacuum :-) I am really excited about this and look forward to being challenged and learning a lot. When you decide what you want to work on, let us know and we'll try to find you an appropriate mentor. --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a mentor, and a project.
Hi Robert, Thanks. If I may, what encompasses your area of expertise... BTW Congratulation on becoming a committer! Regards Ashish On Mon, 7 Dec 2009, Robert Haas wrote: On Sun, Dec 6, 2009 at 9:24 PM, abin...@u.washington.edu wrote: 2. Would someone be willing to be a mentor? It would be nice to be able to get some guidance on a one-to-one basis. I might be willing to do this, but if you pick a project that is outside my area of knowledge then I might not be able to help as much. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Need a mentor, and a project.
On Mon, Dec 7, 2009 at 8:04 PM, Ashish abin...@u.washington.edu wrote: Hi Robert, Thanks. If I may, what encompasses your area of expertise... BTW Congratulation on becoming a committer! Thanks. As others have said, it's probably best to pick a project first, or at least an area. It's more important to find something you're interested in working on than to think about working with some particular person. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Need a mentor, and a project.
Hello there, I am a graduate student at the University of Washington, Tacoma (http://www.tacoma.washington.edu/tech/) with an interest in databases (especially query processing). I am familiar with database theory and in an earlier life I used to be an application developer and have done a lot of SQL/database related work. I have been interested in learning and contribution to postgres for a while now. This quarter I was the TA for the undergrad intro to database class. I convinced my Prof. to use Postgresql to teach and it has been fun. It has also allowed me to familiarize myself with postgres from an external user's point of view. Next quarter I am planning to do an Independent Study course where the main objective would be to allow me to get familiar with the internals of Postgres by working on a project(s). I would like to work on something that could possibly be accepted as a patch. This is (I think) somewhat similar to what students do during google summer and I was hoping to get some help here in terms of: 1. A good project to work on for a newbie. 2. Would someone be willing to be a mentor? It would be nice to be able to get some guidance on a one-to-one basis. Thanks for your time. If you have any questions or need more information, please do let me know. Regards Ashish -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers