On Tue, Mar 28, 2017 at 10:36 PM, Craig Ringer <cr...@2ndquadrant.com> wrote:
> Personally I have to agree that the learning curve is very steep. Some > of the docs and presentations help, but there's a LOT to understand. Some small patches can be kept to a fairly narrow set of areas, and if you can find a similar capability to can crib technique for handling some of the more mysterious areas it might brush up against. When I started working on my first *big* patch that was bound to touch many areas (around the start of development for 9.1) I counted lines of code and found over a million lines just in .c and .h files. We're now closing in on 1.5 million lines. That's not counting over 376,000 lines of documentation in .sgml files, over 12,000 lines of text in README* files, over 26,000 lines of perl code, over 103,000 lines of .sql code (60% of which is in regression tests), over 38,000 lines of .y code (for flex/bison parsing), about 9,000 lines of various type of code just for generating the configure file, and over 439,000 lines of .po files (for message translations). I'm sure I missed a lot of important stuff there, but it gives some idea the challenge it is to get your head around it all. My first advice is to try to identify which areas of the code you will need to touch, and read those over. Several times. Try to infer the API to areas *that* code needs to reference from looking at other code (as similar to what you want to work on as you can find), reading code comments and README files, and asking questions. Secondly, there is a lot that is considered to be "coding rules" that is, as far as I've been able to tell, only contained inside the heads of veteran PostgreSQL coders, with occasional references in the discussion list archives. Asking questions, proposing approaches before coding, and showing work in progress early and often will help a lot in terms of discovering these issues and allowing you to rearrange things to fit these conventions. If someone with the "gift of gab" is able to capture these and put them into a readily available form, that would be fantastic. > * SSI (haven't gone there yet myself) For anyone wanting to approach this area, there is a fair amount to look at. There is some overlap, but in rough order of "practical" to "theoretical foundation", you might want to look at: https://www.postgresql.org/docs/current/static/transaction-iso.html https://wiki.postgresql.org/wiki/SSI The SQL standard https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/backend/storage/lmgr/README-SSI;hb=refs/heads/master http://www.vldb.org/pvldb/vol5.html http://hdl.handle.net/2123/5353 Papers cited in these last two. I have found papers authored by Alan Fekete or Adul Adya particularly enlightening. If any of the other areas that Craig listed have similar work available, maybe we should start a Wiki page where we list areas of code (starting with the list Craig included) as section headers, and put links to useful reading below each? -- Kevin Grittner VMware vCenter Server https://www.vmware.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers