Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-05-01 Thread Kang Yuzhe
On Sun, Apr 30, 2017 at 10:47 AM, Andrew Borodin  wrote:
> Hi, Kang and everyone in this thread.
>
> I'm planning to present the online course "Hacking PostgreSQL: data
> access methods in action and under the hood" on edX on June 1st. It's
> not announced yet, links will be available later.
> This course I'm describing information that was crucial for me to
> start hacking. Currently, my knowledge of technologies behind Postgres
> is quite limited, though I know the border of my knowledge quite well.
>
> Chances are that description from my point of view will not be helpful
> in some cases: before starting contributing to Postgres I had already
> held PhD in CS for database technology and I had already implemented 3
> different commercial DBMS (all in different technologies, PLs,
> paradigms, focuses, different prbolems being solved). And still,
> production of minimally viable contribution took 3 months (I was
> hacking for an hour a day, mostly at evenings).
> That's why I decided that it worth talking about how to get there
> before I'm already there. It's quite easy to forget that some concepts
> are really hard before you get them.
>
> The course will cover:
> 1. Major differences of Postgres from others
> 2. Dev tools as I use them
> 3. Concept of paged memory, latches and paged data structures
> 4. WAL, recovery, replication
> 5. Concurrency and locking in B-trees
> 6. GiST internals
> 7. Extensions
> 8. Text search and some of GIN
> 9. Postgres community mechanics
> Every topic will consist of two parts: 1 - video lectures on YouTube
> (in English and Russian, BTW my English is far from perfect) with
> references to docs and other resources, 2 - practical tasks where you
> change code slightly and observe differences (this part is mostly to
> help the student to observe easy entry points).
>

Thanks Andrey in advance. I am looking forward to meetingyou  there at Edx.
Regards,
Zeray


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-04-30 Thread Andrew Borodin
Hi, Kang and everyone in this thread.

I'm planning to present the online course "Hacking PostgreSQL: data
access methods in action and under the hood" on edX on June 1st. It's
not announced yet, links will be available later.
This course I'm describing information that was crucial for me to
start hacking. Currently, my knowledge of technologies behind Postgres
is quite limited, though I know the border of my knowledge quite well.

Chances are that description from my point of view will not be helpful
in some cases: before starting contributing to Postgres I had already
held PhD in CS for database technology and I had already implemented 3
different commercial DBMS (all in different technologies, PLs,
paradigms, focuses, different prbolems being solved). And still,
production of minimally viable contribution took 3 months (I was
hacking for an hour a day, mostly at evenings).
That's why I decided that it worth talking about how to get there
before I'm already there. It's quite easy to forget that some concepts
are really hard before you get them.

The course will cover:
1. Major differences of Postgres from others
2. Dev tools as I use them
3. Concept of paged memory, latches and paged data structures
4. WAL, recovery, replication
5. Concurrency and locking in B-trees
6. GiST internals
7. Extensions
8. Text search and some of GIN
9. Postgres community mechanics
Every topic will consist of two parts: 1 - video lectures on YouTube
(in English and Russian, BTW my English is far from perfect) with
references to docs and other resources, 2 - practical tasks where you
change code slightly and observe differences (this part is mostly to
help the student to observe easy entry points).

Best regards, Andrey Borodin, Octonica.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-04-28 Thread Craig Ringer
On 28 Apr. 2017 17:04, "Kang Yuzhe"  wrote:

Hello Simon,
The journey that caused and is causing me a lot of pain is finding my way
in PG development.
Complex Code Reading like PG. Fully understanding the science of DBMS
Engines: Query Processing, Storage stuff, Transaction Management and so
on...

Anyway as you said, the rough estimation towards any expertise seems to be
in abidance with by The 10,000 Hour Rule. I will strive based on this rule.


Start with not top-posting on the mailing list ;)


For now, would please tell me how to know the exact PG version to which a
specific patch was developed?
Given x patch, how do I know the specific PG version it was developed for?


If it a was created by git format-patch then the base git revision will be
shown. This may be a commit from postgres public tree that you can find
with 'git branch --contains'.

Otherwise look at the proposed commit message if any, in the patch header.
Or the email it was attached to. If all else fails guess based on the date.


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-04-28 Thread Kang Yuzhe
Hello Simon,
The journey that caused and is causing me a lot of pain is finding my way
in PG development.
Complex Code Reading like PG. Fully understanding the science of DBMS
Engines: Query Processing, Storage stuff, Transaction Management and so
on...

Anyway as you said, the rough estimation towards any expertise seems to be
in abidance with by The 10,000 Hour Rule. I will strive based on this rule.

For now, would please tell me how to know the exact PG version to which a
specific patch was developed?
Given x patch, how do I know the specific PG version it was developed for?

Regards,
Zeray



On Mon, Apr 17, 2017 at 7:33 PM, Simon Riggs  wrote:

> On 27 March 2017 at 13:00, Kang Yuzhe  wrote:
>
> > I have found PG source Code reading and hacking to be one the most
> > frustrating experiences in my life.  I believe that PG hacking should
> not be
> > a painful journey but an enjoyable one!
> >
> > It is my strong believe that out of my PG hacking frustrations, there may
> > come insights for the PG experts on ways how to devise hands-on with PG
> > internals so that new comers will be great coders as quickly as possible.
>
> I'm here now because PostgreSQL has clear, well designed and
> maintained code with accurate docs, great comments and a helpful team.
>
> I'd love to see detailed cases where another project is better in a
> measurable way; I am willing to learn from that.
>
> Any journey to expertise takes 10,000 hours. There is no way to shorten
> that.
>
> What aspect of your journey caused you pain?
>
> --
> Simon Riggshttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-04-18 Thread Andrew Dunstan


On 04/18/2017 03:54 AM, Craig Ringer wrote:
>> But almost nothing about The Internals of PostgreSQL:
> Not surprising. They'd go out of date fast, be a huge effort to write
> and maintain, and sell poorly given the small audience.
>
> Print books probably aren't the way forward here.
>


Agreed,  a well organized web site would work much better.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-04-18 Thread Andrew Dunstan


On 04/18/2017 03:41 AM, Kang Yuzhe wrote:
>
>
> But almost nothing about The Internals of PostgreSQL:
> 1. The Internals of PostgreSQL:
> http://www.interdb.jp/pg/index.html  translated from Japanese Book
> 2. PostgreSQL数据库内核分析(Chinese) Book on the Internals of PostgreSQL:
> 3. PG Docs/site
> 4. some here and there which are less useful
>



I agree that this is an area where more material would be very welcome,
and not only to newcomers. #1 is useful as far as it goes, but the
missing bits (esp. Query Processing) are critical.



> Lastly, I have come to understand that PG community is not
> harsh/intimidating to newbies and thus, I am feeling at home.
>
>


Glad you have found it so.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-04-18 Thread Kang Yuzhe
Hello Amit,
Thanks gain for being patient with me.
YES, I am working with the PostgreSQL source git repository but I don't
think I updated my local forked/cloned branch. I am also working on
standalone PG 9.6.2 source code as well.

I will try to fetch/pull the PG master content to my forked/cloned branch
and apply those current patches.

I will also try to reply to the email threads where I downloaded the
patches so that they can update their patches accordingly.

Thanks,
Zeray



On Tue, Apr 18, 2017 at 11:25 AM, Amit Langote <
langote_amit...@lab.ntt.co.jp> wrote:

> Hi,
>
> On 2017/04/18 16:54, Kang Yuzhe wrote:
> > Thanks Amit for taking your time and pointing to some useful stuff on the
> > Internals of PostgreSQL.
> >
> >
> > One thing I have learned is that PG community is not as hostile/harsh as
> I
> > imagined to newbies. Rather, its the reverse.
> > I am feeling at home here.
> >
> > Amit, would you please help out on  how to apply some patches in PG
> source
> > code. For example, there are two patches attached here: one on
> > CORRESPONDING CLAUSE and one on MERGE SQL Standard.
> >
> > There are some errors saying Hunk failed(src/backend/parser/gram.y.rej).
> >
> > postgresql-9.6.2$ patch --dry-run -p1 < corresponding_clause_v12.patch
> > patching file doc/src/sgml/queries.sgml
> > Hunk #1 succeeded at 1603 (offset 2 lines).
> > Hunk #2 succeeded at 1622 (offset 2 lines).
> > Hunk #3 succeeded at 1664 (offset 2 lines).
>
> [ ... ]
>
> > /postgresql-9.6.2$
>
> Firstly, it looks like you're trying to apply the patch to the 9.6 source
> tree (are you working with the PostgreSQL source git repository?).  But,
> since all the new feature patches are created against the master
> development branch of the git repository, the patch most likely won't
> apply cleanly against a source tree from the older branch.
>
> If you're not using the git repository currently, you may have better luck
> trying the development branch snapshot tarballs (see the link below):
>
> https://www.postgresql.org/ftp/snapshot/dev/
>
> Also, it's a good idea to reply on the email thread from where you
> downloaded the patch to ask them to update the patch, so that they can
> send a fresh patch that applies cleanly.
>
> The MERGE patch looks very old (from 2010 probably), so properly applying
> it to the source tree of today is going to be hard.  Actually, it most
> likely won't be in a working condition anymore.  You can try recently
> proposed patches, for example, those in the next commitfest:
>
> https://commitfest.postgresql.org/14/
>
> Patches listed on the above page are more likely to apply cleanly and be
> in working condition.  But of course, you will need to be interested in
> the topics those patches are related to.  There are some new SQL feature
> patches, for example:
>
> https://commitfest.postgresql.org/14/839/
>
> Thanks,
> Amit
>
>


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-04-18 Thread Kang Yuzhe
Thanks Craig for teaching me a lot of things. I am just learning a lot why
PG hacking/development is the way it is.

Regarding interest and enthusiasm, no problem. Whats is lacking is the
skill-sets and I believe having interest and enthusiasm and with your
support, we will expand PG hacking/devs/usage in Africa and other
continents.

People here in Africa using Oracle/SQL Server/IBM products(generally
commercial products) even for which PG is more than enough.

I want to change this scenario and trend and I hope one day in the future
there will be PG conference in Africa/Ethiopia which is my country.

Thanks,
zeray




On Tue, Apr 18, 2017 at 10:54 AM, Craig Ringer 
wrote:

> On 18 April 2017 at 15:41, Kang Yuzhe  wrote:
> > Thanks Simon for taking your time and trying to tell and warn me the
> harsh
> > reality truth:there is no shortcut to expertise. One has to fail and rise
> > towards any journey to expertise.
>
> Yeah, just because Pg is hard doesn't mean it's notably bad or worse
> than other things. I generally find working on code in other projects,
> even smaller and simpler ones, a rather unpleasant change.
>
> That doesn't mean we can't do things to help interested new people get
> and stay engaged and grow into productive devs to grow our pool.
>
> > Overall, you are right. But I do believe that there is a way(some
> > techniques) to speed up any journey to expertise. One of them is
> mentorship.
> > For example(just an example), If you show me how to design and implement
> FDW
> > to Hadoop/HBase., I believe that I will manage to design and implement
> FDW
> > to Cassandra/MengoDB.
>
> TBH, that's the sort of thing where looking at existing examples is
> often the best way forward and will stay that way.
>
> What I'd like to do is make it easier to understand those examples by
> providing background and overview info on subsystems, so you can read
> the code and have more idea what it does and why.
>
> > But almost nothing about The Internals of PostgreSQL:
>
> Not surprising. They'd go out of date fast, be a huge effort to write
> and maintain, and sell poorly given the small audience.
>
> Print books probably aren't the way forward here.
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-04-18 Thread Amit Langote
Hi,

On 2017/04/18 16:54, Kang Yuzhe wrote:
> Thanks Amit for taking your time and pointing to some useful stuff on the
> Internals of PostgreSQL.
> 
> 
> One thing I have learned is that PG community is not as hostile/harsh as I
> imagined to newbies. Rather, its the reverse.
> I am feeling at home here.
> 
> Amit, would you please help out on  how to apply some patches in PG source
> code. For example, there are two patches attached here: one on
> CORRESPONDING CLAUSE and one on MERGE SQL Standard.
> 
> There are some errors saying Hunk failed(src/backend/parser/gram.y.rej).
> 
> postgresql-9.6.2$ patch --dry-run -p1 < corresponding_clause_v12.patch
> patching file doc/src/sgml/queries.sgml
> Hunk #1 succeeded at 1603 (offset 2 lines).
> Hunk #2 succeeded at 1622 (offset 2 lines).
> Hunk #3 succeeded at 1664 (offset 2 lines).

[ ... ]

> /postgresql-9.6.2$

Firstly, it looks like you're trying to apply the patch to the 9.6 source
tree (are you working with the PostgreSQL source git repository?).  But,
since all the new feature patches are created against the master
development branch of the git repository, the patch most likely won't
apply cleanly against a source tree from the older branch.

If you're not using the git repository currently, you may have better luck
trying the development branch snapshot tarballs (see the link below):

https://www.postgresql.org/ftp/snapshot/dev/

Also, it's a good idea to reply on the email thread from where you
downloaded the patch to ask them to update the patch, so that they can
send a fresh patch that applies cleanly.

The MERGE patch looks very old (from 2010 probably), so properly applying
it to the source tree of today is going to be hard.  Actually, it most
likely won't be in a working condition anymore.  You can try recently
proposed patches, for example, those in the next commitfest:

https://commitfest.postgresql.org/14/

Patches listed on the above page are more likely to apply cleanly and be
in working condition.  But of course, you will need to be interested in
the topics those patches are related to.  There are some new SQL feature
patches, for example:

https://commitfest.postgresql.org/14/839/

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-04-18 Thread Craig Ringer
On 18 April 2017 at 15:41, Kang Yuzhe  wrote:
> Thanks Simon for taking your time and trying to tell and warn me the harsh
> reality truth:there is no shortcut to expertise. One has to fail and rise
> towards any journey to expertise.

Yeah, just because Pg is hard doesn't mean it's notably bad or worse
than other things. I generally find working on code in other projects,
even smaller and simpler ones, a rather unpleasant change.

That doesn't mean we can't do things to help interested new people get
and stay engaged and grow into productive devs to grow our pool.

> Overall, you are right. But I do believe that there is a way(some
> techniques) to speed up any journey to expertise. One of them is mentorship.
> For example(just an example), If you show me how to design and implement FDW
> to Hadoop/HBase., I believe that I will manage to design and implement FDW
> to Cassandra/MengoDB.

TBH, that's the sort of thing where looking at existing examples is
often the best way forward and will stay that way.

What I'd like to do is make it easier to understand those examples by
providing background and overview info on subsystems, so you can read
the code and have more idea what it does and why.

> But almost nothing about The Internals of PostgreSQL:

Not surprising. They'd go out of date fast, be a huge effort to write
and maintain, and sell poorly given the small audience.

Print books probably aren't the way forward here.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-04-18 Thread Craig Ringer
On 18 April 2017 at 01:29, Alvaro Herrera  wrote:
> Craig Ringer wrote:
>
>> Personally I have to agree that the learning curve is very steep. Some
>> of the docs and presentations help, but there's a LOT to understand.
>
> There is a wiki page "Developer_FAQ" which is supposed to help answer
> these questions.  It is currently not very useful, because people
> stopped adding to it very early and is now mostly unmaintained, but
> I'm sure it could become a very useful central resource for this kind of
> information.

I add to it when I think of things.

But it'll become an unmaintainable useless morass if random things are
just indiscriminately added. Something more structured is needed to
cover subsystems, coding rules ("don't LWLockRelease() before
ereport(ERROR, ...)"), etc.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-04-18 Thread Kang Yuzhe
Thanks Simon for taking your time and trying to tell and warn me the harsh
reality truth:there is no shortcut to expertise. One has to fail and rise
towards any journey to expertise.
Overall, you are right. But I do believe that there is a way(some
techniques) to speed up any journey to expertise. One of them is
mentorship. For example(just an example), If you show me how to design and
implement FDW to Hadoop/HBase., I believe that I will manage to design and
implement FDW to Cassandra/MengoDB.

The paths towards any journey to expertise by working alone/the hard way
and working with you using as a mentorship are completely different. I
believe that we humans have to power to imitate and get innovative
afterwords.

There are many books on PG business application development:
1.
*PostgreSQL Essential Reference/Barry Stinson2. *PostgreSQL : introduction
and concepts / Momjian,
Bruce.
3. PostgreSQL Cookbook/Over 90 hands-on recipes to effectively manage,
administer, and design solutions using PostgreSQL
4.PostgreSQL Developer's Handbook
5.PostgreSQL 9.0 High Performance
6.PostgreSQL Server Programming
7.PostgreSQL for Data Architects/Discover how to design, develop, and
maintain your
database application effectively with PostgreSQL
8.Practical PostgreSQL
9.Practical SQL Handbook, The: Using SQL Variants, Fourth Edition
10.PostgreSQL: The comprehensive guide to building, programming, and
administering PostgreSQL databases, Second Edition
11.Beginning Databases with PostgreSQL From Novice to Professional, Second
Edition
12.PostgreSQL Succinctly
13.PostgreSQL Up and Running


But almost nothing about The Internals of PostgreSQL:
1. The Internals of PostgreSQL:
http://www.interdb.jp/pg/index.html  translated from Japanese Book
2. PostgreSQL数据库内核分析(Chinese) Book on the Internals of PostgreSQL:
3. PG Docs/site
4. some here and there which are less useful

Lastly, I have come to understand that PG community is not
harsh/intimidating to newbies and thus, I am feeling at home.

Regards,
Zeray

On Mon, Apr 17, 2017 at 7:33 PM, Simon Riggs  wrote:

> On 27 March 2017 at 13:00, Kang Yuzhe  wrote:
>
> > I have found PG source Code reading and hacking to be one the most
> > frustrating experiences in my life.  I believe that PG hacking should
> not be
> > a painful journey but an enjoyable one!
> >
> > It is my strong believe that out of my PG hacking frustrations, there may
> > come insights for the PG experts on ways how to devise hands-on with PG
> > internals so that new comers will be great coders as quickly as possible.
>
> I'm here now because PostgreSQL has clear, well designed and
> maintained code with accurate docs, great comments and a helpful team.
>
> I'd love to see detailed cases where another project is better in a
> measurable way; I am willing to learn from that.
>
> Any journey to expertise takes 10,000 hours. There is no way to shorten
> that.
>
> What aspect of your journey caused you pain?
>
> --
> Simon Riggshttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-04-18 Thread Amit Langote
On 2017/04/18 15:31, Kang Yuzhe wrote:
> My question is why is that there is a lot of hands-on about PG application
> development(eg. connecting to PG using JAVA/JDBC) but almost nothing about
> PG hacking hands-on lessons. For example, I wanna add the keyword
> "Encrypted" in CREATE TABLE t1(a int, b int encrypted) or CREATE TABLE t1(a
> int, b int) encrypted. Alas, its not easy task.

Regarding this part, at one of the links shared above [1], you can find
presentations with hands-on instructions about how to implement a new SQL
functionality by modifying various parts of the source code.  See these:

Implementing a TABLESAMPLE clause (by Neil Conway)
http://www.neilconway.org/talks/hacking/ottawa/ottawa_slides.pdf

Add support for the WHEN clause to the CREATE TRIGGER statement (by Neil
Conway)
http://www.neilconway.org/talks/hacking/hack_slides.pdf

(by Gavin Sherry)
https://linux.org.au/conf/2007/att_data/Miniconfs(2f)PostgreSQL/attachments/hacking_intro.pdf

Handout: The Implementation of TABLESAMPLE
http://www.neilconway.org/talks/hacking/ottawa/ottawa_handout.pdf

Handout: Adding WHEN clause to CREATE TRIGGER
http://www.neilconway.org/talks/hacking/hack_handout.pdf

Some of the details might be dated, because they were written more than 10
years ago, but will definitely get you motivated to dive more into the
source code.

Thanks,
Amit

[1] http://www.neilconway.org/talks/hacking/



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-04-18 Thread Kang Yuzhe
Thanks Kevin for taking your time and justifying the real difficult of
finding ones space/way in PG development.And thanks for your genuine advice
which I have taken it AS IS.
My question is why is that there is a lot of hands-on about PG application
development(eg. connecting to PG using JAVA/JDBC) but almost nothing about
PG hacking hands-on lessons. For example, I wanna add the keyword
"Encrypted" in "CREATE TABLE t1(a int, b int encrypted)" or "CREATE TABLE
t1(a int, b int) encrypted". Alas, its not easy task.

Lastly, I have come to understand that PG community is not harsh to newbies
and thus, I am feeling at home.

Regards,
Zeray

On Mon, Apr 17, 2017 at 6:53 PM, Kevin Grittner  wrote:

> On Tue, Mar 28, 2017 at 10:36 PM, Craig Ringer 
> wrote:
>
> > Personally I have to agree that the learning curve is very steep. Some
> > of the docs and presentations help, but there's a LOT to understand.
>
> Some small patches can be kept to a fairly narrow set of areas, and
> if you can find a similar capability to can crib technique for
> handling some of the more mysterious areas it might brush up
> against.  When I started working on my first *big* patch that was
> bound to touch many areas (around the start of development for 9.1)
> I counted lines of code and found over a million lines just in .c
> and .h files.  We're now closing in on 1.5 million lines.  That's
> not counting over 376,000 lines of documentation in .sgml files,
> over 12,000 lines of text in README* files, over 26,000 lines of
> perl code, over 103,000 lines of .sql code (60% of which is in
> regression tests), over 38,000 lines of .y code (for flex/bison
> parsing), about 9,000 lines of various type of code just for
> generating the configure file, and over 439,000 lines of .po files
> (for message translations).  I'm sure I missed a lot of important
> stuff there, but it gives some idea the challenge it is to get your
> head around it all.
>
> My first advice is to try to identify which areas of the code you
> will need to touch, and read those over.  Several times.  Try to
> infer the API to areas *that* code needs to reference from looking
> at other code (as similar to what you want to work on as you can
> find), reading code comments and README  files, and asking
> questions.  Secondly, there is a lot that is considered to be
> "coding rules" that is, as far as I've been able to tell, only
> contained inside the heads of veteran PostgreSQL coders, with
> occasional references in the discussion list archives.  Asking
> questions, proposing approaches before coding, and showing work in
> progress early and often will help a lot in terms of discovering
> these issues and allowing you to rearrange things to fit these
> conventions.  If someone with the "gift of gab" is able to capture
> these and put them into a readily available form, that would be
> fantastic.
>
> > * SSI (haven't gone there yet myself)
>
> For anyone wanting to approach this area, there is a fair amount to
> look at.  There is some overlap, but in rough order of "practical"
> to "theoretical foundation", you might want to look at:
>
> https://www.postgresql.org/docs/current/static/transaction-iso.html
>
> https://wiki.postgresql.org/wiki/SSI
>
> The SQL standard
>
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=
> blob_plain;f=src/backend/storage/lmgr/README-SSI;hb=refs/heads/master
>
> http://www.vldb.org/pvldb/vol5.html
>
> http://hdl.handle.net/2123/5353
>
> Papers cited in these last two.  I have found papers authored by
> Alan Fekete or Adul Adya particularly enlightening.
>
> If any of the other areas that Craig listed have similar work
> available, maybe we should start a Wiki page where we list areas of
> code (starting with the list Craig included) as section headers, and
> put links to useful reading below each?
>
> --
> Kevin Grittner
> VMware vCenter Server
> https://www.vmware.com/
>


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-04-18 Thread Kang Yuzhe
Thanks Alvaro for taking your time and pointing me to "Developer_FAQ". I
knew this web page and there is good stuff int it.
The most important about "Developer_FAQ" which I believe is that it lists
vital books for PG developers.

Comparing the real challenge I am facing in finding my way in the rabbit
role(the PG source code), "Developer_FAQ" is indeed less useful.

Of course, I am a beginner and I am just beginning and one day I hope with
your support I will figure out to find my space in PG development.

My question is why is that there is a lot of hands-on about PG application
development(eg. connecting to PG using JAVA/JDBC) but almost nothing about
PG hacking hands-on lessons. For example, I wanna add the keyword
"Encrypted" in CREATE TABLE t1(a int, b int encrypted) or CREATE TABLE t1(a
int, b int) encrypted. Alas, its not easy task.

Regards,
Zeray

On Mon, Apr 17, 2017 at 8:29 PM, Alvaro Herrera 
wrote:

> Craig Ringer wrote:
>
> > Personally I have to agree that the learning curve is very steep. Some
> > of the docs and presentations help, but there's a LOT to understand.
>
> There is a wiki page "Developer_FAQ" which is supposed to help answer
> these questions.  It is currently not very useful, because people
> stopped adding to it very early and is now mostly unmaintained, but
> I'm sure it could become a very useful central resource for this kind of
> information.
>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-04-17 Thread Alvaro Herrera
Craig Ringer wrote:

> Personally I have to agree that the learning curve is very steep. Some
> of the docs and presentations help, but there's a LOT to understand.

There is a wiki page "Developer_FAQ" which is supposed to help answer
these questions.  It is currently not very useful, because people
stopped adding to it very early and is now mostly unmaintained, but
I'm sure it could become a very useful central resource for this kind of
information.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-04-17 Thread Simon Riggs
On 27 March 2017 at 13:00, Kang Yuzhe  wrote:

> I have found PG source Code reading and hacking to be one the most
> frustrating experiences in my life.  I believe that PG hacking should not be
> a painful journey but an enjoyable one!
>
> It is my strong believe that out of my PG hacking frustrations, there may
> come insights for the PG experts on ways how to devise hands-on with PG
> internals so that new comers will be great coders as quickly as possible.

I'm here now because PostgreSQL has clear, well designed and
maintained code with accurate docs, great comments and a helpful team.

I'd love to see detailed cases where another project is better in a
measurable way; I am willing to learn from that.

Any journey to expertise takes 10,000 hours. There is no way to shorten that.

What aspect of your journey caused you pain?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-04-17 Thread Kevin Grittner
On Tue, Mar 28, 2017 at 10:36 PM, Craig Ringer  wrote:

> Personally I have to agree that the learning curve is very steep. Some
> of the docs and presentations help, but there's a LOT to understand.

Some small patches can be kept to a fairly narrow set of areas, and
if you can find a similar capability to can crib technique for
handling some of the more mysterious areas it might brush up
against.  When I started working on my first *big* patch that was
bound to touch many areas (around the start of development for 9.1)
I counted lines of code and found over a million lines just in .c
and .h files.  We're now closing in on 1.5 million lines.  That's
not counting over 376,000 lines of documentation in .sgml files,
over 12,000 lines of text in README* files, over 26,000 lines of
perl code, over 103,000 lines of .sql code (60% of which is in
regression tests), over 38,000 lines of .y code (for flex/bison
parsing), about 9,000 lines of various type of code just for
generating the configure file, and over 439,000 lines of .po files
(for message translations).  I'm sure I missed a lot of important
stuff there, but it gives some idea the challenge it is to get your
head around it all.

My first advice is to try to identify which areas of the code you
will need to touch, and read those over.  Several times.  Try to
infer the API to areas *that* code needs to reference from looking
at other code (as similar to what you want to work on as you can
find), reading code comments and README  files, and asking
questions.  Secondly, there is a lot that is considered to be
"coding rules" that is, as far as I've been able to tell, only
contained inside the heads of veteran PostgreSQL coders, with
occasional references in the discussion list archives.  Asking
questions, proposing approaches before coding, and showing work in
progress early and often will help a lot in terms of discovering
these issues and allowing you to rearrange things to fit these
conventions.  If someone with the "gift of gab" is able to capture
these and put them into a readily available form, that would be
fantastic.

> * SSI (haven't gone there yet myself)

For anyone wanting to approach this area, there is a fair amount to
look at.  There is some overlap, but in rough order of "practical"
to "theoretical foundation", you might want to look at:

https://www.postgresql.org/docs/current/static/transaction-iso.html

https://wiki.postgresql.org/wiki/SSI

The SQL standard

https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/backend/storage/lmgr/README-SSI;hb=refs/heads/master

http://www.vldb.org/pvldb/vol5.html

http://hdl.handle.net/2123/5353

Papers cited in these last two.  I have found papers authored by
Alan Fekete or Adul Adya particularly enlightening.

If any of the other areas that Craig listed have similar work
available, maybe we should start a Wiki page where we list areas of
code (starting with the list Craig included) as section headers, and
put links to useful reading below each?

--
Kevin Grittner
VMware vCenter Server
https://www.vmware.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-03-29 Thread Kang Yuzhe
Thanks Amit for further confirmation on the  Craig's intention.

I am looking forward to seeing your "PG internal machinery under
microscope" blog. May health, persistence and courage be with YOU.

Regards,
Zeray

On Wed, Mar 29, 2017 at 10:36 AM, Amit Langote <
langote_amit...@lab.ntt.co.jp> wrote:

> On 2017/03/29 12:36, Craig Ringer wrote:
> > On 29 March 2017 at 10:53, Amit Langote 
> wrote:
> >> Hi,
> >>
> >> On 2017/03/28 15:40, Kang Yuzhe wrote:
> >>> Thanks Tsunakawa for such an informative reply.
> >>>
> >>> Almost all of the docs related to the internals of PG are of
> introductory
> >>> concepts only.
> >>> There is even more useful PG internals site entitled "The Internals of
> >>> PostgreSQL" in http://www.interdb.jp/pg/ translation of the Japanese
> PG
> >>> Internals.
> >>>
> >>> The query processing framework that is described in the manual as you
> >>> mentioned is of informative and introductory nature.
> >>> In theory, the query processing framework described in the manual is
> >>> understandable.
> >>>
> >>> Unfortunate, it is another story to understand how query processing
> >>> framework in PG codebase really works.
> >>> It has become a difficult task for me to walk through the PG source
> code
> >>> for example how SELECT/INSERT/TRUNCATE in the the different modules
> under
> >>> "src/..". really works.
> >>>
> >>> I wish there were Hands-On with PostgreSQL Internals like
> >>> https://bkmjournal.wordpress.com/2017/01/22/hands-on-with-
> postgresql-internals/
> >>> for more complex PG features.
> >>>
> >>> For example, MERGE SQL standard is not supported yet by PG.  I wish
> there
> >>> were Hands-On with PostgreSQL Internals for MERGE/UPSERT. How it is
> >>> implemented in parser/executor/storage etc. modules with detailed
> >>> explanation for each code and debugging and other important concepts
> >>> related to system programming.
> >>
> >> I am not sure if I can show you that one place where you could learn all
> >> of that, but many people who started with PostgreSQL development at some
> >> point started by exploring the source code itself (either for learning
> or
> >> to write a feature patch), articles on PostgreSQL wiki, and many related
> >> presentations accessible using the Internet. I liked the following among
> >> many others:
> >
> > Personally I have to agree that the learning curve is very steep. Some
> > of the docs and presentations help, but there's a LOT to understand.
>
> I agree too. :)
>
> > When you're getting started you're lost in a world of language you
> > don't know, and trying to understand one piece often gets you lost in
> > other pieces. In no particular order:
> >
> > * Memory contexts and palloc
> > * Managing transactions and how that interacts with memory contexts
> > and the default memory context
> > * Snapshots, snapshot push/pop, etc
> > * LWLocks, memory barriers, spinlocks, latches
> > * Heavyweight locks (and the different APIs to them)
> > * GUCs, their scopes, the rules around their callbacks, etc
> > * dynahash
> > * catalogs and oids and access methods
> > * The heap AM like heap_open
> > * relcache, catcache, syscache
> > * genam and the systable_ calls and their limitations with indexes
> > * The SPI
> > * When to use each of the above 4!
> > * Heap tuples and minimal tuples
> > * VARLENA
> > * GETSTRUCT, when you can/can't use it, other attribute fetching methods
> > * TOAST and detoasting datums.
> > * forming and deforming tuples
> > * LSNs, WAL/xlog generation and redo. Timelines. (ARGH, timelines).
> > * cache invalidations, when they can happen, and how to do anything
> > safely around them.
> > * TIDs, cmin and cmax, xmin and xmax
> > * postmaster, vacuum, bgwriter, checkpointer, startup process,
> > walsender, walreceiver, all our auxillary procs and what they do
> > * relmapper, relfilenodes vs relation oids, filenode extents
> > * ondisk structure, page headers, pages
> > * shmem management, buffers and buffer pins
> > * bgworkers
> > * PG_TRY() and PG_CATCH() and their limitations
> > * elog and ereport and errcontexts, exception unwinding/longjmp and
> > how it interacts with memory contexts, lwlocks, etc
> > * The nest of macros around datum manipulation and functions, PL
> > handlers. How to find the macros for the data types you want to work
> > with.
> > * Everything to do with the C API for arrays (is horrible)
> > * The details of the parse/rewrite/plan phases with rewrite calling
> > back into parse, paths, the mess with inheritance_planner, reading and
> > understanding plantrees
> > * The permissions and grants model and how to interact with it
> > * PGPROC, PGXACT, other main shmem structures
> > * Resource owners (which I still don't fully "get")
> > * Checkpoints, pg_control and ShmemVariableCache, crash recovery
> > * How globals are used in Pg and how they interact with fork()ing from
> > postmaster
> > * SSI (haven't gone there yet myself)
> > * 
>
> That is indeed a 

Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-03-29 Thread Amit Langote
On 2017/03/29 12:36, Craig Ringer wrote:
> On 29 March 2017 at 10:53, Amit Langote  wrote:
>> Hi,
>>
>> On 2017/03/28 15:40, Kang Yuzhe wrote:
>>> Thanks Tsunakawa for such an informative reply.
>>>
>>> Almost all of the docs related to the internals of PG are of introductory
>>> concepts only.
>>> There is even more useful PG internals site entitled "The Internals of
>>> PostgreSQL" in http://www.interdb.jp/pg/ translation of the Japanese PG
>>> Internals.
>>>
>>> The query processing framework that is described in the manual as you
>>> mentioned is of informative and introductory nature.
>>> In theory, the query processing framework described in the manual is
>>> understandable.
>>>
>>> Unfortunate, it is another story to understand how query processing
>>> framework in PG codebase really works.
>>> It has become a difficult task for me to walk through the PG source code
>>> for example how SELECT/INSERT/TRUNCATE in the the different modules under
>>> "src/..". really works.
>>>
>>> I wish there were Hands-On with PostgreSQL Internals like
>>> https://bkmjournal.wordpress.com/2017/01/22/hands-on-with-postgresql-internals/
>>> for more complex PG features.
>>>
>>> For example, MERGE SQL standard is not supported yet by PG.  I wish there
>>> were Hands-On with PostgreSQL Internals for MERGE/UPSERT. How it is
>>> implemented in parser/executor/storage etc. modules with detailed
>>> explanation for each code and debugging and other important concepts
>>> related to system programming.
>>
>> I am not sure if I can show you that one place where you could learn all
>> of that, but many people who started with PostgreSQL development at some
>> point started by exploring the source code itself (either for learning or
>> to write a feature patch), articles on PostgreSQL wiki, and many related
>> presentations accessible using the Internet. I liked the following among
>> many others:
> 
> Personally I have to agree that the learning curve is very steep. Some
> of the docs and presentations help, but there's a LOT to understand.

I agree too. :)

> When you're getting started you're lost in a world of language you
> don't know, and trying to understand one piece often gets you lost in
> other pieces. In no particular order:
> 
> * Memory contexts and palloc
> * Managing transactions and how that interacts with memory contexts
> and the default memory context
> * Snapshots, snapshot push/pop, etc
> * LWLocks, memory barriers, spinlocks, latches
> * Heavyweight locks (and the different APIs to them)
> * GUCs, their scopes, the rules around their callbacks, etc
> * dynahash
> * catalogs and oids and access methods
> * The heap AM like heap_open
> * relcache, catcache, syscache
> * genam and the systable_ calls and their limitations with indexes
> * The SPI
> * When to use each of the above 4!
> * Heap tuples and minimal tuples
> * VARLENA
> * GETSTRUCT, when you can/can't use it, other attribute fetching methods
> * TOAST and detoasting datums.
> * forming and deforming tuples
> * LSNs, WAL/xlog generation and redo. Timelines. (ARGH, timelines).
> * cache invalidations, when they can happen, and how to do anything
> safely around them.
> * TIDs, cmin and cmax, xmin and xmax
> * postmaster, vacuum, bgwriter, checkpointer, startup process,
> walsender, walreceiver, all our auxillary procs and what they do
> * relmapper, relfilenodes vs relation oids, filenode extents
> * ondisk structure, page headers, pages
> * shmem management, buffers and buffer pins
> * bgworkers
> * PG_TRY() and PG_CATCH() and their limitations
> * elog and ereport and errcontexts, exception unwinding/longjmp and
> how it interacts with memory contexts, lwlocks, etc
> * The nest of macros around datum manipulation and functions, PL
> handlers. How to find the macros for the data types you want to work
> with.
> * Everything to do with the C API for arrays (is horrible)
> * The details of the parse/rewrite/plan phases with rewrite calling
> back into parse, paths, the mess with inheritance_planner, reading and
> understanding plantrees
> * The permissions and grants model and how to interact with it
> * PGPROC, PGXACT, other main shmem structures
> * Resource owners (which I still don't fully "get")
> * Checkpoints, pg_control and ShmemVariableCache, crash recovery
> * How globals are used in Pg and how they interact with fork()ing from
> postmaster
> * SSI (haven't gone there yet myself)
> * 

That is indeed a big list of things to know and (have to) worry about.  If
we indeed come up with a PG-hackers-handbook someday, things in your list
could be organized such that it's clear to someone wanting to contribute
code which of those things they need to *absolutely* worry about and which
they don't.

> Personally I recall finding the magic of resource owner and memory
> context changing under me when I started/stopped xacts in a bgworker,
> along with the need to manage snapshots and SPI state to be distinctly
> 

Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-03-29 Thread Kang Yuzhe
Thanks you all for pointing me to useful docs on PG kernel stuff as well as
for being sympathetic with me and the newbie question that appears to be
true and interesting but yet be addressed by PG experts.

Last but not least, *Craig Ringer*, you just nailed it!! You also made me
feel and think that my question is working asking.

Regards,
Zeray

On Wed, Mar 29, 2017 at 6:36 AM, Craig Ringer  wrote:

> On 29 March 2017 at 10:53, Amit Langote 
> wrote:
> > Hi,
> >
> > On 2017/03/28 15:40, Kang Yuzhe wrote:
> >> Thanks Tsunakawa for such an informative reply.
> >>
> >> Almost all of the docs related to the internals of PG are of
> introductory
> >> concepts only.
> >> There is even more useful PG internals site entitled "The Internals of
> >> PostgreSQL" in http://www.interdb.jp/pg/ translation of the Japanese PG
> >> Internals.
> >>
> >> The query processing framework that is described in the manual as you
> >> mentioned is of informative and introductory nature.
> >> In theory, the query processing framework described in the manual is
> >> understandable.
> >>
> >> Unfortunate, it is another story to understand how query processing
> >> framework in PG codebase really works.
> >> It has become a difficult task for me to walk through the PG source code
> >> for example how SELECT/INSERT/TRUNCATE in the the different modules
> under
> >> "src/..". really works.
> >>
> >> I wish there were Hands-On with PostgreSQL Internals like
> >> https://bkmjournal.wordpress.com/2017/01/22/hands-on-with-
> postgresql-internals/
> >> for more complex PG features.
> >>
> >> For example, MERGE SQL standard is not supported yet by PG.  I wish
> there
> >> were Hands-On with PostgreSQL Internals for MERGE/UPSERT. How it is
> >> implemented in parser/executor/storage etc. modules with detailed
> >> explanation for each code and debugging and other important concepts
> >> related to system programming.
> >
> > I am not sure if I can show you that one place where you could learn all
> > of that, but many people who started with PostgreSQL development at some
> > point started by exploring the source code itself (either for learning or
> > to write a feature patch), articles on PostgreSQL wiki, and many related
> > presentations accessible using the Internet. I liked the following among
> > many others:
>
> Personally I have to agree that the learning curve is very steep. Some
> of the docs and presentations help, but there's a LOT to understand.
>
> When you're getting started you're lost in a world of language you
> don't know, and trying to understand one piece often gets you lost in
> other pieces. In no particular order:
>
> * Memory contexts and palloc
> * Managing transactions and how that interacts with memory contexts
> and the default memory context
> * Snapshots, snapshot push/pop, etc
> * LWLocks, memory barriers, spinlocks, latches
> * Heavyweight locks (and the different APIs to them)
> * GUCs, their scopes, the rules around their callbacks, etc
> * dynahash
> * catalogs and oids and access methods
> * The heap AM like heap_open
> * relcache, catcache, syscache
> * genam and the systable_ calls and their limitations with indexes
> * The SPI
> * When to use each of the above 4!
> * Heap tuples and minimal tuples
> * VARLENA
> * GETSTRUCT, when you can/can't use it, other attribute fetching methods
> * TOAST and detoasting datums.
> * forming and deforming tuples
> * LSNs, WAL/xlog generation and redo. Timelines. (ARGH, timelines).
> * cache invalidations, when they can happen, and how to do anything
> safely around them.
> * TIDs, cmin and cmax, xmin and xmax
> * postmaster, vacuum, bgwriter, checkpointer, startup process,
> walsender, walreceiver, all our auxillary procs and what they do
> * relmapper, relfilenodes vs relation oids, filenode extents
> * ondisk structure, page headers, pages
> * shmem management, buffers and buffer pins
> * bgworkers
> * PG_TRY() and PG_CATCH() and their limitations
> * elog and ereport and errcontexts, exception unwinding/longjmp and
> how it interacts with memory contexts, lwlocks, etc
> * The nest of macros around datum manipulation and functions, PL
> handlers. How to find the macros for the data types you want to work
> with.
> * Everything to do with the C API for arrays (is horrible)
> * The details of the parse/rewrite/plan phases with rewrite calling
> back into parse, paths, the mess with inheritance_planner, reading and
> understanding plantrees
> * The permissions and grants model and how to interact with it
> * PGPROC, PGXACT, other main shmem structures
> * Resource owners (which I still don't fully "get")
> * Checkpoints, pg_control and ShmemVariableCache, crash recovery
> * How globals are used in Pg and how they interact with fork()ing from
> postmaster
> * SSI (haven't gone there yet myself)
> * 
>
> Personally I recall finding the magic of resource owner and memory
> context changing under me when I 

Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-03-28 Thread Craig Ringer
On 29 March 2017 at 10:53, Amit Langote  wrote:
> Hi,
>
> On 2017/03/28 15:40, Kang Yuzhe wrote:
>> Thanks Tsunakawa for such an informative reply.
>>
>> Almost all of the docs related to the internals of PG are of introductory
>> concepts only.
>> There is even more useful PG internals site entitled "The Internals of
>> PostgreSQL" in http://www.interdb.jp/pg/ translation of the Japanese PG
>> Internals.
>>
>> The query processing framework that is described in the manual as you
>> mentioned is of informative and introductory nature.
>> In theory, the query processing framework described in the manual is
>> understandable.
>>
>> Unfortunate, it is another story to understand how query processing
>> framework in PG codebase really works.
>> It has become a difficult task for me to walk through the PG source code
>> for example how SELECT/INSERT/TRUNCATE in the the different modules under
>> "src/..". really works.
>>
>> I wish there were Hands-On with PostgreSQL Internals like
>> https://bkmjournal.wordpress.com/2017/01/22/hands-on-with-postgresql-internals/
>> for more complex PG features.
>>
>> For example, MERGE SQL standard is not supported yet by PG.  I wish there
>> were Hands-On with PostgreSQL Internals for MERGE/UPSERT. How it is
>> implemented in parser/executor/storage etc. modules with detailed
>> explanation for each code and debugging and other important concepts
>> related to system programming.
>
> I am not sure if I can show you that one place where you could learn all
> of that, but many people who started with PostgreSQL development at some
> point started by exploring the source code itself (either for learning or
> to write a feature patch), articles on PostgreSQL wiki, and many related
> presentations accessible using the Internet. I liked the following among
> many others:

Personally I have to agree that the learning curve is very steep. Some
of the docs and presentations help, but there's a LOT to understand.

When you're getting started you're lost in a world of language you
don't know, and trying to understand one piece often gets you lost in
other pieces. In no particular order:

* Memory contexts and palloc
* Managing transactions and how that interacts with memory contexts
and the default memory context
* Snapshots, snapshot push/pop, etc
* LWLocks, memory barriers, spinlocks, latches
* Heavyweight locks (and the different APIs to them)
* GUCs, their scopes, the rules around their callbacks, etc
* dynahash
* catalogs and oids and access methods
* The heap AM like heap_open
* relcache, catcache, syscache
* genam and the systable_ calls and their limitations with indexes
* The SPI
* When to use each of the above 4!
* Heap tuples and minimal tuples
* VARLENA
* GETSTRUCT, when you can/can't use it, other attribute fetching methods
* TOAST and detoasting datums.
* forming and deforming tuples
* LSNs, WAL/xlog generation and redo. Timelines. (ARGH, timelines).
* cache invalidations, when they can happen, and how to do anything
safely around them.
* TIDs, cmin and cmax, xmin and xmax
* postmaster, vacuum, bgwriter, checkpointer, startup process,
walsender, walreceiver, all our auxillary procs and what they do
* relmapper, relfilenodes vs relation oids, filenode extents
* ondisk structure, page headers, pages
* shmem management, buffers and buffer pins
* bgworkers
* PG_TRY() and PG_CATCH() and their limitations
* elog and ereport and errcontexts, exception unwinding/longjmp and
how it interacts with memory contexts, lwlocks, etc
* The nest of macros around datum manipulation and functions, PL
handlers. How to find the macros for the data types you want to work
with.
* Everything to do with the C API for arrays (is horrible)
* The details of the parse/rewrite/plan phases with rewrite calling
back into parse, paths, the mess with inheritance_planner, reading and
understanding plantrees
* The permissions and grants model and how to interact with it
* PGPROC, PGXACT, other main shmem structures
* Resource owners (which I still don't fully "get")
* Checkpoints, pg_control and ShmemVariableCache, crash recovery
* How globals are used in Pg and how they interact with fork()ing from
postmaster
* SSI (haven't gone there yet myself)
* 

Personally I recall finding the magic of resource owner and memory
context changing under me when I started/stopped xacts in a bgworker,
along with the need to manage snapshots and SPI state to be distinctly
confusing.

There are various READMEs, blog posts, presentation slides/videos, etc
that explain bits and pieces. But not much exists to tie it together
into a comprehensible hole with simple, minimal explanations for each
part so someone who's new to it all can begin to get a handle on it,
find resources to learn more about subsystems they need to care about,
etc.

Lots of it boils down to "read the code". But so much code! You don't
know if what you're reading is really relevant or if it's even
correct, or if it makes 

Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-03-28 Thread Amit Langote
Hi,

On 2017/03/28 15:40, Kang Yuzhe wrote:
> Thanks Tsunakawa for such an informative reply.
> 
> Almost all of the docs related to the internals of PG are of introductory
> concepts only.
> There is even more useful PG internals site entitled "The Internals of
> PostgreSQL" in http://www.interdb.jp/pg/ translation of the Japanese PG
> Internals.
> 
> The query processing framework that is described in the manual as you
> mentioned is of informative and introductory nature.
> In theory, the query processing framework described in the manual is
> understandable.
> 
> Unfortunate, it is another story to understand how query processing
> framework in PG codebase really works.
> It has become a difficult task for me to walk through the PG source code
> for example how SELECT/INSERT/TRUNCATE in the the different modules under
> "src/..". really works.
> 
> I wish there were Hands-On with PostgreSQL Internals like
> https://bkmjournal.wordpress.com/2017/01/22/hands-on-with-postgresql-internals/
> for more complex PG features.
> 
> For example, MERGE SQL standard is not supported yet by PG.  I wish there
> were Hands-On with PostgreSQL Internals for MERGE/UPSERT. How it is
> implemented in parser/executor/storage etc. modules with detailed
> explanation for each code and debugging and other important concepts
> related to system programming.

I am not sure if I can show you that one place where you could learn all
of that, but many people who started with PostgreSQL development at some
point started by exploring the source code itself (either for learning or
to write a feature patch), articles on PostgreSQL wiki, and many related
presentations accessible using the Internet. I liked the following among
many others:

Introduction to Hacking PostgreSQL:
http://www.neilconway.org/talks/hacking/

Inside the PostgreSQL Query Optimizer:
http://www.neilconway.org/talks/optimizer/optimizer.pdf

Postgres Internals Presentations:
http://momjian.us/main/presentations/internals.html

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-03-28 Thread Adrien Nayrat
On 03/27/2017 02:00 PM, Kang Yuzhe wrote:
> 1. Prepare Hands-on with PG internals
> 
>  For example, a complete Hands-on  with SELECT/INSERT SQL Standard PG 
> internals.
> The point is the experts can pick one fairly complex feature and walk it from
> Parser to Executor in a hands-on manner explaining step by step every 
> technical
> detail.
Hi,

Bruce Momjian has made several presentations about Postgres Internal :
http://momjian.us/main/presentations/internals.html


Regards
-- 
Adrien NAYRAT




signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-03-28 Thread Kang Yuzhe
Thanks Tsunakawa for such an informative reply.

Almost all of the docs related to the internals of PG are of introductory
concepts only.
There is even more useful PG internals site entitled "The Internals of
PostgreSQL" in http://www.interdb.jp/pg/ translation of the Japanese PG
Internals.

The query processing framework that is described in the manual as you
mentioned is of informative and introductory nature.
In theory, the query processing framework described in the manual is
understandable.

Unfortunate, it is another story to understand how query processing
framework in PG codebase really works.
It has become a difficult task for me to walk through the PG source code
for example how SELECT/INSERT/TRUNCATE in the the different modules under
"src/..". really works.

I wish there were Hands-On with PostgreSQL Internals like
https://bkmjournal.wordpress.com/2017/01/22/hands-on-with-postgresql-internals/
for more complex PG features.

For example, MERGE SQL standard is not supported yet by PG.  I wish there
were Hands-On with PostgreSQL Internals for MERGE/UPSERT. How it is
implemented in parser/executor/storage etc. modules with detailed
explanation for each code and debugging and other important concepts
related to system programming.

Zeray,
Regards



On Tue, Mar 28, 2017 at 6:04 AM, Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> wrote:

> From: pgsql-hackers-ow...@postgresql.org
> > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kang Yuzhe
>
> > 1. Prepare Hands-on with PG internals
> >
> >
> >  For example, a complete Hands-on  with SELECT/INSERT SQL Standard PG
> > internals. The point is the experts can pick one fairly complex feature
> > and walk it from Parser to Executor in a hands-on manner explaining step
> > by step every technical detail.
>
> I sympathize with you.  What level of detail do you have in mind?  The
> query processing framework is described in the manual:
>
> Chapter 50. Overview of PostgreSQL Internals
> https://www.postgresql.org/docs/devel/static/overview.html
>
> More detailed source code analysis is provided for very old PostgreSQL
> 7.4, but I guess it's not much different now.  The document is in Japanese
> only:
>
> http://ikubo.x0.com/PostgreSQL/pg_source.htm
>
> Are you thinking of something like this?
>
> MySQL Internals Manual
> https://dev.mysql.com/doc/internals/en/
>
>
>
>
>
> Regards
> Takayuki Tsunakawa
>
>


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-03-27 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kang Yuzhe

> 1. Prepare Hands-on with PG internals
> 
> 
>  For example, a complete Hands-on  with SELECT/INSERT SQL Standard PG
> internals. The point is the experts can pick one fairly complex feature
> and walk it from Parser to Executor in a hands-on manner explaining step
> by step every technical detail.

I sympathize with you.  What level of detail do you have in mind?  The query 
processing framework is described in the manual:

Chapter 50. Overview of PostgreSQL Internals
https://www.postgresql.org/docs/devel/static/overview.html

More detailed source code analysis is provided for very old PostgreSQL 7.4, but 
I guess it's not much different now.  The document is in Japanese only: 

http://ikubo.x0.com/PostgreSQL/pg_source.htm

Are you thinking of something like this?

MySQL Internals Manual
https://dev.mysql.com/doc/internals/en/





Regards
Takayuki Tsunakawa


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-03-27 Thread Michael Paquier
On Mon, Mar 27, 2017 at 9:00 PM, Kang Yuzhe  wrote:
> 1. Prepare Hands-on with PG internals
>
>  For example, a complete Hands-on  with SELECT/INSERT SQL Standard PG
> internals. The point is the experts can pick one fairly complex feature and
> walk it from Parser to Executor in a hands-on manner explaining step by step
> every technical detail.

There are resources on the net, in English as well. Take for example
this manual explaining the internals of Postgres by Hironobu Suzuki:
http://www.interdb.jp/pg/
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...

2017-03-27 Thread Kang Yuzhe
Dear PG Hackers/Experts,

I am newbie to PG Hacking.
I have been reading the PG code base to find my space in it but without
success.

There are hundreds of Hands-on with PG Application development on the web.
Alas, there is almost none in PG hacking.

I have found PG source Code reading and hacking to be one the most
frustrating experiences in my life.  I believe that PG hacking should not
be a painful
Dear PG Hacker/Experts,


I am newbie to PG Hacking.
I have been reading the PG code base to find my space in it but without
success.

There are hundreds of Hands-on with PG Application development on the web.
Alas, there is almost none in PG hacking.

I have found PG source Code reading and hacking to be one the most
frustrating experiences in my life.  I believe that PG hacking should not
be a painful journey but an enjoyable one!

It is my strong believe that out of my PG hacking frustrations, there may
come insights for the PG experts on ways how to devise hands-on with PG
internals so that new comers will be great coders as quickly as possible.

I also believe that we should spend our time reading great Papers and Books
related to Data Management problems BUT not PG code base.

Here are my suggestion for  the experts to devise ways to shorten the steep
learning curve towards PG Hacking.

1. Prepare Hands-on with PG internals

 For example, a complete Hands-on  with SELECT/INSERT SQL Standard PG
internals. The point is the experts can pick one fairly complex feature and
walk it from Parser to Executor in a hands-on manner explaining step by
step every technical detail.

2. Write a book on PG Internals.

There is one book on PG internals. Unfortunately, it's in Chinese.
Why not in English??
It is my strong believe that if there were a great book on PG Internals
with hands-on with some of the basic features of PG internals machinery, PG
hacking would be almost as easy as PG application development.

If the experts make the newbie understand the PG code base as quickly as
possible, that means more reviewers, more contributors and more users of PG
which in turn means more PG usability, more PG popularity, stronger PG
community.

This is my personal feelings and am the ready to be corrected and advised
the right way towards the PG hacking.

Regards,
Zeray


[HACKERS] On How To Shorten The Steep Learning Curve Towards PG Hacking

2017-03-27 Thread Kang Yuzhe
Dear PG Hacker/Experts,


I am newbie to PG Hacking.
I have been reading the PG code base to find my space in it but without
success.

There are hundreds of Hands-on with PG Application development on the web.
Alas, there is almost none in PG hacking.

I have found PG source Code reading and hacking to be one the most
frustrating experiences in my life.  I believe that PG hacking should not
be a painful
Dear PG Hacker/Experts,


I am newbie to PG Hacking.
I have been reading the PG code base to find my space in it but without
success.

There are hundreds of Hands-on with PG Application development on the web.
Alas, there is almost none in PG hacking.

I have found PG source Code reading and hacking to be one the most
frustrating experiences in my life.  I believe that PG hacking should not
be a painful jorney but an enjoyable one!

It is my strong believe that out of my PG hacking frustrations, there may
come insights for the PG experts on ways how to devise hands-on with PG
internals so that new comers will be great coders as quickly as possible.

I also believe that we should spend our time reading great Papers and Books
related to Data Management problems BUT not PG code base.

Here are my suggestion for  the experts to devise ways to shorten the steep
learning curve towards PG Hacking.

1. Prepare Hands-on with PG internals

 For example, a complete Hands-on  with SELECT/INSERT SQL Standard PG
internals. The point is the experts can pick one fairly complex feature and
walk it from Parser to Executor in a hands-on manner explaining step by
step every technical detail.

2. Write a book on PG Internals.

There is one book on PG internals. Unfortunately, it's in Chinese.
Why not in English??
It is my strong believe that if there were a great book on PG Internals
with hands-on with some of the basic features of PG internals machinery, PG
hacking would be almost as easy as PG application development.

If the experts make the newbie understand the PG code base as quickly as
possible, that means more reviewers, more contributors and more users of PG
which in turn means more PG usability, more PG popularity, stronger PG
community.

This is my personal feelings and am the ready to be corrected and advised
the right way towards the PG hacking.

Regards,
Zeray






but an enjoyable journey!

Out of PG hacking frustrations, there may come insights for the PG experts
on ways how to devise hands-on with PG internals so that new comers will be
great coders as quickly as possible.

I also believe that we should spend our time reading great Papers and Books
related to Data Management problems BUT not PG code base.

Here are my suggestion for  the experts to devise ways to shorten the steep
learning curve towards PG Hacking.

1. Prepare Hands-on with PG internals

 For example, a complete Hands-on  with SELECT/INSERT SQL Standard PG
internals. The point is the experts can pick one fairly complex feature and
walk it fromfFrom Parser to Executor in a hands-on manner explaining step
by step every technical detail.

2. Write a book on PG Internals.

There is one book on PG internals. Unfortunately, it's in Chinese.
Why not in English??
It is my strong believe that if there were a great book on PG Internals
with hands-on with some of the basic features of PG internals machinery, PG
hacking would be almost as easy as PG application development.

If the experts make the newbie understand the PG code base as quickly as
possible, that means more reviewers, more contributors and more users of PG
which in turn means more PG usability, more PG popularity, stronger PG
community.

This is my personal feelings and am the ready to be corrected and advised
the right way towards the PG hacking.

Regards,
Zeray