> From: Robert Haas <robertmh...@gmail.com> > Sent: 09 March 2017 01:09 > >> The project that most caught my eye was on "Implementing push-based query >> executor". >> Although it completely fits my capabilities and current research, I have >> some concerns on "The ability to understand and modify PostgresSQL executor >> code" as I had not enough time to understand the dimension of the referred >> changes. > > They are formidable. > > https://www.Postgresql.org/message-id/CA%2BTgmoaf_uR_wVMj53MVvyEQ_wRx62MM3QQwR6aPZe0Lbr%2BJew%40mail.gmail.com
I want to contribute with valuable work, so I will focus on my second choice: "Sorting algorithms benchmark and implementation". Maybe when I get more familiarised with the PostgreSQL project I would give it a try. > From: pgsql-hackers-ow...@postgresql.org <pgsql-hackers-ow...@postgresql.org> > on behalf of Kevin Grittner <kgri...@gmail.com> > Sent: 17 March 2017 13:57 > > Some ideas for desirable content: > > - A resume or CV of the student, including any prior GSoC work > - Their reasons for wanting to participate > - What else they have planned for the summer, and what their time > commitment to the GSoC work will be > - A clear statement that there will be no intellectual property > problems with the work they will be doing -- that the PostgreSQL > community will be able to use their work without encumbrances > (e.g., there should be no agreements related to prior or > ongoing work which might assign the rights to the work they do > to someone else) > - A description of what they will do, and how > - Milestones with dates > - What they consider to be the test that they have successfully > completed the project Using the information posted HERE<https://www.Postgresql.org/developer/summerofcode/> and Kevin Grittner's suggestions, I would like to start writing my proposal as well as begin my work on the project. In the last two weeks I have been using some profiling tools like dstat, top, iostat,... in my university's cluster with the "NAS Parallel Benchmarks" package from NASA. Now I will start another academic work using DTrace on a Solaris machine. I have permanent access to the cluster of SeARCH6, description HERE<http://search6.di.uminho.pt/wordpress/?page_id=55>. I know it is not that powerful, but it's quite heterogeneous, composed by many generations of processors, including both Intel many core solutions (the KNC and the not listed KNL), what I think is good to test the algorithms in many different scenarios. I have no permissions to install new software, so I guess I can't use specific benchmarking software, but it can still be use to test the algorithm alone, using some selected data sets. The point here is just to inform about important knowledge and material that maybe I can use on the project. Other information about my motivations and competences can be found HERE<https://www.Postgresql.org/message-id/am5pr0801mb20202b4b6cf68e9292afaa90c1...@am5pr0801mb2020.eurprd08.prod.outlook.com>. Anyway, I would like to accomplish some small goals before the 23 April's deadline, so I can spot and be prepared for some trickier parts of the project. As I will have classes and evaluations in June, and possibly an internship in the University of Texas in July, I will have to work in both tasks at the same time, so I made a schedule with what I think I can do, leaving August almost free to explore the project (micro optimisations, ...) or compensate in case something doesn't go as expected. I would appreciate if you could review it and a advise me if I'm pointing on the wrong direction. Schedule: Before April 3: project specific work: - read all the suggested papers - implement all the sorting algorithms (functional but unoptimised versions) - validate core ideas with the community integration work: - read some of the PostgreSQL documentation and source code - read the HACKERS mailing list April 3 - May 30: project specific work: - discuss possible benchmarks and optimization possibilities - do a simple benchmark to the current used sort integration work: - go further on understanding PostgreSQL project - keep reading the mailing list and clarify possible doubts May 30 - June 26 (Coding officially begins!): - set up the final benchmark environment - correctly benchmark current sort - macro optimise all the implemented sorts and define performance goals - test the produced code vs the current one June 26 - July 24: micro optimise all the algorithms: - study cache/memory issues, vectorisation, ... - first steps on parallelism do a full profile of the current work: - CPU and memory usage - execution time - number of operations (per second) July 24 - August 29: - optimise parallel solutions - discuss some possible optimisations and test them - revise and document all the code - produce valuable report for future reference After August 29: - keep in contact and look for a possible project that fits my skills A small apart: I read this INFO<https://wiki.postgresql.org/wiki/Mailing_Lists> , but I have been strugling with using the internet style quoting in outlook's browser client and I end up by doing it by hand. I have never user a development mailing list before, so any tip woud be valuable.