Indrajith, Thank you for your interest in joining the MADlib project.
Based on your followup questions I'm guessing the question is more about "how do I understand the code of the project" and less "what are the mechanics of contributing" to an Apache project. You might want to start by reading over the design documents on our community site: http://madlib.incubator.apache.org/community.html Some of that material may still need migration to the new incubator wiki, if you want to help out with that it would be a great way to get started with the project. Some of the material may be a bit out of date, and working on the migration of information to the new wiki could be a good way of working both through the changes as well as helping to understand the content. Overall you can think of MADlib as having a couple different major components: 1. The python driver functions 2. The C++ implementations functions 3. The C++ database abstraction layer 1. Python driver functions The driver functions are mostly located in the subdirectories under https://github.com/apache/incubator-madlib/tree/master/src/ports/postgres/modules These functions are the main entry point from user input and are largely responsible for the flow control of the algorithms. Generally the implementations consist of validating input parameters, executing sql statements, evaluating the results and potentially looping to execute more sql statements until some convergence criteria has been hit. 2. The C++ implementation functions Mostly located under https://github.com/apache/incubator-madlib/tree/master/src/modules These functions are the C++ definitions of the core functions and aggregates needed for particular algorithms. These are implemented in C++ rather than python for performance reasons. 3. The C++ database abstraction layer Mostly located under: https://github.com/apache/incubator-madlib/tree/master/src/dbal and https://github.com/apache/incubator-madlib/tree/master/src/ports/postgres/dbconnector These functions attempt to provide a programming interface that abstracts all the Postgres internal details away and provides a mechanism whereby MADlib can support different backend platforms and focus on the internal functionality rather than the platform integration logic. Finally... Ask questions, we're happy to help you start getting familiar with the project. Cheers, Caleb On Fri, Oct 23, 2015 at 9:00 AM, Indrajith Udayakumara <[email protected]> wrote: > We are a team of undergraduates. We are interested to study madlib, > so we need your help. How can we understand this from the begining? > can we have some references? We would like to contribute this project. >
