Hello, Great job! I leave inline comments on your questions.
Best regards, Hyunsik On Jul 16, 2013, at 5:43 AM, camelia c <[email protected]> wrote: > Hello, > > Thank You very much for Your reply and the reference book. > I managed to follow the steps and created a github account with a mirror of Apache TAJO at https://github.com/camelia-c/incubator-tajo . > I uploaded a diagram at https://sites.google.com/site/gsoc2013tajo34/github > > In the diagram, the outerjoin_1 branch is intended for development whereas the master branch is for stable code. For the moment I didn't include any of my code on outer join yet because I want to firstly set up correctly the repository. Your setup looks correct. You can work on your repository. In general, the master branch is used as a seed branch for new branch, and most works are performed in another branch. > > Still, I am not completely sure of the following aspects: > > 1) I mention that immediately after the git clone command, I issued the following commands: > > $ git remote add --track master upstream git:// github.com/apache/incubator-tajo.git > > $ git fetch upstream > From git://github.com/apache/incubator-tajo > * [new branch] master -> upstream/master > Use 'git pull' in order to synchronise your branch against apache repository. In your case where the apache remote repository is named 'upstream', just type in a certain branch that you want to update as follows: $ git checkout [working_branch] $ git pull upstream master It will fetch the updated source code and will try to merge it with your working code. > $ git rebase upstream/master > Current branch master is up to date. > > But I am not sure of whether they are enough to perform automatic synchronization....or should I still perform manual synchronization periodically? > Most users perform rebase source code manually because in many cases a merge needs hand work. > 2) After this setup, is it still necessary to run periodically the command You suggested last time (git pull origin master)? > In this configuration, the command $git pull origin master > is going to synchronize the local repository on my computer with git:// github.com/apache/incubator-tajo.git or with git:// github.com/camelia-c/incubator-tajo.git? > Yes, occasionally, you should pull the updated source code from apache git repository. > > 3) If I run this command from the outerjoin_1 branch ,i.e. > > git checkout outerjoin_1 > git pull origin master > > is it going to affect only the outerjoin_1 branch? Yes, the 'pull' only affects your current branch (i.e., outer join_1 in your example) > > 4) And the last question, regarding execution: now the interactive shell is launched with $TAJO_HOME/bin/tsql instead of $TAJO_HOME/bin/tajo cli ? > Yes, recently, tsql was added for more convenience. tsql is equivalent to bin/tajo cli. > > Thank You very much! > > Yours sincerely, > Camelia > > > > > ________________________________ > From: Hyunsik Choi <[email protected]> > To: camelia c <[email protected]> > Cc: tajo-dev <[email protected]> > Sent: Saturday, July 13, 2013 7:28 AM > Subject: Re: [GSoc2013] - Outer Join > > > Hi Camellia, > > I leave inline comments on your questions. > > > On Fri, Jul 12, 2013 at 9:13 PM, camelia c <[email protected]> wrote: > >> Hello, >> >> Thank You very much for Your feedback! >> >> I completed the outer joins to inner joins rewriting part and I plan to >> follow Your advice and move the rewriting methods to LogicalOptimizer. >> The new processing is described in >> https://sites.google.com/site/gsoc2013tajo34/home/validation , where I >> also uploaded the source code as files. >> >> > Your work looks good. However, first of all, I would like to encourage you > to learn SCM like Git. > > Actually, your source code cannot be merged into the current Tajo source > code because Tajo source code has been changed by multiple developers. It > is very hard to manage Individual source code files against updating source > tree. > > The main objective of Google summer code is to encourage open source > participation. So, you need to learn an overall system of open source > development. Above all, you should learn SCM like Git. > > >> 1) I think that the allTables data structure as well as the >> validateOuterJoin and recursiveWhere methods should remain in class >> QueryAnalyzer, as they belong to the stage where the query is analyzed and >> validated. >> In my opinion, only methods rewriteOuterJoin, >> recursiveRewriteMultiNullSupplier, recursiveRewriteNullRestricted should be >> moved to class LogicalOptimizer as they perform optimizations on the >> logical plan. >> What do You think about this? >> >> > Sounds great. Let's go ahead with that :) > > >> 2) I would like to kindly ask You how can I continually rebase my work on >> the latest Tajo version, "rebase continually your work on updated source >> code"? >> Usually I issue this command: >> >> mvn package -DskipTests -Pdist -Dtar >> >> What should I do before this? >> > > If you download the source code via git, just type as follows: > > $ git pull origin master > > Probably, you meet some conflicts. If you don't know git, you should learn > git in order to solve the conflicts. You can refer some manuals available > online. I would like to recommend this one (http://git-scm.com/book). > > >> >> 3) I read on the mailing lists that the Tajo Cli changed and was improved. >> But besides the query acceptance, does this affect in any way the stages of >> the query processing, after its parsing? >> >> > Tajo Cli change was in only client side. It does not affect the part in > which you have worked. > > >> 4) Also, I read some posts on the mailing list related to integration >> tests. >> Where can I find these and how should I use them in order to verify that >> my work integrates well with the rest of the source code? >> >> > The following command verifies unit tests and integration tests. It > verifies most parts of Tajo. > > $ mvn clean install > > >> My work so far only affects queries containing at least one outer join, so >> for queries consisting only of inner joins no modification is made. > > >> As a final remark, it was easier to manage the recursion without >> EvalTreeUtil. Hope it's ok. >> > > That's great. > > >> >> >> Thank You in advance! >> >> Yours sincerely, >> Camelia >> >> >> >> >> > > Best regards, > Hyunsik
