[castor-dev] GSoC 2010 introduction

Dennis Butterstein Wed, 28 Apr 2010 07:23:04 -0700

Hi folks,
I'll take the chance to introduce myself to the community.

My name is Dennis Butterstein and I'll contribute to castor in the
context of google summer of code over the next few months. I hope to be
able to combine my work on castor with my master's thesis that I'll
start to write soon.


Roughly speaking, my subject will be to refactor loading of entities
from database to be better
maintainable, extendable and more clear.

For those amongst you interested in details I've added a more detailed
description taken from my application for the google summer of code (the
rest of you can ignore this part =) ):

    I started to implement some refactoring to get to know current
    codebase and classes I will need to know to be able to start working
    in GSOC with full force right on time. As stated in jira issue
    castor-2888 there are some refactorings to do to be able to seize
    loading strategies themselves. So the first step will be to adapt
    KeyGenerator implementations to use the new CastorConnection which
    wraps the used PersistenceFactory as well as the used connection
    (java.sql.Connection). By doing so we achieve the possibility to use
    CastorStatement for SQLStatementInsert as well. So
    SQLStatementDelete, SQLStatementUpdate and SQLStatementInsert will
    be constructed in a very similar way and due to that a cleaner
    codebase will arise.

    By now the SQLStatement classes do not only construct the sql
    strings but also the parameter map. To uncouple the order of columns
    in select statements from the order of columns in resultset they
    would have to construct a map of return values as well.

    To be able to seperate those steps namely construction of sql
    string, parameter map and the map of results using the visitor
    pattern could be assistant. Another point is that using the visitor
    pattern will provide more flexibility in constructing query strings
    (e.g. specific visitor could be used for different databases).


    So I think based on these changes it will be possible to start using
    the visitor pattern to build the sql query strings and the parameter
    map at first. This will serve as reference implementation to
    recognize and resolve possible problems.

    After that the subsequent task will be to integrate the visitor
    pattern in the current codebase, start using it and test (not least
    if the entire functionality was preserved).

    Now it will be time to add new functionality. The current select
    class hierarchy has to be extended to support joins and orders. By
    mapping columns of the select-block to names that will be used to
    access values of the resultset the sequence of columns and access to
    values could be decoupled. Subsequently this functionality has to be
    integrated in the visitor pattern.

    At this point SQLStatementLoad can be refactored to use the select
    class hierarchy in order to build the query string, execute the
    statement and extract columns from resultset. Formerly
    SQLStatemenLoad did these tasks on its own.

    After that we can use the select class hierarchy for
    SQLStatementQuery as well. First ParseTreeWalker, OQLQueryImpl and
    QueryResults (and some other classes) will have to be adapted to
    support new class hierarchy. In a first step this will be done for
    oql queries only. Whether to refactor ParseTreeWalker or to use the
    parser created during GSOC 2008 has to be evaluated on time.

    Sql pass through queries will follow but for them we will first
    have to evaluate possibilities how to get results and bind
    parameters in this case.

    Having done these things should make it much easier to adapt loading
    strategies. Based on benchmarks (like the ones in
    cpaptf/src/site/resources/results/) received from a reference
    machine I'll try to enhance loading strategies step by step. I
    thought about making some comparisons to other similar projects
    (e.g. hibernate) if suitable benchmarks exist. We'll have to see
    whether it will be possible to implement an automated decision
    strategy to choose the most efficient loading strategy. Another
    option: we could make the loading strategy configurable as the
    developers should know enough about their project to be able to
    estimate dimensions of relations.
    My work will not contain the implementation of any loading strategy
    or similar. It will only evaluate possibilities and show benchmark
    results to be considered to point out the direction for future work.

Right now I started refactoring SQLStatementUpdate to use CastorConnection.

Well, all that remains to be said is that I'm happy to get that chance
and I'm looking forward* *to work with you.

Here's to successful cooperation!

[castor-dev] GSoC 2010 introduction

Reply via email to