I agree that there are parts of the current design that are frustrating,
even with the relatively small amount of time I have spent with the
freemarker templates.

I am thinking that Julian is worried we will be taking a more traditional
java architecture and losing performance, but I'm getting the impression
that you want to explode out the templates into source code that will be
similarly structured to the current output of the templates and check that
into version control.

Obviously, if my impression is correct, we will be running the risk of
needing to fix problems in several places and introducing inconsistencies.
This needs to always be balanced against debug efficiency and execution
speed.

I believe there is actually a separate architecture/build problem that
might more appropriately address the issue with the functions. When I have
done work on the value vectors it has always worked that I can modify the
freemarker results and rebuild the project in my IDE and get immediate
feedback on the changes. On the other hand, I have always had to run the
build from the command line when changing the functions, because we pick up
the source code of the functions for java code generation from the Jar
files created in the build. As far as I know, there is no way to compile
all of the current functions during the build without re-running the
freemarker generation. This pretty much requires making all changes to the
functions in the templates themselves. It provides no opportunity to debug
quickly in the IDE to get the changes working with one of the types and
then making the corresponding changes to the templates for final testing
with the other types after a run of the full build.

I have two suggestions to possibly address this problem. The first is to
take a look at the interpreted expression evaluator that was added for the
constant folding rule. To avoid the overhead of compiling java during
planning an alternative expression interpreter is used. The interpreter as
it works now (to solve some build issues it was refactored a little from
Jinfeng's original design) just creates instances of the java classes for
the functions and ties together the results of each with reflection when
walking the expression tree. This allows for stepping into the function
bodies directly in the IDE, with full debugging available for all of the
members of the function class. This also does not rely on fetching the
source code from the full build process, so you can make changes to
functions and test them immediately.

Secondly, it may be worthwhile to allow the build to be run while ignoring
the freemarker generation. This would allow for testing changes to the
functions after modifying the freemarker result java files for a small set
of types, rather than requiring a change to the template that must work for
everything. This would still require running most of the build between
tests, but I think this is the only way to test the functions in the
context of the code-generation based evaluation.

- Jason







On Mon, Apr 27, 2015 at 3:09 PM, Julian Hyde <[email protected]> wrote:

> If you can write all of this as Java code and the JIT can handle it, that
> would be great. But I’d be surprised.
>
> If freemarker template for a particular function is complex, that is
> probably because it is doing a lot of customization for particular cases.
> In other words, it is adding a lot of value.
>
> If the templates make simple things look simple and complex things
> possible, they are a textbook example of a good architecture. Don’t throw
> them out.
>
> You know the code a lot better than I do. If you have any counter-examples
> to what I’ve said above, please prove me wrong. :)
>
> Julian
>
>
>
> On Apr 27, 2015, at 2:57 PM, Mehant Baid <[email protected]> wrote:
>
> > Hey All,
> >
> > We use freemarker templates extensively for generating source code for
> value vectors and functions. However over time the template logic for
> functions has become complicated with the need to modify the function
> templates based on input data type, nullability, function type etc. This
> makes the template code extremely hard to read and debug. Proposal going
> forward is to de-templatize the functions and check-in the source for
> various functions (like how we have been doing in recent times for hash
> functions etc). Although its not ideal as it involves some code duplication
> but the advantage is that the source is then 'IDE friendly' and much more
> readable. As part of the work for DRILL-2870 I am planning to de-templatize
> the aggregate functions.
> >
> > Thanks
> > Mehant
>
>

Reply via email to