Will do Julian – Sorry I was traveling and missed this.
PS: Everyone – please feel free to take the lead in drafting a report if you
want to do that.
As a general update, I’d like to welcome Tarun Bansal as a committer. He is
helping make the data ingest part of Quickstep more robust (it is quite fragile
right now and a corrupt record crashes the system). He has an initial PR:
https://github.com/apache/incubator-quickstep/pull/99. Welcome Tarun!
In other updates, Harshad with help from others is working on fixing the big
performance issues that we have with Aggregation. The key change is combining
different aggregate handles into one. Rathijit started work on this (Thanks
Rathijit!) and has an open PR
https://github.com/apache/incubator-quickstep/pull/90). There is a bunch of
cleanup to do to make this work that Harshad is working on with Rathijit.
One issue that keeps coming up is the TypedValue (this shows up in the
aggregate work too). It is too heavy-weight and a huge performance bottleneck.
Marc is looking at how this could be removed from another part where it is a
bottleneck, i.e. in the ValueAccessors (which themselves are now showing their
age and issues with design).
Essentially, between the TypeValue, ValueAccessors, and HashTables (for
aggregation and joins) there are crucial design issues that need to be
addressed to remove performance bottleneck from the core inner loops. The
approach being considered is to remove TypedValues as much as possible,
refactor code in the ValueAccessors and over time move to more transparent
iterators that also allow for compilers to unroll loops, and allow for more
efficient hash tables that have more flexibility in terms of the payload they
can take. Any other ideas that anyone else has is welcome. Essentially, we want
to spend a few months removing grunge code, make performance issues
transparent, and improve performance. This will also help new developers
approach the code.
Tentatively, we should plan for a first release of Quickstep before the winter
break. The first release can target the single node case, focusing on
high-performance for SQL with high-memory and multi-core hardware.
Anyone who has ideas on early adopters that we should approach, please share
with the group.
On 9/13/16, 12:35 AM, "Julian Hyde" <jh...@apache.org> wrote:
Quickstep didn’t file a report for the Board meeting this month. Can you
please be sure to file one next month?