Hello,

I'm very glad to hear that Tajo is getting stable and production ready.

Here's another news. Some of us might have have seen some Hyunsik's 
presentations commenting about an experimental project, c++ tajo worker.


Recently, I've been working on the c++ worker, even though it is a little bit 
far from behind schedule than what I expected, 
it is able to communicate with Tajo Master and Query Master successfully. 

This is never a replacement of Java worker, but an exchangeable worker. We 
could use Java workers as is, C++ workers only, or mixed workers. 

It is designed as a vectorized execution engine hoping to process certain types 
of data structures very efficiently. 

These are the supported features right now.

- Reading and parsing csv files in the hadoop data node.
- Filtering rows within LLVM code generated evaluation 
- Simple scalar functions
- Simple group by aggregation functions

Now I'm working on 'order by' clause and doing profiling to get expected 
performance.

Working on these stuffs, I hope community to allow creating a new git branch, 
say native_worker, cplus_worker or a nicer name, to make this project together. 
 


There's still a long way to go on this project, but it could be improved with 
help of the Tajo community.


Thanks
Min

Reply via email to