Re: [Discussion] Code generation in carbon result preparation

2016-10-14 Thread Vimal Das Kammath
Hi Vishal,

I think, we need both solution 1 & 2

Solution1 may need re-desiging several parts of Carbon's query process
starting from scanner, aggregator to result preparation. This can help
avoid the frequent cache invalidation.

In Solution2 code generation will not solve the frequent cache invalidation
problem. However, It will surely help to improve the performance by having
specialised code instead of executing generalised code. Especially as we
support several data types and our code is generalised for that. Code
generation will help to improve performance.

Regards
Vimal

On Thu, Oct 13, 2016 at 3:02 AM, Aniket Adnaik 
wrote:

> Hi Vishal,
>
> In general, it is good idea to have a cache efficient algorithm.
>
> For solution-1 :   how do you want to handle variable length columns and
> nulls? may be you will have to maintain variable length columns separately
> and use offsets ?
>
> For solution 2:  code generation may be more efficient solution. We should
> find out all other places in executor that can benefit from code generation
> apart from row formation. BTW, any specific code generation library you
> have mind?
>
> Best Regards,
> Aniket
>
> On Wed, Oct 12, 2016 at 10:02 AM, Kumar Vishal 
> wrote:
>
> > Hi Jacky,
> > Yes result preparation in exeutor side.
> >
> > -Regards
> > Kumar Vishal
> >
> > On Wed, Oct 12, 2016 at 9:33 PM, Jacky Li  wrote:
> >
> > > Hi Vishal,
> > >
> > > Which part of the preparation are you considering? The column stitching
> > in
> > > the executor side?
> > >
> > > Regards,
> > > Jacky
> > >
> > > > 在 2016年10月12日,下午9:24,Kumar Vishal  写道:
> > > >
> > > > Hi All,
> > > > Currently we are preparing the final result row wise, as number of
> > > columns
> > > > present in project list(80 columns) is high mainly measure column or
> no
> > > > dictionary column there are lots of cpu cache invalidation is
> happening
> > > and
> > > > this is resulting to slower the query performance.
> > > >
> > > > *I can think of two solutions for this problem.*
> > > > *Solution 1*. Fill column data vertically, currently it is
> > > horizontally(It
> > > > may not solve all the problem)
> > > > *Solution 2*. Use code generation for result preparation.
> > > >
> > > > This is an initially idea.
> > > >
> > > > -Regards
> > > > Kumar Vishal
> > >
> > >
> > >
> > >
> >
>


Re: [Discussion] Code generation in carbon result preparation

2016-10-12 Thread Jacky Li
Hi Vishal,

Which part of the preparation are you considering? The column stitching in the 
executor side?

Regards,
Jacky

> 在 2016年10月12日,下午9:24,Kumar Vishal  写道:
> 
> Hi All,
> Currently we are preparing the final result row wise, as number of columns
> present in project list(80 columns) is high mainly measure column or no
> dictionary column there are lots of cpu cache invalidation is happening and
> this is resulting to slower the query performance.
> 
> *I can think of two solutions for this problem.*
> *Solution 1*. Fill column data vertically, currently it is horizontally(It
> may not solve all the problem)
> *Solution 2*. Use code generation for result preparation.
> 
> This is an initially idea.
> 
> -Regards
> Kumar Vishal