Hi Vishal,
I think, we need both solution 1 & 2
Solution1 may need re-desiging several parts of Carbon's query process
starting from scanner, aggregator to result preparation. This can help
avoid the frequent cache invalidation.
In Solution2 code generation will not solve the frequent cache invalidation
problem. However, It will surely help to improve the performance by having
specialised code instead of executing generalised code. Especially as we
support several data types and our code is generalised for that. Code
generation will help to improve performance.
Regards
Vimal
On Thu, Oct 13, 2016 at 3:02 AM, Aniket Adnaik
wrote:
> Hi Vishal,
>
> In general, it is good idea to have a cache efficient algorithm.
>
> For solution-1 : how do you want to handle variable length columns and
> nulls? may be you will have to maintain variable length columns separately
> and use offsets ?
>
> For solution 2: code generation may be more efficient solution. We should
> find out all other places in executor that can benefit from code generation
> apart from row formation. BTW, any specific code generation library you
> have mind?
>
> Best Regards,
> Aniket
>
> On Wed, Oct 12, 2016 at 10:02 AM, Kumar Vishal
> wrote:
>
> > Hi Jacky,
> > Yes result preparation in exeutor side.
> >
> > -Regards
> > Kumar Vishal
> >
> > On Wed, Oct 12, 2016 at 9:33 PM, Jacky Li wrote:
> >
> > > Hi Vishal,
> > >
> > > Which part of the preparation are you considering? The column stitching
> > in
> > > the executor side?
> > >
> > > Regards,
> > > Jacky
> > >
> > > > 在 2016年10月12日,下午9:24,Kumar Vishal 写道:
> > > >
> > > > Hi All,
> > > > Currently we are preparing the final result row wise, as number of
> > > columns
> > > > present in project list(80 columns) is high mainly measure column or
> no
> > > > dictionary column there are lots of cpu cache invalidation is
> happening
> > > and
> > > > this is resulting to slower the query performance.
> > > >
> > > > *I can think of two solutions for this problem.*
> > > > *Solution 1*. Fill column data vertically, currently it is
> > > horizontally(It
> > > > may not solve all the problem)
> > > > *Solution 2*. Use code generation for result preparation.
> > > >
> > > > This is an initially idea.
> > > >
> > > > -Regards
> > > > Kumar Vishal
> > >
> > >
> > >
> > >
> >
>