I think that alignment is not the goal here, because the tuples themselves
are not aligned, as there is no padding at their end -  e.g. if  tuple's
size is 17 byte, all kind the first tuple will start at offset 0, the next
at 17 ...
a comment about the lack of padding:
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java#L68

I have a different reason in my memories, but I didn't find the comment
that mentioned it:
The goal of sorting is related to big tuples that span more than 1 cache
page - by sorting, all small tuples will move to the end (just before the
null indicator flag bytes). so we will have "dense" cache pages that are
used by many slots, and "sparse" ones used by only by a few big slots. If
all slots are accessed with the same probability during expression
evaluation, this layout increases the possibility that some "sparse" pages
won't be accessed at all, leading to smaller pressure on cache.

Note that this is a far from optimal strategy IMPO - it could be improved
by considering when a slot will be used, e.g. slots used in predicates
could be moved near the null indicator flag. If the predicate fails, the
rest of the slots (and the pages that contain them) are not accessed again.




On Mon, Feb 8, 2021 at 2:24 PM Zoltán Borók-Nagy <borokna...@apache.org>
wrote:

> Though we don't require tuples to have any memory alignment based on the
> comment in
>
> https://github.com/apache/impala/blob/81d5377c27f1940235db332e43f1d0f073cf3d2f/be/src/runtime/tuple.h#L61-L63
> , but I do believe we sort slots to get a packed and aligned memory layout
> for the tuples in most cases. CPU operations on aligned addresses are more
> efficient than operations on unaligned addresses.
>
> BR,
>     Zoltan
>
>
> On Mon, Feb 8, 2021 at 10:12 AM 许益铭 <x1860...@gmail.com> wrote:
>
> > why tuple memory need to be sorted by slot size? is has any optimize?
> >
>

Reply via email to