I think that alignment is not the goal here, because the tuples themselves are not aligned, as there is no padding at their end - e.g. if tuple's size is 17 byte, all kind the first tuple will start at offset 0, the next at 17 ... a comment about the lack of padding: https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java#L68
I have a different reason in my memories, but I didn't find the comment that mentioned it: The goal of sorting is related to big tuples that span more than 1 cache page - by sorting, all small tuples will move to the end (just before the null indicator flag bytes). so we will have "dense" cache pages that are used by many slots, and "sparse" ones used by only by a few big slots. If all slots are accessed with the same probability during expression evaluation, this layout increases the possibility that some "sparse" pages won't be accessed at all, leading to smaller pressure on cache. Note that this is a far from optimal strategy IMPO - it could be improved by considering when a slot will be used, e.g. slots used in predicates could be moved near the null indicator flag. If the predicate fails, the rest of the slots (and the pages that contain them) are not accessed again. On Mon, Feb 8, 2021 at 2:24 PM Zoltán Borók-Nagy <borokna...@apache.org> wrote: > Though we don't require tuples to have any memory alignment based on the > comment in > > https://github.com/apache/impala/blob/81d5377c27f1940235db332e43f1d0f073cf3d2f/be/src/runtime/tuple.h#L61-L63 > , but I do believe we sort slots to get a packed and aligned memory layout > for the tuples in most cases. CPU operations on aligned addresses are more > efficient than operations on unaligned addresses. > > BR, > Zoltan > > > On Mon, Feb 8, 2021 at 10:12 AM 许益铭 <x1860...@gmail.com> wrote: > > > why tuple memory need to be sorted by slot size? is has any optimize? > > >