Joe McDonnell created IMPALA-14181: -------------------------------------- Summary: DCHECK in Sorter::Run::ConvertValueOffsetsToPtrs() with low memory Key: IMPALA-14181 URL: https://issues.apache.org/jira/browse/IMPALA-14181 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 5.0.0 Reporter: Joe McDonnell
When testing with tuple caching, queries start to use more memory. I hit this assert while running test_sort.py's TestArraySort: {noformat} F20250623 14:59:49.492782 1002052 sorter.cc:829] d84fbaa341de8fbb:027dfece00000000] Check failed: page_offset == 0 (2311 vs. 0) {noformat} >From here: {noformat} if (page_index > var_len_pages_index_) { // We've reached the page boundary for the current var-len page. // This tuple will be returned in the next call to GetNext(). DCHECK_GE(page_index, 0); DCHECK_LE(page_index, var_len_pages_.size()); DCHECK_EQ(page_index, var_len_pages_index_ + 1); DCHECK_EQ(page_offset, 0); <-------------- HERE // The data is the first thing in the next page. // This must be the first slot with var len data for the // tuple. Var len data for tuple shouldn't be split // across blocks. DCHECK(AllPrevSlotsAreNullsOrSmall<ValueType>(tuple, slots, idx)); return false; }{noformat} This is easy to reproduce by using a slightly tighter memory value. On my machine, this works: {noformat} set max_sort_run_size=2; set num_nodes=1; -- ordinarily buffer_pool_limit=44m, but use a slightly tighter value set buffer_pool_limit=41m; select string_col, int_array, double_map, string_array, mixed from functional_parquet.arrays_big order by string_col;{noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)