Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/19660 )
Change subject: WIP - IMPALA-12019: Support ORDER BY for arrays of fixed length types in select list ...................................................................... Patch Set 3: (7 comments) http://gerrit.cloudera.org:8080/#/c/19660/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19660/2//COMMIT_MSG@16 PS2, Line 16: > I think that the queries should be able to run without disabling codegen wi I don't think we have infrastructure to only disable codegen for the sorter, I don't know how difficult that would be... But I think we should try adding codegen in this change. http://gerrit.cloudera.org:8080/#/c/19660/2//COMMIT_MSG@27 PS2, Line 27: keys are strings which are var-len. > I think that it is ok to disable MAP sorting for now in the FE. My understa Sorting MAPs actually works, I added a test for it, but only on a small table. The problem is that our data generator can't generate maps with only fixed len data because the key has to be a string. http://gerrit.cloudera.org:8080/#/c/19660/2/be/src/runtime/sorter.cc File be/src/runtime/sorter.cc: http://gerrit.cloudera.org:8080/#/c/19660/2/be/src/runtime/sorter.cc@232 PS2, Line 232: > 'None' refers to the previous sentence, which changed. Could you rephrase t Done http://gerrit.cloudera.org:8080/#/c/19660/2/be/src/runtime/sorter.cc@234 PS2, Line 234: // is here. : DCHECK_EQ(&var_len_pages_.back(), cur_var_len_page); : uint8_t* var_data_ptr = cur_var_len_page->AllocateBytes(total_var_len); : if (INITIAL_RUN) { : CopyVarLenData(string_values, coll_values_and_sizes, var_data_ptr); : } else { : CopyVarLenDataConvertOffset(string_values, coll_values_and_sizes, : var_len_pages_.size() - 1, cur_var_len_page->data(), var_data_ptr); : } : } : ++num_tuples_; : ++*num_processed; : ++cur_input_index; : } : : // If there are still rows left to process, get a new page for the fixed-length : // tuples. If the run is already too long, return. : if (cur_input_index < batch->num_rows()) { : bool added; : RETURN_IF_ERROR(TryAddPage(add_mode, &fixed_len_pages_, &added)); : if (!added) return Status::OK(); : cur_fixed_len_page = &fixed_len_pages_.back(); : } : } : return Status::OK(); > This check should be done outside this hot loop, e.g. during initialization We used to allow "illegal" tuples if they are NULL (see L259 in PS2). For this we need to read the actual data that is not available at initialisation. On the other hand I don't think it's important that we allow these NULL tuples, especially as the FE will not allow them. I moved this check to Run::Init() and removed the exception to allow NULL tuples. Also, this is only a DCHECK in the end, so the compiler should optimise it out in release builds. http://gerrit.cloudera.org:8080/#/c/19660/2/be/src/runtime/tuple.cc File be/src/runtime/tuple.cc: http://gerrit.cloudera.org:8080/#/c/19660/2/be/src/runtime/tuple.cc@236 PS2, Line 236: RawValue::Write now handles NULL values, we don't have to handle it here. If the value is NULL then 'string_values' and 'collection_values' will remain empty and we don't enter the IF-THEN blocks. http://gerrit.cloudera.org:8080/#/c/19660/2/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java: http://gerrit.cloudera.org:8080/#/c/19660/2/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@369 PS2, Line 369: MapType mapType = (MapType) type; > MAP sorting is not supported at the moment, right? No, it works, we just don't have a large table to test it on. I added a small table. http://gerrit.cloudera.org:8080/#/c/19660/2/tests/query_test/test_sort.py File tests/query_test/test_sort.py: http://gerrit.cloudera.org:8080/#/c/19660/2/tests/query_test/test_sort.py@312 PS2, Line 312: query_result = self.execute_query(quer > This is not clear to me - are we comparing only single a column? Shouldn't You're right, I changed it to compare whole rows (and transposition is not needed). -- To view, visit http://gerrit.cloudera.org:8080/19660 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic7974ef392c1412e8c60231e3420367bd189677a Gerrit-Change-Number: 19660 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Peter Rozsa <[email protected]> Gerrit-Comment-Date: Fri, 14 Apr 2023 10:01:47 +0000 Gerrit-HasComments: Yes
