Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19660 )

Change subject: WIP - IMPALA-12019: Support ORDER BY for arrays of fixed length 
types in select list
......................................................................


Patch Set 3:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/19660/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19660/2//COMMIT_MSG@16
PS2, Line 16:
> I think that the queries should be able to run without disabling codegen wi
I don't think we have infrastructure to only disable codegen for the sorter, I 
don't know how difficult that would be... But I think we should try adding 
codegen in this change.


http://gerrit.cloudera.org:8080/#/c/19660/2//COMMIT_MSG@27
PS2, Line 27:    keys are strings which are var-len.
> I think that it is ok to disable MAP sorting for now in the FE. My understa
Sorting MAPs actually works, I added a test for it, but only on a small table. 
The problem is that our data generator can't generate maps with only fixed len 
data because the key has to be a string.


http://gerrit.cloudera.org:8080/#/c/19660/2/be/src/runtime/sorter.cc
File be/src/runtime/sorter.cc:

http://gerrit.cloudera.org:8080/#/c/19660/2/be/src/runtime/sorter.cc@232
PS2, Line 232:
> 'None' refers to the previous sentence, which changed. Could you rephrase t
Done


http://gerrit.cloudera.org:8080/#/c/19660/2/be/src/runtime/sorter.cc@234
PS2, Line 234:         // is here.
             :         DCHECK_EQ(&var_len_pages_.back(), cur_var_len_page);
             :         uint8_t* var_data_ptr = 
cur_var_len_page->AllocateBytes(total_var_len);
             :         if (INITIAL_RUN) {
             :           CopyVarLenData(string_values, coll_values_and_sizes, 
var_data_ptr);
             :         } else {
             :           CopyVarLenDataConvertOffset(string_values, 
coll_values_and_sizes,
             :               var_len_pages_.size() - 1, 
cur_var_len_page->data(), var_data_ptr);
             :         }
             :       }
             :       ++num_tuples_;
             :       ++*num_processed;
             :       ++cur_input_index;
             :     }
             :
             :     // If there are still rows left to process, get a new page 
for the fixed-length
             :     // tuples. If the run is already too long, return.
             :     if (cur_input_index < batch->num_rows()) {
             :       bool added;
             :       RETURN_IF_ERROR(TryAddPage(add_mode, &fixed_len_pages_, 
&added));
             :       if (!added) return Status::OK();
             :       cur_fixed_len_page = &fixed_len_pages_.back();
             :     }
             :   }
             :   return Status::OK();
> This check should be done outside this hot loop, e.g. during initialization
We used to allow "illegal" tuples if they are NULL (see L259 in PS2). For this 
we need to read the actual data that is not available at initialisation. On the 
other hand I don't think it's important that we allow these NULL tuples, 
especially as the FE will not allow them.

I moved this check to Run::Init() and removed the exception to allow NULL 
tuples.

Also, this is only a DCHECK in the end, so the compiler should optimise it out 
in release builds.


http://gerrit.cloudera.org:8080/#/c/19660/2/be/src/runtime/tuple.cc
File be/src/runtime/tuple.cc:

http://gerrit.cloudera.org:8080/#/c/19660/2/be/src/runtime/tuple.cc@236
PS2, Line 236:
RawValue::Write now handles NULL values, we don't have to handle it here. If 
the value is NULL then 'string_values' and 'collection_values' will remain 
empty and we don't enter the IF-THEN blocks.


http://gerrit.cloudera.org:8080/#/c/19660/2/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java:

http://gerrit.cloudera.org:8080/#/c/19660/2/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@369
PS2, Line 369:         MapType mapType = (MapType) type;
> MAP sorting is not supported at the moment, right?
No, it works, we just don't have a large table to test it on. I added a small 
table.


http://gerrit.cloudera.org:8080/#/c/19660/2/tests/query_test/test_sort.py
File tests/query_test/test_sort.py:

http://gerrit.cloudera.org:8080/#/c/19660/2/tests/query_test/test_sort.py@312
PS2, Line 312:     query_result = self.execute_query(quer
> This is not clear to me - are we comparing only single a column? Shouldn't
You're right, I changed it to compare whole rows (and transposition is not 
needed).



--
To view, visit http://gerrit.cloudera.org:8080/19660
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic7974ef392c1412e8c60231e3420367bd189677a
Gerrit-Change-Number: 19660
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Peter Rozsa <[email protected]>
Gerrit-Comment-Date: Fri, 14 Apr 2023 10:01:47 +0000
Gerrit-HasComments: Yes

Reply via email to