Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/15963 )
Change subject: IMPALA-6692: Trigger sort node run before hitting memory limit. ...................................................................... Patch Set 9: (7 comments) The change looks good to me in general, I only have some nits and improvement suggestions. http://gerrit.cloudera.org:8080/#/c/15963/9//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15963/9//COMMIT_MSG@7 PS9, Line 7: IMPALA-6692 Should this solve the issue in general, or this is just one step in that direction? The 3-6 second AddBatchTime still seems enough to fill the buffers and cause back pressure. http://gerrit.cloudera.org:8080/#/c/15963/9//COMMIT_MSG@23 PS9, Line 23: This patch speedup the decision to start the sort without waiting it : to hit memory limit first by capping the intermediary quicksort run to : lower memory limit, I was thinking about ways to minimize the regression this can cause in queries that spill more because of the sort_bytes_limit, e.g. where the memory need of the sort is between sort_bytes_limit and mem limit. A compromise could be to start using sort_bytes_limit only after the sorter had to spill at least once. This would mean no spilling if it is not needed at all, and the "big sort + spilling" could only occur once for an instance, so the penalty would not scale with the size of the data set. This may potentially allow us to set a safe default for sort_bytes_limit that rarely leads to regression but still helps in huge queries. http://gerrit.cloudera.org:8080/#/c/15963/9//COMMIT_MSG@47 PS9, Line 47: AddBatchTime Does this mean the maximum time of a single AddBatchTime call, or the total time of fragment instance that spent to most time in the function? http://gerrit.cloudera.org:8080/#/c/15963/9/be/src/runtime/sorter.cc File be/src/runtime/sorter.cc: http://gerrit.cloudera.org:8080/#/c/15963/9/be/src/runtime/sorter.cc@880 PS9, Line 880: || (state_->query_options().sort_bytes_limit > 0 : && buffer_pool_client_->GetUsedReservation() : >= state_->query_options().sort_bytes_limit) nit: indentation looks a bit messy, but I am not sure what is the rule here Moving the check to a function like bool ReachedSortBytesLimit() could help. http://gerrit.cloudera.org:8080/#/c/15963/9/be/src/runtime/sorter.cc@894 PS9, Line 894: timer.Stop(); I don't know if it matters, but this means that the counter won't be increased if the function returns with an error, while it is possible that a huge amount of time was spent here. http://gerrit.cloudera.org:8080/#/c/15963/9/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/15963/9/be/src/service/query-options.cc@904 PS9, Line 904: RETURN_IF_ERROR(ParseMemValue(value, "sort bytes limit", &sort_bytes_limit)); I wonder if it would make sense add a minimum limit, e.g. 32MB. The problem with small values is that it would greatly increase the number of runs, which can lead to doing the merge phase of the sort in multiple levels, as all runs during a merge needs a buffer, see Sorter::MaxRunsInNextMerge(). It seems also ok to enforce this in Sorter without notifying the user. http://gerrit.cloudera.org:8080/#/c/15963/9/tests/query_test/test_sort.py File tests/query_test/test_sort.py: http://gerrit.cloudera.org:8080/#/c/15963/9/tests/query_test/test_sort.py@79 PS9, Line 79: query, exec_option, table_format=table_format) nit: +2 indentation -- To view, visit http://gerrit.cloudera.org:8080/15963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2a0ba7c4bae4f1d300d4d9d7f594f63ced06a240 Gerrit-Change-Number: 15963 Gerrit-PatchSet: 9 Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: David Rorke <dro...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Thu, 04 Jun 2020 14:31:00 +0000 Gerrit-HasComments: Yes