[Impala-ASF-CR] IMPALA-5706: Spilling sort optimisations

Gabor Kaszab (Code Review) Tue, 15 May 2018 07:06:07 -0700

Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9943 )


Change subject: IMPALA-5706: Spilling sort optimisations
......................................................................


Patch Set 8:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/9943/7/be/src/runtime/sorter.cc
File be/src/runtime/sorter.cc:

http://gerrit.cloudera.org:8080/#/c/9943/7/be/src/runtime/sorter.cc@a1052
PS7, Line 1052:
> Can we preserve this DCHECK but make it about the next page being pinned, o
I think you meant the page under page_index + 2 as the one on page_index + 1 
should be always pinned due to double buffering. (And I DCHECK on this 
condition already.)

I'll add the DCHECK you mention to "if(is_pinned_)", however, it has a bit 
strange aftertaste :)

Update: I'll revert this as double-buffering is removed from the scope of this 
patch.


http://gerrit.cloudera.org:8080/#/c/9943/7/be/src/runtime/sorter.cc@880
PS7, Line 880:
> Comment needs updating.
Done


http://gerrit.cloudera.org:8080/#/c/9943/7/be/src/runtime/sorter.cc@881
PS7, Line 881:   // Pins the first var len page if there is any.
> Might be cleaner to rewrite these as loops, e.g.
It's rather min(2, fixed_len_pages_.size()) but get the point :) Was wondering 
whether to have nested if's or a loop. Went for the first one but this in fact 
seems cleaner.


http://gerrit.cloudera.org:8080/#/c/9943/7/be/src/runtime/sorter.cc@1049
PS7, Line 1049: st<StringV
> nit: idx/index inconsistency. I'm fine with either but the inconsistency in
Sure, will follow how it was before.

Update: not valid since double-buffering is no longer included.


http://gerrit.cloudera.org:8080/#/c/9943/7/be/src/runtime/sorter.cc@1052
PS7, Line 1052:     }
> This DCHECK doesn't seem useful - if we messed up the vector addressing ari
Thx, done.


http://gerrit.cloudera.org:8080/#/c/9943/7/be/src/runtime/sorter.cc@1551
PS7, Line 1551:     RETURN_IF_ERROR(AddBatchNoSpill(batch, cur_batch_index, 
&num_processed));
> Not a big deal but we could reduce the vertical whitespace in this function
I personally prefer vertical segmentation of the code but as I see in Impala 
the short functions are usually dense, so I'm fine removing the empty lines.


http://gerrit.cloudera.org:8080/#/c/9943/7/be/src/runtime/sorter.cc@1682
PS7, Line 1682:         if (num_required_buffers <= num_available_buffers) 
break;
> Maybe use range-based for loop? I find it more readable personally.
Done


http://gerrit.cloudera.org:8080/#/c/9943/7/be/src/runtime/sorter.cc@1706
PS7, Line 1706:
> Couldn't we compute this more accurately by doing a pass over sorted_runs_?
With dropping double-buffering this logic changes a bit. We always have 
fixed_len_pages so 1 buffer per run is needed for that, and in case that run 
has var_len_pages then it needs an additional buffer.


http://gerrit.cloudera.org:8080/#/c/9943/7/be/src/runtime/sorter.cc@1721
PS7, Line 1721:   }
> I think my above comment about accurately computing the requirement for the
I added a more sophisticated calculation for getting the max runs in the next 
merge. It feels for me that I over-engineered it a little bit, but now it takes 
into account whether each run has var len pages or not.


http://gerrit.cloudera.org:8080/#/c/9943/7/be/src/runtime/sorter.cc@1727
PS7, Line 1727:
> I had a bit of trouble reasoning through whether this calculation was corre
Nice catch! This in fact was incorrect.


http://gerrit.cloudera.org:8080/#/c/9943/7/be/src/runtime/sorter.cc@1730
PS7, Line 1730:   // its runs.
> If the goal is to minimize the number of "extra merges" per row, then it is
Interesting question. The current solution was meant to balance between 
optimising the distribution of runs between merges and not over-engineer the 
code.
Your solution also seems good, though. I'm wondering if finishing the 
intermediate merges quicker but then putting more stress on GetNext() where the 
final merge is done is actually better or not than the current solution of 
distributing the runs evenly. It's true that the amount of merges done on one 
particular row is lower with yours.
Let me give it a try, thanks for the suggestion.


http://gerrit.cloudera.org:8080/#/c/9943/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test:

PS7:
> IMPALA-4835 probably caused a bunch of conflicts in the planner tests. Hope
I'll revert these changes as double-buffering is removed from this patch set.


http://gerrit.cloudera.org:8080/#/c/9943/7/tests/query_test/test_sort.py
File tests/query_test/test_sort.py:

http://gerrit.cloudera.org:8080/#/c/9943/7/tests/query_test/test_sort.py@45
PS7, Line 45:   def test_multiple_mem_limits(self, vector):
> Are these tests now tuned for a 3-node minicluster? Maybe we should skip va
Good point I haven't thought about that.

One note is that according to the comments, the purpose of this test is to 
trigger sorts with 0, 1 and 2 merges. I wonder, which test configuration this 
was ran on.

After removing double-buffering I have rewritten the mem_limit and what I 
observe is that non of these 3 sorts do a merge with a 3-node cluster.
With a 1-node cluster 150m and 110m mem_limits both make a 1 merge sort even 
though the second one should make an intermediate merge as well (according to 
the comment).

Do you think I should play around with the mem_limit to make a 2-merge sort on 
a 1-merge cluster?

Is there an easy way to detect if the test is run on a 1-node cluster or not?
I'd put the assert on the merge # behind a condition and execute only f the 
test is run on a 1-node cluster.
Anyway what cluster is this running on e.g. when ran in jenkins? Isn't it a 
3-node cluster?

Shouldn't we set num_nodes=1 to guarantee the number of merges are the same 
from run to run? Would we lose test coverage with that?


http://gerrit.cloudera.org:8080/#/c/9943/7/tests/query_test/test_sort.py@65
PS7, Line 65:         query, exec_option, table_format=table_format)
> Long line.
Done


http://gerrit.cloudera.org:8080/#/c/9943/7/tests/query_test/test_sort.py@121
PS7, Line 121:
> Could we get this to run quicker on a smaller table or selecting a smaller
Well, thanks for spotting this as the 28s is the runtime of the test I wrote 
initially but later on I enhanced it to reduce runtime but forgot to update the 
comment.
This actually runs for ~4.5s.


http://gerrit.cloudera.org:8080/#/c/9943/7/tests/query_test/test_sort.py@143
PS7, Line 143:     # when it finishes. As a result it has more memory for 
intermediate merges and
> nit: space after ,
Done



--
To view, visit http://gerrit.cloudera.org:8080/9943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9
Gerrit-Change-Number: 9943
Gerrit-PatchSet: 8
Gerrit-Owner: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Tue, 15 May 2018 14:05:45 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5706: Spilling sort optimisations

Reply via email to