Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13412 )

Change subject: KUDU-2826: Add tail for mutation list to accelerate update in 
memrowset
......................................................................


Patch Set 3:

(3 comments)

> I run TestMemRowSet.TestMemRowSetInsertCountAndScan with --roundtrip_num_rows 
> = 1000000.

Based on the timing results; you might want to rerun with 10 million rows 
(instead of 1 million) to see clearer results.

> Time spent Scanning rows where none are committed: real 0.015s user 0.020s 
> sys 0.000s

This one caught my eye: how can the user CPU seconds exceed the wallclock time? 
It's possible if multiple cores participated in the workload, but the scan path 
is processed by a single thread, so not sure how that's possible.

> [Before change]:
> Time spent Inserting rows: real 0.001s user 0.000s sys 0.000s
> Time spent Updating rows: real 192.291s  user 191.650s sys 0.230s
>
> [After change]:
> Time spent Inserting rows: real 0.001s user 0.000s sys 0.000s
> Time spent Updating rows: real 3.130s  user 3.080s sys 0.040s
>
> The performance improvement by this change is adorable.

Very nice!

> By the way, the result of a Zipfian update YCSB workload will be done in few 
> days.

Good, looking forward to hearing that. Could you update your commit description 
summarizing your performance runs?

http://gerrit.cloudera.org:8080/#/c/13412/3/src/kudu/tablet/memrowset.h
File src/kudu/tablet/memrowset.h:

http://gerrit.cloudera.org:8080/#/c/13412/3/src/kudu/tablet/memrowset.h@175
PS3, Line 175:       // Together they comprise a doubly-linked list of all of 
the row's mutations,
Hmm, I misled you a bit: this isn't actually a doubly-linked list, because it's 
only possible to navigate from head to tail, not from tail to head.


http://gerrit.cloudera.org:8080/#/c/13412/3/src/kudu/tablet/mutation.h
File src/kudu/tablet/mutation.h:

http://gerrit.cloudera.org:8080/#/c/13412/3/src/kudu/tablet/mutation.h@86
PS3, Line 86:   void AppendToListAtomic(Mutation **redo_head, Mutation 
**redo_tail);
Nit: while you're here, could you change this "Mutation**" instead of "Mutation 
**"? That's more code style conformant.


http://gerrit.cloudera.org:8080/#/c/13412/3/src/kudu/tablet/mutation.cc
File src/kudu/tablet/mutation.cc:

http://gerrit.cloudera.org:8080/#/c/13412/3/src/kudu/tablet/mutation.cc@54
PS3, Line 54: void Mutation::AppendToListAtomic(Mutation **redo_head, Mutation 
**redo_tail) {
This is no longer "atomic" in that redo_head and redo_tail may disagree (if the 
mutation list was empty), or the second to last element's 'next' and redo_tail 
may disagree (if not empty). I don't think this is a problem, but I'm not sure.

As best I can tell, redo_tail is used by writers only and redo_head by readers 
only. Each row is locked when MRS changes are performed, so I wouldn't expect 
to see concurrent writers; only a single writer and concurrent readers for a 
given row. The changes here will be performed by that single writer, and since 
the readers never look at redo_tail, the "disagreements" outlined above 
shouldn't have any bearing on them.

What do you think?



--
To view, visit http://gerrit.cloudera.org:8080/13412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I945f332f8241ecb3cc6ab66e67ee3ee2b4e49be8
Gerrit-Change-Number: 13412
Gerrit-PatchSet: 3
Gerrit-Owner: ZhangYao <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Yao Xu <[email protected]>
Gerrit-Reviewer: ZhangYao <[email protected]>
Gerrit-Comment-Date: Tue, 28 May 2019 04:50:26 +0000
Gerrit-HasComments: Yes

Reply via email to