Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13412 )
Change subject: KUDU-2826: Add tail for mutation list to accelerate update in memrowset ...................................................................... Patch Set 3: (3 comments) > I run TestMemRowSet.TestMemRowSetInsertCountAndScan with --roundtrip_num_rows > = 1000000. Based on the timing results; you might want to rerun with 10 million rows (instead of 1 million) to see clearer results. > Time spent Scanning rows where none are committed: real 0.015s user 0.020s > sys 0.000s This one caught my eye: how can the user CPU seconds exceed the wallclock time? It's possible if multiple cores participated in the workload, but the scan path is processed by a single thread, so not sure how that's possible. > [Before change]: > Time spent Inserting rows: real 0.001s user 0.000s sys 0.000s > Time spent Updating rows: real 192.291s user 191.650s sys 0.230s > > [After change]: > Time spent Inserting rows: real 0.001s user 0.000s sys 0.000s > Time spent Updating rows: real 3.130s user 3.080s sys 0.040s > > The performance improvement by this change is adorable. Very nice! > By the way, the result of a Zipfian update YCSB workload will be done in few > days. Good, looking forward to hearing that. Could you update your commit description summarizing your performance runs? http://gerrit.cloudera.org:8080/#/c/13412/3/src/kudu/tablet/memrowset.h File src/kudu/tablet/memrowset.h: http://gerrit.cloudera.org:8080/#/c/13412/3/src/kudu/tablet/memrowset.h@175 PS3, Line 175: // Together they comprise a doubly-linked list of all of the row's mutations, Hmm, I misled you a bit: this isn't actually a doubly-linked list, because it's only possible to navigate from head to tail, not from tail to head. http://gerrit.cloudera.org:8080/#/c/13412/3/src/kudu/tablet/mutation.h File src/kudu/tablet/mutation.h: http://gerrit.cloudera.org:8080/#/c/13412/3/src/kudu/tablet/mutation.h@86 PS3, Line 86: void AppendToListAtomic(Mutation **redo_head, Mutation **redo_tail); Nit: while you're here, could you change this "Mutation**" instead of "Mutation **"? That's more code style conformant. http://gerrit.cloudera.org:8080/#/c/13412/3/src/kudu/tablet/mutation.cc File src/kudu/tablet/mutation.cc: http://gerrit.cloudera.org:8080/#/c/13412/3/src/kudu/tablet/mutation.cc@54 PS3, Line 54: void Mutation::AppendToListAtomic(Mutation **redo_head, Mutation **redo_tail) { This is no longer "atomic" in that redo_head and redo_tail may disagree (if the mutation list was empty), or the second to last element's 'next' and redo_tail may disagree (if not empty). I don't think this is a problem, but I'm not sure. As best I can tell, redo_tail is used by writers only and redo_head by readers only. Each row is locked when MRS changes are performed, so I wouldn't expect to see concurrent writers; only a single writer and concurrent readers for a given row. The changes here will be performed by that single writer, and since the readers never look at redo_tail, the "disagreements" outlined above shouldn't have any bearing on them. What do you think? -- To view, visit http://gerrit.cloudera.org:8080/13412 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I945f332f8241ecb3cc6ab66e67ee3ee2b4e49be8 Gerrit-Change-Number: 13412 Gerrit-PatchSet: 3 Gerrit-Owner: ZhangYao <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: Yao Xu <[email protected]> Gerrit-Reviewer: ZhangYao <[email protected]> Gerrit-Comment-Date: Tue, 28 May 2019 04:50:26 +0000 Gerrit-HasComments: Yes
