[kudu-CR] KUDU-1235. Add Get API

Binglin Chang (Code Review) Fri, 11 Mar 2016 00:39:42 -0800

Binglin Chang has posted comments on this change.

Change subject: KUDU-1235. Add Get API
......................................................................



Patch Set 1:

(8 comments)

> Do you have a comparison flame-graph of the tserver CPU usage handling a YCSB 
> workload of get vs scan

Will post them on jira

http://gerrit.cloudera.org:8080/#/c/2519/1/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

Line 888:   ASSERT_NO_FATAL_FAILURE(InsertTestRows(
> can you add a couple more tests here? I think there might be bugs with the 
ok


http://gerrit.cloudera.org:8080/#/c/2519/1/src/kudu/client/client.h
File src/kudu/client/client.h:

Line 1073:   Status SetProjectedColumns(const std::vector<std::string>& 
col_names) WARN_UNUSED_RESULT;
> rename to SetProjectedColumnNames (to match the non-deprecated API in KuduS
ok


http://gerrit.cloudera.org:8080/#/c/2519/1/src/kudu/client/getter-internal.cc
File src/kudu/client/getter-internal.cc:

Line 93:   RETURN_NOT_OK(SchemaToColumnPBs(GetCurrentProjection(), 
request_.mutable_projected_columns()));
> performance-wise, you could do this lazily only when the projection changes
OK


Line 111:         KuduClient::LEADER_ONLY,
> why restrict to leader only? I think this API should mirror the Scanner API
Sure, will add later.


http://gerrit.cloudera.org:8080/#/c/2519/1/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

Line 304:   if (row_key_util::IncrementKey(&new_row, &arena)) {
> this is likely to conflict a lot with the stuff Dan has been working on. Da
Another way is to stop use scanspec, and implement Get in every rowset class, 
which is a major effort, but should have much better performance I think.


Line 315:       return iter->NextBlock(dst);
> I'm not sure this is correct -- it's allowed for an iterator to return HasN
OK, will check and add tests.


Line 1551:     vector<shared_ptr<RowwiseIterator> > *iters) const {
> I don't think we need shared_ptr here -- this could be a vector<unique_ptr<
yes


Line 1567:     RETURN_NOT_OK_PREPEND(rs->NewRowIterator(projection, snap, 
&row_it),
> seems like we are leaving a lot of potential optimization on the table by n
After compaction, most request only ends up in very few RS (mostly 1), so I 
skeptical here.
One optimization I am think of is to lookup  diskrowset first, then memory 
rowset, cause it's most likely then the record is in diskrowset. 
Also yes construct PB is time consuming too.


-- 
To view, visit http://gerrit.cloudera.org:8080/2519
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id054a198373da04a3eca5b244516acc378e1e37c
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Binglin Chang <[email protected]>
Gerrit-Reviewer: Binglin Chang <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1235. Add Get API

Reply via email to