David Ribeiro Alves has posted comments on this change. Change subject: tablet: check for row presence in rowset-wise order ......................................................................
Patch Set 8: (14 comments) http://gerrit.cloudera.org:8080/#/c/6483/8/src/kudu/tablet/row_op.h File src/kudu/tablet/row_op.h: PS8, Line 85: Flag whether this op has already had 'present_in_rowset' filled in. rephrase here and elsewhere to "present and alive" http://gerrit.cloudera.org:8080/#/c/6483/8/src/kudu/tablet/tablet.cc File src/kudu/tablet/tablet.cc: PS8, Line 650: mutate->present_in_rowset->MutateRow( : ts, : *mutate->key_probe, : mutate->decoded_op.changelist, : tx_state->op_id(), : stats, : result.get()); inconsistent wrapping with MutateRow below. PS8, Line 658: rowset should refer here and elsewhere to "DiskRowSet" since the MRS is also a row set. Line 667: if (s.ok()) { PREDICT_TRUE PS8, Line 688: // TODO(todd) determine why we sometimes get empty writes! is this new or did it happen before this patch series? PS8, Line 696: pair<Slice, int> aren't you using a struct for this somewhere else? can you reuse the same struct or at least typedef it? PS8, Line 703: keys_and_indexes.emplace_back(op->key_probe->encoded_key_slice(), i); could you determine in this loop whether the keys were already sorted and remove the duplicates that you find along the way? this would allow to skip the sort and std::unique below if the client already sorted the keys and maybe also allow to build key vector in the same loop. if you feel this is early optimization maybe just leave a TODO PS8, Line 717: & nit: pair<Slice, int>& same below PS8, Line 740: const auto& ClearPending = [&]() { : next_key.clear(); : if (pending.empty()) return; : DCHECK(std::is_sorted(pending.begin(), pending.end(), : [&](const pair<RowSet*, int>& a, : const pair<RowSet*, int>& b) { : auto s_a = keys[a.second]; : auto s_b = keys[b.second]; : return s_a.compare(s_b) < 0; : })); : RowSet* rs = pending[0].first; : for (auto it = pending.begin(); : it != pending.end(); : ++it) { : DCHECK_EQ(it->first, rs); : int op_idx = keys_and_indexes[it->second].second; : RowOp* op = row_ops_base[op_idx]; : if (op->present_in_rowset) { : // Already found this op present somewhere. : continue; : } : : // TODO(todd) is CHECK_OK correct? it used to be that errors here : // would just be silently ignored, so this seems at least an improvement. : bool present = false; : CHECK_OK(rs->CheckRowPresent(*op->key_probe, &present, tx_state->mutable_op_stats(op_idx))); : if (present) { : op->present_in_rowset = rs; : } : } : pending.clear(); : }; this is quite the hefty anonymous function. move it somwhere else? PS8, Line 773: const TabletComponents* comps = DCHECK_NOTNULL(tx_state->tablet_components()); : comps->rowsets->ForEachRowSetContainingKeys( : keys, : [&](RowSet* rs, int i) { : if (!pending.empty() && rs != pending.back().first) { : ClearPending(); : } : pending.emplace_back(rs, i); : }); : ClearPending(); doc what this is doing PS8, Line 784: for (auto& p : keys_and_indexes) { : row_ops_base[p.second]->checked_present = true; : } is checked_present used before this? could we just set it in one of the earlier loops? http://gerrit.cloudera.org:8080/#/c/6483/8/src/kudu/tablet/tablet.h File src/kudu/tablet/tablet.h: PS8, Line 403: // Validate the contents of 'op' and return a bad Status if it is : // invalid. : Status ValidateOp(const RowOp& op) const; : : // Validate 'op' as in 'ValidateOp()' above. If it is invalid, marks : // the op as failed and returns false. If valid, marks the op as validated : // and returns true. : bool ValidateOpOrMarkFailed(RowOp* op) const; general wondering since this is not dependent on tablet internal state could we just do this before the op is submitted, outside of tablet.cc (Maybe some static method in RowOp or in a new tablet_util.cc file). tablet.cc is already pretty big and it's beginning to become untenable to navigate in it. This would avoid expanding it further and maybe even trim it a bit. It would also help slightly if we ever decide that we want to do this non-state changing work while replication is ongoing. http://gerrit.cloudera.org:8080/#/c/6483/8/src/kudu/tablet/transactions/write_transaction.cc File src/kudu/tablet/transactions/write_transaction.cc: PS8, Line 273: nit no need to wrap http://gerrit.cloudera.org:8080/#/c/6483/8/src/kudu/tablet/transactions/write_transaction.h File src/kudu/tablet/transactions/write_transaction.h: PS8, Line 163: // 'i'. nit no need to wrap -- To view, visit http://gerrit.cloudera.org:8080/6483 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9a678bb0da130ce94aae3634c82ace2d6898ffa2 Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
