Todd Lipcon has posted comments on this change.

Change subject: KUDU-1767. c++ client: Prevent concurrent flushes by default
......................................................................


Patch Set 2: Code-Review-1

I don't think this is a particularly good idea to "fix".

Making only a single flush outstanding at a time will regress performance 
substantially, and I don't think the improved semantics are worth it.

Although this avoids one particular case of the insertion order not matching 
the order rows are applied to the session, there are plenty of others:
- if I wrote to row A, and then to row B, and those are different tablets, B 
can be assigned a timestamp earlier than A.
- if I write row A, and it gets a timeout for its flush, and then write row B, 
row A's write may actually still succeed after row B's
- the typical use of auto-flush mode is bulk ingest, eg something like an MR 
job or Impala INSERT query. In that case, you'll almost certainly have tens or 
hundreds of parallel inserters, in which case you have no guarantee that two 
"sequential" rows in the insert data are actually picked up by the same 
inserter thread. For example consider an MR job, where the two versions of the 
same row key fall just across an HDFS block boundary. You have no ordering 
guarantee between them.

I think it's much better to just say "there are no ordering guarantees provided 
by auto_flush_background,but it has the best performance". If someone wants 
ordering, they need to explicitly wait on a row to be flushed before they may 
send the next dependent row, rather than make all use cases take a big perf hit.

-- 
To view, visit http://gerrit.cloudera.org:8080/5533
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e48a054093163824573962963ddcba122a948b8
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Dan Burkert <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: No

Reply via email to