Todd Lipcon has posted comments on this change. Change subject: IMPALA-4134: Use AUTO_FLUSH_BACKGROUND for Kudu sink ......................................................................
Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/4670/3/be/src/exec/kudu-table-sink.cc File be/src/exec/kudu-table-sink.cc: Line 156: DCHECK(sink_action_ == TSinkAction::DELETE) << "Sink type not supported: " spurious change here Line 163: SCOPED_TIMER(profile()->total_time_counter()); I wonder if we should still keep some kind of timer which is specifically referring to the time spent in Kudu. We can't specifically measure the flushing, but we could do something like have a vector<unique_ptr<KuduWriteOperation>> which we fill up by looping over the row batch, and then in a SCOPED_TIMER(write_time_counter) apply them all? That way if we are writing faster than Kudu can absorb the writes, we'll see the blocking on the Apply() call register as significant time, and can still distinguish this time from time spent in output expr computation. Line 301: (*state->per_partition_status())[ROOT_PARTITION_KEY].num_appended_rows = if we don't update this to the end, does the query summary page on the web UI still show the number of rows written by the sink? (I don't know if that counter was the output side or the input side of the sink, actually) -- To view, visit http://gerrit.cloudera.org:8080/4670 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Todd Lipcon <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
