[Impala-ASF-CR] IMPALA-4134: Use AUTO FLUSH BACKGROUND for Kudu sink
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4134: Use AUTO_FLUSH_BACKGROUND for Kudu sink .. Patch Set 1: Hold off on reviewing this. I'm going to make more changes. -- To view, visit http://gerrit.cloudera.org:8080/4728 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5542b9a061b01c543a139e8722560b1365f06595 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4134: Use AUTO FLUSH BACKGROUND for Kudu sink
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4134: Use AUTO_FLUSH_BACKGROUND for Kudu sink .. Patch Set 3: (3 comments) Thanks. I made a bunch of changes and ended up making it into a new commit. I'll abandon this one and post a new one with the new changes in a moment. http://gerrit.cloudera.org:8080/#/c/4670/3/be/src/exec/kudu-table-sink.cc File be/src/exec/kudu-table-sink.cc: Line 156: DCHECK(sink_action_ == TSinkAction::DELETE) << "Sink type not supported: " > spurious change here I actually meant to do this Line 163: SCOPED_TIMER(profile()->total_time_counter()); > I wonder if we should still keep some kind of timer which is specifically r Done Line 301: (*state->per_partition_status())[ROOT_PARTITION_KEY].num_appended_rows = > if we don't update this to the end, does the query summary page on the web >From what I can tell, this isn't something that is observable until the >fragment sends its final update to the coordinator. This is a per-partition >(TInsertStats) thing that gets collected at the end of the query. -- To view, visit http://gerrit.cloudera.org:8080/4670 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4134: Use AUTO FLUSH BACKGROUND for Kudu sink
Todd Lipcon has posted comments on this change. Change subject: IMPALA-4134: Use AUTO_FLUSH_BACKGROUND for Kudu sink .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/4670/3/be/src/exec/kudu-table-sink.cc File be/src/exec/kudu-table-sink.cc: Line 156: DCHECK(sink_action_ == TSinkAction::DELETE) << "Sink type not supported: " spurious change here Line 163: SCOPED_TIMER(profile()->total_time_counter()); I wonder if we should still keep some kind of timer which is specifically referring to the time spent in Kudu. We can't specifically measure the flushing, but we could do something like have a vectorwhich we fill up by looping over the row batch, and then in a SCOPED_TIMER(write_time_counter) apply them all? That way if we are writing faster than Kudu can absorb the writes, we'll see the blocking on the Apply() call register as significant time, and can still distinguish this time from time spent in output expr computation. Line 301: (*state->per_partition_status())[ROOT_PARTITION_KEY].num_appended_rows = if we don't update this to the end, does the query summary page on the web UI still show the number of rows written by the sink? (I don't know if that counter was the output side or the input side of the sink, actually) -- To view, visit http://gerrit.cloudera.org:8080/4670 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4134: Use AUTO FLUSH BACKGROUND for Kudu sink
Matthew Jacobs has uploaded a new patch set (#3). Change subject: IMPALA-4134: Use AUTO_FLUSH_BACKGROUND for Kudu sink .. IMPALA-4134: Use AUTO_FLUSH_BACKGROUND for Kudu sink Improves performance of writes to Kudu. Testing: Manual cluster testing (which is where the default mutation buffer value of 100mb was determined). Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0 --- M be/src/exec/kudu-table-sink.cc M be/src/exec/kudu-table-sink.h 2 files changed, 50 insertions(+), 39 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/4670/3 -- To view, visit http://gerrit.cloudera.org:8080/4670 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Todd LipconGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Todd Lipcon