[Impala-ASF-CR] IMPALA-4134: Use AUTO FLUSH BACKGROUND for Kudu sink

2016-10-14 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4134: Use AUTO_FLUSH_BACKGROUND for Kudu sink
..


Patch Set 1:

Hold off on reviewing this. I'm going to make more changes.

-- 
To view, visit http://gerrit.cloudera.org:8080/4728
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5542b9a061b01c543a139e8722560b1365f06595
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4134: Use AUTO FLUSH BACKGROUND for Kudu sink

2016-10-14 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4134: Use AUTO_FLUSH_BACKGROUND for Kudu sink
..


Patch Set 3:

(3 comments)

Thanks. I made a bunch of changes and ended up making it into a new commit. 
I'll abandon this one and post a new one with the new changes in a moment.

http://gerrit.cloudera.org:8080/#/c/4670/3/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

Line 156: DCHECK(sink_action_ == TSinkAction::DELETE) << "Sink type not 
supported: "
> spurious change here
I actually meant to do this


Line 163:   SCOPED_TIMER(profile()->total_time_counter());
> I wonder if we should still keep some kind of timer which is specifically r
Done


Line 301:   
(*state->per_partition_status())[ROOT_PARTITION_KEY].num_appended_rows =
> if we don't update this to the end, does the query summary page on the web 
>From what I can tell, this isn't something that is observable  until the 
>fragment sends its final update to the coordinator. This is a per-partition 
>(TInsertStats) thing that gets collected at the end of the query.


-- 
To view, visit http://gerrit.cloudera.org:8080/4670
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4134: Use AUTO FLUSH BACKGROUND for Kudu sink

2016-10-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: IMPALA-4134: Use AUTO_FLUSH_BACKGROUND for Kudu sink
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4670/3/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

Line 156: DCHECK(sink_action_ == TSinkAction::DELETE) << "Sink type not 
supported: "
spurious change here


Line 163:   SCOPED_TIMER(profile()->total_time_counter());
I wonder if we should still keep some kind of timer which is specifically 
referring to the time spent in Kudu. We can't specifically measure the 
flushing, but we could do something like have a 
vector which we fill up by looping over the row 
batch, and then in a SCOPED_TIMER(write_time_counter) apply them all? That way 
if we are writing faster than Kudu can absorb the writes, we'll see the 
blocking on the Apply() call register as significant time, and can still 
distinguish this time from time spent in output expr computation.


Line 301:   
(*state->per_partition_status())[ROOT_PARTITION_KEY].num_appended_rows =
if we don't update this to the end, does the query summary page on the web UI 
still show the number of rows written by the sink? (I don't know if that 
counter was the output side or the input side of the sink, actually)


-- 
To view, visit http://gerrit.cloudera.org:8080/4670
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4134: Use AUTO FLUSH BACKGROUND for Kudu sink

2016-10-13 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new patch set (#3).

Change subject: IMPALA-4134: Use AUTO_FLUSH_BACKGROUND for Kudu sink
..

IMPALA-4134: Use AUTO_FLUSH_BACKGROUND for Kudu sink

Improves performance of writes to Kudu.

Testing: Manual cluster testing (which is where the default
mutation buffer value of 100mb was determined).

Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0
---
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
2 files changed, 50 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/4670/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4670
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Todd Lipcon