[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu

2017-10-26 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7793 )

Change subject: IMPALA-4252: Min-max runtime filters for Kudu
..


Patch Set 7:

I know that He Lifu at NetEase also prototyped some support for pushing blooms 
down all the way into Kudu (not just the Impala scanner). On his TPCH 
benchmarks he got a ~50% speedup on Q17 and Q18, Q9, Q8, which I don't see in 
the results here.

I wonder if this is due to a different partitioning scheme or whether the big 
gain actually comes from pushing all the way down into the Kudu scanner. Any 
idea, given the plans for those queries?


--
To view, visit http://gerrit.cloudera.org:8080/7793
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c
Gerrit-Change-Number: 7793
Gerrit-PatchSet: 7
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Anonymous Coward #345
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 26 Oct 2017 20:35:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu

2017-10-24 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7793 )

Change subject: IMPALA-4252: Min-max runtime filters for Kudu
..


Patch Set 7:

> Patch Set 7:
>
> > > Patch Set 7:
>  > >
>  > > Perf results:
>  > > ...
>  >
>  > I'm surprised that only a few queries saw significant speedups. Is
>  > this in line with what you saw with Parquet runtime filters on
>  > TPC-H? Or are we losing a lot by using min/max instead of bloom or
>  > in-list style filters?
>
> Not sure about bloom filters perf, though I can run those numbers for 
> comparison.

I haven't looked at this patch, but had a question about the design:

Are we still pushing blooms across a join to prevent shuffling of data? Or are 
we now pushing _only_ min/max?

It seems there is value in pushing both: the bloom for evaluation on the other 
side of the join to prevent shuffling, and the min/max to push all the way to 
the scanner to reduce I/O.

Not sure if the patch is already doing this.


--
To view, visit http://gerrit.cloudera.org:8080/7793
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c
Gerrit-Change-Number: 7793
Gerrit-PatchSet: 7
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Anonymous Coward #345
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 24 Oct 2017 18:32:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4622: [DOCS] New Kudu ALTER TABLE syntax

2017-10-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8191 )

Change subject: IMPALA-4622: [DOCS] New Kudu ALTER TABLE syntax
..


Patch Set 4: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7647dfe8acfdb598caf561d41d74e38a8b66ce9d
Gerrit-Change-Number: 8191
Gerrit-PatchSet: 4
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Ambreen Kazi 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 06 Oct 2017 01:31:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3267 [DOCS] Docs about nan/inf are incorrect

2017-06-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: IMPALA-3267 [DOCS] Docs about nan/inf are incorrect
..


Patch Set 1: Code-Review+1

lgtm, but I'm not a committer, so you'll need a +2 from someone else

-- 
To view, visit http://gerrit.cloudera.org:8080/7083
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0e5d950b250c2e4425bde7d9e0bccbb068a73e12
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laurel Hale 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4616: [DOCS] Doc new ALTER TABLE options for Kudu

2017-06-01 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: IMPALA-4616: [DOCS] Doc new ALTER TABLE options for Kudu
..


Patch Set 1:

did we also add ALTER TABLE ALTER COLUMN SET DEFAULT and such?

-- 
To view, visit http://gerrit.cloudera.org:8080/7041
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f87155dcba860f6fd9259387fea52bbe439e028
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5301: Set Kudu minicluster memory limit

2017-05-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: IMPALA-5301: Set Kudu minicluster memory limit
..


Patch Set 5: Code-Review+1

Seems fine from the Kudu side. Not quite familiar enough with the test_kudu.py 
part to give an authoritative +2 (and I guess I don't have the rights to do so 
anyway as a non-committer)

-- 
To view, visit http://gerrit.cloudera.org:8080/6844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fd7e1cd9dc781aaa672a2c68c845cb57ec885d5
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4758: (2/2) Impala-side changes to build with latest gutil

2017-03-28 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: IMPALA-4758: (2/2) Impala-side changes to build with latest 
gutil
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5688/9//COMMIT_MSG
Commit Message:

Line 32: a7c3f30 - Remove AMD Opteron Rev E workaround from atomicops
probably worth our taking this patch into kudu gutil as well


-- 
To view, visit http://gerrit.cloudera.org:8080/5688
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ac21d7d6401f21fcdfdd1132b8f322bfba4bb80
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] kudu: fix uninitialized variable usage warning

2017-02-21 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: kudu: fix uninitialized variable usage warning
..


Patch Set 1:

I doubt we'd see this in real life, since as far as I know glog always sets the 
severity to one of the ones covered in the case statement. Otherwise, DEBUG 
builds would be failing with the DCHECK here. But, the warning being emitted 
during compilation was bugging me.

-- 
To view, visit http://gerrit.cloudera.org:8080/6098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4f87c39cf9259bec8ba9a84a2543fb1f2e4a3422
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[Impala-ASF-CR] build: don't look in system paths for kudu client

2017-02-21 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: build: don't look in system paths for kudu client
..


Patch Set 1:

I'm not sure I can trigger a GVM. Does your +2 automatically trigger it? LMK if 
there's anything I need to do to push this forward, thanks.

-- 
To view, visit http://gerrit.cloudera.org:8080/6097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8dad3d92e0bd07c27c3a58f6964d4da88218049c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[Impala-ASF-CR] Avoid including Boost cmake support from system path

2017-02-21 Thread Todd Lipcon (Code Review)
Todd Lipcon has abandoned this change.

Change subject: Avoid including Boost cmake support from system path
..


Abandoned

-- 
To view, visit http://gerrit.cloudera.org:8080/6096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I9767952aada11d4d44da99bc0011d6d0829fad4c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] Avoid including Boost cmake support from system path

2017-02-21 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: Avoid including Boost cmake support from system path
..


Patch Set 1:

Oh, interesting. He and I didn't coordinate, but I'm guessing we both tried 
building on the same dev box (the Kudu team shares a beefy box we often use for 
builds). His has a better commit message than mine, so I'll abandon this one.

-- 
To view, visit http://gerrit.cloudera.org:8080/6096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9767952aada11d4d44da99bc0011d6d0829fad4c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[Impala-ASF-CR] build: don't look in system paths for kudu client

2017-02-21 Thread Todd Lipcon (Code Review)
Hello Matthew Jacobs,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/6097

to review the following change.

Change subject: build: don't look in system paths for kudu client
..

build: don't look in system paths for kudu client

On my system which had the Kudu client library installed in /usr,
Impala was picking up the wrong version of the client. This instructs
CMake to only look in the specified toolchain directory to find Kudu.

Change-Id: I8dad3d92e0bd07c27c3a58f6964d4da88218049c
---
M CMakeLists.txt
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/6097/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8dad3d92e0bd07c27c3a58f6964d4da88218049c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] kudu: fix uninitialized variable usage warning

2017-02-21 Thread Todd Lipcon (Code Review)
Hello Matthew Jacobs,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/6098

to review the following change.

Change subject: kudu: fix uninitialized variable usage warning
..

kudu: fix uninitialized variable usage warning

This fixes a trivial warning of a potential usage of an uninitialized variable
in release builds. If for some reason Kudu sends us an invalid log severity
level, we'll now fall back to INFO instead of reading an uninitialized
severity value.

Change-Id: I4f87c39cf9259bec8ba9a84a2543fb1f2e4a3422
---
M be/src/exec/kudu-util.cc
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/6098/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I4f87c39cf9259bec8ba9a84a2543fb1f2e4a3422
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] Updates to DML statements for Impala + Kudu

2017-01-18 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: Updates to DML statements for Impala + Kudu
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5646/1/docs/topics/impala_update.xml
File docs/topics/impala_update.xml:

Line 50:   [ WHERE where_conditions ]
> Example of the syntax? I get errors trying to put the join clause in differ
looks like the syntax is:

  UPDATE [target_alias|target_table_ref]
  SET COL = VAL [, COL = VAL]
  [FROM [table_ref, ...]]
  [WHERE ...]

eg:

UPDATE t1 SET c1='val' FROM t1 JOIN t2 ON t1.foo = t2.foo WHERE ...


-- 
To view, visit http://gerrit.cloudera.org:8080/5646
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I60512b7957fb53d86d3123a4f1d46fbb355f4665
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Ambreen Kazi 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Major update to Impala + Kudu page

2017-01-10 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: Major update to Impala + Kudu page
..


Patch Set 4:

(31 comments)

http://gerrit.cloudera.org:8080/#/c/5649/4/docs/topics/impala_alter_table.xml
File docs/topics/impala_alter_table.xml:

PS4, Line 887: Any dropped range must not contain
 :   any data values in that range.
I don't believe this is true -- dropping a range is an efficient way to 
bulk-delete a bunch of data for workloads like rolling-window time series


http://gerrit.cloudera.org:8080/#/c/5649/4/docs/topics/impala_compute_stats.xml
File docs/topics/impala_compute_stats.xml:

Line 55: 
nope, because afaik we don't have partition-level stats for kudu (they aren't 
partitions as far as HMS is concerned)


http://gerrit.cloudera.org:8080/#/c/5649/4/docs/topics/impala_create_table.xml
File docs/topics/impala_create_table.xml:

PS4, Line 107: expression
'expression' here makes it sound like we support computed defaults. Perhaps 
it's better to say 'constant' or 'value'?


Line 122:   RANGE
this should be RANGE (pk_col [, ...]) right?


PS4, Line 125: constant
perhaps say 'tuple' or something instead? for composite PKs you need something 
like: PARTITION ('foo', 1) < VALUES


http://gerrit.cloudera.org:8080/#/c/5649/4/docs/topics/impala_grant.xml
File docs/topics/impala_grant.xml:

Line 134:   Access to Kudu tables must be granted to roles as usual.
is it worth noting here that grant/revoke from Impala has no bearing on access 
via the direct Kudu APIs?


http://gerrit.cloudera.org:8080/#/c/5649/4/docs/topics/impala_kudu.xml
File docs/topics/impala_kudu.xml:

PS4, Line 39: use the Apache Kudu component
stored by Apache Kudu


PS4, Line 98: ultiple Kudu hosts.
should say "separated by commas"


PS4, Line 105: TBLPROPERTIES
should say which table property to set


PS4, Line 147: logically
I'd say these are physical


PS4, Line 155: Kudu tablet servers
no, just the number of replicas of a given table must be odd. You could have 4 
tablet servers, but replication count 3


PS4, Line 156:  When you set up a new cluster, the number of tablet servers 
might not
 :   exactly match the number of DataNodes. If you add or 
remove one host from an existing
 :   cluster, be careful not to change the number of tablet 
servers from 3 to 2, or from 99
 :   to 100, and so on.
I'd scrap this whole 


PS4, Line 200: cannot contain any duplicate values
worth noting that it's the _tuple_ that has to be unique. EG you can have 
PRIMARY KEY (a, b) and have values (1,1), (1,2), (2, 1), (2,2). I don't think 
that's clear here.


PS4, Line 460:   
 : Because primary key columns cannot contain any 
NULL values, you can
 : omit the NOT NULL clause for the 
primary key columns.
that's true, but I think it's good practice to specify NOT NULL regardless, in 
case we ever add the ability for a nullable PK.


Line 553: them with the default PLAIN encoding.
I'd disagree with this -- in many cases encodings have a net speedup - there's 
some small CPU cost but the compression ratio can be very good. Especially if 
you have something like a timestamp value or transaction ID in your PK, 
bitshuffle will compress it quite well and have a negligible effect on 
performance of a key lookup, since the cost of decoding one page is nothing 
compared to the cost of the random disk IO, etc.


PS4, Line 592: user_id values come from a specific set of strings
that's slightly unexpected. I think a better example for dictionary would be 
something like 'ship_status STRING" from tpch, or 'state' or 'country'


PS4, Line 597: subjects might start with the same first words
seems somewhat contrived. Prefix encoding isn't super useful for most cases so 
maybe not worth documenting. (we use it internally for storing the reified 
composite key index where it is quite effective)


PS4, Line 601: The original text
 : from the body column is the most 
frequently accessed, therefore it
 : uses the compression with the least CPU overhead to 
expand. The
 : spanish_translation is accessed 
less frequently, therefore it uses
 : a compression format that produces a smaller result 
but takes more CPU cycles to
hrm, I'm not sure I'd draw such a distinction between LZ4 and SNAPPY. In fact 
LZ4 has some extensions which we aren't yet using (but should be at some point) 
which will probably be able to get it to be more dense than snappy as well as 
being faster, at which point we'd stop encouraging snappy


PS4, Line 636: into units known as
 : cfiles. The cfiles for each column can 
have a specified block size
I think I'd not mention cfiles here, but say that there is an underlying unit 
of IO which is the "block size".

[Impala-ASF-CR] Updates to DML statements for Impala + Kudu

2017-01-10 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: Updates to DML statements for Impala + Kudu
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/5646/1/docs/topics/impala_delete.xml
File docs/topics/impala_delete.xml:

Line 49: DELETE [FROM] 
[database_name].table_name [ WHERE 
where_conditions ]
I think there is another form to describe:

DELETE  FROM  WHERE ..

eg:
DELETE t1 from todd_test t1 JOIN todd_test2 t2 ON t1.o_orderkey = t2.o_orderkey;


PS1, Line 65:  
:   Normally, a DELETE operation for a Kudu 
table fails if
:   some partition key columns are not found, due to their 
being deleted or changed
:   by a concurrent UPDATE or 
DELETE operation.
:   Specify DELETE IGNORE 
rest_of_statement to
:   make the DELETE continue in this case. The 
rows with the nonexistent
:   duplicate partition key column values are not removed.
: 
I don't think this is relevant anymore (the IGNORE keyword is gone)


Line 101: 
Yes it can, using syntax like:

DELETE c FROM my_second_table c, stock_symbols s WHERE c.name = s.symbol;

(but it only deletes from one of the tables)

Can also use JOIN syntax as shown above


http://gerrit.cloudera.org:8080/#/c/5646/1/docs/topics/impala_update.xml
File docs/topics/impala_update.xml:

Line 50:   [ WHERE where_conditions ]
similar to my comment on the DELETE, this can support UPDATE with JOIN


-- 
To view, visit http://gerrit.cloudera.org:8080/5646
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I60512b7957fb53d86d3123a4f1d46fbb355f4665
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Ambreen Kazi 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4134,IMPALA-3704: Kudu INSERT improvements

2016-10-19 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: IMPALA-4134,IMPALA-3704: Kudu INSERT improvements
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4728/4/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

Line 132:   KUDU_RETURN_IF_ERROR(s, "Unable to set flush mode"); // Return 
error status for RELEASE
> We have the same error weirdness for the other session_->Set*() calls. I'm 
fwiw, our reasoning for making even simple setter calls return Status is for 
API compatibility. Imagine that you want to write an app which runs against 
multiple versions of the Kudu client API. Old versions don't support this flush 
mode. Without a returned Status, we'd end up having to crash or something. Here 
we can return a reasonable status saying that the flush mode isn't supported.

For the case of Impala it's not as big a deal since you bundle a specific 
client version, but the Kudu API is publicly consumable and a lot of people 
would be linking against /usr/lib/libkudu_client.so rather than bundling.


-- 
To view, visit http://gerrit.cloudera.org:8080/4728
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5542b9a061b01c543a139e8722560b1365f06595
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4134: Use AUTO FLUSH BACKGROUND for Kudu sink

2016-10-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: IMPALA-4134: Use AUTO_FLUSH_BACKGROUND for Kudu sink
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4670/3/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

Line 156: DCHECK(sink_action_ == TSinkAction::DELETE) << "Sink type not 
supported: "
spurious change here


Line 163:   SCOPED_TIMER(profile()->total_time_counter());
I wonder if we should still keep some kind of timer which is specifically 
referring to the time spent in Kudu. We can't specifically measure the 
flushing, but we could do something like have a 
vector which we fill up by looping over the row 
batch, and then in a SCOPED_TIMER(write_time_counter) apply them all? That way 
if we are writing faster than Kudu can absorb the writes, we'll see the 
blocking on the Apply() call register as significant time, and can still 
distinguish this time from time spent in output expr computation.


Line 301:   
(*state->per_partition_status())[ROOT_PARTITION_KEY].num_appended_rows =
if we don't update this to the end, does the query summary page on the web UI 
still show the number of rows written by the sink? (I don't know if that 
counter was the output side or the input side of the sink, actually)


-- 
To view, visit http://gerrit.cloudera.org:8080/4670
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: Use AUTO FLUSH BACKGROUND for Kudu sink

2016-10-12 Thread Todd Lipcon (Code Review)
Todd Lipcon has uploaded a new patch set (#2).

Change subject: WIP: Use AUTO_FLUSH_BACKGROUND for Kudu sink
..

WIP: Use AUTO_FLUSH_BACKGROUND for Kudu sink

Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0
---
M be/src/exec/kudu-table-sink-test.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
7 files changed, 73 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/4670/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4670
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] WIP: Use AUTO FLUSH BACKGROUND for Kudu sink

2016-10-07 Thread Todd Lipcon (Code Review)
Hello Matthew Jacobs,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/4670

to review the following change.

Change subject: WIP: Use AUTO_FLUSH_BACKGROUND for Kudu sink
..

WIP: Use AUTO_FLUSH_BACKGROUND for Kudu sink

Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0
---
M be/src/exec/kudu-table-sink-test.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
3 files changed, 31 insertions(+), 27 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/4670/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4670
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-3918: Fix straggler Cloudera -> ASF license headers

2016-09-12 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: IMPALA-3918: Fix straggler Cloudera -> ASF license headers
..


Patch Set 1:

lgtm, though it might be a good idea to document the reasoning that the 
"relicense" is OK in the commit message so that if anyone is auditing the 
provenance of these files through git history later, you have some trail of why 
you decided to replace the license header.

-- 
To view, visit http://gerrit.cloudera.org:8080/4386
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ief3e09ac93137f870b7fb31a0bc851f24a17573b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No