[Impala-ASF-CR] IMPALA-9778: Refactor HdfsPartition to be immutable

2020-06-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15985 )

Change subject: IMPALA-9778: Refactor HdfsPartition to be immutable
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15985/9/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/15985/9/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@86
PS9, Line 86: public class HdfsPartition implements FeFsPartition, 
PrunablePartition {
is it possible to annotate tihs as @Immutable? I think that'll trigger 
error-prone to check to make sure it truly is immutable



--
To view, visit http://gerrit.cloudera.org:8080/15985
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib52e5810d01d5e0c910daacb9c98977426d3914c
Gerrit-Change-Number: 15985
Gerrit-PatchSet: 10
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 05 Jun 2020 14:52:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3741 [part 2]: Push runtime bloom filter to Kudu

2020-04-09 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15683 )

Change subject: IMPALA-3741 [part 2]: Push runtime bloom filter to Kudu
..


Patch Set 3:

Sorry for repeated comments, but one more thought occurred me to me: the most 
effective case of min-max filter is when min == max, and thus the filter 
becomes an equality. An equality predicate can then serve to prune Kudu hash 
partitions, which isn't doable with bloom or range predicates.


--
To view, visit http://gerrit.cloudera.org:8080/15683
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9100076f68ea299ddb6ec8bc027cac7a47f5d754
Gerrit-Change-Number: 15683
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 10 Apr 2020 04:12:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3741 [part 2]: Push runtime bloom filter to Kudu

2020-04-09 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15683 )

Change subject: IMPALA-3741 [part 2]: Push runtime bloom filter to Kudu
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15683/3/common/thrift/PlanNodes.thrift
File common/thrift/PlanNodes.thrift:

http://gerrit.cloudera.org:8080/#/c/15683/3/common/thrift/PlanNodes.thrift@124
PS3, Line 124:   BLOOM_MIN_MAX = 2
> Right, sortedness in kudu is based on the (compound) primary key. So, min-m
BTW forgot that I took some measurements of the ApplyPredicatePrimitive 
performance bfeore:

 int8   NOT NULL   (c >= 0 AND c < 2) 3152.2M evals/sec 0.88 cycles/eval
 int16  NOT NULL   (c >= 0 AND c < 2) 3309.6M evals/sec 0.85 cycles/eval
 int32  NOT NULL   (c >= 0 AND c < 2) 3384.0M evals/sec 0.85 cycles/eval
 int64  NOT NULL   (c >= 0 AND c < 2) 1847.6M evals/sec 1.57 cycles/eval
 float  NOT NULL   (c >= 0 AND c < 2) 3268.3M evals/sec 0.88 cycles/eval
 double NOT NULL   (c >= 0 AND c < 2) 2245.2M evals/sec 1.27 cycles/eval



--
To view, visit http://gerrit.cloudera.org:8080/15683
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9100076f68ea299ddb6ec8bc027cac7a47f5d754
Gerrit-Change-Number: 15683
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 10 Apr 2020 03:56:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3741 [part 2]: Push runtime bloom filter to Kudu

2020-04-09 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15683 )

Change subject: IMPALA-3741 [part 2]: Push runtime bloom filter to Kudu
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15683/3/common/thrift/PlanNodes.thrift
File common/thrift/PlanNodes.thrift:

http://gerrit.cloudera.org:8080/#/c/15683/3/common/thrift/PlanNodes.thrift@124
PS3, Line 124:   BLOOM_MIN_MAX = 2
> This is my understanding, I'll check further within Kudu team.
Right, sortedness in kudu is based on the (compound) primary key. So, min-max 
predicates can help in the following ways:
(1) eliminate range partitions -- eg a min/max on a timestamp could would help 
a lot if the table is timestamp-partitioned by day, whereas a bloom filter on 
timestamp is probably not useful since cardinality is very high

(2) convert to primary key-based range scans -- if we have inequality/equality 
predicates on a prefix of the primary key, we can convert those predicates into 
a range scan on the table using b-tree style indexes. Again if you had a 
min-max on timestamp and timestamp were the leading portion of your key, it's a 
big win. Or, if you had a min-max on timestamp, a composite primary key of 
(entity_id, timestamp), and an equality predicate on entity_id, it would still 
do the PK range scan conversion for a big win.

(3) normal row-by-row evaluation - this is still useful since, particularly for 
primitive columns, it's very fast to evaluate (SIMD-accelerated in latest 
release, etc). It's likely much faster to evaluate than a bloom filter which 
will probably have some frequency of L1 cache misses if not L2/LLC, plus the 
hash calculation and mixing itself which likely takes a couple cycles. So, 
you're right that it may not eliminate much data in some queries, but the cost 
might be small enough that it's worth doing even if it eliminates 10-20%.

To validate this I loaded 800M rows into a table using 'kudu perf loadgen' and 
then compared scanning with

$ :./build/latest/bin/kudu perf table_scan localhost 
default.loadgen_auto_0b01a9eba6d54f2ba7fa5b59e3a597c5 --columns=int_val 
--num-threads=10 --predicates ["AND", [">=", "int_val", 0]]

vs the same without the predicate. In this case the predicate matches all rows, 
but there's no special short-circuit logic, so it actually evaluates it 
everywhere. Wall clock times:

with the predicate: 1.7023 +- 0.0266 seconds time elapsed  ( +-  
1.56% )
without the predicate: 1.6514 +- 0.0370 seconds time elapsed  ( +-  
2.24% )

Also running perf record -a, I can see that about 5% of the total CPU was used 
on the tserver in ApplyPredicatePrimitive.

Admittedly, this overhead will go up on a percentage basis after some other 
in-flight work we have is finished (eg more CPU-efficient integer encoding 
bankim's working on) but I think we also have more room to optimize 
ApplyPredicatePrimitive by using AVX2 instead of just SSE4 instructions when 
available.

As for the overhead of sending to the coordinator and broadcasting, isn't this 
a _tiny_ amount of data? ie it's only two values, which for primitives is going 
to be a max of 16 bytes each, and for varlen, you could feasibly truncate to a 
fixed size bound even in the case of a long string.

So, I'm generally in favor of sending and evaluating both. In the best case 
it's a huge win (eliminating range partitions) and in the worst case it's 
probably not a major cost.

One further thought here as a future extension: we could provide an API in the 
Kudu scanner that says "this predicate is optional: drop it if you think it's 
not filtering much". For the case of these propagated predicates feeding into a 
join, dropping them at runtime if they are ineffective is worthwhile (I think 
Impala already does something like this in its scanner path, right?)



--
To view, visit http://gerrit.cloudera.org:8080/15683
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9100076f68ea299ddb6ec8bc027cac7a47f5d754
Gerrit-Change-Number: 15683
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 10 Apr 2020 03:49:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: Asynchronous code generation

2020-02-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15105 )

Change subject: WIP: Asynchronous code generation
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15105/9/be/src/codegen/codegen-fn-ptr.h
File be/src/codegen/codegen-fn-ptr.h:

http://gerrit.cloudera.org:8080/#/c/15105/9/be/src/codegen/codegen-fn-ptr.h@41
PS9, Line 41:   static constexpr std::memory_order mem_order_ = 
std::memory_order_acq_rel;
just passing through: I think you have it correct here. Relaxed is insuffiicent 
because it would permit reordering of the writing of the code and the storing 
of the pointer. In other words, your compilation thread does things like:

1. allocate memory for function machine code
2. write code to allocated memory
3. publish pointer

and without "release" semantics on the publishing of the pointer, there is no 
guarantee that #2 and #3 won't be reordered. In practice, this is really 
unlikely to happen, because there are probably some other locking operations in 
between #2 and #3 and because x86 is a TSO architecture, but worth doing it 
right regardless.

Also worth noting that acq/rel semantics on x86 isn't expensive, they are just 
simple load/store operations. It only prevents compiler reordering. So, there 
is little to gain by using relaxed.



--
To view, visit http://gerrit.cloudera.org:8080/15105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b
Gerrit-Change-Number: 15105
Gerrit-PatchSet: 9
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 13 Feb 2020 21:18:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9116: KUDU-2989. Work around SASL bug when FQDN is >d characters

2019-11-01 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14614 )

Change subject: IMPALA-9116: KUDU-2989. Work around SASL bug when FQDN is >=64 
characters
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14614
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4898814f2f7ab87151798336414dde7078d28a4a
Gerrit-Change-Number: 14614
Gerrit-PatchSet: 1
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 01 Nov 2019 17:17:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7506: support global INVALIDATE METADATA in local catalog mode

2019-10-22 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14307 )

Change subject: IMPALA-7506: support global INVALIDATE METADATA in local 
catalog mode
..


Patch Set 8:

(3 comments)

I generally like this solution -- seems simpler than trying to track the full 
contents of the cache.

Is it possible to use this solution for catalog v1 as well, and remove the 
CatalogObjectVersionSet thing? I think that thing is buggy/complex, maybe would 
be nice to just have one implementation to worry about. (ok to do that in a 
follow-up if it's not trivial, or if you think it's a risky change, maybe we 
shouldn't do it at all)

http://gerrit.cloudera.org:8080/#/c/14307/8/be/src/util/impalad-metrics.h
File be/src/util/impalad-metrics.h:

http://gerrit.cloudera.org:8080/#/c/14307/8/be/src/util/impalad-metrics.h@119
PS8, Line 119:   /// Minimal catalog object version in local catalog cache of 
Coordinator.
this isn't actually true -- it's no longer the minimum, but rather a lower 
bound, right? ie it's not a tight bound like it used to be in v1. We should 
probably rename the metric, variables, thrift fields, etc, to reflect that.


http://gerrit.cloudera.org:8080/#/c/14307/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/14307/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1587
PS8, Line 1587:   lastResetStartVersion_ = startVersion;
What if there are two concurrent calls to INVALIDATE METADATA? is it possible 
that the lastResetStartVersion_ will end up going downward, because the setting 
of startVersion is under a different lock acquisition from the version 
writelock above?


http://gerrit.cloudera.org:8080/#/c/14307/8/tests/authorization/test_grant_revoke.py
File tests/authorization/test_grant_revoke.py:

http://gerrit.cloudera.org:8080/#/c/14307/8/tests/authorization/test_grant_revoke.py@364
PS8, Line 364:   self.client.execute("grant role {0} to group 
foobar".format(role_name))
why were these changes necessary?



--
To view, visit http://gerrit.cloudera.org:8080/14307
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib61a7ab1ffa062620ffbc2dadc34bd7a8ca9e549
Gerrit-Change-Number: 14307
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 22 Oct 2019 23:18:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9001: Fix SPNEGO for requests with no 'Authorization'

2019-10-04 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14352 )

Change subject: IMPALA-9001: Fix SPNEGO for requests with no 'Authorization'
..


Patch Set 1: Code-Review+2

I suppose adding a test for this with the ApacheDS KDC and HttpUrlConnection is 
too complicated? If so, that's fine.


--
To view, visit http://gerrit.cloudera.org:8080/14352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id9b6ac99b799324ec22e95fd1eb022d5ad6f54bd
Gerrit-Change-Number: 14352
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 04 Oct 2019 16:25:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8228: Ownership support for Ranger authz

2019-09-11 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14106 )

Change subject: IMPALA-8228: Ownership support for Ranger authz
..


Patch Set 11: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/14106
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I737b7164a3e7afb9996b3402e6872effd663f7b4
Gerrit-Change-Number: 14106
Gerrit-PatchSet: 11
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 12 Sep 2019 03:32:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8228: Ownership support for Ranger authz

2019-09-10 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14106 )

Change subject: IMPALA-8228: Ownership support for Ranger authz
..


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/14106/6/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/14106/6/fe/src/main/java/org/apache/impala/service/Frontend.java@783
PS6, Line 783: TODO
mind filing a JIRA and putting the number here?


http://gerrit.cloudera.org:8080/#/c/14106/6/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
File 
fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java:

http://gerrit.cloudera.org:8080/#/c/14106/6/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java@3060
PS6, Line 3060: authorize("select count(*) from functional.alltypes")
nit: might make this test a little easier to read to do something like create 
an Map:

ImmutableMap.builder()
  .put("select count(*) from functional.alltypes",
   selectError("functional.alltypes"))
  .put("select id from functional.alltypes",
   selectError("functional.alltypes"))
  ...
  .build();

and then you can loop over the queries in the before/after case below to avoid 
having to duplicate all the queries twice


http://gerrit.cloudera.org:8080/#/c/14106/6/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/14106/6/tests/authorization/test_ranger.py@665
PS6, Line 665:  # Run show tables on the db. The resulting list should be 
empty. This happens
 : # because the created table's ownership information is 
not aggresively cached
 : # by the current Catalog implementations. Hence the 
analysis pass does not
 : # have access to the ownership information to verify if 
the current session
 : # user is actually the owner. We need to fix this by 
caching the HMS metadata
 : # more aggressively when the table loads.
this is true only in localcatalog, right? otherwise the 'create table' call 
would have fully populated the cache rather than leaving an incomplete table?


http://gerrit.cloudera.org:8080/#/c/14106/6/tests/authorization/test_ranger.py@669
PS6, Line 669: We need to fix this by caching the HMS metadata
 : # more aggressively when the table loads.
add a TODO(IMPALA-x) with jira



--
To view, visit http://gerrit.cloudera.org:8080/14106
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I737b7164a3e7afb9996b3402e6872effd663f7b4
Gerrit-Change-Number: 14106
Gerrit-PatchSet: 6
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 10 Sep 2019 23:49:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP] IMPALA-8228: Ownership support for Ranger authz

2019-08-20 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14106 )

Change subject: [WIP] IMPALA-8228: Ownership support for Ranger authz
..


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/14106/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/14106/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2673
PS2, Line 2673: FeTable table = getTable(fqTableName.getDb(), 
fqTableName.getTbl());
we seem to have lost the "checkNotNull' here? was that intentional?


http://gerrit.cloudera.org:8080/#/c/14106/2/fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
File fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java:

http://gerrit.cloudera.org:8080/#/c/14106/2/fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java@138
PS2, Line 138: tableName_
this could be null in the case of non-table-specific statements, which seems 
like it would fail


http://gerrit.cloudera.org:8080/#/c/14106/2/fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java@189
PS2, Line 189: dbName
this should be 'database_' right?


http://gerrit.cloudera.org:8080/#/c/14106/2/fe/src/main/java/org/apache/impala/authorization/AuthorizableFactory.java
File fe/src/main/java/org/apache/impala/authorization/AuthorizableFactory.java:

http://gerrit.cloudera.org:8080/#/c/14106/2/fe/src/main/java/org/apache/impala/authorization/AuthorizableFactory.java@35
PS2, Line 35: String ownerUser
mind adding @Nullable annotations here and below, if this is allowed to be null 
to indicate no known owner?


http://gerrit.cloudera.org:8080/#/c/14106/2/fe/src/main/java/org/apache/impala/authorization/AuthorizableTable.java
File fe/src/main/java/org/apache/impala/authorization/AuthorizableTable.java:

http://gerrit.cloudera.org:8080/#/c/14106/2/fe/src/main/java/org/apache/impala/authorization/AuthorizableTable.java@30
PS2, Line 30:   private final String ownerUser_;
how about using @Nullable on this?


http://gerrit.cloudera.org:8080/#/c/14106/2/fe/src/main/java/org/apache/impala/authorization/AuthorizableTable.java@37
PS2, Line 37: ownerUser_ = ownerUser;
would an empty owner string be valid? maybe we should be checking that it's not 
empty?


http://gerrit.cloudera.org:8080/#/c/14106/2/fe/src/main/java/org/apache/impala/authorization/AuthorizableTable.java@55
PS2, Line 55:   public String getOwnerUser() { return ownerUser_; }
this is @Override right?


http://gerrit.cloudera.org:8080/#/c/14106/2/fe/src/main/java/org/apache/impala/authorization/PrivilegeRequestBuilder.java
File 
fe/src/main/java/org/apache/impala/authorization/PrivilegeRequestBuilder.java:

http://gerrit.cloudera.org:8080/#/c/14106/2/fe/src/main/java/org/apache/impala/authorization/PrivilegeRequestBuilder.java@71
PS2, Line 71: onTable
mind giving this a more explicit name like 'onTableWithUnknownOwner' or 
something? think that's better than just overloading, so it's clear that when 
you have an owner you should use a different call. Or, get rid of this overload 
and explicitly pass the null owner at call sites


http://gerrit.cloudera.org:8080/#/c/14106/2/fe/src/main/java/org/apache/impala/authorization/PrivilegeRequestBuilder.java@74
PS2, Line 74: // TableNotFound Analsis exceptions and instead mask that as 
an
typo



--
To view, visit http://gerrit.cloudera.org:8080/14106
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I737b7164a3e7afb9996b3402e6872effd663f7b4
Gerrit-Change-Number: 14106
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 20 Aug 2019 17:25:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8869: Fix handling of HTTP keep-alive when returning 401

2019-08-19 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14076 )

Change subject: IMPALA-8869: Fix handling of HTTP keep-alive when returning 401
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14076
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3d5f80dbcde5b623a1d0586b5d763a062dd21afa
Gerrit-Change-Number: 14076
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 20 Aug 2019 02:53:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8661 : Add randomized tests to stress MetastoreEventsProcessor

2019-07-26 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13932 )

Change subject: IMPALA-8661 : Add randomized tests to stress 
MetastoreEventsProcessor
..


Patch Set 2:

I'm out the next two weeks on PTO, so probably best if someone else reviews this


--
To view, visit http://gerrit.cloudera.org:8080/13932
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c85b83efd4f56b5ae0e8d1dc6a2ee2feb6721ce
Gerrit-Change-Number: 13932
Gerrit-PatchSet: 2
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 26 Jul 2019 23:26:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8783: Add Kerberos SPNEGO support to the http hs2 server

2019-07-26 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13918 )

Change subject: IMPALA-8783: Add Kerberos SPNEGO support to the http hs2 server
..


Patch Set 7: Code-Review+2

k, seems like this is good. Just to triple check, I think you should run one 
manual test where you do add some code locally like:

if (!token.empty()) { token[5]++; }

and make sure that it doesn't allow connecting :)


--
To view, visit http://gerrit.cloudera.org:8080/13918
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15d9a842ab37ebc34b9fde5917137ff2961d870a
Gerrit-Change-Number: 13918
Gerrit-PatchSet: 7
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 26 Jul 2019 23:18:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8783: Add Kerberos SPNEGO support to the http hs2 server

2019-07-26 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13918 )

Change subject: IMPALA-8783: Add Kerberos SPNEGO support to the http hs2 server
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/13918/6/be/src/transport/THttpServer.cpp
File be/src/transport/THttpServer.cpp:

http://gerrit.cloudera.org:8080/#/c/13918/6/be/src/transport/THttpServer.cpp@184
PS6, Line 184:  if (has_ldap_ && (!got_negotiate_auth || !has_kerberos_)) {
shouldn't this be conditioned on got_basic_auth? otherwise a non-authenticated 
request is going to count as a basic_auth_failure


http://gerrit.cloudera.org:8080/#/c/13918/6/be/src/transport/THttpServer.cpp@196
PS6, Line 196: is_complete) {
 : if (metrics_enabled_)
combine these two?


http://gerrit.cloudera.org:8080/#/c/13918/6/be/src/transport/THttpServer.cpp@200
PS6, Line 200: total_negotiate_auth_failure_
shouldn't we only increment this if there was a token present? the normal flow 
is that someone will connect, send no auth header, and get a 401. We don't want 
to count this 401 as a failed authentication



--
To view, visit http://gerrit.cloudera.org:8080/13918
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15d9a842ab37ebc34b9fde5917137ff2961d870a
Gerrit-Change-Number: 13918
Gerrit-PatchSet: 6
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 26 Jul 2019 22:41:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8783: Add Kerberos SPNEGO support to the http hs2 server

2019-07-26 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13918 )

Change subject: IMPALA-8783: Add Kerberos SPNEGO support to the http hs2 server
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13918/1/be/src/transport/THttpServer.cpp
File be/src/transport/THttpServer.cpp:

http://gerrit.cloudera.org:8080/#/c/13918/1/be/src/transport/THttpServer.cpp@187
PS1, Line 187: bool is_complete;
Similar question to the earlier comment -- isn't it possible for SpnegoStep to 
return "OK" but not "complete"? In that case, it would be responding with a 
token, and you'd want to be sending the token back with the Unauthorized 
response?


http://gerrit.cloudera.org:8080/#/c/13918/3/be/src/transport/THttpServer.cpp
File be/src/transport/THttpServer.cpp:

http://gerrit.cloudera.org:8080/#/c/13918/3/be/src/transport/THttpServer.cpp@199
PS3, Line 199: call the auth function with an empty string to allow them to
 :   // generate return headers.
shouldn't they have generated the return header from the bad token? ie i htink 
the SPNEGO spec includes the possibility of a multi-step authentication, where 
the client sends a token, the server responds with a different token (but not 
authorized). Is that not the case?



--
To view, visit http://gerrit.cloudera.org:8080/13918
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15d9a842ab37ebc34b9fde5917137ff2961d870a
Gerrit-Change-Number: 13918
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 26 Jul 2019 20:39:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8783: Add Kerberos SPNEGO support to the http hs2 server

2019-07-25 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13918 )

Change subject: IMPALA-8783: Add Kerberos SPNEGO support to the http hs2 server
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13918/1/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/13918/1/be/src/rpc/authentication.cc@565
PS1, Line 565: connection_context->return_headers;
nit: std::move() around this


http://gerrit.cloudera.org:8080/#/c/13918/1/be/src/transport/THttpServer.cpp
File be/src/transport/THttpServer.cpp:

http://gerrit.cloudera.org:8080/#/c/13918/1/be/src/transport/THttpServer.cpp@256
PS1, Line 256: h << "WWW-Authenticate: Negotiate" << CRLF;
shouldn't we be using a Negotiate header which includes the token returned by 
gssapi here? I guess this will work for certain configurations where the 
server's initial challenge is empty, but I think there might be cases where 
this isn't the case.



--
To view, visit http://gerrit.cloudera.org:8080/13918
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15d9a842ab37ebc34b9fde5917137ff2961d870a
Gerrit-Change-Number: 13918
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 25 Jul 2019 17:47:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Support SPNEGO for Impala webserver

2019-07-11 Thread Todd Lipcon (Code Review)
Hello Thomas Tauber-Marshall, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13774

to look at the new patch set (#4).

Change subject: Support SPNEGO for Impala webserver
..

Support SPNEGO for Impala webserver

This ports over changes from kudu commit
1f291b77ef0868ac888a850678adc2d7cce65529 which implemented SPNEGO for
the Kudu webserver.

Unfortunately, thorough testing of this is difficult given that curl
isn't currently in the toolchain. I was able to manually test this by
adding a 'sleep(1000)' call into the newly added test case, then setting
up $KRB5_CONFIG in my shell to point to the temporary KDC's environment,
and using 'curl -u : --negotiate http://...' to authenticate.

Strangely, using the version of curl on el7 didn't seem to work properly
(perhaps an el7 curl bug) but using curl on my Ubuntu 18 laptop I was
able to authenticate with SPNEGO.

Change-Id: Ife2b04310e1571d231bf8ee1bcfd3b7afc2edd8f
---
M be/src/exec/kudu-util.h
M be/src/gutil/strings/escaping.cc
M be/src/util/CMakeLists.txt
A be/src/util/kudu-status-util.h
M be/src/util/webserver-test.cc
M be/src/util/webserver.cc
M be/src/util/webserver.h
7 files changed, 267 insertions(+), 42 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/13774/4
--
To view, visit http://gerrit.cloudera.org:8080/13774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ife2b04310e1571d231bf8ee1bcfd3b7afc2edd8f
Gerrit-Change-Number: 13774
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] Support SPNEGO for Impala webserver

2019-07-09 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13774 )

Change subject: Support SPNEGO for Impala webserver
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/13774/1/be/src/util/kudu-status-util.h
File be/src/util/kudu-status-util.h:

http://gerrit.cloudera.org:8080/#/c/13774/1/be/src/util/kudu-status-util.h@26
PS1, Line 26: \
> Might as well fix the formatting while you're here
Done


http://gerrit.cloudera.org:8080/#/c/13774/1/be/src/util/webserver.cc
File be/src/util/webserver.cc:

http://gerrit.cloudera.org:8080/#/c/13774/1/be/src/util/webserver.cc@203
PS1, Line 203: // is invalid, a bad Status will be returned (and the other 
out-parameters left untouched).
> line too long (91 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/13774/1/be/src/util/webserver.cc@214
PS1, Line 214:   RETURN_NOT_OK(kudu::gssapi::SpnegoStep(neg_token, 
_token_b64, _complete, authn_user));
> line too long (96 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/13774/1/be/src/util/webserver.cc@370
PS1, Line 370: We assume that security::InitKerberosForServer()
I don't think the --principal is actually necessary -- the way SPNEGO works is 
that the client provides a principal, and the server will look in the 
configured keytab for any matching principal. This is what allows users to have 
multiple DNS names, and clients can connect to them via any name (useful for L4 
load balancing or CNAMEs to work, for example).

If we don't have KRB5_KTNAME set at this point, it'll be null, and we'll get 
the expected bad Status below. Does that seem right? We actually do have some 
test case for this on the equivalent Kudu code, but it wasn't really easy to 
port it over to Impala's tests.

> What about the opposite case - where --principal is set but 
> --webserver_require_spnego=false? Right now I think we just silently allow 
> this, but users may find it surprising that setting up kerberos doesn't 
> automatically secure the webserver. Maybe worth logging a warning in that 
> case?

The problem with this is that, while SPNEGO is a standard, it's a huge pain in 
the ass for users to set up, and in fact I think even on secure clusters, most 
users don't enable SPNEGO on their web UIs. If we started WARNING about being 
insecure if you don't enable SPNEGO I think people will get pretty grouchy.

That said, this is still useful because we'll use this to allow access via Knox 
Trusted Proxy support, which uses SPNEGO to authenticate the Knox service to 
Impala (and Knox itself gets authentication via pluggable modules, including 
SSO integration)



--
To view, visit http://gerrit.cloudera.org:8080/13774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife2b04310e1571d231bf8ee1bcfd3b7afc2edd8f
Gerrit-Change-Number: 13774
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 10 Jul 2019 04:45:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Support SPNEGO for Impala webserver

2019-07-09 Thread Todd Lipcon (Code Review)
Hello Thomas Tauber-Marshall, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13774

to look at the new patch set (#2).

Change subject: Support SPNEGO for Impala webserver
..

Support SPNEGO for Impala webserver

This ports over changes from kudu commit
1f291b77ef0868ac888a850678adc2d7cce65529 which implemented SPNEGO for
the Kudu webserver.

Unfortunately, thorough testing of this is difficult given that curl
isn't currently in the toolchain. I was able to manually test this by
adding a 'sleep(1000)' call into the newly added test case, then setting
up $KRB5_CONFIG in my shell to point to the temporary KDC's environment,
and using 'curl -u : --negotiate http://...' to authenticate.

Strangely, using the version of curl on el7 didn't seem to work properly
(perhaps an el7 curl bug) but using curl on my Ubuntu 18 laptop I was
able to authenticate with SPNEGO.

Change-Id: Ife2b04310e1571d231bf8ee1bcfd3b7afc2edd8f
---
M be/src/exec/kudu-util.h
M be/src/gutil/strings/escaping.cc
M be/src/util/CMakeLists.txt
A be/src/util/kudu-status-util.h
M be/src/util/webserver-test.cc
M be/src/util/webserver.cc
M be/src/util/webserver.h
7 files changed, 265 insertions(+), 42 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/13774/2
--
To view, visit http://gerrit.cloudera.org:8080/13774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ife2b04310e1571d231bf8ee1bcfd3b7afc2edd8f
Gerrit-Change-Number: 13774
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-8748: Always pass hostname to RpcMgr::GetProxy()

2019-07-08 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13818 )

Change subject: IMPALA-8748: Always pass hostname to RpcMgr::GetProxy()
..


Patch Set 1: Code-Review+2

(1 comment)

+2 pending one issue with the DCHECK

http://gerrit.cloudera.org:8080/#/c/13818/1/be/src/rpc/rpc-mgr.inline.h
File be/src/rpc/rpc-mgr.inline.h:

http://gerrit.cloudera.org:8080/#/c/13818/1/be/src/rpc/rpc-mgr.inline.h@44
PS1, Line 44:   DCHECK(!IsResolvedAddress(MakeNetworkAddress(hostname))) << 
hostname << "\n"
I don't think this DCHECK is valid -- it's OK to run things in an environment 
without DNS, if you give the hosts principals like impala/1.2.3.4.



--
To view, visit http://gerrit.cloudera.org:8080/13818
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I85b661c8c3b3b67bfc1ce9e29aecb90a862666c0
Gerrit-Change-Number: 13818
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 09 Jul 2019 03:33:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8663 : FileMetadataLoader should skip hidden and tmp directories

2019-07-08 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13665 )

Change subject: IMPALA-8663 : FileMetadataLoader should skip hidden and tmp 
directories
..


Patch Set 12: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13665
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2c4a22908304fe9e377d77d6c18d401c3f3294aa
Gerrit-Change-Number: 13665
Gerrit-PatchSet: 12
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 08 Jul 2019 22:52:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] build: use thin static archives

2019-07-02 Thread Todd Lipcon (Code Review)
Hello Tim Armstrong, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13775

to look at the new patch set (#2).

Change subject: build: use thin static archives
..

build: use thin static archives

This enables thin static archives for our internal libraries during the
build. This makes linking much faster since the static archives just
point to object files instead of copying them.

This reduces the size of intermediate '.a' files for my debug build from
about 1.4GB to 58MB.

Incremental rebuilds are slightly sped up, though maybe in the realm of
noise (likely the performance improvement depends on how much RAM is
available for buffer caching the IO):

Without patch incremental build of catalogd after modifying 
CatalogObjects.thrift:
real0m53.433s
user7m6.400s
sys 1m21.610s

With patch:
real0m44.772s
user7m6.632s
sys 1m16.870s

Change-Id: I3b54b5be8658c951914758c406cca55d4cc1756e
---
M CMakeLists.txt
1 file changed, 9 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/13775/2
--
To view, visit http://gerrit.cloudera.org:8080/13775
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3b54b5be8658c951914758c406cca55d4cc1756e
Gerrit-Change-Number: 13775
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] build: use thin static archives

2019-07-02 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13775 )

Change subject: build: use thin static archives
..


Patch Set 1:

> Patch Set 1:
>
> I tried this out and I couldn't tell a difference when doing buildall.sh 
> -skiptests. Can you include a measurement of what got faster?

hmm, it should make the generated .a files much smaller and thus link faster. 
Let me run some tests, maybe I put this in the wrong spot in the CMakeLists.txt 
file (I copied it from Kudu where we saw a speedup but will admit I didn't 
verify it in Impala)


--
To view, visit http://gerrit.cloudera.org:8080/13775
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b54b5be8658c951914758c406cca55d4cc1756e
Gerrit-Change-Number: 13775
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 02 Jul 2019 19:17:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] Support SPNEGO for Impala webserver

2019-07-01 Thread Todd Lipcon (Code Review)
Todd Lipcon has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13774


Change subject: Support SPNEGO for Impala webserver
..

Support SPNEGO for Impala webserver

This ports over changes from kudu commit
1f291b77ef0868ac888a850678adc2d7cce65529 which implemented SPNEGO for
the Kudu webserver.

Unfortunately, thorough testing of this is difficult given that curl
isn't currently in the toolchain. I was able to manually test this by
adding a 'sleep(1000)' call into the newly added test case, then setting
up $KRB5_CONFIG in my shell to point to the temporary KDC's environment,
and using 'curl -u : --negotiate http://...' to authenticate.

Strangely, using the version of curl on el7 didn't seem to work properly
(perhaps an el7 curl bug) but using curl on my Ubuntu 18 laptop I was
able to authenticate with SPNEGO.

Change-Id: Ife2b04310e1571d231bf8ee1bcfd3b7afc2edd8f
---
M be/src/exec/kudu-util.h
M be/src/gutil/strings/escaping.cc
M be/src/util/CMakeLists.txt
A be/src/util/kudu-status-util.h
M be/src/util/webserver-test.cc
M be/src/util/webserver.cc
M be/src/util/webserver.h
7 files changed, 263 insertions(+), 42 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/13774/1
--
To view, visit http://gerrit.cloudera.org:8080/13774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ife2b04310e1571d231bf8ee1bcfd3b7afc2edd8f
Gerrit-Change-Number: 13774
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 


[Impala-ASF-CR] build: use thin static archives

2019-07-01 Thread Todd Lipcon (Code Review)
Todd Lipcon has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13775


Change subject: build: use thin static archives
..

build: use thin static archives

This enables thin static archives for our internal libraries during the
build. This makes linking much faster since the static archives just
point to object files instead of copying them.

Change-Id: I3b54b5be8658c951914758c406cca55d4cc1756e
---
M CMakeLists.txt
1 file changed, 9 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/13775/1
--
To view, visit http://gerrit.cloudera.org:8080/13775
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3b54b5be8658c951914758c406cca55d4cc1756e
Gerrit-Change-Number: 13775
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 


[Impala-ASF-CR] IMPALA-8663 : FileMetadataLoader should skip hidden and tmp directories

2019-07-01 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13665 )

Change subject: IMPALA-8663 : FileMetadataLoader should skip hidden and tmp 
directories
..


Patch Set 11:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/13665/8/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/13665/8/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@545
PS8, Line 545:
 :* If 'recursive' is true, all
> Actually, I added this line in the doc since I noticed that the RecursingIt
hrm, were the semantics already broken then? See the unit test 
AcidUtilsTest.testAcidStateNoBase:

assertFiltering(new String[]{
"base_01.txt",
"post_upgrade.txt",
"delta_05_05_/",
"delta_05_05_/lmn.txt",
"base_10/",
"delta_012_012_/",
"delta_012_012_/_0",
"delta_012_012_/_1"},
"", // writeIdList that accepts all transactions as valid
new String[]{
// Post upgrade files are ignored if there is a valid base.
"delta_012_012_/_0",
"delta_012_012_/_1"});


ie here the empty base directory has important semantics. Maybe we don't have a 
valid e2e test case to reproduce this yet because of this Hive bug: 
https://issues.apache.org/jira/browse/HIVE-21750 but it certainly seems like 
this is an issue. Let's file a JIRA to make sure we validate this case before 
releasing this feature.


http://gerrit.cloudera.org:8080/#/c/13665/11/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/13665/11/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@731
PS11, Line 731:   // we use listFiles API to natively recurse on S3 for 
performance reasons, so we
Are we really gaining much from using this RecursiveIteratorWithFilter class 
for the S3 case now? It seems like this is getting to be a bit of spaghetti, 
with this being named "Recursive" but having a flag that makes it non-recursive.

What about just having a separate class like 'FilteringIterator' which wraps 
RemoteIterator and does filtering, and keep this one as always-recursive, so 
the code path is easier to follow?


http://gerrit.cloudera.org:8080/#/c/13665/11/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@736
PS11, Line 736: LOG.info("Ignoring {} since its contained in a ignored 
directory",
This probably shouldn't be at INFO level



--
To view, visit http://gerrit.cloudera.org:8080/13665
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2c4a22908304fe9e377d77d6c18d401c3f3294aa
Gerrit-Change-Number: 13665
Gerrit-PatchSet: 11
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 01 Jul 2019 22:40:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Update squeasel to 7973705170f4744d1806e32695f7ea1e8308ee95

2019-07-01 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13768 )

Change subject: Update squeasel to 7973705170f4744d1806e32695f7ea1e8308ee95
..


Patch Set 3: Code-Review+2

ps2 fixes the whitespace, ps3 is just a rebase. Carrying +2


--
To view, visit http://gerrit.cloudera.org:8080/13768
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I17f90561e0ea6b0917fff51b055225060a4fa549
Gerrit-Change-Number: 13768
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 01 Jul 2019 21:18:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] Update squeasel to 7973705170f4744d1806e32695f7ea1e8308ee95

2019-07-01 Thread Todd Lipcon (Code Review)
Hello Michael Ho, Lars Volker, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13768

to look at the new patch set (#2).

Change subject: Update squeasel to 7973705170f4744d1806e32695f7ea1e8308ee95
..

Update squeasel to 7973705170f4744d1806e32695f7ea1e8308ee95

This updates to the latest build of squeasel, which fixes a few small
issues and also improves support for keepalive. This patch doesn't
itself enable keepalive, but switches to the new APIs.

Change-Id: I17f90561e0ea6b0917fff51b055225060a4fa549
---
M be/src/thirdparty/squeasel/squeasel.c
M be/src/thirdparty/squeasel/squeasel.h
M be/src/util/webserver.cc
M be/src/util/webserver.h
4 files changed, 77 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/13768/2
--
To view, visit http://gerrit.cloudera.org:8080/13768
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I17f90561e0ea6b0917fff51b055225060a4fa549
Gerrit-Change-Number: 13768
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-6159 / KUDU-2192: Enable TCP keepalive for all outbound connections

2019-06-28 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13764 )

Change subject: IMPALA-6159 / KUDU-2192: Enable TCP keepalive for all outbound 
connections
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13764
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa1d66d83aea1cc82d07fc6217be5fc1306695bc
Gerrit-Change-Number: 13764
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 28 Jun 2019 23:42:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] Update squeasel to 7973705170f4744d1806e32695f7ea1e8308ee95

2019-06-28 Thread Todd Lipcon (Code Review)
Hello Michael Ho, Lars Volker,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/13768

to review the following change.


Change subject: Update squeasel to 7973705170f4744d1806e32695f7ea1e8308ee95
..

Update squeasel to 7973705170f4744d1806e32695f7ea1e8308ee95

This updates to the latest build of squeasel, which fixes a few small
issues and also improves support for keepalive. This patch doesn't
itself enable keepalive, but switches to the new APIs.

Change-Id: I17f90561e0ea6b0917fff51b055225060a4fa549
---
M be/src/thirdparty/squeasel/squeasel.c
M be/src/thirdparty/squeasel/squeasel.h
M be/src/util/webserver.cc
M be/src/util/webserver.h
4 files changed, 76 insertions(+), 31 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/13768/1
-- 
To view, visit http://gerrit.cloudera.org:8080/13768
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I17f90561e0ea6b0917fff51b055225060a4fa549
Gerrit-Change-Number: 13768
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] Update kudu/security from 9ebcb77aa911aae76c48e717af24e643cb81908d

2019-06-28 Thread Todd Lipcon (Code Review)
Hello Michael Ho, Lars Volker,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/13767

to review the following change.


Change subject: Update kudu/security from 
9ebcb77aa911aae76c48e717af24e643cb81908d
..

Update kudu/security from 9ebcb77aa911aae76c48e717af24e643cb81908d

This updates the kudu security code to the latest version, which
includes support for GSSAPI calls, necessary for SPNEGO.

This is a straight rsync of the kudu/util source code except for some
minor CMakeLists changes that were carried over from the old version.

Change-Id: Ie7c91193fd49f8ca1234b23cf61fc90c1fdbe2e0
---
M be/src/kudu/security/CMakeLists.txt
A be/src/kudu/security/gssapi.cc
A be/src/kudu/security/gssapi.h
M be/src/kudu/security/init.cc
M be/src/kudu/security/openssl_util.cc
M be/src/kudu/security/test/mini_kdc.cc
M be/src/kudu/security/test/mini_kdc.h
M be/src/kudu/security/tls_socket-test.cc
M be/src/kudu/security/tls_socket.cc
M be/src/kudu/security/token-test.cc
M be/src/kudu/security/token.proto
M be/src/kudu/security/token_signer.cc
M be/src/kudu/security/token_signer.h
M be/src/kudu/security/token_verifier.cc
M be/src/kudu/util/test_util.cc
M be/src/kudu/util/test_util.h
16 files changed, 848 insertions(+), 172 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/13767/1
--
To view, visit http://gerrit.cloudera.org:8080/13767
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie7c91193fd49f8ca1234b23cf61fc90c1fdbe2e0
Gerrit-Change-Number: 13767
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-8585: Insert data into ACID table during dataload

2019-06-27 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13633 )

Change subject: IMPALA-8585: Insert data into ACID table during dataload
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13633/1/testdata/datasets/functional/functional_schema_template.sql
File testdata/datasets/functional/functional_schema_template.sql:

http://gerrit.cloudera.org:8080/#/c/13633/1/testdata/datasets/functional/functional_schema_template.sql@2137
PS1, Line 2137: -- Compactions could be also interesting, but seem to be 
sporadically very slow.
> can you file a JIRA for this issue that you mentioned?
mind filing this JIRA  and adding a TODO here? then we can commit this with the 
workaround for now but also track the fix



--
To view, visit http://gerrit.cloudera.org:8080/13633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id459519d5b963ea760c44719c12736bf104938f9
Gerrit-Change-Number: 13633
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 27 Jun 2019 15:25:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8663 : FileMetadataLoader should skip hidden and tmp directories

2019-06-27 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13665 )

Change subject: IMPALA-8663 : FileMetadataLoader should skip hidden and tmp 
directories
..


Patch Set 8:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/13665/8/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/13665/8/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@541
PS8, Line 541: temporary directories
the code now seems to also skip any files that match this pattern, not just 
directories. I think that's probably OK but we should be clear on the semantics.


http://gerrit.cloudera.org:8080/#/c/13665/8/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@545
PS8, Line 545: all underlying files (except which are
 :* in the ignored directories)
does this mean we no longer yield directories, and only yield files? does this 
not break the case of an empty base data dir, which actually has semantic value?


http://gerrit.cloudera.org:8080/#/c/13665/8/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@568
PS8, Line 568: isS3AFileSystem(p)
why'd you change from isS3AFileSystem(fs) to isS3AFileSystem(p)? In the case 
that 'p' isn't a fully qualified path, this will check the wrong filesystem, 
and we don't have any preconditions checking that 'p' must be fully qualified.


http://gerrit.cloudera.org:8080/#/c/13665/8/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@568
PS8, Line 568: isS3AFileSystem
perhaps we can add a @VisibleForTesting way we can make this path get used even 
for non-S3, so we can get test coverage here?


http://gerrit.cloudera.org:8080/#/c/13665/8/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@626
PS8, Line 626: LISTING_TYPE
style nit: enums should be named like classes (ListingType)


http://gerrit.cloudera.org:8080/#/c/13665/8/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@699
PS8, Line 699:   // if the current file is on a ignored path return early
 :   if (isIgnoredPath(fileStatus)) return;
how does this prevent recursion into tmp dirs in the recursive listFiles case? 
It seems like this is only checking the filename (last path component) and not 
intermediate path components, which we'd need to be doing to filter out 
contents of subdirectories for the "native recursion"



--
To view, visit http://gerrit.cloudera.org:8080/13665
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2c4a22908304fe9e377d77d6c18d401c3f3294aa
Gerrit-Change-Number: 13665
Gerrit-PatchSet: 8
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 27 Jun 2019 06:46:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8627: re-enable catalog v2 in containers

2019-06-24 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13708 )

Change subject: IMPALA-8627: re-enable catalog v2 in containers
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13708/2/docker/catalogd/Dockerfile
File docker/catalogd/Dockerfile:

http://gerrit.cloudera.org:8080/#/c/13708/2/docker/catalogd/Dockerfile@27
PS2, Line 27:  "-catalog_topic_mode=minimal", 
"-hms_event_polling_interval_s=1",\
> Do you think it would be useful to enable them one by one so that triage of
I noticed we have some tests which try to detect whether "v2" is enabled, not 
specifically checking for polling vs localcatalog, and those tests fail if you 
enable one without the other. We may need to change those checks to be more 
specific to one feature or the other to do this.



-- 
To view, visit http://gerrit.cloudera.org:8080/13708
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b4dd7060c3977c4a943b2492008c1dd601402a2
Gerrit-Change-Number: 13708
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 24 Jun 2019 19:28:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8682: Add authorized proxy user/group test coverage with Ranger

2019-06-21 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13679 )

Change subject: IMPALA-8682: Add authorized proxy user/group test coverage with 
Ranger
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13679
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6f797600720e6432b85cac8f13afe8fa5624596
Gerrit-Change-Number: 13679
Gerrit-PatchSet: 5
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 21 Jun 2019 17:39:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8682: Add authorized proxy user/group test coverage with Ranger

2019-06-20 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13679 )

Change subject: IMPALA-8682: Add authorized proxy user/group test coverage with 
Ranger
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13679/4/tests/authorization/test_authorized_proxy.py
File tests/authorization/test_authorized_proxy.py:

http://gerrit.cloudera.org:8080/#/c/13679/4/tests/authorization/test_authorized_proxy.py@195
PS4, Line 195: #
what are these commented-out lines?


http://gerrit.cloudera.org:8080/#/c/13679/4/tests/authorization/test_authorized_proxy.py@231
PS4, Line 231: # Try to user we are not authorized to delegate to.
typo?



--
To view, visit http://gerrit.cloudera.org:8080/13679
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6f797600720e6432b85cac8f13afe8fa5624596
Gerrit-Change-Number: 13679
Gerrit-PatchSet: 4
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 20 Jun 2019 16:21:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Backport KUDU-2871 (part 1): disable TLS 1.3.

2019-06-20 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13689 )

Change subject: Backport KUDU-2871 (part 1): disable TLS 1.3.
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13689
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iae77e06906e01d8442e0f767e7f920bd330cc5da
Gerrit-Change-Number: 13689
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 20 Jun 2019 15:44:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7534. Handle invalidation races in CatalogdMetaProvider

2019-06-19 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13664 )

Change subject: IMPALA-7534. Handle invalidation races in CatalogdMetaProvider
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/13664/4/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/13664/4/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@243
PS4, Line 243:   final AtomicInteger piggybackSuccessCountForTests = new 
AtomicInteger();
> nit: final
Done


http://gerrit.cloudera.org:8080/#/c/13664/4/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@250
PS4, Line 250:   final AtomicInteger piggybackExceptionCountForTests = new 
AtomicInteger();
> nit: final
Done


http://gerrit.cloudera.org:8080/#/c/13664/4/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@434
PS4, Line 434: returns an invalidation for the table name list
> just for my sake of understanding. Is the invalidate of table name done in
yea, because a table was created, the list of tables in that DB gets 
invalidated.



--
To view, visit http://gerrit.cloudera.org:8080/13664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I70f377db88e204825a909389f28dc3451815235c
Gerrit-Change-Number: 13664
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 19 Jun 2019 21:39:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7534. Handle invalidation races in CatalogdMetaProvider

2019-06-19 Thread Todd Lipcon (Code Review)
Hello Bharath Vissapragada, Vihang Karajgaonkar, Tim Armstrong, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13664

to look at the new patch set (#5).

Change subject: IMPALA-7534. Handle invalidation races in CatalogdMetaProvider
..

IMPALA-7534. Handle invalidation races in CatalogdMetaProvider

This handles a race condition in which a cache invalidation concurrent
with a cache load would potentially be skipped, causing out-of-date data
to persist in the cache. This would present itself as spurious "table
not found" errors.

A new test case triggers the issue reliably by injecting latency into
the metadata fetch RPC and running DDLs concurrently on the same
database across 8 threads. With the fix, the test passes reliably.

Another option to fix this might have been to switch to Caffeine instead
of Guava's loading cache. However, Caffeine requires Java 8, and
LocalCatalog is being backported to Impala 2.x which still can run on
Java 7. So, working around the Guava issue will make backporting (and
future backports) easier.

Change-Id: I70f377db88e204825a909389f28dc3451815235c
---
M be/src/exec/catalog-op-executor.cc
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
M tests/custom_cluster/test_local_catalog.py
4 files changed, 260 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/13664/5
--
To view, visit http://gerrit.cloudera.org:8080/13664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I70f377db88e204825a909389f28dc3451815235c
Gerrit-Change-Number: 13664
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7534. Handle invalidation races in CatalogdMetaProvider

2019-06-18 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13664 )

Change subject: IMPALA-7534. Handle invalidation races in CatalogdMetaProvider
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13664/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/13664/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@455
PS2, Line 455: //load to be triggered, which sees the post-invalidated 
data in catalogd.
> Yeah, I mean even a unit test would be helpful. The logic seems simple enou
I verified that the new e2e test does cover these cases, but also added a 
specific targeted unit test along with some test-only counters.



--
To view, visit http://gerrit.cloudera.org:8080/13664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I70f377db88e204825a909389f28dc3451815235c
Gerrit-Change-Number: 13664
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 18 Jun 2019 22:44:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7534. Handle invalidation races in CatalogdMetaProvider

2019-06-18 Thread Todd Lipcon (Code Review)
Hello Bharath Vissapragada, Vihang Karajgaonkar, Tim Armstrong, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13664

to look at the new patch set (#4).

Change subject: IMPALA-7534. Handle invalidation races in CatalogdMetaProvider
..

IMPALA-7534. Handle invalidation races in CatalogdMetaProvider

This handles a race condition in which a cache invalidation concurrent
with a cache load would potentially be skipped, causing out-of-date data
to persist in the cache. This would present itself as spurious "table
not found" errors.

A new test case triggers the issue reliably by injecting latency into
the metadata fetch RPC and running DDLs concurrently on the same
database across 8 threads. With the fix, the test passes reliably.

Another option to fix this might have been to switch to Caffeine instead
of Guava's loading cache. However, Caffeine requires Java 8, and
LocalCatalog is being backported to Impala 2.x which still can run on
Java 7. So, working around the Guava issue will make backporting (and
future backports) easier.

Change-Id: I70f377db88e204825a909389f28dc3451815235c
---
M be/src/exec/catalog-op-executor.cc
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
M tests/custom_cluster/test_local_catalog.py
4 files changed, 243 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/13664/4
--
To view, visit http://gerrit.cloudera.org:8080/13664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I70f377db88e204825a909389f28dc3451815235c
Gerrit-Change-Number: 13664
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7534. Handle invalidation races in CatalogdMetaProvider

2019-06-18 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13664 )

Change subject: IMPALA-7534. Handle invalidation races in CatalogdMetaProvider
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13664/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/13664/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@442
PS2, Line 442: //
> We're not setting hit = true, so we're counting this case as a cache miss?
yea, I was wondering whether to add another separate metric for "piggybacked 
hits" or something, but wasn't sure it was worth it. What do you think?


http://gerrit.cloudera.org:8080/#/c/13664/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@455
PS2, Line 455: boolean hit = false;
> Do we have tests for these exceptional code paths? Including the case when
I'm guessing this gets covered by our stress test which does concurrent queries 
and invalidates. I'll double check that this is actually covered and add 
another stress test if not.



--
To view, visit http://gerrit.cloudera.org:8080/13664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I70f377db88e204825a909389f28dc3451815235c
Gerrit-Change-Number: 13664
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 18 Jun 2019 20:43:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7534. Handle invalidation races in CatalogdMetaProvider

2019-06-18 Thread Todd Lipcon (Code Review)
Hello Bharath Vissapragada, Vihang Karajgaonkar, Tim Armstrong, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13664

to look at the new patch set (#3).

Change subject: IMPALA-7534. Handle invalidation races in CatalogdMetaProvider
..

IMPALA-7534. Handle invalidation races in CatalogdMetaProvider

This handles a race condition in which a cache invalidation concurrent
with a cache load would potentially be skipped, causing out-of-date data
to persist in the cache. This would present itself as spurious "table
not found" errors.

A new test case triggers the issue reliably by injecting latency into
the metadata fetch RPC and running DDLs concurrently on the same
database across 8 threads. With the fix, the test passes reliably.

Another option to fix this might have been to switch to Caffeine instead
of Guava's loading cache. However, Caffeine requires Java 8, and
LocalCatalog is being backported to Impala 2.x which still can run on
Java 7. So, working around the Guava issue will make backporting (and
future backports) easier.

Change-Id: I70f377db88e204825a909389f28dc3451815235c
---
M be/src/exec/catalog-op-executor.cc
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M tests/custom_cluster/test_local_catalog.py
3 files changed, 159 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/13664/3
--
To view, visit http://gerrit.cloudera.org:8080/13664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I70f377db88e204825a909389f28dc3451815235c
Gerrit-Change-Number: 13664
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7534. Handle invalidation races in CatalogdMetaProvider

2019-06-18 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13664 )

Change subject: IMPALA-7534. Handle invalidation races in CatalogdMetaProvider
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/13664/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/13664/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@427
PS2, Line 427:   private  ValueType 
loadWithCaching(String itemString,
> I think this could use a method comment since the logic is non-trivial (pro
Ack


http://gerrit.cloudera.org:8080/#/c/13664/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@435
PS2, Line 435:   Object inCache = cache_.get(key, () -> f);
> Just for my understanding, here we are relying on the inherent thread-safet
yea, will add a comment


http://gerrit.cloudera.org:8080/#/c/13664/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@573
PS2, Line 573: Uninterruptibles.sleepUninterruptibly(10, 
TimeUnit.MILLISECONDS);
> Remove?
woops!


http://gerrit.cloudera.org:8080/#/c/13664/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@687
PS2, Line 687: RuntimeException
> Does this mean the query is not eventually retried?
Do you know if we ever interrupt threads? Otherwise I don't think this 
introduces any actual possibility for an exception.



--
To view, visit http://gerrit.cloudera.org:8080/13664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I70f377db88e204825a909389f28dc3451815235c
Gerrit-Change-Number: 13664
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 18 Jun 2019 19:21:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8584: Add cookie support to the HTTP HS2 server

2019-06-18 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13672 )

Change subject: IMPALA-8584: Add cookie support to the HTTP HS2 server
..


Patch Set 2:

(5 comments)

didn't look in detail yet, a couple high-level questions first

http://gerrit.cloudera.org:8080/#/c/13672/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13672/2//COMMIT_MSG@7
PS2, Line 7: IMPALA-8584: Add cookie support to the HTTP HS2 server
Some questions about the design here:

- by persisting the cookies in a bimap, what happens when you have multiple 
clients separately authenticating as the same user? It seems like you'd end up 
handing the same UUID cookie to all of the remote users, and then they'd all 
expire at the same time, causing them to all have to go back to LDAP to 
reauthenticate simultaneously, which might be problematic for a large number of 
clients. It might be better to just use unique session IDs, rather than 
maintaining the reverse mapping from username->uuid.

- Similar question: if Impala is behind a load balancer of some kind, when a 
user reconnects, they'll use the same cookie to talk to a different impalad. 
That other impalad won't find the cookie and will force them to re-auth each 
time they switch backends I guess? In typical deployments, do people reconnect 
frequently through the load balancer or tend to use sticky sessions? An 
alternative here would be to use HMAC signatures in the cookie, using a secret 
shared via the statestore or somesuch.

Any idea what Hive does, for comparison? Would be good to document a couple of 
these design points in the commit message or the JIRA.


http://gerrit.cloudera.org:8080/#/c/13672/2/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/13672/2/be/src/rpc/authentication.cc@546
PS2, Line 546: random_device
Any idea what the source of this entropy is? Is it subject to entropy deletion 
DOS attacks? (or entropy depletion associated perf problems?)


http://gerrit.cloudera.org:8080/#/c/13672/2/be/src/rpc/authentication.cc@562
PS2, Line 562:   cookie = Substitute("impala.hs2.auth=$0", PrintId(cookie_id));
Do we want any expiration on the cookie? Or is the default (session cookie) 
appropriate?

Also, I would guess we want this to be an http-only and secure cookie (assuming 
https is enabled)? Might be some other settings that would be a good idea for 
security: see https://odino.org/security-hardening-http-cookies/ which has a 
bunch of recommendations.


http://gerrit.cloudera.org:8080/#/c/13672/2/be/src/rpc/cookie-mgr.cc
File be/src/rpc/cookie-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/13672/2/be/src/rpc/cookie-mgr.cc@61
PS2, Line 61:   username_to_cookie_.emplace(std::piecewise_construct, 
std::forward_as_tuple(username),
hrm, this is odd. YOu can't just do .emplace(username, Cookie(cookie, 
UnixMillis()); here?


http://gerrit.cloudera.org:8080/#/c/13672/2/be/src/transport/THttpServer.cpp
File be/src/transport/THttpServer.cpp:

http://gerrit.cloudera.org:8080/#/c/13672/2/be/src/transport/THttpServer.cpp@104
PS2, Line 104: cookieValue_ = string(value);
the Cookie header can pass multiple cookies, separated by ';'. It doesn't seem 
like that's getting handled here? It might also be possible to pass multiple 
Cookie headers, but not sure about that one.



--
To view, visit http://gerrit.cloudera.org:8080/13672
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I647c06f94ef91aa3b6413e91576c4ec506ed57f4
Gerrit-Change-Number: 13672
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 18 Jun 2019 19:19:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8667. Remove --pull incremental stats flag

2019-06-18 Thread Todd Lipcon (Code Review)
Hello Bharath Vissapragada,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/13671

to review the following change.


Change subject: IMPALA-8667. Remove --pull_incremental_stats flag
..

IMPALA-8667. Remove --pull_incremental_stats flag

This flag was added as a "chicken bit" -- so we could disable the new
feature if we had some problems with it. It's been out in the wild for a
number of months and we haven't seen any such problems, so at this point
let's stop maintaining the old code path.

Change-Id: I8878fcd8a2462963c7db3183a003bb9816dda8f9
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M docs/topics/impala_metadata.xml
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M tests/common/custom_cluster_test_suite.py
M tests/conftest.py
D tests/custom_cluster/test_incremental_stats.py
15 files changed, 36 insertions(+), 180 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/13671/1
--
To view, visit http://gerrit.cloudera.org:8080/13671
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8878fcd8a2462963c7db3183a003bb9816dda8f9
Gerrit-Change-Number: 13671
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 


[Impala-ASF-CR] IMPALA-7534. Handle invalidation races in CatalogdMetaProvider

2019-06-18 Thread Todd Lipcon (Code Review)
Hello Bharath Vissapragada, Vihang Karajgaonkar, Tim Armstrong, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13664

to look at the new patch set (#2).

Change subject: IMPALA-7534. Handle invalidation races in CatalogdMetaProvider
..

IMPALA-7534. Handle invalidation races in CatalogdMetaProvider

This handles a race condition in which a cache invalidation concurrent
with a cache load would potentially be skipped, causing out-of-date data
to persist in the cache. This would present itself as spurious "table
not found" errors.

A new test case triggers the issue reliably by injecting latency into
the metadata fetch RPC and running DDLs concurrently on the same
database across 8 threads. With the fix, the test passes reliably.

Change-Id: I70f377db88e204825a909389f28dc3451815235c
---
M be/src/exec/catalog-op-executor.cc
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M tests/custom_cluster/test_local_catalog.py
3 files changed, 133 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/13664/2
--
To view, visit http://gerrit.cloudera.org:8080/13664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I70f377db88e204825a909389f28dc3451815235c
Gerrit-Change-Number: 13664
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] WIP: IMPALA-7434. Handle invalidation races in CatalogdMetaProvider

2019-06-18 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13664 )

Change subject: WIP: IMPALA-7434. Handle invalidation races in 
CatalogdMetaProvider
..


Patch Set 1:

> Patch Set 1:
>
> I'm trying to understand the issue here. Apart from the versioning problem, 
> is the thread-safety issue in loadWithCaching() described in 
> https://softwaremill.com/race-condition-cache-guava-caffeine/ not a problem 
> here?

That's exactly the problem this is fixing. It ensures that an invalidate 
concurrent with loadWithCaching() will cause the concurrent load to _not_ write 
its result to the cache.

In terms of multiple threads calling loadWithCaching() at the same time, I 
think our code appropriately ensures that the first thread who gets there 
inserts a Future, and the second thread piggy-backs onto that same future.


--
To view, visit http://gerrit.cloudera.org:8080/13664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I70f377db88e204825a909389f28dc3451815235c
Gerrit-Change-Number: 13664
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 18 Jun 2019 17:04:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8658: Populate missing Ranger audit fields

2019-06-18 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13601 )

Change subject: IMPALA-8658: Populate missing Ranger audit fields
..


Patch Set 8:

> Patch Set 8:
>
> > Patch Set 7: Verified-1
> >
> > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4483/
>
> InetAddresses.forName() only takes IP address and not a hostname. I'm 
> switching back to InetAddress.getByName(host) to get an IP address of a host.

That's exactly my confusion, though -- how are we getting something other than 
an IP address into this field? Are we doing reverse lookups somewhere on the 
C++ side? I'm afraid to be doing DNS lookups in the audit path here and 
potentially fail or slow down queries if DNS is unreliable (hint: DNS is always 
unreliable)


--
To view, visit http://gerrit.cloudera.org:8080/13601
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I167bc35411ad9b4164c292077ff082671967cff8
Gerrit-Change-Number: 13601
Gerrit-PatchSet: 8
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 18 Jun 2019 08:18:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8542. Add an access trace for the data cache

2019-06-18 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13425 )

Change subject: IMPALA-8542. Add an access trace for the data cache
..

IMPALA-8542. Add an access trace for the data cache

This adds a relatively simple JSON-formatted access trace for the data
cache feature. Each partition stores a trace file named 'trace.txt',
with each line representing a hit, miss, or store into the cache.

The trace is collected using the kudu::AsyncLogger class which handles
buffering and deferring the actual IO to a background thread.

By default, the full cache key info is written to the trace (including
the file paths), but a flag can enable anonymization (128-bit
city-hashing) of the file names in case any user would like to capture a
trace to be shared publically without divulging their table names.

Change-Id: I2302c19abb5db19f1d3d1cd727a82977a9e2ba9c
Reviewed-on: http://gerrit.cloudera.org:8080/13425
Tested-by: Impala Public Jenkins 
Reviewed-by: Michael Ho 
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
3 files changed, 297 insertions(+), 4 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Michael Ho: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/13425
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I2302c19abb5db19f1d3d1cd727a82977a9e2ba9c
Gerrit-Change-Number: 13425
Gerrit-PatchSet: 9
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] fe: improve logging for metadata loading

2019-06-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13463 )

Change subject: fe: improve logging for metadata loading
..

fe: improve logging for metadata loading

This annotates various catalogd-internal calls associated with
refreshing and loading metadata with a 'String why' parameter, useful
for logging. This can help understand why a particular piece of metadata
was loaded, and in the case of REFRESH calls, who issued the original
refresh.

Additionally, some of the log statements were improved to add a bit of
extra detail such as the list of partitions being refreshed
(abbreviated) and whether or not the table schema is being refreshed.

Change-Id: I4d90a5f075b05d2dc96692b3349abe35ce24b8b8
Reviewed-on: http://gerrit.cloudera.org:8080/13463
Tested-by: Impala Public Jenkins 
Reviewed-by: Vihang Karajgaonkar 
---
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/View.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/catalog/CatalogObjectToFromThriftTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java
18 files changed, 207 insertions(+), 144 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Vihang Karajgaonkar: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/13463
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I4d90a5f075b05d2dc96692b3349abe35ce24b8b8
Gerrit-Change-Number: 13463
Gerrit-PatchSet: 6
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] WIP: IMPALA-7434. Handle invalidation races in CatalogdMetaProvider

2019-06-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13664 )

Change subject: WIP: IMPALA-7434. Handle invalidation races in 
CatalogdMetaProvider
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13664/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13664/1//COMMIT_MSG@7
PS1, Line 7: WIP: IMPALA-7434. Handle invalidation races in CatalogdMetaProvider
oops this should say IMPALA-7534.


http://gerrit.cloudera.org:8080/#/c/13664/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/13664/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@240
PS1, Line 240: 7543
also got it wrong here :)



--
To view, visit http://gerrit.cloudera.org:8080/13664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I70f377db88e204825a909389f28dc3451815235c
Gerrit-Change-Number: 13664
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 18 Jun 2019 00:09:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-7434. Handle invalidation races in CatalogdMetaProvider

2019-06-17 Thread Todd Lipcon (Code Review)
Hello Bharath Vissapragada,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/13664

to review the following change.


Change subject: WIP: IMPALA-7434. Handle invalidation races in 
CatalogdMetaProvider
..

WIP: IMPALA-7434. Handle invalidation races in CatalogdMetaProvider

This handles a race condition in which a cache invalidation concurrent
with a cache load would potentially be skipped, causing out-of-date data
to persist in the cache. This would present itself as spurious "table
not found" errors.

A new test case triggers the issue reliably by injecting latency into
the metadata fetch RPC and running DDLs concurrently on the same
database across 8 threads. With the fix, the test passes reliably.

WIP because it probably needs a little more cleanup and testing, but
posting for early review

Change-Id: I70f377db88e204825a909389f28dc3451815235c
---
M be/src/exec/catalog-op-executor.cc
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M tests/custom_cluster/test_local_catalog.py
3 files changed, 113 insertions(+), 23 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/13664/1
--
To view, visit http://gerrit.cloudera.org:8080/13664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I70f377db88e204825a909389f28dc3451815235c
Gerrit-Change-Number: 13664
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 


[Impala-ASF-CR] IMPALA-8671: Do not re-create RangerAuthorizationChecker instance on catalog update

2019-06-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13663 )

Change subject: IMPALA-8671: Do not re-create RangerAuthorizationChecker 
instance on catalog update
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13663
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iabd5c1b5c7cafc7a03681def0c0f6f775d690c41
Gerrit-Change-Number: 13663
Gerrit-PatchSet: 4
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 17 Jun 2019 20:33:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8671: Do not re-create RangerAuthorizationChecker instance on catalog update

2019-06-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13663 )

Change subject: IMPALA-8671: Do not re-create RangerAuthorizationChecker 
instance on catalog update
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13663/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
File 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java:

http://gerrit.cloudera.org:8080/#/c/13663/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java@79
PS3, Line 79: // Ranger plugin.
do we need to reinit the ranger plugin for explicit REFRESH AUTHORIZATION calls?


http://gerrit.cloudera.org:8080/#/c/13663/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java@81
PS3, Line 81: authzChecker_ = new RangerAuthorizationChecker(authzConfig_);
this isn't safe under concurrency unless authzChecker is volatile. Should we 
just initialize it in the constructor?



--
To view, visit http://gerrit.cloudera.org:8080/13663
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iabd5c1b5c7cafc7a03681def0c0f6f775d690c41
Gerrit-Change-Number: 13663
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 17 Jun 2019 20:15:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8671: Do not re-create RangerAuthorizationChecker instance on catalog update

2019-06-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13663 )

Change subject: IMPALA-8671: Do not re-create RangerAuthorizationChecker 
instance on catalog update
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13663/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
File 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java:

http://gerrit.cloudera.org:8080/#/c/13663/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java@39
PS2, Line 39: authzChecker_
where does this ever get set?



--
To view, visit http://gerrit.cloudera.org:8080/13663
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iabd5c1b5c7cafc7a03681def0c0f6f775d690c41
Gerrit-Change-Number: 13663
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 17 Jun 2019 20:09:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP] IMPALA-8630: Hash the full path when calculating consistent remote placement

2019-06-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13545 )

Change subject: [WIP] IMPALA-8630: Hash the full path when calculating 
consistent remote placement
..


Patch Set 3: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/13545
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b
Gerrit-Change-Number: 13545
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 17 Jun 2019 17:59:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8617: Add support for lz4 in parquet

2019-06-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13582 )

Change subject: IMPALA-8617: Add support for lz4 in parquet
..


Patch Set 7: Code-Review+2

looks good. Thanks!


--
To view, visit http://gerrit.cloudera.org:8080/13582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia6850a39ef3f1e0e7ba48e08eef1d4f7cbb74d0c
Gerrit-Change-Number: 13582
Gerrit-PatchSet: 7
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 17 Jun 2019 17:44:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8658: Populate missing Ranger audit fields

2019-06-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13601 )

Change subject: IMPALA-8658: Populate missing Ranger audit fields
..


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/13601/5/fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
File fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java:

http://gerrit.cloudera.org:8080/#/c/13601/5/fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java@43
PS5, Line 43:* @param sqlStmt the SQL statement to be logged for auditing
nit: add a @param for sessionState?


http://gerrit.cloudera.org:8080/#/c/13601/5/fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java
File 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java:

http://gerrit.cloudera.org:8080/#/c/13601/5/fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java@59
PS5, Line 59: clusterName_ = clusterName;
should you Preconditions.checkNotNull() for these? or 
Objects.firstNonNull(clusterName, "") or something? The above comment seems to 
indicate that if they end up null then ranger will hit NPE


http://gerrit.cloudera.org:8080/#/c/13601/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/13601/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@367
PS5, Line 367: requestingUser, ddlRequest.getHeader
it seems like 'requestingUser' always is being constructed out of 
ddlRequest.getHeader().getRequesting_user(), right? Given that, can we just 
pass ddlRequest.getHeader() and not the redundant username here?


http://gerrit.cloudera.org:8080/#/c/13601/5/fe/src/main/java/org/apache/impala/util/TSessionStateUtil.java
File fe/src/main/java/org/apache/impala/util/TSessionStateUtil.java:

http://gerrit.cloudera.org:8080/#/c/13601/5/fe/src/main/java/org/apache/impala/util/TSessionStateUtil.java@44
PS5, Line 44: 
InetAddresses.forString(session.getNetwork_address().getHostname())
: .getHostAddress();
hrm, now I'm wondering why we can't just use getNetowkr_address().getHostname() 
directly here?



--
To view, visit http://gerrit.cloudera.org:8080/13601
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I167bc35411ad9b4164c292077ff082671967cff8
Gerrit-Change-Number: 13601
Gerrit-PatchSet: 5
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 17 Jun 2019 17:30:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8599: Create a Maven module for query event hook API

2019-06-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13653 )

Change subject: IMPALA-8599: Create a Maven module for query event hook API
..


Patch Set 2: Code-Review+2

(1 comment)

Looks fine. We should probably reorganize all this stuff so that the different 
modules are submodules within a top-level directory with a pom to tie them 
together (multi-module maven project) instead of depending on each other as 
separate projects tied together by cmake, but I guess that's out of scope for 
this patch. Any interest in taking that on?

http://gerrit.cloudera.org:8080/#/c/13653/2/query-event-hook-api/pom.xml
File query-event-hook-api/pom.xml:

http://gerrit.cloudera.org:8080/#/c/13653/2/query-event-hook-api/pom.xml@51
PS2, Line 51:   
: org.apache.maven.plugins
: maven-compiler-pluginhttp://gerrit.cloudera.org:8080/13653
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I84c422d83c19b75c3d1d7a772b971f4f7704d44c
Gerrit-Change-Number: 13653
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 17 Jun 2019 17:19:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP] IMPALA-8630: Hash the full path when calculating consistent remote placement

2019-06-17 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13545 )

Change subject: [WIP] IMPALA-8630: Hash the full path when calculating 
consistent remote placement
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13545/2/common/fbs/CatalogObjects.fbs
File common/fbs/CatalogObjects.fbs:

http://gerrit.cloudera.org:8080/#/c/13545/2/common/fbs/CatalogObjects.fbs@81
PS2, Line 81:   // A hash of the absolute file path
> I wonder if this even makes the struct bigger, given padding. Or if we can
FWIW we currently use a flatbuffer 'table' instead of a 'struct' which makes 
all the fields optional, and ends up taking extra space since there's some kind 
of vtable entry for each field or something. Given that I don't think there's 
padding involved. But we could probably get a win by moving all of the 
non-optional fields into a struct



--
To view, visit http://gerrit.cloudera.org:8080/13545
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b
Gerrit-Change-Number: 13545
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 17 Jun 2019 15:41:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP] IMPALA-8630: Hash the full path when calculating consistent remote placement

2019-06-14 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13545 )

Change subject: [WIP] IMPALA-8630: Hash the full path when calculating 
consistent remote placement
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/13545/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13545/2//COMMIT_MSG@27
PS2, Line 27: The alternative is to construct the DescriptorTbl in the scheduler
the advantage of the alternative is saving some extra memory per file 
descriptor. The FileDescs are persistent per-file so this can add up to many MB 
of increased consumption on a big catalogd. That said, with the recent work to 
reduce consumption and add eviction on catalogd, maybe it's not such a big deal 
as it used to be (eg IMPALA-7406 saved 100 bytes per FileFbDesc and here we're 
only losing a little bit of that gain)


http://gerrit.cloudera.org:8080/#/c/13545/2/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/13545/2/be/src/scheduling/scheduler.cc@801
PS2, Line 801:   uint32_t hash = 
HashUtil::Hash(_file_split->full_path_hash,
 :   sizeof(hdfs_file_split->full_path_hash), 0);
why hash the hash instead of just making the hash here be a uint32_t directly?


http://gerrit.cloudera.org:8080/#/c/13545/2/common/fbs/CatalogObjects.fbs
File common/fbs/CatalogObjects.fbs:

http://gerrit.cloudera.org:8080/#/c/13545/2/common/fbs/CatalogObjects.fbs@82
PS2, Line 82: int
should be a uint?



--
To view, visit http://gerrit.cloudera.org:8080/13545
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b
Gerrit-Change-Number: 13545
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 14 Jun 2019 23:20:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] fe: improve logging for metadata loading

2019-06-14 Thread Todd Lipcon (Code Review)
Hello Vihang Karajgaonkar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13463

to look at the new patch set (#4).

Change subject: fe: improve logging for metadata loading
..

fe: improve logging for metadata loading

This annotates various catalogd-internal calls associated with
refreshing and loading metadata with a 'String why' parameter, useful
for logging. This can help understand why a particular piece of metadata
was loaded, and in the case of REFRESH calls, who issued the original
refresh.

Additionally, some of the log statements were improved to add a bit of
extra detail such as the list of partitions being refreshed
(abbreviated) and whether or not the table schema is being refreshed.

Change-Id: I4d90a5f075b05d2dc96692b3349abe35ce24b8b8
---
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/View.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/catalog/CatalogObjectToFromThriftTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java
18 files changed, 207 insertions(+), 144 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/13463/4
--
To view, visit http://gerrit.cloudera.org:8080/13463
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d90a5f075b05d2dc96692b3349abe35ce24b8b8
Gerrit-Change-Number: 13463
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] fe: improve logging for metadata loading

2019-06-14 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13463 )

Change subject: fe: improve logging for metadata loading
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13463/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/13463/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@a948
PS3, Line 948:
> having this and the one after fetchAllPartitions call seems useful to quick
OK. Planning on working on a nicer patch to expose these timings as more 
structured information and even back to the profile of the user who submitted 
it, but will just add these back for now.


http://gerrit.cloudera.org:8080/#/c/13463/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@103
PS3, Line 103:
> nit, unnecessary new line
Done



--
To view, visit http://gerrit.cloudera.org:8080/13463
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d90a5f075b05d2dc96692b3349abe35ce24b8b8
Gerrit-Change-Number: 13463
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 14 Jun 2019 22:58:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8542. Add an access trace for the data cache

2019-06-14 Thread Todd Lipcon (Code Review)
Hello Michael Ho, Lars Volker, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13425

to look at the new patch set (#7).

Change subject: IMPALA-8542. Add an access trace for the data cache
..

IMPALA-8542. Add an access trace for the data cache

This adds a relatively simple JSON-formatted access trace for the data
cache feature. Each partition stores a trace file named 'trace.txt',
with each line representing a hit, miss, or store into the cache.

The trace is collected using the kudu::AsyncLogger class which handles
buffering and deferring the actual IO to a background thread.

By default, the full cache key info is written to the trace (including
the file paths), but a flag can enable anonymization (128-bit
city-hashing) of the file names in case any user would like to capture a
trace to be shared publically without divulging their table names.

Change-Id: I2302c19abb5db19f1d3d1cd727a82977a9e2ba9c
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
3 files changed, 297 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/13425/7
--
To view, visit http://gerrit.cloudera.org:8080/13425
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2302c19abb5db19f1d3d1cd727a82977a9e2ba9c
Gerrit-Change-Number: 13425
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-8459. LocalCatalog: Allow dropping tables that fail to load

2019-06-14 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13557 )

Change subject: IMPALA-8459. LocalCatalog: Allow dropping tables that fail to 
load
..


Patch Set 3:

Ran a build locally with LocalCatalog enabled. I got two failures but they were 
both "Exception: ("DB {0} didn't show up after {1}s", 'hms_sanity_db', 30)" 
which is tracked under another JIRA.


--
To view, visit http://gerrit.cloudera.org:8080/13557
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia85d6b10669866dcc9f2bf7385a4775e483dd6b0
Gerrit-Change-Number: 13557
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 14 Jun 2019 22:19:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8542. Add an access trace for the data cache

2019-06-13 Thread Todd Lipcon (Code Review)
Hello Michael Ho, Lars Volker, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13425

to look at the new patch set (#4).

Change subject: IMPALA-8542. Add an access trace for the data cache
..

IMPALA-8542. Add an access trace for the data cache

This adds a relatively simple JSON-formatted access trace for the data
cache feature. Each partition stores a trace file named 'trace.txt',
with each line representing a hit, miss, or store into the cache.

The trace is collected using the kudu::AsyncLogger class which handles
buffering and deferring the actual IO to a background thread.

By default, the full cache key info is written to the trace (including
the file paths), but a flag can enable anonymization (128-bit
city-hashing) of the file names in case any user would like to capture a
trace to be shared publically without divulging their table names.

Change-Id: I2302c19abb5db19f1d3d1cd727a82977a9e2ba9c
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
3 files changed, 296 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/13425/4
--
To view, visit http://gerrit.cloudera.org:8080/13425
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2302c19abb5db19f1d3d1cd727a82977a9e2ba9c
Gerrit-Change-Number: 13425
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-8542. Add an access trace for the data cache

2019-06-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13425 )

Change subject: IMPALA-8542. Add an access trace for the data cache
..


Patch Set 5:

r4 fixes the comments, r5 is a rebase (had a trivial conflict to deal with from 
the mtime fix adding some DCHECKs in the key encoding path)


--
To view, visit http://gerrit.cloudera.org:8080/13425
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2302c19abb5db19f1d3d1cd727a82977a9e2ba9c
Gerrit-Change-Number: 13425
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 14 Jun 2019 00:39:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8542. Add an access trace for the data cache

2019-06-13 Thread Todd Lipcon (Code Review)
Hello Michael Ho, Lars Volker, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13425

to look at the new patch set (#5).

Change subject: IMPALA-8542. Add an access trace for the data cache
..

IMPALA-8542. Add an access trace for the data cache

This adds a relatively simple JSON-formatted access trace for the data
cache feature. Each partition stores a trace file named 'trace.txt',
with each line representing a hit, miss, or store into the cache.

The trace is collected using the kudu::AsyncLogger class which handles
buffering and deferring the actual IO to a background thread.

By default, the full cache key info is written to the trace (including
the file paths), but a flag can enable anonymization (128-bit
city-hashing) of the file names in case any user would like to capture a
trace to be shared publically without divulging their table names.

Change-Id: I2302c19abb5db19f1d3d1cd727a82977a9e2ba9c
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
3 files changed, 296 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/13425/5
--
To view, visit http://gerrit.cloudera.org:8080/13425
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2302c19abb5db19f1d3d1cd727a82977a9e2ba9c
Gerrit-Change-Number: 13425
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-8542. Add an access trace for the data cache

2019-06-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13425 )

Change subject: IMPALA-8542. Add an access trace for the data cache
..


Patch Set 3:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/13425/2/be/src/runtime/io/data-cache.cc
File be/src/runtime/io/data-cache.cc:

http://gerrit.cloudera.org:8080/#/c/13425/2/be/src/runtime/io/data-cache.cc@656
PS2, Line 656: // Limit the write concurrency to avoid blocking the caller 
(which could be calling
> I think we usually go with "start the arguments on a new line indented by f
Done


http://gerrit.cloudera.org:8080/#/c/13425/2/be/src/runtime/io/data-cache.cc@870
PS2, Line 870: ");
> We usually use all-caps for constants
Done


http://gerrit.cloudera.org:8080/#/c/13425/2/be/src/runtime/io/data-cache.cc@874
PS2, Line 874: case STORE: jw.String("S"); break;
> see above, google style guide disagrees
Done


http://gerrit.cloudera.org:8080/#/c/13425/3/be/src/runtime/io/data-cache.cc
File be/src/runtime/io/data-cache.cc:

http://gerrit.cloudera.org:8080/#/c/13425/3/be/src/runtime/io/data-cache.cc@88
PS3, Line 88: "(Advanced) Collect a trace of all lookups in the 
data cache.");
> nit: the indent here and other places in this change seems to be different
Done


http://gerrit.cloudera.org:8080/#/c/13425/3/be/src/runtime/io/data-cache.cc@98
PS3, Line 98: impala-cache-trace.txt
> The data cache by default removes all files with names containing prefix CA
it currently creates the trace file with open() flags to truncate the existing 
file if there is one, so it's essentially the same behavior. If a user wants to 
archive them before restarting, I think it's up to them. Is that what you were 
asking?


http://gerrit.cloudera.org:8080/#/c/13425/3/be/src/runtime/io/data-cache.cc@116
PS3, Line 116: if (file_) Flush();
> Does it make sense to also call file_->Close() here or does the d'tor take
dtor handles it


http://gerrit.cloudera.org:8080/#/c/13425/3/be/src/runtime/io/data-cache.cc@148
PS3, Line 148:
> nit: unnecessary blank line.
Done


http://gerrit.cloudera.org:8080/#/c/13425/3/be/src/runtime/io/data-cache.cc@186
PS3, Line 186:   unique_ptr underlying_logger_;
 :   unique_ptr logger_;
> Would be nice to briefly document these variables.
Done


http://gerrit.cloudera.org:8080/#/c/13425/3/be/src/runtime/io/data-cache.cc@498
PS3, Line 498:   if (FLAGS_data_cache_enable_tracing) {
 : tracer_.reset(new Tracer(path_ + "/" + TRACE_FILE_NAME));
 : RETURN_IF_ERROR(tracer_->Start());
 :   }
> I wonder how this may affect the available storage space allocated for the
yea, I didn't want to add the complexity of accounting for the trace file, 
since as you said, this is meant for experimentation.


http://gerrit.cloudera.org:8080/#/c/13425/3/be/src/runtime/io/data-cache.cc@887
PS3, Line 887: char buf[BUF_LEN];
> nit: same name as 'buf' declared outside of the scope. While functionally c
Done



--
To view, visit http://gerrit.cloudera.org:8080/13425
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2302c19abb5db19f1d3d1cd727a82977a9e2ba9c
Gerrit-Change-Number: 13425
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 14 Jun 2019 00:30:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8583: Add metrics for Basic auth

2019-06-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13640 )

Change subject: IMPALA-8583: Add metrics for Basic auth
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13640
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6d1b1bcc093e4b00802bc108ff4d8d4e2cdfd88f
Gerrit-Change-Number: 13640
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 14 Jun 2019 00:19:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8658: Populate missing Ranger audit fields

2019-06-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13601 )

Change subject: IMPALA-8658: Populate missing Ranger audit fields
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13601/3/fe/src/main/java/org/apache/impala/util/TSessionStateUtil.java
File fe/src/main/java/org/apache/impala/util/TSessionStateUtil.java:

http://gerrit.cloudera.org:8080/#/c/13601/3/fe/src/main/java/org/apache/impala/util/TSessionStateUtil.java@47
PS3, Line 47: return 
InetAddress.getByName(session.getNetwork_address().getHostname())
> Ranger wants it to be an IP address, although technically it's just a strin
I guess I'm wondering: would this ever _not_ be a dotted-decimal address? I'm 
guessing we pick it up from the inbound socket in our thrift server. I'm afraid 
of accidentally doing resolution here. Maybe we should use 
https://google.github.io/guava/releases/20.0/api/docs/com/google/common/net/InetAddresses.html#forString-java.lang.String-
 here?



--
To view, visit http://gerrit.cloudera.org:8080/13601
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I167bc35411ad9b4164c292077ff082671967cff8
Gerrit-Change-Number: 13601
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 14 Jun 2019 00:16:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8459. LocalCatalog: Allow dropping tables that fail to load

2019-06-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13557 )

Change subject: IMPALA-8459. LocalCatalog: Allow dropping tables that fail to 
load
..


Patch Set 1:

(4 comments)

r2 fixes the comments. Will rebase and run a full test run locally as suggested 
by Tim

http://gerrit.cloudera.org:8080/#/c/13557/1/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
File fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java:

http://gerrit.cloudera.org:8080/#/c/13557/1/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java@40
PS1, Line 40:  * Represents a table with incomplete metadata. The metadata may 
be incomplete because
> yea I think this only gets used by the "v1" catalog. I'll add a comment.
Done


http://gerrit.cloudera.org:8080/#/c/13557/1/fe/src/main/java/org/apache/impala/catalog/local/FailedLoadLocalTable.java
File fe/src/main/java/org/apache/impala/catalog/local/FailedLoadLocalTable.java:

http://gerrit.cloudera.org:8080/#/c/13557/1/fe/src/main/java/org/apache/impala/catalog/local/FailedLoadLocalTable.java@33
PS1, Line 33:   private TableLoadingException cause_;
> nit: final?
Done


http://gerrit.cloudera.org:8080/#/c/13557/1/fe/src/main/java/org/apache/impala/catalog/local/FailedLoadLocalTable.java@37
PS1, Line 37: cause
> nit: checkNotNull()?
Done


http://gerrit.cloudera.org:8080/#/c/13557/1/fe/src/main/java/org/apache/impala/service/MetadataOp.java
File fe/src/main/java/org/apache/impala/service/MetadataOp.java:

http://gerrit.cloudera.org:8080/#/c/13557/1/fe/src/main/java/org/apache/impala/service/MetadataOp.java@317
PS1, Line 317: || table instanceof FeIncompleteTable
> Can't we just override isLoaded() in FailedLoad..Table impl to make it less
the problem with making 'isLoaded()' return false in the "failed table" case is 
that StmtMetadataLoader will then think it has to keep trying to load the 
table. Agree this stuff is pretty janky, mostly because the original 
implementation uses IncompleteTable to mean both an unloaded placeholder table 
and a loaded-but-failed table. Didn't want to try to refactor all of that 
legacy code here.



--
To view, visit http://gerrit.cloudera.org:8080/13557
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia85d6b10669866dcc9f2bf7385a4775e483dd6b0
Gerrit-Change-Number: 13557
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 13 Jun 2019 20:32:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8459. LocalCatalog: Allow dropping tables that fail to load

2019-06-13 Thread Todd Lipcon (Code Review)
Hello Bharath Vissapragada, Vihang Karajgaonkar, Tim Armstrong, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13557

to look at the new patch set (#2).

Change subject: IMPALA-8459. LocalCatalog: Allow dropping tables that fail to 
load
..

IMPALA-8459. LocalCatalog: Allow dropping tables that fail to load

This adds support in LocalCatalog for IncompleteTables. A new
FeIncompleteTable interface is introduced, and when a table fails to
load, it produces a FeIncompleteTable instead of an immediate error.

This allows DROP TABLE statements to get past analysis and execute even
when tables are in bad states (eg a Kudu table with no backing table in
Kudu itself).

This re-enables some tests that were previously disabled for
LocalCatalog.

Change-Id: Ia85d6b10669866dcc9f2bf7385a4775e483dd6b0
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
A fe/src/main/java/org/apache/impala/catalog/FeIncompleteTable.java
M fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
A fe/src/main/java/org/apache/impala/catalog/local/FailedLoadLocalTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java
M fe/src/main/java/org/apache/impala/service/MetadataOp.java
M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java
M tests/common/skip.py
M tests/query_test/test_kudu.py
12 files changed, 133 insertions(+), 36 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/13557/2
--
To view, visit http://gerrit.cloudera.org:8080/13557
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia85d6b10669866dcc9f2bf7385a4775e483dd6b0
Gerrit-Change-Number: 13557
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] fe: improve logging for metadata loading

2019-06-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13463 )

Change subject: fe: improve logging for metadata loading
..


Patch Set 1:

(7 comments)

patch set r2 has fixes for the comments. r3 will rebase to tip of master

http://gerrit.cloudera.org:8080/#/c/13463/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/13463/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1954
PS1, Line 1954: why
> will do
Done


http://gerrit.cloudera.org:8080/#/c/13463/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/13463/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@103
PS1, Line 103: hiveexec
> mistake -- eclipse loves to auto-import shaded crap :(
Done


http://gerrit.cloudera.org:8080/#/c/13463/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@628
PS1, Line 628: new Exception()
> yea, this was some debugging junk I left in by mistake
Done


http://gerrit.cloudera.org:8080/#/c/13463/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@891
PS1, Line 891:   org.apache.hadoop.hive.metastore.api.Table msTbl, String 
why) throws TableLoadingException {
> line too long (98 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/13463/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/13463/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@813
PS1, Line 813: insert
> would be used to say processing table-level insert event from HMS for consi
Done


http://gerrit.cloudera.org:8080/#/c/13463/1/fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
File fe/src/test/java/org/apache/impala/catalog/CatalogTest.java:

http://gerrit.cloudera.org:8080/#/c/13463/1/fe/src/test/java/org/apache/impala/catalog/CatalogTest.java@170
PS1, Line 170: assertNotNull(catalog_.getOrLoadTable(hbaseDb.getName(), 
"alltypessmallbinary", "test"));
> line too long (93 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/13463/1/fe/src/test/java/org/apache/impala/catalog/CatalogTest.java@172
PS1, Line 172: assertNotNull(catalog_.getOrLoadTable(hbaseDb.getName(), 
"hbasealltypeserror", "test"));
> line too long (92 > 90)
Done



--
To view, visit http://gerrit.cloudera.org:8080/13463
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d90a5f075b05d2dc96692b3349abe35ce24b8b8
Gerrit-Change-Number: 13463
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 13 Jun 2019 20:23:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] fe: improve logging for metadata loading

2019-06-13 Thread Todd Lipcon (Code Review)
Hello Vihang Karajgaonkar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13463

to look at the new patch set (#2).

Change subject: fe: improve logging for metadata loading
..

fe: improve logging for metadata loading

This annotates various catalogd-internal calls associated with
refreshing and loading metadata with a 'String why' parameter, useful
for logging. This can help understand why a particular piece of metadata
was loaded, and in the case of REFRESH calls, who issued the original
refresh.

Additionally, some of the log statements were improved to add a bit of
extra detail such as the list of partitions being refreshed
(abbreviated) and whether or not the table schema is being refreshed.

Change-Id: I4d90a5f075b05d2dc96692b3349abe35ce24b8b8
---
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/View.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/catalog/CatalogObjectToFromThriftTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java
18 files changed, 207 insertions(+), 147 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/13463/2
--
To view, visit http://gerrit.cloudera.org:8080/13463
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d90a5f075b05d2dc96692b3349abe35ce24b8b8
Gerrit-Change-Number: 13463
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] fe: improve logging for metadata loading

2019-06-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13463 )

Change subject: fe: improve logging for metadata loading
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/13463/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/13463/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@515
PS1, Line 515: needed to fetch partition stats
> Can we create constants for these reasons. Seems they never change.
Given they're only used in one place, and not 'magic numbers' that benefit from 
a descriptive variable name, not sure what the benefit is. Doesn't it just add 
another line of code and cause you to have to navigate around when reading the 
code?


http://gerrit.cloudera.org:8080/#/c/13463/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1954
PS1, Line 1954: why
> Nit, I think have nouns as the arguments are more readable. eg: reload(Tabl
will do


http://gerrit.cloudera.org:8080/#/c/13463/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/13463/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@103
PS1, Line 103: hiveexec
> Is there a particular reason to use the shaded version of Joiner?
mistake -- eclipse loves to auto-import shaded crap :(


http://gerrit.cloudera.org:8080/#/c/13463/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@628
PS1, Line 628: new Exception()
> This will potentially add a lot of traces in the log. Don't we generally lo
yea, this was some debugging junk I left in by mistake



--
To view, visit http://gerrit.cloudera.org:8080/13463
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d90a5f075b05d2dc96692b3349abe35ce24b8b8
Gerrit-Change-Number: 13463
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 13 Jun 2019 20:10:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8658: Populate missing Ranger audit fields

2019-06-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13601 )

Change subject: IMPALA-8658: Populate missing Ranger audit fields
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/13601/3/common/thrift/JniCatalog.thrift
File common/thrift/JniCatalog.thrift:

http://gerrit.cloudera.org:8080/#/c/13601/3/common/thrift/JniCatalog.thrift@642
PS3, Line 642:   6: required string sql_stmt
Can you name this 'redacted_sql_stmt' or something so that it's clear this is 
supposed to be post-redaction?


http://gerrit.cloudera.org:8080/#/c/13601/3/common/thrift/JniCatalog.thrift@640
PS3, Line 640:
 :   // The SQL statement to be logged.
 :   6: required string sql_stmt
 :
 :   // The client IP address.
 :   7: required string client_ip
What do you think about moving these to TCatalogServiceRequestHeader? They seem 
generally useful for logging/tracing what's going on on the catalogd for any 
operation, and 'client_ip' certainly seems to fit nicely with 'requesting_user' 
which is already present there.


http://gerrit.cloudera.org:8080/#/c/13601/3/fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java
File fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java:

http://gerrit.cloudera.org:8080/#/c/13601/3/fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java@111
PS3, Line 111: } catch (UnknownHostException e) {
 :   throw new AnalysisException("Unable to get the client IP 
address", e);
seems weird that this can throw. Is it trying to reverse-lookup or something?


http://gerrit.cloudera.org:8080/#/c/13601/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
File 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java:

http://gerrit.cloudera.org:8080/#/c/13601/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@222
PS3, Line 222: } catch (UnknownHostException e) {
 :   throw new InternalException("Unable to get the client 
IP address", e);
 : }
similar to elsewhere, it seems like this should never happen if we just have IP 
addresses and not names. Perhaps we can catch it and throw as RuntimeException 
in getClientIpAddress()? Or we could use a different way of getting it that 
can't throw?


http://gerrit.cloudera.org:8080/#/c/13601/3/fe/src/main/java/org/apache/impala/util/TSessionStateUtil.java
File fe/src/main/java/org/apache/impala/util/TSessionStateUtil.java:

http://gerrit.cloudera.org:8080/#/c/13601/3/fe/src/main/java/org/apache/impala/util/TSessionStateUtil.java@47
PS3, Line 47: return 
InetAddress.getByName(session.getNetwork_address().getHostname())
We could avoid the 'throws UnknownHostException' by just using 
String.format("%s:%d", ...) with the host and port, no?



--
To view, visit http://gerrit.cloudera.org:8080/13601
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I167bc35411ad9b4164c292077ff082671967cff8
Gerrit-Change-Number: 13601
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 13 Jun 2019 20:03:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8633 : Insert event should not error when table does not exists

2019-06-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13548 )

Change subject: IMPALA-8633 : Insert event should not error when table does not 
exists
..


Patch Set 6: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13548/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/13548/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@825
PS6, Line 825: e
 : .
tiny nit: can you move 'e' to the next line? this breaking is pretty unnatural



--
To view, visit http://gerrit.cloudera.org:8080/13548
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I961cd7cbede4c248dba538c7fabb4bc708e49693
Gerrit-Change-Number: 13548
Gerrit-PatchSet: 6
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 13 Jun 2019 19:56:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8585: Insert data into ACID table during dataload

2019-06-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13633 )

Change subject: IMPALA-8585: Insert data into ACID table during dataload
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13633/1/testdata/datasets/functional/functional_schema_template.sql
File testdata/datasets/functional/functional_schema_template.sql:

http://gerrit.cloudera.org:8080/#/c/13633/1/testdata/datasets/functional/functional_schema_template.sql@2137
PS1, Line 2137: -- Compactions could be also interesting, but seem to be 
sporadically very slow.
can you file a JIRA for this issue that you mentioned?



--
To view, visit http://gerrit.cloudera.org:8080/13633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id459519d5b963ea760c44719c12736bf104938f9
Gerrit-Change-Number: 13633
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 13 Jun 2019 19:52:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8583: Add metrics for BASIC auth

2019-06-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13640 )

Change subject: IMPALA-8583: Add metrics for BASIC auth
..


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/13640/1/be/src/rpc/auth-provider.h
File be/src/rpc/auth-provider.h:

http://gerrit.cloudera.org:8080/#/c/13640/1/be/src/rpc/auth-provider.h@33
PS1, Line 33: MatricGroup
this doesn't seem right


http://gerrit.cloudera.org:8080/#/c/13640/1/be/src/transport/THttpServer.h
File be/src/transport/THttpServer.h:

http://gerrit.cloudera.org:8080/#/c/13640/1/be/src/transport/THttpServer.h@84
PS1, Line 84: explicit
no need for explicit on multi-arg ctor


http://gerrit.cloudera.org:8080/#/c/13640/1/be/src/transport/THttpServer.cpp
File be/src/transport/THttpServer.cpp:

http://gerrit.cloudera.org:8080/#/c/13640/1/be/src/transport/THttpServer.cpp@152
PS1, Line 152:   total_basic_auth_success_->Increment(1);
semi-related to this patch: do we do any logging of the incorrect login 
attempts?


http://gerrit.cloudera.org:8080/#/c/13640/1/common/thrift/metrics.json
File common/thrift/metrics.json:

http://gerrit.cloudera.org:8080/#/c/13640/1/common/thrift/metrics.json@945
PS1, Line 945: BASIC auth
in the long description perhaps we should say Basic (username/password) Auth, 
or something? People may not understand the underlying HTTP mechanism but would 
like to understand that this represents incorrect password attempts


http://gerrit.cloudera.org:8080/#/c/13640/1/common/thrift/metrics.json@949
PS1, Line 949: BASIC
nit: BASIC isn't an acronym so I think it should just be capitalized as "Basic"


http://gerrit.cloudera.org:8080/#/c/13640/1/common/thrift/metrics.json@955
PS1, Line 955: auth
similar to above, maybe say something like "due to an invalid username and 
password"



--
To view, visit http://gerrit.cloudera.org:8080/13640
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6d1b1bcc093e4b00802bc108ff4d8d4e2cdfd88f
Gerrit-Change-Number: 13640
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 13 Jun 2019 19:45:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8617: Add support for lz4 in parquet

2019-06-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13582 )

Change subject: IMPALA-8617: Add support for lz4 in parquet
..


Patch Set 6:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/13582/6/be/src/exec/parquet/hdfs-parquet-table-writer.cc
File be/src/exec/parquet/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/13582/6/be/src/exec/parquet/hdfs-parquet-table-writer.cc@97
PS6, Line 97: const string& column_name
style nit: take this by value, and then initialize the member variable as 
column_name_(std::move(column_name))


http://gerrit.cloudera.org:8080/#/c/13582/6/be/src/exec/parquet/hdfs-parquet-table-writer.cc@400
PS6, Line 400:   const string& column_name_;
nit: it's dangerous (and anti-style-guide) to store a reference as a member 
variable like this -- whoever passed in this string might well mutate or 
destroy the string leaving a dangling reference here. Typically we lean towards 
using pointers for cases like this, and documenting clearly in the constructor 
that there's a lifetime requirement for a passed-in raw pointer. In this case, 
though, it's not a hot path at all so I think you should just make this a 
'string' vs adding lifetime dependencies


http://gerrit.cloudera.org:8080/#/c/13582/6/be/src/exec/parquet/hdfs-parquet-table-writer.cc@773
PS6, Line 773: Parquet file '$0' column '$1' hit an error
nit: change this to something like "Error writing partquet file '$0' column 
'$1': $2" (so it's more obvious this is on the _write_ side rather than the 
read side


http://gerrit.cloudera.org:8080/#/c/13582/6/be/src/exec/parquet/hdfs-parquet-table-writer.cc@903
PS6, Line 903:   return Status(Substitute("Parquet file '$0' column '$1' 
hit an error. $2",
same


http://gerrit.cloudera.org:8080/#/c/13582/6/be/src/exec/parquet/hdfs-parquet-table-writer.cc@997
PS6, Line 997:   // Map parquet codecs to Impala codecs. Its important to do 
the above check before
 :   // we do any mapping.
 :   parquet::CompressionCodec::type parquet_codec = 
ConvertImpalaToParquetCodec(codec);
 :   codec = ConvertParquetToImpalaCodec(parquet_codec);
this code isn't very intuitive to me. What's going on here? We're mapping the 
impala codec to parquet, and then back to impala? Is this serving some kind of 
canonicalization? If so, can you add some comment explaining why we do this 
back-and-forth round-trip?


http://gerrit.cloudera.org:8080/#/c/13582/6/be/src/exec/parquet/parquet-column-readers.cc
File be/src/exec/parquet/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/13582/6/be/src/exec/parquet/parquet-column-readers.cc@1346
PS6, Line 1346: Parquet file '$0' column '$1' hit an error
nit: think this reads better as "Error decompressing parquet file '$0' column 
'$1': $2"

Is the offset of the page readily available? If not, no need to plumb it 
through a bunch of code, but if we can get it easily, I think that would aid in 
debugging if we ever hit this error


http://gerrit.cloudera.org:8080/#/c/13582/6/be/src/util/compress.cc
File be/src/util/compress.cc:

http://gerrit.cloudera.org:8080/#/c/13582/6/be/src/util/compress.cc@349
PS6, Line 349:   int64_t length = -1;
nit: we tend towards the "early return" style instead of the "one return" 
style. You can get rid of this local variable and just change the assignments 
below on L355 and L360 to return statements, and then get rid of the 'else' on 
line 356 and 361, for a net reduction of several lines of code


http://gerrit.cloudera.org:8080/#/c/13582/6/be/src/util/compress.cc@383
PS6, Line 383: CHECK(size > 0)
You can do:

CHECK_GT(size, 0) << "LZ4_compress_default() failed"

which will end up including the 'size' value in the failure.

That said, maybe we're better off with an if statement and return a bad Status 
here indicating a runtime error, since it seems you do have error checking at 
the call site, etc?



--
To view, visit http://gerrit.cloudera.org:8080/13582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia6850a39ef3f1e0e7ba48e08eef1d4f7cbb74d0c
Gerrit-Change-Number: 13582
Gerrit-PatchSet: 6
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 13 Jun 2019 17:36:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8658: Populate missing Ranger audit fields

2019-06-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13601 )

Change subject: IMPALA-8658: Populate missing Ranger audit fields
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13601/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java
File 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java:

http://gerrit.cloudera.org:8080/#/c/13601/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java@117
PS2, Line 117:   
auditEvent.setClientIP(InetAddress.getLocalHost().getHostAddress());
this seems like the impala IP, not the client's IP. I'm not sure exactly where, 
but I know we have the end user IP somewhere in our session state


http://gerrit.cloudera.org:8080/#/c/13601/2/fe/src/test/resources/ranger-hive-audit.xml
File fe/src/test/resources/ranger-hive-audit.xml:

http://gerrit.cloudera.org:8080/#/c/13601/2/fe/src/test/resources/ranger-hive-audit.xml@27
PS2, Line 27: ranger.plugin.hive.ambari.cluster.name
if this parameter isn't set by the end user, do we risk an uninterpretable NPE 
or something? Should we be checking it somewhere during initialization to make 
sure it's set?



--
To view, visit http://gerrit.cloudera.org:8080/13601
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I167bc35411ad9b4164c292077ff082671967cff8
Gerrit-Change-Number: 13601
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 13 Jun 2019 16:03:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8654: Log the SQL statement in the Ranger audit log

2019-06-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13590 )

Change subject: IMPALA-8654: Log the SQL statement in the Ranger audit log
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id9f584ac4209604675eb13b6d6f349c6cbb1a387
Gerrit-Change-Number: 13590
Gerrit-PatchSet: 4
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 13 Jun 2019 16:01:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8654: Log the SQL statement in the Ranger audit log

2019-06-11 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13590 )

Change subject: IMPALA-8654: Log the SQL statement in the Ranger audit log
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id9f584ac4209604675eb13b6d6f349c6cbb1a387
Gerrit-Change-Number: 13590
Gerrit-PatchSet: 1
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 11 Jun 2019 23:36:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8617: Add support for lz4 in parquet

2019-06-11 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13582 )

Change subject: IMPALA-8617: Add support for lz4 in parquet
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13582/2/common/thrift/generate_error_codes.py
File common/thrift/generate_error_codes.py:

http://gerrit.cloudera.org:8080/#/c/13582/2/common/thrift/generate_error_codes.py@421
PS2, Line 421:"LZ4Block: Decompressed size is not correct."),
> I think this is probably a result of having abort_on_error=0 by default. Fo
it seems we're missing the underlying HDFS path which had the bad block, 
though. Is it possible to ensure that this gets propagated up? If that's a lot 
of scope creep for this commit maybe we can file it as a follow-up? My fear is 
that someone hits this issue and we can't tell them which file is broken.



--
To view, visit http://gerrit.cloudera.org:8080/13582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia6850a39ef3f1e0e7ba48e08eef1d4f7cbb74d0c
Gerrit-Change-Number: 13582
Gerrit-PatchSet: 2
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 11 Jun 2019 23:08:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8649: Fix confusing SHOW GRANT error messages

2019-06-11 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13587 )

Change subject: IMPALA-8649: Fix confusing SHOW GRANT error messages
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb88bdc19cd1223902b44e3634f756d086332266
Gerrit-Change-Number: 13587
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 11 Jun 2019 23:06:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8617: Add support for lz4 in parquet

2019-06-11 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13582 )

Change subject: IMPALA-8617: Add support for lz4 in parquet
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/13582/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13582/2//COMMIT_MSG@21
PS2, Line 21:  <4 byte big endian uncompressed size>
nit: some weird formatting here and below


http://gerrit.cloudera.org:8080/#/c/13582/2//COMMIT_MSG@25
PS2, Line 25: parquer
typo


http://gerrit.cloudera.org:8080/#/c/13582/2/be/src/util/compress.cc
File be/src/util/compress.cc:

http://gerrit.cloudera.org:8080/#/c/13582/2/be/src/util/compress.cc@355
PS2, Line 355: length = sizeof (int32_t);
style nit: I don't think we usually have a space between sizeof and '('


http://gerrit.cloudera.org:8080/#/c/13582/2/be/src/util/compress.cc@381
PS2, Line 381: int64_t size = LZ4_compress_default(reinterpret_cast(input),
lz4.h indicates that this can return 0 to indicate failure of some kind. I 
can't think of any reasons it would happen aside from OOM but it's probably 
worth a CHECK


http://gerrit.cloudera.org:8080/#/c/13582/2/be/src/util/decompress-test.cc
File be/src/util/decompress-test.cc:

http://gerrit.cloudera.org:8080/#/c/13582/2/be/src/util/decompress-test.cc@559
PS2, Line 559:   EXPECT_EQ(Ubsan::MemCmp(input_, output, input_len), 0);
do we ever expect input_ or output_ to be null? if not, we can use regular 
memcmp here


http://gerrit.cloudera.org:8080/#/c/13582/2/common/thrift/generate_error_codes.py
File common/thrift/generate_error_codes.py:

http://gerrit.cloudera.org:8080/#/c/13582/2/common/thrift/generate_error_codes.py@421
PS2, Line 421:"LZ4Block: Decompressed size is not correct."),
when these errors get propagated up, do we get the context of which file had 
the issue?



--
To view, visit http://gerrit.cloudera.org:8080/13582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia6850a39ef3f1e0e7ba48e08eef1d4f7cbb74d0c
Gerrit-Change-Number: 13582
Gerrit-PatchSet: 2
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 11 Jun 2019 20:51:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8551: Make the grant/revoke error messages to be more user friendly

2019-06-10 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13525 )

Change subject: IMPALA-8551: Make the grant/revoke error messages to be more 
user friendly
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13525
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8995f5dc88b211cd3af415713802cfeac44fe576
Gerrit-Change-Number: 13525
Gerrit-PatchSet: 7
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 10 Jun 2019 19:57:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8633 : Insert event should not error when table does not exists

2019-06-10 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13548 )

Change subject: IMPALA-8633 : Insert event should not error when table does not 
exists
..


Patch Set 4:

(1 comment)

Is there any e2e test we could write to stress these paths? eg if you were to 
run the metastore poller with a 5 second frequency, and you did some random 
sequence of hive operations, would it end up triggering it? I'm wondering 
whether we have similar issues in other events. For example, if you ALTER TABLE 
followed by DROP, will Impala get desynchronized?

http://gerrit.cloudera.org:8080/#/c/13548/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/13548/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@829
PS4, Line 829: }
should this have an else { throw e }?



--
To view, visit http://gerrit.cloudera.org:8080/13548
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I961cd7cbede4c248dba538c7fabb4bc708e49693
Gerrit-Change-Number: 13548
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 10 Jun 2019 19:10:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] fe: improve logging for metadata loading

2019-06-10 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13463 )

Change subject: fe: improve logging for metadata loading
..


Patch Set 1:

Vihang, any chance you can take a look at this?


--
To view, visit http://gerrit.cloudera.org:8080/13463
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d90a5f075b05d2dc96692b3349abe35ce24b8b8
Gerrit-Change-Number: 13463
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 10 Jun 2019 19:11:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8551: Make the grant/revoke error messages to be more user friendly

2019-06-10 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13525 )

Change subject: IMPALA-8551: Make the grant/revoke error messages to be more 
user friendly
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13525/6/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/13525/6/tests/authorization/test_ranger.py@572
PS6, Line 572: assert "Error granting a privilege to Ranger. Ranger 
error message: " \
don't these asserts need to be updated?



--
To view, visit http://gerrit.cloudera.org:8080/13525
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8995f5dc88b211cd3af415713802cfeac44fe576
Gerrit-Change-Number: 13525
Gerrit-PatchSet: 5
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 10 Jun 2019 19:03:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Bump CDP BUILD NUMBER to 1153860

2019-06-10 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13574 )

Change subject: Bump CDP_BUILD_NUMBER to 1153860
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13574
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ief3458c7dbf2ee973b586e2d0d54cc0739b6ae3b
Gerrit-Change-Number: 13574
Gerrit-PatchSet: 1
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 10 Jun 2019 19:01:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8551: Bump CDP BUILD NUMBER to 1153860

2019-06-10 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13525 )

Change subject: IMPALA-8551: Bump CDP_BUILD_NUMBER to 1153860
..


Patch Set 5:

(2 comments)

Looks pretty good modulo some nits. One thing, though: would it be possible to 
split this into one commit to bump the CDP_BUILD_NUMBER and one to change the 
error messages? I'm afraid when we look through lists of commits to backport 
across branches, we'll see this and think "oh it's just a CDP version bump, we 
can skip this one and pick up a later one" whereas in fact it also contains a 
functional change.

http://gerrit.cloudera.org:8080/#/c/13525/5/fe/src/main/java/org/apache/impala/authorization/ranger/RangerCatalogdAuthorizationManager.java
File 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerCatalogdAuthorizationManager.java:

http://gerrit.cloudera.org:8080/#/c/13525/5/fe/src/main/java/org/apache/impala/authorization/ranger/RangerCatalogdAuthorizationManager.java@173
PS5, Line 173: } catch (Exception e) {
pre-existing issue, but: is it possible to add a DEBUG level log message here 
which includes the stack trace of the exception? I'm afraid at some point we'll 
trigger an NPE or something within the Ranger client code, and the excepption 
will get swallowed, and all we'll see is "null" in the error message.

(same below)


http://gerrit.cloudera.org:8080/#/c/13525/5/fe/src/main/java/org/apache/impala/authorization/ranger/RangerCatalogdAuthorizationManager.java@174
PS5, Line 174:   throw new InternalException("Error granting a privilege to 
Ranger. " +
nit: can we reword this to "in Ranger" instead of "to Ranger" -- the privilege 
is being granted to some target user, not to the ranger service itself.



--
To view, visit http://gerrit.cloudera.org:8080/13525
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8995f5dc88b211cd3af415713802cfeac44fe576
Gerrit-Change-Number: 13525
Gerrit-PatchSet: 5
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 10 Jun 2019 15:50:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Update mustache to commit b290952d8eb93d085214d8c8c9eab8559df9f606

2019-06-07 Thread Todd Lipcon (Code Review)
Todd Lipcon has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13563


Change subject: Update mustache to commit 
b290952d8eb93d085214d8c8c9eab8559df9f606
..

Update mustache to commit b290952d8eb93d085214d8c8c9eab8559df9f606

This fixes some issues with regard to template variable scoping that
will be useful for Knox integration.

Change-Id: If26f3aaa2a3279a1f6a300c4f4cee7ec899e22ed
---
M be/src/thirdparty/mustache/README
M be/src/thirdparty/mustache/mustache.cc
M be/src/thirdparty/mustache/mustache.h
M be/src/util/webserver.cc
4 files changed, 184 insertions(+), 102 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/13563/1
--
To view, visit http://gerrit.cloudera.org:8080/13563
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If26f3aaa2a3279a1f6a300c4f4cee7ec899e22ed
Gerrit-Change-Number: 13563
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 


[Impala-ASF-CR] IMPALA-8551: Bump CDP BUILD NUMBER to 1153860

2019-06-07 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13525 )

Change subject: IMPALA-8551: Bump CDP_BUILD_NUMBER to 1153860
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13525/2/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/13525/2/tests/authorization/test_ranger.py@578
PS2, Line 578:
> We can but we have to parse the error message and that can be somewhat erro
hrm. I agree that parsing the error message isn't super robust, but I think we 
could at least do something like a simple regex substitution to remove ^HTTP 
\d+ Error: .+' and replace it with something like "Ranger error: ... (HTTP code 
403)" so that it's clear that the error came from ranger and not produced 
within Impala or within the user's connection. I'm concerned that if a user 
using Hue sees an error pop up like this they may think there's something wrong 
with their HTTP connection to Hue, for example.



--
To view, visit http://gerrit.cloudera.org:8080/13525
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8995f5dc88b211cd3af415713802cfeac44fe576
Gerrit-Change-Number: 13525
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 07 Jun 2019 17:13:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8459. LocalCatalog: Allow dropping tables that fail to load

2019-06-07 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13557 )

Change subject: IMPALA-8459. LocalCatalog: Allow dropping tables that fail to 
load
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13557/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13557/1//COMMIT_MSG@17
PS1, Line 17: This re-enables some tests that were previously disabled for
> Were you able to run the end-to-end tests with the local catalog enabled? I
I ran the kudu tests but didn't re-run the full suite. You think it's worth 
running a parameterized job and make sure I didn't regress something else? Is 
there an existing jenkins job that can easily be configured like that?


http://gerrit.cloudera.org:8080/#/c/13557/1/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
File fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java:

http://gerrit.cloudera.org:8080/#/c/13557/1/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java@40
PS1, Line 40:  * Represents a table with incomplete metadata. The metadata may 
be incomplete because
> Trying to understand - does this IncompleteTable implementation get used in
yea I think this only gets used by the "v1" catalog. I'll add a comment.



--
To view, visit http://gerrit.cloudera.org:8080/13557
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia85d6b10669866dcc9f2bf7385a4775e483dd6b0
Gerrit-Change-Number: 13557
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 07 Jun 2019 16:38:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8459. LocalCatalog: Allow dropping tables that fail to load

2019-06-07 Thread Todd Lipcon (Code Review)
Hello Tim Armstrong,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/13557

to review the following change.


Change subject: IMPALA-8459. LocalCatalog: Allow dropping tables that fail to 
load
..

IMPALA-8459. LocalCatalog: Allow dropping tables that fail to load

This adds support in LocalCatalog for IncompleteTables. A new
FeIncompleteTable interface is introduced, and when a table fails to
load, it produces a FeIncompleteTable instead of an immediate error.

This allows DROP TABLE statements to get past analysis and execute even
when tables are in bad states (eg a Kudu table with no backing table in
Kudu itself).

This re-enables some tests that were previously disabled for
LocalCatalog.

Change-Id: Ia85d6b10669866dcc9f2bf7385a4775e483dd6b0
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
A fe/src/main/java/org/apache/impala/catalog/FeIncompleteTable.java
M fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
A fe/src/main/java/org/apache/impala/catalog/local/FailedLoadLocalTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java
M fe/src/main/java/org/apache/impala/service/MetadataOp.java
M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java
M tests/common/skip.py
M tests/query_test/test_kudu.py
12 files changed, 127 insertions(+), 36 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/13557/1
--
To view, visit http://gerrit.cloudera.org:8080/13557
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia85d6b10669866dcc9f2bf7385a4775e483dd6b0
Gerrit-Change-Number: 13557
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8635. Use local metastore URIs when checking for Kudu/HMS integration

2019-06-07 Thread Todd Lipcon (Code Review)
Todd Lipcon has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13555


Change subject: IMPALA-8635. Use local metastore URIs when checking for 
Kudu/HMS integration
..

IMPALA-8635. Use local metastore URIs when checking for Kudu/HMS integration

The patch for IMPALA-8504 (part 2) (6bb404dc359) checks to see if Impala
and Kudu are configured against the same metastore to determine if the
HMS integration is enabled. However, instead of using its own metastore
URI config, it uses the configuration stored on the remote HMS. This is
error prone because it's not common for the HMS configuration to store
its own URI. Instead, we should use our own config.

This patch changes to using the local configuration for this purpose.
More robust would be to use the HMS "UUID" support, since it's possible
that Kudu and Impala are talking to different HMS instances sharing a
backing DB, but that work is deferred to a later commit since it depends
on Kudu-side changes.

This commit doesn't add any tests, but fixes the existing tests when
running against Hive 3, where the HMS server side uses a different
configuration key for the metastore URIs.

Change-Id: Id7a4c2cc0580f7c4dc5cfceed30b91e87c547612
---
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
3 files changed, 9 insertions(+), 17 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/13555/1
--
To view, visit http://gerrit.cloudera.org:8080/13555
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id7a4c2cc0580f7c4dc5cfceed30b91e87c547612
Gerrit-Change-Number: 13555
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 


  1   2   3   4   5   6   7   8   9   10   >