[Impala-ASF-CR] IMPALA-5482: fix git checkout when workloads are modified

2017-06-09 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7145

Change subject: IMPALA-5482: fix git checkout when workloads are modified
..

IMPALA-5482: fix git checkout when workloads are modified

When git checkout would overwrite changes, it fails and alerts the
user to do something with the changes. This patch removes any changes
to files induced by the workload copy-and-paste.

Testing: using a patch provided by Lars Volker that touched
testdata/workloads/ (https://gerrit.cloudera.org/#/c/7073/), I was
able to reproduce the problem he saw and see that this patch fixed it.

Change-Id: I9a0d004c353eb4b547aeaf3c56289594326653d7
---
M bin/single_node_perf_run.py
1 file changed, 4 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/7145/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7145
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I9a0d004c353eb4b547aeaf3c56289594326653d7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 


[Impala-ASF-CR] IMPALA-5280: Coalesce chains of OR conditions to an IN predicate

2017-06-09 Thread Anonymous Coward (Code Review)
sakinape...@cloudera.com has posted comments on this change.

Change subject: IMPALA-5280: Coalesce chains of OR conditions to an IN predicate
..


Patch Set 2:

(21 comments)

Responding to comments...

http://gerrit.cloudera.org:8080/#/c/7110/1/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

Line 179: 
> Not really sure what this predicate does. Seems very specific, so probably 
changed it to  expr with equality condition and literal on the right.  Generic 
enough to keep it here.


Line 183:   return arg instanceof BinaryPredicate
> What's special about these literals? Why not all literals, i.e. isLiteral()
Done. bool and null has only few values. however if there are predicates then 
we dont want to restrict rewriting. hence chaged to isLiteral


http://gerrit.cloudera.org:8080/#/c/7110/1/fe/src/main/java/org/apache/impala/rewrite/CoalesceEqualityDisjunctsToInRule.java
File 
fe/src/main/java/org/apache/impala/rewrite/CoalesceEqualityDisjunctsToInRule.java:

Line 15
> several typos
Done


Line 16
> please follow the format of other rules to list examples
Done


Line 19
> CoalesceEqDisjunctsToInRule (a little shorter)
Done


Line 21
> remove, no logging please, too expensive
Done


Line 27
> very expensive, remove
Done


Line 28
> single line if
Done


Line 34
> single line
Done


Line 39
> single line
Done


Line 46
> Try to be consistent with rest of code base:
Done


Line 48
> whitespace (same below)
Done


Line 60
> else if seems clearer
Done


Line 64
> single line if
Done


Line 67
> Don't we need otherPred to be a BinaryPredicate with an EQ condition?
Added the condition in Expr.java


Line 69
> Predicates might not be normalized, i.e. you could have
Agree. do we capture this effort in a separate jira?


Line 73
> What is this check trying to do? It doesn't seem necessary.
Redundant. Trying to check if the elements in the IN clause and the candidate 
predicate literal is of same type or not. Removed.


Line 82
> We might want to extend NormalizeBinaryPredicates to make dealing with thes
Yes. added a combo testcase with normalized predicate rule for simple case.


Line 105
> What is this trying to do?
Redundant. removed.


Line 109
> Why not leftChild1.isLiteral()?
Done


http://gerrit.cloudera.org:8080/#/c/7110/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java:

Line 80:   public void TestBetweenToCompoundRule() throws AnalysisException {
> nit: move to bottom
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7110
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If02396b752c5497de9a92828c24c8062027dc2e2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: sakinape...@cloudera.com
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: sakinape...@cloudera.com
Gerrit-HasComments: Yes


[Impala-ASF-CR](asf-site) IMPALA-5470: Modernize links and instructions on download page

2017-06-09 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5470: Modernize links and instructions on download page
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7124
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibec680428cb7e0df3f831527218d296214eff5c7
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR](asf-site) IMPALA-5470: Modernize links and instructions on download page

2017-06-09 Thread Jim Apple (Code Review)
Jim Apple has submitted this change and it was merged.

Change subject: IMPALA-5470: Modernize links and instructions on download page
..


IMPALA-5470: Modernize links and instructions on download page

This switches from dist.apache.org to www.apache.org/dist for links
for the hashes and signatures. It also includes instructions for how
to check them.

Change-Id: Ibec680428cb7e0df3f831527218d296214eff5c7
Reviewed-on: http://gerrit.cloudera.org:8080/7124
Reviewed-by: Lars Volker 
Tested-by: Jim Apple 
---
M downloads.html
1 file changed, 33 insertions(+), 7 deletions(-)

Approvals:
  Jim Apple: Verified
  Lars Volker: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7124
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ibec680428cb7e0df3f831527218d296214eff5c7
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR](asf-site) IMPALA-5470: Modernize links and instructions on download page

2017-06-09 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5470: Modernize links and instructions on download page
..


Patch Set 2: Code-Review+2

This seems reasonably small to give a +2 myself.

-- 
To view, visit http://gerrit.cloudera.org:8080/7124
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibec680428cb7e0df3f831527218d296214eff5c7
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5413: Add a hive user for test seq writer hive compatibility.

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5413: Add a hive user for 
test_seq_writer_hive_compatibility.
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7046
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1cc8824800e4339874b9c4e3a84969baf848d941
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5413: Add a hive user for test seq writer hive compatibility.

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5413: Add a hive user for 
test_seq_writer_hive_compatibility.
..


IMPALA-5413: Add a hive user for test_seq_writer_hive_compatibility.

This patch includes a change to the framework to permit the passing
of a username to the run_stmt_in_hive() method in the ImpalaTestSuite
class, but retains the same default value as before.

This is to allow a test to issue a 'select count(*) from foo' query
through hive. Hive needs to set up a job to perform this query, and
HDFS write access to do so. In typical cases, the HDFS user is 'hdfs'.
however it may be necessary to change this depending on the cluster.

On a local mini-cluster, the username appears to be irrelevant, so
this won't affect locally run tests.

Tested by running the core set of tests on a local minicluster to
make sure there were no regressions. Also confirmed that the test
in question now passes on a remote physical cluster.

Change-Id: I1cc8824800e4339874b9c4e3a84969baf848d941
Reviewed-on: http://gerrit.cloudera.org:8080/7046
Reviewed-by: David Knupp 
Tested-by: Impala Public Jenkins
---
M tests/common/impala_test_suite.py
M tests/query_test/test_compressed_formats.py
2 files changed, 10 insertions(+), 4 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  David Knupp: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7046
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I1cc8824800e4339874b9c4e3a84969baf848d941
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4965: Authorize access to runtime profile and exec summary

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4965: Authorize access to runtime profile and exec 
summary
..


Patch Set 13: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4965: Authorize access to runtime profile and exec summary

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4965: Authorize access to runtime profile and exec 
summary
..


IMPALA-4965: Authorize access to runtime profile and exec summary

Bug:
When Sentry-based authorization is enabled, a user that isn't authorized
to EXPLAIN a statement that uses a view can still access unauthorized
information, such as view's definition, by running the statement and
asking for the query profile or the execution summary.

Fix:
During query compilation, determine if the user can access the the runtime
profile or the execution summary. Upon request for a runtime profile or
execution summary from a user, determine based on that information and
the user that is asking for the profile if the runtime profile
(or execution summary) will be returned or an authorization error.

The authorization rule enforced is the following:
- User A runs statement S, A asks for profile, A has profile access:
  Runtime profile is returned
- User A runs statement S, A asks for profile, A doesn't have profile access:
  Authorization error
- User A runs statement S, user B asks for profile:
  Authorization error.

This patch doesn't enforce access to the runtime profile or execution summary
through the Web UI.

Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15
Reviewed-on: http://gerrit.cloudera.org:8080/7064
Reviewed-by: Dimitris Tsirogiannis 
Tested-by: Impala Public Jenkins
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/auth-util.cc
M be/src/util/auth-util.h
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/ShowCreateTableStmt.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M shell/impala_shell.py
M tests/authorization/test_authorization.py
17 files changed, 284 insertions(+), 54 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Dimitris Tsirogiannis: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-5480: Improve missing filters message

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5480: Improve missing filters message
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia87fabdfb591f33343020c4f3bb17dc0fb011046
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5480: Improve missing filters message

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5480: Improve missing filters message
..


IMPALA-5480: Improve missing filters message

Replace:

"Only following filters arrived: , waited 20ms"

with:

"Not all filters arrived (arrived: [], missing: [0]), waited 20ms"

This shows up in the logs and the profile.

Change-Id: Ia87fabdfb591f33343020c4f3bb17dc0fb011046
Testing: ran manually and read the message.
Reviewed-on: http://gerrit.cloudera.org:8080/7142
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins
---
M be/src/exec/hdfs-scan-node-base.cc
1 file changed, 8 insertions(+), 3 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ia87fabdfb591f33343020c4f3bb17dc0fb011046
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins


[Impala-ASF-CR] IMPALA-5221: Avoid re-use of stale SASL contexts.

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5221: Avoid re-use of stale SASL contexts.
..


IMPALA-5221: Avoid re-use of stale SASL contexts.

The TSaslTransport is written as a thrift extension that is a wrapper
around the Cyrus-SASL APIs. This transport is then used by Impala's
RPC layer.

On RHEL7 systems that use newer versions of the Cyrus-SASL library,
we noticed that we sometimes crash inside the Cyrus-SASL thirdparty
while trying to lock an internal mutex. During my investigation, I
found that we needed to fix the order of negotiation that happens in
an edge case.

The steps to use the Cyrus-SASL APIs for SASL negitiation are the
following (Replace '_client_' with '_server_' for server calls):
sasl_client_new()
sasl_client_start()
sasl_client_step()
sasl_dispose()   < --- When we're done with the connection.

sasl_client_new() was being called in the constructor TSaslClient()
which is invoked from SaslAuthProvider::WrapClientTransport().

sasl_client_start() and sasl_client_step() were being called under
TSaslTransport::open(). If for some reason we hit an error during
SASL negotiation, the TSaslTransport::open() call would fail. When
we fail, the ThriftClientImpl::OpenWithRetry() does a retry, which
directly retries the negotiation from sasl_client_start(). This
caused the use of already freed resources from the first negotiation
failure, hence causing the crash.

To fix this, we make sure that on a negotiation failure, we dispose
of all the resources properly by calling sasl_dispose() and retry
the negotiation from the start by calling sasl_client_new() first, and
then the remaining steps. This is done by moving the sasl_client_new()
and sasl_server_new() calls out of the TSaslClient/TSaslServer
constructors and into a new call called TSasl*::setupSaslContext(),
which is called under TSaslTransport::open().

The patch is fairly large for the above mentioned change, however,
most of it is just plumbing.

Testing: Tested on systems with older SASL versions to make sure
we don't regress. Also tested on systems with newer SASL versions
where we previously saw the crash and verified that we don't see
them anymore.
Also, tested with GSSAPI and LDAP mechanisms.

Change-Id: I7d509e14dfe46fa28d0626cf84daf6de82955d80
Reviewed-on: http://gerrit.cloudera.org:8080/7116
Reviewed-by: Sailesh Mukil 
Tested-by: Impala Public Jenkins
---
M be/src/transport/TSasl.cpp
M be/src/transport/TSasl.h
M be/src/transport/TSaslClientTransport.cpp
M be/src/transport/TSaslClientTransport.h
M be/src/transport/TSaslServerTransport.cpp
M be/src/transport/TSaslServerTransport.h
M be/src/transport/TSaslTransport.cpp
M be/src/transport/TSaslTransport.h
8 files changed, 234 insertions(+), 82 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Sailesh Mukil: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I7d509e14dfe46fa28d0626cf84daf6de82955d80
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5453: test create table like file fails on enum.parquet

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5453: test_create_table_like_file fails on enum.parquet
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6c386843e5ef5bf6fc208db1ff90be98fd8baacf
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5453: test create table like file fails on enum.parquet

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5453: test_create_table_like_file fails on enum.parquet
..


IMPALA-5453: test_create_table_like_file fails on enum.parquet

A recent addition to test_create_table_like_file (IMPALA-2525)
relies on a file, enum.parquet, being preloaded into HDFS, which
is done by create-load-data.sh.

The problem is that the test creates the table as an internal
table with its location as the directory containing enum.parquet.

When the test completes and the table is dropped, enum.parquet
is deleted, so the test cannot be successfully run again, and a
snapshot generated from the contents of HDFS afterwards will
not contain the file.

The fix is to create the table as an external table.

Testing:
- Ran the test and verfied enum.parquet is still present in HDFS.

Change-Id: I6c386843e5ef5bf6fc208db1ff90be98fd8baacf
Reviewed-on: http://gerrit.cloudera.org:8080/7139
Reviewed-by: Tim Armstrong 
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins
---
M 
testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Verified
  Alex Behm: Looks good to me, approved
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I6c386843e5ef5bf6fc208db1ff90be98fd8baacf
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5479: Propagate the argument type in RawValue::Compare()

2017-06-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5479: Propagate the argument type in RawValue::Compare()
..


Patch Set 2: Code-Review+2

The IR_ALWAYS_INLINE seems fine as-is

-- 
To view, visit http://gerrit.cloudera.org:8080/7140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I10b5b284e3da03024476a9620a12d6e7fbf08b3c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5413: Add a hive user for test seq writer hive compatibility.

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5413: Add a hive user for 
test_seq_writer_hive_compatibility.
..


Patch Set 3:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/717/

-- 
To view, visit http://gerrit.cloudera.org:8080/7046
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1cc8824800e4339874b9c4e3a84969baf848d941
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5480: Improve missing filters message

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5480: Improve missing filters message
..


Patch Set 1:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/716/

-- 
To view, visit http://gerrit.cloudera.org:8080/7142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia87fabdfb591f33343020c4f3bb17dc0fb011046
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4965: Authorize access to runtime profile and exec summary

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4965: Authorize access to runtime profile and exec 
summary
..


Patch Set 12: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/712/

-- 
To view, visit http://gerrit.cloudera.org:8080/7064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5413: Add a hive user for test seq writer hive compatibility.

2017-06-09 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-5413: Add a hive user for 
test_seq_writer_hive_compatibility.
..


Patch Set 3: Code-Review+2

Removed one blank # comment. Carrying +2 from Alex.

-- 
To view, visit http://gerrit.cloudera.org:8080/7046
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1cc8824800e4339874b9c4e3a84969baf848d941
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5479: Propagate the argument type in RawValue::Compare()

2017-06-09 Thread Michael Ho (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7140

to look at the new patch set (#2).

Change subject: IMPALA-5479: Propagate the argument type in RawValue::Compare()
..

IMPALA-5479: Propagate the argument type in RawValue::Compare()

CodegenAnyVal::Compare() generates code which calls the cross
compiled version of RawValue::Compare() without propagating
the type information into RawValue::Compare(). As a result,
the generated code of RawValue::Compare() is not any more
efficient than the interpreted version as we still have
the big switch statement in it.

This change creates a global constant for the argument 'type'
passed to RawValue::Compare(). By inlining the call to
RawValue::Compare(), LLVM was able to constant propagate
the type and eliminate the dead code for non-target types.

With this change, a query with top-n improves by 12% on average:

select l_orderkey, l_partkey, l_suppkey
from tpch50_parquet.lineitem
order by l_orderkey, l_partkey, l_suppkey
limit 1;

4.49s -> 3.95s

This change also adds the ALWAYS_INLINE attribute to
RuntimeFilter::Eval() as it's needed to propagate the type
after a recent change to not put ALWAYS_INLINE attribute
on all cross-compiled functions.

Change-Id: I10b5b284e3da03024476a9620a12d6e7fbf08b3c
---
M be/src/codegen/codegen-anyval.cc
M be/src/runtime/raw-value-ir.cc
M be/src/runtime/raw-value.h
M be/src/runtime/runtime-filter-ir.cc
M be/src/runtime/runtime-filter.h
5 files changed, 21 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/7140/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I10b5b284e3da03024476a9620a12d6e7fbf08b3c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5479: Propagate the argument type in RawValue::Compare()

2017-06-09 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-5479: Propagate the argument type in RawValue::Compare()
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7140/1/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

Line 702
> Its a little weird that it can't infer that this is already constant. The n
We were using a stack variable. LLVM is hesitant to propagate that as a 
constant.


http://gerrit.cloudera.org:8080/#/c/7140/1/be/src/runtime/raw-value-ir.cc
File be/src/runtime/raw-value-ir.cc:

PS1, Line 29: IR_ALWAYS_INLINE
> We don't need the attribute in both declaration and definition do we?
Agree that it's not needed but it seems clearer (for documentation purpose) to 
have it at the definition too. It seems to be the prevalent pattern in our 
cross-compiled code now.


http://gerrit.cloudera.org:8080/#/c/7140/1/be/src/runtime/runtime-filter-ir.cc
File be/src/runtime/runtime-filter-ir.cc:

PS1, Line 24: IR_ALWAYS_INLINE
> We don't need the attribute in both declaration and definition do we?
Agree that it's not needed but it seems clearer (for documentation purpose) to 
have it at the definition too. It seems to be the prevalent pattern in our 
cross-compiled code now. Do you feel strongly that we should clean them up ?


http://gerrit.cloudera.org:8080/#/c/7140/1/be/src/runtime/runtime-filter.h
File be/src/runtime/runtime-filter.h:

Line 63:   /// Inlined in IR so that the constant 'col_type' can be propagated.
> Maybe something like "Inlined in IR so that the constnt 'col_type' can be p
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I10b5b284e3da03024476a9620a12d6e7fbf08b3c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5413: Add a hive user for test seq writer hive compatibility.

2017-06-09 Thread David Knupp (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7046

to look at the new patch set (#3).

Change subject: IMPALA-5413: Add a hive user for 
test_seq_writer_hive_compatibility.
..

IMPALA-5413: Add a hive user for test_seq_writer_hive_compatibility.

This patch includes a change to the framework to permit the passing
of a username to the run_stmt_in_hive() method in the ImpalaTestSuite
class, but retains the same default value as before.

This is to allow a test to issue a 'select count(*) from foo' query
through hive. Hive needs to set up a job to perform this query, and
HDFS write access to do so. In typical cases, the HDFS user is 'hdfs'.
however it may be necessary to change this depending on the cluster.

On a local mini-cluster, the username appears to be irrelevant, so
this won't affect locally run tests.

Tested by running the core set of tests on a local minicluster to
make sure there were no regressions. Also confirmed that the test
in question now passes on a remote physical cluster.

Change-Id: I1cc8824800e4339874b9c4e3a84969baf848d941
---
M tests/common/impala_test_suite.py
M tests/query_test/test_compressed_formats.py
2 files changed, 10 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/7046/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7046
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1cc8824800e4339874b9c4e3a84969baf848d941
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4965: Authorize access to runtime profile and exec summary

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4965: Authorize access to runtime profile and exec 
summary
..


Patch Set 13:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/715/

-- 
To view, visit http://gerrit.cloudera.org:8080/7064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4965: Authorize access to runtime profile and exec summary

2017-06-09 Thread Dimitris Tsirogiannis (Code Review)
Hello Impala Public Jenkins, Alex Behm, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7064

to look at the new patch set (#13).

Change subject: IMPALA-4965: Authorize access to runtime profile and exec 
summary
..

IMPALA-4965: Authorize access to runtime profile and exec summary

Bug:
When Sentry-based authorization is enabled, a user that isn't authorized
to EXPLAIN a statement that uses a view can still access unauthorized
information, such as view's definition, by running the statement and
asking for the query profile or the execution summary.

Fix:
During query compilation, determine if the user can access the the runtime
profile or the execution summary. Upon request for a runtime profile or
execution summary from a user, determine based on that information and
the user that is asking for the profile if the runtime profile
(or execution summary) will be returned or an authorization error.

The authorization rule enforced is the following:
- User A runs statement S, A asks for profile, A has profile access:
  Runtime profile is returned
- User A runs statement S, A asks for profile, A doesn't have profile access:
  Authorization error
- User A runs statement S, user B asks for profile:
  Authorization error.

This patch doesn't enforce access to the runtime profile or execution summary
through the Web UI.

Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/auth-util.cc
M be/src/util/auth-util.h
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/ShowCreateTableStmt.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M shell/impala_shell.py
M tests/authorization/test_authorization.py
17 files changed, 284 insertions(+), 54 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/7064/13
-- 
To view, visit http://gerrit.cloudera.org:8080/7064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-5467: disable flaky BenchmarkTest to unblock builds

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5467: disable flaky BenchmarkTest to unblock builds
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6aafbc620c1e839b07bf6020f9286ca5ee51197d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5467: disable flaky BenchmarkTest to unblock builds

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5467: disable flaky BenchmarkTest to unblock builds
..


IMPALA-5467: disable flaky BenchmarkTest to unblock builds

Change-Id: I6aafbc620c1e839b07bf6020f9286ca5ee51197d
Reviewed-on: http://gerrit.cloudera.org:8080/7135
Reviewed-by: Alex Behm 
Reviewed-by: Zach Amsden 
Tested-by: Impala Public Jenkins
---
M be/src/util/benchmark-test.cc
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Verified
  Zach Amsden: Looks good to me, but someone else must approve
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I6aafbc620c1e839b07bf6020f9286ca5ee51197d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5480: Improve missing filters message

2017-06-09 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7142

Change subject: IMPALA-5480: Improve missing filters message
..

IMPALA-5480: Improve missing filters message

Replace:

"Only following filters arrived: , waited 20ms"

with:

"Not all filters arrived (arrived: [], missing: [0]), waited 20ms"

This shows up in the logs and the profile.

Change-Id: Ia87fabdfb591f33343020c4f3bb17dc0fb011046
Testing: ran manually and read the message.
---
M be/src/exec/hdfs-scan-node-base.cc
1 file changed, 8 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/7142/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia87fabdfb591f33343020c4f3bb17dc0fb011046
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 


[Impala-ASF-CR](asf-site) IMPALA-5470: Modernize links and instructions on download page

2017-06-09 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new patch set (#2).

Change subject: IMPALA-5470: Modernize links and instructions on download page
..

IMPALA-5470: Modernize links and instructions on download page

This switches from dist.apache.org to www.apache.org/dist for links
for the hashes and signatures. It also includes instructions for how
to check them.

Change-Id: Ibec680428cb7e0df3f831527218d296214eff5c7
---
M downloads.html
1 file changed, 33 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/7124/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7124
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibec680428cb7e0df3f831527218d296214eff5c7
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization

2017-06-09 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#3).

Change subject: IMPALA-5036: Parquet count star optimization
..

IMPALA-5036: Parquet count star optimization

Instead of materializing empty rows when computing count star, we use
the data stored in the Parquet RowGroup.num_rows field. The Parquet
scanner tuple is modified to have one slot into which we will write the
num rows statistic. The aggregate function is changed from count to a
special sum function that gets initialized to 0. We also add a rewrite
rule so that count() is rewritten to count(*) in order to make
sure that this optimization is applied in all cases.

Testing:
- Added functional and planner tests

Change-Id: I536b85c014821296aed68a0c68faadae96005e62
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/FunctionParams.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
A fe/src/main/java/org/apache/impala/rewrite/NormalizeCountStarRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
R testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
M tests/query_test/test_aggregation.py
M tests/query_test/test_parquet_stats.py
24 files changed, 914 insertions(+), 74 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/6812/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6812
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization

2017-06-09 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-5036: Parquet count star optimization
..


Patch Set 2:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/6812/2/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 428: RETURN_IF_ERROR(
> In most cases we're allocating way too much here. We can get a more accurat
Done


Line 437:   int64_t* dst_slot = 
reinterpret_cast(dst_tuple->GetSlot(0));
> The dst_tuple->GetSlot(0) looks wrong. There's no guarantee that the target
Done


http://gerrit.cloudera.org:8080/#/c/6812/2/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
File fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java:

Line 333:   FunctionCallExpr f = (FunctionCallExpr) substitutedAgg;
> remove
Done


Line 359: //Preconditions.checkState(mergeAggInfo_ == null);
> ?
Looks like I didn't undo all the changes I was making when experimenting. Fixed.


http://gerrit.cloudera.org:8080/#/c/6812/2/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java:

Line 571:   public FunctionCallExpr getMergeAggInputFn() { return 
mergeAggInputFn_; }
> Do we need these getters/setters?
Yes, they are used on line 617, for example. I don't think it would be a good 
idea to make mergeAggInputFn_ public.


Line 621:   if (!(substitutedFn instanceof FunctionCallExpr)) return e;
> This looks like an error we should not ignore. Convert into Preconditions c
Done


http://gerrit.cloudera.org:8080/#/c/6812/2/fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
File fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java:

Line 334: ArrayList materializedSlots = 
getMaterializedSlots();
> The previous implementation seemed cheaper, I'd prefer to leave it. Using h
Done


http://gerrit.cloudera.org:8080/#/c/6812/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

Line 105:  * TODO: pass in range restrictions.
> Might be good to add a word or two about the agg optimization flow, i.e., t
Done


Line 134:   // Set to true if the count(*) aggregation can be optimized by 
populating the tuple with
> Set to true if the query block of this scan has a count(*) aggregation that
Done


Line 238: // Create two functions that we will put into an smap. We want to 
replace the
> Integrate this into a function comment. It should describe what this functi
Done


Line 247: sd.setType(Type.BIGINT);
> set slot as non-nullable
Done


Line 253: sumFn.analyze(analyzer);
> use analyzeNoThrow() and remove the throws declaration
Done


Line 288:   (getTupleDesc().getMaterializedSlots().isEmpty() ||
> use desc_ instead of getTupleDesc()
Done


http://gerrit.cloudera.org:8080/#/c/6812/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

Line 1207:* If 'hdfsTblRef' only contains partition columns and 
'fastPartitionKeyScans'
> comment on new param
Done


Line 1268:* 'fastPartitionKeyScans' indicates whether to try to produce the 
slots with
> comment new param
Done


Line 1512:* 'fastPartitionKeyScans' indicates whether to try to produce the 
slots with
> comment new param
Done


http://gerrit.cloudera.org:8080/#/c/6812/2/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java:

Line 378: RewritesOk("count(id)", rule, null);
> Also check count(1+1) and count(1+NULL)
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6812
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5221: Avoid re-use of stale SASL contexts.

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5221: Avoid re-use of stale SASL contexts.
..


Patch Set 4:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/714/

-- 
To view, visit http://gerrit.cloudera.org:8080/7116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d509e14dfe46fa28d0626cf84daf6de82955d80
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5479: Propagate the argument type in RawValue::Compare()

2017-06-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5479: Propagate the argument type in RawValue::Compare()
..


Patch Set 1: Code-Review+2

(4 comments)

Nice!

http://gerrit.cloudera.org:8080/#/c/7140/1/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

Line 702
Its a little weird that it can't infer that this is already constant. The new 
code seems better anyway.


http://gerrit.cloudera.org:8080/#/c/7140/1/be/src/runtime/raw-value-ir.cc
File be/src/runtime/raw-value-ir.cc:

PS1, Line 29: IR_ALWAYS_INLINE
We don't need the attribute in both declaration and definition do we?


http://gerrit.cloudera.org:8080/#/c/7140/1/be/src/runtime/runtime-filter-ir.cc
File be/src/runtime/runtime-filter-ir.cc:

PS1, Line 24: IR_ALWAYS_INLINE
We don't need the attribute in both declaration and definition do we?


http://gerrit.cloudera.org:8080/#/c/7140/1/be/src/runtime/runtime-filter.h
File be/src/runtime/runtime-filter.h:

Line 63:   bool IR_ALWAYS_INLINE Eval(void* val, const ColumnType& col_type) 
const noexcept;
Maybe something like "Inlined in IR so that the constnt 'col_type' can be 
propagated"


-- 
To view, visit http://gerrit.cloudera.org:8080/7140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I10b5b284e3da03024476a9620a12d6e7fbf08b3c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] DRAFT: IMPALA-5036: Parquet count star optimization

2017-06-09 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#3).

Change subject: DRAFT: IMPALA-5036: Parquet count star optimization
..

DRAFT: IMPALA-5036: Parquet count star optimization

Instead of materializing empty rows when computing count star, we use
the data stored in the Parquet RowGroup.num_rows field. The Parquet
scanner tuple is modified to have one slot into which we will write the
num rows statistic. The aggregate function is changed from count to a
special sum function that gets initialized to 0. We also add a rewrite
rule so that count() is rewritten to count(*) in order to make
sure that this optimization is applied in all cases.

Testing:
- Added functional and planner tests

Change-Id: I536b85c014821296aed68a0c68faadae96005e62
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/FunctionParams.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
A fe/src/main/java/org/apache/impala/rewrite/NormalizeCountStarRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
R testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
M tests/query_test/test_aggregation.py
M tests/query_test/test_parquet_stats.py
24 files changed, 911 insertions(+), 72 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/6812/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6812
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-5479: Propagate the argument type in RawValue::Compare()

2017-06-09 Thread Michael Ho (Code Review)
Michael Ho has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7140

Change subject: IMPALA-5479: Propagate the argument type in RawValue::Compare()
..

IMPALA-5479: Propagate the argument type in RawValue::Compare()

CodegenAnyVal::Compare() generates code which calls the cross
compiled version of RawValue::Compare() without propagating
the type information into RawValue::Compare(). As a result,
the generated code of RawValue::Compare() is not any more
efficient than the interpreted version as we still have
the big switch statement in it.

This change creates a global constant for the argument 'type'
passed to RawValue::Compare(). By inlining the call to
RawValue::Compare(), LLVM was able to constant propagate
the type and eliminate the dead code for non-target types.

With this change, a query with top-n improves by 12% on average:

select l_orderkey, l_partkey, l_suppkey
from tpch50_parquet.lineitem
order by l_orderkey, l_partkey, l_suppkey
limit 1;

4.49s -> 3.95s

This change also adds the ALWAYS_INLINE attribute to
RuntimeFilter::Eval() as it's needed to propagate the type
after a recent change to not put ALWAYS_INLINE attribute
on all cross-compiled functions.

Change-Id: I10b5b284e3da03024476a9620a12d6e7fbf08b3c
---
M be/src/codegen/codegen-anyval.cc
M be/src/runtime/raw-value-ir.cc
M be/src/runtime/raw-value.h
M be/src/runtime/runtime-filter-ir.cc
M be/src/runtime/runtime-filter.h
5 files changed, 14 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/7140/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I10b5b284e3da03024476a9620a12d6e7fbf08b3c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-4622: Add ALTER COLUMN statement.

2017-06-09 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4622: Add ALTER COLUMN statement.
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6955/3/fe/src/main/java/org/apache/impala/analysis/AlterColumnStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterColumnStmt.java:

> I think the new syntax is still desirable. I don't think it matters too muc
I'm thinking the new syntax is closer to standard, so we should definitely move 
in that direction. In my mind, the CHANGE syntax should be preserved for 
compatibility, but we should avoid extending.


-- 
To view, visit http://gerrit.cloudera.org:8080/6955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id2e8bd65342b79644a0fdcd925e6f17797e89ad6
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5221: Avoid re-use of stale SASL contexts.

2017-06-09 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5221: Avoid re-use of stale SASL contexts.
..


Patch Set 4: Code-Review+2

(3 comments)

Thanks for the review!

Carry +2. I'll run it through one last round of testing and merge it after that.

http://gerrit.cloudera.org:8080/#/c/7116/3/be/src/transport/TSasl.cpp
File be/src/transport/TSasl.cpp:

Line 115:   mechList(mechanisms) {
> Is it right to initialise chosenMech to a list of mechanisms? Should it sta
Good point. It shouldn't make a difference, but I've removed it anyway. I'll 
just run it though one last round of testing to make sure that this doesn't 
cause a regression.


http://gerrit.cloudera.org:8080/#/c/7116/3/be/src/transport/TSasl.h
File be/src/transport/TSasl.h:

PS3, Line 125: *
> nit: line break
Done


http://gerrit.cloudera.org:8080/#/c/7116/3/be/src/transport/TSaslTransport.h
File be/src/transport/TSaslTransport.h:

PS3, Line 177: /**
> nit: three / (or better to conform to other function comment style in this 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d509e14dfe46fa28d0626cf84daf6de82955d80
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5221: Avoid re-use of stale SASL contexts.

2017-06-09 Thread Sailesh Mukil (Code Review)
Hello Henry Robinson, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7116

to look at the new patch set (#4).

Change subject: IMPALA-5221: Avoid re-use of stale SASL contexts.
..

IMPALA-5221: Avoid re-use of stale SASL contexts.

The TSaslTransport is written as a thrift extension that is a wrapper
around the Cyrus-SASL APIs. This transport is then used by Impala's
RPC layer.

On RHEL7 systems that use newer versions of the Cyrus-SASL library,
we noticed that we sometimes crash inside the Cyrus-SASL thirdparty
while trying to lock an internal mutex. During my investigation, I
found that we needed to fix the order of negotiation that happens in
an edge case.

The steps to use the Cyrus-SASL APIs for SASL negitiation are the
following (Replace '_client_' with '_server_' for server calls):
sasl_client_new()
sasl_client_start()
sasl_client_step()
sasl_dispose()   < --- When we're done with the connection.

sasl_client_new() was being called in the constructor TSaslClient()
which is invoked from SaslAuthProvider::WrapClientTransport().

sasl_client_start() and sasl_client_step() were being called under
TSaslTransport::open(). If for some reason we hit an error during
SASL negotiation, the TSaslTransport::open() call would fail. When
we fail, the ThriftClientImpl::OpenWithRetry() does a retry, which
directly retries the negotiation from sasl_client_start(). This
caused the use of already freed resources from the first negotiation
failure, hence causing the crash.

To fix this, we make sure that on a negotiation failure, we dispose
of all the resources properly by calling sasl_dispose() and retry
the negotiation from the start by calling sasl_client_new() first, and
then the remaining steps. This is done by moving the sasl_client_new()
and sasl_server_new() calls out of the TSaslClient/TSaslServer
constructors and into a new call called TSasl*::setupSaslContext(),
which is called under TSaslTransport::open().

The patch is fairly large for the above mentioned change, however,
most of it is just plumbing.

Testing: Tested on systems with older SASL versions to make sure
we don't regress. Also tested on systems with newer SASL versions
where we previously saw the crash and verified that we don't see
them anymore.
Also, tested with GSSAPI and LDAP mechanisms.

Change-Id: I7d509e14dfe46fa28d0626cf84daf6de82955d80
---
M be/src/transport/TSasl.cpp
M be/src/transport/TSasl.h
M be/src/transport/TSaslClientTransport.cpp
M be/src/transport/TSaslClientTransport.h
M be/src/transport/TSaslServerTransport.cpp
M be/src/transport/TSaslServerTransport.h
M be/src/transport/TSaslTransport.cpp
M be/src/transport/TSaslTransport.h
8 files changed, 234 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/7116/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7d509e14dfe46fa28d0626cf84daf6de82955d80
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5453: test create table like file fails on enum.parquet

2017-06-09 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5453: test_create_table_like_file fails on enum.parquet
..


Patch Set 1: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7139/1/testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test
File 
testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test:

Line 57: create external table $DATABASE.like_enumtype_file like parquet
nice!


-- 
To view, visit http://gerrit.cloudera.org:8080/7139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6c386843e5ef5bf6fc208db1ff90be98fd8baacf
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5453: test create table like file fails on enum.parquet

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5453: test_create_table_like_file fails on enum.parquet
..


Patch Set 1:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/713/

-- 
To view, visit http://gerrit.cloudera.org:8080/7139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6c386843e5ef5bf6fc208db1ff90be98fd8baacf
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5453: test create table like file fails on enum.parquet

2017-06-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5453: test_create_table_like_file fails on enum.parquet
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6c386843e5ef5bf6fc208db1ff90be98fd8baacf
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3905: HdfsScanner::GetNext() for Avro, RC, and Seq scans.

2017-06-09 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new patch set (#4).

Change subject: IMPALA-3905: HdfsScanner::GetNext() for Avro, RC, and Seq scans.
..

IMPALA-3905: HdfsScanner::GetNext() for Avro, RC, and Seq scans.

Implements HdfsScanner::GetNext() for the Avro, RC File, and
Sequence File scanners. Changes ProcessSplit() to repeatedly call
GetNext() to share the core scanning code between the legacy
ProcessSplit() interface (ProcessSplit()) and the new GetNext()
interface.

Summary of changes:
- Slightly change code flow for initial scan range that
  only parses the file header. The new code sets
  'only_parsing_header_' in Open() and then honors
  that flag in GetNextInternal(). Before, all the logic
  was inside ProcessSpit().
- Replace 'finished_' with 'eos_'.
- Add a RowBatch parameter to various functions.
- Change Close() to free all resources when a nullptr
  RowBatch is passed.

Testing:
- Exhaustive tests passed on debug
- Core tests passed on asan
- TODO: Perf testing on cluster

Change-Id: Ie18f57b0d3fe0052a8ccd361b6a5fcdf979d0669
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/base-sequence-scanner.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-avro-scanner-ir.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-sequence-scanner.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/scan-node.h
M be/src/util/blocking-queue.h
M testdata/workloads/functional-query/queries/DataErrorsTest/avro-errors.test
26 files changed, 708 insertions(+), 815 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/6527/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie18f57b0d3fe0052a8ccd361b6a5fcdf979d0669
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-5453: test create table like file fails on enum.parquet

2017-06-09 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7139

Change subject: IMPALA-5453: test_create_table_like_file fails on enum.parquet
..

IMPALA-5453: test_create_table_like_file fails on enum.parquet

A recent addition to test_create_table_like_file (IMPALA-2525)
relies on a file, enum.parquet, being preloaded into HDFS, which
is done by create-load-data.sh.

The problem is that the test creates the table as an internal
table with its location as the directory containing enum.parquet.

When the test completes and the table is dropped, enum.parquet
is deleted, so the test cannot be successfully run again, and a
snapshot generated from the contents of HDFS afterwards will
not contain the file.

The fix is to create the table as an external table.

Testing:
- Ran the test and verfied enum.parquet is still present in HDFS.

Change-Id: I6c386843e5ef5bf6fc208db1ff90be98fd8baacf
---
M 
testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/7139/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I6c386843e5ef5bf6fc208db1ff90be98fd8baacf
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-5221: Avoid re-use of stale SASL contexts.

2017-06-09 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5221: Avoid re-use of stale SASL contexts.
..


Patch Set 3: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7116/3/be/src/transport/TSasl.h
File be/src/transport/TSasl.h:

PS3, Line 125: * C
nit: line break


http://gerrit.cloudera.org:8080/#/c/7116/3/be/src/transport/TSaslTransport.h
File be/src/transport/TSaslTransport.h:

PS3, Line 177: // 
nit: three / (or better to conform to other function comment style in this 
file).


-- 
To view, visit http://gerrit.cloudera.org:8080/7116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d509e14dfe46fa28d0626cf84daf6de82955d80
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5221: Avoid re-use of stale SASL contexts.

2017-06-09 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5221: Avoid re-use of stale SASL contexts.
..


Patch Set 3: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7116/3/be/src/transport/TSasl.cpp
File be/src/transport/TSasl.cpp:

Line 115:   chosenMech(mechanisms),
Is it right to initialise chosenMech to a list of mechanisms? Should it stay 
blank until evaluateChallengeOrResponse()?


-- 
To view, visit http://gerrit.cloudera.org:8080/7116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d509e14dfe46fa28d0626cf84daf6de82955d80
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4965: Authorize access to runtime profile and exec summary

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4965: Authorize access to runtime profile and exec 
summary
..


Patch Set 12:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/712/

-- 
To view, visit http://gerrit.cloudera.org:8080/7064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4965: Authorize access to runtime profile and exec summary

2017-06-09 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4965: Authorize access to runtime profile and exec 
summary
..


Patch Set 12: Code-Review+2

Fix more clang-tidy issue. Keep Dan's +2

-- 
To view, visit http://gerrit.cloudera.org:8080/7064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4965: Authorize access to runtime profile and exec summary

2017-06-09 Thread Dimitris Tsirogiannis (Code Review)
Hello Impala Public Jenkins, Alex Behm, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7064

to look at the new patch set (#12).

Change subject: IMPALA-4965: Authorize access to runtime profile and exec 
summary
..

IMPALA-4965: Authorize access to runtime profile and exec summary

Bug:
When Sentry-based authorization is enabled, a user that isn't authorized
to EXPLAIN a statement that uses a view can still access unauthorized
information, such as view's definition, by running the statement and
asking for the query profile or the execution summary.

Fix:
During query compilation, determine if the user can access the the runtime
profile or the execution summary. Upon request for a runtime profile or
execution summary from a user, determine based on that information and
the user that is asking for the profile if the runtime profile
(or execution summary) will be returned or an authorization error.

The authorization rule enforced is the following:
- User A runs statement S, A asks for profile, A has profile access:
  Runtime profile is returned
- User A runs statement S, A asks for profile, A doesn't have profile access:
  Authorization error
- User A runs statement S, user B asks for profile:
  Authorization error.

This patch doesn't enforce access to the runtime profile or execution summary
through the Web UI.

Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/auth-util.cc
M be/src/util/auth-util.h
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/ShowCreateTableStmt.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M shell/impala_shell.py
M tests/authorization/test_authorization.py
17 files changed, 284 insertions(+), 54 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/7064/12
-- 
To view, visit http://gerrit.cloudera.org:8080/7064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-5467: disable flaky BenchmarkTest to unblock builds

2017-06-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5467: disable flaky BenchmarkTest to unblock builds
..


Patch Set 1:

Not sure. This happened on amazon EC2 vms so not sure what their TSC 
implementation looks like.

-- 
To view, visit http://gerrit.cloudera.org:8080/7135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6aafbc620c1e839b07bf6020f9286ca5ee51197d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5467: disable flaky BenchmarkTest to unblock builds

2017-06-09 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5467: disable flaky BenchmarkTest to unblock builds
..


Patch Set 1: Code-Review+1

Well since the fix, we know this isn't being caused by contention, any idea 
what could actually be causing it?

Do we have AMD or older Intel hardware with flaky TSCs in service that could 
explain this?

-- 
To view, visit http://gerrit.cloudera.org:8080/7135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6aafbc620c1e839b07bf6020f9286ca5ee51197d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5221: Avoid re-use of stale SASL contexts.

2017-06-09 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new patch set (#3).

Change subject: IMPALA-5221: Avoid re-use of stale SASL contexts.
..

IMPALA-5221: Avoid re-use of stale SASL contexts.

The TSaslTransport is written as a thrift extension that is a wrapper
around the Cyrus-SASL APIs. This transport is then used by Impala's
RPC layer.

On RHEL7 systems that use newer versions of the Cyrus-SASL library,
we noticed that we sometimes crash inside the Cyrus-SASL thirdparty
while trying to lock an internal mutex. During my investigation, I
found that we needed to fix the order of negotiation that happens in
an edge case.

The steps to use the Cyrus-SASL APIs for SASL negitiation are the
following (Replace '_client_' with '_server_' for server calls):
sasl_client_new()
sasl_client_start()
sasl_client_step()
sasl_dispose()   < --- When we're done with the connection.

sasl_client_new() was being called in the constructor TSaslClient()
which is invoked from SaslAuthProvider::WrapClientTransport().

sasl_client_start() and sasl_client_step() were being called under
TSaslTransport::open(). If for some reason we hit an error during
SASL negotiation, the TSaslTransport::open() call would fail. When
we fail, the ThriftClientImpl::OpenWithRetry() does a retry, which
directly retries the negotiation from sasl_client_start(). This
caused the use of already freed resources from the first negotiation
failure, hence causing the crash.

To fix this, we make sure that on a negotiation failure, we dispose
of all the resources properly by calling sasl_dispose() and retry
the negotiation from the start by calling sasl_client_new() first, and
then the remaining steps. This is done by moving the sasl_client_new()
and sasl_server_new() calls out of the TSaslClient/TSaslServer
constructors and into a new call called TSasl*::setupSaslContext(),
which is called under TSaslTransport::open().

The patch is fairly large for the above mentioned change, however,
most of it is just plumbing.

Testing: Tested on systems with older SASL versions to make sure
we don't regress. Also tested on systems with newer SASL versions
where we previously saw the crash and verified that we don't see
them anymore.
Also, tested with GSSAPI and LDAP mechanisms.

Change-Id: I7d509e14dfe46fa28d0626cf84daf6de82955d80
---
M be/src/transport/TSasl.cpp
M be/src/transport/TSasl.h
M be/src/transport/TSaslClientTransport.cpp
M be/src/transport/TSaslClientTransport.h
M be/src/transport/TSaslServerTransport.cpp
M be/src/transport/TSaslServerTransport.h
M be/src/transport/TSaslTransport.cpp
M be/src/transport/TSaslTransport.h
8 files changed, 232 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/7116/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7d509e14dfe46fa28d0626cf84daf6de82955d80
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5467: disable flaky BenchmarkTest to unblock builds

2017-06-09 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5467: disable flaky BenchmarkTest to unblock builds
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6aafbc620c1e839b07bf6020f9286ca5ee51197d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5467: disable flaky BenchmarkTest to unblock builds

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5467: disable flaky BenchmarkTest to unblock builds
..


Patch Set 1:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/711/

-- 
To view, visit http://gerrit.cloudera.org:8080/7135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6aafbc620c1e839b07bf6020f9286ca5ee51197d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5467: disable flaky BenchmarkTest to unblock builds

2017-06-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7135

Change subject: IMPALA-5467: disable flaky BenchmarkTest to unblock builds
..

IMPALA-5467: disable flaky BenchmarkTest to unblock builds

Change-Id: I6aafbc620c1e839b07bf6020f9286ca5ee51197d
---
M be/src/util/benchmark-test.cc
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/7135/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I6aafbc620c1e839b07bf6020f9286ca5ee51197d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5446: dropped Sorter::Reset() status

2017-06-09 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5446: dropped Sorter::Reset() status
..


Patch Set 1: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d4f9e93a44531901e663b3f1e18edc514363f74
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5221: Avoid re-use of stale SASL contexts.

2017-06-09 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5221: Avoid re-use of stale SASL contexts.
..


Patch Set 2:

> > (11 comments)
 > >
 > > Looks pretty good. Have you confirmed this works with LDAP as
 > well
 > > as GSSAPI?
 > 
 > Forgot to address the LDAP comment. I just got my hands on a
 > cluster with LDAP enabled, I'll test the patch on it and post an
 > update tomorrow.

Verified that there are no regressions on a cluster with LDAP enabled too.
- Deployed my patch onto an LDAP enabled cluster.
- I was able to connect with a client using LDAP credentials.
- Verified that I saw the following in the logs:

I0609 09:31:49.210407 11809 authentication.cc:249] Trying simple LDAP bind for: 
uid=admin,ou=users,dc=cloudera,dc=com
I0609 09:31:49.565165 11809 authentication.cc:261] LDAP bind successful
I0609 09:31:49.565322 11809 authentication.cc:459] Successfully authenticated 
client user "admin"

-- 
To view, visit http://gerrit.cloudera.org:8080/7116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d509e14dfe46fa28d0626cf84daf6de82955d80
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5413: Add a hive user for test seq writer hive compatibility.

2017-06-09 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5413: Add a hive user for 
test_seq_writer_hive_compatibility.
..


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7046/2/tests/query_test/test_compressed_formats.py
File tests/query_test/test_compressed_formats.py:

Line 182:   #
remove?


-- 
To view, visit http://gerrit.cloudera.org:8080/7046
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1cc8824800e4339874b9c4e3a84969baf848d941
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR](asf-site) IMPALA-5470: Modernize links and instructions on download page

2017-06-09 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5470: Modernize links and instructions on download page
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7124/1/downloads.html
File downloads.html:

Line 189: To check a SHA1 sum, run sha1sum --check
> I think it can be a bit confusing that both sha sums use the same file exte
Agreed. Unfortunately, the sha1sums are already made and posted (2.7, 2.8, 2.9) 
and going forward the sha512 sums should follow the ASF release policy, which 
says to use ".sha".


PS1, Line 189: SHA1
> macOS doesn't ship these binaries. Instead it comes with a shasum binary, t
I can add that. Could you also tell me how to do md5sum and sha512sum?

Alternately, we could make this improvement in a follow-on patch, authored by 
someone who owns a mac.


-- 
To view, visit http://gerrit.cloudera.org:8080/7124
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibec680428cb7e0df3f831527218d296214eff5c7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5016: Try to rewrite coalesce() function in SimplifyConditionalsRule.

2017-06-09 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5016: Try to rewrite coalesce() function in 
SimplifyConditionalsRule.
..


Patch Set 4:

(3 comments)

Final polish.

http://gerrit.cloudera.org:8080/#/c/7015/4//COMMIT_MSG
Commit Message:

Line 7: IMPALA-5016: Try to rewrite coalesce() function in 
SimplifyConditionalsRule.
Long lines here, and commit msg needs some polish.

I suggest as the title.

Simplify COALESCE() in SimplifyConditionalsRule.

Then you can copy+paste your comment from the 
simplifyCoalesceFunctionCallExpr() function here which already describes 
everything. Please be sure to stay within the line limit.


Line 13: add some unit tests for SimplifyConditionalsRule with coalesce() 
function.
replace with

- added unit tests in ExprRewriteRulesTest


http://gerrit.cloudera.org:8080/#/c/7015/4/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java:

Line 143: 
slotRef.getDesc().getParent().getTable().isClusteringColumn(slotRef.getDesc().getColumn()))
 {
long lines


-- 
To view, visit http://gerrit.cloudera.org:8080/7015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0325a9e437261b15313bbdf2f22d83376a84b575
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: yu feng 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: yu feng 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5413: Add a hive user for test seq writer hive compatibility.

2017-06-09 Thread David Knupp (Code Review)
David Knupp has uploaded a new patch set (#2).

Change subject: IMPALA-5413: Add a hive user for 
test_seq_writer_hive_compatibility.
..

IMPALA-5413: Add a hive user for test_seq_writer_hive_compatibility.

This patch includes a change to the framework to permit the passing
of a username to the run_stmt_in_hive() method in the ImpalaTestSuite
class, but retains the same default value as before.

This is to allow a test to issue a 'select count(*) from foo' query
through hive. Hive needs to set up a job to perform this query, and
HDFS write access to do so. In typical cases, the HDFS user is 'hdfs'.
however it may be necessary to change this depending on the cluster.

On a local mini-cluster, the username appears to be irrelevant, so
this won't affect locally run tests.

Tested by running the core set of tests on a local minicluster to
make sure there were no regressions. Also confirmed that the test
in question now passes on a remote physical cluster.

Change-Id: I1cc8824800e4339874b9c4e3a84969baf848d941
---
M tests/common/impala_test_suite.py
M tests/query_test/test_compressed_formats.py
2 files changed, 11 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/7046/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7046
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1cc8824800e4339874b9c4e3a84969baf848d941
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 


[Impala-ASF-CR] IMPALA-4965: Authorize access to runtime profile and exec summary

2017-06-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4965: Authorize access to runtime profile and exec 
summary
..


Patch Set 11: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/710/

-- 
To view, visit http://gerrit.cloudera.org:8080/7064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5016: Try to rewrite coalesce() function in SimplifyConditionalsRule.

2017-06-09 Thread yu feng (Code Review)
yu feng has posted comments on this change.

Change subject: IMPALA-5016: Try to rewrite coalesce() function in 
SimplifyConditionalsRule.
..


Patch Set 4:

(35 comments)

responded to comments by alex.

http://gerrit.cloudera.org:8080/#/c/7015/1/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java:

Line 49:  * case when false then 0 when true then 1 end -> 1
> update to have an example with coalesce
Done


Line 83:*/
> please rename to
Done


Line 101:   /**
> Clearer to understand if you spell out the cases in bullet points, somethin
modify the comments for the function


Line 104:* COALESCE(, a, b) -> , when literal is not 
NullLiteral;
> please rename to
Done


Line 105:* COALESCE(, a, b) -> ,
> nit for consistency: we usually use 'numChilderen'
Done


Line 107:*/
> Preserve the type as follows:
Done


Line 112:   Expr childExpr = expr.getChildren().get(i);
> single line if+continue, remove else block, i.e.
Done


Line 114:   if (childExpr.isNullLiteral()) continue;
> Can you move these into the function comment please?
move those comments to the beginning of the function.


Line 125: return result;
> Why not skip leading NULLs in a separate step at the beginning? This soluti
I always break when meets the first non-NULL parameter, so it is O(N) for the 
function.


Line 136: 
> Use child.isLiteral() && !child.isNullLiteral().
Done


Line 139: if (!slotRef.getDesc().getIsNullable()) return true;
> Use Expr.unwrapExpr(false) here. This optimization is ok if the SlotRef has
Add some test case for cast function.


Line 141: if (slotRef.getDesc().getParent().getTable() instanceof HdfsTable 
&&
> Partition columns do not have stored stats, so this comment is misleading. 
Done


Line 143: 
slotRef.getDesc().getParent().getTable().isClusteringColumn(slotRef.getDesc().getColumn()))
 {
> We should also allow the simplification if !slotRef.getIsNullable()
check slot with !slotRef.getIsNullable(), But I find isNullable_ is set to 
false when aggregation and union statement, Can not construct test case in 
ExprRewriteRulesTest.java


Line 144:   HdfsTable table = (HdfsTable) 
slotRef.getDesc().getParent().getTable();
> You should be able to omit these first two checks. An analyzed SlotRef must
Done


Line 149:   }
> no NULL value
Done


http://gerrit.cloudera.org:8080/#/c/7015/2/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java:

Line 105:* COALESCE(, a, b) -> ,
> change  to 
Done


Line 113:   // Skip leading nulls.
> Nice!
Done


Line 128:   /**
> Flesh out the comment a little more. When does this function return true an
Done


Line 131:* child is a possibly cast SlotRef against a non-nullable slot;
> single-line if
Done


Line 134:   private boolean canSimplifyCoalesceUsingChild(Expr child) {
> use unwrapSlotRef()
Done


PS2, Line 144:   HdfsTable table = (HdfsTable) 
slotRef.getDesc().getParent().getTable();
change the single-line if


http://gerrit.cloudera.org:8080/#/c/7015/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java:

Line 105:* COALESCE(, a, b) -> ,
> long line
Done


Line 128:   /**
> Checks if the given child expr is nullable.
Done


http://gerrit.cloudera.org:8080/#/c/7015/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java:

Line 337: 
> IMPALA-5016: Simplify COALESCE function
Done


Line 341: RewritesOk("if(false, max(id), min(id))", rule, "min(id)");
> grammar and typos
Done


Line 344: RewritesOk(
> grammar and typos (case -> case?), paarameter, please refrain from using al
Done


http://gerrit.cloudera.org:8080/#/c/7015/2/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java:

Line 338: // IMPALA-5125: Exprs containing aggregates should not be 
rewritten if the rewrite
> // Test skipping leading nulls
Done


Line 340: RewritesOk("if(true, 0, sum(id))", rule, null);
> add one more leading null
Done


Line 341: RewritesOk("if(false, max(id), min(id))", rule, "min(id)");
> If the leading parameter is a non-NULL constant, rewrite to that constant.
Done


Line 345: "case when true then count(id) when false then sum(id) end", 
rule, "count(id)");
> If all parameters are NULL, rewrite to NULL
Done


Line 347: // IMPALA-5016: Simplify COALESCE function
> // Do not rewrite non-literal constant exprs (rely on constant folding).
Done


Line 355: RewritesOk("coalesce(id)", rule, "id");
> also add coalesce(year, bigint_col) to test an implicit cast around 

[Impala-ASF-CR] IMPALA-4192: Disentangle Expr and ExprContext

2017-06-09 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-4192: Disentangle Expr and ExprContext
..


Patch Set 19:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/5483/19/be/src/exec/analytic-eval-node.h
File be/src/exec/analytic-eval-node.h:

Line 304:   /// analytic_eval_fns_. When enough input rows have been consumed 
to produce the analytic
> long lines
Done


http://gerrit.cloudera.org:8080/#/c/5483/19/be/src/exec/exec-node.cc
File be/src/exec/exec-node.cc:

Line 498: //   %eval_ptr = getelementptr inbounds 
%"class.impala::ScalarExprEvaluator"*, 
> trailing
Done


http://gerrit.cloudera.org:8080/#/c/5483/19/be/src/exec/hbase-scan-node.cc
File be/src/exec/hbase-scan-node.cc:

Line 250: if (EvalConjuncts(_evals_[0], conjuncts_.size(), row)) {
> .data()
Done


http://gerrit.cloudera.org:8080/#/c/5483/19/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 1238: if (ExecNode::EvalConjuncts([0], evals.size(), row)) {
> .data()
Done


http://gerrit.cloudera.org:8080/#/c/5483/19/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

Line 1410: //   %input0 = call { i8, double } 
@GetSlotRef(%"class.impala::ScalarExprEvaluator"* 
> trailing
Done


http://gerrit.cloudera.org:8080/#/c/5483/19/be/src/exprs/scalar-expr-evaluator.cc
File be/src/exprs/scalar-expr-evaluator.cc:

Line 218:   for (int i = 0; i < evals.size(); ++i) {
> single line
Done


http://gerrit.cloudera.org:8080/#/c/5483/19/be/src/runtime/data-stream-test.cc
File be/src/runtime/data-stream-test.cc:

Line 307:   is_asc_, nulls_first_));
> tab
TupleRowComparator will keep a reference to is_asc_ so it needs to exist after 
calling the constructor.


http://gerrit.cloudera.org:8080/#/c/5483/19/be/src/util/tuple-row-compare.h
File be/src/util/tuple-row-compare.h:

Line 141:   std::vector nulls_first_;
> const& here as well?
Not feasible given the way it's set up in the constructor.


-- 
To view, visit http://gerrit.cloudera.org:8080/5483
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iefdc9aeeba033355cb9497e3a5d2363627dcf2f3
Gerrit-PatchSet: 19
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4192: Disentangle Expr and ExprContext

2017-06-09 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-4192: Disentangle Expr and ExprContext
..


Patch Set 20: Code-Review+2

Carry +2 forward.

-- 
To view, visit http://gerrit.cloudera.org:8080/5483
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iefdc9aeeba033355cb9497e3a5d2363627dcf2f3
Gerrit-PatchSet: 20
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4192: Disentangle Expr and ExprContext

2017-06-09 Thread Michael Ho (Code Review)
Hello Marcel Kornacker, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5483

to look at the new patch set (#20).

Change subject: IMPALA-4192: Disentangle Expr and ExprContext
..

IMPALA-4192: Disentangle Expr and ExprContext

This change separates Expr and ExprContext. This is a preparatory
step for factoring out static data (e.g. Exprs) of plan fragments
to be shared by multiple plan fragment instances.

This change includes the followings:

1. Include aggregate functions (AggFn) as Expr. This separates
   AggFn from its evaluator. AggFn is similar to existing Expr
   as both are represented as a tree of Expr nodes but it doesn't
   really make sense to call Get*Val() on AggFn. This change
   restructures the class hierarchy: much of the existing Expr
   class is now renamed to ScalarExpr. Expr is the parent class
   of both AggFn and ScalarExpr. Expr is defined to be a tree
   with root of either AggFn or ScalarExpr and all descendants
   being ScalarExpr.

2. ExprContext is renamed to ScalarExprEvaluator which is the
   interface for evaluating ScalarExpr; AggFnEvaluator is the
   interface for evaluating AggFn. Multiple evaluators can be
   instantiated per Expr. Expr contains static states of an
   expression while evaluator contains runtime states needed
   for execution (i.e. evaluating the expression).

3. Update all exec nodes to instantiate Expr and their evaluators
   separately. ExecNode::Init() will be responsible for creating
   all the Exprs in an ExecNode while their evaluators are created
   in ExecNode::Prepare(). Certain evaluators are also moved into
   the data structures which actually utilize them. For instance,
   HashTableCtx now owns the build and probe expression evaluators.
   Similarly, TupleRowComparator and Sorter also own the evaluators.
   ExecNode which utilizes these data structures are only responsible
   for creating the expressions used by these data structures.

4. All codegen functions take Exprs instead of evaluators.

5. The assignment of index into the FunctionContext vector is now done
   during the construction of ScalarExpr. Evaluators are only responsible
   for allocating and initializing the FunctionContexts.

6. Open(), Prepare() are now removed from Expr classes. The interface
   for creating any Expr is via either ScalarExpr::Create() or AggFn::Create()
   which will convert a thrift Expr into an initialized Expr object.
   Similarly, Create() interface is used for creating evaluators from an
   intialized Expr object.

This separation allows the future change to introduce PlanNode data structures.
The plan is to move all ExecNode::Init() logic to PlanNode and call them once
per plan fragment.

Change-Id: Iefdc9aeeba033355cb9497e3a5d2363627dcf2f3
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/common/init.cc
M be/src/exec/CMakeLists.txt
M be/src/exec/aggregation-node-ir.cc
M be/src/exec/aggregation-node.cc
M be/src/exec/aggregation-node.h
M be/src/exec/analytic-eval-node.cc
M be/src/exec/analytic-eval-node.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/data-source-scan-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/filter-context.cc
M be/src/exec/filter-context.h
M be/src/exec/hash-join-node-ir.cc
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-join-node.h
M be/src/exec/hash-table-ir.cc
M be/src/exec/hash-table-test.cc
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hbase-table-writer.cc
M be/src/exec/hbase-table-writer.h
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-avro-table-writer.cc
M be/src/exec/hdfs-avro-table-writer.h
M be/src/exec/hdfs-parquet-scanner-ir.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scanner-ir.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-sequence-scanner.h
M be/src/exec/hdfs-sequence-table-writer.cc
M be/src/exec/hdfs-sequence-table-writer.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
M 

[Impala-ASF-CR] IMPALA-5016: Try to rewrite coalesce() function in SimplifyConditionalsRule.

2017-06-09 Thread yu feng (Code Review)
yu feng has uploaded a new patch set (#4).

Change subject: IMPALA-5016: Try to rewrite coalesce() function in 
SimplifyConditionalsRule.
..

IMPALA-5016: Try to rewrite coalesce() function in SimplifyConditionalsRule.

try to rewrite coalesce() function with head-most nulls, constant values
and HDFS table partition metadatas. which will make sence in partiton purning

Testing:
add some unit tests for SimplifyConditionalsRule with coalesce() function.
run all test cases in SimplifyConditionalsRule and all passed.

Change-Id: I0325a9e437261b15313bbdf2f22d83376a84b575
---
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
2 files changed, 122 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/7015/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0325a9e437261b15313bbdf2f22d83376a84b575
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: yu feng 
Gerrit-Reviewer: Alex Behm