[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()

2017-10-10 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8255 )

Change subject: IMPALA-5940: Avoid log spew by using Status::Expected()
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/service/impala-server.cc@1170
PS1, Line 1170: const string& err = Substitute("ReportExecStatus(): 
Received report for unknown "
> I find this to be a useful log when debugging user issues. It lets me know
Actually, we are still printing this line (see line 1172) below. We are just 
not printing the stack trace.



--
To view, visit http://gerrit.cloudera.org:8080/8255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
Gerrit-Change-Number: 8255
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 05:54:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

2017-10-10 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7938 )

Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
..


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/exec/kudu-util.h
File be/src/exec/kudu-util.h:

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/exec/kudu-util.h@45
PS6, Line 45: (FromKuduStatus(status))
> nit: parenthesis not needed.
Done


http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/auth-provider.h
File be/src/rpc/auth-provider.h:

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/auth-provider.h@144
PS6, Line 144: forks
> nit: forks Impalad and exec 'kinit'
Done


http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc@77
PS6, Line 77: "Only used when FLAGS_use_krpc is false");
> Okay. So in other words, we could have fixed the old code as well, and elim
Yes, we could have, but we couldn't deprecate this flag in a minor release 
anyway, so I never went ahead with it.


http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/thrift-server-test.cc
File be/src/rpc/thrift-server-test.cc:

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/thrift-server-test.cc@117
PS6, Line 117: !(GetParam() == USE_KUDU_KERBEROS)
> Did you flip the polarity ?
Oops, I did that for some local testing verification and forgot to turn it 
back. Switched it back.



--
To view, visit http://gerrit.cloudera.org:8080/7938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
Gerrit-Change-Number: 7938
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 05:33:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

2017-10-10 Thread Sailesh Mukil (Code Review)
Hello Michael Ho, Bikramjeet Vig, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/7938

to look at the new patch set (#7).

Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
..

IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

Impala currently kinits by forking off a child process. This
has proved to be expensive in many cases since the subprocess
tries to reserve as much memory as Impala is currently using
which can be quite a lot.

This patch adds a flag called 'use_kudu_kinit' that defaults to
true. When it's true, it uses the Kudu security library's kinit code
that programatically uses the krb5 library to kinit.
When it's false, we run our current path which kicks off the
kinit-thread and forks off a kinit process periodically to reacquire
tickets based on FLAGS_kerberos_reinit_interval.

Converted existing tests in thrift-server-test to run with and
without kerberos. We now run this BE test with kerberos by using
Kudu's MiniKdc utility. This introduces a new dependency on some
kerberos binaries that are checked through FindKerberosPrograms.cmake.
Note that this is only a test dependency and not a dependency for
the impalad binaries and friends. Compilation will still succeed if
the kerberos binaries for the MiniKdc are not found, however, the
thrift-server-test will fail. We run with and without the
'use_kudu_kinit' flag.

Updated bin/bootstap_development.sh to install new sasl-gssapi
modules and the kerberos binaries required for the MiniKdc.

Testing: Verified with thrift-server-test and also manually on a
live kerberized cluster.

Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
---
M CMakeLists.txt
M be/src/exec/kudu-util.h
M be/src/kudu/security/CMakeLists.txt
M be/src/kudu/security/init.cc
M be/src/kudu/security/test/mini_kdc.cc
M be/src/rpc/CMakeLists.txt
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication.cc
M be/src/rpc/thrift-server-test.cc
M be/src/testutil/gtest-util.h
M bin/bootstrap_development.sh
A cmake_modules/FindKerberosPrograms.cmake
12 files changed, 230 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7938/7
--
To view, visit http://gerrit.cloudera.org:8080/7938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
Gerrit-Change-Number: 7938
Gerrit-PatchSet: 7
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()

2017-10-10 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8255 )

Change subject: IMPALA-5940: Avoid log spew by using Status::Expected()
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/service/impala-server.cc@1170
PS1, Line 1170: const string& err = Substitute("ReportExecStatus(): 
Received report for unknown "
I find this to be a useful log when debugging user issues. It lets me know if 
there were late RPCs and in some cases if some fragment instances outlived the 
coordinator.

But I agree that this is a lot of noise if printed more than once. Maybe the 
effect of this log will be assuaged by IMPALA-4063?



--
To view, visit http://gerrit.cloudera.org:8080/8255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
Gerrit-Change-Number: 8255
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 05:17:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()

2017-10-10 Thread Michael Ho (Code Review)
Michael Ho has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8255


Change subject: IMPALA-5940: Avoid log spew by using Status::Expected()
..

IMPALA-5940: Avoid log spew by using Status::Expected()

This change converts two more callers of Status() to Status::Expected()
to avoid log spew and the overhead of stack crawling. These two call
sites can easily fill the log when there is any network issue.

Testing done:

Ran concurrent TPC-DS workload with 256 users and verified the log spew is not 
there.

Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
---
M be/src/rpc/thrift-client.cc
M be/src/service/impala-server.cc
2 files changed, 6 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/8255/1
--
To view, visit http://gerrit.cloudera.org:8080/8255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
Gerrit-Change-Number: 8255
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-10 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8235/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8235/4//COMMIT_MSG@59
PS4, Line 59:  100K-PARTITIONS-1M-FILES-CUSTOM-11-REFRESH-PARTITION I -16.4%
> Thanks for pointing this out. I should've added more context here. Mostafa
I see. So are these literally as simple as, to pick the first one, a single 
refresh call?



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Jim Apple 
Gerrit-Comment-Date: Wed, 11 Oct 2017 01:14:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

2017-10-10 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7938 )

Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/exec/kudu-util.h
File be/src/exec/kudu-util.h:

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/exec/kudu-util.h@45
PS6, Line 45: Status&
> Should this be Status instead of Status& ?
Nevermind. Using const reference is fine too. The returned object's lifetime is 
extended to that of the reference although I would expect the compiler to 
figure it out too even without using a reference.



--
To view, visit http://gerrit.cloudera.org:8080/7938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
Gerrit-Change-Number: 7938
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 00:51:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

2017-10-10 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7938 )

Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
..


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/exec/kudu-util.h
File be/src/exec/kudu-util.h:

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/exec/kudu-util.h@45
PS6, Line 45: Status&
Should this be Status instead of Status& ?


http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/exec/kudu-util.h@45
PS6, Line 45: (FromKuduStatus(status))
nit: parenthesis not needed.


http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/auth-provider.h
File be/src/rpc/auth-provider.h:

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/auth-provider.h@144
PS6, Line 144: forks
nit: forks Impalad and exec 'kinit'


http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/thrift-server-test.cc
File be/src/rpc/thrift-server-test.cc:

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/thrift-server-test.cc@117
PS6, Line 117: !(GetParam() == USE_KUDU_KERBEROS)
Did you flip the polarity ?



--
To view, visit http://gerrit.cloudera.org:8080/7938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
Gerrit-Change-Number: 7938
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 00:36:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-10 Thread Bharath Vissapragada (Code Review)
Hello Jim Apple,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8235

to look at the new patch set (#5).

Change subject: IMPALA-5429: Multi threaded block metadata loading
..

IMPALA-5429: Multi threaded block metadata loading

Implements multi threaded block metadata loading on the Catalog
server where we fetch block metadata for multiple partitions
in parallel. Number of threads to load the metadata is controlled
by the following two parameters (set on the Catalog server startup)

-max_hdfs_parts_parallel_load(default=5)
-max_s3_parts_parallel_load(default=10)

We use different thread pool sizes for HDFS and S3 based tables
since S3 supports much higher throughput of RPC calls for listStatus
/listFiles. Based on our experiments, S3 showed a linear speed up
(up to ~113x) with increasing number of loading threads where as the
HDFS throughput was limited to ~5x in un-secure clusters and up to
~3.7x in secure clusters. We narrowed it down to scalability
bottlenecks in HDFS RPC implementation (HADOOP-14558) on both the
server and the client side.

One thing to note here is that the thread pool based metadata fetching
is implemented only for loading HDFS block metadata and not for loading
HMS partition information. Our experiments showed that while loading large
partitioned tables, ~90% of the time is spent in connecting to NN and
loading the HDFS block information and optimizing the rest ~10% makes
the code unnecessarily complex without much gain.

Additional notes:

- The multithreading approach is implemented for
  * INVALIDATE (loading from scratch),
  * REFRESH (reusing existing md) code paths,
  * ALTER TABLE ADD/RECOVER PARTITIONS.

- This patch makes the implementation of ListMap thread-safe since
we use that datastructure as a shared state between multiple partition
metadata loding threads.

- While the configuration param max_s3_parts_parallel_load says s3, it
applies for any FileSystem implementation that doesn't support storage
IDs (like ADLS).

Testing and Results:

- This patch doesn't add any new tests since there is enough test
coverage already. Passed core/exhaustive runs with HDFS/S3.

- We noticed up to ~113x speed up on S3 tables(thread_pool_size=160)
and up to ~5x speed up in un-secure HDFS clusters and ~3.7x in secure
HDFS clusters.

- Synthesized the following two large tables on HDFS and S3 and noticed
significant reduction in my test DDL queries.

  (1) 100K partitions + 1 million files
  (2) 80 partitions + 250K files

 100K-PARTITIONS-1M-FILES-CUSTOM-11-REFRESH-PARTITION I -16.4%
 100K-PARTITIONS-1M-FILES-CUSTOM-08-ADD-PARTITION I -17.25%
 80-PARTITIONS-250K-FILES-11-REFRESH-PARTITIONI -23.57%
 80-PARTITIONS-250K-FILES-S3-08-ADD-PARTITION I -23.87%
 80-PARTITIONS-250K-FILES-09-INVALIDATE   I -24.88%
 80-PARTITIONS-250K-FILES-03-RECOVER  I -35.90%
 80-PARTITIONS-250K-FILES-07-REFRESH  I -43.03%
 100K-PARTITIONS-1M-FILES-CUSTOM-12-QUERY-PARTITIONS  I -43.93%
 100K-PARTITIONS-1M-FILES-CUSTOM-05-QUERY-AFTER-INV   I -46.59%
 80-PARTITIONS-250K-FILES-10-REFRESH-AFTER-ADD-PARTITION  I -48.71%
 100K-PARTITIONS-1M-FILES-CUSTOM-07-REFRESH   I -49.02%
 80-PARTITIONS-250K-FILES-05-QUERY-AFTER-INV  I -49.05%
 100K-PARTITIONS-1M-FILES-CUSTOM-10-REFRESH-AFTER-ADD-PARTI -51.87%
 80-PARTITIONS-250K-FILES-S3-03-RECOVER   I -67.17%
 80-PARTITIONS-250K-FILES-S3-05-QUERY-AFTER-INV   I -76.45%
 80-PARTITIONS-250K-FILES-S3-07-REFRESH   I -87.04%
 80-PARTITIONS-250K-FILES-S3-10-REFRESH-AFTER-ADD-PARTI -88.57%

Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
---
M be/src/catalog/catalog.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M 
fe/src/main/java/org/apache/impala/catalog/HdfsPartitionLocationCompressor.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/util/ListMap.java
9 files changed, 386 insertions(+), 212 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/8235/5
--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 5
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Jim Apple 


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-10 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8235/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8235/4//COMMIT_MSG@59
PS4, Line 59:  100K-PARTITIONS-1M-FILES-CUSTOM-11-REFRESH-PARTITION I -16.4%
> What are these benchmarks?
Thanks for pointing this out. I should've added more context here. Mostafa 
worked on creating a metadata benchmark by synthesizing large partitioned 
tables and then running some operations that stress the Catalog. Looks like the 
code/scripts are not yet ready for consumption by everyone. I borrowed those 
from him to run these tests. I'll add that information to the commit message.



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Jim Apple 
Gerrit-Comment-Date: Wed, 11 Oct 2017 00:24:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors

2017-10-10 Thread Bikramjeet Vig (Code Review)
Hello Philip Zeyliger, anujphadke, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8233

to look at the new patch set (#3).

Change subject: IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker 
errors
..

IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors

Currently if llvm encounters an error it calls exit() and this kills
the impalad process. This patch adds a diagnostic handler that llvm
can use to pass the error up to impala instead of crashing it.

Testing:
Added a test which induces an error in llvm and makes sure that its
passed up to impala code and handled correctly.

Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff
---
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M tests/query_test/test_codegen.py
3 files changed, 75 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/8233/3
--
To view, visit http://gerrit.cloudera.org:8080/8233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff
Gerrit-Change-Number: 8233
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors

2017-10-10 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8233 )

Change subject: IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker 
errors
..


Patch Set 2:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h@781
PS2, Line 781:   /// Ensures that an attempt to read diagnostic state would 
reset the object.
> Comment is kinda cryptic. Maybe needs some reference to the LLVM Diagnostic
Done


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h@784
PS2, Line 784:
> nit: extra blank line.
removed


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h@785
PS2, Line 785: DiagnosticHandler() : encountered_error_(false) {}
> Let's initialise the member variables inline.
Removed


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h@787
PS2, Line 787: /// Returns the error string if an error was encountered and 
resets the state in
> Can you give some hint about when a caller should call this? I.e. after an 
Done


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h@789
PS2, Line 789: string
> std::string in a header (I guess there's a rogue "using std::string" somewh
Done


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h@791
PS2, Line 791: /// Handler function that sets the state on an instance of 
this class
> I think the comments could make the relationship between these two function
Done


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h@796
PS2, Line 796: string error_str_;
> Can you briefly document the member variables and the relationship between
removed


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h@796
PS2, Line 796: string
> std::string.
Done


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h@799
PS2, Line 799:   /// Maintains state set by diagnostic handler.
> Comment is kinda cryptic. It may not be necessary to have both a class and
Done


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.cc@1621
PS2, Line 1621: DiagnosticHandler* diagnostic_handler = 
reinterpret_cast(context);
> Can context be a nullptr in any scenarios? We should check for that before
I dont think there is any scenario as far as how we are using this handler, 
will a DCHECK(context != nullptr) suffice instead?


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.cc@1627
PS2, Line 1627: error_msg.flush();
> I feel like we should also log the error at LOG(INFO) level so that it does
Done



--
To view, visit http://gerrit.cloudera.org:8080/8233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff
Gerrit-Change-Number: 8233
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Wed, 11 Oct 2017 00:05:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu

2017-10-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7793 )

Change subject: IMPALA-4252: Min-max runtime filters for Kudu
..


Patch Set 6:

(1 comment)

I've been meaning to do another full pass over this. Should have time tomorrow.

http://gerrit.cloudera.org:8080/#/c/7793/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/7793/5//COMMIT_MSG@9
PS5, Line 9: This patch implements min-max filters for runtime filters. Each
> My intention (which I probably wasn't super consistent about, I'll try to c
This is good to know. I suppose another way of looking at it is that bloom 
filters, min-max filters are both kinds of runtime filters, and a composite 
filter with both kinds is also a kind of runtime filter. I'll keep this in mind 
doing my next pass.



--
To view, visit http://gerrit.cloudera.org:8080/7793
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c
Gerrit-Change-Number: 7793
Gerrit-PatchSet: 6
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 10 Oct 2017 23:34:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6023: Fix broken breakpad test

2017-10-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8240 )

Change subject: IMPALA-6023: Fix broken breakpad test
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb5af3e72963280a6677a99aa6a0e5785443bb0c
Gerrit-Change-Number: 8240
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 10 Oct 2017 23:08:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6023: Fix broken breakpad test

2017-10-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8240 )

Change subject: IMPALA-6023: Fix broken breakpad test
..

IMPALA-6023: Fix broken breakpad test

We have a test to make sure that hitting a DCHECK will write a minidump.
We used to pass "-beeswax_port=1" to the server to trigger a DCHECK. A
while ago, this DCHECK seems to have been removed, but we still called
abort() if the ImpalaServer failed to start. This masked the slightly
altered behavior and the test still succeeded.

However, the fix for IMPALA-4786 changed the behavior to call exit(1)
instead of abort() if the ImpalaServer failed to start.

To fix the test, we change it to pass an unresolvable hostname to
impalad, which will result in a call to abort().

This change also splits the breakpad tests into core and exhaustive sets
to make sure that tests which depend on other parts of Impala are
included in every core run.

Change-Id: Ifb5af3e72963280a6677a99aa6a0e5785443bb0c
Reviewed-on: http://gerrit.cloudera.org:8080/8240
Reviewed-by: Michael Brown 
Tested-by: Impala Public Jenkins
---
M tests/custom_cluster/test_breakpad.py
1 file changed, 33 insertions(+), 23 deletions(-)

Approvals:
  Michael Brown: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ifb5af3e72963280a6677a99aa6a0e5785443bb0c
Gerrit-Change-Number: 8240
Gerrit-PatchSet: 5
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5789: Add always false flag in bloom filter

2017-10-10 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8170 )

Change subject: IMPALA-5789: Add always_false flag in bloom filter
..


Patch Set 4:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8170/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8170/3//COMMIT_MSG@13
PS3, Line 13: Testing
> I assume you ran the existing runtime filter tests?
Yes. Commit message is updated with the clarification.


http://gerrit.cloudera.org:8080/#/c/8170/3/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/8170/3/be/src/common/global-flags.cc@129
PS3, Line 129: testing purposes. Effective in debug builds only.");
> This is a little cluttered. I think its okay to just say "for testing purpo
Done


http://gerrit.cloudera.org:8080/#/c/8170/3/be/src/exec/filter-context.h
File be/src/exec/filter-context.h:

http://gerrit.cloudera.org:8080/#/c/8170/3/be/src/exec/filter-context.h@129
PS3, Line 129: CheckForAlwaysFalse(
> I think this needs to be renamed for two reasons:
Done


http://gerrit.cloudera.org:8080/#/c/8170/3/be/src/runtime/runtime-filter.inline.h
File be/src/runtime/runtime-filter.inline.h:

http://gerrit.cloudera.org:8080/#/c/8170/3/be/src/runtime/runtime-filter.inline.h@53
PS3, Line 53: return bloom_filter_ != BloomFilter::ALWAYS_TRUE_FILTER && 
bloom_filter_->AlwaysFalse();
: }
> single line?
Done


http://gerrit.cloudera.org:8080/#/c/8170/3/be/src/util/bloom-filter.h
File be/src/util/bloom-filter.h:

http://gerrit.cloudera.org:8080/#/c/8170/3/be/src/util/bloom-filter.h@111
PS3, Line 111: hasn't had any elements inserted.
> hasn't had any elements inserted
Done


http://gerrit.cloudera.org:8080/#/c/8170/3/tests/custom_cluster/test_always_false_filter.py
File tests/custom_cluster/test_always_false_filter.py:

http://gerrit.cloudera.org:8080/#/c/8170/3/tests/custom_cluster/test_always_false_filter.py@34
PS3, Line 34: f assert_file_rejected(result):
: assert re.search("Files rejected: 8 \(8\)", resu
> I think you can just add 'cursor' as a parameter to the 'test_' functions.
Done.



--
To view, visit http://gerrit.cloudera.org:8080/8170
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If680240a3cd4583fc97c3192177d86d9567c4f8d
Gerrit-Change-Number: 8170
Gerrit-PatchSet: 4
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 10 Oct 2017 22:47:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5789: Add always false flag in bloom filter

2017-10-10 Thread Tianyi Wang (Code Review)
Hello Thomas Tauber-Marshall, Sailesh Mukil, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8170

to look at the new patch set (#4).

Change subject: IMPALA-5789: Add always_false flag in bloom filter
..

IMPALA-5789: Add always_false flag in bloom filter

This patch adds an always_false flag in bloom filters. The flag is set
if nothing has been inserted into the bloom filter. HdfsScanner uses
this flag to early terminate the scan at file and split granularities.

Testing: It passes existing tests. A test case is added checking that an
always-false runtime filter can filter out files and splits.

Change-Id: If680240a3cd4583fc97c3192177d86d9567c4f8d
---
M be/src/common/global-flags.cc
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/base-sequence-scanner.h
M be/src/exec/filter-context.cc
M be/src/exec/filter-context.h
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scanner.cc
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/runtime-filter-ir.cc
M be/src/runtime/runtime-filter.h
M be/src/runtime/runtime-filter.inline.h
M be/src/util/bloom-filter-ir.cc
M be/src/util/bloom-filter.cc
M be/src/util/bloom-filter.h
M common/thrift/ImpalaInternalService.thrift
A tests/custom_cluster/test_always_false_filter.py
21 files changed, 229 insertions(+), 198 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/8170/4
--
To view, visit http://gerrit.cloudera.org:8080/8170
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If680240a3cd4583fc97c3192177d86d9567c4f8d
Gerrit-Change-Number: 8170
Gerrit-PatchSet: 4
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4236: Codegen CopyRows() for select nodes

2017-10-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8196 )

Change subject: IMPALA-4236: Codegen CopyRows() for select nodes
..


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8196/7/be/src/exec/select-node-ir.cc
File be/src/exec/select-node-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8196/7/be/src/exec/select-node-ir.cc@28
PS7, Line 28: limit_
This could return extra rows. E.g. if the limit for the node is 1025, we 
returned a batch of 1024 rows and are now processing another input batch of 
1024 rows, then this would return another 1024 rows instead of only 1 more.

It seems like we might be missing some test coverage here of select nodes with 
limits (it may also require a single node plan if there's an exchange inserted 
otherwise).

I think the old ReachedLimit() check plus FOREACH_ROW would work.


http://gerrit.cloudera.org:8080/#/c/8196/6/be/src/exec/select-node.h
File be/src/exec/select-node.h:

http://gerrit.cloudera.org:8080/#/c/8196/6/be/src/exec/select-node.h@70
PS6, Line 70:   Status CodegenCopyRows(RuntimeState* state);
> Good to know. Apparently missed this change while I was gone.
Yeah, there were actually two LLVM changes that might be of interest to you: 
https://gerrit.cloudera.org/#/q/status:merged+project:Impala-ASF+branch:master+topic:llvm-upgrade


http://gerrit.cloudera.org:8080/#/c/8196/7/be/src/exec/select-node.cc
File be/src/exec/select-node.cc:

http://gerrit.cloudera.org:8080/#/c/8196/7/be/src/exec/select-node.cc@91
PS7, Line 91:   DCHECK(*eos == false);
This isn't safe - the GetNext() API doesn't require that the caller initializes 
the output parameter. (IMO eos should be part of the ExecNode state rather than 
something transiently set as an output parameter, but that's a different 
discussion!)


http://gerrit.cloudera.org:8080/#/c/8196/7/be/src/exec/select-node.cc@112
PS7, Line 112:
nit: remove extra vertical whitespace



--
To view, visit http://gerrit.cloudera.org:8080/8196
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie0d496d004418468e16b6f564f90f45ebbf87c1e
Gerrit-Change-Number: 8196
Gerrit-PatchSet: 7
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 10 Oct 2017 22:38:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5448: fix invalid number of splits reported in Parquet scan node

2017-10-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8147 )

Change subject: IMPALA-5448: fix invalid number of splits reported in Parquet 
scan node
..


Patch Set 8:

Thanks for the contribution!


--
To view, visit http://gerrit.cloudera.org:8080/8147
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaacc2d775032f5707061e704f12e0a63cde695d1
Gerrit-Change-Number: 8147
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 10 Oct 2017 22:22:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5425: Add test for validating input when setting query options

2017-10-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7805 )

Change subject: IMPALA-5425: Add test for validating input when setting query 
options
..


Patch Set 15: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/7805
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7
Gerrit-Change-Number: 7805
Gerrit-PatchSet: 15
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 10 Oct 2017 21:54:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5425: Add test for validating input when setting query options

2017-10-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/7805 )

Change subject: IMPALA-5425: Add test for validating input when setting query 
options
..

IMPALA-5425: Add test for validating input when setting query options

This patch adds multiple query option validation testcases to
be/src/service/query-options-test.cc
The test cases include parsing edge cases, bondary values, special
cases for some options and some testcases moved from
testdata/workloads/functional-query/queries/QueryTest/set.test
This patch also fixes a bug generating wrong error message for
query option RUNTIME_FILTER_WAIT_TIME_MS.

Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7
Reviewed-on: http://gerrit.cloudera.org:8080/7805
Reviewed-by: Dan Hecht 
Tested-by: Impala Public Jenkins
---
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M testdata/workloads/functional-query/queries/QueryTest/set.test
3 files changed, 267 insertions(+), 128 deletions(-)

Approvals:
  Dan Hecht: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/7805
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7
Gerrit-Change-Number: 7805
Gerrit-PatchSet: 16
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4236: Codegen CopyRows() for select nodes

2017-10-10 Thread Bikramjeet Vig (Code Review)
Hello Michael Ho, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8196

to look at the new patch set (#7).

Change subject: IMPALA-4236: Codegen CopyRows() for select nodes
..

IMPALA-4236: Codegen CopyRows() for select nodes

Testing:
Added test case to verify that CopyRows in select node is successfully
codegened.

Performance:
Queries used (num_nodes set to 1):
500 Predicates: select * from (select * from tpch_parquet.lineitem
limit 6001215) t1 where l_partkey > 10 and l_extendedprice > 1 and
l_linenumber > 1 and l_comment >'foo0'  and l_comment >'foo500'
order by l_orderkey limit 10;

1 Predicate: select * from (select * from tpch_parquet.lineitem
limit 6001215) t1 where l_partkey > 10 order by l_orderkey limit 10;

+--+-+
|  |  500 Predicates  |   1 Predicate|
|  ++-++-+
|  |   After|   Before|   After|   Before|
+--++-++-+
| Select Node  | 12s385ms   | 1m1s| 234ms  | 797ms   |
| Codegen time | 2s619ms| 1s962ms | 200ms  | 181ms   |
+--++-++-+

Change-Id: Ie0d496d004418468e16b6f564f90f45ebbf87c1e
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exec/CMakeLists.txt
A be/src/exec/select-node-ir.cc
M be/src/exec/select-node.cc
M be/src/exec/select-node.h
M tests/query_test/test_codegen.py
7 files changed, 121 insertions(+), 64 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/8196/7
--
To view, visit http://gerrit.cloudera.org:8080/8196
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie0d496d004418468e16b6f564f90f45ebbf87c1e
Gerrit-Change-Number: 8196
Gerrit-PatchSet: 7
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4236: Codegen CopyRows() for select nodes

2017-10-10 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8196 )

Change subject: IMPALA-4236: Codegen CopyRows() for select nodes
..


Patch Set 6:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8196/6/be/src/exec/select-node-ir.cc
File be/src/exec/select-node-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8196/6/be/src/exec/select-node-ir.cc@29
PS6, Line 29: while (child_row_idx_ < child_batch_num_rows)
> Have you looked into using FOREACH_ROW_LIMIT() ? May have some complication
Done


http://gerrit.cloudera.org:8080/#/c/8196/6/be/src/exec/select-node.h
File be/src/exec/select-node.h:

http://gerrit.cloudera.org:8080/#/c/8196/6/be/src/exec/select-node.h@40
PS6, Line 40:   virtual void Codegen(RuntimeState* state);
> While you are at it, would you mind adding override to these functions ?
Done


http://gerrit.cloudera.org:8080/#/c/8196/6/be/src/exec/select-node.h@63
PS6, Line 63: codegend_copy_rows_fn_;
> Feel free to ignore but I think we have recently switched to initializing f
I should do that change for all the member variables. Do you recommend I make 
that change in this commit?


http://gerrit.cloudera.org:8080/#/c/8196/6/be/src/exec/select-node.h@69
PS6, Line 69: /// Codegen CopyRows().
> May help to explain that this is mostly codegen'ing the conjuncts evaluatio
Done


http://gerrit.cloudera.org:8080/#/c/8196/6/be/src/exec/select-node.cc
File be/src/exec/select-node.cc:

http://gerrit.cloudera.org:8080/#/c/8196/6/be/src/exec/select-node.cc@91
PS6, Line 91:   if (ReachedLimit() || (child_row_idx_ == 
child_row_batch_->num_rows() && child_eos_)) {
> This is still a bit wonky (not your change but we should try to leave this
Done


http://gerrit.cloudera.org:8080/#/c/8196/6/be/src/exec/select-node.cc@107
PS6, Line 107:   child_row_batch_->TransferResourceOwnership(row_batch);
> Sorry for the additional churn but I think the resource transfer is overly
Done



--
To view, visit http://gerrit.cloudera.org:8080/8196
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie0d496d004418468e16b6f564f90f45ebbf87c1e
Gerrit-Change-Number: 8196
Gerrit-PatchSet: 6
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 10 Oct 2017 21:46:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 15: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1323/


--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 15
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Tue, 10 Oct 2017 21:46:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors

2017-10-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8233 )

Change subject: IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker 
errors
..


Patch Set 2:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h@781
PS2, Line 781:   /// Ensures that an attempt to read diagnostic state would 
reset the object.
Comment is kinda cryptic. Maybe needs some reference to the LLVM 
DiagnosticHandler API - I guess the idea of the class is to implement LLVM's 
DiagnosticHandler API and save the error information from callbacks.?


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h@784
PS2, Line 784:
nit: extra blank line.


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h@785
PS2, Line 785: DiagnosticHandler() : encountered_error_(false) {}
Let's initialise the member variables inline.


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h@787
PS2, Line 787: /// Returns the error string if an error was encountered and 
resets the state in
Can you give some hint about when a caller should call this? I.e. after an LLVM 
API call that can fail but returns error info via this mechanism. We probably 
want to add a TODO and file a follow-up JIRA to make sure that we're checking 
for errors everywhere that we need to.


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h@789
PS2, Line 789: string
std::string in a header (I guess there's a rogue "using std::string" somewhere 
in another header).


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h@791
PS2, Line 791: /// Handler function that sets the state on an instance of 
this class
I think the comments could make the relationship between these two functions 
more obvious. E.g. GetErrorString() returns the last error that was reported 
via DiagnosticHandlerFn()


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h@796
PS2, Line 796: string error_str_;
Can you briefly document the member variables and the relationship between 
them? Since we return an empty string to indicate an error above, is 
'encountered_error_' redundant? I.e. does {true, ""} behave differently to 
{false, ""}?


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h@796
PS2, Line 796: string
std::string.


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.h@799
PS2, Line 799:   /// Maintains state set by diagnostic handler.
Comment is kinda cryptic. It may not be necessary to have both a class and 
member comment since there's only one instance of the class.


http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.cc@1627
PS2, Line 1627: error_msg.flush();
I feel like we should also log the error at LOG(INFO) level so that it doesn't 
get swallowed silently in places where we're not checking for errors yet. These 
errors should be rare enough that they won't add too much to the logs.



--
To view, visit http://gerrit.cloudera.org:8080/8233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff
Gerrit-Change-Number: 8233
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Tue, 10 Oct 2017 21:00:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2758: Change BufferedTupleStream::GetRows to returning multi batches

2017-10-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8226 )

Change subject: IMPALA-2758: Change BufferedTupleStream::GetRows to returning 
multi batches
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8226/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8226/1//COMMIT_MSG@10
PS1, Line 10: which returns multiple batches instead of one.
> The callers iterates through the build rows for every row in null_probe_row
I'm ok with the stream staying pinned, the main goal was to remove the 
GetRows() API, which uses RowBatch in a weird way.

I think keeping the stream unpinned to handle many nulls on the build side is 
just a special case of
https://issues.apache.org/jira/browse/IMPALA-4857. I think that is harder to 
implement well - we'd want to consider doing some kind of blocked algorithm to 
reduce I/O.


http://gerrit.cloudera.org:8080/#/c/8226/1/be/src/runtime/buffered-tuple-stream.h
File be/src/runtime/buffered-tuple-stream.h:

http://gerrit.cloudera.org:8080/#/c/8226/1/be/src/runtime/buffered-tuple-stream.h@332
PS1, Line 332:   Status GetNext(
My intent was to remove the GetRows() API altogether and switch callers to 
calling GetNext() directly.

Creating many row batches is somewhat better, but there's still a lot of 
unnecessary memory overhead with all of the tuple_ptrs_.



--
To view, visit http://gerrit.cloudera.org:8080/8226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414
Gerrit-Change-Number: 8226
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 10 Oct 2017 20:34:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-10-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 3:

(17 comments)

Some initial comments.

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/common/status.cc
File be/src/common/status.cc:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/common/status.cc@246
PS3, Line 246:   }
this is the same as FromThrift() effectively, right? Can we make the two look 
the same to make that obvious?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/common/status.cc@262
PS3, Line 262:
same comment. let's make this and ToThrift look the same so it's obvious they 
do the same things.
nit: also, could we order the functions consistently? We currently have 
ToThrift, FromThrift, FromProto, ToProto, and that ordering just makes it 
slightly slower to read through since it doesn't follow a pattern.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/exec/exchange-node.cc
File be/src/exec/exchange-node.cc:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/exec/exchange-node.cc@109
PS3, Line 109:   RETURN_IF_CANCELLED(state);
why do we do this in some Open() but not all? Should we just do it in 
ExecNode::Open() and remove the ones in the derived classes?  okay to do 
separately from this patch.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/rpc/rpc-mgr.h
File be/src/rpc/rpc-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/rpc/rpc-mgr.h@154
PS3, Line 154:   }
nit: one-liner?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h
File be/src/runtime/krpc-data-stream-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@94
PS3, Line 94: buffer
what buffer? do you mean queue?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@108
PS3, Line 108: During ordinary operation
what does that mean?  Is it saying that during unordinary operation, a sender 
can have both a TransmitData() and EndDataStream() call in-flight 
simultaneously?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@140
PS3, Line 140: it will quietly drop its
what are "it" and "its" here? "the sender" and "the RPC's"?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@141
PS3, Line 141: /// it returns.
is that still true now that we have cancellation of RPCs?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@153
PS3, Line 153: fragment
sending fragment?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@166
PS3, Line 166: request
is that talking about the 'request' field below, or something different?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@175
PS3, Line 175:   const TransmitDataRequestPB* request;
what's the relationship between this and proto_batch?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@178
PS3, Line 178:   /// responded to.
who owns it?


http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/data_stream_service.proto
File common/protobuf/data_stream_service.proto:

http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/data_stream_service.proto@29
PS3, Line 29:   optional int32 sender_id = 2;
:
:   optional int32 dest_node_id = 3;
what are "IDs" in these cases? let's improve the documentation here. Especially 
since type is no longer PlanNodeId (and why is that?).


http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto
File common/protobuf/row_batch.proto:

http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto@30
PS3, Line 30: int32
in thrift we had TTupleId. Is there a reason we aren't defining those types as 
well to make the structure clearer?


http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto@32
PS3, Line 32: tuple_data
what's tuple_data? not a field in this structure...


http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto@39
PS3, Line 39: Size
size of what?


http://gerrit.cloudera.org:8080/#/c/8023/3/common/protobuf/row_batch.proto@42
PS3, Line 42: (TODO(KRPC): native enum)
do we plan to fix that?



--
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
Gerrit-Change-Number: 8023
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 10 Oct 2017 20:15:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5736: Add impala-shell argument to set default query options

2017-10-10 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8038 )

Change subject: IMPALA-5736: Add impala-shell argument to set default query 
options
..


Patch Set 3:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/8038/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8038/3//COMMIT_MSG@25
PS3, Line 25: EXPLAIN_LEVEL=2 and MT_DOP=1.:
> Maybe just make this the example in line 14 and 19?
Done


http://gerrit.cloudera.org:8080/#/c/8038/3/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/8038/3/shell/impala_shell.py@1430
PS3, Line 1430:   options.set_query_options.update(
> Just to confirm: this is command line overriding what's in the config file,
Done


http://gerrit.cloudera.org:8080/#/c/8038/3/shell/impala_shell.py@1431
PS3, Line 1431:  [(k.upper(),v) for k,v in 
parse_variables(options.query_options).items()])
> nit: I believe pep8 style has a space after comma
Done


http://gerrit.cloudera.org:8080/#/c/8038/3/shell/option_parser.py
File shell/option_parser.py:

http://gerrit.cloudera.org:8080/#/c/8038/3/shell/option_parser.py@33
PS3, Line 33: def validate_and_convert_options(loaded_options, default_options):
> Let's be clear that these are impala_shell options and not impala_query opt
I did a bit of refactoring, I think that it is more clear now. Please comment 
if I should work on it more.


http://gerrit.cloudera.org:8080/#/c/8038/3/shell/option_parser.py@39
PS3, Line 39:   print >> sys.stderr, "WARNING: Unable to read configuration 
file correctly. " \
> Perhaps: "Ignoring unrecognized config option '%s'"?
Done


http://gerrit.cloudera.org:8080/#/c/8038/3/shell/option_parser.py@47
PS3, Line 47: loaded_options[i] = (option, True)
> It's weird that we're modifying loaded_options here.
The cause of the runtime error is fixed now, see impala_shell.py line 686-693.

Note that query options only work when the shell is connected to impalad and 
this was the reason why MT_DOP was not accepted here.


http://gerrit.cloudera.org:8080/#/c/8038/3/shell/option_parser.py@65
PS3, Line 65:   Validates some values (False, True, None)
> This only happens for the shell options part of it, yes?
Yes.


http://gerrit.cloudera.org:8080/#/c/8038/3/shell/option_parser.py@73
PS3, Line 73:   loaded_options = []
> This is both shell options and impala query options, yeah? Should we make t
Done


http://gerrit.cloudera.org:8080/#/c/8038/3/shell/option_parser.py@79
PS3, Line 79: loaded_options.append( ("set_query_options",
> It took me longer than I care to admit to understand what's going on here.
I have changed some of this, I hope that it is less messy now. Some of the 
option handling code in impala_shell.py is still a bit "strange" in my opinion, 
but I am not sure that rewriting it should be in the scope of this commit.


http://gerrit.cloudera.org:8080/#/c/8038/3/shell/option_parser.py@179
PS3, Line 179:  " KEY starts with an alphabetic 
character and"
> This is really about "key" being a valid query option, right? I think you c
Done



--
To view, visit http://gerrit.cloudera.org:8080/8038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26a3b67230c80a99bd246b6af205d558fec9a986
Gerrit-Change-Number: 8038
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Tue, 10 Oct 2017 20:08:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5736: Add impala-shell argument to set default query options

2017-10-10 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Matthew Jacobs, Philip Zeyliger,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8038

to look at the new patch set (#4).

Change subject: IMPALA-5736: Add impala-shell argument to set default query 
options
..

IMPALA-5736: Add impala-shell argument to set default query options

Query options can be set from command line and impala rc as
key=value pairs.

Examples:
command line:
impala-shell.sh -Q MT_DOP=1 --query_option=MAX_ERRORS=200

.impalarc:
[impala.query_options]
EXPLAIN_LEVEL=2
MT_DOP=2

The options set in command line will update the ones
in impalarc one by one, so the result of the example
above will be:
EXPLAIN_LEVEL=2
MT_DOP=1
MAX_ERRORS=200

Change-Id: I26a3b67230c80a99bd246b6af205d558fec9a986
---
M shell/impala_shell.py
M shell/option_parser.py
A tests/shell/impalarc_with_query_options
M tests/shell/test_shell_interactive.py
4 files changed, 83 insertions(+), 37 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/8038/4
--
To view, visit http://gerrit.cloudera.org:8080/8038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I26a3b67230c80a99bd246b6af205d558fec9a986
Gerrit-Change-Number: 8038
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-5243: Speed up code gen for wide Avro tables. (wip)

2017-10-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8211 )

Change subject: IMPALA-5243: Speed up code gen for wide Avro tables. (wip)
..


Patch Set 1:

This seems pretty beneficial - a 1-2s speedup is pretty significant for short 
running queries (which should be common for Avro since it's generally only 
recommended for smaller data sets). It feels like there's a bit of redundancy 
with the duplicated code to create functions with the same signatures but it's 
not immediately clear if there's a good abstraction to avoid it.


--
To view, visit http://gerrit.cloudera.org:8080/8211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b390be4adf6e6699a18344234f8ff7ee74476
Gerrit-Change-Number: 8211
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 10 Oct 2017 19:30:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6023: Fix broken breakpad test

2017-10-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8240 )

Change subject: IMPALA-6023: Fix broken breakpad test
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1325/


--
To view, visit http://gerrit.cloudera.org:8080/8240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb5af3e72963280a6677a99aa6a0e5785443bb0c
Gerrit-Change-Number: 8240
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 10 Oct 2017 19:06:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors

2017-10-10 Thread anujphadke (Code Review)
anujphadke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8233 )

Change subject: IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker 
errors
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.cc@1621
PS2, Line 1621: DiagnosticHandler* diagnostic_handler = 
reinterpret_cast(context);
Can context be a nullptr in any scenarios? We should check for that before 
dereferencing it.



--
To view, visit http://gerrit.cloudera.org:8080/8233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff
Gerrit-Change-Number: 8233
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Tue, 10 Oct 2017 19:02:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

2017-10-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7938 )

Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc@77
PS6, Line 77: "Only used when FLAGS_use_krpc is false");
> Currently all our users use this flag. This needs to be set when we use our
Okay. So in other words, we could have fixed the old code as well, and 
eliminated this flag in that case.  (Let's not bother with that though, given 
we will rip that code out soon).



--
To view, visit http://gerrit.cloudera.org:8080/7938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
Gerrit-Change-Number: 7938
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 10 Oct 2017 18:45:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

2017-10-10 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7938 )

Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc@77
PS6, Line 77: "Only used when FLAGS_use_krpc is false");
> in what case would a user need to set this flag? and how is that case handl
Currently all our users use this flag. This needs to be set when we use our 
current way of forking kinit (i.e. in other words when FLAGS_use_krpc is false, 
in the context of this patch), since our RunKinit() function uses that to 
periodically wake up.

I'm not sure of the history of this flag, but it is a completely unnecessary 
flag. Our reinit interval should be based on the configured ticket lifetime in 
the system's krb5.conf file, which is what we've done in the kudu kinit case. 
It detects this ticket lifetime and wakes up periodically according to that to 
do the reinit.

Allowing a user to control this flag is very redundant and possibly confusing 
to users that understand kerberos.



--
To view, visit http://gerrit.cloudera.org:8080/7938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
Gerrit-Change-Number: 7938
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 10 Oct 2017 18:43:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

2017-10-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7938 )

Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc@77
PS6, Line 77: "Only used when FLAGS_use_krpc is false");
in what case would a user need to set this flag? and how is that case handled 
in the kudu kinit case?



--
To view, visit http://gerrit.cloudera.org:8080/7938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
Gerrit-Change-Number: 7938
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 10 Oct 2017 18:35:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

2017-10-10 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7938 )

Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc@842
PS6, Line 842: if (FLAGS_use_krpc_kinit) {
> can this path only be used with KRPC, or could it be used with thrift?
It can be used with both. The flag is there in case we run into unforeseen 
issues, so we can commit to removing it after one release.

I'll rename it to "use_kudu_kinit", since it's from the kudu security library 
and not the krpc library.



--
To view, visit http://gerrit.cloudera.org:8080/7938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
Gerrit-Change-Number: 7938
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 10 Oct 2017 18:29:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

2017-10-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7938 )

Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/7938/6/be/src/rpc/authentication.cc@842
PS6, Line 842: if (FLAGS_use_krpc_kinit) {
can this path only be used with KRPC, or could it be used with thrift?

If the latter, why have the flag? (Or at least, can we commit to removing the 
flag and the old code after one release?)



--
To view, visit http://gerrit.cloudera.org:8080/7938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
Gerrit-Change-Number: 7938
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 10 Oct 2017 18:21:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Allow the SASL protocol service name to be configurable

2017-10-10 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8230 )

Change subject: Allow the SASL protocol service name to be configurable
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8230
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e
Gerrit-Change-Number: 8230
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 10 Oct 2017 18:16:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5789: Add always false flag in bloom filter

2017-10-10 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8170 )

Change subject: IMPALA-5789: Add always_false flag in bloom filter
..


Patch Set 3:

(6 comments)

This is definitely going to conflict with my patch, but you should be able to 
get this in first so I'll probably have to do the painful rebasing.

http://gerrit.cloudera.org:8080/#/c/8170/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8170/3//COMMIT_MSG@13
PS3, Line 13: Testing
I assume you ran the existing runtime filter tests?


http://gerrit.cloudera.org:8080/#/c/8170/3/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/8170/3/be/src/common/global-flags.cc@129
PS3, Line 129: order to provide a regression test for IMPALA-3798 and a test 
for IMPALA-5789.
This is a little cluttered. I think its okay to just say "for testing purposes" 
and anyone who wants to know the exact JIRAs can easily grep for uses of it.


http://gerrit.cloudera.org:8080/#/c/8170/3/be/src/exec/filter-context.h
File be/src/exec/filter-context.h:

http://gerrit.cloudera.org:8080/#/c/8170/3/be/src/exec/filter-context.h@129
PS3, Line 129: HasAlwaysFalseInList
I think this needs to be renamed for two reasons:
- We're eventually going to be adding 'in-list' filters (in addition to the 
existing "bloom" and soon "min-max" filters), so the "InList" here is 
potentially confusing.
- Its unusual for a function starting with "Has" to have side effects 
(incrementing the stats).

Maybe "CheckForAlwaysFalse"? Or something better you come up with.


http://gerrit.cloudera.org:8080/#/c/8170/3/be/src/runtime/runtime-filter.inline.h
File be/src/runtime/runtime-filter.inline.h:

http://gerrit.cloudera.org:8080/#/c/8170/3/be/src/runtime/runtime-filter.inline.h@53
PS3, Line 53: return bloom_filter_ != BloomFilter::ALWAYS_TRUE_FILTER &&
:   bloom_filter_->AlwaysFalse();
single line?


http://gerrit.cloudera.org:8080/#/c/8170/3/be/src/util/bloom-filter.h
File be/src/util/bloom-filter.h:

http://gerrit.cloudera.org:8080/#/c/8170/3/be/src/util/bloom-filter.h@111
PS3, Line 111: hasn't been inserted any elements
hasn't had any elements inserted


http://gerrit.cloudera.org:8080/#/c/8170/3/tests/custom_cluster/test_always_false_filter.py
File tests/custom_cluster/test_always_false_filter.py:

http://gerrit.cloudera.org:8080/#/c/8170/3/tests/custom_cluster/test_always_false_filter.py@34
PS3, Line 34: impalad = self.cluster.get_any_impalad()
: client = impalad.service.create_beeswax_client()
I think you can just add 'cursor' as a parameter to the 'test_' functions.



--
To view, visit http://gerrit.cloudera.org:8080/8170
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If680240a3cd4583fc97c3192177d86d9567c4f8d
Gerrit-Change-Number: 8170
Gerrit-PatchSet: 3
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 10 Oct 2017 18:11:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5425: Add test for validating input when setting query options

2017-10-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7805 )

Change subject: IMPALA-5425: Add test for validating input when setting query 
options
..


Patch Set 15:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1324/


--
To view, visit http://gerrit.cloudera.org:8080/7805
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7
Gerrit-Change-Number: 7805
Gerrit-PatchSet: 15
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 10 Oct 2017 17:54:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5425: Add test for validating input when setting query options

2017-10-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7805 )

Change subject: IMPALA-5425: Add test for validating input when setting query 
options
..


Patch Set 15: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/7805
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7
Gerrit-Change-Number: 7805
Gerrit-PatchSet: 15
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 10 Oct 2017 17:54:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 15: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 15
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Tue, 10 Oct 2017 17:48:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6023: Fix broken breakpad test

2017-10-10 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8240 )

Change subject: IMPALA-6023: Fix broken breakpad test
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb5af3e72963280a6677a99aa6a0e5785443bb0c
Gerrit-Change-Number: 8240
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 10 Oct 2017 17:47:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 15:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1323/


--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 15
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Tue, 10 Oct 2017 17:48:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6023: Fix broken breakpad test

2017-10-10 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8240 )

Change subject: IMPALA-6023: Fix broken breakpad test
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8240/3/tests/custom_cluster/test_breakpad.py
File tests/custom_cluster/test_breakpad.py:

http://gerrit.cloudera.org:8080/#/c/8240/3/tests/custom_cluster/test_breakpad.py@149
PS3, Line 149: @pytest.mark.execute_serially
> This was here due to needing DCHECK behavior before, right? What about abor
Done


http://gerrit.cloudera.org:8080/#/c/8240/3/tests/custom_cluster/test_breakpad.py@174
PS3, Line 174:
> Change to:
Done


http://gerrit.cloudera.org:8080/#/c/8240/3/tests/custom_cluster/test_breakpad.py@333
PS3, Line 333:
> Some blank lines at the end of the file.
Done



--
To view, visit http://gerrit.cloudera.org:8080/8240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb5af3e72963280a6677a99aa6a0e5785443bb0c
Gerrit-Change-Number: 8240
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 10 Oct 2017 17:26:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6023: Fix broken breakpad test

2017-10-10 Thread Lars Volker (Code Review)
Hello Michael Brown, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8240

to look at the new patch set (#4).

Change subject: IMPALA-6023: Fix broken breakpad test
..

IMPALA-6023: Fix broken breakpad test

We have a test to make sure that hitting a DCHECK will write a minidump.
We used to pass "-beeswax_port=1" to the server to trigger a DCHECK. A
while ago, this DCHECK seems to have been removed, but we still called
abort() if the ImpalaServer failed to start. This masked the slightly
altered behavior and the test still succeeded.

However, the fix for IMPALA-4786 changed the behavior to call exit(1)
instead of abort() if the ImpalaServer failed to start.

To fix the test, we change it to pass an unresolvable hostname to
impalad, which will result in a call to abort().

This change also splits the breakpad tests into core and exhaustive sets
to make sure that tests which depend on other parts of Impala are
included in every core run.

Change-Id: Ifb5af3e72963280a6677a99aa6a0e5785443bb0c
---
M tests/custom_cluster/test_breakpad.py
1 file changed, 33 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/8240/4
--
To view, visit http://gerrit.cloudera.org:8080/8240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifb5af3e72963280a6677a99aa6a0e5785443bb0c
Gerrit-Change-Number: 8240
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors

2017-10-10 Thread Bikramjeet Vig (Code Review)
Hello Philip Zeyliger, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8233

to look at the new patch set (#2).

Change subject: IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker 
errors
..

IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors

Currently if llvm encounters an error it calls exit() and this kills
the impalad process. This patch adds a diagnostic handler that llvm
can use to pass the error up to impala instead of crashing it.

Testing:
Added a test which induces an error in llvm and makes sure that its
passed up to impala code and handled correctly.

Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff
---
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M tests/query_test/test_codegen.py
3 files changed, 72 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/8233/2
--
To view, visit http://gerrit.cloudera.org:8080/8233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff
Gerrit-Change-Number: 8233
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors

2017-10-10 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8233 )

Change subject: IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker 
errors
..


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8233/1/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/8233/1/be/src/codegen/llvm-codegen.h@783
PS1, Line 783:   class DiagnosticState {
> Based on http://llvm.org/doxygen/DiagnosticHandler_8h_source.html, it looks
that DiagnosticHandler was only added recently (20 days ago) and is not 
available in the llvm version we use. currently the only way is to set a 
handler callback function.


http://gerrit.cloudera.org:8080/#/c/8233/1/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/8233/1/be/src/codegen/llvm-codegen.cc@186
PS1, Line 186:   >diagnostic_state_, true);
> Could you comment on what this third argument does?
Thanks for highlighting this. There are some cases where the diagnostic handler 
will not be called for some remarks based on filters set using commandline 
flags.

I had initially set this to true to receive all diagnostic messages regardless 
of filters, but since we are only interested in errors at the moment, we can do 
away with not receiving those remarks. hence I will revert this back to the 
default value of false.


http://gerrit.cloudera.org:8080/#/c/8233/1/be/src/codegen/llvm-codegen.cc@304
PS1, Line 304:   if (error || diagnostic_state_.EncounteredError()) {
> So do the diagnostic messages get printed anywhere?
Done, also removed this condition from the if statement because it seems that 
error is true (Linker::linkModules returns true) if a diagnostic handler is set 
and an error is encountered


http://gerrit.cloudera.org:8080/#/c/8233/1/be/src/codegen/llvm-codegen.cc@1621
PS1, Line 1621: DiagnosticPrinterRawOStream 
diagnostic_printer(llvm::errs());
> We talked about this interface offline and whether it was possible to get t
Done


http://gerrit.cloudera.org:8080/#/c/8233/1/be/src/codegen/llvm-codegen.cc@1622
PS1, Line 1622: info.print(diagnostic_printer);
> Does this make it to our logs?
Done


http://gerrit.cloudera.org:8080/#/c/8233/1/tests/query_test/test_codegen.py
File tests/query_test/test_codegen.py:

http://gerrit.cloudera.org:8080/#/c/8233/1/tests/query_test/test_codegen.py@45
PS1, Line 45:   def test_codegen_diagnostic_handler(self, vector):
> You could probably test this in be/src/codegen/llvm-codegen-test.cc with co
I tried moving this to llvm-codegen-test.cc but it turns out  that a lot ugly 
code needs to be added to induce a linkage error. I had 2 options there:
1) add a Gtest that calles LlvmCodeGen::LoadFunction twice on the same lib, but 
I need 2 different names to the same file to induce an error because there are 
checks to see if a file has already been linked. I am not sure I can do hdfs 
calls to create a copy of a lib through the backend test.

2) Since the methods related to this were private to LlvmCodegen class, I wrote 
a Test method in LlvmCodeGenTest to load module from a lib twice and call LLVM 
linker directly to those 2. but it turns out, that this adds alot of ugle code 
to the class LlvmCodeGenTest.

I think another alternative would be to figure out how we can induce another 
kind of error through a simpler way, but I'll have to look more into it. would 
really appreciate any suggestions here.
Thanks



--
To view, visit http://gerrit.cloudera.org:8080/8233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff
Gerrit-Change-Number: 8233
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 10 Oct 2017 16:56:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6023: Fix broken breakpad test

2017-10-10 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8240 )

Change subject: IMPALA-6023: Fix broken breakpad test
..


Patch Set 3:

(3 comments)

Thanks for looking into this.

http://gerrit.cloudera.org:8080/#/c/8240/3/tests/custom_cluster/test_breakpad.py
File tests/custom_cluster/test_breakpad.py:

http://gerrit.cloudera.org:8080/#/c/8240/3/tests/custom_cluster/test_breakpad.py@149
PS3, Line 149: @SkipIfBuildType.not_dev_build
This was here due to needing DCHECK behavior before, right? What about abort() 
on release builds? Can this be removed?


http://gerrit.cloudera.org:8080/#/c/8240/3/tests/custom_cluster/test_breakpad.py@174
PS3, Line 174: super(TestBreakpadExhaustive).setup_class();
Change to:

  super(TestBreakpadExhaustive, cls).setup_class()

1. Drop the semi-colon
2. Pass in the cls argument to super()

Although what you wrote is syntactically valid, it's semantically invalid in 
this context and fails at runtime. More here: 
https://docs.python.org/2.7/library/functions.html#super

Btw, I found this doing an accounting of your tests to be run in both core and 
exhaustive:

  impala-py.test --workload_exploration_strategy functional-query:exhaustive 
tests/custom_cluster/test_breakpad.py


http://gerrit.cloudera.org:8080/#/c/8240/3/tests/custom_cluster/test_breakpad.py@333
PS3, Line 333:
Some blank lines at the end of the file.



--
To view, visit http://gerrit.cloudera.org:8080/8240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb5af3e72963280a6677a99aa6a0e5785443bb0c
Gerrit-Change-Number: 8240
Gerrit-PatchSet: 3
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 10 Oct 2017 15:50:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-10 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 11:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/cast-functions-ir.cc
File be/src/exprs/cast-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/cast-functions-ir.cc@277
PS11, Line 277:   }
> I mean the warning messages that you're adding here have no verification.
I have added tests to exprs.test.
There are files called "out-of-range-timestamp-abort-on-error.test" and 
"out-of-range-timestamp-continue-on-error.test", but they seem to be Parquet 
specific.


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-test.cc@670
PS11, Line 670: *
> Yes, we should follow the style guide so we have a consistent style, which
Done


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.cc@116
PS11, Line 116: (
> still not fixed:
Done



--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Tue, 10 Oct 2017 13:55:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-10 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#14).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, -12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
1-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 1<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Also added logging to "cast to timestamp" functions.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
9 files changed, 183 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/14
--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 14
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-10 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#13).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, -12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
1-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 1<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Also added logging to "cast to timestamp" functions.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
8 files changed, 81 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/13
--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 13
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

2017-10-10 Thread Sailesh Mukil (Code Review)
Hello Michael Ho, Bikramjeet Vig,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/7938

to look at the new patch set (#6).

Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
..

IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

Impala currently kinits by forking off a child process. This
has proved to be expensive in many cases since the subprocess
tries to reserve as much memory as Impala is currently using
which can be quite a lot.

This patch adds a flag called 'use_krpc_kinit' that defaults to
true. When it's true, it uses the Kudu security library's kinit code
that programatically uses the krb5 library to kinit.
When it's false, we run our current path which kicks off the
kinit-thread and forks off a kinit process periodically to reacquire
tickets based on FLAGS_kerberos_reinit_interval.

Converted existing tests in thrift-server-test to run with and
without kerberos. We now run this BE test with kerberos by using
Kudu's MiniKdc utility. This introduces a new dependency on some
kerberos binaries that are checked through FindKerberosPrograms.cmake.
Note that this is only a test dependency and not a dependency for
the impalad binaries and friends. Compilation will still succeed if
the kerberos binaries for the MiniKdc are not found, however, the
thrift-server-test will fail. We run with and without the
'use_krpc_kinit' flag.

Updated bin/bootstap_development.sh to install new sasl-gssapi
modules and the kerberos binaries required for the MiniKdc.

Testing: Verified with thrift-server-test and also manually on a
live kerberized cluster.

Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
---
M CMakeLists.txt
M be/src/exec/kudu-util.h
M be/src/kudu/security/CMakeLists.txt
M be/src/kudu/security/init.cc
M be/src/kudu/security/test/mini_kdc.cc
M be/src/rpc/CMakeLists.txt
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication.cc
M be/src/rpc/thrift-server-test.cc
M be/src/testutil/gtest-util.h
M bin/bootstrap_development.sh
A cmake_modules/FindKerberosPrograms.cmake
12 files changed, 230 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7938/6
--
To view, visit http://gerrit.cloudera.org:8080/7938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
Gerrit-Change-Number: 7938
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil