[Impala-ASF-CR] IMPALA-6537: Add missing ODBC scalar functions

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9376 )

Change subject: IMPALA-6537: Add missing ODBC scalar functions
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9376
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia60af2b4de8c098be7ecb3e60840e459ae10d499
Gerrit-Change-Number: 9376
Gerrit-PatchSet: 5
Gerrit-Owner: Greg Rahn 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Comment-Date: Fri, 23 Feb 2018 07:19:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6537: Add missing ODBC scalar functions

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9376 )

Change subject: IMPALA-6537: Add missing ODBC scalar functions
..

IMPALA-6537: Add missing ODBC scalar functions

This patch contains the following builtin function changes:

New aliases for existing functions:
- LEFT() same as STRLEFT()
- RIGHT() same as STRRIGHT()
- WEEK() same as WEEKOFYEAR()

New functions:
- QUARTER()
- MONTHNAME()

Refactors:
- Remove TimestampFunctions::DayName and add LongDayName to match pattern of
  TimestampFunctions::ShortDayName

Additionally, it adds the unit of QUARTER to EXTRACT() and DATE_PART()

Testing:
- manual testing comparing the translated ODBC functions to the
  non-translated ones
- added at least one new expr-test for aliases
- new expr-tests added for new functions

Change-Id: Ia60af2b4de8c098be7ecb3e60840e459ae10d499
Reviewed-on: http://gerrit.cloudera.org:8080/9376
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins
---
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/exprs/udf-builtins-ir.cc
M common/function-registry/impala_functions.py
M common/thrift/Exprs.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
10 files changed, 108 insertions(+), 53 deletions(-)

Approvals:
  Alex Behm: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9376
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia60af2b4de8c098be7ecb3e60840e459ae10d499
Gerrit-Change-Number: 9376
Gerrit-PatchSet: 6
Gerrit-Owner: Greg Rahn 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins


[Impala-ASF-CR] KUDU-2305: Fix local variable usage to handle 2GB messages

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9407 )

Change subject: KUDU-2305: Fix local variable usage to handle 2GB messages
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9407
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8e468099b16f7eb55de71b753acae8f57d18287c
Gerrit-Change-Number: 9407
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 23 Feb 2018 06:58:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6275: Fix warn stacktrace in successful CTAS

2018-02-22 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9364 )

Change subject: IMPALA-6275: Fix warn stacktrace in successful CTAS
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f07a188458954802fda20e3b3b56280d96e788e
Gerrit-Change-Number: 9364
Gerrit-PatchSet: 6
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Comment-Date: Fri, 23 Feb 2018 05:48:09 +
Gerrit-HasComments: No


Re: [Impala-ASF-CR] IMPALA-6275: Fix warn stacktrace in successful CTAS

2018-02-22 Thread Alexander Behm
GVO hit IMPALA-6532, going to retry

On Thu, Feb 22, 2018 at 6:31 PM, Impala Public Jenkins (Code Review) <
ger...@cloudera.org> wrote:

> Impala Public Jenkins *posted comments* on this change.
>
> View Change 
>
> Patch set 5:Verified -1
>
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1988/
>
>
> To view, visit change 9364 . To
> unsubscribe, visit settings .
> Gerrit-Project: Impala-ASF
> Gerrit-Branch: master
> Gerrit-MessageType: comment
> Gerrit-Change-Id: I6f07a188458954802fda20e3b3b56280d96e788e
> Gerrit-Change-Number: 9364
> Gerrit-PatchSet: 5
> Gerrit-Owner: Fredy Wijaya 
> Gerrit-Reviewer: Alex Behm 
> Gerrit-Reviewer: Fredy Wijaya 
> Gerrit-Reviewer: Impala Public Jenkins
> Gerrit-Comment-Date: Fri, 23 Feb 2018 02:31:09 +
> Gerrit-HasComments: No
>
> --
> You received this message because you are subscribed to the Google Groups
> "impala-cr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to impala-cr+unsubscr...@cloudera.com.
> For more options, visit https://groups.google.com/a/cloudera.com/d/optout.
>


[Impala-ASF-CR] IMPALA-6567: ResetMetadataStmt analysis should not load tables.

2018-02-22 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/9418 )

Change subject: IMPALA-6567: ResetMetadataStmt analysis should not load tables.
..

IMPALA-6567: ResetMetadataStmt analysis should not load tables.

This fixes a regression introduced by IMPALA-5152 where
invalidate metadata  and refresh  accidentally
required the target table to be loaded during analysis,
ultimately leading to a double load in some situations
(load during analysis, then another load during execution).
Since the purpose of these statements is to reload
metadata it does not make sense to require a table load
during analysis - that load happens during execution.

Note that REFRESH  PARTITION () still
requires the containing table to be loaded. This was
the behavior before IMPALA-5152 and this patch does
not attempt to improve that.

Testing:
- added new unit test
- ran FE tests locally
- validated the desired behavior by inspecting logs
  and the timeine from invalidate/refresh statements

Change-Id: I7033781ebf27ea53cfd26ff0e4f74d4f242bd1dc
---
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java
2 files changed, 24 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/9418/2
--
To view, visit http://gerrit.cloudera.org:8080/9418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7033781ebf27ea53cfd26ff0e4f74d4f242bd1dc
Gerrit-Change-Number: 9418
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] IMPALA-6567: ResetMetadataStmt analysis should not load tables.

2018-02-22 Thread Alex Behm (Code Review)
Alex Behm has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9418


Change subject: IMPALA-6567: ResetMetadataStmt analysis should not load tables.
..

IMPALA-6567: ResetMetadataStmt analysis should not load tables.

This fixes a regression introduced by IMPALA-5152 where
invaidate metadata  and refresh  accidentally
required the target table to be loaded during analysis,
ultimately leading to a double load in some situations
(load during analysis, then another load during execution).
Since the purpose of these statements is to reload
metadata it does not make sense to require a table load
during analysis - that load happens during execution.

Note that REFRESH  PARTITION () still
requires the containing table to be loaded. This was
the behavior before IMPALA-5152 and this patch does
not attempt to improve that.

Testing:
- added new unit test
- ran FE tests locally
- validated the desired behavior by inspecting logs
  and the timeine from invalidate/refresh statements

Change-Id: I7033781ebf27ea53cfd26ff0e4f74d4f242bd1dc
---
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java
2 files changed, 24 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/9418/1
--
To view, visit http://gerrit.cloudera.org:8080/9418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7033781ebf27ea53cfd26ff0e4f74d4f242bd1dc
Gerrit-Change-Number: 9418
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9371 )

Change subject: IMPALA-6549: Enable file handle cache by default
..

IMPALA-6549: Enable file handle cache by default

The file handle cache was disabled by default
due to two HDFS issues: HDFS-12528 and HDFS-14872.
Both have been fixed and the CDH components in
the toolchain include both fixes.

This reenables the file handle cache by default.

Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Reviewed-on: http://gerrit.cloudera.org:8080/9371
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins
---
M be/src/runtime/io/disk-io-mgr.cc
1 file changed, 7 insertions(+), 9 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9371 )

Change subject: IMPALA-6549: Enable file handle cache by default
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Fri, 23 Feb 2018 05:02:30 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-4835: Part 3: switch I/O buffers to buffer pool

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9417 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
..


Patch Set 1:

There were some conflicts here, so resolving them before the cherry-pick job 
gets to it. Just doing an initial test to see if things are clean.


--
To view, visit http://gerrit.cloudera.org:8080/9417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 9417
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 23 Feb 2018 05:01:29 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-4835: Part 3: switch I/O buffers to buffer pool

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9417 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1996/


--
To view, visit http://gerrit.cloudera.org:8080/9417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 9417
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Comment-Date: Fri, 23 Feb 2018 04:54:06 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-4835: Part 3: switch I/O buffers to buffer pool

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9417


Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
..

IMPALA-4835: Part 3: switch I/O buffers to buffer pool

This is the final patch to switch the Disk I/O manager to allocate all
buffer from the buffer pool and to reserve the buffers required for
a query upfront.

* The planner reserves enough memory to run a single scanner per
  scan node.
* The multi-threaded scan node must increase reservation before
  spinning up more threads.
* The scanner implementations must be careful to stay within their
  assigned reservation.

The row-oriented scanners were most straightforward, since they only
have a single scan range active at a time. A single I/O buffer is
sufficient to scan the whole file but more I/O buffers can improve I/O
throughput.

Parquet is more complex because it issues a scan range per column and
the sizes of the columns on disk are not known during planning. To
deal with this, the reservation in the frontend is based on a
heuristic involving the file size and # columns. The Parquet scanner
can then divvy up reservation to columns based on the size of column
data on disk.

I adjusted how the 'mem_limit' is divided between buffer pool and non
buffer pool memory for low mem_limits to account for the increase in
buffer pool memory.

Testing:
* Added more planner tests to cover reservation calcs for scan node.
* Test scanners for all file formats with the reservation denial debug
  action, to test behaviour when the scanners hit reservation limits.
* Updated memory and buffer pool limits for tests.
* Added unit tests for dividing reservation between columns in parquet,
  since the algorithm is non-trivial.

Perf:
I ran TPC-H and targeted perf locally comparing with master. Both
showed small improvements of a few percent and no regressions of
note. Cluster perf tests showed no significant change.

Conflicts:
be/src/exec/parquet-column-readers.cc

testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test

testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test

Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
---
M be/src/exec/CMakeLists.txt
A be/src/exec/hdfs-parquet-scanner-test.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/bufferpool/reservation-tracker-test.cc
M be/src/runtime/bufferpool/reservation-util.cc
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-stress.h
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/util/BitUtil.java
M fe/src/test/java/org/apache/impala/util/BitUtilTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M 
testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test
M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
M testdata/workloads/functional-query/queries/QueryTest/codegen-mem-limit.test
M 

[Impala-ASF-CR] IMPALA-6482: add EXEC TIME LIMIT S option

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9227 )

Change subject: IMPALA-6482: add EXEC_TIME_LIMIT_S option
..


Patch Set 10: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8
Gerrit-Change-Number: 9227
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 23 Feb 2018 04:22:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6482: add EXEC TIME LIMIT S option

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9227 )

Change subject: IMPALA-6482: add EXEC_TIME_LIMIT_S option
..

IMPALA-6482: add EXEC_TIME_LIMIT_S option

This is similar to the QUERY_TIMEOUT_S option and shares most of the
implementation. The difference is that the timeout doesn't reset at
any point.

The time limit is measured from the start of query execution,
after the query is admitted, so planning, scheduling and
time spent in admission control is not counted towards the time limit.

Also fix validation of the related QUERY_TIMEOUT_S option, which
previously could ignore invalid input.

Testing:
Added tests for various permutations:
* With and without query_timeout_s set
* With and without result fetching keeping the query active

Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8
Reviewed-on: http://gerrit.cloudera.org:8080/9227
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M tests/custom_cluster/test_query_expiration.py
10 files changed, 304 insertions(+), 111 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8
Gerrit-Change-Number: 9227
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8966 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
..

IMPALA-4835: Part 3: switch I/O buffers to buffer pool

This is the final patch to switch the Disk I/O manager to allocate all
buffer from the buffer pool and to reserve the buffers required for
a query upfront.

* The planner reserves enough memory to run a single scanner per
  scan node.
* The multi-threaded scan node must increase reservation before
  spinning up more threads.
* The scanner implementations must be careful to stay within their
  assigned reservation.

The row-oriented scanners were most straightforward, since they only
have a single scan range active at a time. A single I/O buffer is
sufficient to scan the whole file but more I/O buffers can improve I/O
throughput.

Parquet is more complex because it issues a scan range per column and
the sizes of the columns on disk are not known during planning. To
deal with this, the reservation in the frontend is based on a
heuristic involving the file size and # columns. The Parquet scanner
can then divvy up reservation to columns based on the size of column
data on disk.

I adjusted how the 'mem_limit' is divided between buffer pool and non
buffer pool memory for low mem_limits to account for the increase in
buffer pool memory.

Testing:
* Added more planner tests to cover reservation calcs for scan node.
* Test scanners for all file formats with the reservation denial debug
  action, to test behaviour when the scanners hit reservation limits.
* Updated memory and buffer pool limits for tests.
* Added unit tests for dividing reservation between columns in parquet,
  since the algorithm is non-trivial.

Perf:
I ran TPC-H and targeted perf locally comparing with master. Both
showed small improvements of a few percent and no regressions of
note. Cluster perf tests showed no significant change.

Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Reviewed-on: http://gerrit.cloudera.org:8080/8966
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/exec/CMakeLists.txt
A be/src/exec/hdfs-parquet-scanner-test.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/bufferpool/reservation-tracker-test.cc
M be/src/runtime/bufferpool/reservation-util.cc
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-stress.h
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/util/BitUtil.java
M fe/src/test/java/org/apache/impala/util/BitUtilTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M 
testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test
M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
M testdata/workloads/functional-query/queries/QueryTest/codegen-mem-limit.test
M 
testdata/workloads/functional-query/queries/QueryTest/disk-spill-encryption.test
M 

[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..

IMPALA-4835: Part 2: Allocate scan range buffers upfront

This change is a step towards reserving memory for buffers from the
buffer pool and constraining per-scanner memory requirements. This
change restructures the DiskIoMgr code so that each ScanRange operates
with a fixed set of buffers that are allocated upfront and recycled as
the I/O mgr works through the ScanRange.

One major change is that ScanRanges get blocked when a buffer is not
available and get unblocked when a client returns a buffer via
ReturnBuffer(). I was able to remove the logic to maintain the
blocked_ranges_ list by instead adding a separate set with all ranges
that are active.

There is also some miscellaneous cleanup included - e.g. reducing the
amount of code devoted to maintaining counters and metrics.

One tricky part of the existing code was the it called
IssueInitialRanges() with empty lists of files and depended on
DiskIoMgr::AddScanRanges() to not check for cancellation in that case.
See  IMPALA-6564. I changed the logic to not try to issue ranges for
empty lists of files.

I plan to merge this along with the actual buffer pool switch, but
separated it out to allow review of the DiskIoMgr changes separate from
other aspects of the buffer pool switchover.

Testing:
* Ran core and exhaustive tests.

Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Reviewed-on: http://gerrit.cloudera.org:8080/8707
Reviewed-by: Tim Armstrong 
Tested-by: Tim Armstrong 
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/base-sequence-scanner.h
M be/src/exec/hdfs-lzo-text-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/io/disk-io-mgr-internal.h
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/util/bit-util-test.cc
M be/src/util/bit-util.h
24 files changed, 1,024 insertions(+), 594 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 35
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8414 )

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
..


Patch Set 25: Verified+1

Verified with https://jenkins.impala.io/job/gerrit-verify-dryrun/1991/


--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 25
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 23 Feb 2018 04:17:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 34: Verified+1

Verified with https://jenkins.impala.io/job/gerrit-verify-dryrun/1991/


--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 34
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 23 Feb 2018 04:17:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8414 )

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
..

IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

In preparation for switching the I/O mgr to the buffer pool, this
removes and cleans up a lot of code so that the switchover patch starts
from a cleaner slate.

* Remove the free buffer cache (which will be replaced by buffer pool's
  own caching).
* Make memory limit exceeded error checking synchronous (in anticipation
  of having to propagate buffer pool errors synchronously).
* Simplify error propagation - remove the (ineffectual) code that
  enqueued BufferDescriptors containing error statuses.
* Document locking scheme better in a few places, make it part of the
  function signature when it seemed reasonable.
* Move ReturnBuffer() to ScanRange, because it is intrinsically
  connected with the lifecycle of a scan range.
* Separate external ReturnBuffer() and internal CleanUpBuffer()
  interfaces - previously callers of ReturnBuffer() were fudging
  the num_buffers_in_reader accounting to make the external interface work.
* Eliminate redundant state in ScanRange: 'eosr_returned_' and
  'is_cancelled_'.
* Clarify the logic around calling Close() for the last
  BufferDescriptor.
  -> There appeared to be an implicit assumption that buffers would be
 freed in the order they were returned from the scan range, so that
 the "eos" buffer was returned last. Instead just count the number
 of outstanding buffers to detect the last one.
  -> Touching the is_cancelled_ field without holding a lock was hard to
 reason about - violated locking rules and it was unclear that it
 was race-free.
* Remove DiskIoMgr::Read() to simplify the interface. It is trivial to
  inline at the callsites.

This will probably regress performance somewhat because of the cache
removal, so my plan is to merge it around the same time as switching
the I/O mgr to allocate from the buffer pool. I'm keeping the patches
separate to make reviewing easier.

Testing:
* Ran exhaustive tests
* Ran the disk-io-mgr-stress-test overnight

Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Reviewed-on: http://gerrit.cloudera.org:8080/8414
Reviewed-by: Tim Armstrong 
Tested-by: Tim Armstrong 
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/test-env.cc
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
19 files changed, 577 insertions(+), 909 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 26
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8966 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
..


Patch Set 25: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8966
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 8966
Gerrit-PatchSet: 25
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 23 Feb 2018 04:08:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6008: Creating a UDF from a shared library with a .ll extenion crashes impala

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9154 )

Change subject: IMPALA-6008: Creating a UDF from a shared library with a .ll 
extenion crashes impala
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id060668802ca9c80367cdc0e8a823b968d549bbb
Gerrit-Change-Number: 9154
Gerrit-PatchSet: 7
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Fri, 23 Feb 2018 03:50:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2990: Add a warning message during cancellation

2018-02-22 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9413 )

Change subject: IMPALA-2990: Add a warning message during cancellation
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9413/1/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/9413/1/be/src/runtime/query-state.cc@281
PS1, Line 281:   LOG(ERROR) << "Cancelling fragment instances due to 
failure to report status. "
 :  << "Query may hang. See IMPALA-2990.";
Can we include information about which query/backend this is? That would let us 
know which query might be hanging.



--
To view, visit http://gerrit.cloudera.org:8080/9413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfd56edfe74707592f4dd9c840550b13cb80e9a0
Gerrit-Change-Number: 9413
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 23 Feb 2018 02:50:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2990: Add a warning message during cancellation

2018-02-22 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9413 )

Change subject: IMPALA-2990: Add a warning message during cancellation
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/9413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfd56edfe74707592f4dd9c840550b13cb80e9a0
Gerrit-Change-Number: 9413
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 23 Feb 2018 02:41:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

2018-02-22 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9384 )

Change subject: IMPALA-6347: Monitor queue depth size for outgoing RPCs for 
Reactor threads
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9384/9/www/rpcz.tmpl
File www/rpcz.tmpl:

http://gerrit.cloudera.org:8080/#/c/9384/9/www/rpcz.tmpl@238
PS9, Line 238: asc"
> nit: asc seems to be a slightly more natural ordering for IP addresses, e.g
Done



--
To view, visit http://gerrit.cloudera.org:8080/9384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
Gerrit-Change-Number: 9384
Gerrit-PatchSet: 10
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 23 Feb 2018 02:39:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6275: Fix warn stacktrace in successful CTAS

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9364 )

Change subject: IMPALA-6275: Fix warn stacktrace in successful CTAS
..


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1988/


--
To view, visit http://gerrit.cloudera.org:8080/9364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f07a188458954802fda20e3b3b56280d96e788e
Gerrit-Change-Number: 9364
Gerrit-PatchSet: 5
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Comment-Date: Fri, 23 Feb 2018 02:31:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2990: Add a warning message during cancellation

2018-02-22 Thread Michael Ho (Code Review)
Michael Ho has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9413


Change subject: IMPALA-2990: Add a warning message during cancellation
..

IMPALA-2990: Add a warning message during cancellation

Until IMPALA-2990 is fixed, there is no easy way to
prevent query hangs due to cancellation of fragment
instances in a backend node when it fails to report
status to the coordinator. Given it's rather hard to
diagnose IMPALA-2990, this change adds a log statement
at the point of cancellation to warn about IMPALA-2990.
This hopefully makes diagnostics slightly easier.

Change-Id: Icfd56edfe74707592f4dd9c840550b13cb80e9a0
---
M be/src/runtime/query-state.cc
1 file changed, 7 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/9413/1
--
To view, visit http://gerrit.cloudera.org:8080/9413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Icfd56edfe74707592f4dd9c840550b13cb80e9a0
Gerrit-Change-Number: 9413
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

2018-02-22 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9384 )

Change subject: IMPALA-6347: Monitor queue depth size for outgoing RPCs for 
Reactor threads
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9384/9/www/rpcz.tmpl
File www/rpcz.tmpl:

http://gerrit.cloudera.org:8080/#/c/9384/9/www/rpcz.tmpl@238
PS9, Line 238: desc
nit: asc seems to be a slightly more natural ordering for IP addresses, e.g. 
10.0.0.1 would come before 10.0.0.2.



--
To view, visit http://gerrit.cloudera.org:8080/9384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
Gerrit-Change-Number: 9384
Gerrit-PatchSet: 9
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 23 Feb 2018 02:29:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

2018-02-22 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9384 )

Change subject: IMPALA-6347: Monitor queue depth size for outgoing RPCs for 
Reactor threads
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9384/8/www/rpcz.tmpl
File www/rpcz.tmpl:

http://gerrit.cloudera.org:8080/#/c/9384/8/www/rpcz.tmpl@100
PS8, Line 100: >{{remote_ip}}
> I think the id is not needed anymore, here and in the line below.
Done


http://gerrit.cloudera.org:8080/#/c/9384/8/www/rpcz.tmpl@229
PS8, Line 229:   var table = $('#per_conn_metrics').DataTable();
> remove
Done



--
To view, visit http://gerrit.cloudera.org:8080/9384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
Gerrit-Change-Number: 9384
Gerrit-PatchSet: 9
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 23 Feb 2018 02:22:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

2018-02-22 Thread Sailesh Mukil (Code Review)
Hello Michael Ho, Lars Volker,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9384

to look at the new patch set (#9).

Change subject: IMPALA-6347: Monitor queue depth size for outgoing RPCs for 
Reactor threads
..

IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

On systems with slow networking large queuing can occur in the Reactor
threads. It would be good to quantify how much queueing occurred.

This patch extracts the OutboundTransfer queue size information from
KRPC and displays it on the /rpcz webpage.

Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
---
M be/src/rpc/rpc-mgr.cc
M www/rpcz.tmpl
2 files changed, 46 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/9384/9
--
To view, visit http://gerrit.cloudera.org:8080/9384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
Gerrit-Change-Number: 9384
Gerrit-PatchSet: 9
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

2018-02-22 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9384 )

Change subject: IMPALA-6347: Monitor queue depth size for outgoing RPCs for 
Reactor threads
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9384/8/www/rpcz.tmpl
File www/rpcz.tmpl:

http://gerrit.cloudera.org:8080/#/c/9384/8/www/rpcz.tmpl@100
PS8, Line 100:  id="{{remote_ip}}_remote_ip"
I think the id is not needed anymore, here and in the line below.


http://gerrit.cloudera.org:8080/#/c/9384/8/www/rpcz.tmpl@229
PS8, Line 229:   var table;
remove



--
To view, visit http://gerrit.cloudera.org:8080/9384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
Gerrit-Change-Number: 9384
Gerrit-PatchSet: 8
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 23 Feb 2018 02:20:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

2018-02-22 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9384 )

Change subject: IMPALA-6347: Monitor queue depth size for outgoing RPCs for 
Reactor threads
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9384/7/www/rpcz.tmpl
File www/rpcz.tmpl:

http://gerrit.cloudera.org:8080/#/c/9384/7/www/rpcz.tmpl@99
PS7, Line 99:   

[Impala-ASF-CR] IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

2018-02-22 Thread Sailesh Mukil (Code Review)
Hello Michael Ho, Lars Volker,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9384

to look at the new patch set (#8).

Change subject: IMPALA-6347: Monitor queue depth size for outgoing RPCs for 
Reactor threads
..

IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

On systems with slow networking large queuing can occur in the Reactor
threads. It would be good to quantify how much queueing occurred.

This patch extracts the OutboundTransfer queue size information from
KRPC and displays it on the /rpcz webpage.

Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
---
M be/src/rpc/rpc-mgr.cc
M www/rpcz.tmpl
2 files changed, 47 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/9384/8
--
To view, visit http://gerrit.cloudera.org:8080/9384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
Gerrit-Change-Number: 9384
Gerrit-PatchSet: 8
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6537: Add missing ODBC scalar functions

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9376 )

Change subject: IMPALA-6537: Add missing ODBC scalar functions
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1995/


--
To view, visit http://gerrit.cloudera.org:8080/9376
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia60af2b4de8c098be7ecb3e60840e459ae10d499
Gerrit-Change-Number: 9376
Gerrit-PatchSet: 5
Gerrit-Owner: Greg Rahn 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Comment-Date: Fri, 23 Feb 2018 02:03:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6537: Add missing ODBC scalar functions

2018-02-22 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9376 )

Change subject: IMPALA-6537: Add missing ODBC scalar functions
..


Patch Set 5: Code-Review+2

Trivial fixes with expected test output and a missing "return"


--
To view, visit http://gerrit.cloudera.org:8080/9376
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia60af2b4de8c098be7ecb3e60840e459ae10d499
Gerrit-Change-Number: 9376
Gerrit-PatchSet: 5
Gerrit-Owner: Greg Rahn 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Comment-Date: Fri, 23 Feb 2018 02:03:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6537: Add missing ODBC scalar functions

2018-02-22 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new patch set (#5) to the change originally created by 
Greg Rahn. ( http://gerrit.cloudera.org:8080/9376 )

Change subject: IMPALA-6537: Add missing ODBC scalar functions
..

IMPALA-6537: Add missing ODBC scalar functions

This patch contains the following builtin function changes:

New aliases for existing functions:
- LEFT() same as STRLEFT()
- RIGHT() same as STRRIGHT()
- WEEK() same as WEEKOFYEAR()

New functions:
- QUARTER()
- MONTHNAME()

Refactors:
- Remove TimestampFunctions::DayName and add LongDayName to match pattern of
  TimestampFunctions::ShortDayName

Additionally, it adds the unit of QUARTER to EXTRACT() and DATE_PART()

Testing:
- manual testing comparing the translated ODBC functions to the
  non-translated ones
- added at least one new expr-test for aliases
- new expr-tests added for new functions

Change-Id: Ia60af2b4de8c098be7ecb3e60840e459ae10d499
---
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/exprs/udf-builtins-ir.cc
M common/function-registry/impala_functions.py
M common/thrift/Exprs.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
10 files changed, 108 insertions(+), 53 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/9376/5
--
To view, visit http://gerrit.cloudera.org:8080/9376
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia60af2b4de8c098be7ecb3e60840e459ae10d499
Gerrit-Change-Number: 9376
Gerrit-PatchSet: 5
Gerrit-Owner: Greg Rahn 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins


[Impala-ASF-CR] KUDU-2305: Fix local variable usage to handle 2GB messages

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9407 )

Change subject: KUDU-2305: Fix local variable usage to handle 2GB messages
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1994/


--
To view, visit http://gerrit.cloudera.org:8080/9407
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8e468099b16f7eb55de71b753acae8f57d18287c
Gerrit-Change-Number: 9407
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 23 Feb 2018 01:51:21 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-5152: Introduce metadata loading phase

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9408 )

Change subject: IMPALA-5152: Introduce metadata loading phase
..


Patch Set 1: Code-Review+1

I took a look at the conflicts and satisfied myself that they were resolved 
correctly. In both cases this patch was adjusting code that existed in master 
but not in 2.x.


--
To view, visit http://gerrit.cloudera.org:8080/9408
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I68d32d5acd4a6f6bc6cedb05e6cc5cf604d24a55
Gerrit-Change-Number: 9408
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 23 Feb 2018 01:18:08 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-5152: Introduce metadata loading phase

2018-02-22 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9408 )

Change subject: IMPALA-5152: Introduce metadata loading phase
..


Patch Set 1:

(2 comments)

Did not cherry-pick clean. I'm still running the tests.

http://gerrit.cloudera.org:8080/#/c/9408/1/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/9408/1/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@545
PS1, Line 545:   if 
(havingClause_.contains(Predicates.instanceOf(Subquery.class))) {
There was a conflict here with alias substitution changes that are in master 
but not here in 2.x.


http://gerrit.cloudera.org:8080/#/c/9408/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java:

http://gerrit.cloudera.org:8080/#/c/9408/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java@2219
PS1, Line 2219:   private void testDecimalExpr(String expr, Type expectedType) {
There was a conflict here with the patch in master that turns on DECIMAL_V2 on 
by default.



--
To view, visit http://gerrit.cloudera.org:8080/9408
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I68d32d5acd4a6f6bc6cedb05e6cc5cf604d24a55
Gerrit-Change-Number: 9408
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Comment-Date: Fri, 23 Feb 2018 01:09:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR](2.x) IMPALA-5152: Introduce metadata loading phase

2018-02-22 Thread Alex Behm (Code Review)
Hello Impala Public Jenkins,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/9408

to review the following change.


Change subject: IMPALA-5152: Introduce metadata loading phase
..

IMPALA-5152: Introduce metadata loading phase

Reworks the collection and loading of missing metadata
when compiling a statement. Introduces a new
metadata-loading phase between parsing and analysis.
Summary of the new compilation flow:
1. Parse statement.
2. Collect all table references from the parsed
   statement and generate a list of tables that need
   to be loaded for analysis to succeed.
3. Request missing metadata and wait for it to arrive.
   As views become loaded we expand the set of required
   tables based on the view definitions.
   This step populates a statement-local table cache
   that contains all loaded tables relevant to the
   statement.
4. Create a new Analyzer with the table cache and
   analyze the statement. During analysis only the
   table cache is consulted for table metadata, the
   ImpaladCatalog is not used for that purpose anymore.
5. Authorize the statement.
6. Plan generation as usual.

The intent of the existing code was to collect all tables
missing metadata during analysis, load the metadata, and then
re-analyze the statement (and repeat those steps until all
metadata is loaded).
Unfortunately, the relevant code was hard-to-follow, subtle
and not well tested, and therefore it was broken in several
ways over the course of time. For example, the introduction
of path analysis for nested types subtly broke the intended
behavior, and there are other similar examples.

The serial table loading observed in the JIRA was caused by the
following code in the resolution of table references:
for (all path interpretations) {
  try {
// Try to resolve the path; might call getTable() which
// throws for nonexistent tables.
  } catch (AnalysisException e) {
if (analyzer.hasMissingTbls()) throw e;
  }
}

The following example illustrates the problem:
SELECT * FROM a.b, x.y
When resolving the path "a.b" we consider that "a" could be a
database or a table. Similarly, "b" could be a table or a
nested collection.
If the path resolution for "a.b" adds a missing table entry,
then the path resolution for "x.y" could exit prematurely,
without trying the other path interpretations that would
lead to adding the expected missing table. So effectively,
the tables end up being loaded one-by-one.

Testing:
- A core/hdfs run succeeded
- No new tests were added because the existing functional tests
  provide good coverage of various metadata loading scenarios.
- The issue reported in IMPALA-5152 is basically impossible now.
  Adding FE unit tests for that bug specifically would require
  ugly changes to the new code to enable such testing.

Change-Id: I68d32d5acd4a6f6bc6cedb05e6cc5cf604d24a55
Reviewed-on: http://gerrit.cloudera.org:8080/8958
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins
---
M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/AuthorizationStmt.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateFunctionStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/DescribeDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/LimitElement.java
M fe/src/main/java/org/apache/impala/analysis/LoadDataStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/Path.java
M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SetStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowCreateFunctionStmt.java
M 

[Impala-ASF-CR] IMPALA-6537: Add missing ODBC scalar functions

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9376 )

Change subject: IMPALA-6537: Add missing ODBC scalar functions
..


Patch Set 4: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1987/


--
To view, visit http://gerrit.cloudera.org:8080/9376
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia60af2b4de8c098be7ecb3e60840e459ae10d499
Gerrit-Change-Number: 9376
Gerrit-PatchSet: 4
Gerrit-Owner: Greg Rahn 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Comment-Date: Fri, 23 Feb 2018 01:02:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] KUDU-2305: Fix local variable usage to handle 2GB messages

2018-02-22 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9407 )

Change subject: KUDU-2305: Fix local variable usage to handle 2GB messages
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9407
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8e468099b16f7eb55de71b753acae8f57d18287c
Gerrit-Change-Number: 9407
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 23 Feb 2018 01:00:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

2018-02-22 Thread Sailesh Mukil (Code Review)
Hello Michael Ho, Lars Volker,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9384

to look at the new patch set (#7).

Change subject: IMPALA-6347: Monitor queue depth size for outgoing RPCs for 
Reactor threads
..

IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

On systems with slow networking large queuing can occur in the Reactor
threads. It would be good to quantify how much queueing occurred.

This patch extracts the OutboundTransfer queue size information from
KRPC and displays it on the /rpcz webpage.

Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
---
M be/src/rpc/rpc-mgr.cc
M www/rpcz.tmpl
2 files changed, 52 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/9384/7
--
To view, visit http://gerrit.cloudera.org:8080/9384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
Gerrit-Change-Number: 9384
Gerrit-PatchSet: 7
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

2018-02-22 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9384 )

Change subject: IMPALA-6347: Monitor queue depth size for outgoing RPCs for 
Reactor threads
..


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/9384/5/be/src/rpc/rpc-mgr.cc
File be/src/rpc/rpc-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9384/5/be/src/rpc/rpc-mgr.cc@208
PS5, Line 208:   Value per_conn_metrics(kArrayType);
> nit: maybe add a comment similar to the one in L225?
Done


http://gerrit.cloudera.org:8080/#/c/9384/5/be/src/rpc/rpc-mgr.cc@233
PS5, Line 233:   document->AddMember("per_conn_metrics", per_conn_metrics, 
document->GetAllocator());
> Move to L222 right after its construction?
Done


http://gerrit.cloudera.org:8080/#/c/9384/5/www/rpcz.tmpl
File www/rpcz.tmpl:

http://gerrit.cloudera.org:8080/#/c/9384/5/www/rpcz.tmpl@96
PS5, Line 96:   
> The other .tmpl files have an empty tbody here, maybe it could increase rea
Done


http://gerrit.cloudera.org:8080/#/c/9384/5/www/rpcz.tmpl@237
PS5, Line 237:   for (var i = 0; i < json["per_conn_metrics"].length; ++i) {
> Can you rewrite this to use two nested calls to map()?
Good point, we don't want the list to keep fluttering for large clusters.

I had to adjust the syntax a bit, but I got that to work.



--
To view, visit http://gerrit.cloudera.org:8080/9384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
Gerrit-Change-Number: 9384
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 23 Feb 2018 00:45:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6482: add EXEC TIME LIMIT S option

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9227 )

Change subject: IMPALA-6482: add EXEC_TIME_LIMIT_S option
..


Patch Set 10: Code-Review+2

Carry +2


--
To view, visit http://gerrit.cloudera.org:8080/9227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8
Gerrit-Change-Number: 9227
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 23 Feb 2018 00:41:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6482: add EXEC TIME LIMIT S option

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9227 )

Change subject: IMPALA-6482: add EXEC_TIME_LIMIT_S option
..


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1993/


--
To view, visit http://gerrit.cloudera.org:8080/9227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8
Gerrit-Change-Number: 9227
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 23 Feb 2018 00:41:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6482: add EXEC TIME LIMIT S option

2018-02-22 Thread Tim Armstrong (Code Review)
Hello Philip Zeyliger, Bikramjeet Vig, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9227

to look at the new patch set (#10).

Change subject: IMPALA-6482: add EXEC_TIME_LIMIT_S option
..

IMPALA-6482: add EXEC_TIME_LIMIT_S option

This is similar to the QUERY_TIMEOUT_S option and shares most of the
implementation. The difference is that the timeout doesn't reset at
any point.

The time limit is measured from the start of query execution,
after the query is admitted, so planning, scheduling and
time spent in admission control is not counted towards the time limit.

Also fix validation of the related QUERY_TIMEOUT_S option, which
previously could ignore invalid input.

Testing:
Added tests for various permutations:
* With and without query_timeout_s set
* With and without result fetching keeping the query active

Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M tests/custom_cluster/test_query_expiration.py
10 files changed, 304 insertions(+), 111 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/9227/10
--
To view, visit http://gerrit.cloudera.org:8080/9227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8
Gerrit-Change-Number: 9227
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6482: add EXEC TIME LIMIT S option

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9227 )

Change subject: IMPALA-6482: add EXEC_TIME_LIMIT_S option
..


Patch Set 9:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.h@1038
PS9, Line 1038:  a query
> query execution
Done


http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.cc@984
PS9, Line 984:  has timeout of
> might be nice to make these more specific:
Done


http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.cc@990
PS9, Line 990: has time limit
> has an execution time limit of
Done


http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.cc@1848
PS9, Line 1848: If the last-observed expiration time for this query is still in 
the future,
> not your change, but this seems wrong given that queries_by_timestamp has d
I think the logic is valid but the explanation is confusing. The basic thing is 
that they're processed in ascending order of deadline so you can break as soon 
as you see the first future deadline.

I *think* this comment was trying to say that the actual expiry time may be 
later than the original deadline, but it can't be earlier. I don't think that's 
necessary to explain this logic, so I removed it and left a simpler comment.


http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.cc@1864
PS9, Line 1864: time
> execution time?
Done


http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.cc@1865
PS9, Line 1865: time
> execution time?
Done


http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.cc@1872
PS9, Line 1872:
> DCHECK(kind == IDLE_TIMEOUT)
Done


http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.cc@1874
PS9, Line 1874: it
> last_active_ms?
Done


http://gerrit.cloudera.org:8080/#/c/9227/9/tests/custom_cluster/test_query_expiration.py
File tests/custom_cluster/test_query_expiration.py:

http://gerrit.cloudera.org:8080/#/c/9227/9/tests/custom_cluster/test_query_expiration.py@88
PS9, Line 88: self.assert_impalad_log_contains('INFO', "Expiring query due 
to client inactivity: "
: "[0-9a-f]+:[0-9a-f]+, last activity was at: 
\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d")
: self.assert_impalad_log_contains('INFO',
: "Expiring query [0-9a-f]+:[0-9a-f]+ due to time limit of 
1s")
> why do we do that instead of checking the query status message? which we do
Good point. I added in __expect_expired too. That required a change below to 
ignore "Invalid or unknown query handle" errors from close_query(), which 
happen because fetching from an expired handle unregisters the query after 
returning the error.



--
To view, visit http://gerrit.cloudera.org:8080/9227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8
Gerrit-Change-Number: 9227
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 23 Feb 2018 00:38:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6482: add EXEC TIME LIMIT S option

2018-02-22 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9227 )

Change subject: IMPALA-6482: add EXEC_TIME_LIMIT_S option
..


Patch Set 9: Code-Review+2

(9 comments)

http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.h@1038
PS9, Line 1038:  a query
query execution


http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.cc@984
PS9, Line 984:  has timeout of
might be nice to make these more specific:
has an idle timeout of


http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.cc@990
PS9, Line 990: has time limit
has an execution time limit of


http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.cc@1848
PS9, Line 1848: If the last-observed expiration time for this query is still in 
the future,
not your change, but this seems wrong given that queries_by_timestamp has 
deadlines for various queries right?


http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.cc@1864
PS9, Line 1864: time
execution time?


http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.cc@1865
PS9, Line 1865: time
execution time?


http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.cc@1872
PS9, Line 1872:
DCHECK(kind == IDLE_TIMEOUT)


http://gerrit.cloudera.org:8080/#/c/9227/9/be/src/service/impala-server.cc@1874
PS9, Line 1874: it
last_active_ms?


http://gerrit.cloudera.org:8080/#/c/9227/9/tests/custom_cluster/test_query_expiration.py
File tests/custom_cluster/test_query_expiration.py:

http://gerrit.cloudera.org:8080/#/c/9227/9/tests/custom_cluster/test_query_expiration.py@88
PS9, Line 88: self.assert_impalad_log_contains('INFO', "Expiring query due 
to client inactivity: "
: "[0-9a-f]+:[0-9a-f]+, last activity was at: 
\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d")
: self.assert_impalad_log_contains('INFO',
: "Expiring query [0-9a-f]+:[0-9a-f]+ due to time limit of 
1s")
why do we do that instead of checking the query status message? which we do 
below with __expect_expired, no?



--
To view, visit http://gerrit.cloudera.org:8080/9227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8
Gerrit-Change-Number: 9227
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 23 Feb 2018 00:06:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5752: Add support for DECIMAL on Kudu tables

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9368 )

Change subject: IMPALA-5752: Add support for DECIMAL on Kudu tables
..

IMPALA-5752: Add support for DECIMAL on Kudu tables

Adds support for the Kudu DECIMAL type introduced in Kudu 1.7.0.

Note: Adding support for Kudu decimal min/max filters is
tracked in IMPALA-6533.

Tests:
* Added Kudu create with decimal test to AnalyzeDDLTest.java
* Added Kudu table_format to test_decimal_queries.py
** Both decimal.test and decimal-exprs.test workloads
* Added decimal queries to the following Kudu workloads:
** kudu_create.test
** kudu_delete.test
** kudu_insert.test
** kudu_update.test
** kudu_upsert.test

Change-Id: I3a9fe5acadc53ec198585d765a8cfb0abe56e199
Reviewed-on: http://gerrit.cloudera.org:8080/9368
Reviewed-by: Dimitris Tsirogiannis 
Tested-by: Impala Public Jenkins
---
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-util.cc
M be/src/exec/kudu-util.h
M be/src/exprs/kudu-partition-expr.cc
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/query_test/test_decimal_queries.py
M tests/query_test/test_kudu.py
21 files changed, 882 insertions(+), 637 deletions(-)

Approvals:
  Dimitris Tsirogiannis: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3a9fe5acadc53ec198585d765a8cfb0abe56e199
Gerrit-Change-Number: 9368
Gerrit-PatchSet: 10
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-5752: Add support for DECIMAL on Kudu tables

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9368 )

Change subject: IMPALA-5752: Add support for DECIMAL on Kudu tables
..


Patch Set 9: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a9fe5acadc53ec198585d765a8cfb0abe56e199
Gerrit-Change-Number: 9368
Gerrit-PatchSet: 9
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Fri, 23 Feb 2018 00:03:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9371 )

Change subject: IMPALA-6549: Enable file handle cache by default
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1992/


--
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 22 Feb 2018 23:56:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] KUDU-2301: (Part-1) Add instrumentation on a per connection level

2018-02-22 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9383 )

Change subject: KUDU-2301: (Part-1) Add instrumentation on a per connection 
level
..


Patch Set 2: Code-Review+1

> Looks good to me, manually compared the two diffs. Can you add some
 > pointer to the commit message to point to the Kudu review? Usually
 > cherry-pick would have added those I think (e.g. 
 > https://gerrit.cloudera.org/#/c/9361/).

Done. Carry Lars' +1.


--
To view, visit http://gerrit.cloudera.org:8080/9383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iae1a5fe0066adf644a9cac41ad6696e1bbf00465
Gerrit-Change-Number: 9383
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 22 Feb 2018 23:55:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] KUDU-2301: (Part-1) Add instrumentation on a per connection level

2018-02-22 Thread Sailesh Mukil (Code Review)
Hello Michael Ho, Lars Volker, Kudu Jenkins, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9383

to look at the new patch set (#2).

Change subject: KUDU-2301: (Part-1) Add instrumentation on a per connection 
level
..

KUDU-2301: (Part-1) Add instrumentation on a per connection level

This patch returns the OutboundTransfer queue size on a per
connection level and makes them accessible via the
DumpRunningRpcs() call.

A test is added in rpc-test to ensure that this metric works as
expected.

A future patch will add more metrics.

Change-Id: Iae1a5fe0066adf644a9cac41ad6696e1bbf00465
Reviewed-on: http://gerrit.cloudera.org:8080/9343
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon 
---
M be/src/kudu/rpc/connection.cc
M be/src/kudu/rpc/connection.h
M be/src/kudu/rpc/messenger.h
M be/src/kudu/rpc/rpc-test.cc
M be/src/kudu/rpc/rpc_introspection.proto
5 files changed, 77 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/9383/2
--
To view, visit http://gerrit.cloudera.org:8080/9383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iae1a5fe0066adf644a9cac41ad6696e1bbf00465
Gerrit-Change-Number: 9383
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

2018-02-22 Thread Sailesh Mukil (Code Review)
Hello Michael Ho, Lars Volker,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9384

to look at the new patch set (#6).

Change subject: IMPALA-6347: Monitor queue depth size for outgoing RPCs for 
Reactor threads
..

IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

On systems with slow networking large queuing can occur in the Reactor
threads. It would be good to quantify how much queueing occurred.

This patch extracts the OutboundTransfer queue size information from
KRPC and displays it on the /rpcz webpage.

Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
---
M be/src/rpc/rpc-mgr.cc
M www/rpcz.tmpl
2 files changed, 46 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/9384/6
--
To view, visit http://gerrit.cloudera.org:8080/9384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
Gerrit-Change-Number: 9384
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

2018-02-22 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9371 )

Change subject: IMPALA-6549: Enable file handle cache by default
..


Patch Set 3:

GVO hit IMPALA-6532


--
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 22 Feb 2018 23:54:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6549: Enable file handle cache by default

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9371 )

Change subject: IMPALA-6549: Enable file handle cache by default
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1986/


--
To view, visit http://gerrit.cloudera.org:8080/9371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6935825a1c4c7b2da0bb877f732027be1a57a8b7
Gerrit-Change-Number: 9371
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 22 Feb 2018 23:53:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

2018-02-22 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9384 )

Change subject: IMPALA-6347: Monitor queue depth size for outgoing RPCs for 
Reactor threads
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9384/4/www/rpcz.tmpl
File www/rpcz.tmpl:

http://gerrit.cloudera.org:8080/#/c/9384/4/www/rpcz.tmpl@239
PS4, Line 239: var row_array = $.map(conn_json, function(el) { return el });
 : table.row.add(row_array).draw();
> still needed?
Nope, removed it.



--
To view, visit http://gerrit.cloudera.org:8080/9384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
Gerrit-Change-Number: 9384
Gerrit-PatchSet: 5
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 22 Feb 2018 23:48:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

2018-02-22 Thread Sailesh Mukil (Code Review)
Hello Michael Ho, Lars Volker,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9384

to look at the new patch set (#5).

Change subject: IMPALA-6347: Monitor queue depth size for outgoing RPCs for 
Reactor threads
..

IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

On systems with slow networking large queuing can occur in the Reactor
threads. It would be good to quantify how much queueing occurred.

This patch extracts the OutboundTransfer queue size information from
KRPC and displays it on the /rpcz webpage.

Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
---
M be/src/rpc/rpc-mgr.cc
M www/rpcz.tmpl
2 files changed, 46 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/9384/5
--
To view, visit http://gerrit.cloudera.org:8080/9384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
Gerrit-Change-Number: 9384
Gerrit-PatchSet: 5
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] KUDU-2305: Fix local variable usage to handle 2GB messages

2018-02-22 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9407 )

Change subject: KUDU-2305: Fix local variable usage to handle 2GB messages
..


Patch Set 1:

Clean cherry-pick


--
To view, visit http://gerrit.cloudera.org:8080/9407
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8e468099b16f7eb55de71b753acae8f57d18287c
Gerrit-Change-Number: 9407
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 22 Feb 2018 23:21:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] KUDU-2305: Fix local variable usage to handle 2GB messages

2018-02-22 Thread Joe McDonnell (Code Review)
Joe McDonnell has removed Todd Lipcon from this change.  ( 
http://gerrit.cloudera.org:8080/9407 )

Change subject: KUDU-2305: Fix local variable usage to handle 2GB messages
..


Removed reviewer Todd Lipcon.
--
To view, visit http://gerrit.cloudera.org:8080/9407
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I8e468099b16f7eb55de71b753acae8f57d18287c
Gerrit-Change-Number: 9407
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] KUDU-2305: Fix local variable usage to handle 2GB messages

2018-02-22 Thread Joe McDonnell (Code Review)
Joe McDonnell has removed Dan Burkert from this change.  ( 
http://gerrit.cloudera.org:8080/9407 )

Change subject: KUDU-2305: Fix local variable usage to handle 2GB messages
..


Removed reviewer Dan Burkert.
--
To view, visit http://gerrit.cloudera.org:8080/9407
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I8e468099b16f7eb55de71b753acae8f57d18287c
Gerrit-Change-Number: 9407
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] KUDU-2305: Fix local variable usage to handle 2GB messages

2018-02-22 Thread Joe McDonnell (Code Review)
Joe McDonnell has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/9407 )

Change subject: KUDU-2305: Fix local variable usage to handle 2GB messages
..


Removed reviewer Kudu Jenkins.
--
To view, visit http://gerrit.cloudera.org:8080/9407
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I8e468099b16f7eb55de71b753acae8f57d18287c
Gerrit-Change-Number: 9407
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] KUDU-2305: Fix local variable usage to handle 2GB messages

2018-02-22 Thread Joe McDonnell (Code Review)
Hello Dan Burkert, Kudu Jenkins, Todd Lipcon,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/9407

to review the following change.


Change subject: KUDU-2305: Fix local variable usage to handle 2GB messages
..

KUDU-2305: Fix local variable usage to handle 2GB messages

The maximum value for rpc_max_message_size (an int32_t) is
INT_MAX. However, when using this maximum value, certain
local variables in SerializeMessage() and InboundTransfer
can overflow if the message size approaches INT_MAX.

This changes certain variables to handle messages that
approach INT_MAX. Specifically:
 - Local variables in SerializeMessage() become int64_t
   rather than int.
 - The total message size prefixed to the message is now
   treated as a uint32_t everywhere. This impacts
   InboundTransfer::total_length_/cur_offset_ and
   a local variable in ParseMessage().

These changes mean that a sender can serialize a message
that is slightly larger than INT_MAX due to the added
header size, but the receiver will reject all messages
exceeding rpc_max_message_size (maximum INT_MAX).

For testing, this modifies rpc-test's TestRpcSidecarLimits
to send a message of size INT_MAX rather than
rpc_max_message_size+1. This increases the test runtime
from about 50ms to 2 seconds.

Change-Id: I8e468099b16f7eb55de71b753acae8f57d18287c
Reviewed-on: http://gerrit.cloudera.org:8080/9355
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon 
Reviewed-by: Dan Burkert 
---
M be/src/kudu/rpc/rpc-test.cc
M be/src/kudu/rpc/serialization.cc
M be/src/kudu/rpc/transfer.cc
M be/src/kudu/rpc/transfer.h
4 files changed, 19 insertions(+), 9 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/9407/1
--
To view, visit http://gerrit.cloudera.org:8080/9407
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8e468099b16f7eb55de71b753acae8f57d18287c
Gerrit-Change-Number: 9407
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] KUDU-2301: (Part-1) Add instrumentation on a per connection level

2018-02-22 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9383 )

Change subject: KUDU-2301: (Part-1) Add instrumentation on a per connection 
level
..


Patch Set 1: Code-Review+1

Looks good to me, manually compared the two diffs. Can you add some pointer to 
the commit message to point to the Kudu review? Usually cherry-pick would have 
added those I think (e.g. https://gerrit.cloudera.org/#/c/9361/).


--
To view, visit http://gerrit.cloudera.org:8080/9383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iae1a5fe0066adf644a9cac41ad6696e1bbf00465
Gerrit-Change-Number: 9383
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 22 Feb 2018 23:17:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8966 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
..


Patch Set 25:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1991/


--
To view, visit http://gerrit.cloudera.org:8080/8966
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 8966
Gerrit-PatchSet: 25
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 23:07:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8966 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
..


Patch Set 25: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8966
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 8966
Gerrit-PatchSet: 25
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 23:07:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8414 )

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
..


Patch Set 25: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 25
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 23:06:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-22 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Tianyi Wang, Bikramjeet Vig, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8707

to look at the new patch set (#34).

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..

IMPALA-4835: Part 2: Allocate scan range buffers upfront

This change is a step towards reserving memory for buffers from the
buffer pool and constraining per-scanner memory requirements. This
change restructures the DiskIoMgr code so that each ScanRange operates
with a fixed set of buffers that are allocated upfront and recycled as
the I/O mgr works through the ScanRange.

One major change is that ScanRanges get blocked when a buffer is not
available and get unblocked when a client returns a buffer via
ReturnBuffer(). I was able to remove the logic to maintain the
blocked_ranges_ list by instead adding a separate set with all ranges
that are active.

There is also some miscellaneous cleanup included - e.g. reducing the
amount of code devoted to maintaining counters and metrics.

One tricky part of the existing code was the it called
IssueInitialRanges() with empty lists of files and depended on
DiskIoMgr::AddScanRanges() to not check for cancellation in that case.
See  IMPALA-6564. I changed the logic to not try to issue ranges for
empty lists of files.

I plan to merge this along with the actual buffer pool switch, but
separated it out to allow review of the DiskIoMgr changes separate from
other aspects of the buffer pool switchover.

Testing:
* Ran core and exhaustive tests.

Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/base-sequence-scanner.h
M be/src/exec/hdfs-lzo-text-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/io/disk-io-mgr-internal.h
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/util/bit-util-test.cc
M be/src/util/bit-util.h
24 files changed, 1,024 insertions(+), 594 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/8707/34
--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 34
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 34: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 34
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 23:06:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 33:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8707/33/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/8707/33/be/src/exec/hdfs-scan-node-base.cc@448
PS33, Line 448:   // CANCELLED errors - see IMPALA-6564.
> Okay, and I guess in the case we hit this but num_unqueued_files_ has also
Seems like a good amount of paranoia.



--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 33
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 23:06:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

2018-02-22 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9384 )

Change subject: IMPALA-6347: Monitor queue depth size for outgoing RPCs for 
Reactor threads
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9384/3/www/rpcz.tmpl
File www/rpcz.tmpl:

http://gerrit.cloudera.org:8080/#/c/9384/3/www/rpcz.tmpl@227
PS3, Line 227:   "pageLength": 50,
> I couldn't pass the JSON in directly, since it didn't accept that format. S
Nice!


http://gerrit.cloudera.org:8080/#/c/9384/4/www/rpcz.tmpl
File www/rpcz.tmpl:

http://gerrit.cloudera.org:8080/#/c/9384/4/www/rpcz.tmpl@239
PS4, Line 239: var conn_remote_ip = conn_json["remote_ip"];
 : var outbound_queue_size = conn_json["outbound_queue_size"];
still needed?



--
To view, visit http://gerrit.cloudera.org:8080/9384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
Gerrit-Change-Number: 9384
Gerrit-PatchSet: 4
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 22 Feb 2018 22:54:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6543: Limit RowBatch serialization size to INT MAX

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9367 )

Change subject: IMPALA-6543: Limit RowBatch serialization size to INT_MAX
..


Patch Set 2: Code-Review+1

Did anyone else want to take a look?


--
To view, visit http://gerrit.cloudera.org:8080/9367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b022acdf3bc93912d6d98829b30e44b65890d91
Gerrit-Change-Number: 9367
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 22:53:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-22 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 33: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8707/33/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/8707/33/be/src/exec/hdfs-scan-node-base.cc@448
PS33, Line 448:   // CANCELLED errors - see IMPALA-6564.
> Yeah, those are added from within the scanner code while processing a scan
Okay, and I guess in the case we hit this but num_unqueued_files_ has also 
reached 0.

Maybe HdfsScanNodeBase::AddDiskIoRanges() should have:

// Otherwise, the request context is already cancelled.
DCHECK(!progress_.done());

but okay to leave out if you think it's too much paranoia.



--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 33
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 22:50:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6008: Creating a UDF from a shared library with a .ll extenion crashes impala

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9154 )

Change subject: IMPALA-6008: Creating a UDF from a shared library with a .ll 
extenion crashes impala
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id060668802ca9c80367cdc0e8a823b968d549bbb
Gerrit-Change-Number: 9154
Gerrit-PatchSet: 7
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Thu, 22 Feb 2018 22:43:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6008: Creating a UDF from a shared library with a .ll extenion crashes impala

2018-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9154 )

Change subject: IMPALA-6008: Creating a UDF from a shared library with a .ll 
extenion crashes impala
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1989/


--
To view, visit http://gerrit.cloudera.org:8080/9154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id060668802ca9c80367cdc0e8a823b968d549bbb
Gerrit-Change-Number: 9154
Gerrit-PatchSet: 7
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Thu, 22 Feb 2018 22:44:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6008: Creating a UDF from a shared library with a .ll extenion crashes impala

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9154 )

Change subject: IMPALA-6008: Creating a UDF from a shared library with a .ll 
extenion crashes impala
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id060668802ca9c80367cdc0e8a823b968d549bbb
Gerrit-Change-Number: 9154
Gerrit-PatchSet: 6
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Thu, 22 Feb 2018 22:43:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 33:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8707/33/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/8707/33/be/src/exec/hdfs-scan-node-base.cc@448
PS33, Line 448:   // CANCELLED errors - see IMPALA-6564.
> what about other places we call AddScanRanges()? Are those "safe" because w
Yeah, those are added from within the scanner code while processing a scan 
range, so by definition we can't have completed progress.



--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 33
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 22:42:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-22 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 33:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8707/33/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/8707/33/be/src/exec/hdfs-scan-node-base.cc@448
PS33, Line 448:   // CANCELLED errors - see IMPALA-6564.
what about other places we call AddScanRanges()? Are those "safe" because we're 
guaranteed to not have completed progress_ in those cases?



--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 33
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 22:38:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6482: add EXEC TIME LIMIT S option

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9227 )

Change subject: IMPALA-6482: add EXEC_TIME_LIMIT_S option
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9227/8/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/9227/8/be/src/service/impala-server.cc@1855
PS8, Line 1855:   // Query was deleted or expired already.
> maybe add: "or expired already from a previous expiration event"
Done


http://gerrit.cloudera.org:8080/#/c/9227/8/be/src/service/impala-server.cc@1894
PS8, Line 1894: queries_by_timestamp_.erase(expiration_event++);
> how about
That makes sense. Iterators are confusing. I changed this and the other 
occurrences of the same pattern in this function.



-- 
To view, visit http://gerrit.cloudera.org:8080/9227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8
Gerrit-Change-Number: 9227
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 22:31:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6482: add EXEC TIME LIMIT S option

2018-02-22 Thread Tim Armstrong (Code Review)
Hello Philip Zeyliger, Bikramjeet Vig, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9227

to look at the new patch set (#9).

Change subject: IMPALA-6482: add EXEC_TIME_LIMIT_S option
..

IMPALA-6482: add EXEC_TIME_LIMIT_S option

This is similar to the QUERY_TIMEOUT_S option and shares most of the
implementation. The difference is that the timeout doesn't reset at
any point.

The time limit is measured from the start of query execution,
after the query is admitted, so planning, scheduling and
time spent in admission control is not counted towards the time limit.

Also fix validation of the related QUERY_TIMEOUT_S option, which
previously could ignore invalid input.

Testing:
Added tests for various permutations:
* With and without query_timeout_s set
* With and without result fetching keeping the query active

Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M tests/custom_cluster/test_query_expiration.py
10 files changed, 288 insertions(+), 108 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/9227/9
--
To view, visit http://gerrit.cloudera.org:8080/9227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8
Gerrit-Change-Number: 9227
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6530: Track time spent opening HDFS file handles

2018-02-22 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9370 )

Change subject: IMPALA-6530: Track time spent opening HDFS file handles
..


Patch Set 4: Code-Review+2

Carrying +2


--
To view, visit http://gerrit.cloudera.org:8080/9370
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia560af2d9b12f158e8811900a7b9d98f8e760858
Gerrit-Change-Number: 9370
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 22:28:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6530: Track time spent opening HDFS file handles

2018-02-22 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9370 )

Change subject: IMPALA-6530: Track time spent opening HDFS file handles
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9370/3/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/9370/3/be/src/runtime/io/disk-io-mgr.h@362
PS3, Line 362:   Status ReopenCachedHdfsFileHandle(const hdfsFS& fs, 
std::string* fname, int64_t mtime,
> Probably worth copying the comment about 'reader' from GetCachedHdfsFileHan
Fixed up comments for GetExclusiveHdfsFileHandle, GetCachedHdfsFileHandle, and 
ReopenCachedHdfsFileHandle to make them more consistent.



--
To view, visit http://gerrit.cloudera.org:8080/9370
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia560af2d9b12f158e8811900a7b9d98f8e760858
Gerrit-Change-Number: 9370
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 22:27:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6530: Track time spent opening HDFS file handles

2018-02-22 Thread Joe McDonnell (Code Review)
Hello Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9370

to look at the new patch set (#4).

Change subject: IMPALA-6530: Track time spent opening HDFS file handles
..

IMPALA-6530: Track time spent opening HDFS file handles

When the HDFS NameNode is overloaded, opening file
handles can be a significant source of query execution
time. Currently, there is no statistic to track this
time at the HDFS scan node level.

This introduces a statistic "TotalRawHdfsOpenFileTime(*)"
to track the time spent in HdfsOpenFile(). Here is
an example of this statistic populated for the query
"select * from functional_parquet.widetable_1000_cols",
which is dominated by file handle opening time:
- CachedFileHandlesHitCount: 0 (0)
- CachedFileHandlesMissCount: 1.00K (1001)
...
- ScannerThreadsTotalWallClockTime: 980.432ms
  - MaterializeTupleTime(*): 1.759ms
  - ScannerThreadsSysTime: 4.000ms
  - ScannerThreadsUserTime: 56.000ms
- TotalRawHdfsOpenFileTime(*): 894.285ms
- TotalRawHdfsReadTime(*): 25.188ms

To make the TotalRawHdfsReadTime mutually exclusive
from the TotalRawHdfsOpenFileTime, the timer tracking
for the read timer moves from DiskIoMgr::ReadRange()
to inside the ScanRange::Read() function. This allows
it to exclude the portion of ScanRange::Read() that
is getting a file handle from the file handle cache.

Change-Id: Ia560af2d9b12f158e8811900a7b9d98f8e760858
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.h
M be/src/runtime/io/scan-range.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
9 files changed, 104 insertions(+), 66 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/9370/4
--
To view, visit http://gerrit.cloudera.org:8080/9370
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia560af2d9b12f158e8811900a7b9d98f8e760858
Gerrit-Change-Number: 9370
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6482: add EXEC TIME LIMIT S option

2018-02-22 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9227 )

Change subject: IMPALA-6482: add EXEC_TIME_LIMIT_S option
..


Patch Set 8: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9227/7/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/9227/7/be/src/service/impala-server.cc@992
PS7, Line 992:
> I made the naming of the variables consistent - exec_time_limit_s and idle_
yea that makes sense


http://gerrit.cloudera.org:8080/#/c/9227/8/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/9227/8/be/src/service/impala-server.cc@1855
PS8, Line 1855:   // Query was deleted or expired already.
maybe add: "or expired already from a previous expiration event"


http://gerrit.cloudera.org:8080/#/c/9227/8/be/src/service/impala-server.cc@1894
PS8, Line 1894: queries_by_timestamp_.erase(expiration_event++);
how about
expiration_event = queries_by_timestamp_.erase(expiration_event);

as per c++11 standard



--
To view, visit http://gerrit.cloudera.org:8080/9227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8
Gerrit-Change-Number: 9227
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 22:01:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8966 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
..


Patch Set 24: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8966/20/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8966/20/be/src/exec/hdfs-parquet-scanner.cc@1766
PS20, Line 1766: bytes_to_add = 0;
> Right, but don't we reserve 3 * 8MB per column anyway? Oh, but then we'll r
Yeah we probably have enough reservation to buffer all the columns in that case 
(although not always since you have some internal fragmentation within the 
buffers). I also wanted to be reasonably efficient if we got the minimum 
reservation, not the ideal.

Oh and there's the unusual case where the row groups are not aligned to the 
HDFS blocks and the row group is actually much larger than the input split 
size. I'm not concerned about perf in that case but it's another reason things 
can get interesting.


http://gerrit.cloudera.org:8080/#/c/8966/22/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/8966/22/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1353
PS22, Line 1353: quantizedMaxScanRangeBytes));
> I feel this should go at the beginning of this function to indicate this fu
Done



--
To view, visit http://gerrit.cloudera.org:8080/8966
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 8966
Gerrit-PatchSet: 24
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 21:57:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 33:

Added a fix for IMPALA-6564


--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 33
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 21:56:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 2+3: switch I/O buffers to buffer pool

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has abandoned this change. ( http://gerrit.cloudera.org:8080/9302 
)

Change subject: IMPALA-4835: Part 2+3: switch I/O buffers to buffer pool
..


Abandoned
--
To view, visit http://gerrit.cloudera.org:8080/9302
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I94704bbcb2c438f8fffd1168e42e6b2ac496dc0d
Gerrit-Change-Number: 9302
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has abandoned this change. ( http://gerrit.cloudera.org:8080/9405 
)

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Abandoned
--
To view, visit http://gerrit.cloudera.org:8080/9405
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I060b3bfc6f60bd5afc0df8c09b0954411b86ff4b
Gerrit-Change-Number: 9405
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-22 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Tianyi Wang, Bikramjeet Vig, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8707

to look at the new patch set (#33).

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..

IMPALA-4835: Part 2: Allocate scan range buffers upfront

This change is a step towards reserving memory for buffers from the
buffer pool and constraining per-scanner memory requirements. This
change restructures the DiskIoMgr code so that each ScanRange operates
with a fixed set of buffers that are allocated upfront and recycled as
the I/O mgr works through the ScanRange.

One major change is that ScanRanges get blocked when a buffer is not
available and get unblocked when a client returns a buffer via
ReturnBuffer(). I was able to remove the logic to maintain the
blocked_ranges_ list by instead adding a separate set with all ranges
that are active.

There is also some miscellaneous cleanup included - e.g. reducing the
amount of code devoted to maintaining counters and metrics.

One tricky part of the existing code was the it called
IssueInitialRanges() with empty lists of files and depended on
DiskIoMgr::AddScanRanges() to not check for cancellation in that case.
See  IMPALA-6564. I changed the logic to not try to issue ranges for
empty lists of files.

I plan to merge this along with the actual buffer pool switch, but
separated it out to allow review of the DiskIoMgr changes separate from
other aspects of the buffer pool switchover.

Testing:
* Ran core and exhaustive tests.

Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/base-sequence-scanner.h
M be/src/exec/hdfs-lzo-text-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/io/disk-io-mgr-internal.h
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/util/bit-util-test.cc
M be/src/util/bit-util.h
24 files changed, 1,023 insertions(+), 594 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/8707/33
--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 33
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool

2018-02-22 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig, Alex Behm, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8966

to look at the new patch set (#24).

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
..

IMPALA-4835: Part 3: switch I/O buffers to buffer pool

This is the final patch to switch the Disk I/O manager to allocate all
buffer from the buffer pool and to reserve the buffers required for
a query upfront.

* The planner reserves enough memory to run a single scanner per
  scan node.
* The multi-threaded scan node must increase reservation before
  spinning up more threads.
* The scanner implementations must be careful to stay within their
  assigned reservation.

The row-oriented scanners were most straightforward, since they only
have a single scan range active at a time. A single I/O buffer is
sufficient to scan the whole file but more I/O buffers can improve I/O
throughput.

Parquet is more complex because it issues a scan range per column and
the sizes of the columns on disk are not known during planning. To
deal with this, the reservation in the frontend is based on a
heuristic involving the file size and # columns. The Parquet scanner
can then divvy up reservation to columns based on the size of column
data on disk.

I adjusted how the 'mem_limit' is divided between buffer pool and non
buffer pool memory for low mem_limits to account for the increase in
buffer pool memory.

Testing:
* Added more planner tests to cover reservation calcs for scan node.
* Test scanners for all file formats with the reservation denial debug
  action, to test behaviour when the scanners hit reservation limits.
* Updated memory and buffer pool limits for tests.
* Added unit tests for dividing reservation between columns in parquet,
  since the algorithm is non-trivial.

Perf:
I ran TPC-H and targeted perf locally comparing with master. Both
showed small improvements of a few percent and no regressions of
note. Cluster perf tests showed no significant change.

Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
---
M be/src/exec/CMakeLists.txt
A be/src/exec/hdfs-parquet-scanner-test.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/bufferpool/reservation-tracker-test.cc
M be/src/runtime/bufferpool/reservation-util.cc
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-stress.h
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/util/BitUtil.java
M fe/src/test/java/org/apache/impala/util/BitUtilTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M 
testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test
M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
M testdata/workloads/functional-query/queries/QueryTest/codegen-mem-limit.test
M 
testdata/workloads/functional-query/queries/QueryTest/disk-spill-encryption.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level0.test
M 

[Impala-ASF-CR] IMPALA-6553: [DOCS] load catalog in background default change

2018-02-22 Thread Balazs Jeszenszky (Code Review)
Balazs Jeszenszky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9389 )

Change subject: IMPALA-6553: [DOCS] load_catalog_in_background default change
..


Patch Set 4: Code-Review+1

LGTM, after looking around commit comments are often omitted for docs changes.


--
To view, visit http://gerrit.cloudera.org:8080/9389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I548b2d1532c12f8d3c795a940b7f980482ecf09b
Gerrit-Change-Number: 9389
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 21:55:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9405


Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..

IMPALA-4835: Part 2: Allocate scan range buffers upfront

This change is a step towards reserving memory for buffers from the
buffer pool and constraining per-scanner memory requirements. This
change restructures the DiskIoMgr code so that each ScanRange operates
with a fixed set of buffers that are allocated upfront and recycled as
the I/O mgr works through the ScanRange.

One major change is that ScanRanges get blocked when a buffer is not
available and get unblocked when a client returns a buffer via
ReturnBuffer(). I was able to remove the logic to maintain the
blocked_ranges_ list by instead adding a separate set with all ranges
that are active.

There is also some miscellaneous cleanup included - e.g. reducing the
amount of code devoted to maintaining counters and metrics.

One tricky part of the existing code was the it called
IssueInitialRanges() with empty lists of files and depended on
DiskIoMgr::AddScanRanges() to not check for cancellation in that case.
See  IMPALA-6564. I changed the logic to not try to issue ranges for
empty lists of files.

I plan to merge this along with the actual buffer pool switch, but
separated it out to allow review of the DiskIoMgr changes separate from
other aspects of the buffer pool switchover.

Testing:
* Ran core and exhaustive tests.

Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f

Fix spurious cancel.

Change-Id: I060b3bfc6f60bd5afc0df8c09b0954411b86ff4b
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/base-sequence-scanner.h
M be/src/exec/hdfs-lzo-text-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/io/disk-io-mgr-internal.h
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/util/bit-util-test.cc
M be/src/util/bit-util.h
24 files changed, 1,023 insertions(+), 594 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/9405/1
--
To view, visit http://gerrit.cloudera.org:8080/9405
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I060b3bfc6f60bd5afc0df8c09b0954411b86ff4b
Gerrit-Change-Number: 9405
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool

2018-02-22 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig, Alex Behm, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8966

to look at the new patch set (#23).

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
..

IMPALA-4835: Part 3: switch I/O buffers to buffer pool

This is the final patch to switch the Disk I/O manager to allocate all
buffer from the buffer pool and to reserve the buffers required for
a query upfront.

* The planner reserves enough memory to run a single scanner per
  scan node.
* The multi-threaded scan node must increase reservation before
  spinning up more threads.
* The scanner implementations must be careful to stay within their
  assigned reservation.

The row-oriented scanners were most straightforward, since they only
have a single scan range active at a time. A single I/O buffer is
sufficient to scan the whole file but more I/O buffers can improve I/O
throughput.

Parquet is more complex because it issues a scan range per column and
the sizes of the columns on disk are not known during planning. To
deal with this, the reservation in the frontend is based on a
heuristic involving the file size and # columns. The Parquet scanner
can then divvy up reservation to columns based on the size of column
data on disk.

I adjusted how the 'mem_limit' is divided between buffer pool and non
buffer pool memory for low mem_limits to account for the increase in
buffer pool memory.

Testing:
* Added more planner tests to cover reservation calcs for scan node.
* Test scanners for all file formats with the reservation denial debug
  action, to test behaviour when the scanners hit reservation limits.
* Updated memory and buffer pool limits for tests.
* Added unit tests for dividing reservation between columns in parquet,
  since the algorithm is non-trivial.

Perf:
I ran TPC-H and targeted perf locally comparing with master. Both
showed small improvements of a few percent and no regressions of
note. Cluster perf tests showed no significant change.

Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
---
M be/src/exec/CMakeLists.txt
A be/src/exec/hdfs-parquet-scanner-test.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/bufferpool/reservation-tracker-test.cc
M be/src/runtime/bufferpool/reservation-util.cc
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-stress.h
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/util/BitUtil.java
M fe/src/test/java/org/apache/impala/util/BitUtilTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M 
testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test
M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
M testdata/workloads/functional-query/queries/QueryTest/codegen-mem-limit.test
M 
testdata/workloads/functional-query/queries/QueryTest/disk-spill-encryption.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level0.test
M 

[Impala-ASF-CR] IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

2018-02-22 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9384 )

Change subject: IMPALA-6347: Monitor queue depth size for outgoing RPCs for 
Reactor threads
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9384/3/www/rpcz.tmpl
File www/rpcz.tmpl:

http://gerrit.cloudera.org:8080/#/c/9384/3/www/rpcz.tmpl@227
PS3, Line 227:   "pageLength": 50
> If you add a mapping of keys to columns here, will you be able to feed the
I couldn't pass the JSON in directly, since it didn't accept that format. So I 
converted it to an array format before adding it.

But I moved the mapping of keys to columns at the initialization here, so we 
don't need to modify the function body as more metrics become available.



--
To view, visit http://gerrit.cloudera.org:8080/9384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
Gerrit-Change-Number: 9384
Gerrit-PatchSet: 3
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 22 Feb 2018 21:49:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

2018-02-22 Thread Sailesh Mukil (Code Review)
Hello Michael Ho, Lars Volker,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9384

to look at the new patch set (#4).

Change subject: IMPALA-6347: Monitor queue depth size for outgoing RPCs for 
Reactor threads
..

IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

On systems with slow networking large queuing can occur in the Reactor
threads. It would be good to quantify how much queueing occurred.

This patch extracts the OutboundTransfer queue size information from
KRPC and displays it on the /rpcz webpage.

Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
---
M be/src/rpc/rpc-mgr.cc
M www/rpcz.tmpl
2 files changed, 49 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/9384/4
--
To view, visit http://gerrit.cloudera.org:8080/9384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
Gerrit-Change-Number: 9384
Gerrit-PatchSet: 4
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8414 )

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
..


Patch Set 24: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 24
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 21:40:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 32: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 32
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 21:40:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6543: Limit RowBatch serialization size to INT MAX

2018-02-22 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9367 )

Change subject: IMPALA-6543: Limit RowBatch serialization size to INT_MAX
..


Patch Set 1:

(1 comment)

Updated row-batch-serialization-test to test each of the limits.

http://gerrit.cloudera.org:8080/#/c/9367/1/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

http://gerrit.cloudera.org:8080/#/c/9367/1/be/src/runtime/row-batch.cc@346
PS1, Line 346: INT_MAX
> We've standardised on numeric_limits::max(). I kind of like INT_MAX be
Fixed.



--
To view, visit http://gerrit.cloudera.org:8080/9367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b022acdf3bc93912d6d98829b30e44b65890d91
Gerrit-Change-Number: 9367
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 22 Feb 2018 21:35:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6543: Limit RowBatch serialization size to INT MAX

2018-02-22 Thread Joe McDonnell (Code Review)
Hello Michael Ho, Lars Volker, Sailesh Mukil, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9367

to look at the new patch set (#2).

Change subject: IMPALA-6543: Limit RowBatch serialization size to INT_MAX
..

IMPALA-6543: Limit RowBatch serialization size to INT_MAX

The serialization format of a row batch relies on
tuple offsets. In its current form, the tuple offsets
are int32s. This means that it is impossible to generate
a valid serialization of a row batch that is larger
than INT_MAX.

This changes RowBatch::SerializeInternal() to return an
error if trying to serialize a row batch larger than INT_MAX.
This prevents a DCHECK on debug builds when creating a row
larger than 2GB.

This also changes the compression logic in RowBatch::Serialize()
to avoid a DCHECK if LZ4 will not be able to compress the
row batch. Instead, it returns an error.

This modifies row-batch-serialize-test to verify behavior at
each of the limits. Specifically:
RowBatches up to size LZ4_MAX_INPUT_SIZE succeed.
RowBatches with size range [LZ4_MAX_INPUT_SIZE+1, INT_MAX]
fail on LZ4 compression.
RowBatches with size > INT_MAX fail with RowBatch too large.

Change-Id: I3b022acdf3bc93912d6d98829b30e44b65890d91
---
M be/src/runtime/row-batch-serialize-test.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M common/thrift/generate_error_codes.py
4 files changed, 126 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/9367/2
--
To view, visit http://gerrit.cloudera.org:8080/9367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3b022acdf3bc93912d6d98829b30e44b65890d91
Gerrit-Change-Number: 9367
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6347: Monitor queue depth size for outgoing RPCs for Reactor threads

2018-02-22 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9384 )

Change subject: IMPALA-6347: Monitor queue depth size for outgoing RPCs for 
Reactor threads
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9384/2/be/src/rpc/rpc-mgr.cc
File be/src/rpc/rpc-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9384/2/be/src/rpc/rpc-mgr.cc@208
PS2, Line 208: Value remote_ip_str(conn.remote_ip().c_str(), 
document->GetAllocator());
> In what case would it not be unique? I think we can rely on the fact that K
I was thinking about the case where that KRPC guarantee breaks but you're 
right, it's probably not worth the effort.


http://gerrit.cloudera.org:8080/#/c/9384/3/www/rpcz.tmpl
File www/rpcz.tmpl:

http://gerrit.cloudera.org:8080/#/c/9384/3/www/rpcz.tmpl@227
PS3, Line 227: var key = keys[l];
If you add a mapping of keys to columns here, will you be able to feed the json 
into rows.add()?



--
To view, visit http://gerrit.cloudera.org:8080/9384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I84fea531e98d3e84fcc57bf7533655218bc91f4c
Gerrit-Change-Number: 9384
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 22 Feb 2018 21:25:57 +
Gerrit-HasComments: Yes


  1   2   >