[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls

2018-02-19 Thread Vuk Ercegovac (Code Review)
Hello Lars Volker, Dimitris Tsirogiannis, Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8523

to look at the new patch set (#11).

Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls
..

IMPALA-5931: Generates scan ranges in planner for s3/adls

Currently, for filesystems that do not include physical
block information (e.g., block replica locations, caching),
synthetic blocks are generated and stored in the catalog
when metadata is loaded. Example file systems for which this is done
includes S3, ADLS, and local fs.

This change avoids generating these blocks when metadata is loaded.
Instead, scan ranges are directly generated from such files by the
backend coordinator. Previously, all scan ranges were produced by
the planner in HDFSScanNode in the frontend. Now, those files without
block information are sent to the coordinator represented by a split
specification that determines how the coordinator will create scan ranges
to send to executors.

This change reduces the space needed in the catalog and reduces the
scan range data structures that are passed from the frontend to the
backend when planning and coordinating a query.
In addition a bug is avoided where non-splittable files were being
split anyways to support the query parameter that places a limit on
scan ranges.

Testing:
- added backend scheduler tests
- mixed-filesystems test covers tables/queries with multiple fs's.
- local fs tests cover the code paths in this change
- all core tests pass when configured with s3
- manually tried larger local filesystem tables (tpch) with multiple
  partitions and observed the same scan ranges.
- TODO: adls testing

Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5
---
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler-test-util.h
M be/src/scheduling/scheduler-test.cc
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/util/CMakeLists.txt
A be/src/util/flat_buffer.cc
A be/src/util/flat_buffer.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
13 files changed, 641 insertions(+), 191 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8523/11
--
To view, visit http://gerrit.cloudera.org:8080/8523
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5
Gerrit-Change-Number: 8523
Gerrit-PatchSet: 11
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-5801: Clean up codegen GetType() interface

2018-02-19 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9063 )

Change subject: IMPALA-5801: Clean up codegen GetType() interface
..


Patch Set 8:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9063/8/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/9063/8/be/src/codegen/llvm-codegen.h@272
PS8, Line 272:
Nit: extra whitespace


http://gerrit.cloudera.org:8080/#/c/9063/8/be/src/codegen/llvm-codegen.h@277
PS8, Line 277:
Nit: extra whitespace


http://gerrit.cloudera.org:8080/#/c/9063/8/be/src/codegen/llvm-codegen.h@506
PS8, Line 506: Value*
Being curious: is there any specific reason for not changing null_ptr_value() 
to return llvm::Constant*, just like for "true" and "false"? "nullptr" is a 
literal as well as the others.



--
To view, visit http://gerrit.cloudera.org:8080/9063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77
Gerrit-Change-Number: 9063
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 13:46:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5801: Clean up codegen GetType() interface

2018-02-19 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Laszlo Gaal, Gabor Kaszab, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9063

to look at the new patch set (#9).

Change subject: IMPALA-5801: Clean up codegen GetType() interface
..

IMPALA-5801: Clean up codegen GetType() interface

Several functions that return llvm::(Pointer)Type were renamed
to make them shorter or indicate their roles more clearly. Some
additional convenience functions were created to make some common
codegen tasks simpler:

- Get(Ptr)Type functions with string parameter are renamed to
  GetNamed(Ptr)Type
- GetStruct(Ptr)Type template functions are created to make
  GetNamedType(MyStruct::LLVM_CLASS_NAME) calls simpler (some
  classes had LLVM_CLASS_NAME as private, these are changed to
  public)
- integer type convenience functions are renamed to indicate
  bit width instead of matching SQL type (e.g. int_type->i32_type)
- similar convenience functions were created for ptr to primitive
  types, int_ptr_type
- Get(Ptr)Type functions with ColumnType parameter are renamed
  to GetSlot(Ptr)Type
- GetIntConstant function is split to several functions depending
  on the type of the integer e.g. GetI32Constant

The renamed functions can be found in llvm-codegen.h/cc. Changes
in other files are mainly renamed calls with the same functionality.

Testing:
No new tests are necessary, as no functionality was changed.

Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/exec-node.cc
M be/src/exec/filter-context.cc
M be/src/exec/hash-table.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/text-converter.cc
M be/src/exprs/compound-predicates.cc
M be/src/exprs/scalar-expr.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/slot-ref.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/tuple.cc
M be/src/runtime/types.cc
M be/src/util/tuple-row-compare.cc
26 files changed, 357 insertions(+), 393 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/9063/9
--
To view, visit http://gerrit.cloudera.org:8080/9063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77
Gerrit-Change-Number: 9063
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5801: Clean up codegen GetType() interface

2018-02-19 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9063 )

Change subject: IMPALA-5801: Clean up codegen GetType() interface
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9063/8/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/9063/8/be/src/codegen/llvm-codegen.h@249
PS8, Line 249:   llvm::PointerType* GetNamedPtrPtrType(const std::string& name);
For me simply reading this header file doesn't bring me closer to understand 
the difference between these 2 functions:
llvm::PointerType* GetNamedPtrPtrType(const std::string& name);
llvm::PointerType* GetNamedPtrType(const std::string& name);

Same return type, same param, the name differs only with an extra 'Ptr' that 
I'm sure makes sense but I find a bit confusing.

What do you think?



--
To view, visit http://gerrit.cloudera.org:8080/9063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77
Gerrit-Change-Number: 9063
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 14:22:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5801: Clean up codegen GetType() interface

2018-02-19 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9063 )

Change subject: IMPALA-5801: Clean up codegen GetType() interface
..


Patch Set 9:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/9063/9/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/9063/9/be/src/codegen/llvm-codegen.h@529
PS9, Line 529:   llvm::Constant* GetI8Constant(uint64_t val) {
 : return llvm::ConstantInt::get(context(), llvm::APInt(8, 
val));
 :   }
 :   llvm::Constant* GetI16Constant(uint64_t val) {
 : return llvm::ConstantInt::get(context(), llvm::APInt(16, 
val));
 :   }
 :   llvm::Constant* GetI32Constant(uint64_t val) {
 : return llvm::ConstantInt::get(context(), llvm::APInt(32, 
val));
 :   }
 :   llvm::Constant* GetI64Constant(uint64_t val) {
I have written the val arguments type to uint64_t to minimize the change 
compared to the original behavior.


http://gerrit.cloudera.org:8080/#/c/9063/8/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/9063/8/be/src/codegen/llvm-codegen.h@272
PS8, Line 272: G
> Nit: extra whitespace
Done


http://gerrit.cloudera.org:8080/#/c/9063/8/be/src/codegen/llvm-codegen.h@277
PS8, Line 277: G
> Nit: extra whitespace
Done


http://gerrit.cloudera.org:8080/#/c/9063/8/be/src/codegen/llvm-codegen.h@506
PS8, Line 506: Consta
> Being curious: is there any specific reason for not changing null_ptr_value
There was no specific reason. I had to change true/false_value to Constant 
because of GetBoolConstant, so I also changed the function that returns them.

Note that some other functions' return type could be changed from Value* to 
Constant*, for example CastPtrToLlvmPtr(). It could be also nice to change 
several variable's type on the call sites of these functions.



--
To view, visit http://gerrit.cloudera.org:8080/9063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77
Gerrit-Change-Number: 9063
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 14:32:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5752: Add support for DECIMAL on Kudu tables

2018-02-19 Thread Grant Henke (Code Review)
Hello Thomas Tauber-Marshall, Taras Bobrovytsky,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9306

to look at the new patch set (#6).

Change subject: IMPALA-5752: Add support for DECIMAL on Kudu tables
..

IMPALA-5752: Add support for DECIMAL on Kudu tables

Adds support for the Kudu DECIMAL type introduced in Kudu 1.7.0.

Note: Adding support for Kudu decimal min/max filters is
tracked in IMPALA-6533.

Tests:
* Added Kudu create with decimal test to AnalyzeDDLTest.java
* Added Kudu table_format to test_decimal_queries.py
** Both decimal.test and decimal-exprs.test workloads
* Added decimal queries to the following Kudu workloads:
** kudu_create.test
** kudu_delete.test
** kudu_insert.test
** kudu_update.test
** kudu_upsert.test

Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6
---
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-util.cc
M be/src/exec/kudu-util.h
M be/src/exprs/kudu-partition-expr.cc
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/query_test/test_decimal_queries.py
M tests/query_test/test_kudu.py
21 files changed, 784 insertions(+), 626 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/9306/6
--
To view, visit http://gerrit.cloudera.org:8080/9306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6
Gerrit-Change-Number: 9306
Gerrit-PatchSet: 6
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-6538: Fix read path when Parquet min/max statistics contain NaN

2018-02-19 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9358


Change subject: IMPALA-6538: Fix read path when Parquet min/max statistics 
contain NaN
..

IMPALA-6538: Fix read path when Parquet min/max statistics contain NaN

If the first number in a row group written by Impala is NaN,
then Impala writes incorrect statistics in the metadata.
This will result in incorrect results when filtering the
data.

This commit fixes the read path when encountering NaNs in
Parquet min/max statistics. If max is NaN, then Impala
will interpret it as infinity. If min is NaN, then -infinity
will be used.

I added some tests to QueryTest/parqet-stats.test

Change-Id: If3897fc1426541239223670812f59e2bed32f455
---
M be/src/exec/parquet-column-stats.cc
M testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
2 files changed, 48 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/9358/1
--
To view, visit http://gerrit.cloudera.org:8080/9358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If3897fc1426541239223670812f59e2bed32f455
Gerrit-Change-Number: 9358
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-5801: Clean up codegen GetType() interface

2018-02-19 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9063 )

Change subject: IMPALA-5801: Clean up codegen GetType() interface
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9063/8/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/9063/8/be/src/codegen/llvm-codegen.h@249
PS8, Line 249:   llvm::PointerType* GetNamedPtrPtrType(const std::string& name);
> For me simply reading this header file doesn't bring me closer to understan
The difference is the same as between T* and T**.

I would keep it as it is, as the Ptr/PtrPtr convention is used in codegen since 
a long time, but I am curious, if you have a better naming convention in mind.



--
To view, visit http://gerrit.cloudera.org:8080/9063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77
Gerrit-Change-Number: 9063
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 15:13:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6538: Fix read path when Parquet min/max statistics contain NaN

2018-02-19 Thread Zoltan Borok-Nagy (Code Review)
Hello Gabor Kaszab, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9358

to look at the new patch set (#2).

Change subject: IMPALA-6538: Fix read path when Parquet min/max statistics 
contain NaN
..

IMPALA-6538: Fix read path when Parquet min/max statistics contain NaN

If the first number in a row group written by Impala is NaN,
then Impala writes incorrect statistics in the metadata.
This will result in incorrect results when filtering the
data.

This commit fixes the read path when encountering NaNs in
Parquet min/max statistics. If max is NaN, then Impala
will interpret it as infinity. If min is NaN, then -infinity
will be used.

I added some tests to QueryTest/parqet-stats.test

Change-Id: If3897fc1426541239223670812f59e2bed32f455
---
M be/src/exec/parquet-column-stats.cc
M testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
2 files changed, 48 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/9358/2
--
To view, visit http://gerrit.cloudera.org:8080/9358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If3897fc1426541239223670812f59e2bed32f455
Gerrit-Change-Number: 9358
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6423: HDFS scanner doesn't check RuntimeState::is cancelled()

2018-02-19 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9352 )

Change subject: IMPALA-6423: HDFS scanner doesn't check 
RuntimeState::is_cancelled()
..


Patch Set 2: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/9352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3eb960d9d6f2dde4b2cb4988aa06288d11802b9a
Gerrit-Change-Number: 9352
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 15:18:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6538: Fix read path when Parquet min/max statistics contain NaN

2018-02-19 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9358 )

Change subject: IMPALA-6538: Fix read path when Parquet min/max statistics 
contain NaN
..


Patch Set 2:

(2 comments)

Seems fine in general. I have some minor comments.

http://gerrit.cloudera.org:8080/#/c/9358/2/be/src/exec/parquet-column-stats.cc
File be/src/exec/parquet-column-stats.cc:

http://gerrit.cloudera.org:8080/#/c/9358/2/be/src/exec/parquet-column-stats.cc@29
PS2, Line 29: void ChangeNaNToInf(void *slot, ColumnStatsBase::StatsField 
stats_field) {
1: Can this be part of ColumnStatsBase?
2: Could you add some description to the function? e.g. can we expect changes 
on slot parameter?


http://gerrit.cloudera.org:8080/#/c/9358/2/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
File testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test:

http://gerrit.cloudera.org:8080/#/c/9358/2/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test@496
PS2, Line 496: create table test_nan(val double) stored as parquet;
Could you add a description here why we test this? (to check that when the 
first item in the row group is NaN then it doesn't ruin min/max stats and as a 
result it doesn't rule out the whole row group)



--
To view, visit http://gerrit.cloudera.org:8080/9358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If3897fc1426541239223670812f59e2bed32f455
Gerrit-Change-Number: 9358
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 15:40:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5801: Clean up codegen GetType() interface

2018-02-19 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9063 )

Change subject: IMPALA-5801: Clean up codegen GetType() interface
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9063/8/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/9063/8/be/src/codegen/llvm-codegen.h@249
PS8, Line 249:   llvm::PointerType* GetNamedPtrPtrType(const std::string& name);
> The difference is the same as between T* and T**.
Thanks for the explanation!
Well, the naming feels a bit strange for me, but I don't have a better one up 
my sleeve :)



--
To view, visit http://gerrit.cloudera.org:8080/9063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77
Gerrit-Change-Number: 9063
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 16:03:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5752: Add support for DECIMAL on Kudu tables

2018-02-19 Thread Grant Henke (Code Review)
Hello Thomas Tauber-Marshall, Taras Bobrovytsky,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9306

to look at the new patch set (#7).

Change subject: IMPALA-5752: Add support for DECIMAL on Kudu tables
..

IMPALA-5752: Add support for DECIMAL on Kudu tables

Adds support for the Kudu DECIMAL type introduced in Kudu 1.7.0.

Note: Adding support for Kudu decimal min/max filters is
tracked in IMPALA-6533.

Tests:
* Added Kudu create with decimal test to AnalyzeDDLTest.java
* Added Kudu table_format to test_decimal_queries.py
** Both decimal.test and decimal-exprs.test workloads
* Added decimal queries to the following Kudu workloads:
** kudu_create.test
** kudu_delete.test
** kudu_insert.test
** kudu_update.test
** kudu_upsert.test

Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6
---
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-util.cc
M be/src/exec/kudu-util.h
M be/src/exprs/kudu-partition-expr.cc
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/query_test/test_decimal_queries.py
M tests/query_test/test_kudu.py
21 files changed, 784 insertions(+), 626 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/9306/7
--
To view, visit http://gerrit.cloudera.org:8080/9306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6
Gerrit-Change-Number: 9306
Gerrit-PatchSet: 7
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 28:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8707/26/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8707/26/be/src/runtime/io/disk-io-mgr.h@290
PS26, Line 290: ge can be sc
> That seems like an improvement to me. Up to you if you want to incorporate
Done


http://gerrit.cloudera.org:8080/#/c/8707/28/be/src/runtime/io/request-context.h
File be/src/runtime/io/request-context.h:

http://gerrit.cloudera.org:8080/#/c/8707/28/be/src/runtime/io/request-context.h@50
PS28, Line 50: AllocateBuffers
> Do you mean AllocateBuffersForRange() or AddUnusedBuffers()?
Done


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/request-context.cc
File be/src/runtime/io/request-context.cc:

http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/request-context.cc@160
PS25, Line 160: ScanRange* scan_range = static_cast(range);
> I guess it's okay.
Will leave for now. It seems like it could do with some cleanup.



--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 28
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 18:16:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 28: Code-Review+2

Carry +2


--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 28
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 18:17:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 30: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 30
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 18:17:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 29: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 29
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 18:17:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-19 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Tianyi Wang, Bikramjeet Vig, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8707

to look at the new patch set (#29).

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..

IMPALA-4835: Part 2: Allocate scan range buffers upfront

This change is a step towards reserving memory for buffers from the
buffer pool and constraining per-scanner memory requirements. This
change restructures the DiskIoMgr code so that each ScanRange operates
with a fixed set of buffers that are allocated upfront and recycled as
the I/O mgr works through the ScanRange.

One major change is that ScanRanges get blocked when a buffer is not
available and get unblocked when a client returns a buffer via
ReturnBuffer(). I was able to remove the logic to maintain the
blocked_ranges_ list by instead adding a separate set with all ranges
that are active.

There is also some miscellaneous cleanup included - e.g. reducing the
amount of code devoted to maintaining counters and metrics.

I plan to merge this along with the actual buffer pool switch, but
separated it out to allow review of the DiskIoMgr changes separate from
other aspects of the buffer pool switchover.

Testing:
* Ran core and exhaustive tests.

Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/io/disk-io-mgr-internal.h
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/util/bit-util-test.cc
M be/src/util/bit-util.h
20 files changed, 994 insertions(+), 580 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/8707/29
--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 29
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

2018-02-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8414 )

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
..


Patch Set 23: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 23
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 18:17:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-19 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Tianyi Wang, Bikramjeet Vig, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8707

to look at the new patch set (#31).

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..

IMPALA-4835: Part 2: Allocate scan range buffers upfront

This change is a step towards reserving memory for buffers from the
buffer pool and constraining per-scanner memory requirements. This
change restructures the DiskIoMgr code so that each ScanRange operates
with a fixed set of buffers that are allocated upfront and recycled as
the I/O mgr works through the ScanRange.

One major change is that ScanRanges get blocked when a buffer is not
available and get unblocked when a client returns a buffer via
ReturnBuffer(). I was able to remove the logic to maintain the
blocked_ranges_ list by instead adding a separate set with all ranges
that are active.

There is also some miscellaneous cleanup included - e.g. reducing the
amount of code devoted to maintaining counters and metrics.

I plan to merge this along with the actual buffer pool switch, but
separated it out to allow review of the DiskIoMgr changes separate from
other aspects of the buffer pool switchover.

Testing:
* Ran core and exhaustive tests.

Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/io/disk-io-mgr-internal.h
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/util/bit-util-test.cc
M be/src/util/bit-util.h
20 files changed, 996 insertions(+), 580 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/8707/31
--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 31
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 31: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 31
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 18:37:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 28:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8707/28/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8707/28/be/src/runtime/io/disk-io-mgr.h@81
PS28, Line 81: ///  2. GetNextRange(): returns to the caller the next scan 
range it should process.
> it'd be good to mention the StartScanRange() alternative path.
Done



--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 28
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 18:37:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5752: Add support for DECIMAL on Kudu tables

2018-02-19 Thread Grant Henke (Code Review)
Hello Thomas Tauber-Marshall, Taras Bobrovytsky,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9306

to look at the new patch set (#8).

Change subject: IMPALA-5752: Add support for DECIMAL on Kudu tables
..

IMPALA-5752: Add support for DECIMAL on Kudu tables

Adds support for the Kudu DECIMAL type introduced in Kudu 1.7.0.

Note: Adding support for Kudu decimal min/max filters is
tracked in IMPALA-6533.

Tests:
* Added Kudu create with decimal test to AnalyzeDDLTest.java
* Added Kudu table_format to test_decimal_queries.py
** Both decimal.test and decimal-exprs.test workloads
* Added decimal queries to the following Kudu workloads:
** kudu_create.test
** kudu_delete.test
** kudu_insert.test
** kudu_update.test
** kudu_upsert.test

Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6
---
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-util.cc
M be/src/exec/kudu-util.h
M be/src/exprs/kudu-partition-expr.cc
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/query_test/test_decimal_queries.py
M tests/query_test/test_kudu.py
21 files changed, 794 insertions(+), 632 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/9306/8
--
To view, visit http://gerrit.cloudera.org:8080/9306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6
Gerrit-Change-Number: 9306
Gerrit-PatchSet: 8
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool

2018-02-19 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8966

to look at the new patch set (#17).

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
..

IMPALA-4835: Part 3: switch I/O buffers to buffer pool

This is the final patch to switch the Disk I/O manager to allocate all
buffer from the buffer pool and to reserve the buffers required for
a query upfront.

* The planner reserves enough memory to run a single scanner per
  scan node.
* The multi-threaded scan node must increase reservation before
  spinning up more threads.
* The scanner implementations must be careful to stay within their
  assigned reservation.

The row-oriented scanners were most straightforward, since they only
have a single scan range active at a time. A single I/O buffer is
sufficient to scan the whole file but more I/O buffers can improve I/O
throughput.

Parquet is more complex because it issues a scan range per column and
the sizes of the columns on disk are not known during planning. To
deal with this, the reservation in the frontend is based on a
heuristic involving the file size and # columns. The Parquet scanner
can then divvy up reservation to columns based on the size of column
data on disk.

I adjusted how the 'mem_limit' is divided between buffer pool and non
buffer pool memory for low mem_limits to account for the increase in
buffer pool memory.

Testing:
* Added more planner tests to cover reservation calcs for scan node.
* Test scanners for all file formats with the reservation denial debug
  action, to test behaviour when the scanners hit reservation limits.
* Updated memory and buffer pool limits for tests.
* Added unit tests for dividing reservation between columns in parquet,
  since the algorithm is non-trivial.

Perf:
I ran TPC-H and targeted perf locally comparing with master. Both
showed small improvements of a few percent and no regressions of
note. Cluster perf tests showed no significant change.

Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
---
M be/src/exec/CMakeLists.txt
A be/src/exec/hdfs-parquet-scanner-test.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/bufferpool/reservation-tracker-test.cc
M be/src/runtime/bufferpool/reservation-util.cc
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-stress.h
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/util/BitUtil.java
M fe/src/test/java/org/apache/impala/util/BitUtilTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M 
testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test
M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
M testdata/workloads/functional-query/queries/QueryTest/codegen-mem-limit.test
M 
testdata/workloads/functional-query/queries/QueryTest/disk-spill-encryption.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level0.test
M testdata/workloads/functional-query/queries/Que

[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool

2018-02-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8966 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
..


Patch Set 17:

Rebased and fixed a couple of bugs with reservation calculations that I found 
running tpch_nested at scale.

Bug 1: Rounded up instead of down with 
RoundUpToPowerOfTwo(reservation_to_distribute), leading to reservation being 
incorrectly distributed between columns. Added backend test to repro.
Bug 2: Didn't reserve memory for columns in nested collections. Added planner 
tests to directly check this.


--
To view, visit http://gerrit.cloudera.org:8080/8966
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 8966
Gerrit-PatchSet: 17
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 20:50:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6538: Fix read path when Parquet min/max statistics contain NaN

2018-02-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9358 )

Change subject: IMPALA-6538: Fix read path when Parquet min/max statistics 
contain NaN
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9358/2/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
File testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test:

http://gerrit.cloudera.org:8080/#/c/9358/2/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test@520
PS2, Line 520: select * from test_nan where not val >= 0
Maybe also test !=?



--
To view, visit http://gerrit.cloudera.org:8080/9358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If3897fc1426541239223670812f59e2bed32f455
Gerrit-Change-Number: 9358
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 20:59:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6423: HDFS scanner doesn't check RuntimeState::is cancelled()

2018-02-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9352 )

Change subject: IMPALA-6423: HDFS scanner doesn't check 
RuntimeState::is_cancelled()
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1952/


--
To view, visit http://gerrit.cloudera.org:8080/9352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3eb960d9d6f2dde4b2cb4988aa06288d11802b9a
Gerrit-Change-Number: 9352
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 21:01:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6423: HDFS scanner doesn't check RuntimeState::is cancelled()

2018-02-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9352 )

Change subject: IMPALA-6423: HDFS scanner doesn't check 
RuntimeState::is_cancelled()
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3eb960d9d6f2dde4b2cb4988aa06288d11802b9a
Gerrit-Change-Number: 9352
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 21:00:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6423: HDFS scanner doesn't check RuntimeState::is cancelled()

2018-02-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9352 )

Change subject: IMPALA-6423: HDFS scanner doesn't check 
RuntimeState::is_cancelled()
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3eb960d9d6f2dde4b2cb4988aa06288d11802b9a
Gerrit-Change-Number: 9352
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 21:00:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

2018-02-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9344 )

Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from 
BufferPool
..


Patch Set 7: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9344/6/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

http://gerrit.cloudera.org:8080/#/c/9344/6/be/src/runtime/row-batch.cc@193
PS6, Line 193:   row_batch->tuple_ptrs_info_->client = client;
> We rely on tuple_ptrs_info_ being set to distinguish where tuple_ptrs_ is a
Ah I see, I follow now.


http://gerrit.cloudera.org:8080/#/c/9344/7/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

http://gerrit.cloudera.org:8080/#/c/9344/7/be/src/runtime/row-batch.cc@206
PS7, Line 206:   row_batch->num_rows_ = header.num_rows();
We're expecting that capacity_ == header.num_rows() here, right? Can we add a 
DCHECK



--
To view, visit http://gerrit.cloudera.org:8080/9344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Gerrit-Change-Number: 9344
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 21:09:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6538: Fix read path when Parquet min/max statistics contain NaN

2018-02-19 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9358 )

Change subject: IMPALA-6538: Fix read path when Parquet min/max statistics 
contain NaN
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9358/2/be/src/exec/parquet-column-stats.cc
File be/src/exec/parquet-column-stats.cc:

http://gerrit.cloudera.org:8080/#/c/9358/2/be/src/exec/parquet-column-stats.cc@109
PS2, Line 109:   ChangeNaNToInf(slot, stats_field);
I think we can just return false here if a value is NaN. That way the caller 
will know that it shouldn't use the stats (see comment of the ReadFromThrift()).


http://gerrit.cloudera.org:8080/#/c/9358/2/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
File testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test:

http://gerrit.cloudera.org:8080/#/c/9358/2/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test@497
PS2, Line 497: insert into test_nan values (cast('NaN' as double)), (42);
Can you also add tests where NaN is inserted in the middle of two values? For 
those, "where not val >= 0" should be covered, too.



--
To view, visit http://gerrit.cloudera.org:8080/9358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If3897fc1426541239223670812f59e2bed32f455
Gerrit-Change-Number: 9358
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 22:17:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-19 Thread Lars Volker (Code Review)
Hello Michael Ho, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9292

to look at the new patch set (#12).

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..

IMPALA-6269: Expose KRPC metrics on debug webpage

This change exposes KRPC metrics on the /rpcz debug web page.

This change also exposes metrics for rejected RPCs on the /metrics debug
web page. See here for an example: https://git.io/vAczm

This change also fixes a bug in PrettyPrinter::GetByteUnit(), which
previously did not work for unsigned values due to an implicit cast.

This change contains tests to check that the metrics show up in /rpcz
and /metrics and that they update as expected when executing queries.

This change is based on a change by Sailesh Mukil.

Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-trace.cc
M be/src/rpc/rpc-trace.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/service/impalad-main.cc
M be/src/util/histogram-metric.h
M be/src/util/pretty-printer.h
M common/thrift/Metrics.thrift
M common/thrift/metrics.json
A tests/custom_cluster/test_krpc_metrics.py
M tests/webserver/test_web_pages.py
M www/rpcz.tmpl
16 files changed, 505 insertions(+), 107 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/9292/12
--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 12
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-19 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9292 )

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..


Patch Set 11:

(11 comments)

Thanks for the review. Please see PS11 and my inline comments. I will rebase 
the change next and then address comments related to the deprecated startup 
options.

http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/impala-service-pool.h
File be/src/rpc/impala-service-pool.h:

http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/impala-service-pool.h@83
PS11, Line 83: map
> Will an unordered_map suffice ? It seems to be more performant.
Done


http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/impala-service-pool.h@85
PS11, Line 85:   // Number of RPCs that were rejected due to the queue being 
full.
> Owned by the rpc metrics group singleton.
I added to the comment that this is not owned, it feels more future proof to 
not include the actual owner and I think we omit it in other places, too. Let 
me know if you would like to include the owning entity here.


http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/impala-service-pool.h@86
PS11, Line 86:   IntCounter* rpcs_queue_overflow_;
> = nullptr
Done


http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/impala-service-pool.cc
File be/src/rpc/impala-service-pool.cc:

http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/impala-service-pool.cc@246
PS11, Line 246: Value service_name_val(service_name().c_str(), 
document->GetAllocator());
> nit: indent wrong
Done


http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/impala-service-pool.cc@264
PS11, Line 264:
> nit: indent
Done


http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/rpc-mgr.h
File be/src/rpc/rpc-mgr.h:

http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/rpc-mgr.h@183
PS11, Line 183:  Not owned.
> shared_ptr<> means co-ownership no ?
Done


http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/rpc-trace.cc
File be/src/rpc/rpc-trace.cc:

http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/rpc-trace.cc@147
PS11, Line 147: MetricGroup* metrics
> Should callers now be passing the exec_env_->rpc_metrics() so all RPC relat
I think this would break compatibility with previous versions of Impala, for 
example users might have tools that rely on the metric group hierarchy as it 
is. I created IMPALA-6540 for this.


http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/util/histogram-metric.h
File be/src/util/histogram-metric.h:

http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/util/histogram-metric.h@96
PS11, Line 96: //
> nit: ///
Done


http://gerrit.cloudera.org:8080/#/c/9292/11/tests/custom_cluster/test_krpc_metrics.py
File tests/custom_cluster/test_krpc_metrics.py:

http://gerrit.cloudera.org:8080/#/c/9292/11/tests/custom_cluster/test_krpc_metrics.py@52
PS11, Line 52: -datastream_service_queue_depth=1
> This flag is being deprecated here https://gerrit.cloudera.org/#/c/9282/
I will address all other comments, then rebase the change, and then address 
this one.


http://gerrit.cloudera.org:8080/#/c/9292/11/tests/custom_cluster/test_krpc_metrics.py@60
PS11, Line 60: [0]
> This may not be necessary now but it'd be better to not assume that DataStr
The idea was that there needs to be at least one service, but you're right, 
this depends on the datastream service specific startup flag so I'll change 
this together with the flag after the rebase.


http://gerrit.cloudera.org:8080/#/c/9292/11/tests/custom_cluster/test_krpc_metrics.py@70
PS11, Line 70: -datastream_service_queue_depth=1
> Same here
See above.



--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 11
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 19 Feb 2018 22:34:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] KUDU-2004. Undefined behavior in TlsSocket::Writev()

2018-02-19 Thread Michael Ho (Code Review)
Hello Mike Percy, Kudu Jenkins,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/9359

to review the following change.


Change subject: KUDU-2004. Undefined behavior in TlsSocket::Writev()
..

KUDU-2004. Undefined behavior in TlsSocket::Writev()

TlsSocket::Writev() was attempting to use the value of nwritten from
TlsSocket::Write(), but in the case of an error that value was never
set or initialized. A simple check to make sure the result from
TlsSocket::Write() wasn't an error was added, otherwise we break out of
the write loop to cleanup and return the error (thus skipping the line
that uses nwritten)

Dist job result from before the fix:
http://dist-test.cloudera.org/job?job_id=efan.1496860112.16151

Dist job result from after the fix:
http://dist-test.cloudera.org/job?job_id=efan.1497036430.19311

Change-Id: Ia5b5bbb3fd2ec8fcd1a48873446f3aa09546eaac
Reviewed-on: http://gerrit.cloudera.org:8080/7141
Tested-by: Kudu Jenkins
Reviewed-by: Mike Percy 
---
M be/src/kudu/security/tls_socket.cc
M be/src/kudu/util/net/socket.h
2 files changed, 10 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/9359/1
--
To view, visit http://gerrit.cloudera.org:8080/9359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia5b5bbb3fd2ec8fcd1a48873446f3aa09546eaac
Gerrit-Change-Number: 9359
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[Impala-ASF-CR] KUDU-2004. Undefined behavior in TlsSocket::Writev()

2018-02-19 Thread Michael Ho (Code Review)
Michael Ho has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/9359 )

Change subject: KUDU-2004. Undefined behavior in TlsSocket::Writev()
..


Removed reviewer Kudu Jenkins.
--
To view, visit http://gerrit.cloudera.org:8080/9359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: Ia5b5bbb3fd2ec8fcd1a48873446f3aa09546eaac
Gerrit-Change-Number: 9359
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Mike Percy 


[Impala-ASF-CR] [security] test and fixes for TLS socket EINTR issues

2018-02-19 Thread Michael Ho (Code Review)
Hello Alexey Serbin, Dan Burkert, Kudu Jenkins,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/9360

to review the following change.


Change subject: [security] test and fixes for TLS socket EINTR issues
..

[security] test and fixes for TLS socket EINTR issues

SSL_{read,write}() can return SSL_ERROR_WANT_{READ,WRITE}
correspondingly when signal interrupts recv()/send() calls even if
SSL_MODE_AUTO_RETRY is set in the TLS context.  To handle that
properly in Socket::Blocking{Recv,Write}() methods, return
NetworkError() with appropriate POSIX error code from
TlsSocket::{Recv,Write}().

As a by-product, this changelist fixes flakiness in TestUniqueClientIds
scenario of the ClientStressTest test and other flaky tests which failed
with errors like below:

  Bad status: IO error: Could not connect to the cluster: \
Client connection negotiation failed: client connection to \
IP:port: Read zero bytes on a blocking Recv() call: \
Transferred 0 of 4 bytes

Prior to this fix, the test failure ratio observed with dist-test
for TSAN builds was about 6% in multiple 1K runs.  After the fix,
no failures observed.

Change-Id: Ibec9049186f79f1c43295e4735538ed7ba4e516e
Reviewed-on: http://gerrit.cloudera.org:8080/8462
Tested-by: Kudu Jenkins
Reviewed-by: Dan Burkert 
Reviewed-by: Alexey Serbin 
---
M be/src/kudu/security/CMakeLists.txt
M be/src/kudu/security/tls_handshake.cc
M be/src/kudu/security/tls_socket.cc
M be/src/kudu/util/net/socket.cc
A security/tls_socket-test.cc
5 files changed, 227 insertions(+), 7 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/9360/1
--
To view, visit http://gerrit.cloudera.org:8080/9360
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibec9049186f79f1c43295e4735538ed7ba4e516e
Gerrit-Change-Number: 9360
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] KUDU-2218. tls socket: properly handle temporary socket errors in Writev

2018-02-19 Thread Michael Ho (Code Review)
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/9361

to review the following change.


Change subject: KUDU-2218. tls_socket: properly handle temporary socket errors 
in Writev
..

KUDU-2218. tls_socket: properly handle temporary socket errors in Writev

This fixes a bug which caused RaftConsensusITest.TestLargeBatches to
fail when run under stress, as in the following command line:

taskset -c 0-4 \
 build/latest/bin/raft_consensus-itest \
   --gtest_filter=\*LargeBat\* \
   --stress-cpu-threads=8

This would produce an error like:
Network error: failed to write to TLS socket: error:1409F07F:SSL 
routines:SSL3_WRITE_PENDING:bad write retry:s3_pkt.c:878

This means that we were retrying a write after getting EAGAIN, but with
a different buffer than the first time.

I tracked this down to mishandling of temporary socket errors in
TlsSocket::Writev(). In the case that we successfully write part of the
io vector but hit such an error trying to write a later element in the
vector, we were still propagating the error back up to the caller. The
caller didn't realize that part of the write was successful, and thus it
would retry the write from the beginning.

The fix is to fix the above, but also to enable partial writes in
TlsContext. The new test fails if either of the above two changes are
backed out.

Change-Id: If797f220f42bfb2e6f452b66f15e7a758e883472
Reviewed-on: http://gerrit.cloudera.org:8080/8570
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin 
---
M be/src/kudu/security/tls_context.cc
M be/src/kudu/security/tls_socket.cc
M security/tls_socket-test.cc
3 files changed, 219 insertions(+), 70 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/9361/1
--
To view, visit http://gerrit.cloudera.org:8080/9361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If797f220f42bfb2e6f452b66f15e7a758e883472
Gerrit-Change-Number: 9361
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] [security] test and fixes for TLS socket EINTR issues

2018-02-19 Thread Michael Ho (Code Review)
Michael Ho has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/9360 )

Change subject: [security] test and fixes for TLS socket EINTR issues
..


Removed reviewer Kudu Jenkins.
--
To view, visit http://gerrit.cloudera.org:8080/9360
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: Ibec9049186f79f1c43295e4735538ed7ba4e516e
Gerrit-Change-Number: 9360
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] [security] test and fixes for TLS socket EINTR issues

2018-02-19 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9360 )

Change subject: [security] test and fixes for TLS socket EINTR issues
..


Patch Set 1:

Clean application of the original patch.


--
To view, visit http://gerrit.cloudera.org:8080/9360
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibec9049186f79f1c43295e4735538ed7ba4e516e
Gerrit-Change-Number: 9360
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 19 Feb 2018 22:38:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] KUDU-2218. tls socket: properly handle temporary socket errors in Writev

2018-02-19 Thread Michael Ho (Code Review)
Michael Ho has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/9361 )

Change subject: KUDU-2218. tls_socket: properly handle temporary socket errors 
in Writev
..


Removed reviewer Kudu Jenkins.
--
To view, visit http://gerrit.cloudera.org:8080/9361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: If797f220f42bfb2e6f452b66f15e7a758e883472
Gerrit-Change-Number: 9361
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] KUDU-2004. Undefined behavior in TlsSocket::Writev()

2018-02-19 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9359 )

Change subject: KUDU-2004. Undefined behavior in TlsSocket::Writev()
..


Patch Set 1:

Clean application of the original patch.


--
To view, visit http://gerrit.cloudera.org:8080/9359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5b5bbb3fd2ec8fcd1a48873446f3aa09546eaac
Gerrit-Change-Number: 9359
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 19 Feb 2018 22:37:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] KUDU-2218. tls socket: properly handle temporary socket errors in Writev

2018-02-19 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9361 )

Change subject: KUDU-2218. tls_socket: properly handle temporary socket errors 
in Writev
..


Patch Set 1:

Clean application of the original patch.


--
To view, visit http://gerrit.cloudera.org:8080/9361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If797f220f42bfb2e6f452b66f15e7a758e883472
Gerrit-Change-Number: 9361
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 19 Feb 2018 22:38:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-19 Thread Lars Volker (Code Review)
Hello Michael Ho, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9292

to look at the new patch set (#13).

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..

IMPALA-6269: Expose KRPC metrics on debug webpage

This change exposes KRPC metrics on the /rpcz debug web page.

This change also exposes metrics for rejected RPCs on the /metrics debug
web page. See here for an example: https://git.io/vAczm

This change also fixes a bug in PrettyPrinter::GetByteUnit(), which
previously did not work for unsigned values due to an implicit cast.

This change contains tests to check that the metrics show up in /rpcz
and /metrics and that they update as expected when executing queries.

This change is based on a change by Sailesh Mukil.

Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-trace.cc
M be/src/rpc/rpc-trace.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/service/impalad-main.cc
M be/src/util/histogram-metric.h
M be/src/util/pretty-printer.h
M common/thrift/Metrics.thrift
M common/thrift/metrics.json
A tests/custom_cluster/test_krpc_metrics.py
M tests/webserver/test_web_pages.py
M www/rpcz.tmpl
16 files changed, 501 insertions(+), 103 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/9292/13
--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 13
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-19 Thread Lars Volker (Code Review)
Hello Michael Ho, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9292

to look at the new patch set (#14).

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..

IMPALA-6269: Expose KRPC metrics on debug webpage

This change exposes KRPC metrics on the /rpcz debug web page.

This change also exposes metrics for rejected RPCs on the /metrics debug
web page. See here for an example: https://git.io/vAczm

This change also fixes a bug in PrettyPrinter::GetByteUnit(), which
previously did not work for unsigned values due to an implicit cast.

This change contains tests to check that the metrics show up in /rpcz
and /metrics and that they update as expected when executing queries.

This change is based on a change by Sailesh Mukil.

Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-trace.cc
M be/src/rpc/rpc-trace.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/service/impalad-main.cc
M be/src/util/histogram-metric.h
M be/src/util/pretty-printer.h
M common/thrift/Metrics.thrift
M common/thrift/metrics.json
A tests/custom_cluster/test_krpc_metrics.py
M tests/webserver/test_web_pages.py
M www/rpcz.tmpl
16 files changed, 503 insertions(+), 103 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/9292/14
--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 14
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-19 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9292 )

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..


Patch Set 13:

(6 comments)

PS13 contains the rebase. PS14 should address all left-over comments.

http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/impala-service-pool.cc
File be/src/rpc/impala-service-pool.cc:

http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/impala-service-pool.cc@71
PS11, Line 71:   for (const auto& method : method_infos) {
 : const string& method_name = method.first;
 : string payload_size_name = Substitute("
> These lines can be combined into:
Done


http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/impala-service-pool.cc@76
PS11, Line 76:
> Does this need to be defined in metrics.json too ?
No, since it is not added to any metric groups and thus also not displayed on 
/metrics. I think we can add it there, but it would be inconsistent with the 
KRPC metrics, so I think it would be better to add them to /metrics in a 
separate change that also figures out how to add KRPC metrics there to begin 
with.


http://gerrit.cloudera.org:8080/#/c/9292/11/tests/custom_cluster/test_krpc_metrics.py
File tests/custom_cluster/test_krpc_metrics.py:

http://gerrit.cloudera.org:8080/#/c/9292/11/tests/custom_cluster/test_krpc_metrics.py@52
PS11, Line 52: -datastream_service_queue_depth=1
> I will address all other comments, then rebase the change, and then address
Replaced with the new flag.


http://gerrit.cloudera.org:8080/#/c/9292/11/tests/custom_cluster/test_krpc_metrics.py@60
PS11, Line 60: [0]
> The idea was that there needs to be at least one service, but you're right,
Done.


http://gerrit.cloudera.org:8080/#/c/9292/11/tests/custom_cluster/test_krpc_metrics.py@70
PS11, Line 70: -datastream_service_queue_depth=1
> See above.
Replaced with the new flag.


http://gerrit.cloudera.org:8080/#/c/9292/11/tests/webserver/test_web_pages.py
File tests/webserver/test_web_pages.py:

http://gerrit.cloudera.org:8080/#/c/9292/11/tests/webserver/test_web_pages.py@250
PS11, Line 250: rpcz['services'][0]['rpc_method_metrics']
> same comment as above
Done



--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 13
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 19 Feb 2018 23:37:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5801: Clean up codegen GetType() interface

2018-02-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9063 )

Change subject: IMPALA-5801: Clean up codegen GetType() interface
..


Patch Set 9: Code-Review+2

(2 comments)

This seems like an improvement. Thanks for doing the cleanup. Will start the 
merge once Gabor +1s.

http://gerrit.cloudera.org:8080/#/c/9063/9/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/9063/9/be/src/codegen/llvm-codegen.h@270
PS9, Line 270: site
nit: side?


http://gerrit.cloudera.org:8080/#/c/9063/9/be/src/exec/hdfs-scanner.cc
File be/src/exec/hdfs-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/9063/9/be/src/exec/hdfs-scanner.cc@333
PS9, Line 333:   llvm::PointerType* uint8_ptr_type = codegen->i8_ptr_type();
Thanks for this cleanup



--
To view, visit http://gerrit.cloudera.org:8080/9063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77
Gerrit-Change-Number: 9063
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 23:38:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6497: add "Last row fetched" and AC events

2018-02-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9271 )

Change subject: IMPALA-6497: add "Last row fetched" and AC events
..


Patch Set 7:

Any more comments?


--
To view, visit http://gerrit.cloudera.org:8080/9271
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e
Gerrit-Change-Number: 9271
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 23:40:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6497: add "Last row fetched" and AC events

2018-02-19 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9271 )

Change subject: IMPALA-6497: add "Last row fetched" and AC events
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9271
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e
Gerrit-Change-Number: 9271
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 23:43:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6482: add QUERY TIME LIMIT S option

2018-02-19 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9227 )

Change subject: IMPALA-6482: add QUERY_TIME_LIMIT_S option
..


Patch Set 5:

(2 comments)

I just looked at how these two were different.

http://gerrit.cloudera.org:8080/#/c/9227/5/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/9227/5/be/src/service/impala-server.h@1035
PS5, Line 1035: // The query is cancelled if the query has not been 
inactive this long. The event
"has not been inactive" is confusing. Is the "not" there correct?


http://gerrit.cloudera.org:8080/#/c/9227/5/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

http://gerrit.cloudera.org:8080/#/c/9227/5/common/thrift/ImpalaInternalService.thrift@271
PS5, Line 271:   // Time, in s, before a query will be timed out after it 
starts executing.
Mention explicitly that this doesn't include waiting for admission?



--
To view, visit http://gerrit.cloudera.org:8080/9227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8
Gerrit-Change-Number: 9227
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Feb 2018 23:54:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6497: add "Last row fetched" and AC events

2018-02-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9271 )

Change subject: IMPALA-6497: add "Last row fetched" and AC events
..


Patch Set 8: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9271
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e
Gerrit-Change-Number: 9271
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 20 Feb 2018 00:08:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6497: add "Last row fetched" and AC events

2018-02-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9271 )

Change subject: IMPALA-6497: add "Last row fetched" and AC events
..


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1953/


--
To view, visit http://gerrit.cloudera.org:8080/9271
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e
Gerrit-Change-Number: 9271
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 20 Feb 2018 00:08:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] KUDU-2004. Undefined behavior in TlsSocket::Writev()

2018-02-19 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9359 )

Change subject: KUDU-2004. Undefined behavior in TlsSocket::Writev()
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5b5bbb3fd2ec8fcd1a48873446f3aa09546eaac
Gerrit-Change-Number: 9359
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 20 Feb 2018 00:18:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] KUDU-2218. tls socket: properly handle temporary socket errors in Writev

2018-02-19 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9361 )

Change subject: KUDU-2218. tls_socket: properly handle temporary socket errors 
in Writev
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9361
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If797f220f42bfb2e6f452b66f15e7a758e883472
Gerrit-Change-Number: 9361
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 20 Feb 2018 00:18:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] [security] test and fixes for TLS socket EINTR issues

2018-02-19 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9360 )

Change subject: [security] test and fixes for TLS socket EINTR issues
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9360
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibec9049186f79f1c43295e4735538ed7ba4e516e
Gerrit-Change-Number: 9360
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 20 Feb 2018 00:18:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

2018-02-19 Thread Michael Ho (Code Review)
Hello Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9344

to look at the new patch set (#8).

Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from 
BufferPool
..

IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

Previously, tuple pointers of a row batch are allocated from
the heap via malloc() and tuple data is allocated from the
MemPool associated with the RowBatch. This change converts
the allocations of tuple pointers and tuple data to using
BufferPool for row batches allocated from KrpcDataStreamRecvr.
The primary motivation for this change is to take advantage of
the fact that buffers allocated from BufferPool always go back
to the per-core arena they came from when they are freed. This
alleviates the TCMalloc imbalance between the RPC service threads
and the fragment execution threads. As described in IMPALA-5518,
row batches are always allocated from the service threads' TCMalloc
cache and placed into the fragment execution threads' TCMalloc cache
when they're freed. This leads to underflow and overflow in those
threads' caches and high contention for the spinlock of the central
free list. With BufferPool, the memory always went back to its
originating arena so this kind of imbalance is less likely to occur.
This also dovetails with the long term plan to put most allocations
under BufferPool and have each operators in the plan reserved
appropriate amount of memory before execution.

Note that the proper reservation mechanism of the exchange node
hasn't yet been implemented in this change so the buffer pool client
handle used for allocating buffers has an ad-hoc set-up of no reservation
limit and using root reservation tracker as parent. This needs to be
fixed as part of IMPALA-6524. The default buffer pool limit is also
bumped to 85% to account for the extra usage from the exchange nodes.
The minimum buffer size is also lowered to 8KB to reduce amount of memory
wastage as a row batch's tuple pointers / tuple data can sometimes be
much smaller than 64KB.

Testing done: Debug core build.

Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
---
M be/src/common/global-flags.cc
M be/src/exec/blocking-join-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/partial-sort-node.cc
M be/src/runtime/data-stream-mgr-base.h
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.h
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/service/query-options-test.cc
M testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test
M tests/query_test/test_mem_usage_scaling.py
19 files changed, 242 insertions(+), 95 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/9344/8
--
To view, visit http://gerrit.cloudera.org:8080/9344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Gerrit-Change-Number: 9344
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

2018-02-19 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9344 )

Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from 
BufferPool
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9344/6/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

http://gerrit.cloudera.org:8080/#/c/9344/6/be/src/runtime/row-batch.h@411
PS6, Line 411:   /// Creates an empty row batch based on the serialized row 
batch header. Called from
> Seems weird to set capacity_ at this point since it's not where the buffer
Fixed in PS8. Deferred setting capacity_ until the buffer is allocated.


http://gerrit.cloudera.org:8080/#/c/9344/7/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

http://gerrit.cloudera.org:8080/#/c/9344/7/be/src/runtime/row-batch.cc@206
PS7, Line 206:   row_batch->num_rows_ = header.num_rows();
> We're expecting that capacity_ == header.num_rows() here, right? Can we add
Good point. Actually, I took Dan's suggestion and deferred setting capacity_ 
until the buffer is allocated.



--
To view, visit http://gerrit.cloudera.org:8080/9344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Gerrit-Change-Number: 9344
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 20 Feb 2018 00:19:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

2018-02-19 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9344 )

Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from 
BufferPool
..


Patch Set 8: Code-Review+2

Carry Tim's +2


--
To view, visit http://gerrit.cloudera.org:8080/9344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Gerrit-Change-Number: 9344
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 20 Feb 2018 00:19:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

2018-02-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9344 )

Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from 
BufferPool
..


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1954/


--
To view, visit http://gerrit.cloudera.org:8080/9344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Gerrit-Change-Number: 9344
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 20 Feb 2018 00:27:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] KUDU-2004. Undefined behavior in TlsSocket::Writev()

2018-02-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9359 )

Change subject: KUDU-2004. Undefined behavior in TlsSocket::Writev()
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1955/


--
To view, visit http://gerrit.cloudera.org:8080/9359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5b5bbb3fd2ec8fcd1a48873446f3aa09546eaac
Gerrit-Change-Number: 9359
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 20 Feb 2018 00:30:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6482: add QUERY TIME LIMIT S option

2018-02-19 Thread Tim Armstrong (Code Review)
Hello Philip Zeyliger, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9227

to look at the new patch set (#6).

Change subject: IMPALA-6482: add QUERY_TIME_LIMIT_S option
..

IMPALA-6482: add QUERY_TIME_LIMIT_S option

This is similar to the QUERY_TIMEOUT_S option and shares most of the
implementation. The difference is that the timeout doesn't reset at
any point.

The time limit is measured from the start of query execution,
after the query is admitted, so planning, scheduling and
time spent in admission control is not counted towards the time limit.

Also fix validation of the related QUERY_TIMEOUT_S option, which
previously could ignore invalid input.

Testing:
Added tests for various permutations:
* With and without query_timeout_s set
* With and without result fetching keeping the query active

Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M tests/custom_cluster/test_query_expiration.py
10 files changed, 243 insertions(+), 91 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/9227/6
--
To view, visit http://gerrit.cloudera.org:8080/9227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8
Gerrit-Change-Number: 9227
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6482: add QUERY TIME LIMIT S option

2018-02-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9227 )

Change subject: IMPALA-6482: add QUERY_TIME_LIMIT_S option
..


Patch Set 5:

(2 comments)

I also considered calling this something like QUERY_EXEC_TIME_LIMIT_S to hint 
that it doesn't include planning, admission, etc, but it seemed a little 
verbose.

http://gerrit.cloudera.org:8080/#/c/9227/5/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/9227/5/be/src/service/impala-server.h@1035
PS5, Line 1035: // The query is cancelled if the query has not been 
inactive this long. The event
> "has not been inactive" is confusing. Is the "not" there correct?
oops, had an unintended double negative


http://gerrit.cloudera.org:8080/#/c/9227/5/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

http://gerrit.cloudera.org:8080/#/c/9227/5/common/thrift/ImpalaInternalService.thrift@271
PS5, Line 271:   // Time, in s, before a query will be timed out after it 
starts executing.
> Mention explicitly that this doesn't include waiting for admission?
Done, here and in the other .thrift



--
To view, visit http://gerrit.cloudera.org:8080/9227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8
Gerrit-Change-Number: 9227
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 20 Feb 2018 00:34:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6423: HDFS scanner doesn't check RuntimeState::is cancelled()

2018-02-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9352 )

Change subject: IMPALA-6423: HDFS scanner doesn't check 
RuntimeState::is_cancelled()
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3eb960d9d6f2dde4b2cb4988aa06288d11802b9a
Gerrit-Change-Number: 9352
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 20 Feb 2018 00:45:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6423: HDFS scanner doesn't check RuntimeState::is cancelled()

2018-02-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9352 )

Change subject: IMPALA-6423: HDFS scanner doesn't check 
RuntimeState::is_cancelled()
..

IMPALA-6423: HDFS scanner doesn't check RuntimeState::is_cancelled()

There are cancellation checks in HDFS scanner code that are using
ScannerContext::cancelled(), however, if an internal cancellation
happens then RuntimeState::is_cancelled() should also be checked to
detect these. As a solution I extended ScannerContext::cancelled() to
check RuntimeState::is_cancelled() as well.

Creating automatic tests for this is difficult as the result of the
query is actually the same with and without this change. With manual
tests I verifyed that in case an internal cancel happens (I made the
exchange node fail during a select query) this change prevents
CommitRows() being executed multiple times even if the query had been
cancelled before the first call.

Change-Id: I3eb960d9d6f2dde4b2cb4988aa06288d11802b9a
Reviewed-on: http://gerrit.cloudera.org:8080/9352
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
2 files changed, 4 insertions(+), 2 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9352
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3eb960d9d6f2dde4b2cb4988aa06288d11802b9a
Gerrit-Change-Number: 9352
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-19 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9292 )

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..


Patch Set 14: Code-Review+1

(4 comments)

http://gerrit.cloudera.org:8080/#/c/9292/8/be/src/rpc/impala-service-pool.cc
File be/src/rpc/impala-service-pool.cc:

http://gerrit.cloudera.org:8080/#/c/9292/8/be/src/rpc/impala-service-pool.cc@76
PS8, Line 76: }
:
: ImpalaServicePool::~ImpalaServic
Maybe this is something to consider moving into RpcMethodInfo as a Kudu patch.

In that case, we can get it like we get the handler latency in L273.

Not for now, but if you agree, we can add a TODO here.


http://gerrit.cloudera.org:8080/#/c/9292/8/be/src/rpc/rpc-mgr.cc
File be/src/rpc/rpc-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9292/8/be/src/rpc/rpc-mgr.cc@128
PS8, Line 128: rviceIf* service_p
This might stop us from working with different types of ServiceIfs in the 
future if we need to. But since we don't have any other types of ServiceIfs, 
it's okay for now.

Something to keep a note of though.


http://gerrit.cloudera.org:8080/#/c/9292/14/tests/webserver/test_web_pages.py
File tests/webserver/test_web_pages.py:

http://gerrit.cloudera.org:8080/#/c/9292/14/tests/webserver/test_web_pages.py@251
PS14, Line 251: 'impala.DataStreamService'
Why not keep this in a local array? Then we can add more services to this test 
as we add them to the code base.

Then we can remove all references to 'DataStreamService' and use the array 
entry.


http://gerrit.cloudera.org:8080/#/c/9292/8/www/rpcz.tmpl
File www/rpcz.tmpl:

http://gerrit.cloudera.org:8080/#/c/9292/8/www/rpcz.tmpl@42
PS8, Line 42: 
:   Queue Size
:   Idle threads
:   Maximum Queue Size
:   RPCs rejected due to queue overflow
: 
: 
:   {{queue_size}}
:   {{idle_threads}}
:   {{service_max_queue_size}}
:   
:   {{rpcs_queue_overflow}}
: 
Would specifying a colspan for these guys align them better?



--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 14
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 20 Feb 2018 00:48:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-19 Thread Lars Volker (Code Review)
Hello Michael Ho, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9292

to look at the new patch set (#15).

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..

IMPALA-6269: Expose KRPC metrics on debug webpage

This change exposes KRPC metrics on the /rpcz debug web page.

This change also exposes metrics for rejected RPCs on the /metrics debug
web page. See here for an example: https://git.io/vAczm

This change also fixes a bug in PrettyPrinter::GetByteUnit(), which
previously did not work for unsigned values due to an implicit cast.

This change contains tests to check that the metrics show up in /rpcz
and /metrics and that they update as expected when executing queries.

This change is based on a change by Sailesh Mukil.

Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-trace.cc
M be/src/rpc/rpc-trace.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/service/impalad-main.cc
M be/src/util/histogram-metric.h
M be/src/util/pretty-printer.h
M common/thrift/Metrics.thrift
M common/thrift/metrics.json
A tests/custom_cluster/test_krpc_metrics.py
M tests/webserver/test_web_pages.py
M www/rpcz.tmpl
16 files changed, 506 insertions(+), 103 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/9292/15
--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 15
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-19 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9292 )

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..


Patch Set 14:

(4 comments)

Thanks for the review. Please see PS15 and my inline comments.

http://gerrit.cloudera.org:8080/#/c/9292/8/be/src/rpc/impala-service-pool.cc
File be/src/rpc/impala-service-pool.cc:

http://gerrit.cloudera.org:8080/#/c/9292/8/be/src/rpc/impala-service-pool.cc@76
PS8, Line 76: }
:
: ImpalaServicePool::~ImpalaServic
> Maybe this is something to consider moving into RpcMethodInfo as a Kudu pat
Good point, filed KUDU-2313 and left a comment here.


http://gerrit.cloudera.org:8080/#/c/9292/8/be/src/rpc/rpc-mgr.cc
File be/src/rpc/rpc-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9292/8/be/src/rpc/rpc-mgr.cc@128
PS8, Line 128: rviceIf* service_p
> This might stop us from working with different types of ServiceIfs in the f
That's true. Non-generated services don't necessarily use a map to lookup 
methods. When adding the required getter to Kudu it felt far easier to only add 
it to the generated services interface than require a change to all services. I 
agree that we should revisit this if we need a custom service. An alternative 
then would be to register the methods ourselves, since we already have to type 
them out in code.


http://gerrit.cloudera.org:8080/#/c/9292/14/tests/webserver/test_web_pages.py
File tests/webserver/test_web_pages.py:

http://gerrit.cloudera.org:8080/#/c/9292/14/tests/webserver/test_web_pages.py@251
PS14, Line 251: 'impala.DataStreamService'
> Why not keep this in a local array? Then we can add more services to this t
I factored the name into a parameter to make the code more generic. The next 
service might require a different query to exercise it though and not knowing 
which one exactly we're going to add, I'd like to defer factoring the method 
into a loop to a future change.


http://gerrit.cloudera.org:8080/#/c/9292/8/www/rpcz.tmpl
File www/rpcz.tmpl:

http://gerrit.cloudera.org:8080/#/c/9292/8/www/rpcz.tmpl@42
PS8, Line 42: 
:   Queue Size
:   Idle threads
:   Maximum Queue Size
:   RPCs rejected due to queue overflow
: 
: 
:   {{queue_size}}
:   {{idle_threads}}
:   {{service_max_queue_size}}
:   
:   {{rpcs_queue_overflow}}
: 
> Would specifying a colspan for these guys align them better?
I couldn't find a way to make it look nicer using colspan but putting in 
another layer of tables seems to have improved it.



--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 14
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 20 Feb 2018 01:42:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-19 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9292 )

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..


Patch Set 15: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9292/8/be/src/rpc/rpc-mgr.cc
File be/src/rpc/rpc-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9292/8/be/src/rpc/rpc-mgr.cc@128
PS8, Line 128: rviceIf* service_p
> That's true. Non-generated services don't necessarily use a map to lookup m
Ok thanks for the explanation. We can address it when it becomes necessary. 
This looks good for now.


http://gerrit.cloudera.org:8080/#/c/9292/14/tests/webserver/test_web_pages.py
File tests/webserver/test_web_pages.py:

http://gerrit.cloudera.org:8080/#/c/9292/14/tests/webserver/test_web_pages.py@251
PS14, Line 251:
> I factored the name into a parameter to make the code more generic. The nex
Makes sense, the query can be a parameter too then (like a 2-D array). But lets 
defer that for the future.



--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 15
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 20 Feb 2018 01:48:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-19 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9292 )

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..


Patch Set 15: Code-Review+2

(5 comments)

http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/impala-service-pool.cc
File be/src/rpc/impala-service-pool.cc:

http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/impala-service-pool.cc@76
PS11, Line 76:
> No, since it is not added to any metric groups and thus also not displayed
Is there a JIRA for it ? Will it be done as part of IMPALA-6540 ?


http://gerrit.cloudera.org:8080/#/c/9292/15/be/src/rpc/impala-service-pool.cc
File be/src/rpc/impala-service-pool.cc:

http://gerrit.cloudera.org:8080/#/c/9292/15/be/src/rpc/impala-service-pool.cc@254
PS15, Line 254: const string KrpcHistogramToString(const kudu::Histogram* 
histogram) {
It may be more extensible to not return a formatted string. Instead, we can 
consider returning just a list of numbers and let the caller present them in a 
more flexible way. As the code stands now, it would require the caller to parse 
the string to extract the numbers.

If you think this makes sense, please add a TODO here.


http://gerrit.cloudera.org:8080/#/c/9292/15/be/src/rpc/rpc-mgr.cc
File be/src/rpc/rpc-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9292/15/be/src/rpc/rpc-mgr.cc@164
PS15, Line 164: void RpcMgr::Shutdown() {
We can probably give up ownership of the acceptor_pool_ before calling 
messenger_->Shutdown().


http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/rpc-trace.cc
File be/src/rpc/rpc-trace.cc:

http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/rpc-trace.cc@147
PS11, Line 147: MetricGroup* metrics
> I think this would break compatibility with previous versions of Impala, fo
Makes sense.


http://gerrit.cloudera.org:8080/#/c/9292/15/be/src/util/histogram-metric.h
File be/src/util/histogram-metric.h:

http://gerrit.cloudera.org:8080/#/c/9292/15/be/src/util/histogram-metric.h@96
PS15, Line 96: type
nit: long line



--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 15
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 20 Feb 2018 02:17:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5752: Add support for DECIMAL on Kudu tables

2018-02-19 Thread Grant Henke (Code Review)
Hello Thomas Tauber-Marshall, Taras Bobrovytsky,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9306

to look at the new patch set (#9).

Change subject: IMPALA-5752: Add support for DECIMAL on Kudu tables
..

IMPALA-5752: Add support for DECIMAL on Kudu tables

Adds support for the Kudu DECIMAL type introduced in Kudu 1.7.0.

Note: Adding support for Kudu decimal min/max filters is
tracked in IMPALA-6533.

Tests:
* Added Kudu create with decimal test to AnalyzeDDLTest.java
* Added Kudu table_format to test_decimal_queries.py
** Both decimal.test and decimal-exprs.test workloads
* Added decimal queries to the following Kudu workloads:
** kudu_create.test
** kudu_delete.test
** kudu_insert.test
** kudu_update.test
** kudu_upsert.test

Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6
---
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-util.cc
M be/src/exec/kudu-util.h
M be/src/exprs/kudu-partition-expr.cc
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/query_test/test_decimal_queries.py
M tests/query_test/test_kudu.py
21 files changed, 795 insertions(+), 633 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/9306/9
--
To view, visit http://gerrit.cloudera.org:8080/9306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6
Gerrit-Change-Number: 9306
Gerrit-PatchSet: 9
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-6497: add "Last row fetched" and AC events

2018-02-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9271 )

Change subject: IMPALA-6497: add "Last row fetched" and AC events
..


Patch Set 8: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1953/


--
To view, visit http://gerrit.cloudera.org:8080/9271
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e
Gerrit-Change-Number: 9271
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 20 Feb 2018 03:49:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

2018-02-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9344 )

Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from 
BufferPool
..

IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

Previously, tuple pointers of a row batch are allocated from
the heap via malloc() and tuple data is allocated from the
MemPool associated with the RowBatch. This change converts
the allocations of tuple pointers and tuple data to using
BufferPool for row batches allocated from KrpcDataStreamRecvr.
The primary motivation for this change is to take advantage of
the fact that buffers allocated from BufferPool always go back
to the per-core arena they came from when they are freed. This
alleviates the TCMalloc imbalance between the RPC service threads
and the fragment execution threads. As described in IMPALA-5518,
row batches are always allocated from the service threads' TCMalloc
cache and placed into the fragment execution threads' TCMalloc cache
when they're freed. This leads to underflow and overflow in those
threads' caches and high contention for the spinlock of the central
free list. With BufferPool, the memory always went back to its
originating arena so this kind of imbalance is less likely to occur.
This also dovetails with the long term plan to put most allocations
under BufferPool and have each operators in the plan reserved
appropriate amount of memory before execution.

Note that the proper reservation mechanism of the exchange node
hasn't yet been implemented in this change so the buffer pool client
handle used for allocating buffers has an ad-hoc set-up of no reservation
limit and using root reservation tracker as parent. This needs to be
fixed as part of IMPALA-6524. The default buffer pool limit is also
bumped to 85% to account for the extra usage from the exchange nodes.
The minimum buffer size is also lowered to 8KB to reduce amount of memory
wastage as a row batch's tuple pointers / tuple data can sometimes be
much smaller than 64KB.

Testing done: Debug core build.

Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Reviewed-on: http://gerrit.cloudera.org:8080/9344
Reviewed-by: Michael Ho 
Tested-by: Impala Public Jenkins
---
M be/src/common/global-flags.cc
M be/src/exec/blocking-join-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/partial-sort-node.cc
M be/src/runtime/data-stream-mgr-base.h
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.h
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/service/query-options-test.cc
M testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test
M tests/query_test/test_mem_usage_scaling.py
19 files changed, 242 insertions(+), 95 deletions(-)

Approvals:
  Michael Ho: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Gerrit-Change-Number: 9344
Gerrit-PatchSet: 9
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

2018-02-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9344 )

Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from 
BufferPool
..


Patch Set 8: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Gerrit-Change-Number: 9344
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 20 Feb 2018 04:08:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] KUDU-2004. Undefined behavior in TlsSocket::Writev()

2018-02-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9359 )

Change subject: KUDU-2004. Undefined behavior in TlsSocket::Writev()
..


Patch Set 1: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1955/


--
To view, visit http://gerrit.cloudera.org:8080/9359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5b5bbb3fd2ec8fcd1a48873446f3aa09546eaac
Gerrit-Change-Number: 9359
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 20 Feb 2018 04:12:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] KUDU-2004. Undefined behavior in TlsSocket::Writev()

2018-02-19 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9359 )

Change subject: KUDU-2004. Undefined behavior in TlsSocket::Writev()
..


Patch Set 1:

GVO failed due to infrastructure issue (IMPALA-6394)


--
To view, visit http://gerrit.cloudera.org:8080/9359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5b5bbb3fd2ec8fcd1a48873446f3aa09546eaac
Gerrit-Change-Number: 9359
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 20 Feb 2018 05:25:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] KUDU-2004. Undefined behavior in TlsSocket::Writev()

2018-02-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9359 )

Change subject: KUDU-2004. Undefined behavior in TlsSocket::Writev()
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1956/


--
To view, visit http://gerrit.cloudera.org:8080/9359
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5b5bbb3fd2ec8fcd1a48873446f3aa09546eaac
Gerrit-Change-Number: 9359
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Edward Fancher 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 20 Feb 2018 05:26:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6497: add "Last row fetched" and AC events

2018-02-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9271 )

Change subject: IMPALA-6497: add "Last row fetched" and AC events
..


Patch Set 8:

Failing in "Waiting for HDFS replication"


--
To view, visit http://gerrit.cloudera.org:8080/9271
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e
Gerrit-Change-Number: 9271
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 20 Feb 2018 05:55:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6497: add "Last row fetched" and AC events

2018-02-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9271 )

Change subject: IMPALA-6497: add "Last row fetched" and AC events
..


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1957/


--
To view, visit http://gerrit.cloudera.org:8080/9271
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e
Gerrit-Change-Number: 9271
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 20 Feb 2018 05:55:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-19 Thread Lars Volker (Code Review)
Hello Michael Ho, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9292

to look at the new patch set (#16).

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..

IMPALA-6269: Expose KRPC metrics on debug webpage

This change exposes KRPC metrics on the /rpcz debug web page.

This change also exposes metrics for rejected RPCs on the /metrics debug
web page. See here for an example: https://git.io/vAczm

This change also fixes a bug in PrettyPrinter::GetByteUnit(), which
previously did not work for unsigned values due to an implicit cast.

This change contains tests to check that the metrics show up in /rpcz
and /metrics and that they update as expected when executing queries.

This change is based on a change by Sailesh Mukil.

Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-trace.cc
M be/src/rpc/rpc-trace.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/service/impalad-main.cc
M be/src/util/histogram-metric.h
M be/src/util/pretty-printer.h
M common/thrift/Metrics.thrift
M common/thrift/metrics.json
A tests/custom_cluster/test_krpc_metrics.py
M tests/webserver/test_web_pages.py
M www/rpcz.tmpl
16 files changed, 507 insertions(+), 103 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/9292/16
--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 16
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-19 Thread Lars Volker (Code Review)
Hello Michael Ho, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9292

to look at the new patch set (#17).

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..

IMPALA-6269: Expose KRPC metrics on debug webpage

This change exposes KRPC metrics on the /rpcz debug web page.

This change also exposes metrics for rejected RPCs on the /metrics debug
web page. See here for an example: https://git.io/vAczm

This change also fixes a bug in PrettyPrinter::GetByteUnit(), which
previously did not work for unsigned values due to an implicit cast.

This change contains tests to check that the metrics show up in /rpcz
and /metrics and that they update as expected when executing queries.

This change is based on a change by Sailesh Mukil.

Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-trace.cc
M be/src/rpc/rpc-trace.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/service/impalad-main.cc
M be/src/util/histogram-metric.h
M be/src/util/pretty-printer.h
M common/thrift/Metrics.thrift
M common/thrift/metrics.json
A tests/custom_cluster/test_krpc_metrics.py
M tests/webserver/test_web_pages.py
M www/rpcz.tmpl
16 files changed, 516 insertions(+), 103 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/9292/17
--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 17
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-19 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9292 )

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..


Patch Set 17:

(4 comments)

Please see my inline comments and PS17. It also removes the now obsolete queue 
size and replaces it with memory usage values.

http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/impala-service-pool.cc
File be/src/rpc/impala-service-pool.cc:

http://gerrit.cloudera.org:8080/#/c/9292/11/be/src/rpc/impala-service-pool.cc@76
PS11, Line 76:
> Is there a JIRA for it ? Will it be done as part of IMPALA-6540 ?
I don't think this would be part of IMPALA-6540, and I filed IMPALA-6541 for 
it. This particular metric here should be moved to KRPC's internal code 
(KUDU-2313).


http://gerrit.cloudera.org:8080/#/c/9292/15/be/src/rpc/impala-service-pool.cc
File be/src/rpc/impala-service-pool.cc:

http://gerrit.cloudera.org:8080/#/c/9292/15/be/src/rpc/impala-service-pool.cc@254
PS15, Line 254: const string KrpcHistogramToString(const kudu::Histogram* 
histogram) {
> It may be more extensible to not return a formatted string. Instead, we can
Currently we're exposing only what is required to fulfill ToJson()'s contract. 
Once we want to wrap Kudu's Histograms and register them inside our metric 
groups, we will have to alter this abstraction (but I don't know how to do that 
yet). I created IMPALA-6541 to track this effort.


http://gerrit.cloudera.org:8080/#/c/9292/15/be/src/rpc/rpc-mgr.cc
File be/src/rpc/rpc-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9292/15/be/src/rpc/rpc-mgr.cc@164
PS15, Line 164: void RpcMgr::Shutdown() {
> We can probably give up ownership of the acceptor_pool_ before calling mess
Good point, done.


http://gerrit.cloudera.org:8080/#/c/9292/15/be/src/util/histogram-metric.h
File be/src/util/histogram-metric.h:

http://gerrit.cloudera.org:8080/#/c/9292/15/be/src/util/histogram-metric.h@96
PS15, Line 96:
> nit: long line
Done



--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 17
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 20 Feb 2018 07:12:32 +
Gerrit-HasComments: Yes