[Impala-ASF-CR] IMPALA-5482: fix git checkout when workloads are modified

2017-06-10 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5482: fix git checkout when workloads are modified
..


Patch Set 1: Code-Review+2

Thank you for fixing this!

-- 
To view, visit http://gerrit.cloudera.org:8080/7145
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9a0d004c353eb4b547aeaf3c56289594326653d7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5031: Remove undefined behavior "reference binding to null"

2017-06-10 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5031: Remove undefined behavior "reference binding to 
null"
..


Patch Set 5: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7008
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifef5c3de686318bff4eac7dfad6e1639ddadb8f2
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5031: Remove undefined behavior "reference binding to null"

2017-06-10 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5031: Remove undefined behavior "reference binding to 
null"
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7008/4/be/src/exec/parquet-column-stats.inline.h
File be/src/exec/parquet-column-stats.inline.h:

PS4, Line 74: bytes_needed > 0
> I agree, let's add a DCHECK.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7008
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifef5c3de686318bff4eac7dfad6e1639ddadb8f2
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5031: Remove undefined behavior "reference binding to null"

2017-06-10 Thread Jim Apple (Code Review)
Hello Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7008

to look at the new patch set (#5).

Change subject: IMPALA-5031: Remove undefined behavior "reference binding to 
null"
..

IMPALA-5031: Remove undefined behavior "reference binding to null"

When p has type T* and p is nullptr, then T x = p[0] has undefined
behavior (obviously). Less obviously, T& x = p[0] also has undefined
behavior -- references cannot be null. That undefined behavior can be
caused by expressions like [0] when v is a vector of size 0. The
issue is that the expression is parsed as &(v[0]), aka
&(v.operator[](0)). The return type of the operator[] method is T&,
and when v is empty the return value is a null reference.

This was recognized by the C++ standards committe and fixed in
Committee Draft 2008. See LWG Defect Report 464:

http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html#464

This patch was created by running the following command, then fixing
the resulting compile errors:

find be/src -type f -execdir \
  sed -i 's/&\([A-Za-z0-9_]\+\)\[0\]/\1.data()/g' \{\} \;

Change-Id: Ifef5c3de686318bff4eac7dfad6e1639ddadb8f2
---
M be/src/exec/aggregation-node.cc
M be/src/exec/data-source-scan-node.cc
M be/src/exec/delimited-text-parser-test.cc
M be/src/exec/hash-join-node-ir.cc
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-table.cc
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hdfs-parquet-scanner-ir.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-sequence-table-writer.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/kudu-scanner.cc
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/old-hash-table-ir.cc
M be/src/exec/old-hash-table.cc
M be/src/exec/parquet-column-stats.h
M be/src/exec/parquet-column-stats.inline.h
M be/src/exec/partitioned-aggregation-node-ir.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-node-ir.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/select-node.cc
M be/src/exec/unnest-node.cc
M be/src/experiments/tuple-splitter-test.cc
M be/src/exprs/expr-context.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/expr-value.h
M be/src/exprs/string-functions-ir.cc
M be/src/rpc/authentication.cc
M be/src/rpc/thrift-util-test.cc
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/parallel-executor-test.cc
M be/src/runtime/row-batch-serialize-test.cc
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/udf/udf-test-harness.h
M be/src/util/bitmap.h
M be/src/util/coding-util-test.cc
M be/src/util/coding-util.cc
M be/src/util/dict-encoding.h
M be/src/util/runtime-profile.cc
M be/src/util/streaming-sampler.h
M be/src/util/tuple-row-compare.h
M be/src/util/uid-util.h
M be/src/util/webserver.cc
49 files changed, 113 insertions(+), 109 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/7008/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7008
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifef5c3de686318bff4eac7dfad6e1639ddadb8f2
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-06-10 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 1:

This patch passes core, exhaustive and ASAN tests. It can execute 32 concurrent 
streams of TPCDS-Q17 @ scale factor 3 on a 138-node cluster with Kerberos 
enabled. (I don't believe the previous implementation could do this effectively 
because of the number of Thrift connections required). 

Some perf results from a 20-node cluster:

++--+---++-++---++-+---+
| Workload   | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters |
++--+---++-++---++-+---+
| TPCH(_300) | TPCH-Q3  | parquet / none / none | 32.55  | 28.18   |   
+15.51%  |   4.71%   |   1.17%| 1   | 3 |
| TPCH(_300) | TPCH-Q13 | parquet / none / none | 24.43  | 22.21   |   
+9.99%   |   0.61%   |   0.70%| 1   | 3 |
| TPCH(_300) | TPCH-Q8  | parquet / none / none | 7.53   | 7.05|   
+6.69%   |   1.70%   |   2.09%| 1   | 3 |
| TPCH(_300) | TPCH-Q22 | parquet / none / none | 6.35   | 6.04|   
+5.19%   |   0.37%   |   0.76%| 1   | 3 |
| TPCH(_300) | TPCH-Q14 | parquet / none / none | 4.28   | 4.10|   
+4.36%   |   0.03%   |   0.73%| 1   | 3 |
| TPCH(_300) | TPCH-Q15 | parquet / none / none | 3.53   | 3.41|   
+3.69%   |   0.61%   |   1.42%| 1   | 3 |
| TPCH(_300) | TPCH-Q16 | parquet / none / none | 6.09   | 5.87|   
+3.63%   |   0.15%   |   1.78%| 1   | 3 |
| TPCH(_300) | TPCH-Q11 | parquet / none / none | 1.73   | 1.70|   
+2.22%   |   0.10%   |   0.95%| 1   | 3 |
| TPCH(_300) | TPCH-Q21 | parquet / none / none | 105.84 | 103.71  |   
+2.06%   |   0.57%   |   0.44%| 1   | 3 |
| TPCH(_300) | TPCH-Q9  | parquet / none / none | 30.76  | 30.46   |   
+1.00%   |   2.57%   |   1.22%| 1   | 3 |
| TPCH(_300) | TPCH-Q1  | parquet / none / none | 22.14  | 21.94   |   
+0.91%   |   0.81%   |   0.86%| 1   | 3 |
| TPCH(_300) | TPCH-Q4  | parquet / none / none | 5.09   | 5.05|   
+0.79%   |   0.48%   |   2.54%| 1   | 3 |
| TPCH(_300) | TPCH-Q18 | parquet / none / none | 31.76  | 32.54   |   
-2.39%   |   0.44%   |   0.03%| 1   | 3 |
| TPCH(_300) | TPCH-Q2  | parquet / none / none | 1.98   | 2.04|   
-2.74%   |   7.17%   |   7.41%| 1   | 3 |
| TPCH(_300) | TPCH-Q5  | parquet / none / none | 47.62  | 48.98   |   
-2.79%   |   0.51%   |   0.16%| 1   | 3 |
| TPCH(_300) | TPCH-Q20 | parquet / none / none | 3.18   | 3.27|   
-2.89%   |   1.34%   |   1.98%| 1   | 3 |
| TPCH(_300) | TPCH-Q6  | parquet / none / none | 1.32   | 1.37|   
-3.72%   |   0.03%   |   4.00%| 1   | 3 |
| TPCH(_300) | TPCH-Q10 | parquet / none / none | 9.00   | 9.48|   
-5.06%   |   0.16%   |   0.69%| 1   | 3 |
| TPCH(_300) | TPCH-Q17 | parquet / none / none | 5.16   | 5.75|   
-10.18%  |   6.44%   |   2.63%| 1   | 3 |
| TPCH(_300) | TPCH-Q12 | parquet / none / none | 3.01   | 3.39|   
-11.38%  |   2.43%   |   0.06%| 1   | 3 |
| TPCH(_300) | TPCH-Q19 | parquet / none / none | 25.20  | 28.82   | I 
-12.57%  |   0.01%   |   0.75%| 1   | 3 |
| TPCH(_300) | TPCH-Q7  | parquet / none / none | 45.32  | 61.16   | I 
-25.91%  |   0.55%   |   2.22%| 1   | 3 |
++--+---++-++---++-+---+

Primitives (note the significant regression in many_independent_fragments, that 
needs further attention)

+-++---++-++---++-+---+
| Workload| Query  
| File Format   | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base 
StdDev(%) | Num Clients | Iters |
+-++---++-++---++-+---+
| TARGETED-PERF(_300) | primitive_many_independent_fragments   
| parquet / none / none | 377.69 | 189.40  | R +99.42%  |   0.32%   |   
0.22%| 1   |