[Impala-ASF-CR] IMPALA-4282: Remove max length check for type strings.

2017-02-15 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-4282: Remove max length check for type strings.
..


Patch Set 2: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f5e503e14feee857bbdf80b3ef4f5a8b57fb2d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4282: Remove max length check for type strings.

2017-02-15 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new patch set (#2).

Change subject: IMPALA-4282: Remove max length check for type strings.
..

IMPALA-4282: Remove max length check for type strings.

During analysis, Impala used to enforce the default HMS
limit on the number of characters of a type string.
That enforcement has the benefit of a clear error
message, but has the severe drawback of not having
any workaround because the limit is baked into the code.
Going above the 4000 limit is pretty easy with complex
types, and several users have already run into this
issue.

This patch removes the Impala-side enforcement such
that the 4000 limit can be increased by altering
the corresponding columns in the Metastore's backend
database.

Change-Id: I01f5e503e14feee857bbdf80b3ef4f5a8b57fb2d
---
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
3 files changed, 1 insertion(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/6034/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f5e503e14feee857bbdf80b3ef4f5a8b57fb2d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 


[Impala-ASF-CR] IMPALA-4282: Remove max length check for type strings.

2017-02-15 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4282: Remove max length check for type strings.
..


Patch Set 2:

Ahh right. Thanks. Done.

-- 
To view, visit http://gerrit.cloudera.org:8080/6034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f5e503e14feee857bbdf80b3ef4f5a8b57fb2d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4282: Remove max length check for type strings.

2017-02-15 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-4282: Remove max length check for type strings.
..


Patch Set 1:

Looks like you forgot to git-add changes from ColumnDef.java.

-- 
To view, visit http://gerrit.cloudera.org:8080/6034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f5e503e14feee857bbdf80b3ef4f5a8b57fb2d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4282: Remove max length check for type strings.

2017-02-15 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6034

Change subject: IMPALA-4282: Remove max length check for type strings.
..

IMPALA-4282: Remove max length check for type strings.

During analysis, Impala used to enforce the default HMS
limit on the number of characters of a type string.
That enforcement has the benefit of a clear error
message, but has the severe drawback of not having
any workaround because the limit is baked into the code.
Going above the 4000 limit is pretty easy with complex
types, and several users have already run into this
issue.

This patch removes the Impala-side enforcement such
that the 4000 limit can be increased by altering
the corresponding column in the Metastore's backend
database.

Change-Id: I01f5e503e14feee857bbdf80b3ef4f5a8b57fb2d
---
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
2 files changed, 0 insertions(+), 42 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/6034/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I01f5e503e14feee857bbdf80b3ef4f5a8b57fb2d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] IMPALA-4854: Fix incremental stats with complex types.

2017-02-15 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-4854: Fix incremental stats with complex types.
..


Patch Set 1: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6e0335048d688ee25ff55c6628d0f6f8ecc1dd8a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4854: Fix incremental stats with complex types.

2017-02-15 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6033

Change subject: IMPALA-4854: Fix incremental stats with complex types.
..

IMPALA-4854: Fix incremental stats with complex types.

The bug: Compute incremental stats used to always do a
full stats recomputation for tables with complex types.
The logic for detecting schema changes (e.g. an added
column) did not take into consideration that columns
with complex types are ignored in the stats computation,
and should therefore not be recognized as a new column
that does not yet have stats.

Testing:
- Added a new regression test
- Locally ran test_compute_stats.py and the FE tests

Change-Id: I6e0335048d688ee25ff55c6628d0f6f8ecc1dd8a
---
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M 
testdata/workloads/functional-query/queries/QueryTest/compute-stats-incremental.test
2 files changed, 47 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/6033/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I6e0335048d688ee25ff55c6628d0f6f8ecc1dd8a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] IMPALA-2328: Read support for min/max Parquet statistics

2017-02-15 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6032

Change subject: IMPALA-2328: Read support for min/max Parquet statistics
..

IMPALA-2328: Read support for min/max Parquet statistics

This change adds support for skipping row groups based on Parquet row
group statistics. With this change we only support reading statistics
from Parquet files for numerical types (bool, integer, floating point)
and for simple predicates of the formsor
  , where  is LT, LE, GE, GT, and EQ.

Change-Id: I39b836165756fcf929c801048d91c50c8fdcdae4
---
M be/src/exec/CMakeLists.txt
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
A be/src/exec/parquet-column-stats.cc
M be/src/exec/parquet-column-stats.h
M be/src/exec/parquet-metadata-utils.cc
M be/src/exec/parquet-metadata-utils.h
M be/src/exprs/expr.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/rewrite/NormalizeExprsRule.java
A testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test
M tests/query_test/test_insert_parquet.py
15 files changed, 589 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/6032/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6032
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I39b836165756fcf929c801048d91c50c8fdcdae4
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-2020: Add rounding for decimal casts

2017-02-15 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-2020: Add rounding for decimal casts
..


Patch Set 11:

I have not updated the Python test yet, as that would require a rebase.  Python 
test and benchmark will come in next patch set, this is just to give a clean 
slate for review.

-- 
To view, visit http://gerrit.cloudera.org:8080/5951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2daf186b4770a022f9cb349d512067a1dd624810
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4926: Upgrade LZ4 to 1.7.5

2017-02-15 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6030

Change subject: IMPALA-4926: Upgrade LZ4 to 1.7.5
..

IMPALA-4926: Upgrade LZ4 to 1.7.5

LZ4 has deprecated the method names LZ4_compress and LZ4_uncompress, so
rename those to their new versions LZ4_compress_default() and
LZ4_decompress_fast() respectively. Otherwise no changes are required
for compatibility with LZ4 1.7.5.

Change-Id: I10e4561d0e940fa15ca8248c8277acfc6dff3da3
---
M be/src/util/compress.cc
M be/src/util/decompress.cc
M bin/impala-config.sh
3 files changed, 6 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/6030/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I10e4561d0e940fa15ca8248c8277acfc6dff3da3
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 


[Impala-ASF-CR] IMPALA-4840: Fix REFRESH performance regression.

2017-02-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4840: Fix REFRESH performance regression.
..


IMPALA-4840: Fix REFRESH performance regression.

The fix for IMPALA-4172 introduced a regression in
performance of the REFRESH command. The regression
stems from the fact that we reload the block metadata
of every valid data file without considering whether it
has changed since the last load. This caused unnecessary
metadata loads for unchanged files and thus increasing
the runtime.

The fix involves having the refresh codepath (and other
operations that use the same codepath like insert etc.) to
reload the metadata of only modified files by doing a
listStatus() on the partition directory and checking the
last modified time of each file. Without this patch, we relied
on listFiles(), which fetched the block locations irrespective of
whether the file has changed and it was significantly slower on
unchanged tables. The initial/invalidate metadata load still
fetches the block locations in bulk using listFiles(). The
side effect of this change is that the refresh no longer picks up
block location changes after HDFS block rebalancing. We suggest
using "invalidate metadata" for that which loads the metadata from
scratch.

Additionally, this commit enables the reuse of metadata during
table refresh (which was disabled in IMPALA-4172) to prevent
reloading metadata from HMS everytime.

Change-Id: I859b9fe93563ba886d0b5db6db42a14c88caada8
Reviewed-on: http://gerrit.cloudera.org:8080/6009
Reviewed-by: Dimitris Tsirogiannis 
Tested-by: Impala Public Jenkins
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
2 files changed, 110 insertions(+), 32 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Dimitris Tsirogiannis: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/6009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I859b9fe93563ba886d0b5db6db42a14c88caada8
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins


[Impala-ASF-CR] IMPALA-4787: Optimize APPX MEDIAN() memory usage

2017-02-15 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#2).

Change subject: IMPALA-4787: Optimize APPX_MEDIAN() memory usage
..

IMPALA-4787: Optimize APPX_MEDIAN() memory usage

Before this change, ReservoirSample functions (such as APPX_MEDIAN())
allocated memory for 20,000 elements up front per grouping key. This
caused inefficient memory usage for aggregations with many grouping
keys.

This patch fixes this by initially allocating memory for 20 elements.
Once the buffer becomes full, we allocate room for 200 elements and
copy the original buffer into the new one. We continue increasing the
buffer size this way until the buffer has room for 20,000 elements as
before.

Testing:
Ran BE and some relevant EE tests locally.
No new tests were added, the existing tests should provide enough
coverage.

Memory benchmark:
The following query was used to verify that this patch reduces memory
usage:
SELECT APPX_MEDIAN(ss_sold_date_sk)
FROM tpcds.store_sales
GROUP BY ss_customer_sk;

Peak Mem in Agg nodes is reduced from 4.94 GB to 20.67 MB.

Summary before:

Operator   #Hosts   Avg Time   Max Time#Rows  Est. #Rows  Peak Mem  
Est. Peak Mem  Detail

04:EXCHANGE 15.856ms5.856ms   14.82K  15.21K 0  
  -1.00 B  UNPARTITIONED
03:AGGREGATE33s721ms3s789ms   14.82K  15.21K   4.94 GB  
 10.00 MB  FINALIZE
02:EXCHANGE 3  139.276ms  157.753ms   15.60K  15.21K 0  
0  HASH(ss_customer_sk)
01:AGGREGATE32s851ms3s026ms   15.60K  15.21K   5.29 GB  
 10.00 MB  STREAMING
00:SCAN HDFS3   24.245ms   35.727ms  183.59K 183.59K   4.60 MB  
384.00 MB  tpcds.store_sales

Summary after:

Operator   #Hosts   Avg Time  Max Time#Rows  Est. #Rows  Peak Mem  Est. 
Peak Mem  Detail
---
04:EXCHANGE 11.869ms   1.869ms   14.82K  15.21K 0   
 -1.00 B  UNPARTITIONED
03:AGGREGATE3   34.225ms  37.978ms   14.82K  15.21K  20.67 MB   
10.00 MB  FINALIZE
02:EXCHANGE 3  860.854us   1.024ms   15.60K  15.21K 0   
   0  HASH(ss_customer_sk)
01:AGGREGATE3   74.048ms  93.605ms   15.60K  15.21K   9.99 MB   
10.00 MB  STREAMING
00:SCAN HDFS3   68.162ms  78.181ms  183.59K 183.59K   3.84 MB  
384.00 MB  tpcds.store_sales

Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968
---
M be/src/exprs/aggregate-functions-ir.cc
M testdata/workloads/functional-query/queries/QueryTest/alloc-fail-init.test
2 files changed, 136 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/6025/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6025
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4787: Optimize APPX MEDIAN() memory usage

2017-02-15 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#2).

Change subject: IMPALA-4787: Optimize APPX_MEDIAN() memory usage
..

IMPALA-4787: Optimize APPX_MEDIAN() memory usage

Before this change, ReservoirSample functions (such as APPX_MEDIAN())
allocated memory for 20,000 elements up front per grouping key. This
caused inefficient memory usage for aggregations with many grouping
keys.

This patch fixes this by initially allocating memory for 20 elements.
Once the buffer becomes full, we allocate room for 200 elements and
copy the original buffer into the new one. We continue increasing the
buffer size this way until the buffer has room for 20,000 elements as
before.

Testing:
Ran BE and some relevant EE tests locally.
No new tests were added, the existing tests should provide enough
coverage.

Memory benchmark:
The following query was used to verify that this patch reduces memory
usage:
SELECT APPX_MEDIAN(ss_sold_date_sk)
FROM tpcds.store_sales
GROUP BY ss_customer_sk;

Peak Mem in Agg nodes is reduced from 4.94 GB to 20.67 MB.

Summary before:

Operator   #Hosts   Avg Time   Max Time#Rows  Est. #Rows  Peak Mem  
Est. Peak Mem  Detail

04:EXCHANGE 15.856ms5.856ms   14.82K  15.21K 0  
  -1.00 B  UNPARTITIONED
03:AGGREGATE33s721ms3s789ms   14.82K  15.21K   4.94 GB  
 10.00 MB  FINALIZE
02:EXCHANGE 3  139.276ms  157.753ms   15.60K  15.21K 0  
0  HASH(ss_customer_sk)
01:AGGREGATE32s851ms3s026ms   15.60K  15.21K   5.29 GB  
 10.00 MB  STREAMING
00:SCAN HDFS3   24.245ms   35.727ms  183.59K 183.59K   4.60 MB  
384.00 MB  tpcds.store_sales

Summary after:

Operator   #Hosts   Avg Time  Max Time#Rows  Est. #Rows  Peak Mem  Est. 
Peak Mem  Detail
---
04:EXCHANGE 11.869ms   1.869ms   14.82K  15.21K 0   
 -1.00 B  UNPARTITIONED
03:AGGREGATE3   34.225ms  37.978ms   14.82K  15.21K  20.67 MB   
10.00 MB  FINALIZE
02:EXCHANGE 3  860.854us   1.024ms   15.60K  15.21K 0   
   0  HASH(ss_customer_sk)
01:AGGREGATE3   74.048ms  93.605ms   15.60K  15.21K   9.99 MB   
10.00 MB  STREAMING
00:SCAN HDFS3   68.162ms  78.181ms  183.59K 183.59K   3.84 MB  
384.00 MB  tpcds.store_sales

Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968
---
M be/src/exprs/aggregate-functions-ir.cc
M testdata/workloads/functional-query/queries/QueryTest/alloc-fail-init.test
2 files changed, 136 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/6025/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6025
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4787: Optimize APPX MEDIAN() memory usage

2017-02-15 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#3).

Change subject: IMPALA-4787: Optimize APPX_MEDIAN() memory usage
..

IMPALA-4787: Optimize APPX_MEDIAN() memory usage

Before this change, ReservoirSample functions (such as APPX_MEDIAN())
allocated memory for 20,000 elements up front per grouping key. This
caused inefficient memory usage for aggregations with many grouping
keys.

This patch fixes this by initially allocating memory for 20 elements.
Once the buffer becomes full, we allocate room for 200 elements and
copy the original buffer into the new one. We continue increasing the
buffer size this way until the buffer has room for 20,000 elements as
before.

Testing:
Ran BE and some relevant EE tests locally.
No new tests were added, the existing tests should provide enough
coverage.

Memory benchmark:
The following query was used to verify that this patch reduces memory
usage:
SELECT APPX_MEDIAN(ss_sold_date_sk)
FROM tpcds.store_sales
GROUP BY ss_customer_sk;

Peak Mem in Agg nodes is reduced from 4.94 GB to 20.67 MB.

Summary before:

Operator   #Hosts   Avg Time   Max Time#Rows  Est. #Rows  Peak Mem  
Est. Peak Mem  Detail

04:EXCHANGE 15.856ms5.856ms   14.82K  15.21K 0  
  -1.00 B  UNPARTITIONED
03:AGGREGATE33s721ms3s789ms   14.82K  15.21K   4.94 GB  
 10.00 MB  FINALIZE
02:EXCHANGE 3  139.276ms  157.753ms   15.60K  15.21K 0  
0  HASH(ss_customer_sk)
01:AGGREGATE32s851ms3s026ms   15.60K  15.21K   5.29 GB  
 10.00 MB  STREAMING
00:SCAN HDFS3   24.245ms   35.727ms  183.59K 183.59K   4.60 MB  
384.00 MB  tpcds.store_sales

Summary after:

Operator   #Hosts   Avg Time  Max Time#Rows  Est. #Rows  Peak Mem  Est. 
Peak Mem  Detail
---
04:EXCHANGE 11.869ms   1.869ms   14.82K  15.21K 0   
 -1.00 B  UNPARTITIONED
03:AGGREGATE3   34.225ms  37.978ms   14.82K  15.21K  20.67 MB   
10.00 MB  FINALIZE
02:EXCHANGE 3  860.854us   1.024ms   15.60K  15.21K 0   
   0  HASH(ss_customer_sk)
01:AGGREGATE3   74.048ms  93.605ms   15.60K  15.21K   9.99 MB   
10.00 MB  STREAMING
00:SCAN HDFS3   68.162ms  78.181ms  183.59K 183.59K   3.84 MB  
384.00 MB  tpcds.store_sales

Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968
---
M be/src/exprs/aggregate-functions-ir.cc
M testdata/workloads/functional-query/queries/QueryTest/alloc-fail-init.test
2 files changed, 136 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/6025/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6025
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-2020: Add rounding for decimal casts

2017-02-15 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#11).

Change subject: IMPALA-2020: Add rounding for decimal casts
..

IMPALA-2020: Add rounding for decimal casts

This change adds support for DECIMAL_V2 rounding behavior for both
DECIMAL to INT and DOUBLE to DECIMAL casts.  The round behavior
implemented for exact halves is round halves away from zero (e.g
(0.5 -> 1) and (-0.5 -> -1)).

Testing: Added expr-test and decimal-test test coverage as well as
manual testing.

[localhost:21000] > select cast(0.5 AS int);
+--+
| cast(0.5 as int) |
+--+
| 0|
+--+
Fetched 1 row(s) in 0.01s
[localhost:21000] > select cast(cast(0.5999 as float) as decimal(5,1));
+-+
| cast(cast(0.5999 as float) as decimal(5,1)) |
+-+
| 0.5 |
+-+
Fetched 1 row(s) in 0.01s
[localhost:21000] > set decimal_v2=1;
DECIMAL_V2 set to 1
[localhost:21000] > select cast(0.5 AS int);
+--+
| cast(0.5 as int) |
+--+
| 1|
+--+
Fetched 1 row(s) in 0.01s
[localhost:21000] > select cast(cast(0.5999 as float) as decimal(5,1));
+-+
| cast(cast(0.5999 as float) as decimal(5,1)) |
+-+
| 0.6 |
+-+
Fetched 1 row(s) in 0.01s

Change-Id: I2daf186b4770a022f9cb349d512067a1dd624810
---
M be/src/exprs/decimal-operators-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/expr.h
M be/src/exprs/literal.cc
M be/src/runtime/decimal-test.cc
M be/src/runtime/decimal-value.h
M be/src/runtime/decimal-value.inline.h
M be/src/udf/udf.h
8 files changed, 374 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/5951/11
-- 
To view, visit http://gerrit.cloudera.org:8080/5951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2daf186b4770a022f9cb349d512067a1dd624810
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4787: Optimize APPX MEDIAN() memory usage

2017-02-15 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#2).

Change subject: IMPALA-4787: Optimize APPX_MEDIAN() memory usage
..

IMPALA-4787: Optimize APPX_MEDIAN() memory usage

Before this change, ReservoirSample functions (such as APPX_MEDIAN())
allocated memory for 20,000 elements up front per grouping key. This
caused inefficient memory usage for aggregations with many grouping
keys.

This patch fixes this by initially allocating memory for 20 elements.
Once the buffer becomes full, we allocate room for 200 elements and
copy the original buffer into the new one. We continue increasing the
buffer size this way until the buffer has room for 20,000 elements as
before.

Testing:
Ran BE and some relevant EE tests locally.
No new tests were added, the existing tests should provide enough
coverage.

Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968
---
M be/src/exprs/aggregate-functions-ir.cc
M testdata/workloads/functional-query/queries/QueryTest/alloc-fail-init.test
2 files changed, 136 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/6025/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6025
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4787: Optimize APPX MEDIAN() memory usage

2017-02-15 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4787: Optimize APPX_MEDIAN() memory usage
..


Patch Set 1:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/6025/1/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

Line 146: // We assume that the dst buffer has already been allocated earlier. 
This increases
> can simplify with:
Done


Line 150:   uint8_t* ptr = ctx->Reallocate(dst->ptr, buf_len);
> DCHECK(dst->ptr != NULL);
Done


Line 918: const static int NUM_SAMPLES = NUM_BUCKETS * NUM_SAMPLES_PER_BUCKET;
> NUM_SAMPLES -> MAX_NUM_SAMPLES
Done


Line 974:   ReservoirSample* at(int64_t idx) {
> single line, const function
Done


Line 979:   void push_back(ReservoirSample s) {
> use const&, otherwise 's' is copied twice
Done


Line 982: ReservoirSample* arr = 
reinterpret_cast(begin());
> no need to reinterpret_cast
Done


Line 987:   ReservoirSample* begin() {
> const function (and elsewhere)
Done


Line 994:   ReservoirSample* end() {
> single line
Done


Line 1005: void resizeReservoirSampleState(FunctionContext* ctx, StringVal* 
buffer) {
> Why not move this function into ReservoirSampleState and make ReservoirSamp
Unfortunately it's not possible to encapsulate because we also need to update 
the pointer in the StringVal buffer. It would be more complicated and error 
prone to pass it as one of parameters into this function.


Line 1007:   // can fit 10 times as many samples, up to a maximum of 20,000. We 
do not allocate
> instead of 20,000 use the constant MAX_NUM_SAMPLES
Done


Line 1033:   // If the array gets filled due to updates or merges, we will 
reallocate a larger
> suggest "we reallocate a larger buffer to hold up to a maximum of MAX_NUM_S
Done


Line 1035:   int init_size = (NUM_SAMPLES / 1000);
> It would be nice to have this encapsulated in the ReservoirSampleState as w
As mentioned above, it's not possible to encapsulate cleanly.


Line 1061:   resizeReservoirSampleState(ctx, dst);
> if ReallocBuffer() failed, dst->ptr will be in an undefined state, and may 
I just added a check here, because it's not possible to move the logic as you 
suggested.


Line 1147:   DCHECK_EQ(src_val.len, sizeof(ReservoirSampleState) +
> you can add a helper function to ReservoirSampleState to do this size compu
Done. Also added IsFull().


-- 
To view, visit http://gerrit.cloudera.org:8080/6025
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2020: Add rounding for decimal casts

2017-02-15 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-2020: Add rounding for decimal casts
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5951/10/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

PS10, Line 127: typename RESULT_T::underlying_type_t
> It's applicable to the next line too, right ? The following looks easier to
I agree that does look nicer, but I can't do it without polluting the client 
namespace with whatever typedef I use, unless I put it inside another 
namespace, like detail, which doesn't really solve the problem either.


-- 
To view, visit http://gerrit.cloudera.org:8080/5951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2daf186b4770a022f9cb349d512067a1dd624810
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2020: Add rounding for decimal casts

2017-02-15 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-2020: Add rounding for decimal casts
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5951/5/be/src/exprs/literal.cc
File be/src/exprs/literal.cc:

Line 185: value_.decimal4_val = Decimal4Value::FromDouble(type, v, 
, true);
> Shouldn't this still be dependent on the behavior we want to test ? We test
I really don't want to complicate tests any more than necessary so since this 
is just for tests, I just gave this the sane expected behavior.  If this 
happens to break some tests, we'll find out and fix them, but this should not 
change Impala behavior.


-- 
To view, visit http://gerrit.cloudera.org:8080/5951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2daf186b4770a022f9cb349d512067a1dd624810
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4546: Fix Moscow timezone conversion after 2014

2017-02-15 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#2).

Change subject: IMPALA-4546: Fix Moscow timezone conversion after 2014
..

IMPALA-4546: Fix Moscow timezone conversion after 2014

In 2014 Moscow timezone rules changed from UTC+3 with no DST to UTC+4
with no DST. A special case has been added to timestamp functions to
handle this.

Testing:
Added BE Expr tests.

Change-Id: Id6e3f2c9f6ba29749a26bc1087e664637bc02528
---
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timezone_db.cc
M be/src/exprs/timezone_db.h
4 files changed, 90 insertions(+), 44 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/5969/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5969
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id6e3f2c9f6ba29749a26bc1087e664637bc02528
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4546: Fix Moscow timezone conversion after 2014

2017-02-15 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#2).

Change subject: IMPALA-4546: Fix Moscow timezone conversion after 2014
..

IMPALA-4546: Fix Moscow timezone conversion after 2014

In 2014 Moscow timezone rules changed from UTC+3 with no DST to UTC+4
with no DST. A special case has been added to timestamp functions to
handle this.

Testing:
Added BE Expr tests.

Change-Id: Id6e3f2c9f6ba29749a26bc1087e664637bc02528
---
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timezone_db.cc
M be/src/exprs/timezone_db.h
4 files changed, 90 insertions(+), 44 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/5969/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5969
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id6e3f2c9f6ba29749a26bc1087e664637bc02528
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4546: Fix Moscow timezone conversion after 2014

2017-02-15 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4546: Fix Moscow timezone conversion after 2014
..


Patch Set 1:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/5969/1/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

Line 3588:   // IMPALA-4209: Moscow time change in 2011.
> Moscow could use a test on its own, rather than adding so many lines to Tim
Done


PS1, Line 3590: cast(from_utc_timestamp('2010-10-30 22:59:00', "
  :   "'Europe/Moscow') as string)
> This could be turned into a template of calling TestStringValue with "cast(
Done


Line 3596:   TestIsNull("cast(to_utc_timestamp('2010-10-31 02:00:00', "
> Each one of these lines could use a short comment like "spring forward into
Done


http://gerrit.cloudera.org:8080/#/c/5969/1/be/src/exprs/timestamp-functions.cc
File be/src/exprs/timestamp-functions.cc:

PS1, Line 194: January 19, 1992 
> What about in 1991?
Added comment. In general, Timestamp timezone conversions are broken.


http://gerrit.cloudera.org:8080/#/c/5969/1/be/src/exprs/timezone_db.h
File be/src/exprs/timezone_db.h:

PS1, Line 41: "tz"
> 'tz', because the parameter tz is a std::string and so "tz" is a valid valu
Done


Line 45:   /// Moscow timezone UTC+3 with DST, for use before 27 March 2011.
> Between 27 March 2011 and 26 October 2014, which timezone was Moscow in?
Moscow was in TIMEZONE_MSK_PRE_2014 timezone between 2011 and 2014, which is 
UTC+4


PS1, Line 48:  October 26 2014
> "26 October 2014" for parallelism with above comment.
I changed both to Month, Day, Year.


-- 
To view, visit http://gerrit.cloudera.org:8080/5969
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id6e3f2c9f6ba29749a26bc1087e664637bc02528
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4937: Remove unused kudu scanner keep alive variable

2017-02-15 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4937: Remove unused kudu scanner keep alive variable
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6021
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I99f56f93f1ff8543cbe415542464a053b24968e2
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4787: Optimize APPX MEDIAN() memory usage

2017-02-15 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4787: Optimize APPX_MEDIAN() memory usage
..


Patch Set 1:

(14 comments)

Did you verify the memory savings with a simple experiment?

http://gerrit.cloudera.org:8080/#/c/6025/1/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

Line 146: // We assume that the dst buffer has already been allocated earlier. 
This increases
can simplify with:

Same as AllocBuffer() above but for re-allocating 'dst->ptr'.


Line 150:   uint8_t* ptr = ctx->Reallocate(dst->ptr, buf_len);
DCHECK(dst->ptr != NULL);


Line 918: const static int NUM_SAMPLES = NUM_BUCKETS * NUM_SAMPLES_PER_BUCKET;
NUM_SAMPLES -> MAX_NUM_SAMPLES


Line 974:   ReservoirSample* at(int64_t idx) {
single line, const function


Line 979:   void push_back(ReservoirSample s) {
use const&, otherwise 's' is copied twice


Line 982: ReservoirSample* arr = 
reinterpret_cast(begin());
no need to reinterpret_cast


Line 987:   ReservoirSample* begin() {
const function (and elsewhere)


Line 994:   ReservoirSample* end() {
single line


Line 1005: void resizeReservoirSampleState(FunctionContext* ctx, StringVal* 
buffer) {
Why not move this function into ReservoirSampleState and make 
ReservoirSampleState auto-resize itself during push_back()? You can add ctx as 
a param to push_back.

That way all the logic is encapsulated inside ReservoirSampleState and not 
throughout various agg fn implementations


Line 1007:   // can fit 10 times as many samples, up to a maximum of 20,000. We 
do not allocate
instead of 20,000 use the constant MAX_NUM_SAMPLES


Line 1033:   // If the array gets filled due to updates or merges, we will 
reallocate a larger
suggest "we reallocate a larger buffer to hold up to a maximum of 
MAX_NUM_SAMPLES ReservoirSamples."


Line 1035:   int init_size = (NUM_SAMPLES / 1000);
It would be nice to have this encapsulated in the ReservoirSampleState as well.


Line 1061:   resizeReservoirSampleState(ctx, dst);
if ReallocBuffer() failed, dst->ptr will be in an undefined state, and may 
cause a crash

I suggest moving the reallocation logic into the push_back() of 
ReservoirSampleState. You might use the return code of push_back() to signal 
success/failure, so you can handle the failure case here.


Line 1147:   DCHECK_EQ(src_val.len, sizeof(ReservoirSampleState) +
you can add a helper function to ReservoirSampleState to do this size 
computation


-- 
To view, visit http://gerrit.cloudera.org:8080/6025
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4787: Optimize APPX MEDIAN() memory usage

2017-02-15 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4787: Optimize APPX_MEDIAN() memory usage
..


Patch Set 1:

I'll do a CR later tonight or tmr morning.

-- 
To view, visit http://gerrit.cloudera.org:8080/6025
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4933: Force thrift to initialize SSL on process startup

2017-02-15 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4933: Force thrift to initialize SSL on process startup
..


Patch Set 1:

Running jenkins job now, will also deploy to a secure cluster for additional 
testing

-- 
To view, visit http://gerrit.cloudera.org:8080/6027
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I245a8a001103ddca7f07349faa82bbb5b9fe3ab0
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4933: Force thrift to initialize SSL on process startup

2017-02-15 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6027

Change subject: IMPALA-4933: Force thrift to initialize SSL on process startup
..

IMPALA-4933: Force thrift to initialize SSL on process startup

OpenSSL initialization functions are not threadsafe and are
currently called by both thrift, squeasel, and the Kudu
client. This change forces thrift to initialize OpenSSL on
startup by adding a TSSLSocketFactory to the AuthManager
which initializes OpenSSL upon creation and lives the
lifetime of the process. Then, squeasel is configured to
skip the OpenSSL initialization.

TODO: When the Kudu client supports a flag to disable its
initialization path (KUDU-1738), Impala will call that. In
the meantime, there will continue to be some small
likelihood of a race.

Change-Id: I245a8a001103ddca7f07349faa82bbb5b9fe3ab0
---
M be/src/rpc/authentication.cc
M be/src/rpc/authentication.h
M be/src/util/webserver.cc
3 files changed, 18 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/6027/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6027
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I245a8a001103ddca7f07349faa82bbb5b9fe3ab0
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-4840: Fix REFRESH performance regression.

2017-02-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4840: Fix REFRESH performance regression.
..


Patch Set 4:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/278/

-- 
To view, visit http://gerrit.cloudera.org:8080/6009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I859b9fe93563ba886d0b5db6db42a14c88caada8
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4840: Fix REFRESH performance regression.

2017-02-15 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4840: Fix REFRESH performance regression.
..


Patch Set 4: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I859b9fe93563ba886d0b5db6db42a14c88caada8
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4822: Implement dynamic log level changes

2017-02-15 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has uploaded a new patch set (#13).

Change subject: IMPALA-4822: Implement dynamic log level changes
..

IMPALA-4822: Implement dynamic log level changes

Very often we have to change the logging levels
of Impalads and Catalog server for debugging purposes.
Currently, there is no way to do that without a restart
of the daemons, which is not a viable option in production
deployments.

This patch addresses this supportability gap by exposing
the ability to set dynamic logging levels using a simple
web endpoint on Impalad/Catalog/Statestore web pages.

This includes setting VLOG levels (equivalent to --v flag)
as well as setting log4j levels on the Frontend and the
Catalog JVMs.

Change-Id: I588418e9bcb0b66d33138baf96207a5a35bfbd63
---
M be/src/catalog/catalog-server.cc
M be/src/service/impala-http-handler.cc
M be/src/statestore/statestore.cc
M be/src/util/jni-util.cc
M be/src/util/jni-util.h
M be/src/util/logging-support.cc
M be/src/util/logging-support.h
M common/thrift/Logging.thrift
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/GlogAppender.java
M tests/webserver/test_web_pages.py
A www/log_level.tmpl
12 files changed, 473 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/5792/13
-- 
To view, visit http://gerrit.cloudera.org:8080/5792
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I588418e9bcb0b66d33138baf96207a5a35bfbd63
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-4822: Implement dynamic log level changes

2017-02-15 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-4822: Implement dynamic log level changes
..


Patch Set 12:

(16 comments)

Fxed a minor bug in the web UI where the setting for java log level changes 
were showing up on the statestore web UI which doesn't have a JVM.

http://gerrit.cloudera.org:8080/#/c/5792/12/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

Line 209: 
> remove blank line
Done


http://gerrit.cloudera.org:8080/#/c/5792/12/be/src/util/jni-util.h
File be/src/util/jni-util.h:

PS12, Line 230: the above
> explicitly say LoadJniMethod. Otherwise someone might put a method between 
haha, good point. Changed it.


PS12, Line 234: It seems these
  :   /// must be defined in the header to compile properly.
> can you remove this line (I probably wrote it...)? templates need to be in 
Done


http://gerrit.cloudera.org:8080/#/c/5792/12/be/src/util/logging-support.cc
File be/src/util/logging-support.cc:

PS12, Line 129: Status SetJavaLogLevel(const TSetJavaLogLevelParams& params, 
string* result) {
  :   RETURN_IF_ERROR(
  :   JniUtil::CallJniMethod(log4j_logger_class_, 
set_log_level_method, params, result));
  :   return Status::OK();
  : }
> Is this called in more than one place? Otherwise, let's just inline it into
Yea that is the only place. I've written this way think it would be more 
readable but I agree it looks redundant.


PS12, Line 143: const Status& status
> What I mean was: rather than constructing a status every time you want to c
Ah sorry misunderstood. Made the change.


PS12, Line 149: SetDisplayResult
> how about 'AddDocumentMember()" ?
That sounds better. Done.


PS12, Line 158: log_getclass->second == nullptr
> Please remove the comparisons to nullptr for strings, and replace with .emp
Done


PS12, Line 215: return
> Maybe make this case an error as well so that user knows what's missing.
Done


PS12, Line 223: std::
> remove
Done


PS12, Line 229: std::
> remove
Done


Line 307: void RegisterLogLevelCallbacks(const string& url, Webserver* 
webserver, bool register_log4j_handlers) {
> long line (consider using git-clang-format)
Done.


PS12, Line 310: set_glog_log
> sorry to nitpick - but set_glog_log seems repetitive. Why not set_glog_leve
Valid point. Changed.


http://gerrit.cloudera.org:8080/#/c/5792/12/be/src/util/logging-support.h
File be/src/util/logging-support.h:

PS12, Line 39: init_log4j
> update
Done


PS12, Line 40: const std::string& url
> is this needed, or is it the same for all callers?
Removed, its the same for all callers.


http://gerrit.cloudera.org:8080/#/c/5792/12/fe/src/main/java/org/apache/impala/util/GlogAppender.java
File fe/src/main/java/org/apache/impala/util/GlogAppender.java:

Line 46: 
> remove blank line
Done


http://gerrit.cloudera.org:8080/#/c/5792/12/tests/webserver/test_web_pages.py
File tests/webserver/test_web_pages.py:

PS12, Line 86: \
> remove where you have used parentheses
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5792
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I588418e9bcb0b66d33138baf96207a5a35bfbd63
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4828: Alter Kudu schema outside Impala may crash on read

2017-02-15 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4828: Alter Kudu schema outside Impala may crash on read
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5840/7/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

Line 101:   def test_kudu_col_changed(self, cursor, kudu_client, 
unique_database):
Tests lgtm, but there seems to be a lot of repeated boilerplate stuff. Maybe 
you can take another pass and see if it makes sense to factor out some parts to 
condense the code?


Line 106: kudu_tbl_name = "impala::%s.foo" % unique_database
seems worth factoring out this function and adding a comment that it must be 
kept in sync with the FE table-name-generation behavior


-- 
To view, visit http://gerrit.cloudera.org:8080/5840
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6d43f5bb9811e728ad592933066d006c8fb4553a
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4787: Optimize APPX MEDIAN() memory usage

2017-02-15 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6025

Change subject: IMPALA-4787: Optimize APPX_MEDIAN() memory usage
..

IMPALA-4787: Optimize APPX_MEDIAN() memory usage

Before this change, ReservoirSample functions (such as APPX_MEDIAN())
allocated memory for 20,000 elements up front per grouping key. This
caused inefficient memory usage for aggregations with many grouping
keys.

This patch fixes this by initially allocating memory for 20 elements.
Once the buffer becomes full, we allocate room for 200 elements and
copy the original buffer into the new one. We continue increasing the
buffer size this way until the buffer has room for 20,000 elements as
before.

Testing:
Ran BE and some relevant EE tests locally.
No new tests were added, the existing tests should provide enough
coverage.

Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968
---
M be/src/exprs/aggregate-functions-ir.cc
1 file changed, 124 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/6025/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6025
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4848: Add WIDHT BUCKET() function

2017-02-15 Thread anujphadke (Code Review)
anujphadke has posted comments on this change.

Change subject: IMPALA-4848: Add WIDHT_BUCKET() function
..


Patch Set 1:

Will add a commit message.

Results of the tests match the postgres output.
aphadke-MBP-2:~ aphadke$ 
"/Applications/Postgres.app/Contents/Versions/9.6/bin/psql" -p5432 -d "aphadke"
psql (9.6.1)
Type "help" for help.

aphadke=# 
aphadke=# 
aphadke=# 
aphadke=# 
aphadke=# select width_bucket(6.3, 2, 17, 2);
 width_bucket 
--
1
(1 row)

aphadke=# select width_bucket(11, 6, 14, 3);
 width_bucket 
--
2
(1 row)

aphadke=# select width_bucket(-1, -5, 5, 3);
 width_bucket 
--
2
(1 row)

aphadke=# select width_bucket(1, -5, 5, 3);
 width_bucket 
--
2
(1 row)

aphadke=# select width_bucket(3, 5, 20.1, 4);
 width_bucket 
--
0
(1 row)

aphadke=# select width_bucket(22, 5, 20.1, 4);
 width_bucket 
--
5
(1 row)

aphadke=# select width_bucket(NULL, 2, 14, 4);
 width_bucket 
--
 
(1 row)

aphadke=# select width_bucket(NULL, 2, 14, 0);
 width_bucket 
--
 
(1 row)

aphadke=# select width_bucket(22, 5, 20.1, 0);
ERROR:  count must be greater than zero
aphadke=# select width_bucket(22, 5, 5, 0);
ERROR:  count must be greater than zero
aphadke=#

-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3586: Implement union passthrough

2017-02-15 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3586: Implement union passthrough
..


Patch Set 7:

(40 comments)

http://gerrit.cloudera.org:8080/#/c/5816/7/be/src/exec/analytic-eval-node.cc
File be/src/exec/analytic-eval-node.cc:

Line 145:   DCHECK(child(0)->row_desc().IsPrefixOfEquivalent(row_desc()));
This should be IsPrefixOf() because we sanity checking the row descriptors of 
exec nodes (and not row batches).


http://gerrit.cloudera.org:8080/#/c/5816/7/be/src/exec/union-node.cc
File be/src/exec/union-node.cc:

Line 115:   // passthrough case, the child can be closed right away because the 
row batch received
the child can be closed right away -> the child was already closed?


Line 116:   // from the child is copied (more details below).
accuracy: the row batch wasn't copied


Line 121:   if (child_idx_ < children_.size() && 
isChildPassThrough(child_idx_)) {
Suggest comment:
// Handle passthrough children. We pass 'row_batch' directly into the GetNext() 
call on the child.


Line 122: // If the rows from the current child can be passed through, the 
physical row layout
This comment doesn't seem to add anything, I suggest removing it.


Line 131: // It may be possible that the row batch is not empty, so we save 
the previous value.
More details would be helpful. If the batch has rows at this point, I'd imagine 
it can cause all sorts of other problems. How can the batch already have rows?


Line 148:   // 'needs_deep_copy' lets us safely close the child in the next 
GetNext() call.
style: 'needs_deep_copy' is not a visible variable here, suggest just saying 
"Marking the batch as needing a deep copy let's us ...


Line 154: 
DCHECK that child_idx_ is not passthrough here


http://gerrit.cloudera.org:8080/#/c/5816/7/be/src/exec/union-node.h
File be/src/exec/union-node.h:

Line 67:   // Which children can be passed through, without being materialized.
... without evaluating and materializing their exprs.


http://gerrit.cloudera.org:8080/#/c/5816/7/be/src/runtime/descriptors.h
File be/src/runtime/descriptors.h:

Line 412:   /// Return true if the physical layout of this descriptor matches 
the physical layout
matches that of other_desc


Line 413:   /// of other_desc, but not necessarily ids.
bot not necessarily the id.


Line 565:   /// of other_desc, but not necessarily ids.
but not necessarily the ids


http://gerrit.cloudera.org:8080/#/c/5816/7/be/src/service/query-options.h
File be/src/service/query-options.h:

Line 38:   TImpalaQueryOptions::DISABLE_UNION_PASSTHROUGH + 1);\
I tend to prefer ENABLE_UNION_PASSTHROUGH. To me positive phrasing is a little 
easier to understand. What do you think?


http://gerrit.cloudera.org:8080/#/c/5816/7/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
File fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java:

Line 227:   public boolean isEquivalent(SlotDescriptor other) {
Unfortunately, the term 'equivalent' already has a different meaning in the FE 
for slots, so it would be good to the existing term fro this new one. Maybe 
isLayoutEquivalent()?


http://gerrit.cloudera.org:8080/#/c/5816/7/fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
File fe/src/main/java/org/apache/impala/analysis/UnionStmt.java:

Line 616:   public List getUnionNodeResultExprs() {
getUnionResultExprs()


http://gerrit.cloudera.org:8080/#/c/5816/7/fe/src/main/java/org/apache/impala/planner/UnionNode.java
File fe/src/main/java/org/apache/impala/planner/UnionNode.java:

Line 47:  * a child has an identical tuple layout as the output of the union 
node, the
... as the output of the union node, and the child only has naked SlotRefs as 
result exprs, then the child is marked as 'passthrough'. The rows of 
passthrough children are directly returned by the union node, instead of 
materializing the child's result exprs into new tuples.


Line 57:   protected final List resultExprs_;
unionResultExprs_ to make distinguish it from the resultExprLists_ and such


Line 73:   // If false, no batches from child nodes would be passed through in 
the backend.
Comment should describe what this flag is. Also you mean "true" right?


Line 76:   // Indicates for which child nodes batches can be passed through in 
the backend.
remove "in the backend" (it's clear that execution happens there)


Line 81:   protected UnionNode(PlanNodeId id, TupleId tupleId) {
Is this c'tor still needed? If not, remove.


Line 89:   List resultExprs, boolean isInSubplan) {
indentation, unionResultExprs


Line 182:* Compute whether we can pass through rows without materializing 
for the given child.
Can combine/simplify like this:

Returns true if rows from the child with 'childTupleIds' and 'childResultExprs' 
can be returned directly by the union node (without materialization into a new 
tuple).

Might be good to list the conditions for passthrough 

[Impala-ASF-CR] IMPALA-4848: Add WIDHT BUCKET() function

2017-02-15 Thread anujphadke (Code Review)
anujphadke has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6023

Change subject: IMPALA-4848: Add WIDHT_BUCKET() function
..

IMPALA-4848: Add WIDHT_BUCKET() function

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
4 files changed, 32 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 


[Impala-ASF-CR] IMPALA-4931: Update squeasel to include patch to disable SSL init

2017-02-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4931: Update squeasel to include patch to disable SSL 
init
..


Patch Set 2: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/277/

-- 
To view, visit http://gerrit.cloudera.org:8080/6006
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3d28f2a5391b8a7d39a50002ff1d96ef3d927567
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4916: Fix maintenance of set of item sets in DisjointSet.

2017-02-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4916: Fix maintenance of set of item sets in DisjointSet.
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5980
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I609c8795c09becd78815605ea8e82e2f99e82212
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4828: Alter Kudu schema outside Impala may crash on read

2017-02-15 Thread Matthew Jacobs (Code Review)
Hello Thomas Tauber-Marshall, Sailesh Mukil,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5840

to look at the new patch set (#7).

Change subject: IMPALA-4828: Alter Kudu schema outside Impala may crash on read
..

IMPALA-4828: Alter Kudu schema outside Impala may crash on read

Creating a table in Impala, changing the column schema
outside of Impala, and then reading again in Impala may
result in a crash. Impala may attempt to dereference
pointers that aren't there. This happens if a string column
is dropped and then a new, non string column is added with
the old string column's name.

The Kudu scan token contains the projection schema, and that
is validated when opening the Kudu scanner (with the
exception of KUDU-1881), but the issue is that during
planning, Impala assumes the types/nullability of columns
haven't changed when creating the scan tokens. This is fixed
by adding a check when creating the scan token, and failing
the query if the column types changed. Impala then relies on
the Kudu client to properly validate that the underlying
schema is still represented by the scan token, and that
deserialization will fail if it no longer matches. Test
cases were added for this particular crash scenario, which now
fails during planning as expected. This does not attempt to
validate the Kudu client validation at deserialization time,
though that would be valuable coverage to add in the future.

Columns being removed don't produce a crash; the query fails
gracefully. A test was added for this case.

Columns being added should not affect this scenario, but a
test was added anyway.

Change-Id: I6d43f5bb9811e728ad592933066d006c8fb4553a
---
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M tests/query_test/test_kudu.py
2 files changed, 216 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/5840/7
-- 
To view, visit http://gerrit.cloudera.org:8080/5840
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6d43f5bb9811e728ad592933066d006c8fb4553a
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-4840: Fix REFRESH performance regression.

2017-02-15 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4840: Fix REFRESH performance regression.
..


Patch Set 3: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I859b9fe93563ba886d0b5db6db42a14c88caada8
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4885: Expose Jvm thread info in web UI

2017-02-15 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-4885: Expose Jvm thread info in web UI
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/6013/1/be/src/util/thread.cc
File be/src/util/thread.cc:

Line 178:   void ThreadOverviewUrlCallback(bool track_jvm_threads,
Shouldn't this go above ThreadGroupUrlCallback() to match with the sample 
jsons? I think the one right above it now corresponds to  
ThreadGroupUrlCallback()? No?


PS1, Line 194: [this, track_jvm_threads] (const Webserver::ArgumentMap& args, 
Document* doc) {
 : this->ThreadOverviewUrlCallback(track_jvm_threads, args, doc)
Make this a method? MakeUrlCallBack or something, used multiple times.


PS1, Line 327: INFO
warn/error? How about bubbling this error up to the web endpoint by setting the 
"error" member? (Other places too).


http://gerrit.cloudera.org:8080/#/c/6013/1/be/src/util/thread.h
File be/src/util/thread.h:

Line 191: /// the "thread-manager."
Describe track_jvm_threads in the method comment?


http://gerrit.cloudera.org:8080/#/c/6013/1/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

Line 720:   // Summary of a JVM thread. Includes stacktraces, locked monitors
Do you think its good to include the locked synchronizers or the lock it is 
blocked on? (ThreadInfo#getLockInfo() / getLockOwnerId() etc.).. Helps 
debugging hangs.


-- 
To view, visit http://gerrit.cloudera.org:8080/6013
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id497043ab33dcf107a562f0b1ccd5c46095d397f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4840: Fix REFRESH performance regression.

2017-02-15 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has uploaded a new patch set (#3).

Change subject: IMPALA-4840: Fix REFRESH performance regression.
..

IMPALA-4840: Fix REFRESH performance regression.

The fix for IMPALA-4172 introduced a regression in
performance of the REFRESH command. The regression
stems from the fact that we reload the block metadata
of every valid data file without considering whether it
has changed since the last load. This caused unnecessary
metadata loads for unchanged files and thus increasing
the runtime.

The fix involves having the refresh codepath (and other
operations that use the same codepath like insert etc.) to
reload the metadata of only modified files by doing a
listStatus() on the partition directory and checking the
last modified time of each file. Without this patch, we relied
on listFiles(), which fetched the block locations irrespective of
whether the file has changed and it was significantly slower on
unchanged tables. The initial/invalidate metadata load still
fetches the block locations in bulk using listFiles(). The
side effect of this change is that the refresh no longer picks up
block location changes after HDFS block rebalancing. We suggest
using "invalidate metadata" for that which loads the metadata from
scratch.

Additionally, this commit enables the reuse of metadata during
table refresh (which was disabled in IMPALA-4172) to prevent
reloading metadata from HMS everytime.

Change-Id: I859b9fe93563ba886d0b5db6db42a14c88caada8
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
2 files changed, 110 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/6009/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I859b9fe93563ba886d0b5db6db42a14c88caada8
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 


[Impala-ASF-CR] IMPALA-4840: Fix REFRESH performance regression.

2017-02-15 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-4840: Fix REFRESH performance regression.
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6009/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 787:* for modified files on HDFS. This method uses a 
FileSystem.listStatus() call on the
> Suggest something like:
Done


Line 789:* block locations using FileSystem.getFileBlockLocations() method. 
The initial table
> the FileSystem.getFileBlockLocations() method
Done


Line 793:* (up to ~40x slower in some cases) and hence it is implemented 
this way to optimize
> suggest moving this wording to the top as suggested above
Done


Line 836:   if (unknownDiskIdCount > 0 && LOG.isWarnEnabled()) {
> remove the disk id warning as you suggested
I still retained it in the initial table load using loadBlockMetadata(). I 
think that helps debugging. Removed it here though.


-- 
To view, visit http://gerrit.cloudera.org:8080/6009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I859b9fe93563ba886d0b5db6db42a14c88caada8
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4840: Fix REFRESH performance regression.

2017-02-15 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4840: Fix REFRESH performance regression.
..


Patch Set 2:

(4 comments)

Getting close.

http://gerrit.cloudera.org:8080/#/c/6009/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 787:* for modified files on HDFS. This method uses a 
FileSystem.listStatus() call on the
Suggest something like:

This method is optimized for the case where the files in the partition have not 
changed dramatically. It first uses FileSystem.listStatus() ... 

then you can mention the perf difference between these functions


Line 789:* block locations using FileSystem.getFileBlockLocations() method. 
The initial table
the FileSystem.getFileBlockLocations() method


Line 793:* (up to ~40x slower in some cases) and hence it is implemented 
this way to optimize
suggest moving this wording to the top as suggested above


Line 836:   if (unknownDiskIdCount > 0 && LOG.isWarnEnabled()) {
remove the disk id warning as you suggested


-- 
To view, visit http://gerrit.cloudera.org:8080/6009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I859b9fe93563ba886d0b5db6db42a14c88caada8
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4828: Alter Kudu schema outside Impala may crash on read

2017-02-15 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4828: Alter Kudu schema outside Impala may crash on read
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5840/2/be/src/exec/kudu-scanner.cc
File be/src/exec/kudu-scanner.cc:

Line 149: 
> I think Thomas is right and the query may crash if there is a difference in
I couldn't reproduce any failures, but looking at TupleDescriptor I agree it 
seems like it could be possible to crash. Good call Thomas & Alex. I'll check 
that here too.

Will the Kudu-side validation check that the projection is 100% identical to 
what the scan token expects? Nullability etc. included?
In theory it should, thought it looks like it's missing on the Kudu side as 
well, so I filed a JIRA on them https://issues.apache.org/jira/browse/KUDU-1881 
. Thanks for pointing this out.


http://gerrit.cloudera.org:8080/#/c/5840/6/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java:

Line 161:   if (!colType.matchesType(kuduColType)) {
> Shouldn't this use equals() instead of matchesType()?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5840
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6d43f5bb9811e728ad592933066d006c8fb4553a
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4840: Fix REFRESH performance regression.

2017-02-15 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has uploaded a new patch set (#2).

Change subject: IMPALA-4840: Fix REFRESH performance regression.
..

IMPALA-4840: Fix REFRESH performance regression.

The fix for IMPALA-4172 introduced a regression in
performance of the REFRESH command. The regression
stems from the fact that we reload the block metadata
of every valid data file without considering whether it
has changed since the last load. This caused unnecessary
metadata loads for unchanged files and thus increasing
the runtime.

The fix involves having the refresh codepath (and other
operations that use the same codepath like insert etc.) to
reload the metadata of only modified files by doing a
listStatus() on the partition directory and checking the
last modified time of each file. Without this patch, we relied
on listFiles(), which fetched the block locations irrespective of
whether the file has changed and it was significantly slower on
unchanged tables. The initial/invalidate metadata load still
fetches the block locations in bulk using listFiles(). The
side effect of this change is that the refresh no longer picks up
block location changes after HDFS block rebalancing. We suggest
using "invalidate metadata" for that which loads the metadata from
scratch.

Additionally, this commit enables the reuse of metadata during
table refresh (which was disabled in IMPALA-4172) to prevent
reloading metadata from HMS everytime.

Change-Id: I859b9fe93563ba886d0b5db6db42a14c88caada8
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
2 files changed, 116 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/6009/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I859b9fe93563ba886d0b5db6db42a14c88caada8
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 


[Impala-ASF-CR] IMPALA-4931: Update squeasel to include patch to disable SSL init

2017-02-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4931: Update squeasel to include patch to disable SSL 
init
..


Patch Set 2:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/277/

-- 
To view, visit http://gerrit.cloudera.org:8080/6006
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3d28f2a5391b8a7d39a50002ff1d96ef3d927567
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4931: Update squeasel to include patch to disable SSL init

2017-02-15 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4931: Update squeasel to include patch to disable SSL 
init
..


Patch Set 2: Code-Review+2

removed the old cmake line and rebased

-- 
To view, visit http://gerrit.cloudera.org:8080/6006
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3d28f2a5391b8a7d39a50002ff1d96ef3d927567
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4931: Update squeasel to include patch to disable SSL init

2017-02-15 Thread Matthew Jacobs (Code Review)
Hello Henry Robinson,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6006

to look at the new patch set (#2).

Change subject: IMPALA-4931: Update squeasel to include patch to disable SSL 
init
..

IMPALA-4931: Update squeasel to include patch to disable SSL init

Update Squeasel in thirdparty to
c304d3f3481b07bf153979155f02e0aab24d01de

Change-Id: I3d28f2a5391b8a7d39a50002ff1d96ef3d927567
---
M be/src/thirdparty/squeasel/squeasel.c
M be/src/util/CMakeLists.txt
2 files changed, 36 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/6006/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6006
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3d28f2a5391b8a7d39a50002ff1d96ef3d927567
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] Revert "IMPALA-4829: Change default Kudu read behavior for "RYW""

2017-02-15 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: Revert "IMPALA-4829: Change default Kudu read behavior for 
"RYW""
..


Patch Set 3: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/5970
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I995dec543946c9e0f79bc5b7e82568060a9d8262
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] Revert "IMPALA-4829: Change default Kudu read behavior for "RYW""

2017-02-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: Revert "IMPALA-4829: Change default Kudu read behavior for 
"RYW""
..


Patch Set 3:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/276/

-- 
To view, visit http://gerrit.cloudera.org:8080/5970
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I995dec543946c9e0f79bc5b7e82568060a9d8262
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] Revert "IMPALA-4829: Change default Kudu read behavior for "RYW""

2017-02-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: Revert "IMPALA-4829: Change default Kudu read behavior for 
"RYW""
..


Patch Set 2: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/274/

-- 
To view, visit http://gerrit.cloudera.org:8080/5970
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I995dec543946c9e0f79bc5b7e82568060a9d8262
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4916: Fix maintenance of set of item sets in DisjointSet.

2017-02-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4916: Fix maintenance of set of item sets in DisjointSet.
..


Patch Set 3:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/275/

-- 
To view, visit http://gerrit.cloudera.org:8080/5980
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I609c8795c09becd78815605ea8e82e2f99e82212
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4916: Fix maintenance of set of item sets in DisjointSet.

2017-02-15 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4916: Fix maintenance of set of item sets in DisjointSet.
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5980
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I609c8795c09becd78815605ea8e82e2f99e82212
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4916: Fix maintenance of set of item sets in DisjointSet.

2017-02-15 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4916: Fix maintenance of set of item sets in DisjointSet.
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5980/2/fe/src/main/java/org/apache/impala/util/DisjointSet.java
File fe/src/main/java/org/apache/impala/util/DisjointSet.java:

PS2, Line 49:  private final IdentityHashMap uniqueSets_ =
:   new IdentityHashMap();
> nice idea :)
It was pretty much your idea :)


PS2, Line 111: checkState(removedValue == DUMMY_VALUE);
> I think checking for not NULL is also ok.
This one seems more explicit because if we got back a non-NULL value that is 
not DUMMY_VALUE, then something bad has happened.


-- 
To view, visit http://gerrit.cloudera.org:8080/5980
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I609c8795c09becd78815605ea8e82e2f99e82212
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Revert "IMPALA-4829: Change default Kudu read behavior for "RYW""

2017-02-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: Revert "IMPALA-4829: Change default Kudu read behavior for 
"RYW""
..


Patch Set 2:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/274/

-- 
To view, visit http://gerrit.cloudera.org:8080/5970
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I995dec543946c9e0f79bc5b7e82568060a9d8262
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4885: Expose Jvm thread info in web UI

2017-02-15 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4885: Expose Jvm thread info in web UI
..


Patch Set 1:

A publicly available screenshot: 
https://drive.google.com/file/d/0B7VW0hRyzVlKUGdWakNIdlhtbTQ/view?usp=sharing

-- 
To view, visit http://gerrit.cloudera.org:8080/6013
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id497043ab33dcf107a562f0b1ccd5c46095d397f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2020: Make it easy to work with big numbers

2017-02-15 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-2020: Make it easy to work with big numbers
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5902/4/be/src/util/decimal-util.cc
File be/src/util/decimal-util.cc:

Line 25: static constexpr T AllNines(unsigned precision) {
If it's going to be this specific, why not

const int128_t DecimalUtil::MAX_UNSCALED_DECIMAL16 = 99 + 100 *
(MAX_UNSCALED_DECIMAL8 + (1 + MAX_UNSCALED_DECIMAL8) *
 static_cast(MAX_UNSCALED_DECIMAL8));


-- 
To view, visit http://gerrit.cloudera.org:8080/5902
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5095a366d914cebb0b64bd434a08dbb55c90ed30
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4885: Expose Jvm thread info in web UI

2017-02-15 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4885: Expose Jvm thread info in web UI
..


Patch Set 1:

This is super useful. Could you post the screenshot somewhere that's publicly 
accessible?

-- 
To view, visit http://gerrit.cloudera.org:8080/6013
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id497043ab33dcf107a562f0b1ccd5c46095d397f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: No


[Impala-ASF-CR](asf-site) A blog post about IMPALA-4916

2017-02-15 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: A blog post about IMPALA-4916
..


Patch Set 3:

> (1 comment)
 > 
 > So, I believe that all of the pages still point to the old blog
 > site. This patch doesn't seem to address that just yet. Was that by
 > intention?

It wasn't but now that I'm thinking about it, I'd say this might be the right 
option. Maybe we should get a few posts +2ed and ready to commit, as well as a 
sustainable cadence going, before posting. What do you think?

-- 
To view, visit http://gerrit.cloudera.org:8080/5995
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifa90bf5621ef6466a4821f77a6e8a8b20c5512ae
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4916: Fix maintenance of set of item sets in DisjointSet.

2017-02-15 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4916: Fix maintenance of set of item sets in DisjointSet.
..


Patch Set 2: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5980/2/fe/src/main/java/org/apache/impala/util/DisjointSet.java
File fe/src/main/java/org/apache/impala/util/DisjointSet.java:

PS2, Line 49:  private final IdentityHashMap uniqueSets_ =
:   new IdentityHashMap();
nice idea :)


PS2, Line 111: checkState(removedValue == DUMMY_VALUE);
I think checking for not NULL is also ok.


-- 
To view, visit http://gerrit.cloudera.org:8080/5980
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I609c8795c09becd78815605ea8e82e2f99e82212
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes