[Impala-ASF-CR] Pin gen build version's git handling to typical git dir.

2017-11-14 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8500 )

Change subject: Pin gen_build_version's git handling to typical git dir.
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8500
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I98870a1cb2073ddcf9ba1620e3801e1964930edd
Gerrit-Change-Number: 8500
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 15 Nov 2017 04:20:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] Pin gen build version's git handling to typical git dir.

2017-11-13 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8500 )

Change subject: Pin gen_build_version's git handling to typical git dir.
..


Patch Set 1:

Sorry, didn't realize this was the upstream commit.  What I meant to say was we 
had exactly the same issue where the wrong impala-lzo bits were getting used.  
The result was exactly the same compile failure:

hdfs-lzo-text-scanner.cc:589:44: error: no matching function for call to 
'impala::ScannerContext::ReleaseCompletedResources(bool)'
   context_->ReleaseCompletedResources(false);


--
To view, visit http://gerrit.cloudera.org:8080/8500
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I98870a1cb2073ddcf9ba1620e3801e1964930edd
Gerrit-Change-Number: 8500
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Mon, 13 Nov 2017 22:44:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] Pin gen build version's git handling to typical git dir.

2017-11-13 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8500 )

Change subject: Pin gen_build_version's git handling to typical git dir.
..


Patch Set 1:

Sounds like https://jira.cloudera.com/browse/RELENG-2864


--
To view, visit http://gerrit.cloudera.org:8080/8500
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I98870a1cb2073ddcf9ba1620e3801e1964930edd
Gerrit-Change-Number: 8500
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Mon, 13 Nov 2017 22:23:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6148: Specifying thirdparty deps as URLs

2017-11-07 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8456 )

Change subject: IMPALA-6148: Specifying thirdparty deps as URLs
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8456/2/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

http://gerrit.cloudera.org:8080/#/c/8456/2/bin/bootstrap_toolchain.py@112
PS2, Line 112:   return v
Why not cache v instead?



--
To view, visit http://gerrit.cloudera.org:8080/8456
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4628d86022d4bd8b762313f7056d76416a58b422
Gerrit-Change-Number: 8456
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 07 Nov 2017 16:45:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5607: Add additional units to EXTRACT, DATE PART, TRUNC

2017-10-30 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8311 )

Change subject: IMPALA-5607: Add additional units to EXTRACT, DATE_PART, TRUNC
..


Patch Set 4:

(3 comments)

Let's figure out how to address MILLISECONDS in the JIRA.  We also should 
figure out what to do about rounding vs. truncation when converting to less 
precise units.

http://gerrit.cloudera.org:8080/#/c/8311/4/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8311/4/be/src/exprs/expr-test.cc@6037
PS4, Line 6037: TYPE_BIGINT, 28123);
This is a compatibility breaking change; we need to document the issue.


http://gerrit.cloudera.org:8080/#/c/8311/4/be/src/exprs/udf-builtins-ir.cc
File be/src/exprs/udf-builtins-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8311/4/be/src/exprs/udf-builtins-ir.cc@a193
PS4, Line 193:
This breaks compatibility; let's discuss what the most appropriate resolution 
is.


http://gerrit.cloudera.org:8080/#/c/8311/4/be/src/exprs/udf-builtins-ir.cc@134
PS4, Line 134:   return time.fractional_seconds() / NANOS_PER_MICRO / 
MICROS_PER_MILLI + time.seconds() * MILLIS_PER_SEC;
Too confusing - please use / (NANOS_PER_MICRO * MICROS_PER_MILLI)



--
To view, visit http://gerrit.cloudera.org:8080/8311
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If23ea310ddcc5c1c806b2b1ac24d24a4bd6aa4d9
Gerrit-Change-Number: 8311
Gerrit-PatchSet: 4
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 31 Oct 2017 00:19:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5607: Add additional units to EXTRACT, DATE PART, TRUNC

2017-10-26 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8311 )

Change subject: IMPALA-5607: Add additional units to EXTRACT, DATE_PART, TRUNC
..


Patch Set 2:

Looks like we want NANOSECONDS after all, so the BIGINT change is needed.


--
To view, visit http://gerrit.cloudera.org:8080/8311
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If23ea310ddcc5c1c806b2b1ac24d24a4bd6aa4d9
Gerrit-Change-Number: 8311
Gerrit-PatchSet: 2
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Thu, 26 Oct 2017 23:24:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5607: Add additional units to EXTRACT, DATE PART, TRUNC

2017-10-26 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8311 )

Change subject: IMPALA-5607: Add additional units to EXTRACT, DATE_PART, TRUNC
..


Patch Set 2:

(6 comments)

Overall this looks good - and thank you for the tests.  It is worth debating 
whether NANOSECONDs should be supported at all - let's do that in the JIRA.

http://gerrit.cloudera.org:8080/#/c/8311/2/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8311/2/be/src/exprs/expr-test.cc@5959
PS2, Line 5959:   TestStringValue(
I don't think the additional test cases add any value unless you add greater 
precision, i.e.

"cast(trunc(cast('2012-09-10 07:02:03.1234' as timestamp), 'SS') as string)"


http://gerrit.cloudera.org:8080/#/c/8311/2/be/src/exprs/udf-builtins-ir.cc
File be/src/exprs/udf-builtins-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8311/2/be/src/exprs/udf-builtins-ir.cc@120
PS2, Line 120:   return time.fractional_seconds() + time.seconds() * 1000 * 
1000 * 1000;
> The seconds field, including fractional parts, multiplied by 1 000 000 000;
Use NANOS_PER_SEC


http://gerrit.cloudera.org:8080/#/c/8311/2/be/src/exprs/udf-builtins-ir.cc@125
PS2, Line 125:   return ExtractNanosecond(time) / 1000;
Instead of dividing the result of another function, these should be implement 
directly, i.e.

  time.fractional_seconds() / NANOS_PER_MICRO + time.seconds() * MICROS_PER_SEC


http://gerrit.cloudera.org:8080/#/c/8311/2/be/src/exprs/udf-builtins-ir.cc@130
PS2, Line 130:   return ExtractMicrosecond(time) / 1000;
same comment applies


http://gerrit.cloudera.org:8080/#/c/8311/2/be/src/exprs/udf-builtins-ir.cc@154
PS2, Line 154: BigIntVal
> Return type should be changed from INT to BIGINT because INT data type's ra
We only need to change this if we actually implement nanosecond units for 
EXTRACT; no other referenced system in the JIRA currently supports that, so it 
is worth debating whether we should or not.


http://gerrit.cloudera.org:8080/#/c/8311/2/common/thrift/Exprs.thrift
File common/thrift/Exprs.thrift:

http://gerrit.cloudera.org:8080/#/c/8311/2/common/thrift/Exprs.thrift@92
PS2, Line 92:   MILLISECONDS,
We shouldn't have redundant field definitions here; instead we should be 
lenient with parsing and convert MILISECONDS to MILISECOND.



--
To view, visit http://gerrit.cloudera.org:8080/8311
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If23ea310ddcc5c1c806b2b1ac24d24a4bd6aa4d9
Gerrit-Change-Number: 8311
Gerrit-PatchSet: 2
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Thu, 26 Oct 2017 23:13:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6068: Fix dataload for complextypes fileformat

2017-10-24 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8350 )

Change subject: IMPALA-6068: Fix dataload for complextypes_fileformat
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7152306b2907198204a6d8d282a0bad561129b82
Gerrit-Change-Number: 8350
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 24 Oct 2017 22:02:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+

2017-10-20 Thread Zach Amsden (Code Review)
Zach Amsden has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8274 )

Change subject: IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+
..

IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+

Hadoop changed behavior regarding encrypted partitions and started
automatically provisioning .Trash directories in encrypted partitions.
This interplays poorly with the current test.

Change-Id: I30234aa50fea93f316e75beea2ced002dcea0c24
Reviewed-on: http://gerrit.cloudera.org:8080/8274
Tested-by: Impala Public Jenkins
Reviewed-by: Tim Armstrong 
---
M tests/metadata/test_hdfs_encryption.py
1 file changed, 28 insertions(+), 7 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/8274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I30234aa50fea93f316e75beea2ced002dcea0c24
Gerrit-Change-Number: 8274
Gerrit-PatchSet: 5
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+

2017-10-19 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8274 )

Change subject: IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+
..


Patch Set 4:

Looks like I finally got a good GVO on this.

Tim, is it alright if I carry the +2 on this, or would you rather I fix the 
long lines?  (Looks like there are a lot of long lines in this particular test 
however).


--
To view, visit http://gerrit.cloudera.org:8080/8274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30234aa50fea93f316e75beea2ced002dcea0c24
Gerrit-Change-Number: 8274
Gerrit-PatchSet: 4
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Thu, 19 Oct 2017 15:09:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6070: Parallel data load.

2017-10-18 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8320 )

Change subject: IMPALA-6070: Parallel data load.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/create-load-data.sh
File testdata/bin/create-load-data.sh:

http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/create-load-data.sh@492
PS1, Line 492:   run-step "Loading auxiliary workloads" load-aux-workloads.log 
load-aux-workloads
I don't see any reason this also couldn't run in parallel.



--
To view, visit http://gerrit.cloudera.org:8080/8320
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I836c4e1586f229621c102c4f4ba22ce7224ab9ac
Gerrit-Change-Number: 8320
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 18 Oct 2017 23:44:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+

2017-10-18 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8274 )

Change subject: IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+
..


Patch Set 4:

Anyone know what's going on with this change not going through verification?  I 
got a pass on private build and test, and the failures here appear Kerberos 
related.  Trying a manual rebase...


-- 
To view, visit http://gerrit.cloudera.org:8080/8274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30234aa50fea93f316e75beea2ced002dcea0c24
Gerrit-Change-Number: 8274
Gerrit-PatchSet: 4
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 18 Oct 2017 21:51:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+

2017-10-18 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8274 )

Change subject: IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+
..


Patch Set 2:

Looks like a transient failure, retrying


--
To view, visit http://gerrit.cloudera.org:8080/8274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30234aa50fea93f316e75beea2ced002dcea0c24
Gerrit-Change-Number: 8274
Gerrit-PatchSet: 2
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 18 Oct 2017 16:05:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+

2017-10-13 Thread Zach Amsden (Code Review)
Hello Philip Zeyliger, Joe McDonnell,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8274

to look at the new patch set (#2).

Change subject: IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+
..

IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+

Hadoop changed behavior regarding encrypted partitions and started
automatically provisioning .Trash directories in encrypted partitions.
This interplays poorly with the current test.

Change-Id: I30234aa50fea93f316e75beea2ced002dcea0c24
---
M tests/metadata/test_hdfs_encryption.py
1 file changed, 28 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/8274/2
--
To view, visit http://gerrit.cloudera.org:8080/8274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I30234aa50fea93f316e75beea2ced002dcea0c24
Gerrit-Change-Number: 8274
Gerrit-PatchSet: 2
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+

2017-10-13 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8274 )

Change subject: IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8274/1/tests/metadata/test_hdfs_encryption.py
File tests/metadata/test_hdfs_encryption.py:

http://gerrit.cloudera.org:8080/#/c/8274/1/tests/metadata/test_hdfs_encryption.py@191
PS1, Line 191: # exists. This behavior is expected due to the difference in 
encryption zones
> "the difference in encryption zones between the .Trash and the warehouse di
Done


http://gerrit.cloudera.org:8080/#/c/8274/1/tests/metadata/test_hdfs_encryption.py@198
PS1, Line 198:   # New HDFS behavior succeeds the query and creates trash; 
the partition removal
> s/New HDFS/HDFS 2.8+/?
Done



--
To view, visit http://gerrit.cloudera.org:8080/8274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30234aa50fea93f316e75beea2ced002dcea0c24
Gerrit-Change-Number: 8274
Gerrit-PatchSet: 1
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Fri, 13 Oct 2017 23:10:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+

2017-10-13 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8274


Change subject: IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+
..

IMPALA-6055: Fix hdfs encryption test far Hadoop 2.8+

Hadoop changed behavior regarding encrypted partitions and started
automatically provisioning .Trash directories in encrypted partitions.
This interplays poorly with the current test.

N.B. this is probably worth fixing upstream.

Change-Id: I30234aa50fea93f316e75beea2ced002dcea0c24
---
M tests/metadata/test_hdfs_encryption.py
1 file changed, 27 insertions(+), 7 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/8274/1
--
To view, visit http://gerrit.cloudera.org:8080/8274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I30234aa50fea93f316e75beea2ced002dcea0c24
Gerrit-Change-Number: 8274
Gerrit-PatchSet: 1
Gerrit-Owner: Zach Amsden 


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-28 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 6:

The algorithm mentioned here, on page 17 seems quite useful:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.111.9736=rep1=pdf


--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 6
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Thu, 28 Sep 2017 17:28:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-26 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 6:

I totally agree about doing anything fancy as a followup.  In particular, I did 
not find anything at all useful in AVX to help.  There is no vectored multiply 
large enough for this, and FMA operations won't help either as they basically 
only help with precision loss on floating point types.

Considering the perf difference is so extreme in some cases, I think we should 
strongly consider either living with the broken behavior until we can come up 
with a solution, or else verifying with users making use of the DECIMAL_V2 
feature that this will not be a problem.


--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 6
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 27 Sep 2017 03:00:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5975: Work around broken beeline clients

2017-09-26 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8132 )

Change subject: IMPALA-5975: Work around broken beeline clients
..


Patch Set 1:

>From private build and test 6418,

14:21:05 Executing: create-load-data.sh
14:21:05 Loading Hive Builtins (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/load-hive-builtins.log)...
14:23:20 OK (Took: 2 min 15 sec)
14:23:20 Generating HBase data (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/create-hbase.log)...
 
14:23:49 OK (Took: 0 min 29 sec)
14:23:49 Creating /test-warehouse HDFS directory (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/create-test-warehouse-dir.log)...
14:23:53 OK (Took: 0 min 4 sec)
14:23:53 Derived params for create-load-data.sh:
14:23:53 EXPLORATION_STRATEGY=exhaustive
14:23:53 SKIP_METADATA_LOAD=0
14:23:53 SKIP_SNAPSHOT_LOAD=0
14:23:53 SNAPSHOT_FILE=
14:23:53 CM_HOST=
14:23:53 REMOTE_LOAD=
14:23:53 Starting Impala cluster (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/start-impala-cluster.log)...
14:24:06 OK (Took: 0 min 13 sec)
14:24:06 Setting up HDFS environment (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/setup-hdfs-env.log)...
14:24:21 OK (Took: 0 min 15 sec)
14:24:21 Loading custom schemas (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/load-custom-schemas.log)...
14:25:24 OK (Took: 1 min 3 sec)
14:25:24 Loading functional-query data (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/load-functional-query.log)...
 
15:17:32 OK (Took: 52 min 8 sec)
15:17:32 Loading TPC-H data (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/load-tpch.log)...
15:42:56 OK (Took: 25 min 24 sec)
15:42:56 Loading nested data (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/load-nested.log)...
15:48:33 OK (Took: 5 min 37 sec)
15:48:33 Loading TPC-DS data (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/load-tpcds.log)...
16:16:20 OK (Took: 27 min 46 sec)
16:16:20 Loading auxiliary workloads (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/load-aux-workloads.log)...
16:27:10 OK (Took: 10 min 51 sec)
16:27:10 Loading dependent tables (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/copy-and-load-dependent-tables.log)...
16:28:18 OK (Took: 1 min 7 sec)
16:28:18 Loading custom data (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/load-custom-data.log)...
16:30:16 OK (Took: 1 min 58 sec)
16:30:16 Creating many block table (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/create-table-many-blocks.log)...
16:31:18 OK (Took: 1 min 2 sec)
16:31:18 Loading Kudu functional (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/load-kudu.log)...
16:35:57 OK (Took: 4 min 39 sec)
16:35:57 Loading Kudu TPCH (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/load-kudu-tpch.log)...
16:42:17 OK (Took: 6 min 20 sec)
16:42:17 Loading Hive UDFs (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/build-and-copy-hive-udfs.log)...
16:43:46 OK (Took: 1 min 29 sec)
16:43:46 Running custom post-load steps (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/custom-post-load-steps.log)...
16:44:20 OK (Took: 0 min 34 sec)
16:44:20 Caching test tables (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/cache-test-tables.log)...
16:44:28 OK (Took: 0 min 8 sec)
16:44:28 Loading external data sources (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/load-ext-data-source.log)...
16:44:39 OK (Took: 0 min 11 sec)
16:44:39 Splitting HBase (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/create-hbase.log)...
16:45:15 OK (Took: 0 min 36 sec)
16:45:15 Creating internal HBase table (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/create-internal-hbase-table.log)...
16:46:01 OK (Took: 0 min 46 sec)
16:46:01 Computing table stats (logging to 
/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/logs/data_loading/compute-table-stats.log)...
16:58:20 OK 

[Impala-ASF-CR] IMPALA-5975: Work around broken beeline clients

2017-09-25 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8132 )

Change subject: IMPALA-5975: Work around broken beeline clients
..


Patch Set 1:

I tested this with a local data load, but we need to kick off a GVO run as well.


--
To view, visit http://gerrit.cloudera.org:8080/8132
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id8b9f3dde4445513f1f389785a002c6cc6b3dada
Gerrit-Change-Number: 8132
Gerrit-PatchSet: 1
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Mon, 25 Sep 2017 19:56:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5975: Work around broken beeline clients

2017-09-22 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8132 )

Change subject: IMPALA-5975: Work around broken beeline clients
..


Patch Set 1:

I will do that now.


--
To view, visit http://gerrit.cloudera.org:8080/8132
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id8b9f3dde4445513f1f389785a002c6cc6b3dada
Gerrit-Change-Number: 8132
Gerrit-PatchSet: 1
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Sat, 23 Sep 2017 02:10:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5975: Work around broken beeline clients

2017-09-22 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8132


Change subject: IMPALA-5975: Work around broken beeline clients
..

IMPALA-5975: Work around broken beeline clients

To make statements execute, some clients require always appending
a semi-colon to the end.  The workaround is quite simple.

Change-Id: Id8b9f3dde4445513f1f389785a002c6cc6b3dada
---
M testdata/bin/create-load-data.sh
M testdata/bin/create-table-many-blocks.sh
2 files changed, 4 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/8132/1
--
To view, visit http://gerrit.cloudera.org:8080/8132
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id8b9f3dde4445513f1f389785a002c6cc6b3dada
Gerrit-Change-Number: 8132
Gerrit-PatchSet: 1
Gerrit-Owner: Zach Amsden 


[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-22 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8004 )

Change subject: IMPALA-4513: Promote integer types for ABS()
..


Patch Set 7:

Finally got back to this, looks like this was the only broken test.


--
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-Change-Number: 8004
Gerrit-PatchSet: 7
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Fri, 22 Sep 2017 22:32:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-22 Thread Zach Amsden (Code Review)
Hello Lars Volker, Michael Brown, Tim Armstrong, Alex Behm, Dan Hecht, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8004

to look at the new patch set (#6).

Change subject: IMPALA-4513: Promote integer types for ABS()
..

IMPALA-4513: Promote integer types for ABS()

The internal representation of the most negative number
in two's complement requires 1 more bit to represent the
positive version.  This means ABS() must promote integer
types to the next highest width.

Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
5 files changed, 41 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/8004/6
--
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-Change-Number: 8004
Gerrit-PatchSet: 6
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-20 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 5:

Have we ever tried using SSE/AVX for multiplication?  It should be possible to 
avoid using int256_t altogether if we can do four 64-bit multiplies in 
parallel.  Are FMA instructions available for integer? (I have the manual in 
front of me but have to run and no time to look it up).  We should be able to 
do the division the same way with another round of multiplication.

-- 
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

2017-09-18 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
..


Patch Set 5: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8035/5/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java
File 
common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java:

Line 23: import org.apache.hadoop.classification.InterfaceStability.Unstable;
These don't look pretty, since we own this now I guess they could disappear 
(along with the @Private @Unstable annotations below).


-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

2017-09-18 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
..


Patch Set 4:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
File 
common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java:

Line 64:   private final Map> 
queueAcls;
> I don't think it's worth changing this one otherwise I'll have to modify a 
That's fine.  We may even want extra ACLs sometime in the future - I can see a 
case for CANCEL_OTHERS


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java
File 
common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java:

Line 30: public class AllocationConfigurationException extends Exception {
> We still use the resource parsing code that throws this, e.g. in FairSchedu
Fair enough.


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
File 
common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java:

Line 103:   public void serviceInit(Configuration conf) throws Exception {
> Yeah you're right, though I'm on the fence about how much we should doctor 
Can you file a JIRA?  This is actually a well compartmentalized ramp-up or 
beginner task (that even someone from outside Cloudera could do)


Line 223: Map minSharePreemptionTimeouts = new HashMap<>();
> I think it may be worth leaving this code as-is even if it's doing extra wo
Seems fine.


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java
File 
common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java:

Line 70: Pattern pattern = Pattern.compile("(\\d+)(\\.\\d*)?\\s*" + units);
> Agreed, though I don't think it's worth spending time improving this code. 
Nope.  I have zero insight into how often this is called.  Obviously if it ends 
up being called every time we need to check a memory reservation it might be a 
problem, but that seems very unlikely.


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/QueuePlacementRule.java
File 
common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/QueuePlacementRule.java:

Line 57:   
Is it worth fixing the trailing spaces so future diffs don't give the linter 
hives?


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
File 
common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java:

Line 25: return DEFAULT_POLICY;
> Per my other comments, I'd prefer not to change the parsing code too much y
Seems fine.


http://gerrit.cloudera.org:8080/#/c/8035/4/fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
File fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java:

Line 124: new Configuration().get(HADOOP_SECURITY_AUTH_TO_LOCAL, 
"DEFAULT"));
> I'm actually not sure why the testResolvePrincipalName() was working before
If I had to guess, I'd guess that kerberos tests were flaky and got disabled or 
marked xfailed.


-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7efdd8ebea298836ca2a82c0a4ae037ac9285bcf
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5920: Remove admission control dependency on YARN RM jar

2017-09-18 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5920: Remove admission control dependency on YARN RM jar
..


Patch Set 4:

(17 comments)

Looks good but I have a few comments - there is a lot more that could be 
pruned.  Let me know which if any you might pursue and I can give a +1

http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
File 
common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java:

Line 61:   private final float queueMaxAMShareDefault;
Impala doesn't make use of YARN's concept of ApplicationMasters.  Why not kill 
this.


Line 64:   private final Map> 
queueAcls;
Useless (just SUBMIT_APPLICATIONS)


Line 69:   private final Map minSharePreemptionTimeouts;
Unused by Impala


Line 74:   private final Map fairSharePreemptionTimeouts;
Also unused by Impala


Line 80:   private final Map fairSharePreemptionThresholds;
Also unused


Line 82:   private final Map schedulingPolicies;
The only policy is DEFAULT_POLICY so this is also not needed.


Line 84:   private final SchedulingPolicy defaultSchedulingPolicy;
Ditto


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java
File 
common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationConfigurationException.java:

Line 30: public class AllocationConfigurationException extends Exception {
The only point of this class is capturing configuration exceptions and 
throwing, but if you remove most of the throws, maybe this class is not 
necessary either.


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
File 
common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java:

Line 103:   public void serviceInit(Configuration conf) throws Exception {
103-165 duplicate code and functionality provided by 
impala.util.FileWatchService (minus the ALLOC_RELOAD_WAIT_MS check); we only 
needed this code before because this is the way yarn worked internally, but it 
would be better to have just one implementation here - maybe switch this to use 
FileWatchService.


Line 171:* classpath, but loaded like a regular File.
This seems like unnecessary functionality that we could deprecate if desired.


Line 223: Map minSharePreemptionTimeouts = new HashMap<>();
Preemption configuration does nothing in Impala; could also be killed.


Line 226: Map> queueAcls = new 
HashMap<>();
Impala only uses a single queueACL, SUBMIT_APPLICATIONS.  We could pull the 
entire QueueACL class into impala to get rid of one more Yarnish dependency.


Line 323: } else if 
("defaultQueueSchedulingPolicy".equals(element.getTagName())
Here's the useless code and exception throwing.


Line 533:   public interface Listener {
This can probably go away too.


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java
File 
common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java:

Line 70: Pattern pattern = Pattern.compile("(\\d+)(\\.\\d*)?\\s*" + units);
This seems a pretty inefficient way to look up resources. This would be a 
candidate for memoizing if this is called often with the same 'units'.


http://gerrit.cloudera.org:8080/#/c/8035/4/common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
File 
common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java:

Line 25: return DEFAULT_POLICY;
This whole class is pretty useless and in Impala's case, only serves to throw 
exceptions when the SchedulingPolicy can't be parsed as a valid policy.

Since that is now impossible, maybe just remove the class?


http://gerrit.cloudera.org:8080/#/c/8035/4/fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
File fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java:

Line 124: new Configuration().get(HADOOP_SECURITY_AUTH_TO_LOCAL, 
"DEFAULT"));
Not clear to me why this is needed now.


-- 
To view, visit http://gerrit.cloudera.org:8080/8035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: 

[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-15 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4513: Promote integer types for ABS()
..


Patch Set 5:

Sorry, been under C6 crunch.  Will get back to this today.

-- 
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-08 Thread Zach Amsden (Code Review)
Hello Lars Volker,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/8004

to look at the new patch set (#4).

Change subject: IMPALA-4513: Promote integer types for ABS()
..

IMPALA-4513: Promote integer types for ABS()

The internal representation of the most negative number
in two's complement requires 1 more bit to represent the
positive version.  This means ABS() must promote integer
types to the next highest width.

Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
4 files changed, 39 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/8004/4
-- 
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-08 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4513: Promote integer types for ABS()
..


Patch Set 3:

(2 comments)

Decimal is immune to this because the range is clamped.  Float and double are 
immune because there is a sign bit.

http://gerrit.cloudera.org:8080/#/c/8004/3/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

Line 4215:   TestValue("abs(-127)", TYPE_SMALLINT, 127);
> It looks like we don't have other coverage for abs integral expressions. Ma
I tested this manually but didn't think the test was super useful.  Can add it 
anyway though (would protect against a bug which somehow reversed sign)


http://gerrit.cloudera.org:8080/#/c/8004/3/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

Line 70:   ctx->AddWarning("Expression overflowed, returning NULL");
> Maybe "abs() overflowed, returning NULL" to make it easier for users to tra
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-07 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#3).

Change subject: IMPALA-4513: Promote integer types for ABS()
..

IMPALA-4513: Promote integer types for ABS()

The internal representation of the most negative number
in two's complement requires 1 more bit to represent the
positive version.  This means ABS() must promote integer
types to the next highest width.

Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
4 files changed, 31 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/8004/3
-- 
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-07 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4513: Promote integer types for ABS()
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8004/2/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

Line 4216:   TestValue("abs(-128)", TYPE_SMALLINT, 128);
> What does this do with abs(-100 - 27)? How about abs(-200 + 100)?
The types may be promoted or not depending on operator precedence (I believe we 
have some weirdness with unary minus) but the end result is now the same:

[localhost:21000] > select abs(-100 - 27);
Query: select abs(-100 - 27)
Query submitted at: 2017-09-07 19:30:10 (Coordinator: http://zach-pa:25000)
Query progress can be monitored at: 
http://zach-pa:25000/query_plan?query_id=c42cf1a096e10f3:48ac2186
++
| abs(-100 - 27) |
++
| 127|
++
Fetched 1 row(s) in 0.01s
[localhost:21000] > select abs(-200 + 100);
Query: select abs(-200 + 100)
Query submitted at: 2017-09-07 19:30:22 (Coordinator: http://zach-pa:25000)
Query progress can be monitored at: 
http://zach-pa:25000/query_plan?query_id=2548f7f5364aa478:70dc0e3e
+-+
| abs(-200 + 100) |
+-+
| 100 |
+-+
Fetched 1 row(s) in 0.01s


http://gerrit.cloudera.org:8080/#/c/8004/2/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

Line 65:   ctx->AddWarning("Expression overflowed, returning NULL");
> Do we have a test for this case?
Not sure it is possible to test the warning message without adding more 
machinery, but there is a test for the case.


Line 73: ONE_ARG_MATH_FN(Abs, BigIntVal, IntVal, llabs);
> Maybe mention the upcast in a comment here? Otherwise it seems easy to miss
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-07 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4513: Promote integer types for ABS()
..


Patch Set 2:

As per Greg's request, add overflow check for BIGINT.  All other types just get 
promoted.

-- 
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-07 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#2).

Change subject: IMPALA-4513: Promote integer types for ABS()
..

IMPALA-4513: Promote integer types for ABS()

The internal representation of the most negative number
in 2's complement requires 1 more bit to represent the
positive version.  This means ABS() must promote integer
types to the next highest width.

Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
4 files changed, 26 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/8004/2
-- 
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-07 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4513: Promote integer types for ABS()
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8004/1/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

Line 4219:   TestValue("abs(-1 * cast(pow(2, 31) as int))", TYPE_BIGINT, 
2147483648);
Obviously there is going to be some debate about what to do about BIGINT types. 
 Let's do that in the JIRA.


-- 
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-07 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/8004

Change subject: IMPALA-4513: Promote integer types for ABS()
..

IMPALA-4513: Promote integer types for ABS()

The internal representation of the most negative number
in 2's complement requires 1 more bit to represent the
positive version.  This means ABS() must promote integer
types to the next highest width.

Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
4 files changed, 15 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/8004/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 


[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components

2017-09-05 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5764: Allow overriding packaged components
..


Patch Set 2:

Need to get back to this now that I have smokes fixed.  I imagine I'll want to 
do things a little bit differently to avoid breaking the various integration 
jobs.

-- 
To view, visit http://gerrit.cloudera.org:8080/7581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I95c2662e6f62adc924cc5de7a371202126046545
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5854: Update external hadoop versions

2017-08-30 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5854: Update external hadoop versions
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7892/1/bin/impala-config.sh
File bin/impala-config.sh:

PS1, Line 126: export KUDU_JAVA_VERSION=1.5.0-cdh5.14.0-SNAPSHOT
> I think this should be 1.6.0-cdh5.14.0-SNAPSHOT
Not sure how I missed that - this only seems to be used in the pom.xml and 
nowhere else.


-- 
To view, visit http://gerrit.cloudera.org:8080/7892
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ab9512c3fa3e94c3a4fa9eeb49ff5563a114954
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5854: Update external hadoop versions

2017-08-30 Thread Zach Amsden (Code Review)
Hello Michael Brown,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7892

to look at the new patch set (#2).

Change subject: IMPALA-5854: Update external hadoop versions
..

IMPALA-5854: Update external hadoop versions

These versions need to be updated and new versions
need to be deployed to S3.

Change-Id: I4ab9512c3fa3e94c3a4fa9eeb49ff5563a114954
---
M bin/impala-config.sh
1 file changed, 6 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/7892/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7892
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4ab9512c3fa3e94c3a4fa9eeb49ff5563a114954
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5854: Update external hadoop versions

2017-08-29 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5854: Update external hadoop versions
..


Patch Set 1:

Components are already built, S3 buckets updated, and I ran a private 
exhaustive build & load job, which passed: 

http://sandbox.jenkins.cloudera.com/view/Impala/view/Private-Utility/job/impala-private-build-and-test/6245/console

-- 
To view, visit http://gerrit.cloudera.org:8080/7892
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ab9512c3fa3e94c3a4fa9eeb49ff5563a114954
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5854: Update external hadoop versions

2017-08-29 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7892

Change subject: IMPALA-5854: Update external hadoop versions
..

IMPALA-5854: Update external hadoop versions

These versions need to be updated and new versions
need to be deployed to S3.

Change-Id: I4ab9512c3fa3e94c3a4fa9eeb49ff5563a114954
---
M bin/impala-config.sh
1 file changed, 6 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/7892/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7892
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I4ab9512c3fa3e94c3a4fa9eeb49ff5563a114954
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 


[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components

2017-08-15 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5764: Allow overriding packaged components
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7581/2/bin/impala-config.sh
File bin/impala-config.sh:

Line 180: export 
DOWNLOAD_PACKAGED_COMPONENTS=${DOWNLOAD_PACKAGED_COMPONENTS-"$NO_THIRDPARTY"}
> This is the code review tool for Apache Impala. It is fine to add features 
The proprietary stuff is what I'm actually trying to get rid of with this 
change.  Needless to say, downstream things that are affected by this will need 
to be fixed downstream; right now I'm unaware of any major problems there that 
cannot easily be fixed.


-- 
To view, visit http://gerrit.cloudera.org:8080/7581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I95c2662e6f62adc924cc5de7a371202126046545
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components

2017-08-14 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5764: Allow overriding packaged components
..


Patch Set 2:

(4 comments)

A couple of minor nits.  Also, let me make sure I didn't accidentally introduce 
typos during edits and that the script and data load still work.

http://gerrit.cloudera.org:8080/#/c/7581/2/README.md
File README.md:

Line 69: | PACKAGED_COMPONENTS_NAME | "cdh" | Identifier used to uniquify paths 
for potentially incompatible component builds. |
This doesn't currently match the "cdh5" label used in the shell script.


http://gerrit.cloudera.org:8080/#/c/7581/2/bin/impala-config.sh
File bin/impala-config.sh:

Line 180: export 
DOWNLOAD_PACKAGED_COMPONENTS=${DOWNLOAD_PACKAGED_COMPONENTS-"$NO_THIRDPARTY"}
Technically this "breaks" the C6 build, but because the cdh6.x version of 
Impala is pushed right now from CDH5 with a thirdparty directory, this is still 
working.  Will need to update the cauldron build to fix this.


Line 361:   export 
PACKAGED_COMPONENTS_HOME="$IMPALA_TOOLCHAIN/$PACKAGED_COMPONENTS_NAME"
This changes the download directory for the CDH pieces of the toolchain; I 
presume that is okay to do, might result in duplicate copies but nothing should 
break because of this.


Line 382: export 
MINI_DFS_BASE_DATA_DIR="$IMPALA_HOME/${PACKAGED_COMPONENTS_NAME}-hdfs-data"
This forces a data reload - unless there is any objection, I'm not going to try 
to change that.


-- 
To view, visit http://gerrit.cloudera.org:8080/7581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I95c2662e6f62adc924cc5de7a371202126046545
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components

2017-08-10 Thread Zach Amsden (Code Review)
Zach Amsden has abandoned this change.

Change subject: IMPALA-5764: Allow overriding packaged components
..


Abandoned

Gerrit created another change for some reason.

-- 
To view, visit http://gerrit.cloudera.org:8080/7648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: If3e4748aae54d3c4f723591bfd2ce87594338cfd
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 


[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components

2017-08-10 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#2).

Change subject: IMPALA-5764: Allow overriding packaged components
..

IMPALA-5764: Allow overriding packaged components

For allowing multiple different distributions to build against the same
codebase, allow overriding various environment variables as well as what
packages to download into the toolchain.  This allows multiple sets of
packaged components to exist in the toolchain and testdata concurrently,
and to easily swap between them using environment overrides.

For example, one could specify overrides to packages and disambiguate
the resulting database paths:

PACKAGED_COMPONENTS_NAME=apache_components
PACKAGED_COMPONENTS_PATH='/apache_bucket/'
PACKAGED_COMPONENTS=avro,hadoop,hbase,hive,sentry,parquet
IMPALA_AVRO_VERSION=2.0.2-apache-local
IMPALA_HADOOP_VERSION=3.0.0-apache-experimental
IMPALA_HBASE_VERSION=1.2.0-apache-SNAPSHOT
IMPALA_HIVE_VERSION=1.1.0-apache-SNAPSHOT
IMPALA_SENTRY_VERSION=1.5.1-apache-SNAPSHOT
IMPALA_PARQUET_VERSION=1.5.0-apache-SNAPSHOT

And these packages would be downloaded from the '/apache_components/'
S3 bucket into the toolchain's apache_componenets directory.

Change-Id: I95c2662e6f62adc924cc5de7a371202126046545
---
M README.md
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
M testdata/cluster/admin
4 files changed, 61 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/7581/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I95c2662e6f62adc924cc5de7a371202126046545
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components

2017-08-10 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7648

Change subject: IMPALA-5764: Allow overriding packaged components
..

IMPALA-5764: Allow overriding packaged components

For allowing multiple different distributions to build against the same
codebase, allow overriding various environment variables as well as what
packages to download into the toolchain.  This allows multiple sets of
packaged components to exist in the toolchain and testdata concurrently,
and to easily swap between them using environment overrides.

For example, one could specify overrides to packages and disambiguate
the resulting database paths:

PACKAGED_COMPONENTS_NAME=apache_components
PACKAGED_COMPONENTS_PATH='/apache_bucket/'
PACKAGED_COMPONENTS=avro,hadoop,hbase,hive,sentry,parquet
IMPALA_AVRO_VERSION=2.0.2-apache-local
IMPALA_HADOOP_VERSION=3.0.0-apache-experimental
IMPALA_HBASE_VERSION=1.2.0-apache-SNAPSHOT
IMPALA_HIVE_VERSION=1.1.0-apache-SNAPSHOT
IMPALA_SENTRY_VERSION=1.5.1-apache-SNAPSHOT
IMPALA_PARQUET_VERSION=1.5.0-apache-SNAPSHOT

And these packages would be downloaded from the '/apache_components/'
S3 bucket into the toolchain's apache_componenets directory.

Change-Id: If3e4748aae54d3c4f723591bfd2ce87594338cfd
---
M README.md
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
M testdata/cluster/admin
4 files changed, 61 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/7648/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: If3e4748aae54d3c4f723591bfd2ce87594338cfd
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 


[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components

2017-08-10 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5764: Allow overriding packaged components
..


Patch Set 1:

needs rebase against merged changes, coming presently

-- 
To view, visit http://gerrit.cloudera.org:8080/7581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I95c2662e6f62adc924cc5de7a371202126046545
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5768: Better developer documentation

2017-08-07 Thread Zach Amsden (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7350

to look at the new patch set (#5).

Change subject: IMPALA-5768: Better developer documentation
..

IMPALA-5768: Better developer documentation

Guide to important environment variables for
build, impala paths and config cleanup.

Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
---
M README.md
M bin/impala-config.sh
2 files changed, 75 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/7350/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5768: Better developer documentation

2017-08-04 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5768: Better developer documentation
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7350/2/bin/impala-config.sh
File bin/impala-config.sh:

Line 456: LIBHDFS_OPTS="${LIBHDFS_OPTS}:${IMPALA_HOME}/be/build/debug/service"
> I agree that it's wrong, and /latest/ is the correct way to point it at tha
I got what appears to be a clean and functional build out of it - pretty sure 
all this stuff that keeps mentioning libbackend.so is obsolete.


-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5768: Better developer documentation

2017-08-04 Thread Zach Amsden (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7350

to look at the new patch set (#4).

Change subject: IMPALA-5768: Better developer documentation
..

IMPALA-5768: Better developer documentation

Guide to important environment variables for
build, impala paths and config cleanup.

Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
---
M README.md
M bin/impala-config.sh
2 files changed, 75 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/7350/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5768

2017-08-04 Thread Zach Amsden (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7350

to look at the new patch set (#3).

Change subject: IMPALA-5768
..

IMPALA-5768

Guide to important environment variables for
build, impala paths and config cleanup.

Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
---
M README.md
M bin/impala-config.sh
2 files changed, 75 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/7350/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components

2017-08-04 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5764: Allow overriding packaged components
..


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/7581/1//COMMIT_MSG
Commit Message:

Line 9: For allowing multiple different distributions to build against the same
> Maybe briefly mention the relevant environment variables so it's more disco
Done


Line 14: 
> can you comment on how other 'packages' can be set up? e.g. what's required
Yeah, these two changes are somewhat intertwined.  I'll put most of the 
explanation in README.md rather than the changelog which nobody will read ;)


http://gerrit.cloudera.org:8080/#/c/7581/1/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

Line 39: HOST = "https://native-toolchain.s3.amazonaws.com/build;
> It might make sense to also allow configuring this URL, so it can be pointe
I considered it, but it's easy to open a slippery slope to 'Everything is an 
environment variable!' and this gets used for both the components and the 
toolchain with different paths, so I refrained from letting the change snowball.


PS1, Line 340: llama,
> nit: remove llama (though not llama-minikdc)
Done


PS1, Line 341: minikidc
> typo: minikdc
Done.  Wait - that wasn't even my tyop!  Will fix anyway.


http://gerrit.cloudera.org:8080/#/c/7581/1/bin/impala-config.sh
File bin/impala-config.sh:

PS1, Line 131: cdh
> Keeping it as cdh_components kind-of makes sense to me as long as we're put
My intention was to leave this alone so that the existing workflows for both 
external and internal developers continue to work.

The override I had in mind for C6 looks something like:

PACKAGED_COMPONENTS_NAME=cdh6
PACKAGED_COMPONENTS_PATH="/cdh6_components/"
PACKAGED_COMPONENTS="avro,hadoop,hbase,hive,sentry"

Then we can have both C5 & C6 builder jobs run separately and deposit 
components where they won't conflict.  Also note we can pull in avro and drop 
llama-minikdc (or also pull in hbase-minikdc, if someone is brave enough to set 
that up)

And of course external thirdparty components are pretty easy to add as well, 
but with this arrangement (components override toolchain, as I want to happen 
for avro), one might argue for adding the other Apache components directly into 
the toolchain and letting the build override them.

But one step at a time, this should suffice for now.


-- 
To view, visit http://gerrit.cloudera.org:8080/7581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I95c2662e6f62adc924cc5de7a371202126046545
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Guide to important environment variables for build, test, and mini-cluster operations.

2017-08-04 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: Guide to important environment variables for build, test, and 
mini-cluster operations.
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7350/2/bin/impala-config.sh
File bin/impala-config.sh:

Line 343: if type nproc >/dev/null 2>&1; then
> I used "getconf _NPROCESSORS_ONLN" in infra/python/bootstrap_virtualenv.py.
Done


PS2, Line 433: nproc
> Hmm, should this use $CORES?
No, I think we only want to reduce concurrency for tests, which might actually 
start doing multi-threaded things, but for the build, full steam ahead.


Line 456: LIBHDFS_OPTS="${LIBHDFS_OPTS}:${IMPALA_HOME}/be/build/debug/service"
> Pretty sure we don't need this, since everything works on a release build. 
mabe safer to put be/build/latest/service ?


Line 473: 
LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${IMPALA_HOME}/be/build/debug/service"
> Same here. Pretty sure that we don't need to add this to LD_LIBRARY_PATH.
Agree, but unsure about the combinations of shared / static builds needed to 
test this hypothesis.


Line 491: alias 
gerrit-verify-only="${IMPALA_AUX_TEST_HOME}/jenkins/gerrit-verify-only.sh"
> We can remove this while we're here too - it's referring to a Cloudera-inte
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components

2017-08-03 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7581

Change subject: IMPALA-5764: Allow overriding packaged components
..

IMPALA-5764: Allow overriding packaged components

For allowing multiple different distributions to build against the same
codebase, allow overriding various environment variables as well as what
packages to download into the toolchain.  This allows multiple sets of
packaged components to exist in the toolchain and testdata concurrently,
and to easily swap between them using environment overrides.

Change-Id: I95c2662e6f62adc924cc5de7a371202126046545
---
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
M testdata/cluster/admin
3 files changed, 56 insertions(+), 43 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/7581/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I95c2662e6f62adc924cc5de7a371202126046545
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 


[Impala-ASF-CR] Guide to important environment variables for build, test, and mini-cluster operations.

2017-08-03 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: Guide to important environment variables for build, test, and 
mini-cluster operations.
..


Patch Set 2:

I cut this back in scope somewhat and made it use README.md so that it shows up 
nicely formatted.  I'll go on and document the mini-cluster separately and file 
a JIRA for this if everyone is happy with the format.

-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] Guide to important environment variables for build, test, and mini-cluster operations.

2017-08-03 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#2).

Change subject: Guide to important environment variables for build, test, and 
mini-cluster operations.
..

Guide to important environment variables for
build, test, and mini-cluster operations.

Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
---
M README.md
M bin/impala-config.sh
2 files changed, 72 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/7350/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4939, IMPALA-4939: Decimal V2 multiplication

2017-07-26 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4939, IMPALA-4939: Decimal V2 multiplication
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7438/1/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

Line 1554: }
> I think we aren't using 128 bit literals in that many places in our codebas
Agreed, this is fine for now.


http://gerrit.cloudera.org:8080/#/c/7438/1/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

Line 301: needs_int256 = DecimalUtil::MAX_UNSCALED_DECIMAL16 / abs(y) < 
abs(x);
> Are you talking about using some existing special functions for counting th
A while loop will perform terribly, but there are intrinsics that can be used 
to count the zeros efficiently.  If we do that, this division can be optimized 
away if we note the number of zeros in the absolute value of the multiplicands 
is greater than 128.

We still need to check for a value > MAX_UNSCALED_DECIMAL16, but I think this 
could be a win over the division.

Maybe just adding a note or TODO for this for now.


http://gerrit.cloudera.org:8080/#/c/7438/1/be/src/util/bit-util.h
File be/src/util/bit-util.h:

Line 65: return value < 0 ? -1 : 1;
> The way it was written before didn't work with int256, so I made this chang
I think we want to think carefully about using the Sign function with int256.  
That boost type is likely to do something crazy here no matter what we code.  
It's also always signed, and shifting it is a really bad idea.  Doing anything 
with it at all aside from the multiplication is going to be less than optimal.

Can we cheat and round in 128 bit precision instead?

I.e...
  int256 prod = x * y;
  result = prod / scale_multiplier;
  remainder = prod % scale_multiplier;
  // scale multiplier is always a power of 2, so
  if (remainder > scale_multiplier >> 1) result = 
  RoundAwayFromZero(result)

If we can get both remainder and quotient together without added cost from 
Boost, that is.

Worth running a simple benchmark to test, how about using TPC-DS and running:

 sum(l_quantity * l_tax) | sum(l_extendedprice * l_discount)


http://gerrit.cloudera.org:8080/#/c/7438/1/be/src/util/decimal-util.h
File be/src/util/decimal-util.h:

Line 336:   if (LIKELY(scale < 77)) return values[scale];
> It is possible to get large values, but when we are Dividing, not multiplyi
Really?  That's crazy.

Can we actually kill GetScaleMultiplier then?


http://gerrit.cloudera.org:8080/#/c/7438/1/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
File fe/src/main/java/org/apache/impala/analysis/TypesUtil.java:

Line 260: resultPrecision = p1 + p2 + 1;
> I agree that the +1 is messy, but to be compatible with other systems and f
I think the 0.999 * 0.999 case is a fluke - we are sacrificing precision for 
every single other maximum precision case just for the 1e-74 chance that this 
one case rounds nicely.

I'd strongly prefer to see the crazy +1 go the way of the dodo, understand the 
compatibility argument, but wonder how important that actually is.


-- 
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4939, IMPALA-4939: Decimal V2 multiplication

2017-07-17 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4939, IMPALA-4939: Decimal V2 multiplication
..


Patch Set 1:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/7438/1/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

Line 1554: }
This is the second time a similar function has come up in code review.  What do 
you think about using user-defined literals or merging Jim Apple's code: 
https://github.com/jbapple/128-bit-literals


Line 1693:  { true, 0, 38, 6 }}},
Other test - 38 9's right of decimal * 38 9's right of decimal.  If I am 
correct this should round up to 1 in the v2 type scheme (due to the crazy +1 
and rounding).


http://gerrit.cloudera.org:8080/#/c/7438/1/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

Line 240:   if (needs_int256) {
UNLIKELY(needs_int256)


Line 251: if (delta_scale > 38) {
Can we explicitly handle the LIKELY(delta_scale == 0) case first?


Line 253:   // decimal(38, 38). The result should be a decimal(38, 37), so
Is this what our new formula gives?  It is slightly paradoxical, since it loses 
precision.  We can never have more than (p1-s1)+(p2-s2) digits to the left of 
the decimal.  I think we should reconsider our truncation rules as they apply 
to multiply.


Line 301: needs_int256 = DecimalUtil::MAX_UNSCALED_DECIMAL16 / abs(y) < 
abs(x);
I think this can be optimized a bit.  Why not use count of leading zeroes as an 
estimator of result?  Then we know needs_int256 = lzcnt(x) + lzcnt(y) < 128


Line 305:   if (needs_int256) {
UNLIKELY


Line 316: if (delta_scale > 38) {
As above, handle likely case of delta_scale == 0 first


http://gerrit.cloudera.org:8080/#/c/7438/1/be/src/util/bit-util.h
File be/src/util/bit-util.h:

Line 65: return value < 0 ? -1 : 1;
The reason it was written that way is I think this is undefined for unsigned 
integers and triggers pedantic warnings (also I think gcc and clang turn this 
into less than optimal code).

godbolt is not working for me right now so I'll check later.


http://gerrit.cloudera.org:8080/#/c/7438/1/be/src/util/decimal-util.h
File be/src/util/decimal-util.h:

Line 336:   if (LIKELY(scale < 77)) return values[scale];
Do we really need values this high?  Maybe truncate the scale earlier and add a 
DCHECK that we never request larger values?


http://gerrit.cloudera.org:8080/#/c/7438/1/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
File fe/src/main/java/org/apache/impala/analysis/TypesUtil.java:

Line 260: resultPrecision = p1 + p2 + 1;
This +1 is messy for several reasons - first, it isn't necessary at all except 
for rounding on precision overflow.  Second, it actually reduces precision - in 
general, v1 * v2 should have at most (p1 - s1) + (p2 - s2) digits left of the 
decimal point, instead we have one more.  Here we get less precise results for 
almost all cases, just to save the one case of 0.9... * 0.9 rounding up 
to 1.0.  I'd prefer to truncate that case and keep the extra precision.

Finally, it contributes to type inflation, creating longer types and more 
overhead than necessary.


-- 
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5164: Better benchmark heuristic

2017-07-10 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5164: Better benchmark heuristic
..


Patch Set 1: -Code-Review

-- 
To view, visit http://gerrit.cloudera.org:8080/7389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6fb1283716dbc591c10238d85b9ab436d02fb96
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5164: Better benchmark heuristic

2017-07-10 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7389

Change subject: IMPALA-5164: Better benchmark heuristic
..

IMPALA-5164: Better benchmark heuristic

If any benchmark took longer than 50ms to run, the test suite would
bail.  This doesn't seem reasonable for tests which have large data
structures.  Just increase the timeout in that case while still
trying to measure context-switch free runs.

Change-Id: Ie6fb1283716dbc591c10238d85b9ab436d02fb96
---
M be/src/benchmarks/bloom-filter-benchmark.cc
M be/src/util/benchmark.cc
2 files changed, 9 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/7389/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie6fb1283716dbc591c10238d85b9ab436d02fb96
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 


[Impala-ASF-CR] IMPALA-5164: Better benchmark heuristic

2017-07-10 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5164: Better benchmark heuristic
..


Patch Set 1: Code-Review+1

Turns out this approach works even better and avoids any special changes to 
benchmarks.

-- 
To view, visit http://gerrit.cloudera.org:8080/7389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6fb1283716dbc591c10238d85b9ab436d02fb96
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5164: Bloom filter benchmark fails to complete

2017-07-10 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5164: Bloom filter benchmark fails to complete
..


Patch Set 1:

@jbapple, let's go ahead with this diff and if there are more failures, I'll 
attach more fixes.

-- 
To view, visit http://gerrit.cloudera.org:8080/7381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5164: Bloom filter benchmark fails to complete

2017-07-10 Thread Zach Amsden (Code Review)
Hello Jim Apple,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7381

to look at the new patch set (#2).

Change subject: IMPALA-5164: Bloom filter benchmark fails to complete
..

IMPALA-5164: Bloom filter benchmark fails to complete

Due to large allocations during benchmark, we spend a lot of time
under memory pressure, which triggers the context switch detection.
Not using micro-benchmark heuristics for these benchmarks fixes the
problem.

Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc
---
M be/src/benchmarks/bloom-filter-benchmark.cc
1 file changed, 2 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/7381/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-XXXX: Bloom filter benchmark fails to complete

2017-07-07 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-: Bloom filter benchmark fails to complete
..


Patch Set 1:

@jbapple, Is there a JIRA open for this already?

-- 
To view, visit http://gerrit.cloudera.org:8080/7381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5623: Fix lag() on STRING cols to release UDF mem

2017-07-06 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5623: Fix lag() on STRING cols to release UDF mem
..


Patch Set 1: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7371/1/testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
File testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test:

Line 1924:  QUERY
Did you have to manually inspect the log output to validate the fix?  If so, is 
there a more automated way to do that?


-- 
To view, visit http://gerrit.cloudera.org:8080/7371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id2b69b4ccb9cac076abca19bed6f0b1dd11dfff3
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5498: Support for partial sorts

2017-07-06 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5498: Support for partial sorts
..


Patch Set 3:

(6 comments)

Looks promising!

http://gerrit.cloudera.org:8080/#/c/7267/3/be/src/exec/partial-sort-node.cc
File be/src/exec/partial-sort-node.cc:

Line 32: sorter_(NULL),
This doesn't need to handle offset?


Line 127: input_batch_index_ += num_processed;
Since this only requires a partial sort, you can exit early when the cumulative 
number of rows processed exceeds the limit.  This might end up sorting 
significantly fewer rows (at the cost of less well sorted data when a limit is 
present).


Line 140:   if (ReachedLimit()) {
I guess it is not clear to me why allowing a limit without an offset is useful. 
 If there is a limit, the total ordering of rows is undefined and there is no 
way to continue fetching them without also allowing an offset.  Maybe document 
this limitation?


http://gerrit.cloudera.org:8080/#/c/7267/3/be/src/exec/partial-sort-node.h
File be/src/exec/partial-sort-node.h:

Line 58: 
Just for general niceness, can you also update be/src/exec/sort-node.h's 
private SortInput method to have WARN_UNUSED_RESULT?


http://gerrit.cloudera.org:8080/#/c/7267/3/be/src/runtime/sorter.cc
File be/src/runtime/sorter.cc:

Line 1434:   DCHECK(unsorted_run_ != NULL);
Unnecessary - this is unconditionally dereferenced


http://gerrit.cloudera.org:8080/#/c/7267/3/common/thrift/PlanNodes.thrift
File common/thrift/PlanNodes.thrift:

Line 360:   // Not used with TSortType::PARTIAL.
Please ignore my earlier comments about offset handling (although a comment 
that offsets are not supported in exec/partion-sode-node.cc is still welcome.


-- 
To view, visit http://gerrit.cloudera.org:8080/7267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Complete guide to important environment variables for build, test, and mini-cluster operations.

2017-07-05 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: Complete guide to important environment variables for build, 
test, and mini-cluster operations.
..


Patch Set 1:

(2 comments)

> > I personally abhor Google docs as they just seem to get lost in
 > the
 > > Wiki-sea over time.  If we are going to take time to document the
 > > various set of environment variables controlling important
 > aspects
 > > of the build, why not make that accessible to both the internal
 > and
 > > external developers by including it in the codebase.
 > >
 > > This is my first draft of the standard build settings.
 > 
 > What about the Wiki, like this page: 
 > https://cwiki.apache.org/confluence/display/IMPALA/Bootstrapping+an+Impala+Development+Environment+From+Scratch

Our wikis are pretty nice, and for things like ASAN, totally worth documenting 
separately, but for non-major features, keeping this up to date still requires 
updating things in two places.  In experience, this isn't something people will 
do reliably unless forced.  I think we could force at least minimal 
documentation by keeping a white-list of exported build variables and enforcing 
it with a pre-checkin lint rule.

http://gerrit.cloudera.org:8080/#/c/7350/1/bin/impala-config.sh
File bin/impala-config.sh:

Line 440
> Speaking confidently as the only team member with a fine arts education, I'
Glad to know that my alternative career as the artist Deth Leprikon remains a 
well kept secret.


http://gerrit.cloudera.org:8080/#/c/7350/1/docs/Developer-Guide/Build-Environment.txt
File docs/Developer-Guide/Build-Environment.txt:

Line 1: Environment Variables:
> How many of these are set in impala-config.sh, and how many are set by othe
Additional docs will be forthcoming


-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Complete guide to important environment variables for build, test, and mini-cluster operations.

2017-06-30 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: Complete guide to important environment variables for build, 
test, and mini-cluster operations.
..


Patch Set 1:

I personally abhor Google docs as they just seem to get lost in the Wiki-sea 
over time.  If we are going to take time to document the various set of 
environment variables controlling important aspects of the build, why not make 
that accessible to both the internal and external developers by including it in 
the codebase.

This is my first draft of the standard build settings.

-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] Complete guide to important environment variables for build, test, and mini-cluster operations.

2017-06-30 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7350

Change subject: Complete guide to important environment variables for build, 
test, and mini-cluster operations.
..

Complete guide to important environment variables for
build, test, and mini-cluster operations.

Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
---
M bin/impala-config.sh
A docs/Developer-Guide/Build-Environment.txt
2 files changed, 265 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/7350/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 


[Impala-ASF-CR] IMPALA-5547: Rework FK/PK join detection.

2017-06-29 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5547: Rework FK/PK join detection.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7257/1/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

Line 253:   return getFkPkJoinCardinality(eqJoinConjunctSlots, lhsCard, 
rhsCard);
> 1. Changed the code to only pass those FK/PK conjuncts because that makes t
My concern was that if we have compelling evidence that a FK/PK relationship is 
not present, we shouldn't pass those eqJoin conjuncts to the FK/PK cardinality 
estimation - e.g.,

SELECT orders.customer_id, shipments.ship_id from orders, shipments, WHERE 
orders.order_date = shipments.ship_date

where the highly selective non-PK ship_date should not reduce the cardinality 
of the join - ideally, we multiply the cardinality by (|shipments| / 
NDV(shipments.ship_date))


-- 
To view, visit http://gerrit.cloudera.org:8080/7257
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I49074fe743a28573cff541ef7dbd0edd88892067
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5128: BE Test Infrastructure

2017-06-27 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5128: BE Test Infrastructure
..


Patch Set 2: Code-Review-1

I think we have more important things to do and there are probably even more 
egregious tests.  No one seems to be complaining about the effects of the extra 
backend test run time for now, so ...

-- 
To view, visit http://gerrit.cloudera.org:8080/6477
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71fe6de39dff21b21d322daf0232b0578bdff297
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5547: Rework FK/PK join detection.

2017-06-21 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5547: Rework FK/PK join detection.
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7257/1/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

Line 91:   protected List fkPkEqJoinConjuncts_;
Class member is only used in one function in the base class.  Maybe document 
that this is preserved just for printing values and wrap it with a getter so it 
can't be modified?


Line 253:   return getFkPkJoinCardinality(eqJoinConjunctSlots, lhsCard, 
rhsCard);
This computes estimates based on both PK conjuncts and non-PK conjuncts using 
the same formula, instead of filtering out non-PK joins.  If we have a RHS 
which is an extremely selective non-PK, this could bias the cardinality 
estimate to a very low value, since we take the minimum computed cardinality 
from these assumed PK values.

Is there any value in preserving the >> construction, or should 
getFkPkEqJoinConjuncts consider the conjuncts by tid order and just return a 
flattened list (which can be passed to getFkPkJoinCardinality)


Line 275: for (List fkPkCandi: 
scanSlotsByTids.values()) {
fkPkCandi - this could use a better name, it isn't clear to me what this is 
trying to describe.


-- 
To view, visit http://gerrit.cloudera.org:8080/7257
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I49074fe743a28573cff541ef7dbd0edd88892067
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5548 Fix some minor nits with HDFS parquet column readers

2017-06-21 Thread Zach Amsden (Code Review)
Hello Michael Ho,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7240

to look at the new patch set (#3).

Change subject: IMPALA-5548 Fix some minor nits with HDFS parquet column readers
..

IMPALA-5548 Fix some minor nits with HDFS parquet column readers

Replace some std::map instances with unordered maps.  std::map is
unnecessary in many cases where an unordered map should suffice, and
in almost all circumstances, perform better.  Also discovered was a
slightly dangerous initialization of an empty NullOffsetIndicator,
which could conceivably result in undefined behavior by writing
arr[ofs] |= b, where ofs was mistakenly initialized to -1 (and b to 0,
so such behavior may not be detected, but could cause a crash by
underrunning a buffer).  Also add warn unused to status results while
we are changing this.

Testing: Running exhaustive tests against a Jenkins build.

Change-Id: I08e653cae6f2188599f4a23e4f44692166d9c119
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.h
M be/src/exec/parquet-column-readers.h
M be/src/runtime/descriptors.h
5 files changed, 22 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/7240/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I08e653cae6f2188599f4a23e4f44692166d9c119
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5548 Fix some minor issues with HDFS / parquet column readers

2017-06-21 Thread Zach Amsden (Code Review)
Hello Michael Ho,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7240

to look at the new patch set (#2).

Change subject: IMPALA-5548 Fix some minor issues with HDFS / parquet column 
readers
..

IMPALA-5548 Fix some minor issues with HDFS / parquet column readers

Replace some std::map instances with unordered maps.  std::map is
unnecessary in many cases where an unordered map should suffice, and in
almost all circumstances, perform better.  Also discovered was a
slightly dangerous initialization of an empty NullOffsetIndicator, which
could conceivably result in undefined behavior by writing arr[ofs] |= b,
where ofs was mistakenly initialized to -1 (and b to 0, so such behavior
may not be detected, but could cause a crash by underrunning a buffer).
Also add warn unused to status results while we are changing this.

Testing: Running exhaustive tests against a Jenkins build.

Change-Id: I08e653cae6f2188599f4a23e4f44692166d9c119
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.h
M be/src/exec/parquet-column-readers.h
M be/src/runtime/descriptors.h
5 files changed, 22 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/7240/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I08e653cae6f2188599f4a23e4f44692166d9c119
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5548 Fix some minor issues with HDFS / parquet column readers

2017-06-21 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5548 Fix some minor issues with HDFS / parquet column 
readers
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7240/1/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

Line 352:   FileFormatsMap per_type_files_;
> we iterate through this one. it doesn't look like the ordering has any inte
Done.  Not much point in putting this on HdfsFileFormat.


Line 464:   std::pair, int> 
FileTypeCountsMap;
> we iterate this one to print some info, and it seems like a consistent orde
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I08e653cae6f2188599f4a23e4f44692166d9c119
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries

2017-06-20 Thread Zach Amsden (Code Review)
Zach Amsden has abandoned this change.

Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries
..


Abandoned

Not enough win for the complexity.

-- 
To view, visit http://gerrit.cloudera.org:8080/6726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
Gerrit-PatchSet: 19
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5548 Fix some minor issues with HDFS / parquet column readers

2017-06-20 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5548 Fix some minor issues with HDFS / parquet column 
readers
..


Patch Set 1:

Some pretty simple fixes that came from work on IMPALA-4864.  Running build and 
test job presently.

-- 
To view, visit http://gerrit.cloudera.org:8080/7240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I08e653cae6f2188599f4a23e4f44692166d9c119
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5548 Fix some minor issues with HDFS / parquet column readers

2017-06-20 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7240

Change subject: IMPALA-5548 Fix some minor issues with HDFS / parquet column 
readers
..

IMPALA-5548 Fix some minor issues with HDFS / parquet column readers

Replace some std::map instances with unordered maps.  std::map is
unnecessary in many cases where an unordered map should suffice, and in
almost all circumstances, perform better.  Also discovered was a
slightly dangerous initialization of an empty NullOffsetIndicator, which
could conceivably result in undefined behavior by writing arr[ofs] |= b,
where ofs was mistakenly initialized to -1 (and b to 0, so such behavior
may not be detected, but could cause a crash by underrunning a buffer).

Testing: Running exhaustive tests against a Jenkins build.

Change-Id: I08e653cae6f2188599f4a23e4f44692166d9c119
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.h
M be/src/exec/parquet-column-readers.h
M be/src/runtime/descriptors.h
5 files changed, 12 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/7240/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I08e653cae6f2188599f4a23e4f44692166d9c119
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 


[Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries

2017-06-16 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#19).

Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries
..

IMPALA-4864 Speed up single slot predicates with dictionaries

When dictionaries are present we can pre-evaluate conjuncts
against the dictionary values and simply look up the result.

Status: Runs with ASAN, runs without crashes on ee tests.  Performance
results inconclusive; this may not be worth the complexity unless we
get really selective or really expensive predicates.

Basic idea: since we codegen so early, before we know enough details
about the columns to know if they are dict filterable, if we do have
dictionary filtering predicates, we codegen a guard around each
dictionary filterable predicate evaluation.  This guard skips
evaluation of the predicate if it has already been evaluated by the
dictionary.  In this way, we can skip evaluation dynamically for
each row group as we learn which columns are dictionary filterable,
and then push predicate evaluation into the column reader.  Since the
branches will remain 100% predictable over the row group, this should
give us the fastest way to skip over predicate evaluation without
compromising the general case where we may be unable to evaluate
against the dictionary.  We can even do this with codegen turned
off, as a side effect of the way we generate the codegen'd function
when dictionary evaluation is enabled.

If dictionaries aren't available for some predicates, we automatically
fall back to evaluating those predicates in the original order,
preserving selectivity.  The overhead in this case is a perfectly
predictable extra conditional per dictionary predicate.

Performance:  Hard to get!  Simple predicates did not show
improvement, in fact regressed.  I used a TPC-H scale 30 dataset,
duplicated 3x into a 'biglineitem' table.

 select count(*) from biglineitem WHERE l_returnflag = 'A';
1.43s -> 1.53s

 select count(*) from biglineitem WHERE l_shipinstruct = 'DELIVER IN
PERSON';
1.43s -> 1.53s

 select count(*) from biglineitem WHERE l_quantity > 49;

0.93s -> 0.93s

 select count(*) from biglineitem WHERE instr(l_shipdate, '1994-11') >
0;

2.33s -> 1.03s

So this appears to only be a win for expensive predicates.

Update: I added changes to make computed predicate costs visible from
the frontend to the backend, and tried a TPC-DS scale 10 dataset, which
has better queries (lots of IN groups).  Still there is a regression in
raw query performance.

Update again: reversing one branch to UNLIKELY in the ir compiled code
gave these results on TPC-DSx10:

Q46 (limit modified to 10):
  3.05 -> 2.89 sec
Q27 (limit modified to 10):
  3.13 -> 3.09 sec

Update (6.5.2017): Switched back to bitset evaluation for scratch batch
rather than using an extra byte per tuple row.  Surprisingly, this is
the best performing diff yet for selective predicates.  Using TPC-DS-10
with a filter:

Query: select count(*) from
  store_sales,
  customer,
  customer_address
where store_sales.ss_customer_sk = customer.c_customer_sk and
  customer.c_current_addr_sk = customer_address.ca_address_sk
 and customer_address.ca_city IN (... list of ~200 cities)

This almost makes parity.  I get 3836140 rows in 1.41-1.45s with this
diff and in 1.39-1.45s with the same caching optimization for batch_size
on top of the last change (Tim's parquest column reader optimizations).
So in a totally fair comparison, we are still losing :(

Update (6.8.2017): Tried Tim's suggestions.  Best I could get on this
query was now only 1.54s.  My host OS kernel did change during this time
so I re-measured baseline and got a best time of 1.53s.  So we seem to
be making parity, but not showcasing a big win.

Maybe runtime filters will pay off better; they already are toggled
dynamically so nothing should be lost.

Update (6.16.2017): Tried runtime filters, simplified the logic as much
as possible.  Everything is still a wash:

Query: select count(*) from
  store_sales,
  customer,
  customer_address
where store_sales.ss_customer_sk = customer.c_customer_sk and
  customer.c_current_addr_sk = customer_address.ca_address_sk
 and customer_address.ca_city IN
 ('Providence', 'Mount Zion', 'Greenville', 'Mount Olive', 'Wildwood',
   'Hillcrest', 'Crossroads', 'Belmont', 'Wilson', 'Riverdale',
'Newport',
   'Springdale', 'Mountain View', 'Forest Hills', 'Bridgeport', 'White
Oak',
   'Oakwood', 'Newtown', 'Macedonia', 'Five Forks', 'Edgewood',
'Arlington',
   'Unionville', 'Red Hill', 'Clinton', 'Woodville', 'Summit',
'Jamestown',
   'Hamilton', 'Clifton', 'Union Hill', 'Sulphur Springs', 'Lincoln',
   'Lebanon', 'Farmington', 'Enterprise', 'Brownsville', 'Ashland',
'Woodland',
   'Waterloo', 'Valley View', 'Oak Ridge', 'Maple Grove', 'Kingston',
   'Jackson', 'Greenfield', 'Green Acres', 'Plainview', 'Marion',
'Florence',
   'Qarth', 'R\'lyeh', 'Mos Eisley', 'King\'s Landing', "Q'onos",

[Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries

2017-06-16 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#18).

Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries
..

IMPALA-4864 Speed up single slot predicates with dictionaries

When dictionaries are present we can pre-evaluate conjuncts
against the dictionary values and simply look up the result.

Status: Runs with ASAN, runs without crashes on ee tests.  Performance
results inconclusive; this may not be worth the complexity unless we
get really selective or really expensive predicates.

Basic idea: since we codegen so early, before we know enough details
about the columns to know if they are dict filterable, if we do have
dictionary filtering predicates, we codegen a guard around each
dictionary filterable predicate evaluation.  This guard skips
evaluation of the predicate if it has already been evaluated by the
dictionary.  In this way, we can skip evaluation dynamically for
each row group as we learn which columns are dictionary filterable,
and then push predicate evaluation into the column reader.  Since the
branches will remain 100% predictable over the row group, this should
give us the fastest way to skip over predicate evaluation without
compromising the general case where we may be unable to evaluate
against the dictionary.  We can even do this with codegen turned
off, as a side effect of the way we generate the codegen'd function
when dictionary evaluation is enabled.

If dictionaries aren't available for some predicates, we automatically
fall back to evaluating those predicates in the original order,
preserving selectivity.  The overhead in this case is a perfectly
predictable extra conditional per dictionary predicate.

Performance:  Hard to get!  Simple predicates did not show
improvement, in fact regressed.  I used a TPC-H scale 30 dataset,
duplicated 3x into a 'biglineitem' table.

 select count(*) from biglineitem WHERE l_returnflag = 'A';
1.43s -> 1.53s

 select count(*) from biglineitem WHERE l_shipinstruct = 'DELIVER IN
PERSON';
1.43s -> 1.53s

 select count(*) from biglineitem WHERE l_quantity > 49;

0.93s -> 0.93s

 select count(*) from biglineitem WHERE instr(l_shipdate, '1994-11') >
0;

2.33s -> 1.03s

So this appears to only be a win for expensive predicates.

Update: I added changes to make computed predicate costs visible from
the frontend to the backend, and tried a TPC-DS scale 10 dataset, which
has better queries (lots of IN groups).  Still there is a regression in
raw query performance.

Update again: reversing one branch to UNLIKELY in the ir compiled code
gave these results on TPC-DSx10:

Q46 (limit modified to 10):
  3.05 -> 2.89 sec
Q27 (limit modified to 10):
  3.13 -> 3.09 sec

Update (6.5.2017): Switched back to bitset evaluation for scratch batch
rather than using an extra byte per tuple row.  Surprisingly, this is
the best performing diff yet for selective predicates.  Using TPC-DS-10
with a filter:

Query: select count(*) from
  store_sales,
  customer,
  customer_address
where store_sales.ss_customer_sk = customer.c_customer_sk and
  customer.c_current_addr_sk = customer_address.ca_address_sk
 and customer_address.ca_city IN (... list of ~200 cities)

This almost makes parity.  I get 3836140 rows in 1.41-1.45s with this
diff and in 1.39-1.45s with the same caching optimization for batch_size
on top of the last change (Tim's parquest column reader optimizations).
So in a totally fair comparison, we are still losing :(

Update (6.8.2017): Tried Tim's suggestions.  Best I could get on this
query was now only 1.54s.  My host OS kernel did change during this time
so I re-measured baseline and got a best time of 1.53s.  So we seem to
be making parity, but not showcasing a big win.

Maybe runtime filters will pay off better; they already are toggled
dynamically so nothing should be lost.

Update (6.16.2017): Tried runtime filters, simplified the logic as much
as possible.  Everything is still a wash:

Query: select count(*) from
  store_sales,
  customer,
  customer_address
where store_sales.ss_customer_sk = customer.c_customer_sk and
  customer.c_current_addr_sk = customer_address.ca_address_sk
 and customer_address.ca_city IN
 ('Providence', 'Mount Zion', 'Greenville', 'Mount Olive', 'Wildwood',
   'Hillcrest', 'Crossroads', 'Belmont', 'Wilson', 'Riverdale',
'Newport',
   'Springdale', 'Mountain View', 'Forest Hills', 'Bridgeport', 'White
Oak',
   'Oakwood', 'Newtown', 'Macedonia', 'Five Forks', 'Edgewood',
'Arlington',
   'Unionville', 'Red Hill', 'Clinton', 'Woodville', 'Summit',
'Jamestown',
   'Hamilton', 'Clifton', 'Union Hill', 'Sulphur Springs', 'Lincoln',
   'Lebanon', 'Farmington', 'Enterprise', 'Brownsville', 'Ashland',
'Woodland',
   'Waterloo', 'Valley View', 'Oak Ridge', 'Maple Grove', 'Kingston',
   'Jackson', 'Greenfield', 'Green Acres', 'Plainview', 'Marion',
'Florence',
   'Qarth', 'R\'lyeh', 'Mos Eisley', 'King\'s Landing', "Q'onos",

[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization

2017-06-13 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5036: Parquet count star optimization
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6812/3/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 445:   *dst_slot = file_metadata_.row_groups[row_group_idx_].num_rows;
Bounds check against file_metadata_.num_rows (i.e. keep a running counter as 
below in row_group_rows_read_ and DCHECK_LE(row_group_rows_read_, 
file_metatdata_.num_rows)?


Line 454:   if (scan_node_->IsZeroSlotTableScan()) {
Why is this optimization not redundant now?  Maybe update the comment to 
indicate the type of query this now will operate on:

  e.g., select 1 from table


http://gerrit.cloudera.org:8080/#/c/6812/3/common/thrift/PlanNodes.thrift
File common/thrift/PlanNodes.thrift:

Line 226:   11: optional i64 parquet_count_star_slot_offset
> i32 right?
Would it be simpler to have this be one parameter and indicate truth by passing 
a value >= 0?


-- 
To view, visit http://gerrit.cloudera.org:8080/6812
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5467: disable flaky BenchmarkTest to unblock builds

2017-06-09 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5467: disable flaky BenchmarkTest to unblock builds
..


Patch Set 1: Code-Review+1

Well since the fix, we know this isn't being caused by contention, any idea 
what could actually be causing it?

Do we have AMD or older Intel hardware with flaky TSCs in service that could 
explain this?

-- 
To view, visit http://gerrit.cloudera.org:8080/7135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6aafbc620c1e839b07bf6020f9286ca5ee51197d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries

2017-06-08 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries
..


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6726/16/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

Line 420:   LIKELY(dictionary_results_.num_bits() > 0)) {
> I think the predicate evaluation on 40,000 values is probably cheap enough 
We can certainly try it.  I was worried the pre-computation might be expensive 
if we have, say, string manipulation in predicates, as opposed to inexpensive, 
simple comparisons.  Still, even if we have the same number of predicate 
evaluations, they end up going through the unoptimized EvalConjuncts() path, as 
opposed to the codegen'd path.

As for IS_FILTERED, that is set when the column reader is created.  
IS_DICT_ENCODED is determined per page.  We are left with no way to remove 
dictionary_results_.num_bits() on a per-row basis, since we can't unset 
IS_FILTERED, and IS_DICT_ENCODED may be true even if the encoding did not cover 
all values.

I'll try precomputing all the values and see what happens.


-- 
To view, visit http://gerrit.cloudera.org:8080/6726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5164: Fix flaky benchmarks

2017-06-06 Thread Zach Amsden (Code Review)
Hello Impala Public Jenkins, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6935

to look at the new patch set (#8).

Change subject: IMPALA-5164: Fix flaky benchmarks
..

IMPALA-5164: Fix flaky benchmarks

Improve benchmarks by detecting involuntary context switches.
If a server is heavily loaded due to some other jobs running,
benchmarking will not be reliable.  We can detect this by using
getrusage() to measure context switches.  If 90% or more of
benchmark time is lost due to switching, we abort the benchmark,
but catch the failure and exit silently.  We subtract out the
time during which we might have been switched out from the total
result time, and only count runs which were uninterrupted.

If 10% of the result is valid, that seems barely legitimate,
but we will warn when anything less than 50% of the benchmark
ran without switching.

Testing: ran the free-lists-benchmark until the first test passed
and then started "make -j 60" on an Impala directory.  Got this
result:

E0519 23:35:41.482010  8766 benchmark.cc:161] Benchmark failed to
complete due to context switching.
W0519 23:35:41.548840  8766 benchmark.cc:162] Exiting silently from
FreeLists 0 iters on 128 kb to avoid spurious test failure.

Also, fixed two benchmarks that had crashes on start due to static
initialization order problems and missing pieces of initialization.
Some benchmarks are still crashing (expr-benchmark and
network-perf-be), but those will require a lot more work to fix.

Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274
---
M be/src/benchmarks/free-lists-benchmark.cc
M be/src/benchmarks/hash-benchmark.cc
M be/src/benchmarks/lock-benchmark.cc
M be/src/util/benchmark-test.cc
M be/src/util/benchmark.cc
M be/src/util/benchmark.h
6 files changed, 108 insertions(+), 38 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/6935/8
-- 
To view, visit http://gerrit.cloudera.org:8080/6935
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries

2017-06-05 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries
..


Patch Set 16:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/6726/16/be/src/exec/hdfs-parquet-scanner-ir.cc
File be/src/exec/hdfs-parquet-scanner-ir.cc:

Line 89: idx = scratch_batch_.NextValidTuple(idx);
Surprising, despite apparently being costlier, the bitmap scan appears to be 
fast, especially when predicates are selective.


http://gerrit.cloudera.org:8080/#/c/6726/16/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

Line 420:   LIKELY(dictionary_results_.num_bits() > 0)) {
Not liking this branch, but it is unavoidable without pre-computation overhead. 
 When we have a partially dictionary encoding on a filtered column that spills 
to plain encoding, we don't pre-evaluate the dictionary against predicates.  We 
could do so anyway, but it may end up being an unnecessary up-front cost.  
However, the number of dictionary entries before spilling to plain is bounded, 
so the cost should be fixed.

Maybe we can afford to do so if the predicates are "inexpensive", but then we 
are still left with this check for expensive predicate (and have no way to 
determine cost here without another check).


http://gerrit.cloudera.org:8080/#/c/6726/16/be/src/exec/parquet-scratch-tuple-batch.h
File be/src/exec/parquet-scratch-tuple-batch.h:

Line 172: // pointer.  We don't need to transfer the extra filter byte.
Comment obsolete with latest diff.


http://gerrit.cloudera.org:8080/#/c/6726/16/be/src/runtime/descriptors.h
File be/src/runtime/descriptors.h:

Line 94:   NullIndicatorOffset(int byte_offset = 0, int bit_offset = -1)
This fixes an apparent bug, where byte_offset = -1 (passed in constructor for 
ParquetColumnReader as NullIndicatorOffset(-1, -1) ends up converting into code 
which will execute something like:

tuple_mem[-1] |= 0

Which despite not modifying memory, could be an out of bounds memory access.


http://gerrit.cloudera.org:8080/#/c/6726/16/be/src/util/bitmap-test.cc
File be/src/util/bitmap-test.cc:

Line 81:   for (index = bm.NextSetBit(); index < size; index = 
bm.NextSetBit(index)) {
test needs updating for behavior change: index -> index + 1


-- 
To view, visit http://gerrit.cloudera.org:8080/6726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries

2017-06-05 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#16).

Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries
..

IMPALA-4864 Speed up single slot predicates with dictionaries

When dictionaries are present we can pre-evaluate conjuncts
against the dictionary values and simply look up the result.

Status: Runs with ASAN, runs without crashes on ee tests.  Performance
results inconclusive; this may not be worth the complexity unless we
get really selective or really expensive predicates.

Basic idea: since we codegen so early, before we know enough details
about the columns to know if they are dict filterable, if we do have
dictionary filtering predicates, we codegen a guard around each
dictionary filterable predicate evaluation.  This guard skips
evaluation of the predicate if it has already been evaluated by the
dictionary.  In this way, we can skip evaluation dynamically for
each row group as we learn which columns are dictionary filterable,
and then push predicate evaluation into the column reader.  Since the
branches will remain 100% predictable over the row group, this should
give us the fastest way to skip over predicate evaluation without
compromising the general case where we may be unable to evaluate
against the dictionary.  We can even do this with codegen turned
off, as a side effect of the way we generate the codegen'd function
when dictionary evaluation is enabled.

If dictionaries aren't available for some predicates, we automatically
fall back to evaluating those predicates in the original order,
preserving selectivity.  The overhead in this case is a perfectly
predictable extra conditional per dictionary predicate.

Performance:  Hard to get!  Simple predicates did not show
improvement, in fact regressed.  I used a TPC-H scale 30 dataset,
duplicated 3x into a 'biglineitem' table.

 select count(*) from biglineitem WHERE l_returnflag = 'A';
1.43s -> 1.53s

 select count(*) from biglineitem WHERE l_shipinstruct = 'DELIVER IN
PERSON';
1.43s -> 1.53s

 select count(*) from biglineitem WHERE l_quantity > 49;

0.93s -> 0.93s

 select count(*) from biglineitem WHERE instr(l_shipdate, '1994-11') >
0;

2.33s -> 1.03s

So this appears to only be a win for expensive predicates.

Update: I added changes to make computed predicate costs visible from
the frontend to the backend, and tried a TPC-DS scale 10 dataset, which
has better queries (lots of IN groups).  Still there is a regression in
raw query performance.

Update again: reversing one branch to UNLIKELY in the ir compiled code
gave these results on TPC-DSx10:

Q46 (limit modified to 10):
  3.05 -> 2.89 sec
Q27 (limit modified to 10):
  3.13 -> 3.09 sec

Update (6.5.2017): Switched back to bitset evaluation for scratch batch
rather than using an extra byte per tuple row.  Surprisingly, this is
the best performing diff yet for selective predicates.  Using TPC-DS-10
with a filter:

Query: select count(*) from
  store_sales,
  customer,
  customer_address
where store_sales.ss_customer_sk = customer.c_customer_sk and
  customer.c_current_addr_sk = customer_address.ca_address_sk
 and customer_address.ca_city IN (... list of ~200 cities)

This almost makes parity.  I get 3836140 rows in 1.41-1.45s with this
diff and in 1.39-1.45s with the same caching optimization for batch_size
on top of the last change (Tim's parquest column reader optimizations).
So in a totally fair comparison, we are still losing :(

Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-parquet-scanner-ir.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/parquet-scratch-tuple-batch.h
M be/src/runtime/descriptors.h
M be/src/runtime/row-batch.h
M be/src/runtime/tuple.h
M be/src/util/bitmap-test.cc
M be/src/util/bitmap.h
M be/src/util/dict-encoding.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
20 files changed, 552 insertions(+), 165 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/6726/16
-- 
To view, visit http://gerrit.cloudera.org:8080/6726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 

[Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries

2017-06-01 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries
..


Patch Set 14:

I'm getting kind of disheartened by this approach.  The biggest obstacle is 
that we don't know ahead of time what conjuncts will be useful for filtering, 
as we don't know which columns are fully dictionary encoded.  None of the 
approaches I have tried really make parity with the baseline, and only very 
expensive conjuncts on sets of small to mid-size distinct values appear to be a 
win.

It is hard to come up with a data set and query that perform much better 
without artificially making poor queries to begin with.  Nothing reasonable 
I've tried on TPC-Hx30 or TPC-DSx10 have produced great results.

-- 
To view, visit http://gerrit.cloudera.org:8080/6726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries

2017-06-01 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#14).

Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries
..

IMPALA-4864 Speed up single slot predicates with dictionaries

When dictionaries are present we can pre-evaluate conjuncts
against the dictionary values and simply look up the result.

Status: Runs with ASAN, runs without crashes on ee tests.  Performance
results inconclusive; this may not be worth the complexity unless we
get really selective or really expensive predicates.

Basic idea: since we codegen so early, before we know enough details
about the columns to know if they are dict filterable, if we do have
dictionary filtering predicates, we codegen a guard around each
dictionary filterable predicate evaluation.  This guard skips
evaluation of the predicate if it has already been evaluated by the
dictionary.  In this way, we can skip evaluation dynamically for
each row group as we learn which columns are dictionary filterable,
and then push predicate evaluation into the column reader.  Since the
branches will remain 100% predictable over the row group, this should
give us the fastest way to skip over predicate evaluation without
compromising the general case where we may be unable to evaluate
against the dictionary.  We can even do this with codegen turned
off, as a side effect of the way we generate the codegen'd function
when dictionary evaluation is enabled.

If dictionaries aren't available for some predicates, we automatically
fall back to evaluating those predicates in the original order,
preserving selectivity.  The overhead in this case is a perfectly
predictable extra conditional per dictionary predicate.

Performance:  Hard to get!  Simple predicates did not show
improvement, in fact regressed.  I used a TPC-H scale 30 dataset,
duplicated 3x into a 'biglineitem' table.

 select count(*) from biglineitem WHERE l_returnflag = 'A';
1.43s -> 1.53s

 select count(*) from biglineitem WHERE l_shipinstruct = 'DELIVER IN
PERSON';
1.43s -> 1.53s

 select count(*) from biglineitem WHERE l_quantity > 49;

0.93s -> 0.93s

 select count(*) from biglineitem WHERE instr(l_shipdate, '1994-11') >
0;

2.33s -> 1.03s

So this appears to only be a win for expensive predicates.

Update: I added changes to make computed predicate costs visible from
the frontend to the backend, and tried a TPC-DS scale 10 dataset, which
has better queries (lots of IN groups).  Still there is a regression in
raw query performance.

Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-parquet-scanner-ir.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/parquet-scratch-tuple-batch.h
M be/src/runtime/descriptors.h
M be/src/runtime/row-batch.h
M be/src/runtime/tuple.h
M be/src/util/bitmap-test.cc
M be/src/util/bitmap.h
M be/src/util/dict-encoding.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
20 files changed, 527 insertions(+), 142 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/6726/14
-- 
To view, visit http://gerrit.cloudera.org:8080/6726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4988: Add query option read parquet statistics

2017-06-01 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4988: Add query option read_parquet_statistics
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7001/1/testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test
File testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test:

Line 455:  RESULTS
> What is the difference to =0?
Even better test - read a parquet file with stats filters that filter ~50%, 
select count(*), then disable the filters and validate the count is the same.


-- 
To view, visit http://gerrit.cloudera.org:8080/7001
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I427f7fde40d0f4b703751e40f3c2109a850643f7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Wood 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5164: Fix flaky benchmarks

2017-05-31 Thread Zach Amsden (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6935

to look at the new patch set (#6).

Change subject: IMPALA-5164: Fix flaky benchmarks
..

IMPALA-5164: Fix flaky benchmarks

Improve benchmarks by detecting involuntary context switches.
If a server is heavily loaded due to some other jobs running,
benchmarking will not be reliable.  We can detect this by using
getrusage() to measure context switches.  If 90% or more of
benchmark time is lost due to switching, we abort the benchmark,
but catch the failure and exit silently.  We subtract out the
time during which we might have been switched out from the total
result time, and only count runs which were uninterrupted.

If 10% of the result is valid, that seems barely legitimate,
but we will warn when anything less than 50% of the benchmark
ran without switching.

Testing: ran the free-lists-benchmark until the first test passed
and then started "make -j 60" on an Impala directory.  Got this
result:

E0519 23:35:41.482010  8766 benchmark.cc:161] Benchmark failed to
complete due to context switching.
W0519 23:35:41.548840  8766 benchmark.cc:162] Exiting silently from
FreeLists 0 iters on 128 kb to avoid spurious test failure.

Also, fixed two benchmarks that had crashes on start due to static
initialization order problems and missing pieces of initialization.
Some benchmarks are still crashing (expr-benchmark and
network-perf-be), but those will require a lot more work to fix.

Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274
---
M be/src/benchmarks/free-lists-benchmark.cc
M be/src/benchmarks/hash-benchmark.cc
M be/src/benchmarks/lock-benchmark.cc
M be/src/util/benchmark.cc
M be/src/util/benchmark.h
5 files changed, 107 insertions(+), 37 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/6935/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6935
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5164: Fix flaky benchmarks

2017-05-31 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5164: Fix flaky benchmarks
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6935/5/be/src/benchmarks/free-lists-benchmark.cc
File be/src/benchmarks/free-lists-benchmark.cc:

Line 323: static SystemAllocator* allocator = nullptr;
> what's the reason for this instead of static allocation? leave comment, so 
Undefined behavior due to undefined static initialization order in the 
SystemAllocator class.  Will leave a comment.


-- 
To view, visit http://gerrit.cloudera.org:8080/6935
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries

2017-05-31 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#13).

Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries
..

IMPALA-4864 Speed up single slot predicates with dictionaries

When dictionaries are present we can pre-evaluate conjuncts
against the dictionary values and simply look up the result.

Status: Ready for review.  Runs with ASAN, runs without crashes
on ee tests.

Basic idea: since we codegen so early, before we know enough details
about the columns to know if they are dict filterable, if we do have
dictionary filtering predicates, we codegen a guard around each
dictionary filterable predicate evaluation.  This guard skips
evaluation of the predicate if it has already been evaluated by the
dictionary.  In this way, we can skip evaluation dynamically for
each row group as we learn which columns are dictionary filterable,
and then push predicate evaluation into the column reader.  Since the
branches will remain 100% predictable over the row group, this should
give us the fastest way to skip over predicate evaluation without
compromising the general case where we may be unable to evaluate
against the dictionary.  We can even do this with codegen turned
off, as a side effect of the way we generate the codegen'd function
when dictionary evaluation is enabled.

If dictionaries aren't available for some predicates, we automatically
fall back to evaluating those predicates in the original order,
preserving selectivity.  The overhead in this case is a perfectly
predictable extra conditional per dictionary predicate.

Performance:  Hard to get!  Simple predicates did not show
improvement, in fact regressed.  I used a TPC-H scale 30 dataset,
duplicated 3x into a 'biglineitem' table.

 select count(*) from biglineitem WHERE l_returnflag = 'A';
1.43s -> 1.53s

 select count(*) from biglineitem WHERE l_shipinstruct = 'DELIVER IN
PERSON';
1.43s -> 1.53s

 select count(*) from biglineitem WHERE l_quantity > 49;

0.93s -> 0.93s

 select count(*) from biglineitem WHERE instr(l_shipdate, '1994-11') >
0;

2.33s -> 1.03s

So this appears to only be a win for expensive predicates.

Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-parquet-scanner-ir.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/parquet-scratch-tuple-batch.h
M be/src/runtime/descriptors.h
M be/src/runtime/row-batch.h
M be/src/runtime/tuple.h
M be/src/util/bitmap-test.cc
M be/src/util/bitmap.h
M be/src/util/dict-encoding.h
18 files changed, 527 insertions(+), 150 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/6726/13
-- 
To view, visit http://gerrit.cloudera.org:8080/6726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries

2017-05-31 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#12).

Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries
..

IMPALA-4864 Speed up single slot predicates with dictionaries

When dictionaries are present we can pre-evaluate conjuncts
against the dictionary values and simply look up the result.

Status: Ready for review.  Seems to be passing all tests, runs with
ASAN, and gives expected results.  Need to come up with some specific
test cases that exercise this functionality and measure performance.

Basic idea: since we codegen so early, before we know enough details
about the columns to know if they are dict filterable, if we do have
dictionary filtering predicates, we codegen a guard around each
dictionary filterable predicate evaluation.  This guard skips
evaluation of the predicate if it has already been evaluated by the
dictionary.  In this way, we can skip evaluation dynamically for
each row group as we learn which columns are dictionary filterable,
and then push predicate evaluation into the column reader.  Since the
branches will remain 100% predictable over the row group, this should
give us the fastest way to skip over predicate evaluation without
compromising the general case where we may be unable to evaluate
against the dictionary.  We can even do this with codegen turned
off, as a side effect of the way we generate the codegen'd function
when dictionary evaluation is enabled.

If dictionaries aren't available for some predicates, we automatically
fall back to evaluating those predicates in the original order,
preserving selectivity.  The overhead in this case is a perfectly
predictable extra conditional per dictionary predicate.

Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-parquet-scanner-ir.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/parquet-scratch-tuple-batch.h
M be/src/runtime/descriptors.h
M be/src/runtime/row-batch.h
M be/src/runtime/tuple.h
M be/src/util/bitmap-test.cc
M be/src/util/bitmap.h
M be/src/util/dict-encoding.h
18 files changed, 527 insertions(+), 150 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/6726/12
-- 
To view, visit http://gerrit.cloudera.org:8080/6726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries

2017-05-30 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries
..


Patch Set 9:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/6726/9/be/src/exec/exec-node.cc
File be/src/exec/exec-node.cc:

PS9, Line 567: skippable_map
> skippable_map != nullptr
Done


PS9, Line 586: 0 
> huh ?
Was diverting for testing


PS9, Line 586: (*skippable_map)[i] == true
> (*skippable_map)[i]
Done


PS9, Line 590: bool
> skip_val
Done


http://gerrit.cloudera.org:8080/#/c/6726/9/be/src/exec/hdfs-parquet-scanner-ir.cc
File be/src/exec/hdfs-parquet-scanner-ir.cc:

PS9, Line 88: LOG(INFO) << "tuple idx: " << tuple_index;
> debugging output ?
Yeah, removed this


http://gerrit.cloudera.org:8080/#/c/6726/9/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

PS9, Line 1050: std::map&
> DictionaryFilterPositionMap
I think it will need to be HdfsScanNodeBase::DictionaryFilterPositionMap - not 
sure even a using declaration will help have the syntax, but I'll try.


http://gerrit.cloudera.org:8080/#/c/6726/9/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

PS9, Line 167: typedef std::map 
DictFilterConjunctsPositionMap;
> Comment what this map stands for ?
Done


PS9, Line 366: Mapping to original position in conjuncts
> Mapping from SlotId to ...
Done


http://gerrit.cloudera.org:8080/#/c/6726/9/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

PS9, Line 206: IS_FILTERED
> Comment what IS_FILTERED stands for.
Done


PS9, Line 240: (filters == nullptr) ^ IS_FILTERED
> Is there a reason to not use DCHECK((filters == nullptr) != IS_FILTERED) ?
1 fewer character?  I can change this though.


http://gerrit.cloudera.org:8080/#/c/6726/9/be/src/exec/parquet-column-readers.h
File be/src/exec/parquet-column-readers.h:

PS9, Line 390: return false;
> Is this hard coded for testing ? If not, please document why it's always fa
No, it gets overridden in the derived class.


http://gerrit.cloudera.org:8080/#/c/6726/9/be/src/util/bitmap.h
File be/src/util/bitmap.h:

PS9, Line 86: (mask & buffer_[word_index])
> (mask & buffer[word_index]) == 0
Done


PS9, Line 99: buffer_.size()
> num_words_
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5164: Fix flaky benchmarks

2017-05-30 Thread Zach Amsden (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6935

to look at the new patch set (#5).

Change subject: IMPALA-5164: Fix flaky benchmarks
..

IMPALA-5164: Fix flaky benchmarks

Improve benchmarks by detecting involuntary context switches.
If a server is heavily loaded due to some other jobs running,
benchmarking will not be reliable.  We can detect this by using
getrusage() to measure context switches.  If 90% or more of
benchmark time is lost due to switching, we abort the benchmark,
but catch the failure and exit silently.  We subtract out the
time during which we might have been switched out from the total
result time, and only count runs which were uninterrupted.

If 10% of the result is valid, that seems barely legitimate,
but we will warn when anything less than 50% of the benchmark
ran without switching.

Testing: ran the free-lists-benchmark until the first test passed
and then started "make -j 60" on an Impala directory.  Got this
result:

E0519 23:35:41.482010  8766 benchmark.cc:161] Benchmark failed to
complete due to context switching.
W0519 23:35:41.548840  8766 benchmark.cc:162] Exiting silently from
FreeLists 0 iters on 128 kb to avoid spurious test failure.

Also, fixed two benchmarks that had crashes on start due to static
initialization order problems and missing pieces of initialization.
Some benchmarks are still crashing (expr-benchmark and
network-perf-be), but those will require a lot more work to fix.

Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274
---
M be/src/benchmarks/free-lists-benchmark.cc
M be/src/benchmarks/hash-benchmark.cc
M be/src/benchmarks/lock-benchmark.cc
M be/src/util/benchmark.cc
M be/src/util/benchmark.h
5 files changed, 105 insertions(+), 37 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/6935/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6935
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5164: Fix flaky benchmarks

2017-05-30 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5164: Fix flaky benchmarks
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6935/4/be/src/util/benchmark.h
File be/src/util/benchmark.h:

Line 84:   bool micro_;
> Nit: maybe call this micro_heuristics so it's more obvious that the constru
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6935
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries

2017-05-26 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#11).

Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries
..

IMPALA-4864 Speed up single slot predicates with dictionaries

When dictionaries are present we can pre-evaluate conjuncts
against the dictionary values and simply look up the result.

Status: Ready for review.  Seems to be passing all tests, runs with
ASAN, and gives expected results.  Need to come up with some specific
test cases that exercise this functionality and measure performance.

Basic idea: since we codegen so early, before we know enough details
about the columns to know if they are dict filterable, if we do have
dictionary filtering predicates, we codegen a guard around each
dictionary filterable predicate evaluation.  This guard skips
evaluation of the predicate if it has already been evaluated by the
dictionary.  In this way, we can skip evaluation dynamically for
each row group as we learn which columns are dictionary filterable,
and then push predicate evaluation into the column reader.  Since the
branches will remain 100% predictable over the row group, this should
give us the fastest way to skip over predicate evaluation without
compromising the general case where we may be unable to evaluate
against the dictionary.  We can even do this with codegen turned
off, as a side effect of the way we generate the codegen'd function
when dictionary evaluation is enabled.

If dictionaries aren't available for some predicates, we automatically
fall back to evaluating those predicates in the original order,
preserving selectivity.  The overhead in this case is a perfectly
predictable extra conditional per dictionary predicate.

Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-parquet-scanner-ir.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/parquet-scratch-tuple-batch.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/runtime/descriptors.h
M be/src/runtime/row-batch.h
M be/src/runtime/tuple.h
M be/src/util/bitmap-test.cc
M be/src/util/bitmap.h
M be/src/util/dict-encoding.h
21 files changed, 511 insertions(+), 148 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/6726/11
-- 
To view, visit http://gerrit.cloudera.org:8080/6726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries

2017-05-26 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries
..


Patch Set 10:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6726/10/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

Line 167:   static const int MAX_BATCH_SIZE = 65536;
Didn't end up needing this.


http://gerrit.cloudera.org:8080/#/c/6726/10/be/src/util/bitmap.h
File be/src/util/bitmap.h:

Line 79:   /// Returns next set bit, or num_bits_ if the end is reached.
This function ended up unused as well.  What shall we do with it?


http://gerrit.cloudera.org:8080/#/c/6726/10/testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test:

Line 25:parquet dictionary predicates: int_col IS NULL, int_col > 1
Bogus test change needs to be backed out.


-- 
To view, visit http://gerrit.cloudera.org:8080/6726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries

2017-05-26 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#10).

Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries
..

IMPALA-4864 Speed up single slot predicates with dictionaries

When dictionaries are present we can pre-evaluate conjuncts
against the dictionary values and simply look up the result.

Status: Ready for review.  Seems to be passing all tests, runs with
ASAN, and gives expected results.  Need to come up with some specific
test cases that exercise this functionality and measure performance.

Basic idea: since we codegen so early, before we know enough details
about the columns to know if they are dict filterable, if we do have
dictionary filtering predicates, we codegen a guard around each
dictionary filterable predicate evaluation.  This guard skips
evaluation of the predicate if it has already been evaluated by the
dictionary.  In this way, we can skip evaluation dynamically for
each row group as we learn which columns are dictionary filterable,
and then push predicate evaluation into the column reader.  Since the
branches will remain 100% predictable over the row group, this should
give us the fastest way to skip over predicate evaluation without
compromising the general case where we may be unable to evaluate
against the dictionary.  We can even do this with codegen turned
off, as a side effect of the way we generate the codegen'd function
when dictionary evaluation is enabled.

If dictionaries aren't available for some predicates, we automatically
fall back to evaluating those predicates in the original order,
preserving selectivity.  The overhead in this case is a perfectly
predictable extra conditional per dictionary predicate.

Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-parquet-scanner-ir.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/parquet-scratch-tuple-batch.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/runtime/descriptors.h
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.h
M be/src/runtime/tuple.h
M be/src/util/bitmap-test.cc
M be/src/util/bitmap.h
M be/src/util/dict-encoding.h
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
23 files changed, 510 insertions(+), 150 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/6726/10
-- 
To view, visit http://gerrit.cloudera.org:8080/6726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


  1   2   3   4   >