[Impala-ASF-CR] IMPALA-3884: Support TYPE TIMESTAMP for HashTableCtx::CodegenAssignNullValue()

2016-10-24 Thread Michael Ho (Code Review)
Michael Ho has submitted this change and it was merged.

Change subject: IMPALA-3884: Support TYPE_TIMESTAMP for 
HashTableCtx::CodegenAssignNullValue()
..


IMPALA-3884: Support TYPE_TIMESTAMP for HashTableCtx::CodegenAssignNullValue()

This change implements support for TYPE_TIMESTAMP for
HashTableCtx::CodegenAssignNullValue(). TimestampValue itself
is 16 bytes in size. To match RawValue::Write() in the
interpreted path, CodegenAssignNullValue() emits code to assign
HashUtil::FNV_SEED to both the upper and lower 64-bit of the
destination value. This change also fixes the handling of 128-bit
Decimal16Value in CodegenAssignNullValue() so the emitted code
matches the behavior of the interpreted path.

Change-Id: I0211d38cbef46331e0006fa5ed0680e6e0867bc8
Reviewed-on: http://gerrit.cloudera.org:8080/4794
Reviewed-by: Michael Ho 
Tested-by: Michael Ho 
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/hash-table.cc
M be/src/exec/old-hash-table.cc
M be/src/util/bit-util.h
M testdata/workloads/functional-query/queries/QueryTest/joins.test
7 files changed, 72 insertions(+), 37 deletions(-)

Approvals:
  Michael Ho: Looks good to me, approved; Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I0211d38cbef46331e0006fa5ed0680e6e0867bc8
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3211: provide toolchain build id for bootstrapping

2016-10-24 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3211: provide toolchain build id for bootstrapping
..


IMPALA-3211: provide toolchain build id for bootstrapping

Testing:
Ran a private build, which succeeded.

Change-Id: Ibcc25ae82511713d0ff05ded37ef162925f2f0fb
Reviewed-on: http://gerrit.cloudera.org:8080/4771
Reviewed-by: Tim Armstrong 
Tested-by: Internal Jenkins
---
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
2 files changed, 14 insertions(+), 2 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ibcc25ae82511713d0ff05ded37ef162925f2f0fb
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3211: provide toolchain build id for bootstrapping

2016-10-24 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3211: provide toolchain build id for bootstrapping
..


Patch Set 4: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibcc25ae82511713d0ff05ded37ef162925f2f0fb
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4339: ensure coredumps end up in IMPALA HOME

2016-10-24 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4339: ensure coredumps end up in IMPALA_HOME
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4785
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc34d152139653374f940dc3edbca08e749bf55e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4339: ensure coredumps end up in IMPALA HOME

2016-10-24 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4339: ensure coredumps end up in IMPALA_HOME
..


IMPALA-4339: ensure coredumps end up in IMPALA_HOME

Change-Id: Ibc34d152139653374f940dc3edbca08e749bf55e
Reviewed-on: http://gerrit.cloudera.org:8080/4785
Reviewed-by: Tim Armstrong 
Tested-by: Internal Jenkins
---
M buildall.sh
1 file changed, 3 insertions(+), 0 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4785
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ibc34d152139653374f940dc3edbca08e749bf55e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4223: Handle truncated file read from HDFS cache

2016-10-24 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4223: Handle truncated file read from HDFS cache
..


Patch Set 1:

> (1 comment)
 > 
 > The change makes sense and we should get it in regardless, but
 > could this have caused the original crash we saw?

I don't think this caused what we saw. I removed the DCHECK to see what would 
happen, since the original stack was from a Release build, but ran into another 
DCHECk. After removing that one (and a third one), too, the queries just 
started to "work", reporting a different warning about not being able to read 
from the file. The same also happened with a Release build, which I was unable 
to crash.

-- 
To view, visit http://gerrit.cloudera.org:8080/4828
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id1e1fdb0211819c5938956abb13b512350a46f1a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3884: Support TYPE TIMESTAMP for HashTableCtx::CodegenAssignNullValue()

2016-10-24 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-3884: Support TYPE_TIMESTAMP for 
HashTableCtx::CodegenAssignNullValue()
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0211d38cbef46331e0006fa5ed0680e6e0867bc8
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3884: Support TYPE TIMESTAMP for HashTableCtx::CodegenAssignNullValue()

2016-10-24 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-3884: Support TYPE_TIMESTAMP for 
HashTableCtx::CodegenAssignNullValue()
..


Patch Set 3:

http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/344/

-- 
To view, visit http://gerrit.cloudera.org:8080/4794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0211d38cbef46331e0006fa5ed0680e6e0867bc8
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3884: Support TYPE TIMESTAMP for HashTableCtx::CodegenAssignNullValue()

2016-10-24 Thread Michael Ho (Code Review)
Michael Ho has uploaded a new patch set (#3).

Change subject: IMPALA-3884: Support TYPE_TIMESTAMP for 
HashTableCtx::CodegenAssignNullValue()
..

IMPALA-3884: Support TYPE_TIMESTAMP for HashTableCtx::CodegenAssignNullValue()

This change implements support for TYPE_TIMESTAMP for
HashTableCtx::CodegenAssignNullValue(). TimestampValue itself
is 16 bytes in size. To match RawValue::Write() in the
interpreted path, CodegenAssignNullValue() emits code to assign
HashUtil::FNV_SEED to both the upper and lower 64-bit of the
destination value. This change also fixes the handling of 128-bit
Decimal16Value in CodegenAssignNullValue() so the emitted code
matches the behavior of the interpreted path.

Change-Id: I0211d38cbef46331e0006fa5ed0680e6e0867bc8
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/hash-table.cc
M be/src/exec/old-hash-table.cc
M be/src/util/bit-util.h
M testdata/workloads/functional-query/queries/QueryTest/joins.test
7 files changed, 72 insertions(+), 37 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/4794/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0211d38cbef46331e0006fa5ed0680e6e0867bc8
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4309: Introduce Expr rewrite phase and supporting classes.

2016-10-24 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4309: Introduce Expr rewrite phase and supporting 
classes.
..


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4746/6/fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java
File fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java:

Line 117: // canEvalUsingPartitionMd() to fold constant expressions 
in-place.
comment why the cloning is necessary.


http://gerrit.cloudera.org:8080/#/c/4746/5/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
File fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java:

Line 38: if (!(expr instanceof BetweenPredicate)) return expr;
> 1. Sure, the driver can keep track of changes instead, but it seems like we
i think your example illustrates my point about making multiple passes.

a remove-redundant-conjuncts rule would look for a compound predicate and 
return a new compound predicate. a normalization rule would look for nested 
compound predicates and unnest them. (i agree that a rule needs to be able to 
look into the child exprs, but that can easily be done with tree predicates.) 
you'd want to apply the normalization rule repeatedly, until the tree stops 
changing, and then the redundancy removal rule.


http://gerrit.cloudera.org:8080/#/c/4746/6/fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java
File fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java:

Line 40:   public Expr rewrite(Expr expr, Analyzer analyzer) throws 
AnalysisException {
i would have expected this to apply a single rule to the entire expr tree until 
there are no more changes, then switch to the next rule, etc.

i guess it doesn't matter at the moment, since both approaches are equivalent 
for the between rule. probably best to see some more rules before tweaking the 
traversal patterns.


Line 52: Expr origExpr = rewrittenExpr.clone();
why is this necessary? if the convention is that apply() always returns the 
original expr if no transformation was done, then a pointer comparison should 
be sufficient.


-- 
To view, visit http://gerrit.cloudera.org:8080/4746
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2279dc984bcf7742db4fa3b1aa67283ecbb05e6e
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4134,IMPALA-3704: Kudu INSERT improvements

2016-10-24 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4134,IMPALA-3704: Kudu INSERT improvements
..


IMPALA-4134,IMPALA-3704: Kudu INSERT improvements

1.) IMPALA-4134: Use Kudu AUTO FLUSH
Improves performance of writes to Kudu up to 4.2x in
bulk data loading tests (load 200 million rows from
lineitem).

2.) IMPALA-3704: Improve errors on PK conflicts
The Kudu client reports an error for every PK conflict,
and all errors were being returned in the error status.
As a result, inserts/updates/deletes could return errors
with thousands errors reported. This changes the error
handling to log all reported errors as warnings and
return only the first error in the query error status.

3.) Improve the DataSink reporting of the insert stats.
The per-partition stats returned by the data sink weren't
useful for Kudu sinks. Firstly, the number of appended rows
was not being displayed in the profile. Secondly, the
'stats' field isn't populated for Kudu tables and thus was
confusing in the profile, so it is no longer printed if it
is not set in the thrift struct.

Testing: Ran local tests, including new tests to verify
the query profile insert stats. Manual cluster testing was
conducted of the AUTO FLUSH functionality, and that testing
informed the default mutation buffer value of 100MB which
was found to provide good results.

Change-Id: I5542b9a061b01c543a139e8722560b1365f06595
Reviewed-on: http://gerrit.cloudera.org:8080/4728
Reviewed-by: Matthew Jacobs 
Tested-by: Internal Jenkins
---
M be/src/exec/data-sink.cc
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M be/src/runtime/coordinator.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/query-exec-state.cc
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/generate_error_codes.py
M shell/impala_client.py
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
M tests/beeswax/impala_beeswax.py
14 files changed, 248 insertions(+), 97 deletions(-)

Approvals:
  Matthew Jacobs: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4728
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I5542b9a061b01c543a139e8722560b1365f06595
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-24 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 10: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] Add distcc infrastructure.

2016-10-24 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: Add distcc infrastructure.
..


Add distcc infrastructure.

This has been working for several months, and it it was written mainly
by Casey Ching while he was at Cloudera working on Impala.

Change-Id: Ia4bc78ad46dda13e4533183195af632f46377cae
Reviewed-on: http://gerrit.cloudera.org:8080/4820
Reviewed-by: Jim Apple 
Tested-by: Internal Jenkins
---
M .gitignore
A bin/distcc/.gitignore
A bin/distcc/README.md
A bin/distcc/distcc.sh
A bin/distcc/distcc_env.sh
5 files changed, 330 insertions(+), 1 deletion(-)

Approvals:
  Jim Apple: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4820
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ia4bc78ad46dda13e4533183195af632f46377cae
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Add distcc infrastructure.

2016-10-24 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: Add distcc infrastructure.
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4820
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4bc78ad46dda13e4533183195af632f46377cae
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4223: Handle truncated file read from HDFS cache

2016-10-24 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4223: Handle truncated file read from HDFS cache
..


Patch Set 1:

(1 comment)

The change makes sense and we should get it in regardless, but could this have 
caused the original crash we saw?

http://gerrit.cloudera.org:8080/#/c/4828/1/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 445
This is definitely suspicious.

Sometimes I feel like some of the DCHECKS were expressions of hope rather than 
assertions.


-- 
To view, visit http://gerrit.cloudera.org:8080/4828
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id1e1fdb0211819c5938956abb13b512350a46f1a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3211: provide toolchain build id for bootstrapping

2016-10-24 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3211: provide toolchain build id for bootstrapping
..


Patch Set 4: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/4771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibcc25ae82511713d0ff05ded37ef162925f2f0fb
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4350: Crash with vlog level 2 in hash join node

2016-10-24 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4830

Change subject: IMPALA-4350: Crash with vlog level 2 in hash join node
..

IMPALA-4350: Crash with vlog level 2 in hash join node

Testing:
Reproduced by running test_mem_usage_scaling against an Impala with
-v=2. Confirmed the fix avoided the crash.

We don't have any routine testing of high log levels so there isn't
a clear way to get good coverage of all code paths where there might be
similar bugs.

Change-Id: Ieedd49d8a17709177a622ddc15b78a3f48e12d3f
---
M be/src/exec/partitioned-hash-join-node.cc
1 file changed, 13 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/4830/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4830
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ieedd49d8a17709177a622ddc15b78a3f48e12d3f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4362: Misc. fixes for PFE counters

2016-10-24 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4829

Change subject: IMPALA-4362: Misc. fixes for PFE counters
..

IMPALA-4362: Misc. fixes for PFE counters

* ExecTime was always 0, because it wasn't updated before the profile
  was reported to the coordinator
* Made ExecTree* counters children of their respective parents
* Moved ExecTreePrepareTime into the right profile
* Renamed timings profile to something a bit more obvious.

Change-Id: Iaa9dccdcd91297a11a093b921f8d84d32f32be84
---
M be/src/runtime/plan-fragment-executor.cc
1 file changed, 23 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/4829/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4829
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Iaa9dccdcd91297a11a093b921f8d84d32f32be84
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 


[Impala-ASF-CR] Minor compute stats script fixes

2016-10-24 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: Minor compute stats script fixes
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4825
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I326f4c370fda8d5e388af8e2395623185c06bc07
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] Minor compute stats script fixes

2016-10-24 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: Minor compute stats script fixes
..


Minor compute stats script fixes

* Change run-step to output full log path
* Change text to say "Computing table stats" rather than "Computing
  HBase stats" when running compute-table-stats.sh

Change-Id: I326f4c370fda8d5e388af8e2395623185c06bc07
Reviewed-on: http://gerrit.cloudera.org:8080/4825
Reviewed-by: Alex Behm 
Tested-by: Internal Jenkins
---
M testdata/bin/create-load-data.sh
M testdata/bin/run-step.sh
2 files changed, 3 insertions(+), 4 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4825
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I326f4c370fda8d5e388af8e2395623185c06bc07
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins


[Impala-ASF-CR] IMPALA-4223: Handle truncated file read from HDFS cache

2016-10-24 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4828

Change subject: IMPALA-4223: Handle truncated file read from HDFS cache
..

IMPALA-4223: Handle truncated file read from HDFS cache

While overwriting files on HDFS via Hive it can happen that Impala sees
a partially written, cached file. In these cases we did not correctly
handle the partial cached read.

This change adds a check and triggers a fall back to disk reads for such
errors. If the file is partially written to disk, too, then the query
will report a file corruption warning through the disk read path.

Change-Id: Id1e1fdb0211819c5938956abb13b512350a46f1a
---
M be/src/runtime/disk-io-mgr-scan-range.cc
1 file changed, 13 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/4828/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4828
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Id1e1fdb0211819c5938956abb13b512350a46f1a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage

2016-10-24 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-1169: Admission control info on the queries debug webpage
..


Patch Set 4:

Also I had a previous question about how COMPUTE STATS behaves.

-- 
To view, visit http://gerrit.cloudera.org:8080/4756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-24 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 10: Code-Review+1

Thanks!

-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4339: ensure coredumps end up in IMPALA HOME

2016-10-24 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4339: ensure coredumps end up in IMPALA_HOME
..


Patch Set 2:

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/352/

-- 
To view, visit http://gerrit.cloudera.org:8080/4785
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc34d152139653374f940dc3edbca08e749bf55e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-24 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 10:

(1 comment)

I added a test. I also had to rebase onto Dimitris' change, so now the diff 
between PS 9 and 10 is a bit messy. Sorry for that.

http://gerrit.cloudera.org:8080/#/c/4745/8/testdata/workloads/functional-planner/queries/PlannerTest/insert.test
File testdata/workloads/functional-planner/queries/PlannerTest/insert.test:

Line 635: |--02:AGGREGATE [FINALIZE]
> Sorry just thought of this. Can you also add a test like this that has a WH
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-24 Thread Lars Volker (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4745

to look at the new patch set (#10).

Change subject: IMPALA-2521: Add clustered hint to insert statements
..

IMPALA-2521: Add clustered hint to insert statements

This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
plan, just before the table sink. This has the effect that data will be
clustered by its partition prior to writing partitions, which therefore
can be written sequentially.

Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
10 files changed, 417 insertions(+), 43 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4745/10
-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] Removed dead join inversion code from Analyzer.

2016-10-24 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: Removed dead join inversion code from Analyzer.
..


Patch Set 2:

Sure I can take over. Most of these can be fixed using "Quick Fix" by the IDE. 
I'll take a look.

-- 
To view, visit http://gerrit.cloudera.org:8080/4827
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d8fff29c0f6b239796561c877acc709a178c108
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Removed dead join inversion code from Analyzer.

2016-10-24 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: Removed dead join inversion code from Analyzer.
..


Patch Set 2: Code-Review-1

Do you want to take over the patch Bharath? Please keep in mind that some code 
(even if unused) is still useful, for example, validateValueTransferGraph().

-- 
To view, visit http://gerrit.cloudera.org:8080/4827
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d8fff29c0f6b239796561c877acc709a178c108
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Removed dead join inversion code from Analyzer.

2016-10-24 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: Removed dead join inversion code from Analyzer.
..


Patch Set 1:

The above ones I pasted are apart from invertOuterJoinState() you removed here.

-- 
To view, visit http://gerrit.cloudera.org:8080/4827
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d8fff29c0f6b239796561c877acc709a178c108
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Removed dead join inversion code from Analyzer.

2016-10-24 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: Removed dead join inversion code from Analyzer.
..


Patch Set 1:

I ran the code analysis tool from my IDE and found huge amount of un-used code 
in the frontend. I haven't verified each item but Analyzer class shows the 
following ones.

- isSubquery()
- resetSubquery()
- containsOuterjoinedTid(List)
- getEquivSlots(SlotId)
- removeRedundantExprs(List)
- isConjunctAssigned(Expr)
- hasUnassignedConjuncts()
- getTargetDbName(FunctionName)
- getConjunct(ExprId)
- getEqJoinConjuncts()
- validateValueTransferGraph)

As per the analysis there are 4533 such warnings across the code :)

-- 
To view, visit http://gerrit.cloudera.org:8080/4827
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d8fff29c0f6b239796561c877acc709a178c108
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4339: ensure coredumps end up in IMPALA HOME

2016-10-24 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4339: ensure coredumps end up in IMPALA_HOME
..


Patch Set 2:

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/348/

-- 
To view, visit http://gerrit.cloudera.org:8080/4785
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc34d152139653374f940dc3edbca08e749bf55e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4134,IMPALA-3704: Kudu INSERT improvements

2016-10-24 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4134,IMPALA-3704: Kudu INSERT improvements
..


Patch Set 8: Code-Review+2

(1 comment)

rebased and addressed dan's comment. carrying his +2.

http://gerrit.cloudera.org:8080/#/c/4728/7/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

PS7, Line 145: the 
> is this talking about the "Kudu client" or "Kudu server"?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4728
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5542b9a061b01c543a139e8722560b1365f06595
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3788: Support for Kudu 'read-your-writes' consistency

2016-10-24 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new patch set (#3).

Change subject: IMPALA-3788: Support for Kudu 'read-your-writes' consistency
..

IMPALA-3788: Support for Kudu 'read-your-writes' consistency

Kudu provides an API to get/set a 'latest observed
timestamp' on clients to allow a client which inserts to
capture and send this timestamp to another client before a
read to ensure that data as of that timestamp is visible.
This adds support for this feature _for reads within a
session_ by capturing the latest observed timestamp when the
KuduTableSink is sending its last update to the coordinator.
The timestamp is sent with other post-write information, and
is aggregated (i.e. taking the max) at the coordinator. The
max is stored in the session, and that value is then set in
the Kudu client on future scans.

This is being tested by running the Kudu tests after
removing delays that were introduced to work around the
issue that reads might not be visible after a write. Before
this change, if there were no delay, inconsistent results
could be returned.

Change-Id: I6bcb5fc218ad4ab935343a55b2188441d8c7dfbd
---
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-table-sink.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M common/thrift/ImpalaInternalService.thrift
M tests/common/impala_test_suite.py
M tests/query_test/test_kudu.py
11 files changed, 55 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/4779/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4779
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6bcb5fc218ad4ab935343a55b2188441d8c7dfbd
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-4134,IMPALA-3704: Kudu INSERT improvements

2016-10-24 Thread Matthew Jacobs (Code Review)
Hello Alex Behm, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4728

to look at the new patch set (#8).

Change subject: IMPALA-4134,IMPALA-3704: Kudu INSERT improvements
..

IMPALA-4134,IMPALA-3704: Kudu INSERT improvements

1.) IMPALA-4134: Use Kudu AUTO FLUSH
Improves performance of writes to Kudu up to 4.2x in
bulk data loading tests (load 200 million rows from
lineitem).

2.) IMPALA-3704: Improve errors on PK conflicts
The Kudu client reports an error for every PK conflict,
and all errors were being returned in the error status.
As a result, inserts/updates/deletes could return errors
with thousands errors reported. This changes the error
handling to log all reported errors as warnings and
return only the first error in the query error status.

3.) Improve the DataSink reporting of the insert stats.
The per-partition stats returned by the data sink weren't
useful for Kudu sinks. Firstly, the number of appended rows
was not being displayed in the profile. Secondly, the
'stats' field isn't populated for Kudu tables and thus was
confusing in the profile, so it is no longer printed if it
is not set in the thrift struct.

Testing: Ran local tests, including new tests to verify
the query profile insert stats. Manual cluster testing was
conducted of the AUTO FLUSH functionality, and that testing
informed the default mutation buffer value of 100MB which
was found to provide good results.

Change-Id: I5542b9a061b01c543a139e8722560b1365f06595
---
M be/src/exec/data-sink.cc
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M be/src/runtime/coordinator.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/query-exec-state.cc
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/generate_error_codes.py
M shell/impala_client.py
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
M tests/beeswax/impala_beeswax.py
14 files changed, 248 insertions(+), 97 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/4728/8
-- 
To view, visit http://gerrit.cloudera.org:8080/4728
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5542b9a061b01c543a139e8722560b1365f06595
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] Removed dead join inversion code from Analyzer.

2016-10-24 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Removed dead join inversion code from Analyzer.
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4827
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d8fff29c0f6b239796561c877acc709a178c108
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3586: Clean up union-node.h/cc to enable improvements.

2016-10-24 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3586: Clean up union-node.h/cc to enable improvements.
..


Patch Set 2:

This patch requires the fix for IMPALA-4336.
See: https://gerrit.cloudera.org/#/c/4815/

I'll hold off until that one gets reviewed/in.

-- 
To view, visit http://gerrit.cloudera.org:8080/4817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5fc23985e8d51acb8a6920717ce4e2f0254fe70
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4134,IMPALA-3704: Kudu INSERT improvements

2016-10-24 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4134,IMPALA-3704: Kudu INSERT improvements
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4728/6/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

PS6, Line 330: kudu_error_counter_->value()
> Good point, but this kinda means that IGNORE is buggy (at least theoretical
True. I think we just need to make sure the Kudu folks don't change the 
behavior without telling us so we can work with them to make sure that won't 
happen in practice, e.g. I could imagine a few things to consider if they want 
to set error buffer limits so we can make sure the buffer is big enough to 
handle at least all conflicts. Or they offer a way to report errors different 
for different types.

Ideally we could have a test case that ensures we don't hit any unexpected code 
paths, I'm not sure we can guarantee any test would overflow errors if they 
change the behavior in the future. I do have a test case that will result in 
key conflicts for >7000 rows, but depending on how they could change the code 
that might or might not result in errors overflowing.

I'll start a conversation with them about this.


-- 
To view, visit http://gerrit.cloudera.org:8080/4728
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5542b9a061b01c543a139e8722560b1365f06595
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3586: Clean up union-node.h/cc to enable improvements.

2016-10-24 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3586: Clean up union-node.h/cc to enable improvements.
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5fc23985e8d51acb8a6920717ce4e2f0254fe70
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage

2016-10-24 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-1169: Admission control info on the queries debug webpage
..


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/4756/3/be/src/scheduling/query-schedule.cc
File be/src/scheduling/query-schedule.cc:

Line 56: is_admitted_(false) {
> was this intentional? i don't see why this is necessary
Done


http://gerrit.cloudera.org:8080/#/c/4756/3/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:

Line 900: // TODO-MT: call AdmitQuery() rather than always setting 
is_admitted
> call AdmitQuery() rather than always setting is_admitted
Done


http://gerrit.cloudera.org:8080/#/c/4756/3/be/src/scheduling/simple-scheduler.h
File be/src/scheduling/simple-scheduler.h:

Line 92:   virtual Status Schedule(QuerySchedule* schedule);
> Sorry, on second thought this doesn't add much. There's a comment in the ba
Done


http://gerrit.cloudera.org:8080/#/c/4756/3/be/src/service/query-exec-state.h
File be/src/service/query-exec-state.h:

PS3, Line 158:  and had the pool set yet,
> , not had the pool set yet, or this is StmtType doesn't go through admissio
Done


http://gerrit.cloudera.org:8080/#/c/4756/3/tests/common/impala_service.py
File tests/common/impala_service.py:

PS3, Line 83:   sleep(interval)
: assert 0, 'Metric value %s did not reach value %s in %ss' %\
: (metric_name, expected_value, timeout)
> can you move this up to be under read_debug_webpage and call it get_debug_w
Done


http://gerrit.cloudera.org:8080/#/c/4756/3/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

PS3, Line 603: 
> outdated
Done


PS3, Line 658: 
> counts from the queries page
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage

2016-10-24 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new patch set (#4).

Change subject: IMPALA-1169: Admission control info on the queries debug webpage
..

IMPALA-1169: Admission control info on the queries debug webpage

This patch adds a new event, 'Queued', to the query event log to
indicate when a query is queued by the admission controller. This
means that queries on the '/queries' page that are currently
queued will display this as their 'Last Event', making it possible
to see which queries are current queued.

It also adds a column to show the resource pool associated with
the queries.

Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283
---
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.h
M tests/common/impala_service.py
M tests/custom_cluster/test_admission_controller.py
M www/queries.tmpl
9 files changed, 75 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/4756/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-3211: provide toolchain build id for bootstrapping

2016-10-24 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-3211: provide toolchain build id for bootstrapping
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibcc25ae82511713d0ff05ded37ef162925f2f0fb
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4134,IMPALA-3704: Kudu INSERT improvements

2016-10-24 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4134,IMPALA-3704: Kudu INSERT improvements
..


Patch Set 7: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4728/6/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

PS6, Line 330: 
> Yes, we still fail the query if the error buffer overflows since we can't b
Good point, but this kinda means that IGNORE is buggy (at least theoretically). 
 i.e. a user could get into a situation where they can't run a query because 
the number of to be ignored errors is too large, and that seems bad.


-- 
To view, visit http://gerrit.cloudera.org:8080/4728
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5542b9a061b01c543a139e8722560b1365f06595
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Removed dead join inversion code from Analyzer.

2016-10-24 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4827

Change subject: Removed dead join inversion code from Analyzer.
..

Removed dead join inversion code from Analyzer.

Change-Id: I9d8fff29c0f6b239796561c877acc709a178c108
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
1 file changed, 0 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/4827/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4827
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I9d8fff29c0f6b239796561c877acc709a178c108
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] IMPALA-3586: Clean up union-node.h/cc to enable improvements.

2016-10-24 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3586: Clean up union-node.h/cc to enable improvements.
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4817/1/be/src/exec/union-node.cc
File be/src/exec/union-node.cc:

PS1, Line 72: for (int i = 0; i < const_expr_lists_.size(); ++i) {
: RETURN_IF_ERROR(Expr::Prepare(
: const_expr_lists_[i], state, row_desc(), 
expr_mem_tracker()));
: AddExprCtxsToFree(const_expr_lists_[i]);
: DCHECK_EQ(const_expr_lists_[i].size(), 
tuple_desc_->slots().size());
:   }
: 
:   // Prepare result expr lists.
:   for (int i = 0; i < child_expr_lists_.size(); ++i) {
: RETURN_IF_ERROR(Expr::Prepare(
: child_expr_lists_[i], state, child(i)->row_desc(), 
expr_mem_tracker()));
: AddExprCtxsToFree(child_expr_lists_[i]);
: DCHECK_EQ(child_expr_lists_[i].size(), 
tuple_desc_->slots().size());
:   }
> ranged-for loops as well?
Done for the first loop.
The second loop needs child(i)->row_desc().


PS1, Line 220: for (int i = 0; i < const_expr_lists_.size(); ++i) {
 : Expr::Close(const_expr_lists_[i], state);
 :   }
 :   for (int i = 0; i < child_expr_lists_.size(); ++i) {
 : Expr::Close(child_expr_lists_[i], state);
 :   }
> While you're here, can you rewrite as two ranged-for loops?
Done


http://gerrit.cloudera.org:8080/#/c/4817/1/be/src/exec/union-node.h
File be/src/exec/union-node.h:

Line 58:   /// Const exprs materialized by this node. These exprs don't refer 
to any children.
> Add that these are only materialized by the first fragment instance to avoi
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5fc23985e8d51acb8a6920717ce4e2f0254fe70
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add all build targets to CMake and speed up builds

2016-10-24 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Add all build targets to CMake and speed up builds
..


Patch Set 2: Code-Review+1

Not +2 because I think it could use a review from someone who knows Cmake much 
better than I do

-- 
To view, visit http://gerrit.cloudera.org:8080/4790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3586: Clean up union-node.h/cc to enable improvements.

2016-10-24 Thread Alex Behm (Code Review)
Hello Henry Robinson,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4817

to look at the new patch set (#2).

Change subject: IMPALA-3586: Clean up union-node.h/cc to enable improvements.
..

IMPALA-3586: Clean up union-node.h/cc to enable improvements.

This patch does not address IMPALA-3586, but it cleans up the
code in union-node.h/cc to make it easier to implement those
perf improvements.

The major simplification is to remove conjunct evaluation since
the planner does not assigns conjuncts to a union-node anymore.
Conjuncts are always pushed to the union operands.

Change-Id: Ia5fc23985e8d51acb8a6920717ce4e2f0254fe70
---
M be/src/exec/union-node.cc
M be/src/exec/union-node.h
2 files changed, 95 insertions(+), 133 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/4817/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia5fc23985e8d51acb8a6920717ce4e2f0254fe70
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 


[Impala-ASF-CR] IMPALA-4134,IMPALA-3704: Kudu INSERT improvements

2016-10-24 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4134,IMPALA-3704: Kudu INSERT improvements
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4728/6/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

PS6, Line 330: kudu_error_counter_->value()
> Does that reasoning hold up even if all errors are IGNORE type errors and w
Yes, we still fail the query if the error buffer overflows since we can't be 
sure that there weren't more serious errors missed. Regardless, I looked at the 
client code and it never even fails right now, it just keeps a vector of errors 
that can keep growing, not bounded.


-- 
To view, visit http://gerrit.cloudera.org:8080/4728
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5542b9a061b01c543a139e8722560b1365f06595
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3676: Use clang as a static analysis tool

2016-10-24 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-3676: Use clang as a static analysis tool
..


Patch Set 7:

(8 comments)

> (8 comments)
 > 
 > The commit message doesn't really tell the full story here. In
 > particular, you should say something about AlignmentNew and what it
 > is for.

Done

 > 
 > The rearrangement of class members to group by type seems like it
 > sacrifices readability for performance in some strange cases  - the
 > statestore is a good example, where there is only ever one
 > statestore and it is not (as far as we know) highly sensitive to
 > its in-memory layout. What criteria did you use to decide that a
 > class' members should be grouped-by-type?

I rearranged them all. Tim convinced me to roll some back.

http://gerrit.cloudera.org:8080/#/c/4758/6/be/src/common/init.cc
File be/src/common/init.cc:

PS6, Line 100: [[noreturn]]
> can this go on the previous line? It might be easier to read that way (simi
While it can, clang-format "fixes" it to this right away.


http://gerrit.cloudera.org:8080/#/c/4758/6/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

PS6, Line 173: 
 : 
> Why remove these comments?
Accident.


PS6, Line 176: 
> why remove this?
Put it back in the definition in the class.


http://gerrit.cloudera.org:8080/#/c/4758/6/be/src/exprs/aggregate-functions.h
File be/src/exprs/aggregate-functions.h:

PS6, Line 182: ec
> can you replace that with HLL_PRECISION?
Done


http://gerrit.cloudera.org:8080/#/c/4758/6/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

PS6, Line 140: uniqu
> remove std:: in cc files
Removed here and elsewhere, except for std::move. I see that usually referred 
to with the prefix by C++ people. Thoughts?


http://gerrit.cloudera.org:8080/#/c/4758/6/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

PS6, Line 82: CacheLineAligned
> why?
subscriber_heartbeat_threadpool_ is a ThreadPool, and 
ThreadPool::work_queue_ is a BlockingQueue, and BlcokingQueue is 
marked CACHE_ALIGNED, presumably to avoid false sharing.

I made CacheLineAligned use the C++11 alignas specifier and deleted 
CACHE_ALIGNED.


PS6, Line 253: boost::mutex exit_flag_lock_;
> This is a good example of where logical grouping is broken by grouping-by-t
rolled back


http://gerrit.cloudera.org:8080/#/c/4758/6/be/src/util/aligned-new.h
File be/src/util/aligned-new.h:

PS6, Line 27: Objects that should be allocate
> What does 'must be allocated' mean - for correctness or performance or both
either - performance for BlockingQueue and correctness for SIMD values when 
using aligned loads or stores.


-- 
To view, visit http://gerrit.cloudera.org:8080/4758
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ed168488cb30ddeccd0087f3840541d858f9c06
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4134,IMPALA-3704: Kudu INSERT improvements

2016-10-24 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4134,IMPALA-3704: Kudu INSERT improvements
..


Patch Set 7:

(2 comments)

Code looks fine, but want to close on the question at line 330 before +2ing.

http://gerrit.cloudera.org:8080/#/c/4728/6/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

PS6, Line 330: 
> I think so: If that happens we'll fail the query. If the query fails, the '
Does that reasoning hold up even if all errors are IGNORE type errors and we 
overflow the client error buffer?


http://gerrit.cloudera.org:8080/#/c/4728/7/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

PS7, Line 145: Kudu
is this talking about the "Kudu client" or "Kudu server"?


-- 
To view, visit http://gerrit.cloudera.org:8080/4728
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5542b9a061b01c543a139e8722560b1365f06595
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3676: Use clang as a static analysis tool

2016-10-24 Thread Jim Apple (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4758

to look at the new patch set (#7).

Change subject: IMPALA-3676: Use clang as a static analysis tool
..

IMPALA-3676: Use clang as a static analysis tool

This patch adds a script to run clang-tidy over the whole code
base. It is a first step towards running clang-tidy over patches as a
tool to help users spot bugs before code review.

Because of the number of clang-tidy checks, this patch only addresses
some of them. In particular, only checks starting with 'clang' are
considered. Many of them which are flaky or not part of our style are
excluded from the analysis. This patch also exlcudes some checks which
are part of our current style but which would be too laborious to fix
over the entire codebase, like using nullptr rather than NULL.

This patch also fixes a number of small bugs found by clang-tidy.

Finally, this patch adds the class AlignedNew, the purpose of which is
to provide correct alignment on heap-allocated data. The global new
operator only guarantees 16-byte alignment. A class that includes a
member variable that must be aligned on a k-byte boundary for k>16 can
inherit from AlignedNew to ensure correct alignment on the heap,
quieting clang's -Wover-aligned warning. (Static and stack allocation
are required by the standard to respect the alignment of the type and
its member variables, so no extra code is needed for allocation in
those places.)

Change-Id: I4ed168488cb30ddeccd0087f3840541d858f9c06
---
A .clang-tidy
M CMakeLists.txt
M be/CMakeLists.txt
M be/src/benchmarks/bloom-filter-benchmark.cc
M be/src/benchmarks/in-predicate-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.cc
M be/src/common/compiler-util.h
M be/src/common/init.cc
M be/src/common/logging.h
M be/src/common/status.h
M be/src/exec/delimited-text-parser.cc
M be/src/exec/delimited-text-parser.h
M be/src/exec/exec-node.h
M be/src/exec/external-data-source-executor.h
M be/src/exec/hash-table-test.cc
M be/src/exec/hbase-scan-node.h
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/hdfs-table-writer.h
M be/src/exec/parquet-column-readers.h
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/scanner-context.inline.h
M be/src/exec/write-stream.h
M be/src/experiments/bit-stream-utils.8byte.inline.h
M be/src/experiments/tuple-splitter-test.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/aggregate-functions.h
M be/src/exprs/compound-predicates.h
M be/src/exprs/expr-test.cc
M be/src/exprs/slot-ref.cc
M be/src/exprs/string-functions-ir.cc
M be/src/rpc/TAcceptQueueServer.h
M be/src/rpc/authentication.cc
M be/src/rpc/authentication.h
M be/src/rpc/thrift-server.h
M be/src/runtime/buffered-block-mgr.h
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/exec-env.h
M be/src/runtime/free-pool-test.cc
M be/src/runtime/hbase-table.cc
M be/src/runtime/hdfs-fs-cache.h
M be/src/runtime/multi-precision-test.cc
M be/src/runtime/runtime-filter-bank.h
M be/src/runtime/runtime-filter.h
M be/src/runtime/scoped-buffer.h
M be/src/runtime/string-buffer.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/simple-scheduler-test.cc
M be/src/service/fe-support.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M be/src/service/query-result-set.cc
M be/src/statestore/failure-detector.h
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/testutil/test-udas.cc
M be/src/testutil/test-udas.h
M be/src/transport/TSasl.h
M be/src/transport/TSaslClientTransport.h
M be/src/transport/TSaslServerTransport.h
M be/src/transport/TSaslTransport.h
M be/src/udf/uda-test-harness.h
M be/src/udf/uda-test.cc
M be/src/udf/udf-ir.cc
M be/src/udf/udf.cc
M be/src/udf/udf.h
M be/src/udf_samples/uda-sample.cc
A be/src/util/aligned-new.h
M be/src/util/benchmark-test.cc
M be/src/util/bit-stream-utils.inline.h
M be/src/util/bit-util-test.cc
M be/src/util/blocking-queue-test.cc
M be/src/util/blocking-queue.h
M be/src/util/bloom-filter-test.cc
M 

[Impala-ASF-CR] Add distcc infrastructure.

2016-10-24 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Add distcc infrastructure.
..


Patch Set 3: Code-Review+2

Carry Tim's +2

-- 
To view, visit http://gerrit.cloudera.org:8080/4820
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4bc78ad46dda13e4533183195af632f46377cae
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1702: Enforce single-table consistency in query analysis.

2016-10-24 Thread Huaisi Xu (Code Review)
Huaisi Xu has uploaded a new patch set (#17).

Change subject: IMPALA-1702: Enforce single-table consistency in query analysis.
..

IMPALA-1702: Enforce single-table consistency in query analysis.

Catalogd managed table id generation causes problems when
new updates arrive at frontend while queries are being
analyzed: references to the same table may suddenly refer
to a different version, and different tables may share the
same table id. This causes problems when one table overrides
another one in the thrift descriptor table sent to backend.

This commit removes the table id from the catalog Table
object; instead frontend assigns a unique id to each table
in DescriptorTable.toThrift(). It also implements a
referencedTables_ cache in Analyzer::globalState_ so that
calling Analyzer::getTable() on the same table returns the
same reference for the same query.

Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
---
M common/thrift/CatalogObjects.thrift
M common/thrift/Descriptors.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
D fe/src/main/java/org/apache/impala/catalog/TableId.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/View.java
M fe/src/main/java/org/apache/impala/common/Id.java
M fe/src/main/java/org/apache/impala/planner/HBaseTableSink.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M fe/src/main/java/org/apache/impala/planner/JoinTableId.java
M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
27 files changed, 204 insertions(+), 219 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/4349/17
-- 
To view, visit http://gerrit.cloudera.org:8080/4349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Huaisi Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-1702: Enforce single-table consistency in query analysis.

2016-10-24 Thread Huaisi Xu (Code Review)
Hello Marcel Kornacker, Bharath Vissapragada, Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4349

to look at the new patch set (#17).

Change subject: IMPALA-1702: Enforce single-table consistency in query analysis.
..

IMPALA-1702: Enforce single-table consistency in query analysis.

Catalogd managed table id generation causes problems when
new updates arrive at frontend while queries are being
analyzed: references to the same table may suddenly refer
to a different version, and different tables may share the
same table id. This causes problems when one table overrides
another one in the thrift descriptor table sent to backend.

This commit removes the table id from the catalog Table
object; instead frontend assigns a unique id to each table
in DescriptorTable.toThrift(). It also implements a
referencedTables_ cache in Analyzer::globalState_ so that
calling Analyzer::getTable() on the same table returns the
same reference for the same query.

Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
---
M common/thrift/CatalogObjects.thrift
M common/thrift/Descriptors.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
D fe/src/main/java/org/apache/impala/catalog/TableId.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/View.java
M fe/src/main/java/org/apache/impala/common/Id.java
M fe/src/main/java/org/apache/impala/planner/HBaseTableSink.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M fe/src/main/java/org/apache/impala/planner/JoinTableId.java
M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
27 files changed, 204 insertions(+), 219 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/4349/17
-- 
To view, visit http://gerrit.cloudera.org:8080/4349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Huaisi Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-1702: Enforce single-table consistency in query analysis.

2016-10-24 Thread Huaisi Xu (Code Review)
Huaisi Xu has posted comments on this change.

Change subject: IMPALA-1702: Enforce single-table consistency in query analysis.
..


Patch Set 16:

(10 comments)

Thanks Marcel.

http://gerrit.cloudera.org:8080/#/c/4349/15/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

Line 294: // The Impalad Catalog has the latest tables from the statestore. 
In order to use the
> add todo that we need to investigate whether to do this for other catalog o
Done


http://gerrit.cloudera.org:8080/#/c/4349/15/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java
File fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java:

Line 43:  * them unique ids.
> "may be null" is a better phrasing.
Done


Line 44:  */
> "Set in QueryStmt.analyze()."
Done


Line 45: public class DescriptorTable {
> remove last sentence, that goes without saying.
Done


Line 179:   // inline view of a non-constant select has a non-materialized 
tuple descriptor
> "checking that"
Done


Line 180:   // in the descriptor table just for type checking, which we 
need to skip
> "same Table instance" is even more specific (because we're talking about ob
Done


Line 202:   for (SlotDescriptor slotD: tupleDesc.getMaterializedSlots()) {
> single line
Done


http://gerrit.cloudera.org:8080/#/c/4349/15/fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
File fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java:

Line 201:   public void materializeSlots() {
> single line
done. but not sure if this is what you meant.


http://gerrit.cloudera.org:8080/#/c/4349/15/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
File fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java:

Line 438:   if (descTbl.isSetTupleDescriptors()) {
> precede w/ blank line
Done


Line 447: if (execRequest.isSetFragments() && 
!execRequest.fragments.isEmpty()) {
> same here
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Huaisi Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4309: Introduce Expr rewrite phase and supporting classes.

2016-10-24 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4309: Introduce Expr rewrite phase and supporting 
classes.
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4746/5/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java:

Line 37
I removed this test until we agree on the approach first. I'll see how we can 
resurrect this test later (if possible).


-- 
To view, visit http://gerrit.cloudera.org:8080/4746
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2279dc984bcf7742db4fa3b1aa67283ecbb05e6e
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4309: Introduce Expr rewrite phase and supporting classes.

2016-10-24 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new patch set (#6).

Change subject: IMPALA-4309: Introduce Expr rewrite phase and supporting 
classes.
..

IMPALA-4309: Introduce Expr rewrite phase and supporting classes.

Introduces a new phase for rewriting Exprs after analysis and
before subquery rewriting. The transformed Exprs replace the
original ones in analyzed statements. If Exprs were changed,
the whole statement is reset() and re-analyzed, similar to how
subqueries are rewritten. If both Exprs and subqueries are
rewritten there is only one re-analysis of the changed statement.

The following new classes work together to perform transformations:
1. ExprRewriteRule
- base class for Expr transformation rules
2. ExprRewriter
- drives the transformation of Exprs using a list of
  ExprRewriteRules

Statements that have exprs to be rewritten need to implement
a new method rewriteExprs() that accepts an ExprRewriter.

As an example, this patch adds a rule for converting
BetweenPredicates into their equivalent CompoundPredicates.
The BetweenPredicate has been notoriously buggy due to a lack
of such a separate rewrite phase and is now cleaned up.

Testing:
1. Added a new test for checking that the rewrite framework
   covers all relevant statements, clauses and can properly
   handle nested statements and subqueries.
2. There are many existing tests for BetweePredicates and
   they all exercise the new rewrite rule/phase.
3. Ran a private core/hdfs run and it passed.

Change-Id: I2279dc984bcf7742db4fa3b1aa67283ecbb05e6e
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectList.java
M fe/src/main/java/org/apache/impala/analysis/SelectListItem.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StatementBase.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
A fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
A fe/src/main/java/org/apache/impala/rewrite/ExprRewriteRule.java
A fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
A fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-views.test
29 files changed, 513 insertions(+), 216 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/4746/6
-- 
To view, visit http://gerrit.cloudera.org:8080/4746
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2279dc984bcf7742db4fa3b1aa67283ecbb05e6e
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Add all build targets to CMake and speed up builds

2016-10-24 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2).

Change subject: Add all build targets to CMake and speed up builds
..

Add all build targets to CMake and speed up builds

Use CMake's dependency resolution always instead of serial execution of
targets via shell scripts.  This improves parallelism by building fe,
be, and other targets at the same time and avoid some overhead from
invoking "make" multiple times. This reduces the time taken for
an incremental compilation of fe and be from 56s to 24s with this
command:

  ./buildall.sh -debug -noclean -notests -skiptests -ninja

Also use Impala-lzo's build script. This depends on the IMPALA-4277
fixes to the Impala-lzo build script.

Also make sure that "make" builds all the same artifacts as buildall.sh
when run with no args.

Testing:
Ran a jenkins core job, also experimented locally.

Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26
---
M CMakeLists.txt
M be/src/benchmarks/CMakeLists.txt
M be/src/service/CMakeLists.txt
M bin/make_impala.sh
M buildall.sh
M ext-data-source/CMakeLists.txt
M fe/CMakeLists.txt
7 files changed, 65 insertions(+), 52 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/4790/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4309: Introduce Expr rewrite phase and supporting classes.

2016-10-24 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4309: Introduce Expr rewrite phase and supporting 
classes.
..


Patch Set 5:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/4746/5/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java:

Line 355
> what was going on with this?
This was stale code from the first version of subquery rewriting which was 
later cleaned up. It's not needed.


Line 379: analysisResult_.stmt_.rewriteExprs(rewriter_);
> for the between transformation it's not necessary, but in general i'd expec
I was thinking that we'd cross that bridge when our set of rewrite rules 
becomes complex enough to warrant that, but easy enough to do now. Done.


http://gerrit.cloudera.org:8080/#/c/4746/5/fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java:

Line 56:   "supported in a between predicate: " + toSqlImpl());
> capitalize between
Done


http://gerrit.cloudera.org:8080/#/c/4746/5/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:

Line 59:   protected Expr havingClause_;  // original having clause
> why move it?
Reverted.


Line 874:   // Also rewrite exprs in the statements of subqueries.
> the problem you have is that the parent/child expr tree doesn't give you ac
Expr.getContainedExprs() does not help because we also need to set the 
rewritten exprs in the appropriate clauses of the contained QueryStmt. Sure, we 
can get the exprs and rewrite them, but then we don't know where to set them. 
We could make everything Reference.

Let me know if you want to go down the "everything is a ParseNode path" and 
I'll abandon this patch. That will be much more involved change and I 
personally don't think it's worth it. That change will save us these < 50 LOC 
to traverse the stmts and I don't see us rewriting non-Expr ParseNodes with 
that new infra (we'll likely keep our existing StmtRewriter).


http://gerrit.cloudera.org:8080/#/c/4746/5/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java:

Line 69
> huh?
Moved to caller to avoid several reset() and reanalyze() passes.


http://gerrit.cloudera.org:8080/#/c/4746/5/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
File fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java:

Line 38: numChanges_ = 0;
> should rules really have state?
1. Sure, the driver can keep track of changes instead, but it seems like we'll 
need excessive object creation:

for (Rule rule: rules) {
  Expr clone = expr.clone();
  clone = rule.apply(clone);
  changed = clone.equals(expr);
}

I made those changes.

2. I think there are some rewrite rules that are way easier to express if the 
rule itself can traverse the entire Expr tree. For example, removing redundant 
conjuncts or disjuncts seems a little cumbersome in your proposed approach. 
Seems like we'd need to have a normalization pass first to get the redundant 
exprs next to each other so a single rule application can detect the redundancy.


Line 46: if (expr instanceof BetweenPredicate) {
> if (!(expr instanceof ...)) return expr;
Done


http://gerrit.cloudera.org:8080/#/c/4746/5/fe/src/main/java/org/apache/impala/rewrite/ExprRewriteRule.java
File fe/src/main/java/org/apache/impala/rewrite/ExprRewriteRule.java:

Line 40:   public abstract Expr rewrite(Expr expr, Analyzer analyzer) throws 
AnalysisException;
> call it apply then?
Done


http://gerrit.cloudera.org:8080/#/c/4746/5/fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java
File fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java:

Line 31: public class ExprRewriter extends ExprRewriteRule {
> what's the point of making it a subclass of ExprRewriteRule?
1. No need to make subclass ExprRewriteRule. Done.

2. I don't think we want to hardcode the rules in here because we will want to 
do different sets of rewrites in different phases of planning. For example, the 
BETWEEN rewrite is a logical rewrite that we can do on the parse "tree". But 
constant folding will need to be done on the fully substituted Exprs, i.e., 
during planning. Of course, we can just brute-force it and always apply 
everything but I think it's easier to understand if we have a more nuanced 
application of rules.

3. I made the other changed you requested here:
* Keep applying rules until there no more changes.
* Have the driver traverse the expr tree and detect changes.


http://gerrit.cloudera.org:8080/#/c/4746/5/fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
File fe/src/test/java/org/apache/impala/common/FrontendTestBase.java:

Line 186: // Do not analyze the stmt to avoid applying 

[Impala-ASF-CR] IMPALA-4339: ensure coredumps end up in IMPALA HOME

2016-10-24 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4339: ensure coredumps end up in IMPALA_HOME
..


Patch Set 2: Code-Review+2

Rebase, carry +2

-- 
To view, visit http://gerrit.cloudera.org:8080/4785
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc34d152139653374f940dc3edbca08e749bf55e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4339: ensure coredumps end up in IMPALA HOME

2016-10-24 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4339: ensure coredumps end up in IMPALA_HOME
..


Patch Set 2: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/347/

-- 
To view, visit http://gerrit.cloudera.org:8080/4785
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc34d152139653374f940dc3edbca08e749bf55e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3211: provide toolchain build id for bootstrapping

2016-10-24 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#3).

Change subject: IMPALA-3211: provide toolchain build id for bootstrapping
..

IMPALA-3211: provide toolchain build id for bootstrapping

Testing:
Ran a private build, which succeeded.

Change-Id: Ibcc25ae82511713d0ff05ded37ef162925f2f0fb
---
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
2 files changed, 14 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/4771/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibcc25ae82511713d0ff05ded37ef162925f2f0fb
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4339: ensure coredumps end up in IMPALA HOME

2016-10-24 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4339: ensure coredumps end up in IMPALA_HOME
..


Patch Set 1: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/343/

-- 
To view, visit http://gerrit.cloudera.org:8080/4785
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc34d152139653374f940dc3edbca08e749bf55e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3211: provide toolchain build id for bootstrapping

2016-10-24 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3211: provide toolchain build id for bootstrapping
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4771/2/bin/impala-config.sh
File bin/impala-config.sh:

Line 56: : ${IMPALA_TOOLCHAIN_BUILD_ID=249-2267164200}
> This could use a bit more explanation. How should a person discover this va
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibcc25ae82511713d0ff05ded37ef162925f2f0fb
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1702: Enforce single-table consistency in query analysis.

2016-10-24 Thread Huaisi Xu (Code Review)
Huaisi Xu has posted comments on this change.

Change subject: IMPALA-1702: Enforce single-table consistency in query analysis.
..


Patch Set 16:

solved conflict first. addressing comment next

-- 
To view, visit http://gerrit.cloudera.org:8080/4349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Huaisi Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1702: Enforce single-table consistency in query analysis.

2016-10-24 Thread Huaisi Xu (Code Review)
Huaisi Xu has uploaded a new patch set (#16).

Change subject: IMPALA-1702: Enforce single-table consistency in query analysis.
..

IMPALA-1702: Enforce single-table consistency in query analysis.

Catalogd managed table id generation causes problems when
new updates arrive at frontend while queries are being
analyzed: references to the same table may suddenly refer
to a different version, and different tables may share the
same table id. This causes problems when one table overrides
another one in the thrift descriptor table sent to backend.

This commit removes the table id from the catalog Table
object; instead frontend assigns a unique id to each table
in DescriptorTable.toThrift(). It also implements a
referencedTables_ cache in Analyzer::globalState_ so that
calling Analyzer::getTable() on the same table returns the
same reference for the same query.

Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
---
M common/thrift/CatalogObjects.thrift
M common/thrift/Descriptors.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
D fe/src/main/java/org/apache/impala/catalog/TableId.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/View.java
M fe/src/main/java/org/apache/impala/common/Id.java
M fe/src/main/java/org/apache/impala/planner/HBaseTableSink.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M fe/src/main/java/org/apache/impala/planner/JoinTableId.java
M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
27 files changed, 199 insertions(+), 213 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/4349/16
-- 
To view, visit http://gerrit.cloudera.org:8080/4349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Huaisi Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-1702: Enforce single-table consistency in query analysis.

2016-10-24 Thread Huaisi Xu (Code Review)
Hello Marcel Kornacker, Bharath Vissapragada, Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4349

to look at the new patch set (#16).

Change subject: IMPALA-1702: Enforce single-table consistency in query analysis.
..

IMPALA-1702: Enforce single-table consistency in query analysis.

Catalogd managed table id generation causes problems when
new updates arrive at frontend while queries are being
analyzed: references to the same table may suddenly refer
to a different version, and different tables may share the
same table id. This causes problems when one table overrides
another one in the thrift descriptor table sent to backend.

This commit removes the table id from the catalog Table
object; instead frontend assigns a unique id to each table
in DescriptorTable.toThrift(). It also implements a
referencedTables_ cache in Analyzer::globalState_ so that
calling Analyzer::getTable() on the same table returns the
same reference for the same query.

Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
---
M common/thrift/CatalogObjects.thrift
M common/thrift/Descriptors.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
D fe/src/main/java/org/apache/impala/catalog/TableId.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/View.java
M fe/src/main/java/org/apache/impala/common/Id.java
M fe/src/main/java/org/apache/impala/planner/HBaseTableSink.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M fe/src/main/java/org/apache/impala/planner/JoinTableId.java
M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
27 files changed, 199 insertions(+), 213 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/4349/16
-- 
To view, visit http://gerrit.cloudera.org:8080/4349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Huaisi Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Add all build targets to CMake and speed up builds

2016-10-24 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Add all build targets to CMake and speed up builds
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4790/1/bin/make_impala.sh
File bin/make_impala.sh:

Line 90:   echo "[-fe_only] : Builds fe only."
> The reason for this flag is so that make_impala.sh still builds the backend
A lot of people depend on make_impala.sh only building the backend 
unfortunately (I know of several jenkins jobs). Doesn't seem worth the pain, 
especially since buildall.sh already was a standard way to build everything.


PS1, Line 92: ."
> cand benchmarks
Done


PS1, Line 96: like
> What else?
There used to be a test tarball. We might want to produce a source tarball at 
some point. Mainly just didn't want to add a new flag every time we added some 
new build artifact.


PS1, Line 174: exit
> This line can be omitted
Done


http://gerrit.cloudera.org:8080/#/c/4790/1/buildall.sh
File buildall.sh:

PS1, Line 305: all
> Should the meaning of "all" be put into make_impala.sh?
I think it makes more sense to do that at the CMake level. It already builds 
most of these things when run with no args, but I added in the remaining ones 
that were not part of ALL (cscope and the tarballs).


-- 
To view, visit http://gerrit.cloudera.org:8080/4790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4153: Fix count(*) on all blank('') columns - test

2016-10-24 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4153: Fix count(*) on all blank('') columns - test
..


Patch Set 1:

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/342/

-- 
To view, visit http://gerrit.cloudera.org:8080/4755
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I23923f95f43d67977ee1520a1fc09ce297548b3f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4134,IMPALA-3704: Kudu INSERT improvements

2016-10-24 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4134,IMPALA-3704: Kudu INSERT improvements
..


Patch Set 6:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4728/6/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

PS6, Line 145: .
> why?  (that's the more interesting part for the code reader since the code 
We want to end up with the ability to have multiple buffers where we start 
flushing one or more of them and continuing to write into new ones. This is 
because a buffer that is flushing must block and cannot be freed/re-used again 
until all pending ops in it complete.

The issue of why set 100mb total mutation space w/ a low threshold took a while 
for me to get enough info, but it turns out the API is just confusing. 
Unfortunately there are a few ways to specify the same behavior. Since we set 
the total buffer space and the flush watermark pct, that implies a number of 
max buffers (100/7). We could also set the max number of buffers and pick 
different values for the total buffer space / flush watermark pct to accomplish 
the same thing. This is a known issue and they're looking to improve it moving 
forward.

I'll update the comment.


Line 282:  << "previous write operation might be 
inconsistent.\n";
> can't the write operation be inconsistent even if there was no overflow? i.
Yeah I agree this is poorly worded, I think it'd just be better to take out the 
2nd part.


Line 298: error_msg_buffer << "Kudu error(s) reported, first error: " 
<< e.ToString();
> why not just create the Status here since we no longer will append to it la
Done


PS6, Line 330: kudu_error_counter_->value()
> this doesn't really work in case of GetPendingErrors() returning overflow, 
I think so: If that happens we'll fail the query. If the query fails, the 
'num_modified_rows' never gets put in the profile (Coordinator::Wait()) or 
returned via beeswax (impala-beeswax-server CloseInsertInternal()) -- both 
ignore the insert stats when query status is not ok.


http://gerrit.cloudera.org:8080/#/c/4728/6/be/src/exec/kudu-table-sink.h
File be/src/exec/kudu-table-sink.h:

PS6, Line 48: due to a 
> can't it return an error for other reasons?
Done


PS6, Line 49: be reported as warnings.
> isn't the first Kudu error also reported as the error status?
Done


PS6, Line 113: if the Kudu client may be
 :   /// getting behind.
> what does this mean? does the code use it, or is this saying a user could u
It's meant to be used as a profile counter, not for the code.
I tried to clarify.


http://gerrit.cloudera.org:8080/#/c/4728/6/common/thrift/generate_error_codes.py
File common/thrift/generate_error_codes.py:

PS6, Line 300: Error Kudu table 
> should this say "Error in Kudu table ..." or something?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4728
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5542b9a061b01c543a139e8722560b1365f06595
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

2016-10-24 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#7).

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
..

IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

This change enables codegen for all builtin aggregate functions,
e.g. timestamp functions and group_concat.

There are several parts to the change:
* Adding support for generic UDAs. Previous the codegen code did not
  handle multiple input arguments or NULL return values.
* Defaulting to using the UDA interface when there is not a special
  codegen path (we have implementations of all builtin aggregate
  functions for the interpreted path).
* Remove all the logic to disable codegen for the special cases that now
  are supported.

Also fix the generation of code to get/set NULL bits since I needed
to add functionality there anyway.

Testing:
Add tests that check that codegen was enabled for builtin aggregate
functions. Also fix some gaps in the preexisting tests.

Also add tests for UDAs that check input/output nulls are handled
correctly, in anticipation of enabling codegen for arbitrary UDAs.

Perf:
Ran local TPC-H and targeted perf. Spent a lot of time on TPC-H Q1,
since my original approach regressed it ~5%. In the end the problem was
to do with the ordering of loads/stores to the slot and null bit in the
generated code: the previous version of the code exploited some
properties of the particular aggregate function. I ended up replicating
this behaviour to avoid regressing perf.

Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/aggregation-node.cc
M be/src/exec/exec-node.cc
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-table.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/old-hash-table.cc
M be/src/exec/partitioned-aggregation-node-ir.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/text-converter.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/case-expr.cc
M be/src/exprs/compound-predicates.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/exprs/expr.cc
M be/src/exprs/literal.cc
M be/src/exprs/null-literal.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/slot-ref.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/tuple.cc
M be/src/runtime/types.h
M be/src/testutil/test-udas.cc
M be/src/testutil/test-udfs.cc
M be/src/util/tuple-row-compare.cc
M testdata/workloads/functional-query/queries/QueryTest/uda.test
M tests/common/test_result_verifier.py
M tests/query_test/test_aggregation.py
M tests/query_test/test_udfs.py
39 files changed, 845 insertions(+), 561 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/4655/7
-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

2016-10-24 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#7).

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
..

IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

This change enables codegen for all builtin aggregate functions,
e.g. timestamp functions and group_concat.

There are several parts to the change:
* Adding support for generic UDAs. Previous the codegen code did not
  handle multiple input arguments or NULL return values.
* Defaulting to using the UDA interface when there is not a special
  codegen path (we have implementations of all builtin aggregate
  functions for the interpreted path).
* Remove all the logic to disable codegen for the special cases that now
  are supported.

Also fix the generation of code to get/set NULL bits since I needed
to add functionality there anyway.

Testing:
Add tests that check that codegen was enabled for builtin aggregate
functions. Also fix some gaps in the preexisting tests.

Also add tests for UDAs that check input/output nulls are handled
correctly, in anticipation of enabling codegen for arbitrary UDAs.

Perf:
Ran local TPC-H and targeted perf. Spent a lot of time on TPC-H Q1,
since my original approach regressed it ~5%. In the end the problem was
to do with the ordering of loads/stores to the slot and null bit in the
generated code: the previous version of the code exploited some
properties of the particular aggregate function. I ended up replicating
this behaviour to avoid regressing perf.

Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/aggregation-node.cc
M be/src/exec/exec-node.cc
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-table.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/old-hash-table.cc
M be/src/exec/partitioned-aggregation-node-ir.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/text-converter.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/case-expr.cc
M be/src/exprs/compound-predicates.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/exprs/expr.cc
M be/src/exprs/literal.cc
M be/src/exprs/null-literal.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/slot-ref.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/tuple.cc
M be/src/runtime/types.h
M be/src/testutil/test-udas.cc
M be/src/testutil/test-udfs.cc
M be/src/util/tuple-row-compare.cc
M testdata/workloads/functional-query/queries/QueryTest/uda.test
M tests/common/test_result_verifier.py
M tests/query_test/test_aggregation.py
M tests/query_test/test_udfs.py
39 files changed, 845 insertions(+), 561 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/4655/7
-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

2016-10-24 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
..


Patch Set 6:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/4655/6/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

PS6, Line 1063: if we initialize the destination value to 0 (with the NULL bit 
set)
> If I understand it correctly, this initialization is done at the initFn() o
We could do that but it doesn't seem worth it (the interpreted path has enough 
additional overhead that I doubt it makes a measurable difference).


PS6, Line 1617: Value* dst_ptr
> May help to explain what dst_ptr actually is ? It seems to be pointer to th
Done


PS6, Line 1631: (agg_op == AggFnEvaluator::MIN || agg_op == 
AggFnEvaluator::MAX))
> nit: I find it easier to read to check the agg_op first:
Done


PS6, Line 1639: slot_desc->CodegenSetNullIndicator
> Can this be skipped if (!slot_desc->is_nullable()) ? Same for below.
Min/Max should always be nullable (if there are no input values, the output is 
NULL). Added a DCHECK to assert this.


PS6, Line 1663: agg_op == AggFnEvaluator::MIN
> agg_op != OTHER ? or you wanna keep this list for clarity ?
Yeah I prefer this way, also since it won't break if someone adds a new 
aggregate op


PS6, Line 1718: false);
> Why not just pass true and skip calling CloneFunction() below ?
I somehow missed that we called Clonefunction() below.


PS6, Line 1742:   Value* dst_lowered_ptr = dst.GetLoweredPtr("dst_lowered_ptr");
  :   Type* dst_unlowered_ptr_type = 
CodegenAnyVal::GetUnloweredPtrType(codegen, dst_type);
  :   Value* dst_unlowered_ptr = builder->CreateBitCast(
  :   dst_lowered_ptr, dst_unlowered_ptr_type, 
"dst_unlowered_ptr");
> Why not dst.GetUnloweredPtr() ?
We use both dst_unlowered_ptr and dst_lowered_ptr so we need both variables. We 
can't just call both GetLoweredPtr() and GetUnloweredPtr() here because 
dst_lowered_ptr and dst_unlowered_ptr need to point to the same alloca().


-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3884: Support TYPE TIMESTAMP for HashTableCtx::CodegenAssignNullValue()

2016-10-24 Thread Michael Ho (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4794

to look at the new patch set (#3).

Change subject: IMPALA-3884: Support TYPE_TIMESTAMP for 
HashTableCtx::CodegenAssignNullValue()
..

IMPALA-3884: Support TYPE_TIMESTAMP for HashTableCtx::CodegenAssignNullValue()

This change implements support for TYPE_TIMESTAMP for
HashTableCtx::CodegenAssignNullValue(). TimestampValue itself
is 16 bytes in size. To match RawValue::Write() in the
interpreted path, CodegenAssignNullValue() emits code to assign
HashUtil::FNV_SEED to both the upper and lower 64-bit of the
destination value. This change also fixes the handling of 128-bit
Decimal16Value in CodegenAssignNullValue() so the emitted code
matches the behavior of the interpreted path.

Change-Id: I0211d38cbef46331e0006fa5ed0680e6e0867bc8
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/hash-table.cc
M be/src/exec/old-hash-table.cc
M be/src/util/bit-util.h
M testdata/workloads/functional-query/queries/QueryTest/joins.test
7 files changed, 72 insertions(+), 37 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/4794/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0211d38cbef46331e0006fa5ed0680e6e0867bc8
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Minor compute stats script fixes

2016-10-24 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4825

Change subject: Minor compute stats script fixes
..

Minor compute stats script fixes

* Change run-step to output full log path
* Change text to say "Computing table stats" rather than "Computing
  HBase stats" when running compute-table-stats.sh

Change-Id: I326f4c370fda8d5e388af8e2395623185c06bc07
---
M testdata/bin/create-load-data.sh
M testdata/bin/run-step.sh
2 files changed, 3 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/4825/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4825
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I326f4c370fda8d5e388af8e2395623185c06bc07
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-24 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-2521: Add clustered hint to insert statements
..


Patch Set 9:

(6 comments)

Thanks for the comments, please see PS9.

http://gerrit.cloudera.org:8080/#/c/4745/5/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

Line 602:   }
> Thanks. The reason why it works without is that SortNode.init() during plan
Ok


Line 604:   if (matchFound && table_ instanceof KuduTable) {
> I vote for keep
Ok


http://gerrit.cloudera.org:8080/#/c/4745/8/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

Line 603:   // Store exprs for Kudu key columns.
> add line breaks between if blocks, it's harder to navigate without those ('
Done


http://gerrit.cloudera.org:8080/#/c/4745/8/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java:

Line 237:* outputs of aggregation.
> something missing there. (i know, not your fault.)
Yes, I couldn't figure out in git blame what that sentence originally meant. 
Since I don't understand enough of the code to complete it, I removed it. Let 
me know if I should replace it with something else.


http://gerrit.cloudera.org:8080/#/c/4745/8/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

Line 247: # IMPALA-2521: clustered insert into table adds sort node, correctly 
substituting exprs.
> you can't always see the correct substitution until you run the query (beca
Done


Line 249: select id, name, maxzip as zip
> use fewer columns, add an analytic function in the inline view
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add distcc infrastructure.

2016-10-24 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Add distcc infrastructure.
..


Patch Set 2:

Would be good to send out an email to dev@ once this goes in

-- 
To view, visit http://gerrit.cloudera.org:8080/4820
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4bc78ad46dda13e4533183195af632f46377cae
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Add distcc infrastructure.

2016-10-24 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Add distcc infrastructure.
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4820
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4bc78ad46dda13e4533183195af632f46377cae
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

2016-10-24 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
..


Patch Set 6:

(7 comments)

Looking good. Some more comments for now. Still doing another pass.

http://gerrit.cloudera.org:8080/#/c/4655/6/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

PS6, Line 1063: if we initialize the destination value to 0 (with the NULL bit 
set)
If I understand it correctly, this initialization is done at the initFn() of 
various aggregate operators (e.g. initNull() and initZero()) so this is even 
true for the interpreted path, right ? Not sure why we cannot do that even for 
the interpreted path too ?


PS6, Line 1617: Value* dst_ptr
May help to explain what dst_ptr actually is ? It seems to be pointer to the 
slot for this agg_fn_evaluator in the intermediate tuple


PS6, Line 1631: (agg_op == AggFnEvaluator::MIN || agg_op == 
AggFnEvaluator::MAX))
nit: I find it easier to read to check the agg_op first:

if ((agg_op == ..) && dst_is_numeric_or_bool)) {
...
}


PS6, Line 1639: slot_desc->CodegenSetNullIndicator
Can this be skipped if (!slot_desc->is_nullable()) ? Same for below.


PS6, Line 1663: agg_op == AggFnEvaluator::MIN
agg_op != OTHER ? or you wanna keep this list for clarity ?


PS6, Line 1718: false);
Why not just pass true and skip calling CloneFunction() below ?


PS6, Line 1742:   Value* dst_lowered_ptr = dst.GetLoweredPtr("dst_lowered_ptr");
  :   Type* dst_unlowered_ptr_type = 
CodegenAnyVal::GetUnloweredPtrType(codegen, dst_type);
  :   Value* dst_unlowered_ptr = builder->CreateBitCast(
  :   dst_lowered_ptr, dst_unlowered_ptr_type, 
"dst_unlowered_ptr");
Why not dst.GetUnloweredPtr() ?


-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3676: Use clang as a static analysis tool

2016-10-24 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3676: Use clang as a static analysis tool
..


Patch Set 6:

(8 comments)

The commit message doesn't really tell the full story here. In particular, you 
should say something about AlignmentNew and what it is for.

The rearrangement of class members to group by type seems like it sacrifices 
readability for performance in some strange cases  - the statestore is a good 
example, where there is only ever one statestore and it is not (as far as we 
know) highly sensitive to its in-memory layout. What criteria did you use to 
decide that a class' members should be grouped-by-type?

http://gerrit.cloudera.org:8080/#/c/4758/6/be/src/common/init.cc
File be/src/common/init.cc:

PS6, Line 100: [[noreturn]]
can this go on the previous line? It might be easier to read that way (similar 
to how we do template<> declarations).


http://gerrit.cloudera.org:8080/#/c/4758/6/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

PS6, Line 173: 
 : 
Why remove these comments?


PS6, Line 176: 
why remove this?


http://gerrit.cloudera.org:8080/#/c/4758/6/be/src/exprs/aggregate-functions.h
File be/src/exprs/aggregate-functions.h:

PS6, Line 182: 10
can you replace that with HLL_PRECISION?


http://gerrit.cloudera.org:8080/#/c/4758/6/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

PS6, Line 140: std::
remove std:: in cc files


http://gerrit.cloudera.org:8080/#/c/4758/6/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

PS6, Line 82: CacheLineAligned
why?


PS6, Line 253: boost::mutex exit_flag_lock_;
This is a good example of where logical grouping is broken by grouping-by-type. 
The lock and the flag are related, but are not next to each other.


http://gerrit.cloudera.org:8080/#/c/4758/6/be/src/util/aligned-new.h
File be/src/util/aligned-new.h:

PS6, Line 27: Objects that must be allocated 
What does 'must be allocated' mean - for correctness or performance or both?


-- 
To view, visit http://gerrit.cloudera.org:8080/4758
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ed168488cb30ddeccd0087f3840541d858f9c06
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3211: provide toolchain build id for bootstrapping

2016-10-24 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-3211: provide toolchain build id for bootstrapping
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4771/2//COMMIT_MSG
Commit Message:

PS2, Line 10: cceeded
I think the reason Gerrit displays these in blue while the first two letters 
are black is because "cceeded" is a 7-character git hash :-)


http://gerrit.cloudera.org:8080/#/c/4771/2/bin/impala-config.sh
File bin/impala-config.sh:

Line 56: : ${IMPALA_TOOLCHAIN_BUILD_ID=249-2267164200}
This could use a bit more explanation. How should a person discover this value? 
When should it be changed?


-- 
To view, visit http://gerrit.cloudera.org:8080/4771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibcc25ae82511713d0ff05ded37ef162925f2f0fb
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Remove unused Bitmap code.

2016-10-24 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: Remove unused Bitmap code.
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I95fcaaa40243999800c2ec2ead5b3479d66a63e7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] Add all build targets to CMake and speed up builds

2016-10-24 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Add all build targets to CMake and speed up builds
..


Patch Set 1:

(5 comments)

Have you tested this in a clean environment, maybe with Docker or Jenkins?

http://gerrit.cloudera.org:8080/#/c/4790/1/bin/make_impala.sh
File bin/make_impala.sh:

Line 90:   echo "[-fe_only] : Builds fe only."
The reason for this flag is so that make_impala.sh still builds the backend if 
you pass it no args?

If so, what would be the consequences of breaking compatibility and requiring 
-be if the user wants to build the backend?


PS1, Line 92: ."
cand benchmarks


PS1, Line 96: like
What else?


PS1, Line 174: exit
This line can be omitted


http://gerrit.cloudera.org:8080/#/c/4790/1/buildall.sh
File buildall.sh:

PS1, Line 305: all
Should the meaning of "all" be put into make_impala.sh?


-- 
To view, visit http://gerrit.cloudera.org:8080/4790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add distcc infrastructure.

2016-10-24 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Add distcc infrastructure.
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4820/1/bin/distcc/distcc.sh
File bin/distcc/distcc.sh:

Line 1: #!/bin/bash
> Needs a copyright header
Done


Line 46: fi
> Should be able to get rid of this, we don't pick the linker up from PATH no
Done


http://gerrit.cloudera.org:8080/#/c/4820/1/bin/distcc/distcc_env.sh
File bin/distcc/distcc_env.sh:

Line 1: # Licensed to the Apache Software Foundation (ASF) under one
> Needs a copyright header
Done


PS1, Line 44: and not fo
> Can you check if BUILD_FARM is defined and print a message? Otherwise peopl
Done


Line 131: IMPALA_USE_DISTCC=true
> All this linker stuff is irrelevant now that we're using the toolchain link
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4820
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4bc78ad46dda13e4533183195af632f46377cae
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add distcc infrastructure.

2016-10-24 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new patch set (#2).

Change subject: Add distcc infrastructure.
..

Add distcc infrastructure.

This has been working for several months, and it it was written mainly
by Casey Ching while he was at Cloudera working on Impala.

Change-Id: Ia4bc78ad46dda13e4533183195af632f46377cae
---
M .gitignore
A bin/distcc/.gitignore
A bin/distcc/README.md
A bin/distcc/distcc.sh
A bin/distcc/distcc_env.sh
5 files changed, 330 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/4820/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4820
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia4bc78ad46dda13e4533183195af632f46377cae
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3725 Support Kudu UPSERT in Impala

2016-10-24 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-3725 Support Kudu UPSERT in Impala
..


Patch Set 12: Code-Review+1

(1 comment)

Carrying forward Alex's +2

http://gerrit.cloudera.org:8080/#/c/4047/11/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

Line 174: } else {
> else DCHECK that the referenced columns are empty?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable

2016-10-24 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-3552: make incremental stats max serialized size 
configurable
..


Patch Set 1:

@Yonghyun I had a chat with Alex on the best way to move forward. He seems to 
agree with using BackendConfig to pass the params. As per our discussion, its 
better if we create a thrift struct (ex: TBackendConfig or some such similar), 
serialize it and pass it using JNI, unpack it in the JniCatalog/Frontend and 
populate it in a map in the static BackendConfig class. We could have getter 
methods that return the appropriated type cast'ed value corresponding to a 
given config key. You could clean up all the existing config arguments too 
using this.

-- 
To view, visit http://gerrit.cloudera.org:8080/4772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4d189ffc1c56e7a4e27e6233e9119250a2395a19
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Yonghyun Hwang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Yonghyun Hwang
Gerrit-HasComments: No


[Impala-ASF-CR] Enabling end-to-end tests on a remote cluster

2016-10-24 Thread David Knupp (Code Review)
David Knupp has uploaded a new patch set (#4).

Change subject: Enabling end-to-end tests on a remote cluster
..

Enabling end-to-end tests on a remote cluster

This patch enables data loading and running end-to-end tests on a remote
cluster. The requirements to run the tests on a remote cluster are

  - CDH cluster that is CM managed
  - KMS and KeyTrustee installed and available as service
  - Hive warehouse dir points to /test-warehouse

The new remote_load_data.py script takes a CM host as argument and will
load the test warehouse snapshot on the first cluster managed by this
instance of CM. It will automatically pick the necessary configuration
needed to perform the data load process.

Usage: remote_data_load.py [options] cm_host

Options:
  -h, --helpshow this help message and exit
  --cm-user=CM_USER Cloudera Manager admin user
  --cm-pass=CM_PASS Cloudera Manager admin user password
  --gateway=GATEWAY Gateway host to upload the data from. If not set, uses
the CM host as gateway.
  --ssh-user=SSH_USER   System user on the remote machine with passwordless
SSH configured.
  --no-load Do not try to load the snapshot
  --exploration-strategy=EXPLORATION_STRATEGY
  --testRun end-to-end tests against cluster

Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9
---
M bin/load-data.py
A bin/remote_data_load.py
M testdata/bin/compute-table-stats.sh
M testdata/bin/create-load-data.sh
M testdata/bin/create-table-many-blocks.sh
M testdata/bin/generate-schema-statements.py
M testdata/bin/load-test-warehouse-snapshot.sh
M testdata/bin/load_nested.py
M testdata/bin/run-step.sh
M testdata/bin/setup-hdfs-env.sh
10 files changed, 575 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/4769/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4769
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Martin Grund 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] Enabling end-to-end tests on a remote cluster

2016-10-24 Thread David Knupp (Code Review)
David Knupp has uploaded a new patch set (#4).

Change subject: Enabling end-to-end tests on a remote cluster
..

Enabling end-to-end tests on a remote cluster

This patch enables data loading and running end-to-end tests on a remote
cluster. The requirements to run the tests on a remote cluster are

  - CDH cluster that is CM managed
  - KMS and KeyTrustee installed and available as service
  - Hive warehouse dir points to /test-warehouse

The new remote_load_data.py script takes a CM host as argument and will
load the test warehouse snapshot on the first cluster managed by this
instance of CM. It will automatically pick the necessary configuration
needed to perform the data load process.

Usage: remote_data_load.py [options] cm_host

Options:
  -h, --helpshow this help message and exit
  --cm-user=CM_USER Cloudera Manager admin user
  --cm-pass=CM_PASS Cloudera Manager admin user password
  --gateway=GATEWAY Gateway host to upload the data from. If not set, uses
the CM host as gateway.
  --ssh-user=SSH_USER   System user on the remote machine with passwordless
SSH configured.
  --no-load Do not try to load the snapshot
  --exploration-strategy=EXPLORATION_STRATEGY
  --testRun end-to-end tests against cluster

Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9
---
M bin/load-data.py
A bin/remote_data_load.py
M testdata/bin/compute-table-stats.sh
M testdata/bin/create-load-data.sh
M testdata/bin/create-table-many-blocks.sh
M testdata/bin/generate-schema-statements.py
M testdata/bin/load-test-warehouse-snapshot.sh
M testdata/bin/load_nested.py
M testdata/bin/run-step.sh
M testdata/bin/setup-hdfs-env.sh
10 files changed, 575 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/4769/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4769
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Martin Grund 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-4300: Speed up BloomFilter::Or with SIMD

2016-10-24 Thread Jim Apple (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4813

to look at the new patch set (#4).

Change subject: IMPALA-4300: Speed up BloomFilter::Or with SIMD
..

IMPALA-4300: Speed up BloomFilter::Or with SIMD

The previous code was not written in a way that GCC could
auto-vectorize it. Manually vectorizing speeds up BloomFilter::Or by
up to 184x.

Change-Id: I840799d9cfb81285c796e2abfe2029bb869b0f67
---
M be/src/benchmarks/bloom-filter-benchmark.cc
M be/src/testutil/mem-util.h
M be/src/util/bloom-filter.cc
M be/src/util/cpu-info.cc
M be/src/util/cpu-info.h
5 files changed, 178 insertions(+), 75 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/4813/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4813
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I840799d9cfb81285c796e2abfe2029bb869b0f67
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4153: Fix count(*) on all blank('') columns - test

2016-10-24 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4153: Fix count(*) on all blank('') columns - test
..


Patch Set 1: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/339/

-- 
To view, visit http://gerrit.cloudera.org:8080/4755
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I23923f95f43d67977ee1520a1fc09ce297548b3f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No