[Impala-ASF-CR] IMPALA-6802 (part 3): Clean up authorization tests

2018-05-11 Thread Adam Holley (Code Review)
Adam Holley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10358 )

Change subject: IMPALA-6802 (part 3): Clean up authorization tests
..


Patch Set 2: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/10358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9681cc3c7094db33ab7c5caa69b99dd803b908b7
Gerrit-Change-Number: 10358
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Comment-Date: Sat, 12 May 2018 03:44:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6907: Close stale connections to removed cluster members

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10327 )

Change subject: IMPALA-6907: Close stale connections to removed cluster members
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41b7297cf665bf291b09b23524d19b1d10ab281d
Gerrit-Change-Number: 10327
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Sat, 12 May 2018 03:32:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6907: Close stale connections to removed cluster members

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10327 )

Change subject: IMPALA-6907: Close stale connections to removed cluster members
..

IMPALA-6907: Close stale connections to removed cluster members

Previously, ImpalaServer::MembershipCallback() is used by each
Impala backend node to update cluster membership. It also removes
stale connections to nodes which are no longer members of the cluster.
However, the way it detects removed member is flawed as it relies
on query_locations_ to determine whether stale connections may
exist to the removed members. query_locations_ is a map of host
name to a set of queries running on that host. A entry for a remote
node only exists in query_locations_ if an Impalad node has acted
as coordinator of a query with fragment instances scheduled to run
on that remote node.

This change fixes this problem by closing connections to remote
hosts which are removed from the cluster regardless of whether
it can be found in query_locations_. A new test is added to
exercise this path by restarting Impalad backend nodes between
queries. Also change impala_cluster.py to use bin/start-impala.sh
to start Impala demon instead of directly forking and exec'ing
Impalad. This is needed as start-impala.sh sets up the proper
Java related environment variables.

Change-Id: I41b7297cf665bf291b09b23524d19b1d10ab281d
Reviewed-on: http://gerrit.cloudera.org:8080/10327
Reviewed-by: Michael Ho 
Tested-by: Impala Public Jenkins 
---
M be/src/service/impala-server.cc
M tests/common/impala_cluster.py
M tests/custom_cluster/test_restart_services.py
3 files changed, 56 insertions(+), 35 deletions(-)

Approvals:
  Michael Ho: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I41b7297cf665bf291b09b23524d19b1d10ab281d
Gerrit-Change-Number: 10327
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR](2.x) IMPALA-7010: don't run memory usage tests on non-HDFS

2018-05-11 Thread Tim Armstrong (Code Review)
Hello Impala Public Jenkins,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/10387

to review the following change.


Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS
..

IMPALA-7010: don't run memory usage tests on non-HDFS

Moved a number of tests with tuned mem_limits. In some cases
this required separating the tests from non-tuned functional
tests.

TestQueryMemLimit used very high and very low limits only, so seemed
safe to run in all configurations.

Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662
Reviewed-on: http://gerrit.cloudera.org:8080/10370
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 
---
A testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
A 
testdata/workloads/functional-query/queries/QueryTest/kudu_insert_mem_limit.test
A 
testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch-mem-limit.test
M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test
M testdata/workloads/functional-query/queries/QueryTest/spilling-aggs.test
R 
testdata/workloads/functional-query/queries/QueryTest/spilling-regression-exhaustive.test
M tests/common/skip.py
M tests/query_test/test_insert.py
M tests/query_test/test_kudu.py
M tests/query_test/test_mem_usage_scaling.py
M tests/query_test/test_nested_types.py
M tests/query_test/test_sort.py
M tests/query_test/test_spilling.py
15 files changed, 198 insertions(+), 157 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/10387/1
--
To view, visit http://gerrit.cloudera.org:8080/10387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662
Gerrit-Change-Number: 10387
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR](2.x) IMPALA-7010: don't run memory usage tests on non-HDFS

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10387 )

Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2460/


--
To view, visit http://gerrit.cloudera.org:8080/10387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662
Gerrit-Change-Number: 10387
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 12 May 2018 02:47:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10256 )

Change subject: IMPALA-6957: calc thread resource requirement in planner
..


Patch Set 8: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10256
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140
Gerrit-Change-Number: 10256
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 12 May 2018 01:43:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10256 )

Change subject: IMPALA-6957: calc thread resource requirement in planner
..

IMPALA-6957: calc thread resource requirement in planner

This only factors in fragment execution threads. E.g. this does *not*
try to account for the number of threads on the old Thrift RPC
code path if that is enabled.

This is loosely related to the old VCores estimate, but is different in
that it:
* Directly ties into the notion of required threads in
  ThreadResourceMgr.
* Is a strict upper bound on the number of such threads, rather than
  an estimate.

Does not include "optional" threads. ThreadResourceMgr in the backend
bounds the number of "optional" threads per impalad, so the number of
execution threads on a backend is limited by

  sum(required threads per query) +
  CpuInfo::num_cores() * FLAGS_num_threads_per_core

DCHECKS in the backend enforce that the calculation is correct. They
were actually hit in KuduScanNode because of some races in thread
management leading to multiple "required" threads running. Now the
first thread in the multithreaded scans never exits, which means
that it's always safe for any of the other threads to exit early,
which simplifies the logic a lot.

Testing:
Updated planner tests.

Ran core tests.

Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140
Reviewed-on: http://gerrit.cloudera.org:8080/10256
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/query-state.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/thread-resource-mgr.cc
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler.cc
M common/thrift/Frontend.thrift
M common/thrift/ImpalaInternalService.thrift
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test
M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level0.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level1.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test
43 files changed, 1,977 insertions(+), 1,931 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10256
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140
Gerrit-Change-Number: 10256
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala 

[Impala-ASF-CR] IMPALA-6221: [DOCS] Correct syntax for Kudu Partition clause

2018-05-11 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10353 )

Change subject: IMPALA-6221: [DOCS] Correct syntax for Kudu Partition clause
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10353/1/docs/topics/impala_create_table.xml
File docs/topics/impala_create_table.xml:

http://gerrit.cloudera.org:8080/#/c/10353/1/docs/topics/impala_create_table.xml@215
PS1, Line 215:
> missing a comma
Done



--
To view, visit http://gerrit.cloudera.org:8080/10353
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d8ccdcc8b28fcd44511040f1e05969af83b3d1e
Gerrit-Change-Number: 10353
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Sat, 12 May 2018 01:33:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6221: [DOCS] Correct syntax for Kudu Partition clause

2018-05-11 Thread Alex Rodoni (Code Review)
Hello Thomas Tauber-Marshall, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10353

to look at the new patch set (#2).

Change subject: IMPALA-6221: [DOCS] Correct syntax for Kudu Partition clause
..

IMPALA-6221: [DOCS] Correct syntax for Kudu Partition clause

Change-Id: I8d8ccdcc8b28fcd44511040f1e05969af83b3d1e
---
M docs/topics/impala_create_table.xml
1 file changed, 2 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/10353/2
--
To view, visit http://gerrit.cloudera.org:8080/10353
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8d8ccdcc8b28fcd44511040f1e05969af83b3d1e
Gerrit-Change-Number: 10353
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10370 )

Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS
..


Patch Set 5:

This change did not cherrypick successfully into branch 2.x. To resolve this, 
please do the cherry-pick manually and submit it to Gerrit at refs/for/2.x or 
add an exception to the branch 2.x copy of bin/ignored_commits.json. Thanks, 
your friendly bot at https://jenkins.impala.io/job/cherrypick-2.x-and-test/482/ 
.


--
To view, visit http://gerrit.cloudera.org:8080/10370
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662
Gerrit-Change-Number: 10370
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 12 May 2018 01:30:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6983: stress: don't write a null runtime profile

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10381 )

Change subject: IMPALA-6983: stress: don't write a null runtime profile
..


Patch Set 1: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2457/


--
To view, visit http://gerrit.cloudera.org:8080/10381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic8299a8a97ad1f2bd1f2927e3111db8df1d3a3e5
Gerrit-Change-Number: 10381
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Nithya Janarthanan 
Gerrit-Comment-Date: Sat, 12 May 2018 00:59:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7003: Deflake erasure coding data loading

2018-05-11 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/10362 )

Change subject: IMPALA-7003: Deflake erasure coding data loading
..

IMPALA-7003: Deflake erasure coding data loading

Erasure coding data loading is flaky in two ways:
1. HBase sometimes doesn't work because of HBase-19369
2. Nested data loading sometimes fails because the HDFS namenode cannot
   find enough good datanodes.

For problem 1, this patch enables erasure coding only on /test-warehouse
directory. For problem 2, this patch sets
dfs.namenode.redundancy.considerLoad to false, preventing namenode from
excluding heavily-loaded datanodes.

Change-Id: I219106cd3ec7ffab7a834700f2a722b165e5f66c
---
M bin/impala-config.sh
M testdata/bin/create-load-data.sh
M testdata/bin/load-test-warehouse-snapshot.sh
M testdata/bin/setup-hdfs-env.sh
M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
5 files changed, 44 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/10362/2
--
To view, visit http://gerrit.cloudera.org:8080/10362
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I219106cd3ec7ffab7a834700f2a722b165e5f66c
Gerrit-Change-Number: 10362
Gerrit-PatchSet: 2
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..

IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

The is the final change to clarify and break up the Coordinator's lock.
The state machine for the coordinator is made explicit, distinguishing
between executing state and multiple terminal states. Logic to
transition into a terminal state is centralized in one location and
executes exactly once for each coordinator object.

Derived from a patch for IMPALA_5384 by Marcel Kornacker.

Testing:
- exhaustive functional tests
- stress test on minicluster with memory overcommitment. Verified from
  the logs that this exercises all these paths:
  - successful queries
  - client requested cancellation
  - error from exec FInstances RPC
  - error reported asynchronously via report status RPC
  - eos before backend execution completed

Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Reviewed-on: http://gerrit.cloudera.org:8080/10158
Reviewed-by: Dan Hecht 
Tested-by: Impala Public Jenkins 
---
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/client-request-state.cc
M be/src/service/impala-server.h
M be/src/util/counting-barrier.h
6 files changed, 393 insertions(+), 392 deletions(-)

Approvals:
  Dan Hecht: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 22
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 21: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 21
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 12 May 2018 00:46:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6907: Close stale connections to removed cluster members

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10327 )

Change subject: IMPALA-6907: Close stale connections to removed cluster members
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2459/


--
To view, visit http://gerrit.cloudera.org:8080/10327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41b7297cf665bf291b09b23524d19b1d10ab281d
Gerrit-Change-Number: 10327
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Sat, 12 May 2018 00:04:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async

2018-05-11 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10060 )

Change subject: IMPALA-5216: Make admission control queuing async
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@1147
PS7, Line 1147: void ClientRequestState::FinishExecQueryOrDmlRequest() {
  : #ifndef NDEBUG
  :   SleepIfSetInDebugOptions(schedule_->query_options(), 
SLEEP_BEFORE_ADMISSION_MS);
  : #endif
  :   // Remove if-check after IMPALA-5814.
  :   if (ExecEnv::GetInstance()->admission_controller() != 
nullptr) {
  : Status admit_status = 
ExecEnv::GetInstance()->admission_controller()->AdmitQuery(
  : schedule_.get(), _outcome_);
  :
  : #ifndef NDEBUG
  : SleepIfSetInDebugOptions(schedule_->query_options(), 
SLEEP_AFTER_ADMISSION_MS);
  : #endif
> Given how small the window is, I think it's better to go with the simpler a
Yes, it is equivalent to cancellation happening right after this check, which 
would be caught by the cancellation check at Line 1173



--
To view, visit http://gerrit.cloudera.org:8080/10060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
Gerrit-Change-Number: 10060
Gerrit-PatchSet: 8
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 23:59:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-11 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 21:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@476
PS18, Line 476:   DCHECK(exec_status_.ok()) << exec_status_;
> I was thinking the ($3 -> $4) already gives that info, once you get used to
Agreed. That makes sense.



--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 21
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 23:59:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner

2018-05-11 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10256 )

Change subject: IMPALA-6957: calc thread resource requirement in planner
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10256/4/testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test:

http://gerrit.cloudera.org:8080/#/c/10256/4/testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test@64
PS4, Line 64: instances
> Sounds like deferring it is a good idea :)
Agree, definitely outside of the scope of this change



--
To view, visit http://gerrit.cloudera.org:8080/10256
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140
Gerrit-Change-Number: 10256
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 23:16:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async

2018-05-11 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10060 )

Change subject: IMPALA-5216: Make admission control queuing async
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10060/8/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/10060/8/be/src/service/impala-server.cc@1062
PS8, Line 1062: const PerBackendExecParams& per_backend_params =
  : request_state->schedule()->per_backend_exec_params();
  : if (!per_backend_params.empty()) {
  :   lock_guard l(query_locations_lock_);
  :   for (const auto& entry : per_backend_params) {
  : const TNetworkAddress& hostport = entry.first;
  : // Query may have been removed already by cancellation 
path. In particular, if
  : // node to fail was last sender to an exchange, the 
coordinator will realise and
  : // fail the query at the same time the failure 
detection path does the same
  : // thing. They will harmlessly race to remove the query 
from this map.
  : QueryLocations::iterator it = 
query_locations_.find(hostport);
  : if (it != query_locations_.end()) {
  :   it->second.erase(request_state->query_id());
  : }
  :   }
  : }
forgot to change this part, since after last patch query locations are always 
added even if coord has not been started.
Just need to enable this part to execute if a valid schedule exists



--
To view, visit http://gerrit.cloudera.org:8080/10060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
Gerrit-Change-Number: 10060
Gerrit-PatchSet: 8
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 23:03:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6802 (part 3): Clean up authorization tests

2018-05-11 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10358 )

Change subject: IMPALA-6802 (part 3): Clean up authorization tests
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10358/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java
File fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java:

http://gerrit.cloudera.org:8080/#/c/10358/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@750
PS1, Line 750: .error(selectError("functional.alltypesagg"))
 : .error(selectError("functional.alltypesagg"), 
onServer(allExcept(
> I'm not sure this test makes sense because you can only grant SELECT at the
Done. I'll just remove all other onColumn error checks that don't make sense.



--
To view, visit http://gerrit.cloudera.org:8080/10358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9681cc3c7094db33ab7c5caa69b99dd803b908b7
Gerrit-Change-Number: 10358
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Comment-Date: Fri, 11 May 2018 22:53:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6802 (part 3): Clean up authorization tests

2018-05-11 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/10358 )

Change subject: IMPALA-6802 (part 3): Clean up authorization tests
..

IMPALA-6802 (part 3): Clean up authorization tests

The third part of this patch is to rewrite the following authorization
tests:
- with
- union
- reset metadata
- show

Testing:
- Added new authorization tests
- Ran all front-end tests

Change-Id: I9681cc3c7094db33ab7c5caa69b99dd803b908b7
Cherry-picks: not for 2.x
---
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java
1 file changed, 443 insertions(+), 27 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/10358/2
--
To view, visit http://gerrit.cloudera.org:8080/10358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9681cc3c7094db33ab7c5caa69b99dd803b908b7
Gerrit-Change-Number: 10358
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 


[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async

2018-05-11 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10060 )

Change subject: IMPALA-5216: Make admission control queuing async
..


Patch Set 7:

(40 comments)

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h
File be/src/scheduling/admission-controller.h:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@44
PS7, Line 44: ClientRequestState::admit_outcome_
> if this enum is meant to be consumed publicly, then it's best to not refer
Done


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@201
PS7, Line 201: admit_status
> admit_outcome?
Done


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@203
PS7, Line 203: cancelled
> given that there's no Cancel() API in this class, how is that done? (I'm gu
Done


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.h@204
PS7, Line 204: admit_status
> admit_outcome
Done


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.cc@112
PS7, Line 112: Q
> nit: don't capitalize queue
Done


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/scheduling/admission-controller.cc@899
PS7, Line 899: false;
> use && rather than ?:false
Done


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h
File be/src/service/client-request-state.h:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@72
PS7, Line 72:   /// Must *not* be called with lock_ held.
> given that the coordinator object is (unfortunately) current exposed by thi
Changed  coord() to GetCoord() and added those details to its comment instead.


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@114
PS7, Line 114: non-error states (RUNNING_STATE and FINISHED_STATE)
> is PENDING_STATE one of those?
added it to this method, and updated the implementation to honor the state 
transition order of:
INITIALIZED_STATE -> PENDING_STATE -> RUNNING_STATE -> FINISHED_STATE


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@152
PS7, Line 152: coord_
> let's try not to talk about private members when documenting the public int
Done


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@172
PS7, Line 172:   Coordinator* coord() const {
 : return coord_exec_started_.Load() ? coord_.get() : nullptr;
 :   }
> it's confusing that coord() is not actually an accessor given it's name. If
Renamed to GetCoord().


For the second part of your comment:
That was my initial thought as well, but what happens is that there is a small 
window between calling coord->Exec() and setting the "coord_" variable in CRS, 
where the started fragments send in UpdateBackendExecStatus RPCs which we need 
to pass to the coord object that we currently do by looking up the CRS object 
directly in "client_request_state_map_". This worked up till now since coord_ 
is set before the Exec() call but now if we defer setting it, then we get a 
nullptr instead.

I thought of 2 approaches to to solve this:

1. the one that I implemented using "coord_exec_started_" for controlling 
external access and adding methods CRS::UpdateBackendExecStatus(), 
CRS::UpdateFilter() that access the "coord_" object directly.

OR

2. To defer setting the "coord_" variable in CRS and to serve the 
UpdateBackendExecStatus and UpdateFilter RPCs by keeping a list of active 
coordinators separately and looking them up there.


Update: As discussed, I will try to look into a way of cleaning this up.


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@323
PS7, Line 323: Exec() has been successfully called on it
> why do we bother creating the coordinator object ahead of when we call Exec
see previous comment.


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@481
PS7, Line 481: query-exec-state
> Is there a reason we can't update the group name to client-request-state? I
I dont see any reason not to. Since this only needs to be changed in 2 other 
places apart from this (seems like a small enough change), would you recommend 
I make the change in this commit itself?


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.cc@482
PS7, Line 482: SubmitToAdmissionController, this,
 :   _controller_thread_,
> I think it's a bit confusing that we call this the "admission controller" t
Excellent idea. Done.
I am a bit iffy about calling it the 

[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async

2018-05-11 Thread Bikramjeet Vig (Code Review)
Hello Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10060

to look at the new patch set (#8).

Change subject: IMPALA-5216: Make admission control queuing async
..

IMPALA-5216: Make admission control queuing async

Implement asynchronous admission control queuing. This is achieved by
running the admission control code-path in a separate thread. Major
changes include: propagating cancellation to the admission control
thread and dequeuing thread, and adding a new Query Operation State
called "PENDING" that represents the state between completion of
planning and starting of query execution.

Testing:
- Added a deterministic end to end test
- Ran multiple stress tests successfully with a cancellation probability
of 60% and with different values for the following parameters:
max_requests, queue_wait_timeout_ms. Ensured that the impalad was in a
valid state afterwards (no orphan fragments or wrong metrics).
- Ran all exhaustive tests and ASAN core tests successfully.

TODO: change terminology of "in_flight_query" to "submitted_queries".
  Need to identify all references of this terminology, eg. in
  comments, tests, variable names, etc.

Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
---
M be/src/common/atomic.h
M be/src/common/logging.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/debug-util.cc
M be/src/util/debug-util.h
M be/src/util/promise-test.cc
M be/src/util/promise.h
M common/thrift/ImpalaService.thrift
M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test
M tests/authorization/test_authorization.py
M tests/beeswax/impala_beeswax.py
M tests/common/impala_connection.py
M tests/custom_cluster/test_admission_controller.py
M tests/custom_cluster/test_krpc_mem_usage.py
M tests/hs2/hs2_test_suite.py
M tests/hs2/test_hs2.py
M tests/query_test/test_observability.py
M www/query_backends.tmpl
30 files changed, 649 insertions(+), 227 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/10060/8
--
To view, visit http://gerrit.cloudera.org:8080/10060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
Gerrit-Change-Number: 10060
Gerrit-PatchSet: 8
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10370 )

Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10370
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662
Gerrit-Change-Number: 10370
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 22:41:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10370 )

Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS
..

IMPALA-7010: don't run memory usage tests on non-HDFS

Moved a number of tests with tuned mem_limits. In some cases
this required separating the tests from non-tuned functional
tests.

TestQueryMemLimit used very high and very low limits only, so seemed
safe to run in all configurations.

Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662
Reviewed-on: http://gerrit.cloudera.org:8080/10370
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 
---
A testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
A 
testdata/workloads/functional-query/queries/QueryTest/kudu_insert_mem_limit.test
A 
testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch-mem-limit.test
M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test
M testdata/workloads/functional-query/queries/QueryTest/spilling-aggs.test
R 
testdata/workloads/functional-query/queries/QueryTest/spilling-regression-exhaustive.test
M tests/common/skip.py
M tests/query_test/test_insert.py
M tests/query_test/test_kudu.py
M tests/query_test/test_mem_usage_scaling.py
M tests/query_test/test_nested_types.py
M tests/query_test/test_sort.py
M tests/query_test/test_spilling.py
15 files changed, 196 insertions(+), 156 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10370
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662
Gerrit-Change-Number: 10370
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6983: stress: don't write a null runtime profile

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10381 )

Change subject: IMPALA-6983: stress: don't write a null runtime profile
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2457/


--
To view, visit http://gerrit.cloudera.org:8080/10381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic8299a8a97ad1f2bd1f2927e3111db8df1d3a3e5
Gerrit-Change-Number: 10381
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Nithya Janarthanan 
Gerrit-Comment-Date: Fri, 11 May 2018 22:35:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6983: stress: don't write a null runtime profile

2018-05-11 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10381 )

Change subject: IMPALA-6983: stress: don't write a null runtime profile
..


Patch Set 1:

> I don't have permissions to give +2
 > Looks good to me

Appreciate the review! It's good to have reviews from non-committers as well, 
and reviewing code is a way to get you on a path toward committership.

https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala

http://impala.apache.org/bylaws.html


--
To view, visit http://gerrit.cloudera.org:8080/10381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic8299a8a97ad1f2bd1f2927e3111db8df1d3a3e5
Gerrit-Change-Number: 10381
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Nithya Janarthanan 
Gerrit-Comment-Date: Fri, 11 May 2018 22:35:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10256 )

Change subject: IMPALA-6957: calc thread resource requirement in planner
..


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2456/


--
To view, visit http://gerrit.cloudera.org:8080/10256
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140
Gerrit-Change-Number: 10256
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 22:29:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner

2018-05-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10256 )

Change subject: IMPALA-6957: calc thread resource requirement in planner
..


Patch Set 8: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10256
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140
Gerrit-Change-Number: 10256
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 22:28:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6983: stress: don't write a null runtime profile

2018-05-11 Thread Nithya Janarthanan (Code Review)
Nithya Janarthanan has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10381 )

Change subject: IMPALA-6983: stress: don't write a null runtime profile
..


Patch Set 1: Code-Review+1

I don't have permissions to give +2
Looks good to me


--
To view, visit http://gerrit.cloudera.org:8080/10381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic8299a8a97ad1f2bd1f2927e3111db8df1d3a3e5
Gerrit-Change-Number: 10381
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Nithya Janarthanan 
Gerrit-Comment-Date: Fri, 11 May 2018 22:23:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6221: [DOCS] Correct syntax for Kudu Partition clause

2018-05-11 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10353 )

Change subject: IMPALA-6221: [DOCS] Correct syntax for Kudu Partition clause
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10353/1/docs/topics/impala_create_table.xml
File docs/topics/impala_create_table.xml:

http://gerrit.cloudera.org:8080/#/c/10353/1/docs/topics/impala_create_table.xml@215
PS1, Line 215:
missing a comma



--
To view, visit http://gerrit.cloudera.org:8080/10353
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d8ccdcc8b28fcd44511040f1e05969af83b3d1e
Gerrit-Change-Number: 10353
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Fri, 11 May 2018 21:52:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR](2.x) Ignore 3.x version update patch

2018-05-11 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10383 )

Change subject: Ignore 3.x version update patch
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec50418e2a21d662bd2d8a7f8d55c24db367086c
Gerrit-Change-Number: 10383
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 11 May 2018 21:31:32 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) Ignore 3.x version update patch

2018-05-11 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10383 )

Change subject: Ignore 3.x version update patch
..

Ignore 3.x version update patch

Change-Id: Iec50418e2a21d662bd2d8a7f8d55c24db367086c
Reviewed-on: http://gerrit.cloudera.org:8080/10383
Reviewed-by: Michael Brown 
Tested-by: Sailesh Mukil 
---
M bin/ignored_commits.json
1 file changed, 3 insertions(+), 1 deletion(-)

Approvals:
  Michael Brown: Looks good to me, approved
  Sailesh Mukil: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: merged
Gerrit-Change-Id: Iec50418e2a21d662bd2d8a7f8d55c24db367086c
Gerrit-Change-Number: 10383
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR](2.x) Ignore 3.x version update patch

2018-05-11 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10383 )

Change subject: Ignore 3.x version update patch
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec50418e2a21d662bd2d8a7f8d55c24db367086c
Gerrit-Change-Number: 10383
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Fri, 11 May 2018 21:30:39 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) Ignore 3.x version update patch

2018-05-11 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10383


Change subject: Ignore 3.x version update patch
..

Ignore 3.x version update patch

Change-Id: Iec50418e2a21d662bd2d8a7f8d55c24db367086c
---
M bin/ignored_commits.json
1 file changed, 3 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/10383/1
--
To view, visit http://gerrit.cloudera.org:8080/10383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iec50418e2a21d662bd2d8a7f8d55c24db367086c
Gerrit-Change-Number: 10383
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 21:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2455/


--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 21
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 21:18:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 21: Code-Review+2

Rebase


--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 21
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 21:18:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 18:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@476
PS18, Line 476:   DCHECK(exec_status_.ok()) << exec_status_;
> Maybe we can have a larger discussion later, and leave this as it is now. B
I was thinking the ($3 -> $4) already gives that info, once you get used to it. 
For an error that results in query error, it will look like:

(EXECUTING -> ERROR)

for other errors (that are effectively dropped), it will look like e.g.:

(CANCELLED -> CANCELLED)
(RETURNED_RESULTS -> RETURNED_RESULTS)
(ERROR -> ERROR)

which says that we didn't enter the error state for this status.

I'm definitely open to tweaking the condition for this log in the future, but 
just thinking given the amount of active development in the code that will call 
this path (i.e. IMPALA-2990), it might be helpful to have at least initially.



--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 18
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 21:16:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-11 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 20: Code-Review+2

(4 comments)

Thanks! This LGTM, +2-ing it since it's already gone through multiple rounds of 
review. All comments are just responses for discussion's sake, and don't 
require any code changes.

http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h@325
PS18, Line 325: rySummary() and
> ExecState is a scalar (enum). We pass scalar by value - no reason to have i
I agree but on the other hand, keeping it const would help by avoiding 
accidental bugs such as overwriting the exec state in future changes, so it's 
better just to define the read only contract upfront. Also agree with no need 
for a ref.


http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@279
PS14, Line 279:
> Cancel() is not called for (execution) errors. Thats part of the point of t
I think I meant CancelBackends() as the NotifyAll() lives there, but the 
comment you updated it to is much clearer. Thanks.


http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@457
PS18, Line 457: ret_status
> exec_status_ is protected by the lock, so the copy is necessary.
Fair point, I missed that.


http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@476
PS18, Line 476:   DCHECK(exec_status_.ok()) << exec_status_;
> My thinking was that it would be good to at least log the error, which is w
Maybe we can have a larger discussion later, and leave this as it is now. But 
just as someone who frequents the logs to debug issues, if I see an error for a 
query, I would assume by default that the query failed, which could lead to 
some confusion, if it in fact didn't fail.

Maybe a solution would be to meet in the middle and tag log messages such as 
this with something like " failed but query succeeded", etc. But we 
don't need to have that convo in this patch.



--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 20
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 21:11:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db

2018-05-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9986 )

Change subject: IMPALA-3307: Add support for IANA time-zone db
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/runtime/runtime-state.h@317
PS5, Line 317:   /// Query-global timezone used as local timezone when 
executing the query.
who owns it?  Let's at least say "Not owned."


http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/util/filesystem-util.h
File be/src/util/filesystem-util.h:

http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/util/filesystem-util.h@58
PS5, Line 58: Real
Should we call it GetCanonicalPath()? I assume it's related to 
IsCanonicalPath(), and so would be nice to name them similarly (and group them 
together).  Is it the case that IsCanonicalPath() always returns true for the 
*real_path returned by GetRealPath()?



--
To view, visit http://gerrit.cloudera.org:8080/9986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77
Gerrit-Change-Number: 9986
Gerrit-PatchSet: 5
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 11 May 2018 21:07:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner

2018-05-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10256 )

Change subject: IMPALA-6957: calc thread resource requirement in planner
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10256
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140
Gerrit-Change-Number: 10256
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 21:01:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10165 )

Change subject: IMPALA-6941: load more text scanner compression plugins
..


Patch Set 11:

Thanks


--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 11 May 2018 20:56:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10165 )

Change subject: IMPALA-6941: load more text scanner compression plugins
..


Patch Set 11:

I believe the test case below covers the interesting cases

testdata/workloads/functional-query/queries/QueryTest/unsupported-compression-partitions.test


--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 11 May 2018 20:22:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6907: Close stale connections to removed cluster members

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10327 )

Change subject: IMPALA-6907: Close stale connections to removed cluster members
..


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2453/


--
To view, visit http://gerrit.cloudera.org:8080/10327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41b7297cf665bf291b09b23524d19b1d10ab281d
Gerrit-Change-Number: 10327
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Fri, 11 May 2018 20:07:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner

2018-05-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10256 )

Change subject: IMPALA-6957: calc thread resource requirement in planner
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10256/6/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

http://gerrit.cloudera.org:8080/#/c/10256/6/common/thrift/Frontend.thrift@416
PS6, Line 416: max_per_host_required_threads
> should we just go ahead and call this a reservation?  Or, we can wait for l
Done. Went through and updated variables and comments to use 
thread_reservation/mem_reservation.

There were some minor formatting changes from clang-format too.


http://gerrit.cloudera.org:8080/#/c/10256/6/common/thrift/Planner.thrift
File common/thrift/Planner.thrift:

http://gerrit.cloudera.org:8080/#/c/10256/6/common/thrift/Planner.thrift@80
PS6, Line 80: required_threads
> same question about reservation.
Done


http://gerrit.cloudera.org:8080/#/c/10256/6/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java:

http://gerrit.cloudera.org:8080/#/c/10256/6/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@276
PS6, Line 276: .setMemEstimateBytes(0)
> why do that?
The ResourceProfileBuilder requires setting a mem estimate (to avoid PlanNodes 
implicitly leaving it at zero). This is just preserving the old behaviour 
(which doesn't make sense, but I've been deferring making piecemeal changes to 
the estimates - IMPALA-5013).



--
To view, visit http://gerrit.cloudera.org:8080/10256
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140
Gerrit-Change-Number: 10256
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 19:58:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner

2018-05-11 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig, Alex Behm, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10256

to look at the new patch set (#7).

Change subject: IMPALA-6957: calc thread resource requirement in planner
..

IMPALA-6957: calc thread resource requirement in planner

This only factors in fragment execution threads. E.g. this does *not*
try to account for the number of threads on the old Thrift RPC
code path if that is enabled.

This is loosely related to the old VCores estimate, but is different in
that it:
* Directly ties into the notion of required threads in
  ThreadResourceMgr.
* Is a strict upper bound on the number of such threads, rather than
  an estimate.

Does not include "optional" threads. ThreadResourceMgr in the backend
bounds the number of "optional" threads per impalad, so the number of
execution threads on a backend is limited by

  sum(required threads per query) +
  CpuInfo::num_cores() * FLAGS_num_threads_per_core

DCHECKS in the backend enforce that the calculation is correct. They
were actually hit in KuduScanNode because of some races in thread
management leading to multiple "required" threads running. Now the
first thread in the multithreaded scans never exits, which means
that it's always safe for any of the other threads to exit early,
which simplifies the logic a lot.

Testing:
Updated planner tests.

Ran core tests.

Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140
---
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/query-state.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/thread-resource-mgr.cc
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler.cc
M common/thrift/Frontend.thrift
M common/thrift/ImpalaInternalService.thrift
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test
M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level0.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level1.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test
43 files changed, 1,977 insertions(+), 1,931 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/10256/7
--
To view, visit http://gerrit.cloudera.org:8080/10256
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140
Gerrit-Change-Number: 10256
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6983: stress: don't write a null runtime profile

2018-05-11 Thread Michael Brown (Code Review)
Michael Brown has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10381


Change subject: IMPALA-6983: stress: don't write a null runtime profile
..

IMPALA-6983: stress: don't write a null runtime profile

This patch fixes a problem in which the writing of profiles always
assumed profiles would be collected during binary. That's not the case
when queries are too expensive to run. The simple fix is to just check
for None and not perform the write.

Change-Id: Ic8299a8a97ad1f2bd1f2927e3111db8df1d3a3e5
---
M tests/stress/concurrent_select.py
1 file changed, 3 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/10381/1
--
To view, visit http://gerrit.cloudera.org:8080/10381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic8299a8a97ad1f2bd1f2927e3111db8df1d3a3e5
Gerrit-Change-Number: 10381
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Brown 


[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10370 )

Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2454/


--
To view, visit http://gerrit.cloudera.org:8080/10370
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662
Gerrit-Change-Number: 10370
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 19:14:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS

2018-05-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10370 )

Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10370
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662
Gerrit-Change-Number: 10370
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 19:14:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] Update version to 3.1.0-SNAPSHOT

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10360 )

Change subject: Update version to 3.1.0-SNAPSHOT
..


Patch Set 2:

This change did not cherrypick successfully into branch 2.x. To resolve this, 
please do the cherry-pick manually and submit it to Gerrit at refs/for/2.x or 
add an exception to the branch 2.x copy of bin/ignored_commits.json. Thanks, 
your friendly bot at https://jenkins.impala.io/job/cherrypick-2.x-and-test/479/ 
.


--
To view, visit http://gerrit.cloudera.org:8080/10360
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167
Gerrit-Change-Number: 10360
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 11 May 2018 18:58:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] Update version to 3.1.0-SNAPSHOT

2018-05-11 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10360 )

Change subject: Update version to 3.1.0-SNAPSHOT
..


Patch Set 2:

Sorry for the spam; I'm testing something on the cherrypicker jenkins job to 
send a message via gerrit review, and I'm struggling a little bit with the 
syntax!


--
To view, visit http://gerrit.cloudera.org:8080/10360
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167
Gerrit-Change-Number: 10360
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 11 May 2018 18:54:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] Update version to 3.1.0-SNAPSHOT

2018-05-11 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10360 )

Change subject: Update version to 3.1.0-SNAPSHOT
..


Patch Set 2:

test
test


--
To view, visit http://gerrit.cloudera.org:8080/10360
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167
Gerrit-Change-Number: 10360
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 11 May 2018 18:54:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS

2018-05-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10370 )

Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10370/2/tests/common/skip.py
File tests/common/skip.py:

http://gerrit.cloudera.org:8080/#/c/10370/2/tests/common/skip.py@130
PS2, Line 130:   qualified_path = pytest.mark.skipif(IS_LOCAL,
 :   reason="Tests rely on HDFS qualified paths")
> Done. The local filesystem tests run with a single impalad, so it ends up w
Ah, didn't realize that's that we only started one node for local fs tests. I'm 
not sure why that is, but okay.



--
To view, visit http://gerrit.cloudera.org:8080/10370
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662
Gerrit-Change-Number: 10370
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 18:54:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Update version to 3.1.0-SNAPSHOT

2018-05-11 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10360 )

Change subject: Update version to 3.1.0-SNAPSHOT
..


Patch Set 2:

--help


--
To view, visit http://gerrit.cloudera.org:8080/10360
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167
Gerrit-Change-Number: 10360
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 11 May 2018 18:53:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] Update version to 3.1.0-SNAPSHOT

2018-05-11 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10360 )

Change subject: Update version to 3.1.0-SNAPSHOT
..


Patch Set 2:

test


--
To view, visit http://gerrit.cloudera.org:8080/10360
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2e2df73d27fa332d17fec4e9aa469ea91b14167
Gerrit-Change-Number: 10360
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 11 May 2018 18:52:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS

2018-05-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10370 )

Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10370/2/testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test
File 
testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test:

http://gerrit.cloudera.org:8080/#/c/10370/2/testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test@3
PS2, Line 3: # Check that hdfs writers respects mem_limit.
> maybe add comments to the test files saying the mem limit is tuned for 3 no
Done


http://gerrit.cloudera.org:8080/#/c/10370/2/tests/common/skip.py
File tests/common/skip.py:

http://gerrit.cloudera.org:8080/#/c/10370/2/tests/common/skip.py@130
PS2, Line 130:   mem_usage_different = pytest.mark.skipif(IS_LOCAL,
 :   reason="Memory limit too low when running single node")
> do we still need that? should all of these tests be "tuned_for_minicluster"
Done. The local filesystem tests run with a single impalad, so it ends up with 
3x the memory consumption in some cases.



--
To view, visit http://gerrit.cloudera.org:8080/10370
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662
Gerrit-Change-Number: 10370
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 18:41:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS

2018-05-11 Thread Tim Armstrong (Code Review)
Hello Sailesh Mukil, Impala Public Jenkins, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10370

to look at the new patch set (#3).

Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS
..

IMPALA-7010: don't run memory usage tests on non-HDFS

Moved a number of tests with tuned mem_limits. In some cases
this required separating the tests from non-tuned functional
tests.

TestQueryMemLimit used very high and very low limits only, so seemed
safe to run in all configurations.

Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662
---
A testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
A 
testdata/workloads/functional-query/queries/QueryTest/kudu_insert_mem_limit.test
A 
testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch-mem-limit.test
M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test
M testdata/workloads/functional-query/queries/QueryTest/spilling-aggs.test
R 
testdata/workloads/functional-query/queries/QueryTest/spilling-regression-exhaustive.test
M tests/common/skip.py
M tests/query_test/test_insert.py
M tests/query_test/test_kudu.py
M tests/query_test/test_mem_usage_scaling.py
M tests/query_test/test_nested_types.py
M tests/query_test/test_sort.py
M tests/query_test/test_spilling.py
15 files changed, 196 insertions(+), 156 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/10370/3
--
To view, visit http://gerrit.cloudera.org:8080/10370
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662
Gerrit-Change-Number: 10370
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-7010: don't run memory usage tests on non-HDFS

2018-05-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10370 )

Change subject: IMPALA-7010: don't run memory usage tests on non-HDFS
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10370/2/testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test
File 
testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test:

http://gerrit.cloudera.org:8080/#/c/10370/2/testdata/workloads/functional-query/queries/QueryTest/insert-mem-limit.test@3
PS2, Line 3: # Check that hdfs writers respects mem_limit.
maybe add comments to the test files saying the mem limit is tuned for 3 node 
hdfs cluster, to help prevent these from accidentally being added back to the 
full matrix.


http://gerrit.cloudera.org:8080/#/c/10370/2/tests/common/skip.py
File tests/common/skip.py:

http://gerrit.cloudera.org:8080/#/c/10370/2/tests/common/skip.py@130
PS2, Line 130:   mem_usage_different = pytest.mark.skipif(IS_LOCAL,
 :   reason="Memory limit too low when running single node")
do we still need that? should all of these tests be "tuned_for_minicluster".

And I don't understand the reason here anyway - local fs tests still use the 
minicluster (as do hdfs and isilon for that matter). They just use a different 
default filesystem.



--
To view, visit http://gerrit.cloudera.org:8080/10370
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9686195a29dde2d87b19ef8bb0e93e08f8bee662
Gerrit-Change-Number: 10370
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 11 May 2018 17:36:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6907: Close stale connections to removed cluster members

2018-05-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10327 )

Change subject: IMPALA-6907: Close stale connections to removed cluster members
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2453/


--
To view, visit http://gerrit.cloudera.org:8080/10327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41b7297cf665bf291b09b23524d19b1d10ab281d
Gerrit-Change-Number: 10327
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Fri, 11 May 2018 17:31:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6907: Close stale connections to removed cluster members

2018-05-11 Thread Michael Ho (Code Review)
Hello Tianyi Wang, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10327

to look at the new patch set (#5).

Change subject: IMPALA-6907: Close stale connections to removed cluster members
..

IMPALA-6907: Close stale connections to removed cluster members

Previously, ImpalaServer::MembershipCallback() is used by each
Impala backend node to update cluster membership. It also removes
stale connections to nodes which are no longer members of the cluster.
However, the way it detects removed member is flawed as it relies
on query_locations_ to determine whether stale connections may
exist to the removed members. query_locations_ is a map of host
name to a set of queries running on that host. A entry for a remote
node only exists in query_locations_ if an Impalad node has acted
as coordinator of a query with fragment instances scheduled to run
on that remote node.

This change fixes this problem by closing connections to remote
hosts which are removed from the cluster regardless of whether
it can be found in query_locations_. A new test is added to
exercise this path by restarting Impalad backend nodes between
queries. Also change impala_cluster.py to use bin/start-impala.sh
to start Impala demon instead of directly forking and exec'ing
Impalad. This is needed as start-impala.sh sets up the proper
Java related environment variables.

Change-Id: I41b7297cf665bf291b09b23524d19b1d10ab281d
---
M be/src/service/impala-server.cc
M tests/common/impala_cluster.py
M tests/custom_cluster/test_restart_services.py
3 files changed, 56 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/10327/5
--
To view, visit http://gerrit.cloudera.org:8080/10327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I41b7297cf665bf291b09b23524d19b1d10ab281d
Gerrit-Change-Number: 10327
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata

2018-05-11 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10116 )

Change subject: IMPALA-6131: Track time of last statistics update in metadata
..


Patch Set 10:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1235
PS10, Line 1235:* TODO: the following paragraph seems at least partially 
obsolate
> Why not clean it up then? Point (1) is obsolete but point (2) is still vali
Done


http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
File fe/src/main/java/org/apache/impala/catalog/KuduTable.java:

http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@244
PS10, Line 244: Long.toString(System.currentTimeMillis() / 1000));
> Doesn't the HMS set this in alter_table()?
It sets it if the property does not exist, but this would not work well for 
Kudu tables: after calling alter_table here, the table will not be loaded from 
HMS again (unlike in HDFS tables, where this is done to ensure HMS-Impala 
consistency). This means that if I would remove "transient_lastDdlTime" here, 
then HMS would calculate a valid value, but Impala catalogd would not know 
about this value.


http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@785
PS10, Line 785:   msTbl.putToParameters("impala.lastComputeStatsTime",
> Create a constant for the table property in Table similar to what we have i
I have put the constant to HdfsTable, because all the other property keys 
reside their, even those that are used by Kudu tables too.


http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2868
PS10, Line 2868: // HMS updates transient_lastDdlTime if it is removed.
> I understand what this comment says, but I don't understand it's relevance
This comment remained here by mistake.


http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2870
PS10, Line 2870: Long.toString(System.currentTimeMillis() / 1000));
> We're doing "System.currentTimeMillis() / 1000" in quite a few places, I su
I have created a bit different helper function.


http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py
File tests/metadata/test_last_ddl_time_update.py:

http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py@74
PS10, Line 74: def run_test(self, query, expect_changed_ddl_time, 
expect_changed_stats_time):
> Passing a ";" delimited string is really weird. Why not pass a list of quer
The "invalidate metadata %(TBL)s; describe %(TBL)s" combo was used the check 
that in case of Kudu tables, lastDdlTime is not increased by reloading the 
table (it used to increase it), so an extra query is needed after INVALIDATE 
METADATA to load the table. The two can be separated (like for other init 
queries) but I felt that they belong together.

I replaced the string splitting with variable argument list - I can replace 
this with two separate calls if needed.


http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py@93
PS10, Line 93:   # Hive uses a seconds granularity on the last ddl time.
> Not just Hive - we do too. Just say "All times are stored in seconds" or so
Isn't it an HMS convention in this case? Or is there a reason behind not using 
higher precision timestamp?


http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py@155
PS10, Line 155: h.expect_no_time_change("drop incremental stats %(TBL)s 
partition (j=1, s='2012')")
> use "drop stats" instead to wipe everything (including incremental stats)
Are you sure about this? I use DROP INCREMENTAL STATS on purpose to check +1 
statement's effect in the test suite.



--
To view, visit http://gerrit.cloudera.org:8080/10116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
Gerrit-Change-Number: 10116
Gerrit-PatchSet: 10
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 

[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 18:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@457
PS18, Line 457: ret_status
> Just fyi, there's no copy because return-value-optimisation applies to ret_
RVO eliminates the copy of ret_value into the thing that is accepting it, but 
there's still a copy from exec_status_ while holding the lock. But, yeah, agree 
with your point that it ultimately we end up with one copy from exec_status_ in 
both cases.



--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 18
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 17:15:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10165 )

Change subject: IMPALA-6941: load more text scanner compression plugins
..


Patch Set 11:

I wasn't worried so much about the load data part, but more once the partition 
with those files exists (either via load data or via some other engine), do we 
produce the proper runtime error. maybe the case is the same as the text 
parsing error test case, and so we do have coverage for that?


--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 11 May 2018 17:11:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6957: calc thread resource requirement in planner

2018-05-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10256 )

Change subject: IMPALA-6957: calc thread resource requirement in planner
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10256/6/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

http://gerrit.cloudera.org:8080/#/c/10256/6/common/thrift/Frontend.thrift@416
PS6, Line 416: max_per_host_required_threads
should we just go ahead and call this a reservation?  Or, we can wait for later 
if/when we actually put some backend reservation accounting/limit in place. 
(Though we can still think of it as a reservation of an unlimited resource 
currently). Just thinking it would be nice standardize on resource terminology 
- this is effectively the min-thread-reservation. (and then we should clarify 
the thing above is a mem reservation).

Okay to defer or argue against of course.


http://gerrit.cloudera.org:8080/#/c/10256/6/common/thrift/Planner.thrift
File common/thrift/Planner.thrift:

http://gerrit.cloudera.org:8080/#/c/10256/6/common/thrift/Planner.thrift@80
PS6, Line 80: required_threads
same question about reservation.


http://gerrit.cloudera.org:8080/#/c/10256/6/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java:

http://gerrit.cloudera.org:8080/#/c/10256/6/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@276
PS6, Line 276: .setMemEstimateBytes(0)
why do that?



--
To view, visit http://gerrit.cloudera.org:8080/10256
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I982837ef883457fa4d2adc3bdbdc727353469140
Gerrit-Change-Number: 10256
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 17:09:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10165 )

Change subject: IMPALA-6941: load more text scanner compression plugins
..


Patch Set 11:

Could you have a look at the fe changes, Alex? The main change is to  remove 
(untested) handling of unsupported compression types in the catalog and "LOAD 
DATA" planning that I convinced myself were useless. Instead the check for 
support happens during query execution.

One key thing to keep in mind is that unrecognised suffixes are treated as 
uncompressed, e.g. some_file.xz would be treated as an uncompressed file 
because we don't know about the ".xz" suffix.


--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 11 May 2018 17:05:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10165 )

Change subject: IMPALA-6941: load more text scanner compression plugins
..


Patch Set 11:

There was no existing test coverage for the error path of "LOAD DATA" - the 
only tests load uncompressed files.

I think we probably should have that coverage for completeness but I feel a bit 
better about it now that there's no special code path that is untested - "LOAD 
DATA" is agnostic to file extensions now.


--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 11 May 2018 17:01:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 18:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@457
PS18, Line 457: ret_status
> exec_status_ is protected by the lock, so the copy is necessary.
Just fyi, there's no copy because return-value-optimisation applies to 
ret_status.



--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 18
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 16:57:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10165 )

Change subject: IMPALA-6941: load more text scanner compression plugins
..


Patch Set 11: Code-Review+1

That seems okay to me, though would be good to have Alex look at at least that 
last change.

Why didn't a test need to be updated with that change? Shouldn't it cause some 
case to go from a table loading error to a runtime error?


--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 11 May 2018 16:51:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 20: Code-Review+1

Carry +1s


--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 20
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 16:44:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata

2018-05-11 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Zoltan Borok-Nagy, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10116

to look at the new patch set (#11).

Change subject: IMPALA-6131: Track time of last statistics update in metadata
..

IMPALA-6131: Track time of last statistics update in metadata

The timestamp of the last COMPUTE STATS operation is saved to
table property "impala.lastComputeStatsTime". The format is
the same as in "transient_lastDdlTime", so the two can be
compared to check if the schema has changed since computing
statistics.

Other changes:
- Handling of "transient_lastDdlTime" is simplified - the old
  logic set it to current time + 1, if the old version was
  >= current time, to ensure that it is always increased by
  DDL operations. This was useful in the past, as IMPALA-387
  used lastDdlTime to check if partition data needs to be
  reloaded, but since IMPALA-1480, Impala does not rely on
  lastDdlTime at all.

- Computing / setting stats on HDFS tables no longer increases
  "transient_lastDdlTime".

- When Kudu tables are (re)loaded, it is checked if their
  HMS representation is up to date, and if it is, then
  IMetaStoreClient.alter_table() is not called. The old
  logic always called alter_table() after loading metadata
  from Kudu. This change was needed to ensure that
  "transient_lastDdlTime" works similarly in HDFS and Kudu
  tables, and should also make (re)loading Kudu tables faster.

Notes:
- Kudu will be able to sync its tables to HMS in the near
  future (see KUDU-2191), so the Kudu metadata handling in
  Impala may need to be redesigned.

Testing:
tests/metadata/test_last_ddl_time_update.py is extended by
- also checking "impala.lastComputeStatsTime"
- testing more SQL statements
- tests for Kudu tables

Note that test_last_ddl_time_update.py is ran only in
exhaustive testing.

Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
---
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
M tests/metadata/test_last_ddl_time_update.py
8 files changed, 220 insertions(+), 173 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/10116/11
--
To view, visit http://gerrit.cloudera.org:8080/10116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
Gerrit-Change-Number: 10116
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 20:

(12 comments)

Thanks for taking a look Sailesh.

http://gerrit.cloudera.org:8080/#/c/10158/19/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10158/19/be/src/runtime/coordinator.h@93
PS19, Line 93: /// 2. exec_state_lock_, backend_states_init_lock_, filter_lock_,
> What about backend_states_init_lock_ ?
Done


http://gerrit.cloudera.org:8080/#/c/10158/19/be/src/runtime/coordinator.h@206
PS19, Line 206:
> comment requires an update.
referenced the class level comment.


http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h@325
PS18, Line 325: rySummary() and
> const ref
ExecState is a scalar (enum). We pass scalar by value - no reason to have 
indirection.

I can make them const though, if you prefer, given these are leafs routines. 
Generally, I'm not a huge fan of const scalar args because it leads to const 
creep with little benefit (compiler has to perform analysis anyway, and usually 
easy enough to see when non-pointers are not modified/aliased).


http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h@350
PS18, Line 350:  in all cases releases resources and cal
> These look like they're read only, so they should be passed as const refs.
Scalar and in registers, so not going to pass by reference.


http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h@357
PS18, Line 357:
> const ref
made it const


http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.h@360
PS18, Line 360:
> const ref
made it const


http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@127
PS14, Line 127: ();
> You mean Cancel() and UpdateBackendExecStatus()? Maybe this is not worth me
This comment was pre-existing but I'm fine with removing it. It's covered in 
the class comment.


http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@279
PS14, Line 279:
> To be more explicit, we should say "... or when Cancel() is called."
Cancel() is not called for (execution) errors. Thats part of the point of the 
patch is to distinguish canceled state and error state.
(It's true that the impala server layer can implement an error at that layer 
using Cancel(), but that's not an execution error, and I'd rather not muddle 
the terminology).

Will update this comment to talk about backend cancellation, though.


http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@457
PS18, Line 457: ret_status
> Preferable to get rid of the automatic variable 'ret_status' and just retur
exec_status_ is protected by the lock, so the copy is necessary.


http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@462
PS18, Line 462: Status ret_status;
> Same here.
Same locking.


http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@476
PS18, Line 476:   DCHECK(exec_status_.ok()) << exec_status_;
> If an error happens after exec_state_ is already set to ExecState::RETURNED
My thinking was that it would be good to at least log the error, which is why I 
wrote the code that way.  Alternatively, we could only log when a state 
transition happens, but it seems like it might be good to know when an backend 
error occurs even if not failing the query, but I'm also not a fan of log spew. 
Are you arguing for one way or the other?


http://gerrit.cloudera.org:8080/#/c/10158/18/be/src/runtime/coordinator.cc@486
PS18, Line 486: !status.ok() && !status.IsCancelled() && e
> (!status.ok() && !status.IsCancelled() && exec_status_.IsCancelled())
This path should be rare, but sure we can avoid the copy in that case.



--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 20
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 May 2018 16:37:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-11 Thread Dan Hecht (Code Review)
Hello Michael Ho, Sailesh Mukil, Tim Armstrong, Bikramjeet Vig,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10158

to look at the new patch set (#20).

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..

IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

The is the final change to clarify and break up the Coordinator's lock.
The state machine for the coordinator is made explicit, distinguishing
between executing state and multiple terminal states. Logic to
transition into a terminal state is centralized in one location and
executes exactly once for each coordinator object.

Derived from a patch for IMPALA_5384 by Marcel Kornacker.

Testing:
- exhaustive functional tests
- stress test on minicluster with memory overcommitment. Verified from
  the logs that this exercises all these paths:
  - successful queries
  - client requested cancellation
  - error from exec FInstances RPC
  - error reported asynchronously via report status RPC
  - eos before backend execution completed

Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
---
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/client-request-state.cc
M be/src/service/impala-server.h
M be/src/util/counting-barrier.h
6 files changed, 393 insertions(+), 392 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/10158/20
--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 20
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6997: Avoid redundant dumping in SetMemLimitExceeded()

2018-05-11 Thread Zoram Thanga (Code Review)
Zoram Thanga has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10364 )

Change subject: IMPALA-6997: Avoid redundant dumping in SetMemLimitExceeded()
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10364/1/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/10364/1/be/src/runtime/runtime-state.cc@210
PS1, Line 210:   // (e.g. once per row) before the fragment aborts. See 
IMPALA-6997.
I wonder if we should explicitly check that the existing error is 
TErrorCode::MEM_LIMIT_EXCEEDED? I mean, could we come here after hitting some 
other error condition?



--
To view, visit http://gerrit.cloudera.org:8080/10364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I92b87f370a68a2f695ebbc2520a98dd143730701
Gerrit-Change-Number: 10364
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Fri, 11 May 2018 16:23:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6802 (part 3): Clean up authorization tests

2018-05-11 Thread Adam Holley (Code Review)
Adam Holley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10358 )

Change subject: IMPALA-6802 (part 3): Clean up authorization tests
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10358/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java
File fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java:

http://gerrit.cloudera.org:8080/#/c/10358/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@750
PS1, Line 750: .error(selectError("functional.alltypes"), onColumn("functional",
 : "alltypes", "id", allExcept(TPrivilegeLevel.ALL, 
TPrivilegeLevel.SELECT)));
I'm not sure this test makes sense because you can only grant SELECT at the 
column level.  Maybe change the column name and remove "allExcept".  I know 
this pattern is already used a lot of other places, but looking at the select 
tests, I don't think it's covered that the user is granted select to other 
(possibly all) columns except one, and that it should still fail.  Maybe the 
existing tests can remain if case additional column privileges are added, but 
we still need to cover the other case.



--
To view, visit http://gerrit.cloudera.org:8080/10358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9681cc3c7094db33ab7c5caa69b99dd803b908b7
Gerrit-Change-Number: 10358
Gerrit-PatchSet: 1
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Comment-Date: Fri, 11 May 2018 15:02:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db

2018-05-11 Thread Attila Jeges (Code Review)
Hello Gabor Kaszab, Zoltan Borok-Nagy, Csaba Ringhofer,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9986

to look at the new patch set (#6).

Change subject: IMPALA-3307: Add support for IANA time-zone db
..

IMPALA-3307: Add support for IANA time-zone db

Impala currently uses two different libraries for timestamp
manipulations: boost and glibc.

Issues with boost:
- Time-zone database is currently hard coded in timezone_db.cc.
  Impala admins cannot update it without upgrading Impala.
- Time-zone database is flat, therefore can’t track year-to-year
  changes.
- Time-zone database is not updated on a regular basis.

Issues with glibc:
- Uses /usr/share/zoneinfo/ database which could be out of sync on
  some of the nodes in the Impala cluster.
- Uses the host system’s local time-zone. Different nodes in the
  Impala cluster might use a different local time-zone.
- Conversion functions take a global lock, which causes severe
  performance degradation.

In addition to the issues above, the fact that /usr/share/zoneinfo/
and the hard-coded boost time-zone database are both in use is a
source of inconsistency in itself.

This patch makes the following changes:
- Instead of boost and glibc, impalad uses Google's CCTZ to implement
  time-zone conversions.
- Introduces a new startup flag (--hdfs_zone_info_dir) to impalad to
  specify an HDFS/S3/ADLS location that contains the shared compiled
  IANA time-zone database. If the startup flag is set, impalad will
  use the specified time-zone database. Otherwise, impalad will use
  the default /usr/share/zoneinfo time-zone database.
- impalad reads the entire time-zone database into an in-memory
  map on startup for fast lookups.
- The name of the coordinator node’s local time-zone is saved to the
  query context when preparing query execution. This time-zone is used
  whenever the current time-zone is referred afterwards in an
  execution node.
- Introduces a new startup flag (--hdfs_zone_abbrev_conf) to impalad
  to specify an HDFS/S3/ADLS path to a shared config file that
  contains definitions for non-standard time-zone abbreviations.

Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77
---
M CMakeLists.txt
M be/CMakeLists.txt
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/common/global-types.h
M be/src/exec/data-source-scan-node.cc
M be/src/exec/data-source-scan-node.h
M be/src/exec/hdfs-orc-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exprs/CMakeLists.txt
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/decimal-operators-ir.cc
M be/src/exprs/decimal-operators.h
M be/src/exprs/expr-test.cc
M be/src/exprs/literal.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
A be/src/exprs/timezone_db-test.cc
M be/src/exprs/timezone_db.cc
M be/src/exprs/timezone_db.h
M be/src/runtime/raw-value-test.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M be/src/runtime/timestamp-value.inline.h
M be/src/service/impala-server.cc
M be/src/service/impalad-main.cc
M be/src/util/filesystem-util-test.cc
M be/src/util/filesystem-util.cc
M be/src/util/filesystem-util.h
M be/src/util/hdfs-util-test.cc
M be/src/util/hdfs-util.cc
M be/src/util/hdfs-util.h
M be/src/util/time-test.cc
M be/src/util/time.cc
M be/src/util/time.h
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
A cmake_modules/FindCctz.cmake
M common/thrift/ImpalaInternalService.thrift
M common/thrift/metrics.json
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
M testdata/bin/create-load-data.sh
M testdata/data/timezoneverification.csv
A testdata/tzdb/abbrev.conf
A testdata/tzdb/zoneinfo/AmerICA/ArgeNTINA/MendOZA
A testdata/tzdb/zoneinfo/AmerICA/CancUN
A testdata/tzdb/zoneinfo/UTC
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
A tests/custom_cluster/test_custom_tzdb.py
53 files changed, 2,569 insertions(+), 1,092 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/9986/6
--
To view, visit http://gerrit.cloudera.org:8080/9986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77
Gerrit-Change-Number: 9986
Gerrit-PatchSet: 6
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db

2018-05-11 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9986 )

Change subject: IMPALA-3307: Add support for IANA time-zone db
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/runtime/timestamp-value.cc@139
PS5, Line 139:   // In case the resulting 'time_point' is ambiguous, we have to 
invalidate
 :   // TimestampValue.
 :   // 'civil_lookup' members and the details of handling 
ambiguity are described at:
 :   // 
https://github.com/google/cctz/blob/a2dd3d0fbc811fe0a1d4d2dbb0341f1a3d28cb2a/
 :   // include/cctz/time_zone.h#L106
 :   if (UNLIKELY(from_cl.kind != 
cctz::time_zone::civil_lookup::UNIQUE)
I have investigated a bit about this:

- there is a Jira that complains about this behavior: 
https://issues.apache.org/jira/browse/IMPALA-3169

- Hive does not work like this, it returns a "valid" timestamp for 
repeated/skipped hours:

select
 to_utc_timestamp(cast("2011-03-13 02:15:00" as timestamp), 
"America/Los_Angeles"),
 to_utc_timestamp(cast("2011-11-06 01:15:00" as timestamp), 
"America/Los_Angeles")
result: 2011-03-13 10:15:00.0   2011-11-06 09:15:00.0

I think that we should do the same, at least  for repeated values. I can 
imagine several valid queries where this would be the correct behavior, for 
example when we filter for a time interval.

So I vote for solving IMPALA-3169 in this patch by choosing pre or post time in 
non UNIQUE cases too. If there are no test cases yet for skipped/repeated 
hours, then we should create some and expect the same results that Hive returns.



--
To view, visit http://gerrit.cloudera.org:8080/9986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77
Gerrit-Change-Number: 9986
Gerrit-PatchSet: 5
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 11 May 2018 14:04:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db

2018-05-11 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9986 )

Change subject: IMPALA-3307: Add support for IANA time-zone db
..


Patch Set 4:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/9986/4/CMakeLists.txt
File CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/9986/4/CMakeLists.txt@281
PS4, Line 281: Cctz
> nit: CCTZ
Done


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@139
PS4, Line 139: new_time_zone_(time_zone), new_tz_
> From reading the names of these 2 variables it's not clear what de differen
Done


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@140
PS4, Line 140: /*overwrite*/
> Do you need this comment?
Removed it


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@153
PS4, Line 153: Timezone *
> nit: Timezone* Expect..()
Done


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/expr-test.cc@155
PS4, Line 155: new_tz_ = TimezoneDatabase::FindTimezone(new_time_zone_);
> I wonder if it makes sense to do this assignment in the constructor and the
The main reason I implemented the class this way was that some tests that use 
'ScopedTimeZoneOverride' don't need the actual Timezone pointer.

On the other hand, checking always that the timezone name is valid doesn't hurt 
anyone. Done.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc@503
PS4, Line 503: namespace
> Why did you need this namespace?
Putting  stuff into an unnamed namespace is a common C++ idiom: it says that 
everything in the unnamed namespace is "local" to this translation unit. 
They're not visible from the outside, and their names won't clash with names in 
other translation units.

Impala uses unnamed namespaces elsewhere too.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions-ir.cc@504
PS4, Line 504: / TODO
> What is the plan to get rid of the "Duplicate code" TODOs in this review?
I removed the TODO comment.

I cannot think of a better place for this function. We can create a shared 
class for this function only but that would be awkward.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions.cc
File be/src/exprs/timestamp-functions.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions.cc@55
PS4, Line 55: // This should raise some sort of error or at least return 
null. Hive just ignores it.
> Shouldn't this be a TODO?
Fixed the comment.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timestamp-functions.cc@87
PS4, Line 87: // This should raise some sort of error or at least return 
null. Hive just ignores it.
> Same as above
Fixed the comment.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc
File be/src/exprs/timezone_db-test.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc@57
PS4, Line 57: TzAbbev
> nit: TzAbbrev?
Done


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc@68
PS4, Line 68:   // Abbreviations must start with an uppercase letter.
> If it has to start with an uppercase letter, can we add a test this with an
Done


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db-test.cc@105
PS4, Line 105:   // Misformatted time-zone names.
> Can you again play around with upper vs lower case letters here?
Done


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h
File be/src/exprs/timezone_db.h:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h@1
PS4, Line 1: // with the License.  You may obtain a copy of the License at
> Hmm, is the top of the Apache comment missing?
Not sure what happened there. I probably inadvertently executed some mysterious 
vim command :)


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h@91
PS4, Line 91: tz_seg
> nit: might be just my preference but tz_segment is still short and I think
Done


http://gerrit.cloudera.org:8080/#/c/9986/1/be/src/exprs/timezone_db.cc
File be/src/exprs/timezone_db.cc:

http://gerrit.cloudera.org:8080/#/c/9986/1/be/src/exprs/timezone_db.cc@147
PS1, Line 147:
> What I meant is that you could get rid of the offset_sec paramater if you c
I understand.

The problem is that then we would have to allocate an int64_t in the function 
and return a pointer to it wrapped in unique_ptr to avoid memory 
leaks. Seems more pain then gain.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.cc
File be/src/exprs/timezone_db.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.cc@365
PS4, Line 365:   hdfsFile hdfs_file = hdfsOpenFile(
> Just for my information: