[Impala-ASF-CR] IMPALA-6987: [DOCS] Refactor the INVALIDATE METADATA and REFRESH docs

2018-05-31 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10339 )

Change subject: IMPALA-6987: [DOCS] Refactor the INVALIDATE METADATA and 
REFRESH docs
..


Patch Set 4: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10339/4/docs/topics/impala_refresh.xml
File docs/topics/impala_refresh.xml:

http://gerrit.cloudera.org:8080/#/c/10339/4/docs/topics/impala_refresh.xml@52
PS4, Line 52: Name Nodes
Don't remember how docs reference this but let's make sure the naming is 
consistent.


http://gerrit.cloudera.org:8080/#/c/10339/4/docs/topics/impala_refresh.xml@111
PS4, Line 111: Creating or updating functions
Functions are under databases not tables. Hence, when functions are created or 
update you need to run refresh database not refresh table.



--
To view, visit http://gerrit.cloudera.org:8080/10339
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2124e14900d0f82569c061cc46006447bb054b36
Gerrit-Change-Number: 10339
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 01 Jun 2018 04:59:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7105. Ensure fe tests pass when running standalone

2018-05-31 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10570 )

Change subject: IMPALA-7105. Ensure fe tests pass when running standalone
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10570/1/fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
File fe/src/test/java/org/apache/impala/catalog/CatalogTest.java:

http://gerrit.cloudera.org:8080/#/c/10570/1/fe/src/test/java/org/apache/impala/catalog/CatalogTest.java@40
PS1, Line 40: import org.apache.impala.service.FeSupport;
oops, don't need this import



--
To view, visit http://gerrit.cloudera.org:8080/10570
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I85286ba44f03526b487212a8ddadf71a9f427555
Gerrit-Change-Number: 10570
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 01 Jun 2018 04:46:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7105. Ensure fe tests pass when running standalone

2018-05-31 Thread Todd Lipcon (Code Review)
Todd Lipcon has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10570


Change subject: IMPALA-7105. Ensure fe tests pass when running standalone
..

IMPALA-7105. Ensure fe tests pass when running standalone

CatalogTest and HdfsStorageDescriptorTest would fail when run
individually because they relied on various initialization that only
occurs when FeSupport is loaded. These tests weren't explicitly loading
FeSupport, but instead relying on earlier tests in 'mvn test' to do so.

Tested that, with this change, I'm able to run these tests in Eclipse
and they pass. I also tested 'mvn test -DreuseForks=false' and verified
that all tests now pass in this mode.

Change-Id: I85286ba44f03526b487212a8ddadf71a9f427555
---
M fe/src/main/java/org/apache/impala/service/FeSupport.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/catalog/HdfsStorageDescriptorTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
4 files changed, 14 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/10570/1
--
To view, visit http://gerrit.cloudera.org:8080/10570
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I85286ba44f03526b487212a8ddadf71a9f427555
Gerrit-Change-Number: 10570
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 


[Impala-ASF-CR] IMPALA-7099: xfail test on affected filesystems

2018-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10566 )

Change subject: IMPALA-7099: xfail test on affected filesystems
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10566
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie5dd744d48f099b3b12607ad91454e71733f1c6a
Gerrit-Change-Number: 10566
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 01 Jun 2018 03:57:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7099: xfail test on affected filesystems

2018-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10566 )

Change subject: IMPALA-7099: xfail test on affected filesystems
..

IMPALA-7099: xfail test on affected filesystems

As a stopgap, let's xfail the test on filesystems where this
is known to be broken.

Testing:
Tested that the test succeeded and xfailed without and with an fs
prefix, i.e.:

  impala-py.test tests/metadata/test_partition_metadata.py -k unsupport
  FILESYSTEM_PREFIX="s3a://" impala-py.test 
tests/metadata/test_partition_metadata.py -k unsupport

Change-Id: Ie5dd744d48f099b3b12607ad91454e71733f1c6a
Reviewed-on: http://gerrit.cloudera.org:8080/10566
Reviewed-by: Dan Hecht 
Tested-by: Impala Public Jenkins 
---
M tests/metadata/test_partition_metadata.py
1 file changed, 4 insertions(+), 1 deletion(-)

Approvals:
  Dan Hecht: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10566
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ie5dd744d48f099b3b12607ad91454e71733f1c6a
Gerrit-Change-Number: 10566
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-7104: test bloom wait time failing with timeout on asan

2018-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10565 )

Change subject: IMPALA-7104: test_bloom_wait_time failing with timeout on asan
..

IMPALA-7104: test_bloom_wait_time failing with timeout on asan

test_bloom_wait_time checks that its query runs in less than a certain
amount of time, to ensure we aren't waiting on bloom filters that will
never arrive.

The test sometimes goes slightly over the timeout and fails on ASAN.
This patch uses a specific_build_type_timeout to extend the allowed
timeout for ASAN.

It also fixes a bug in a similar test where we were comparing a
duration in seconds to a timeout in milliseconds, and renames some
variables to make it clearer why this was a bug.

Change-Id: I0271eb83e65b3b5f17e32f6501032f0d12dde38e
Reviewed-on: http://gerrit.cloudera.org:8080/10565
Reviewed-by: Michael Brown 
Tested-by: Impala Public Jenkins 
---
M tests/query_test/test_runtime_filters.py
1 file changed, 4 insertions(+), 4 deletions(-)

Approvals:
  Michael Brown: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10565
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I0271eb83e65b3b5f17e32f6501032f0d12dde38e
Gerrit-Change-Number: 10565
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-7104: test bloom wait time failing with timeout on asan

2018-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10565 )

Change subject: IMPALA-7104: test_bloom_wait_time failing with timeout on asan
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10565
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0271eb83e65b3b5f17e32f6501032f0d12dde38e
Gerrit-Change-Number: 10565
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Fri, 01 Jun 2018 02:03:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Not able to create new tables when the statestore is offline

2018-05-31 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10481 )

Change subject: [DOCS] Not able to create new tables when the statestore is 
offline
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/10481/3/docs/topics/impala_components.xml
File docs/topics/impala_components.xml:

http://gerrit.cloudera.org:8080/#/c/10481/3/docs/topics/impala_components.xml@117
PS3, Line 117: fail
... and less consistent as data is changed


http://gerrit.cloudera.org:8080/#/c/10481/3/docs/topics/impala_components.xml@119
PS3, Line 119: resumes its monitoring function.
and broadcasting


http://gerrit.cloudera.org:8080/#/c/10481/3/docs/topics/impala_components.xml@123
PS3, Line 123: create a new table
Tim's comment about how this generalizes beyond create-table should be applied 
here: If you issue any DDL statements while ..., the changes will only be 
visible at ...


http://gerrit.cloudera.org:8080/#/c/10481/3/docs/topics/impala_components.xml@125
PS3, Line 125: The queries that use the new table will fail
 : if issued at other coordinators.
just checked that queries will fail for create since the table load has to come 
back through the statestore. so the change is visible at the coordinator (show 
tables works) but the query will not work.



--
To view, visit http://gerrit.cloudera.org:8080/10481
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41216fcf60f94f596dae0109a5730ce792b28244
Gerrit-Change-Number: 10481
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 01 Jun 2018 02:00:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7099: xfail test on affected filesystems

2018-05-31 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10566 )

Change subject: IMPALA-7099: xfail test on affected filesystems
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10566
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie5dd744d48f099b3b12607ad91454e71733f1c6a
Gerrit-Change-Number: 10566
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 01 Jun 2018 01:20:25 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) Ignore "IMPALA-7079: Disable the multiple blocks test in erasure coding build"

2018-05-31 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10568 )

Change subject: Ignore "IMPALA-7079: Disable the multiple blocks test in 
erasure coding build"
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10568
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I037b3909a1f6afcf3242950da9a3fd7bb864d330
Gerrit-Change-Number: 10568
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Fri, 01 Jun 2018 01:14:23 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) Ignore "IMPALA-7079: Disable the multiple blocks test in erasure coding build"

2018-05-31 Thread Tianyi Wang (Code Review)
Tianyi Wang has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10568 )

Change subject: Ignore "IMPALA-7079: Disable the multiple blocks test in 
erasure coding build"
..

Ignore "IMPALA-7079: Disable the multiple blocks test in erasure coding build"

The commit message in IMPALA-7079 mistakenly didn't include
"not for 2.x".

Change-Id: I037b3909a1f6afcf3242950da9a3fd7bb864d330
Reviewed-on: http://gerrit.cloudera.org:8080/10568
Reviewed-by: Thomas Marshall 
Tested-by: Tianyi Wang 
---
M bin/ignored_commits.json
1 file changed, 3 insertions(+), 1 deletion(-)

Approvals:
  Thomas Marshall: Looks good to me, approved
  Tianyi Wang: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10568
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: merged
Gerrit-Change-Id: I037b3909a1f6afcf3242950da9a3fd7bb864d330
Gerrit-Change-Number: 10568
Gerrit-PatchSet: 2
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR](2.x) Ignore "IMPALA-7079: Disable the multiple blocks test in erasure coding build"

2018-05-31 Thread Thomas Marshall (Code Review)
Thomas Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10568 )

Change subject: Ignore "IMPALA-7079: Disable the multiple blocks test in 
erasure coding build"
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10568
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I037b3909a1f6afcf3242950da9a3fd7bb864d330
Gerrit-Change-Number: 10568
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Fri, 01 Jun 2018 01:12:31 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) Ignore "IMPALA-7079: Disable the multiple blocks test in erasure coding build"

2018-05-31 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10568


Change subject: Ignore "IMPALA-7079: Disable the multiple blocks test in 
erasure coding build"
..

Ignore "IMPALA-7079: Disable the multiple blocks test in erasure coding build"

The commit message in IMPALA-7079 mistakenly didn't include
"not for 2.x".

Change-Id: I037b3909a1f6afcf3242950da9a3fd7bb864d330
---
M bin/ignored_commits.json
1 file changed, 3 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/10568/1
--
To view, visit http://gerrit.cloudera.org:8080/10568
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: I037b3909a1f6afcf3242950da9a3fd7bb864d330
Gerrit-Change-Number: 10568
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] [DOCS] Not able to create new tables when the statestore is offline

2018-05-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10481 )

Change subject: [DOCS] Not able to create new tables when the statestore is 
offline
..


Patch Set 3:

Thanks for the clarification! Vuk knows more about this than me so I'll let him 
+2


--
To view, visit http://gerrit.cloudera.org:8080/10481
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41216fcf60f94f596dae0109a5730ce792b28244
Gerrit-Change-Number: 10481
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 01 Jun 2018 00:39:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7099: xfail test on affected filesystems

2018-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10566 )

Change subject: IMPALA-7099: xfail test on affected filesystems
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2584/


--
To view, visit http://gerrit.cloudera.org:8080/10566
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie5dd744d48f099b3b12607ad91454e71733f1c6a
Gerrit-Change-Number: 10566
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 01 Jun 2018 00:35:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7099: xfail test on affected filesystems

2018-05-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10566


Change subject: IMPALA-7099: xfail test on affected filesystems
..

IMPALA-7099: xfail test on affected filesystems

As a stopgap, let's xfail the test on filesystems where this
is known to be broken.

Testing:
Tested that the test succeeded and xfailed without and with an fs
prefix, i.e.:

  impala-py.test tests/metadata/test_partition_metadata.py -k unsupport
  FILESYSTEM_PREFIX="s3a://" impala-py.test 
tests/metadata/test_partition_metadata.py -k unsupport

Change-Id: Ie5dd744d48f099b3b12607ad91454e71733f1c6a
---
M tests/metadata/test_partition_metadata.py
1 file changed, 4 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/10566/1
--
To view, visit http://gerrit.cloudera.org:8080/10566
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie5dd744d48f099b3b12607ad91454e71733f1c6a
Gerrit-Change-Number: 10566
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] [DOCS] Not able to create new tables when the statestore is offline

2018-05-31 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10481 )

Change subject: [DOCS] Not able to create new tables when the statestore is 
offline
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10481/2/docs/topics/impala_components.xml
File docs/topics/impala_components.xml:

http://gerrit.cloudera.org:8080/#/c/10481/2/docs/topics/impala_components.xml@111
PS2, Line 111: help when things go wrong
> Its also used to broadcast metadata to coordinators. Mentioning this will m
Done


http://gerrit.cloudera.org:8080/#/c/10481/2/docs/topics/impala_components.xml@115
PS2, Line 115: existing
> More specifically, "known to Impala". Whether they exist or not is a separa
Done


http://gerrit.cloudera.org:8080/#/c/10481/2/docs/topics/impala_components.xml@122
PS2, Line 122: You will not be able to create new tables if the 
statestore is
> Will need to double-check this, but the create should be visible at the coo
Done



--
To view, visit http://gerrit.cloudera.org:8080/10481
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41216fcf60f94f596dae0109a5730ce792b28244
Gerrit-Change-Number: 10481
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 01 Jun 2018 00:16:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-05-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc@213
PS3, Line 213: void Coordinator::BackendState::GetBackendResourceUtilization(
We could just return BackendResourceUtilization by value and achieve the same 
thing (copying all the members into the caller's stack allocation) I.e.


  BackendResourceUtilization 
Coordinator::BackendState::GetBackendResourceUtilization() {
lock_guard l(lock_);
return resource_utilization_;
  }


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.cc@815
PS3, Line 815: query_resource_utilization->peak_consumption = max(
If we add more counters like this at some point we should come up with some 
shared logic for doing the aggregation instead of stamping it out. That seems 
like overkill for now though...


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/service/impala-server.cc@1011
PS3, Line 1011:   queries_by_timestamp_.emplace(ExpirationEvent{
We only need to queue one expiration event if both max_cpu_time_s and 
max_scan_bytes are set. I.e. it should be something like:

  if (max_cpu_time_s > 0 || max_scan_bytes > 0) {
if (max_cpu_time_s > 0) {
  VLOG_QUERY << "Query " << PrintId(query_id) << " has cpu limit of "
 << PrettyPrinter::Print(max_cpu_time_s, TUnit::TIME_S);
}
if (max_scan_bytes > 0) {
  VLOG_QUERY << "Query " << PrintId(query_id) << " has scan bytes limit of "
 << PrettyPrinter::Print(max_scan_bytes, TUnit::BYTES);
}
  queries_by_timestamp_.emplace(ExpirationEvent{
  now + 1000L, query_id, ExpirationKind::RESOURCE_LIMIT});
}


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/service/impala-server.cc@1934
PS3, Line 1934: query_resouce_usage
type:query_resource_usage


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/service/impala-server.cc@1972
PS3, Line 1972: }
> looks like we are not updating the deadline for RESOURCE_LIMIT type of expi
Ah, good catch. Yeah I think we should do something similar to the query 
timeouts where we re-enqueue it for a later time.


expiration_event = queries_by_timestamp_.erase(expiration_event);
queries_by_timestamp_.emplace(ExpirationEvent{
UnixMillis() + 1000L, query_id, 
ExpirationKind::RESOURCE_LIMIT});

I think otherwise we run into some potential fairness issues where we're always 
processing these resource limit checks before other time limits.



--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 3
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 01 Jun 2018 00:16:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [DOCS] Not able to create new tables when the statestore is offline

2018-05-31 Thread Alex Rodoni (Code Review)
Hello Tim Armstrong, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10481

to look at the new patch set (#3).

Change subject: [DOCS] Not able to create new tables when the statestore is 
offline
..

[DOCS] Not able to create new tables when the statestore is offline

Change-Id: I41216fcf60f94f596dae0109a5730ce792b28244
---
M docs/topics/impala_components.xml
1 file changed, 16 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/10481/3
--
To view, visit http://gerrit.cloudera.org:8080/10481
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I41216fcf60f94f596dae0109a5730ce792b28244
Gerrit-Change-Number: 10481
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] [DOCS] Not able to create new tables when the statestore is offline

2018-05-31 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10481 )

Change subject: [DOCS] Not able to create new tables when the statestore is 
offline
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10481/2/docs/topics/impala_components.xml
File docs/topics/impala_components.xml:

http://gerrit.cloudera.org:8080/#/c/10481/2/docs/topics/impala_components.xml@111
PS2, Line 111: help when things go wrong
Its also used to broadcast metadata to coordinators. Mentioning this will make 
the new statement on L122 seem less surprising.


http://gerrit.cloudera.org:8080/#/c/10481/2/docs/topics/impala_components.xml@115
PS2, Line 115: existing
More specifically, "known to Impala". Whether they exist or not is a separate 
issue.


http://gerrit.cloudera.org:8080/#/c/10481/2/docs/topics/impala_components.xml@122
PS2, Line 122: You will not be able to create new tables if the 
statestore is
> Is it specifically creating tables that you can't do? Or is it most (or all
Will need to double-check this, but the create should be visible at the 
coordinator that issues the create table. Other coordinators will not know 
about this table however, so queries that use that table at other coordinators 
will fail.



--
To view, visit http://gerrit.cloudera.org:8080/10481
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41216fcf60f94f596dae0109a5730ce792b28244
Gerrit-Change-Number: 10481
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 31 May 2018 23:56:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [DOCS] Update composite or nested types in UDF

2018-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10509 )

Change subject: [DOCS] Update composite or nested types in UDF
..

[DOCS] Update composite or nested types in UDF

Change-Id: Ieda055f22e3d1e3507aacc8ba1e86e56d6f2e142
Reviewed-on: http://gerrit.cloudera.org:8080/10509
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 
---
M docs/topics/impala_udf.xml
1 file changed, 4 insertions(+), 3 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10509
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ieda055f22e3d1e3507aacc8ba1e86e56d6f2e142
Gerrit-Change-Number: 10509
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] [DOCS] Update composite or nested types in UDF

2018-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10509 )

Change subject: [DOCS] Update composite or nested types in UDF
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10509
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieda055f22e3d1e3507aacc8ba1e86e56d6f2e142
Gerrit-Change-Number: 10509
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 31 May 2018 23:51:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] Revert "IMPALA-5893: Remove old kinit code for Impala 3"

2018-05-31 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10563 )

Change subject: Revert "IMPALA-5893: Remove old kinit code for Impala 3"
..


Patch Set 1: Code-Review+2

Please run some sanity tests (e.g. thrift with kudu kinit disabled) before 
merging.


--
To view, visit http://gerrit.cloudera.org:8080/10563
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idbcce96a1fab23445f42d224ca397897aa90e07d
Gerrit-Change-Number: 10563
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 31 May 2018 23:43:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Update composite or nested types in UDF

2018-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10509 )

Change subject: [DOCS] Update composite or nested types in UDF
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-docs-submit/302/


--
To view, visit http://gerrit.cloudera.org:8080/10509
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieda055f22e3d1e3507aacc8ba1e86e56d6f2e142
Gerrit-Change-Number: 10509
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 31 May 2018 23:41:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Update composite or nested types in UDF

2018-05-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10509 )

Change subject: [DOCS] Update composite or nested types in UDF
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10509
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieda055f22e3d1e3507aacc8ba1e86e56d6f2e142
Gerrit-Change-Number: 10509
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 31 May 2018 23:35:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Not able to create new tables when the statestore is offline

2018-05-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10481 )

Change subject: [DOCS] Not able to create new tables when the statestore is 
offline
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10481/2/docs/topics/impala_components.xml
File docs/topics/impala_components.xml:

http://gerrit.cloudera.org:8080/#/c/10481/2/docs/topics/impala_components.xml@122
PS2, Line 122: You will not be able to create new tables if the 
statestore is
Is it specifically creating tables that you can't do? Or is it most (or all) 
DDL operations that will fail?



--
To view, visit http://gerrit.cloudera.org:8080/10481
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41216fcf60f94f596dae0109a5730ce792b28244
Gerrit-Change-Number: 10481
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 31 May 2018 23:35:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6035: Add query options to limit thread reservation

2018-05-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10365 )

Change subject: IMPALA-6035: Add query options to limit thread reservation
..


Patch Set 9:

Rebased, had to resolve some conflicts with Bikram's process mem limit patch.


--
To view, visit http://gerrit.cloudera.org:8080/10365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007
Gerrit-Change-Number: 10365
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 31 May 2018 23:31:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6035: Add query options to limit thread reservation

2018-05-31 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig, Mostafa Mokhtar, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10365

to look at the new patch set (#9).

Change subject: IMPALA-6035: Add query options to limit thread reservation
..

IMPALA-6035: Add query options to limit thread reservation

Adds two options: THREAD_RESERVATION_LIMIT and
THREAD_RESERVATION_AGGREGATE_LIMIT, which are both enforced by admission
control based on planner resource requirements and the schedule. The
mechanism used is the same as the minimum reservation checks.

THREAD_RESERVATION_LIMIT limits the total number of reserved threads in
fragments scheduled on a single backend.
THREAD_RESERVATION_AGGREGATE_LIMIT limits the sum of reserved threads
across all fragments.

This also slightly improves the minimum reservation error message to
include the host name.

Testing:
Added end-to-end tests that exercise the code paths.

Ran core tests.

Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007
---
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler.cc
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M 
testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test
M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test
A testdata/workloads/functional-query/queries/QueryTest/thread-limits.test
A tests/query_test/test_resource_limits.py
12 files changed, 254 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/10365/9
--
To view, visit http://gerrit.cloudera.org:8080/10365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007
Gerrit-Change-Number: 10365
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6947: Kudu tests flaky due to rpc timeout

2018-05-31 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10466 )

Change subject: IMPALA-6947: Kudu tests flaky due to rpc timeout
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10466/2/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/10466/2/be/src/common/global-flags.cc@163
PS2, Line 163: kudu_rpc
Not sure if I am nitpicking here but I wonder if it's less confusing to call it 
kudu_client_rpc_timeout_ms ? In theory, one can consider data stream sender a 
client of KRPC but this is different from what we think as a Kudu client. So, 
does the differentiation make sense ?


http://gerrit.cloudera.org:8080/#/c/10466/2/be/src/common/global-flags.cc@164
PS2, Line 164: Kudu rpcs
Kudu client's rpcs ?


http://gerrit.cloudera.org:8080/#/c/10466/2/be/src/common/global-flags.cc@165
PS2, Line 165: kudu_operation_timeout_ms
So, kudu_operation_timeout will kick in first if it's >= 
kudu_operation_timeout_ms ?



--
To view, visit http://gerrit.cloudera.org:8080/10466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c4a2d87934cd4eb98509512f7060a596894ed53
Gerrit-Change-Number: 10466
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 31 May 2018 23:27:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6035: Add query options to limit thread reservation

2018-05-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10365 )

Change subject: IMPALA-6035: Add query options to limit thread reservation
..


Patch Set 7:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/10365/7/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/10365/7/be/src/scheduling/admission-controller.cc@426
PS7, Line 426:   max_thread_reservation = make_pair(, 
e.second.thread_reservation);
> I think the use of pairs hurts readability (since fields don't have meaning
I personally liked the more compact code. It does seem like a pattern that 
could be misused though. My tendency is to leave it as-is.


http://gerrit.cloudera.org:8080/#/c/10365/7/be/src/scheduling/admission-controller.cc@432
PS7, Line 432: chedule->query_options().__isset.buffer_pool_limit
 :   && schedule->query_options()
> query_opts
Done


http://gerrit.cloudera.org:8080/#/c/10365/7/be/src/scheduling/admission-controller.cc@440
PS7, Line 440:  else if
> Preexisting, but I think it's non-obvious why this is "else if" with the bu
Done


http://gerrit.cloudera.org:8080/#/c/10365/7/be/src/scheduling/admission-controller.cc@461
PS7, Line 461: else if
> given that these aren't really mutually exclusive (unlike the buffer pool l
Done


http://gerrit.cloudera.org:8080/#/c/10365/7/be/src/scheduling/query-schedule.h
File be/src/scheduling/query-schedule.h:

http://gerrit.cloudera.org:8080/#/c/10365/7/be/src/scheduling/query-schedule.h@49
PS7, Line 49: input to a BackendState
> input to AdmissionController and a BackendState
Done


http://gerrit.cloudera.org:8080/#/c/10365/4/testdata/workloads/functional-query/queries/QueryTest/resource-limits.test
File testdata/workloads/functional-query/queries/QueryTest/resource-limits.test:

http://gerrit.cloudera.org:8080/#/c/10365/4/testdata/workloads/functional-query/queries/QueryTest/resource-limits.test@1
PS4, Line 1: 
> Can you rename the file to "thread-limits.test"?
Done



--
To view, visit http://gerrit.cloudera.org:8080/10365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007
Gerrit-Change-Number: 10365
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 31 May 2018 23:22:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6035: Add query options to limit thread reservation

2018-05-31 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig, Mostafa Mokhtar, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10365

to look at the new patch set (#8).

Change subject: IMPALA-6035: Add query options to limit thread reservation
..

IMPALA-6035: Add query options to limit thread reservation

Adds two options: THREAD_RESERVATION_LIMIT and
THREAD_RESERVATION_AGGREGATE_LIMIT, which are both enforced by admission
control based on planner resource requirements and the schedule. The
mechanism used is the same as the minimum reservation checks.

THREAD_RESERVATION_LIMIT limits the total number of reserved threads in
fragments scheduled on a single backend.
THREAD_RESERVATION_AGGREGATE_LIMIT limits the sum of reserved threads
across all fragments.

This also slightly improves the minimum reservation error message to
include the host name.

Testing:
Added end-to-end tests that exercise the code paths.

Ran core tests.

Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007
---
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler.cc
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M 
testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test
M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test
A testdata/workloads/functional-query/queries/QueryTest/thread-limits.test
A tests/query_test/test_resource_limits.py
12 files changed, 253 insertions(+), 28 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/10365/8
--
To view, visit http://gerrit.cloudera.org:8080/10365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007
Gerrit-Change-Number: 10365
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6947: Kudu tests flaky due to rpc timeout

2018-05-31 Thread Thomas Marshall (Code Review)
Thomas Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10466 )

Change subject: IMPALA-6947: Kudu tests flaky due to rpc timeout
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10466/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10466/1//COMMIT_MSG@18
PS1, Line 18: Testing:
: - Passed a full run of core tests
> Also, since this affects ASAN, I suggest trying this against an ASAN build.
Done


http://gerrit.cloudera.org:8080/#/c/10466/1/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/10466/1/be/src/common/global-flags.cc@162
PS1, Line 162: // Timeout (ms) for Kudu rpcs set in the BE on the KuduClient.
 : DEFINE_int32(kudu_rpc_timeout_ms, 0, "(Advanced) Timeout 
(milliseconds) set for "
 : "individual Kudu rpcs. An operation may consist of several 
rpcs, so this is expected "
 :
> Is this an option that ought to be hidden away from users/customers? If yes
I don't think it needs to be hidden, a real user could conceivably use it, but 
I marked it Advanced.


http://gerrit.cloudera.org:8080/#/c/10466/1/bin/run-all-tests.sh
File bin/run-all-tests.sh:

http://gerrit.cloudera.org:8080/#/c/10466/1/bin/run-all-tests.sh@84
PS1, Line 84: : ${CODE_COVERAGE:=false}
> Since this only happens with Kudu on ASAN, I think we only want to enable t
Done



--
To view, visit http://gerrit.cloudera.org:8080/10466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c4a2d87934cd4eb98509512f7060a596894ed53
Gerrit-Change-Number: 10466
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 31 May 2018 23:13:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Revert "IMPALA-5893: Remove old kinit code for Impala 3"

2018-05-31 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10563 )

Change subject: Revert "IMPALA-5893: Remove old kinit code for Impala 3"
..


Patch Set 1:

Is this a clean revert ?


--
To view, visit http://gerrit.cloudera.org:8080/10563
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idbcce96a1fab23445f42d224ca397897aa90e07d
Gerrit-Change-Number: 10563
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Thu, 31 May 2018 23:12:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6947: Kudu tests flaky due to rpc timeout

2018-05-31 Thread Thomas Marshall (Code Review)
Hello Michael Brown,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10466

to look at the new patch set (#2).

Change subject: IMPALA-6947: Kudu tests flaky due to rpc timeout
..

IMPALA-6947: Kudu tests flaky due to rpc timeout

Some Kudu tests occasionally fail due to hitting an rpc timeout when
running on a heavily loaded machine. For normal operation, having a
low timeout is good for noticing issues quickly, but for tests there's
no real reason we can't set this higher to avoid flakiness.

This patch adds a flag, -kudu_rpc_timeout_ms, disabled by default. It
also sets -kudu_rpc_timeout_ms to 60s for ASAN in
start-impala-cluster.py, higher than Kudu's default value of 10s.

Testing:
- Passed a full run of core tests on ASAN.

Change-Id: I8c4a2d87934cd4eb98509512f7060a596894ed53
---
M be/src/common/global-flags.cc
M be/src/exec/kudu-util.cc
M bin/start-impala-cluster.py
3 files changed, 18 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/10466/2
--
To view, visit http://gerrit.cloudera.org:8080/10466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c4a2d87934cd4eb98509512f7060a596894ed53
Gerrit-Change-Number: 10466
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-7099: fix test unsupported text compression delete path

2018-05-31 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10562 )

Change subject: IMPALA-7099: fix test_unsupported_text_compression delete path
..


Patch Set 1:

(1 comment)

As we discussed offline, this doesn't seem to be the root problem though given 
the assert is saying there are not enough files.

http://gerrit.cloudera.org:8080/#/c/10562/1/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

http://gerrit.cloudera.org:8080/#/c/10562/1/tests/metadata/test_partition_metadata.py@62
PS1, Line 62: self.filesystem_client.delete_file_dir(TBL_NAME, 
recursive=True)
> This also seems to have the same mistake.
Done



--
To view, visit http://gerrit.cloudera.org:8080/10562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I564760c9eaa63514f2b59c250b4b000b7a960601
Gerrit-Change-Number: 10562
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 31 May 2018 22:51:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7099: fix test unsupported text compression delete path

2018-05-31 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10562 )

Change subject: IMPALA-7099: fix test_unsupported_text_compression delete path
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10562/1/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

http://gerrit.cloudera.org:8080/#/c/10562/1/tests/metadata/test_partition_metadata.py@220
PS1, Line 220: [1:]
this seems wrong if FILESYSTEM_PREFIX is not empty.



--
To view, visit http://gerrit.cloudera.org:8080/10562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I564760c9eaa63514f2b59c250b4b000b7a960601
Gerrit-Change-Number: 10562
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 31 May 2018 22:52:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7099: fix test unsupported text compression delete path

2018-05-31 Thread Dan Hecht (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10562

to look at the new patch set (#2).

Change subject: IMPALA-7099: fix test_unsupported_text_compression delete path
..

IMPALA-7099: fix test_unsupported_text_compression delete path

Looks like this test was meaning to delete the path at TBL_LOCATION,
not TBL_NAME.

Change-Id: I564760c9eaa63514f2b59c250b4b000b7a960601
---
M tests/metadata/test_partition_metadata.py
1 file changed, 2 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/10562/2
--
To view, visit http://gerrit.cloudera.org:8080/10562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I564760c9eaa63514f2b59c250b4b000b7a960601
Gerrit-Change-Number: 10562
Gerrit-PatchSet: 2
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7044: Prevent overflow when computing Parquet block size

2018-05-31 Thread Lars Volker (Code Review)
Hello Thomas Marshall, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10483

to look at the new patch set (#6).

Change subject: IMPALA-7044: Prevent overflow when computing Parquet block size
..

IMPALA-7044: Prevent overflow when computing Parquet block size

When writing Parquet files we compute a minimum block size based on the
number of columns in the target table:

  3 * page_size * num_cols

For tables with a large number of columns (> ~10k), this value will get
larger than 2GB. When we pass it to hdfsOpenFile() in
HdfsTableSink::CreateNewTmpFile() it gets cast to a signed int32 and can
overflow.

To fix this we return an error if we detect that the minimum block size
exceed 2GB.

This change adds a test using CTAS into a table with 12k columns, making
sure that Impala returns the correct error.

Change-Id: I6e63420e5a093c0bbc789201771708865b16e138
---
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/hdfs-table-sink.cc
M tests/query_test/test_insert_parquet.py
4 files changed, 51 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/10483/6
--
To view, visit http://gerrit.cloudera.org:8080/10483
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6e63420e5a093c0bbc789201771708865b16e138
Gerrit-Change-Number: 10483
Gerrit-PatchSet: 6
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-7104: test bloom wait time failing with timeout on asan

2018-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10565 )

Change subject: IMPALA-7104: test_bloom_wait_time failing with timeout on asan
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2583/


--
To view, visit http://gerrit.cloudera.org:8080/10565
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0271eb83e65b3b5f17e32f6501032f0d12dde38e
Gerrit-Change-Number: 10565
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Thu, 31 May 2018 22:33:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7104: test bloom wait time failing with timeout on asan

2018-05-31 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10565 )

Change subject: IMPALA-7104: test_bloom_wait_time failing with timeout on asan
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10565
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0271eb83e65b3b5f17e32f6501032f0d12dde38e
Gerrit-Change-Number: 10565
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Thu, 31 May 2018 22:30:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7104: test bloom wait time failing with timeout on asan

2018-05-31 Thread Thomas Marshall (Code Review)
Thomas Marshall has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10565


Change subject: IMPALA-7104: test_bloom_wait_time failing with timeout on asan
..

IMPALA-7104: test_bloom_wait_time failing with timeout on asan

test_bloom_wait_time checks that its query runs in less than a certain
amount of time, to ensure we aren't waiting on bloom filters that will
never arrive.

The test sometimes goes slightly over the timeout and fails on ASAN.
This patch uses a specific_build_type_timeout to extend the allowed
timeout for ASAN.

It also fixes a bug in a similar test where we were comparing a
duration in seconds to a timeout in milliseconds, and renames some
variables to make it clearer why this was a bug.

Change-Id: I0271eb83e65b3b5f17e32f6501032f0d12dde38e
---
M tests/query_test/test_runtime_filters.py
1 file changed, 4 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/10565/1
--
To view, visit http://gerrit.cloudera.org:8080/10565
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I0271eb83e65b3b5f17e32f6501032f0d12dde38e
Gerrit-Change-Number: 10565
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall 


[Impala-ASF-CR] [DOCS] SHOW TBLPROPERTIES not supported in impala

2018-05-31 Thread Alex Rodoni (Code Review)
Alex Rodoni has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10564


Change subject: [DOCS] SHOW TBLPROPERTIES not supported in impala
..

[DOCS] SHOW TBLPROPERTIES not supported in impala

Change-Id: I4643a6d759e334854b955bae75596ac43aa33b97
---
M docs/topics/impala_langref_unsupported.xml
1 file changed, 4 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/10564/1
--
To view, visit http://gerrit.cloudera.org:8080/10564
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4643a6d759e334854b955bae75596ac43aa33b97
Gerrit-Change-Number: 10564
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 


[Impala-ASF-CR] Revert "IMPALA-5893: Remove old kinit code for Impala 3"

2018-05-31 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10563


Change subject: Revert "IMPALA-5893: Remove old kinit code for Impala 3"
..

Revert "IMPALA-5893: Remove old kinit code for Impala 3"

This reverts commit 4dc3d340d866ef9a7ef1f654cbfff402e0bc69cc.

Due to IMPALA-7072, we still need to keep the old way of Kinit-ing
around for some specific configurations as a fallback.

Change-Id: Idbcce96a1fab23445f42d224ca397897aa90e07d
---
M be/src/common/global-flags.cc
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication.cc
M be/src/rpc/rpc-mgr-kerberized-test.cc
M be/src/rpc/thrift-server-test.cc
M be/src/testutil/mini-kdc-wrapper.h
6 files changed, 110 insertions(+), 11 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/10563/1
--
To view, visit http://gerrit.cloudera.org:8080/10563
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Idbcce96a1fab23445f42d224ca397897aa90e07d
Gerrit-Change-Number: 10563
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-7099: fix test unsupported text compression delete path

2018-05-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10562 )

Change subject: IMPALA-7099: fix test_unsupported_text_compression delete path
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10562/1/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

http://gerrit.cloudera.org:8080/#/c/10562/1/tests/metadata/test_partition_metadata.py@62
PS1, Line 62: self.filesystem_client.delete_file_dir(TBL_NAME, 
recursive=True)
This also seems to have the same mistake.



--
To view, visit http://gerrit.cloudera.org:8080/10562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I564760c9eaa63514f2b59c250b4b000b7a960601
Gerrit-Change-Number: 10562
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 31 May 2018 22:16:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7099: fix test unsupported text compression delete path

2018-05-31 Thread Dan Hecht (Code Review)
Dan Hecht has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10562


Change subject: IMPALA-7099: fix test_unsupported_text_compression delete path
..

IMPALA-7099: fix test_unsupported_text_compression delete path

Looks like this test was meaning to delete the path at TBL_LOCATION,
not TBL_NAME.

Change-Id: I564760c9eaa63514f2b59c250b4b000b7a960601
---
M tests/metadata/test_partition_metadata.py
1 file changed, 1 insertion(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/10562/1
--
To view, visit http://gerrit.cloudera.org:8080/10562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I564760c9eaa63514f2b59c250b4b000b7a960601
Gerrit-Change-Number: 10562
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Hecht 


[Impala-ASF-CR] IMPALA-6714: [DOCS] ORC file format support

2018-05-31 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10525 )

Change subject: IMPALA-6714: [DOCS] ORC file format support
..


Patch Set 1:

(5 comments)

> Patch Set 1:
>
> (1 comment)

http://gerrit.cloudera.org:8080/#/c/10525/1/docs/topics/impala_orc.xml
File docs/topics/impala_orc.xml:

http://gerrit.cloudera.org:8080/#/c/10525/1/docs/topics/impala_orc.xml@93
PS1, Line 93: If you do not have an existing data file to use, begin by 
creating one in the appropriate format.
> The example below should be enough, remove.
OK. This is the same as other formats' docs. Do you think they should all be 
removed?


http://gerrit.cloudera.org:8080/#/c/10525/1/docs/topics/impala_orc.xml@133
PS1, Line 133: select * from
> Could you make all SQL keywords in uppercase as in the Hive examples below?
Sure


http://gerrit.cloudera.org:8080/#/c/10525/1/docs/topics/impala_orc.xml@152
PS1, Line 152: Enabling Compression for ORC Tables
> This section deals mostly with Hive - is there a Hive document that could b
I think it's reasonable. There're no details examples in the official site of 
ORC. For example, https://orc.apache.org/docs/hive-ddl.html


http://gerrit.cloudera.org:8080/#/c/10525/1/docs/topics/impala_orc.xml@260
PS1, Line 260: Most of the types have the same name in Impala except the BINARY 
type is STRING type in Impala,
 : and the DATE type is not supported in Impala.
> Turn into list (or box, similar to what Parquet has)
Sure


http://gerrit.cloudera.org:8080/#/c/10525/1/docs/topics/impala_orc.xml@269
PS1, Line 269: For example,
> Add examples of what works, and one which doesn't. Include exception text.
Sure



--
To view, visit http://gerrit.cloudera.org:8080/10525
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib1ee23ed844653c274babdce5a332dbe5c79b630
Gerrit-Change-Number: 10525
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 31 May 2018 22:03:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7044: Prevent overflow when computing Parquet block size

2018-05-31 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10483 )

Change subject: IMPALA-7044: Prevent overflow when computing Parquet block size
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10483/5/be/src/exec/hdfs-table-sink.cc
File be/src/exec/hdfs-table-sink.cc:

http://gerrit.cloudera.org:8080/#/c/10483/5/be/src/exec/hdfs-table-sink.cc@376
PS5, Line 376: HDFS block size must be smaller than 2GB.")
> It'd probably be good to include block_size in this error message, and also
should say: partition descriptor *or* the default_block_size logic



--
To view, visit http://gerrit.cloudera.org:8080/10483
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e63420e5a093c0bbc789201771708865b16e138
Gerrit-Change-Number: 10483
Gerrit-PatchSet: 5
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 31 May 2018 21:13:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7044: Prevent overflow when computing Parquet block size

2018-05-31 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10483 )

Change subject: IMPALA-7044: Prevent overflow when computing Parquet block size
..


Patch Set 5: Code-Review+2

(1 comment)

I'm okay with this as is, but if you decide to try to simplify the logic around 
block size I could take another look.

http://gerrit.cloudera.org:8080/#/c/10483/5/be/src/exec/hdfs-table-sink.cc
File be/src/exec/hdfs-table-sink.cc:

http://gerrit.cloudera.org:8080/#/c/10483/5/be/src/exec/hdfs-table-sink.cc@376
PS5, Line 376: HDFS block size must be smaller than 2GB.")
It'd probably be good to include block_size in this error message, and also to 
indicate whether this came from the partition descriptor for the 
default_bloc_size logic. Otherwise, I think it'll be a bit of work to determine 
where the bad block size came from.



--
To view, visit http://gerrit.cloudera.org:8080/10483
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e63420e5a093c0bbc789201771708865b16e138
Gerrit-Change-Number: 10483
Gerrit-PatchSet: 5
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 31 May 2018 21:12:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7044: Prevent overflow when computing Parquet block size

2018-05-31 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10483 )

Change subject: IMPALA-7044: Prevent overflow when computing Parquet block size
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10483/4/be/src/exec/hdfs-parquet-table-writer.cc
File be/src/exec/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/10483/4/be/src/exec/hdfs-parquet-table-writer.cc@796
PS4, Line 796:   }
> The check in hdfs-table-writer is supposed to catch future errors and act a
Yes, understood. It just seems like all these limits are somewhat related and 
so it'd be nice if we can check them in a consistent place. But maybe that's 
not possible given how the code is currently structured.



--
To view, visit http://gerrit.cloudera.org:8080/10483
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e63420e5a093c0bbc789201771708865b16e138
Gerrit-Change-Number: 10483
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 31 May 2018 21:09:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7044: Prevent overflow when computing Parquet block size

2018-05-31 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10483 )

Change subject: IMPALA-7044: Prevent overflow when computing Parquet block size
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10483/4/be/src/exec/hdfs-table-sink.cc
File be/src/exec/hdfs-table-sink.cc:

http://gerrit.cloudera.org:8080/#/c/10483/4/be/src/exec/hdfs-table-sink.cc@371
PS4, Line 371: output_partition->partition_descriptor->block_size();
> My intent here is to catch any overflow that does not get caught elsewhere
I was talking specifically about the case when the partition descriptor's block 
size is > 2GB.  I'm fine with lines 374-377 as a validation before hdfsOpen(), 
but wondering what cases the block size on the partition descriptor can be so 
large and whether we need to just cope with it (since it's out of impala's 
control).



--
To view, visit http://gerrit.cloudera.org:8080/10483
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e63420e5a093c0bbc789201771708865b16e138
Gerrit-Change-Number: 10483
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 31 May 2018 21:07:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR](2.x) IMPALA-7061: Rework HBase splitting and assignment

2018-05-31 Thread Thomas Marshall (Code Review)
Thomas Marshall has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10545 )

Change subject: IMPALA-7061: Rework HBase splitting and assignment
..

IMPALA-7061: Rework HBase splitting and assignment

Some frontend PlannerTests rely on HBase tables being
arranged in a deterministic way. Specifically, the
HBase tables need to be split with specific region
boundaries and those regions need to be assigned to
specific HBase region servers.

Currently, the tables are created without splits and
testdata/bin/split-hbase.sh runs Java code in
HBaseTestDataRegionAssignment to split and assign
the tables. This runs during dataload via
testdata/bin/create-load-data.sh and during tests
with bin/run-all-tests.sh. There are problems with
both parts of this process. The table splitting is
flaky. Since significant time can pass between the
assignments and the tests, rebalancing means the
assignments are not always stable.

This changes the process so that the HBase tables are
created with the splits already specified via the
HBase shell. The splits remain stable over time.
PlannerTestBase runs the assignment code in
HBaseTestDataRegionAssignment at the start of
the PlannerTests. This makes the assignments
deterministic. No other tests depends on the
exact assignments, so this does not regress anything.

Testing:
 - Local testing
 - Ran gerrit-verify-dryrun-external

2.x does not have minicluster profiles, so the
HBaseTestDataRegionAssignment.java is in the regular
fe/src/test directories.

Change-Id: I3d639128a856254a6ccb93d6750f531974b5f897
Reviewed-on: http://gerrit.cloudera.org:8080/10447
Reviewed-by: Philip Zeyliger 
Tested-by: Impala Public Jenkins 
(cherry picked from commit 9a5410570e25431813b96e00f7b91db44f672f38)
Reviewed-on: http://gerrit.cloudera.org:8080/10545
Reviewed-by: Joe McDonnell 
Tested-by: Thomas Marshall 
---
M bin/run-all-tests.sh
A 
fe/src/test/java/org/apache/impala/datagenerator/HBaseTestDataRegionAssignment.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M testdata/bin/create-load-data.sh
M testdata/bin/generate-schema-statements.py
D testdata/bin/split-hbase.sh
M testdata/datasets/functional/functional_schema_template.sql
D 
testdata/src/main/java/org/apache/impala/datagenerator/HBaseTestDataRegionAssigment.java
8 files changed, 165 insertions(+), 390 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Thomas Marshall: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10545
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: merged
Gerrit-Change-Id: I3d639128a856254a6ccb93d6750f531974b5f897
Gerrit-Change-Number: 10545
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR](2.x) IMPALA-7061: Rework HBase splitting and assignment

2018-05-31 Thread Thomas Marshall (Code Review)
Thomas Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10545 )

Change subject: IMPALA-7061: Rework HBase splitting and assignment
..


Patch Set 2: Verified+1

Successful gvo: https://jenkins.impala.io/job/gerrit-verify-dryrun/2582/


--
To view, visit http://gerrit.cloudera.org:8080/10545
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I3d639128a856254a6ccb93d6750f531974b5f897
Gerrit-Change-Number: 10545
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 31 May 2018 21:05:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7100: [DOCS] Consistent memory alloc across executor nodes

2018-05-31 Thread Alex Rodoni (Code Review)
Alex Rodoni has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10561


Change subject: IMPALA-7100: [DOCS] Consistent memory alloc across executor 
nodes
..

IMPALA-7100: [DOCS] Consistent memory alloc across executor nodes

Change-Id: I22926eb6050f9501624d2041f594fe4ef15be73b
---
M docs/topics/impala_prereqs.xml
1 file changed, 57 insertions(+), 39 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/10561/1
--
To view, visit http://gerrit.cloudera.org:8080/10561
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I22926eb6050f9501624d2041f594fe4ef15be73b
Gerrit-Change-Number: 10561
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 


[Impala-ASF-CR](2.x) IMPALA-7061: Rework HBase splitting and assignment

2018-05-31 Thread Thomas Marshall (Code Review)
Thomas Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10559 )

Change subject: IMPALA-7061: Rework HBase splitting and assignment
..


Patch Set 1:

This patch is: https://gerrit.cloudera.org/#/c/10545/ cherry-picked onto 
https://gerrit.cloudera.org/#/c/10556/

The motivation of creating this was to kick off a gvo: 
https://jenkins.impala.io/job/gerrit-verify-dryrun/2582/ that ran at the same 
time as the gvo for https://gerrit.cloudera.org/#/c/10556/

Assuming that gvo succeeds, I suggest we just go ahead and manually verify and 
merge https://gerrit.cloudera.org/#/c/10545/, instead of running yet another 
gvo.

This is a slight deviation from our normal procedure, but all of the usual 
testing will still have run, and this will let us get the 2.x builds passing a 
day sooner, since they are all failing deterministically at the moment waiting 
for fixes to go in that have been blocked by this.


--
To view, visit http://gerrit.cloudera.org:8080/10559
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I62ce82b23a2df0faea36baadfd48b99b8f973f72
Gerrit-Change-Number: 10559
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 31 May 2018 20:49:30 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-7061: Rework HBase splitting and assignment

2018-05-31 Thread Thomas Marshall (Code Review)
Thomas Marshall has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10559


Change subject: IMPALA-7061: Rework HBase splitting and assignment
..

IMPALA-7061: Rework HBase splitting and assignment

Some frontend PlannerTests rely on HBase tables being
arranged in a deterministic way. Specifically, the
HBase tables need to be split with specific region
boundaries and those regions need to be assigned to
specific HBase region servers.

Currently, the tables are created without splits and
testdata/bin/split-hbase.sh runs Java code in
HBaseTestDataRegionAssignment to split and assign
the tables. This runs during dataload via
testdata/bin/create-load-data.sh and during tests
with bin/run-all-tests.sh. There are problems with
both parts of this process. The table splitting is
flaky. Since significant time can pass between the
assignments and the tests, rebalancing means the
assignments are not always stable.

This changes the process so that the HBase tables are
created with the splits already specified via the
HBase shell. The splits remain stable over time.
PlannerTestBase runs the assignment code in
HBaseTestDataRegionAssignment at the start of
the PlannerTests. This makes the assignments
deterministic. No other tests depends on the
exact assignments, so this does not regress anything.

Testing:
 - Local testing
 - Ran gerrit-verify-dryrun-external

2.x does not have minicluster profiles, so the
HBaseTestDataRegionAssignment.java is in the regular
fe/src/test directories.

Change-Id: I62ce82b23a2df0faea36baadfd48b99b8f973f72
---
M bin/run-all-tests.sh
A 
fe/src/test/java/org/apache/impala/datagenerator/HBaseTestDataRegionAssignment.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M testdata/bin/create-load-data.sh
M testdata/bin/generate-schema-statements.py
D testdata/bin/split-hbase.sh
M testdata/datasets/functional/functional_schema_template.sql
D 
testdata/src/main/java/org/apache/impala/datagenerator/HBaseTestDataRegionAssigment.java
8 files changed, 165 insertions(+), 390 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/10559/1
--
To view, visit http://gerrit.cloudera.org:8080/10559
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: I62ce82b23a2df0faea36baadfd48b99b8f973f72
Gerrit-Change-Number: 10559
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 


[Impala-ASF-CR](2.x) IMPALA-7091: Address NullPointerException in HBaseTable.getRegionSize().

2018-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10556 )

Change subject: IMPALA-7091: Address NullPointerException in 
HBaseTable.getRegionSize().
..

IMPALA-7091: Address NullPointerException in HBaseTable.getRegionSize().

It's possible for "serverLoad.getRegionsLoad().get(info.getRegionName())"
to be null, which causes a NullPointerException in the planner, and
visible to the user. The code around it already says that it handles
errors by returning 0 for the size, and I've extended that to one more
case.

In practice, I have seen this come up in failures of the following test:

  failure.test_failpoints.TestFailpoints.test_failpoints[table_format: 
hbase/none | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
0} | mt_dop: 4 | location: OPEN | action: MEM_LIMIT_EXCEEDED | query: select * 
from alltypessmall union all select * from alltypessmall]

I saw this failure only happen in some test-with-docker runs,
inconsistently.  The error is a little bit hard to spot, but by
correlating the timestamp of the failing test (which just complains
about NullPointerException), you can find a Java stack trace complaining
of a NPE in "regionLoad.getStorefileSizeMB()". I think the likely cause
is regionLoad being null.

Change-Id: I02f06daf69e7f7e97c9ecc13997147530c2f9d3f
Reviewed-on: http://gerrit.cloudera.org:8080/10531
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 
(cherry picked from commit fc0c5f58f00b360a7651187e0b0fe79add426742)
Reviewed-on: http://gerrit.cloudera.org:8080/10556
Reviewed-by: Philip Zeyliger 
---
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
1 file changed, 6 insertions(+), 2 deletions(-)

Approvals:
  Philip Zeyliger: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: merged
Gerrit-Change-Id: I02f06daf69e7f7e97c9ecc13997147530c2f9d3f
Gerrit-Change-Number: 10556
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR](2.x) IMPALA-7091: Address NullPointerException in HBaseTable.getRegionSize().

2018-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10556 )

Change subject: IMPALA-7091: Address NullPointerException in 
HBaseTable.getRegionSize().
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I02f06daf69e7f7e97c9ecc13997147530c2f9d3f
Gerrit-Change-Number: 10556
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 31 May 2018 20:37:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6119: Fix issue with multiple partitions sharing same location

2018-05-31 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10543 )

Change subject: IMPALA-6119: Fix issue with multiple partitions sharing same 
location
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10543/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/10543/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1485
PS2, Line 1485:   if (partitions == null) {
> I'm not an expert on the Catalog code, so feel free to push back.
- I have seen tables with > 100k partitions. If we take this route, we'd need 
to loop through them in every "alter" request. I think that'd be expensive, but 
I could be wrong (add partition is perf critical operation AFAICT)

- Also, like you mentioned, this adds some extra memory, especially the String 
keys (the partition objects are just references) so we need to probably intern 
them.

- Either way, we can probably benchmark both the approaches on large 
partitioned tables and see which one is better.

Gabor, any thoughts on this?


http://gerrit.cloudera.org:8080/#/c/10543/1/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

http://gerrit.cloudera.org:8080/#/c/10543/1/tests/metadata/test_partition_metadata.py@159
PS1, Line 159:   assert data.split('\t') == ['21', '6']
> Nice catch as dropping a partition in this case would cause some issue. Not
I see. Did you get a chance to see how Apache Hive behaves in this case?



--
To view, visit http://gerrit.cloudera.org:8080/10543
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a
Gerrit-Change-Number: 10543
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 31 May 2018 20:32:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7044: Prevent overflow when computing Parquet block size

2018-05-31 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10483 )

Change subject: IMPALA-7044: Prevent overflow when computing Parquet block size
..


Patch Set 4:

(5 comments)

Thanks for the review. Please see PS5 and my inline comments. Let me know if 
you'd like to chat in person about the various error checks.

http://gerrit.cloudera.org:8080/#/c/10483/4/be/src/exec/hdfs-parquet-table-writer.cc
File be/src/exec/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/10483/4/be/src/exec/hdfs-parquet-table-writer.cc@795
PS4, Line 795: num_cols
> it's a bit misleading to say "number of columns in the target table" and gi
Updated the message


http://gerrit.cloudera.org:8080/#/c/10483/4/be/src/exec/hdfs-parquet-table-writer.cc@796
PS4, Line 796:   }
> The preexisting block size logic is pretty convoluted. It appears that this
The check in hdfs-table-writer is supposed to catch future errors and act as a 
backstop to prevent the overflow (see my longer comment there).

The check in InitNewFile() below makes sure that the configured file size limit 
is large enough for the number of non-partitioning columns. Users can also hit 
it by setting the parquet_file_size option to limit the target file size.


http://gerrit.cloudera.org:8080/#/c/10483/4/be/src/exec/hdfs-parquet-table-writer.cc@925
PS4, Line 925: num_cols
> maybe call this num_file_columns to be clear that it's the file columns not
Done


http://gerrit.cloudera.org:8080/#/c/10483/4/be/src/exec/hdfs-parquet-table-writer.cc@979
PS4, Line 979: a table with " << num_cols << " columns
> not your change, but this is also confused.
Updated the message


http://gerrit.cloudera.org:8080/#/c/10483/4/be/src/exec/hdfs-table-sink.cc
File be/src/exec/hdfs-table-sink.cc:

http://gerrit.cloudera.org:8080/#/c/10483/4/be/src/exec/hdfs-table-sink.cc@371
PS4, Line 371: output_partition->partition_descriptor->block_size();
> in what cases can this be > 2GB? I'm wondering if we should just cap this v
My intent here is to catch any overflow that does not get caught elsewhere in 
the code. When calling hdfsOpenFile() below we must be sure that the value is 
<= 2GB. Introducing the possibility for such an error elsewhere in the code 
would not necessarily trigger it, so I opted to log it and abort the query 
instead of letting the overflow occur. At this point a user might not be able 
to fix the query, but at least they don't get unexpected results (incorrect 
results, crashes).

Should I add a comment to explain the intent better?



--
To view, visit http://gerrit.cloudera.org:8080/10483
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e63420e5a093c0bbc789201771708865b16e138
Gerrit-Change-Number: 10483
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 31 May 2018 20:02:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7044: Prevent overflow when computing Parquet block size

2018-05-31 Thread Lars Volker (Code Review)
Hello Thomas Marshall, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10483

to look at the new patch set (#5).

Change subject: IMPALA-7044: Prevent overflow when computing Parquet block size
..

IMPALA-7044: Prevent overflow when computing Parquet block size

When writing Parquet files we compute a minimum block size based on the
number of columns in the target table:

  3 * page_size * num_cols

For tables with a large number of columns (> ~10k), this value will get
larger than 2GB. When we pass it to hdfsOpenFile() in
HdfsTableSink::CreateNewTmpFile() it gets cast to a signed int32 and can
overflow.

To fix this we return an error if we detect that the minimum block size
exceed 2GB.

This change adds a test using CTAS into a table with 12k columns, making
sure that Impala returns the correct error.

Change-Id: I6e63420e5a093c0bbc789201771708865b16e138
---
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/hdfs-table-sink.cc
M tests/query_test/test_insert_parquet.py
4 files changed, 41 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/10483/5
--
To view, visit http://gerrit.cloudera.org:8080/10483
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6e63420e5a093c0bbc789201771708865b16e138
Gerrit-Change-Number: 10483
Gerrit-PatchSet: 5
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-7012: Fix NPE when parsing unexpected tokens

2018-05-31 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/10512 )

Change subject: IMPALA-7012: Fix NPE when parsing unexpected tokens
..

IMPALA-7012: Fix NPE when parsing unexpected tokens

Currently some token are not added to tokenIdMap in the sql scanner and
an NPE will be thrown if such are parsed as unexpected tokens. This
patch fixes this bug and adds a case to ParserTest.

Change-Id: I9c846fdfb22ba37bfc3b1985b9a044014ab58968
---
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
3 files changed, 36 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/10512/3
--
To view, visit http://gerrit.cloudera.org:8080/10512
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9c846fdfb22ba37bfc3b1985b9a044014ab58968
Gerrit-Change-Number: 10512
Gerrit-PatchSet: 3
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans

2018-05-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10550 )

Change subject: PREVIEW: IMPALA-7078: improve memory consumption of wide Avro 
scans
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10550/5/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/10550/5/be/src/exec/hdfs-scan-node.cc@301
PS5, Line 301:   if (materialized_row_batches_->Size() == 
max_materialized_row_batches_) break;
Unnecessary change.



--
To view, visit http://gerrit.cloudera.org:8080/10550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iebd2600b4784fd19696c9b92eefb7d7e9db0c80b
Gerrit-Change-Number: 10550
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 31 May 2018 18:34:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Fix Indents from IMPALA-4970 fixing the mistake in indentation made previously

2018-05-31 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10560 )

Change subject: Fix Indents from IMPALA-4970 fixing the mistake in indentation 
made previously
..


Patch Set 1:

Same info, but its here as well:

https://cwiki.apache.org/confluence/display/IMPALA/Impala+Style+Guide
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868536


--
To view, visit http://gerrit.cloudera.org:8080/10560
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4ea53e0780cca9f791808c2a73c7f72ba27c2159
Gerrit-Change-Number: 10560
Gerrit-PatchSet: 1
Gerrit-Owner: Rahul Shivu Mahadev 
Gerrit-Reviewer: Rahul Shivu Mahadev 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 31 May 2018 18:17:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] Fix Indents from IMPALA-4970 fixing the mistake in indentation made previously

2018-05-31 Thread Rahul Shivu Mahadev (Code Review)
Rahul Shivu Mahadev has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10560 )

Change subject: Fix Indents from IMPALA-4970 fixing the mistake in indentation 
made previously
..


Patch Set 1:

> Has anyone told you about clang-format? Might save you some time on
 > these nit-picky little issues.
 >
 > Running something like:
 > > sudo apt-get install clang-format-5.0
 > > git diff -U0 HEAD^ | clang-format-diff-5.0 -style file -p1 -i
 > will automatically apply our formatting rules to the patch you're
 > currently working on

 > Has anyone told you about clang-format? Might save you some time on
 > these nit-picky little issues.
 >
 > Running something like:
 > > sudo apt-get install clang-format-5.0
 > > git diff -U0 HEAD^ | clang-format-diff-5.0 -style file -p1 -i
 > will automatically apply our formatting rules to the patch you're
 > currently working on

 > Has anyone told you about clang-format? Might save you some time on
 > these nit-picky little issues.
 >
 > Running something like:
 > > sudo apt-get install clang-format-5.0
 > > git diff -U0 HEAD^ | clang-format-diff-5.0 -style file -p1 -i
 > will automatically apply our formatting rules to the patch you're
 > currently working on

That's helpful I shall check it out, thank you


--
To view, visit http://gerrit.cloudera.org:8080/10560
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4ea53e0780cca9f791808c2a73c7f72ba27c2159
Gerrit-Change-Number: 10560
Gerrit-PatchSet: 1
Gerrit-Owner: Rahul Shivu Mahadev 
Gerrit-Reviewer: Rahul Shivu Mahadev 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 31 May 2018 17:50:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] Fix Indents from IMPALA-4970 fixing the mistake in indentation made previously

2018-05-31 Thread Thomas Marshall (Code Review)
Thomas Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10560 )

Change subject: Fix Indents from IMPALA-4970 fixing the mistake in indentation 
made previously
..


Patch Set 1:

Has anyone told you about clang-format? Might save you some time on these 
nit-picky little issues.

Running something like:
> sudo apt-get install clang-format-5.0
> git diff -U0 HEAD^ | clang-format-diff-5.0 -style file -p1 -i
will automatically apply our formatting rules to the patch you're currently 
working on


--
To view, visit http://gerrit.cloudera.org:8080/10560
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4ea53e0780cca9f791808c2a73c7f72ba27c2159
Gerrit-Change-Number: 10560
Gerrit-PatchSet: 1
Gerrit-Owner: Rahul Shivu Mahadev 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 31 May 2018 17:47:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] Fix Indents from IMPALA-4970 fixing the mistake in indentation made previously

2018-05-31 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10560 )

Change subject: Fix Indents from IMPALA-4970 fixing the mistake in indentation 
made previously
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10560
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4ea53e0780cca9f791808c2a73c7f72ba27c2159
Gerrit-Change-Number: 10560
Gerrit-PatchSet: 1
Gerrit-Owner: Rahul Shivu Mahadev 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 31 May 2018 17:41:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] Fix Indents from IMPALA-4970 fixing the mistake in indentation made previously

2018-05-31 Thread Rahul Shivu Mahadev (Code Review)
Rahul Shivu Mahadev has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10560


Change subject: Fix Indents from IMPALA-4970 fixing the mistake in indentation 
made previously
..

Fix Indents from IMPALA-4970
fixing the mistake in indentation made previously

Change-Id: I4ea53e0780cca9f791808c2a73c7f72ba27c2159
---
M be/src/runtime/coordinator.cc
1 file changed, 4 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/10560/1
--
To view, visit http://gerrit.cloudera.org:8080/10560
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4ea53e0780cca9f791808c2a73c7f72ba27c2159
Gerrit-Change-Number: 10560
Gerrit-PatchSet: 1
Gerrit-Owner: Rahul Shivu Mahadev 


[Impala-ASF-CR] IMPALA-6119: Fix issue with multiple partitions sharing same location

2018-05-31 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10543 )

Change subject: IMPALA-6119: Fix issue with multiple partitions sharing same 
location
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10543/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/10543/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1485
PS2, Line 1485:   if (partitions == null) {
I'm not an expert on the Catalog code, so feel free to push back.

I'm just wondering if it would make more sense to not have many extra 
'locationToPartMap_' maps (one per HdfsTable) in memory, and rather just 
restructure the 'getPartitionsByPath()' function to loop through 
'partitionMap_' once and pick out all the partitions that match 
'partitionNames'.

These are the questions that motivated the above approach:
What's the largest we expect a 'partitionMap_' to be? And do we expect that 
such an approach would significantly slowdown the hot path in the catalog?


http://gerrit.cloudera.org:8080/#/c/10543/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1487
PS2, Line 1487: partitionPath.toString()
partition.getLocation() to avoid unnecessary indirection ?



--
To view, visit http://gerrit.cloudera.org:8080/10543
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a
Gerrit-Change-Number: 10543
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 31 May 2018 17:01:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR](2.x) IMPALA-7091: Address NullPointerException in HBaseTable.getRegionSize().

2018-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10556 )

Change subject: IMPALA-7091: Address NullPointerException in 
HBaseTable.getRegionSize().
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2576/


--
To view, visit http://gerrit.cloudera.org:8080/10556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I02f06daf69e7f7e97c9ecc13997147530c2f9d3f
Gerrit-Change-Number: 10556
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 31 May 2018 16:53:42 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-7091: Address NullPointerException in HBaseTable.getRegionSize().

2018-05-31 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10556 )

Change subject: IMPALA-7091: Address NullPointerException in 
HBaseTable.getRegionSize().
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I02f06daf69e7f7e97c9ecc13997147530c2f9d3f
Gerrit-Change-Number: 10556
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 31 May 2018 16:52:58 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-7091: Address NullPointerException in HBaseTable.getRegionSize().

2018-05-31 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10556 )

Change subject: IMPALA-7091: Address NullPointerException in 
HBaseTable.getRegionSize().
..


Patch Set 1:

Since the 2.x backport for IMPALA-7061 is hitting this, pulling this ahead 
manually to unblock things.

Clean cherry pick.


--
To view, visit http://gerrit.cloudera.org:8080/10556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I02f06daf69e7f7e97c9ecc13997147530c2f9d3f
Gerrit-Change-Number: 10556
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 31 May 2018 16:47:39 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-7091: Address NullPointerException in HBaseTable.getRegionSize().

2018-05-31 Thread Joe McDonnell (Code Review)
Hello Impala Public Jenkins,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/10556

to review the following change.


Change subject: IMPALA-7091: Address NullPointerException in 
HBaseTable.getRegionSize().
..

IMPALA-7091: Address NullPointerException in HBaseTable.getRegionSize().

It's possible for "serverLoad.getRegionsLoad().get(info.getRegionName())"
to be null, which causes a NullPointerException in the planner, and
visible to the user. The code around it already says that it handles
errors by returning 0 for the size, and I've extended that to one more
case.

In practice, I have seen this come up in failures of the following test:

  failure.test_failpoints.TestFailpoints.test_failpoints[table_format: 
hbase/none | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
0} | mt_dop: 4 | location: OPEN | action: MEM_LIMIT_EXCEEDED | query: select * 
from alltypessmall union all select * from alltypessmall]

I saw this failure only happen in some test-with-docker runs,
inconsistently.  The error is a little bit hard to spot, but by
correlating the timestamp of the failing test (which just complains
about NullPointerException), you can find a Java stack trace complaining
of a NPE in "regionLoad.getStorefileSizeMB()". I think the likely cause
is regionLoad being null.

Change-Id: I02f06daf69e7f7e97c9ecc13997147530c2f9d3f
Reviewed-on: http://gerrit.cloudera.org:8080/10531
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 
(cherry picked from commit fc0c5f58f00b360a7651187e0b0fe79add426742)
---
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
1 file changed, 6 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/10556/1
--
To view, visit http://gerrit.cloudera.org:8080/10556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: I02f06daf69e7f7e97c9ecc13997147530c2f9d3f
Gerrit-Change-Number: 10556
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-6119: Fix issue with multiple partitions sharing same location

2018-05-31 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10543 )

Change subject: IMPALA-6119: Fix issue with multiple partitions sharing same 
location
..


Patch Set 2:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@189
PS1, Line 189: ).
> nit:Java style
Done


http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@189
PS1, Line 189: from HdfsPar
> nit: remove, kinda obvious
Done


http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1105
PS1, Line 1105: tition);
> can be inferred from the partition object?
Good point. Done


http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1204
PS1, Line 1204: .s
> nit:Java
Done


http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1205
PS1, Line 1205: updates l
> updates
Done


http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1207
PS1, Line 1207: updat
> nit: update?
Done


http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1214
PS1, Line 1214: Adds 'partition' to locationToPartMap_.
> May be just say "Adds 'partition' to locationTo..." Rest is implicit.
Done


http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1216
PS1, Line 1216: addToLocationPartMap(Hd
> call it addToLocationPartMap may be?
Done


http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1468
PS1, Line 1468:
  :* Given a set of partition names, returns the correspon
> I think the earlier description is more clear.
I wanted to somehow include to the comment that we now search for all the 
HdfsPartitions that point where the ones received by parameter. Rephrased the 
comment. What do you think?


http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2434
PS1, Line 2434:
> nit: call it updatePartitionLocation()?
Done


http://gerrit.cloudera.org:8080/#/c/10543/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2435
PS1, Line 2435:   applyAlterPartition(tbl, partition);
  :   } finally {
  :
> I don't think non HdfsTable is possible here. Look at catalog_.getHdfsParti
Done


http://gerrit.cloudera.org:8080/#/c/10543/1/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

http://gerrit.cloudera.org:8080/#/c/10543/1/tests/metadata/test_partition_metadata.py@159
PS1, Line 159:   assert data.split('\t') == ['21', '6']
> Could you drop one of the partitions that point to the same partition dir a
Nice catch as dropping a partition in this case would cause some issue. Note, 
that this is present without my code as well.

Basically, when I drop a partition than the hdfs directory is deleted no matter 
if other partitions would use it. However, as far as I see Impala's metadata is 
not aware of this and treat the other partitions pointing to this location 
valid including their stored file descriptors. Once some data is queried from 
these partitions than a "No such file or directory" error is shown. An 
invalidate metadata then solves the issue with querying the table.

Is that expected to keep the other partitions to the same location alive? I 
tested on 5.15 nightly and this issue is present there as well.



--
To view, visit http://gerrit.cloudera.org:8080/10543
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a
Gerrit-Change-Number: 10543
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 31 May 2018 16:30:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6119: Fix issue with multiple partitions sharing same location

2018-05-31 Thread Gabor Kaszab (Code Review)
Hello Bharath Vissapragada, Zoltan Borok-Nagy,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10543

to look at the new patch set (#2).

Change subject: IMPALA-6119: Fix issue with multiple partitions sharing same 
location
..

IMPALA-6119: Fix issue with multiple partitions sharing same location

When multiple partitions point to the same location and a new
data file is added to any of them then the expected behaviour is that
this new file is added to the other partitions pointing to the same
location as well. Apparently, this is not the case and right after
the insertion the new file is only visible in the partition where it
was inserted to and an invalidate metadata is needed to resolve this
inconsistency.
This fix addresses this issue with keeping track of a mapping between
locations and the HdfsPartitions pointing to it. When new files are
inserted into a partition then all the other partition's metadata are
reloaded that point to the same location as the one where the files
are inserted.

Testing:
There was an existing test that covered partitions pointing to the
same location. However, after each insert it executed a refresh to
reload the metadata for the entire table. This reload was removed
to cover the changes of this fix.
Another test is introduced to cover the case when the location of a
partition is altered.

Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M tests/metadata/test_partition_metadata.py
3 files changed, 115 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/10543/2
--
To view, visit http://gerrit.cloudera.org:8080/10543
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2a54bc8224bcefe65b83de2df58bb84629f2aa4a
Gerrit-Change-Number: 10543
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans

2018-05-31 Thread Tim Armstrong (Code Review)
Hello Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10550

to look at the new patch set (#5).

Change subject: PREVIEW: IMPALA-7078: improve memory consumption of wide Avro 
scans
..

PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans

Revert to the pre-IMPALA-3905 algorithm for deciding when to return a
batch from an Avro scan. The post-IMPALA-3905 algorithm is bad for
wide tables where there are only a small number of rows per Avro block.

Optimise memory transfer for selective scans - don't attach unused
decompression buffers to the output batch. Combined with the previous
change, this dramatically reduces the amount of memory transferred out
of scanner threads for selective scans of wide tables.

Cap the maximum row batch queue size at 5 * max_num_scanner_threads_
so that we guarantee lower amounts of memory in the row batch queue
when num_scanner_threads is set, rather than just achieving it
statistically because of the producer running slower relative to
consumer. It does not reduce the default significantly on typical
server configurations that would have 24+ cores except under high
concurrency or low memory environments where the number of scanner
threads is limited. We should evaluate reducing the default further
or otherwise better controlling memory consumption in a follow-up,
based on experiments.

Includes some observability improvements including additional
counters that will help diagnose issues like this more easily:
* Add counters to give some insight into row batch queue. Here's
  an excerpt:
   - RowBatchBytesEnqueued: 20.89 MB (21903380)
   - RowBatchQueueCapacity: 5 (5)
   - RowBatchQueueGetWaitTime: 59.187ms
   - RowBatchQueuePeakMemoryUsage: 8.85 MB (9279347)
   - RowBatchQueuePutWaitTime: 0.000ns
   - RowBatchesEnqueued: 6 (6)
* Don't create AverageScannerThreadConcurrency for MT scan node where
  it's not actually used.
* Track the row batch queue memory consumption against a sub-tracker
  HDFS_SCAN_NODE (id=2): Reservation=48.00 MB OtherMemory=588.00 KB Total=48.57 
MB Peak=48.62 MB
Queued Batches: Total=588.00 KB Peak=637.00 KB

Ran the repro in the JIRA. Memory consumption was reduced from ~500MB
to ~220MB on my system.

Testing:
Ran stress test for an hour on uncompressed and snappy-compressed avro.

TODO:
- Running exhaustive tests
- Running ASAN tests
- Perf benchmarks

Change-Id: Iebd2600b4784fd19696c9b92eefb7d7e9db0c80b
---
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/scan-node.h
M be/src/runtime/mem-pool.cc
M be/src/runtime/mem-pool.h
M be/src/runtime/mem-tracker-test.cc
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
12 files changed, 242 insertions(+), 74 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/10550/5
--
To view, visit http://gerrit.cloudera.org:8080/10550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iebd2600b4784fd19696c9b92eefb7d7e9db0c80b
Gerrit-Change-Number: 10550
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] PREVIEW: IMPALA-7078: improve memory consumption of wide Avro scans

2018-05-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10550 )

Change subject: PREVIEW: IMPALA-7078: improve memory consumption of wide Avro 
scans
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10550/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10550/2//COMMIT_MSG@18
PS2, Line 18: Cap the maximum row batch queue size at 5 * the number of active
: scanner threads.
> I want to get a tighter upper bound on the amount of memory in the queue an
I did some benchmarks and the dynamic queue size caused performance changes 
because the queue became full sooner and fewer scanner threads were started. I 
removed the dynamic queue sizing. I think it's worth capping the queue based on 
num_scanner_threads still so that setting num_scanner_threads has more teeth in 
limiting memory consumption.

I can still see the argument for separating that out from this patch though if 
you prefer.



--
To view, visit http://gerrit.cloudera.org:8080/10550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iebd2600b4784fd19696c9b92eefb7d7e9db0c80b
Gerrit-Change-Number: 10550
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 31 May 2018 15:55:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7091: Address NullPointerException in HBaseTable.getRegionSize().

2018-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10531 )

Change subject: IMPALA-7091: Address NullPointerException in 
HBaseTable.getRegionSize().
..

IMPALA-7091: Address NullPointerException in HBaseTable.getRegionSize().

It's possible for "serverLoad.getRegionsLoad().get(info.getRegionName())"
to be null, which causes a NullPointerException in the planner, and
visible to the user. The code around it already says that it handles
errors by returning 0 for the size, and I've extended that to one more
case.

In practice, I have seen this come up in failures of the following test:

  failure.test_failpoints.TestFailpoints.test_failpoints[table_format: 
hbase/none | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
0} | mt_dop: 4 | location: OPEN | action: MEM_LIMIT_EXCEEDED | query: select * 
from alltypessmall union all select * from alltypessmall]

I saw this failure only happen in some test-with-docker runs,
inconsistently.  The error is a little bit hard to spot, but by
correlating the timestamp of the failing test (which just complains
about NullPointerException), you can find a Java stack trace complaining
of a NPE in "regionLoad.getStorefileSizeMB()". I think the likely cause
is regionLoad being null.

Change-Id: I02f06daf69e7f7e97c9ecc13997147530c2f9d3f
Reviewed-on: http://gerrit.cloudera.org:8080/10531
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
1 file changed, 6 insertions(+), 2 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10531
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I02f06daf69e7f7e97c9ecc13997147530c2f9d3f
Gerrit-Change-Number: 10531
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-7091: Address NullPointerException in HBaseTable.getRegionSize().

2018-05-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10531 )

Change subject: IMPALA-7091: Address NullPointerException in 
HBaseTable.getRegionSize().
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10531
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02f06daf69e7f7e97c9ecc13997147530c2f9d3f
Gerrit-Change-Number: 10531
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 31 May 2018 07:02:20 +
Gerrit-HasComments: No