[Impala-ASF-CR] IMPALA-8058: Fallback for HBase key scan range estimation

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12192 )

Change subject: IMPALA-8058: Fallback for HBase key scan range estimation
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1832/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12192
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic01147abcb6b184071ba28b55aedc3bc49b322ce
Gerrit-Change-Number: 12192
Gerrit-PatchSet: 2
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Sat, 19 Jan 2019 03:02:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8058: Fallback for HBase key scan range estimation

2019-01-18 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12192 )

Change subject: IMPALA-8058: Fallback for HBase key scan range estimation
..


Patch Set 2:

(3 comments)

Addressed code review comments. Rebased on latest master, which now shows 
cardinality estimates in the EXPLAIN plan, so updated the new test file 
accordingly.

http://gerrit.cloudera.org:8080/#/c/12192/1/fe/src/main/java/org/apache/impala/catalog/FeHBaseTable.java
File fe/src/main/java/org/apache/impala/catalog/FeHBaseTable.java:

http://gerrit.cloudera.org:8080/#/c/12192/1/fe/src/main/java/org/apache/impala/catalog/FeHBaseTable.java@409
PS1, Line 409:
> nit: add tablename, startKey and endKey too?
Done


http://gerrit.cloudera.org:8080/#/c/12192/1/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
File fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java:

http://gerrit.cloudera.org:8080/#/c/12192/1/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@304
PS1, Line 304: // No useful estimate. Rely on HMS row count stats.
> Probably mention here that this doesn't work as expected if the stats are m
Done


http://gerrit.cloudera.org:8080/#/c/12192/1/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@304
PS1, Line 304: // No useful estimate. Rely on HMS row count stats.
> Probably mention here that this doesn't work as expected if the stats are m
Done



--
To view, visit http://gerrit.cloudera.org:8080/12192
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic01147abcb6b184071ba28b55aedc3bc49b322ce
Gerrit-Change-Number: 12192
Gerrit-PatchSet: 2
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Sat, 19 Jan 2019 02:29:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8058: Fallback for HBase key scan range estimation

2019-01-18 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Zoram Thanga, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12192

to look at the new patch set (#2).

Change subject: IMPALA-8058: Fallback for HBase key scan range estimation
..

IMPALA-8058: Fallback for HBase key scan range estimation

Impala supports "pushing" of HBase key range predicates to HBase so that
Impala reads only rows within the target key range. The planner
estimates the cardinality of such scans by sampling the rows within the
range. However, we have seen cases where sampling returns rows for
unknown reasons. The planner then ends up without a good cardinality
estimate.  (Specifically, the code does a division by zero and produces
a huge estimate.  See the ticket for details.)

Impala appears to use the sampling strategy to compute cardinality
because HBase uses generally do not gather table stats. The resulting
estimates are often off by 2x or more. This is a problem in tests as it
causes cardinality numbers to vary greatly from the expected values.
Fortunately, tests do gather HMS stats. There may be cases where users
do as well. This fix exploints that fact.

This fix:

* Creates a fall-back strategy that uses table cardinality from HMS and
  the selectivity of the key predicates to estimate cardinality when the
  sampling approach fails.
* The fall-back strategy requires tracking the predicates used for HBase
  keys so that their selectivity can be applied during fall-back
  calculations.
* Moved HBase key calculation out of the SingleNodePlanner into the
  HBase scan node as suggested by a "TO DO" in the code. Doing so
  simplified the new code.
* In the spirit of IMPALA-7919, adds the key predicates to the HBase
  scan node in the EXPLAIN output.

Testing:

* Adds a query context option to disable the normal key sampling to
  force the use of the fall-back. Used for testing.
* Adds a new set of HBase test cases that use the new feature to check
  plans with the fall-back approach.
* Reran all existing tests.
* Compared cardinality numbers for the two modes: sampling and HMS using
  the cardinality features of IMPALA-8021. The two approaches provide
  different results, but this is mostly due to the missing selectivity
  estimates for inequality operators. (That's a fix for another time.)

Change-Id: Ic01147abcb6b184071ba28b55aedc3bc49b322ce
---
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/catalog/FeHBaseTable.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/hbase-no-key-est.test
M testdata/workloads/functional-planner/queries/PlannerTest/hbase.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
10 files changed, 512 insertions(+), 116 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/12192/2
--
To view, visit http://gerrit.cloudera.org:8080/12192
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic01147abcb6b184071ba28b55aedc3bc49b322ce
Gerrit-Change-Number: 12192
Gerrit-PatchSet: 2
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Zoram Thanga 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-18 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 20:

(36 comments)

http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@307
PS21, Line 307: BackendConfig.INSTANCE.getHMSPollingIntervalInSeconds()
> eventPollingInterval
Done


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@319
PS21, Line 319:  LOG.error("Unable to fetch the current notification event id 
from metastore."
  :   + "Metastore event processing will be disabled.",
  :   e);
> nit: Multiple places in the code, try to format in fewer lines. Something l
I used the command given in 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868536 as 
advised by some of the other members of the team. Unfortunately, it doesn't 
like some of these lines.


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@324
PS21, Line 324:   e);
> Will the code that handles this exception write to the log both the message
The LOG.error above logs the trace as well since it takes in the underlying 
cause exception as the second argument. The exception is uncaught in this 
particular case on the caller side, since we want CatalogD to not come up if it 
is configured to using event processing but for some reasons isn't able to 
instantiate one.


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1206
PS21, Line 1206: // Even though we get the current notification event id 
before stopping the event
> clarification: reset() is equivalent to invalidating everything and fetchin
This is similar to Tim's comment and our discussion. The line 1216 gets the 
currentEventID (latest) from HMS. And the event processing begins from that id. 
So the event processing jumps to the latest event id once the reset() is 
complete. There is still a slight chance that we will reprocess some of the 
events which are generated during reset() execution. It is better than missing 
those events which would lead to inconsistent state between catalog and HMS as 
far as event processing is concerned. The comment gives details of that case.


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1216
PS21, Line 1216: long currentEventId = 
metastoreEventProcessor_.getCurrentEventId();
> Should this be here? Or in the event processor itself? If here, why is it n
It needs to be here to avoid the race condition which can lead to possible 
missed events. We want to get the latestEventId before triggering the reset() 
so that we can then restart the event processing after reset from this eventId. 
If we move the code in the start() method, there is a chance that we will miss 
processing some of the events which happen during after reset is complete, but 
before we started the event processing. See Tim's comment in the earlier review 
comments which gives a nice example.


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1360
PS21, Line 1360:   Reference dbWasFound, Reference 
tblWasFound) {
> I still think that using the reference is non-standard Java. A simple solut
used the second option of return boolean and throw exception when db is not 
found.


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1363
PS21, Line 1363: Db db = getDb(dbName);
> Let's think about this. We return the flags because we want to know if the
if getting db before acquiring the lock has a race then it looks like addTable 
and renameTable also has a race. I will create separate JIRAs for fixing them


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1372
PS21, Line 1372: tblWasFound.setRef(true);
> I wonder, do we really need to know if the table existed or not? Is it even
consider a case where user do create, drop, create with the same table name 
from Impala. In case of if the create event from the first create statement, 
the table presented in the event is stale and should not be used to add to the 
catalog. I think it is useful to log this information to make sure that the 
create event was ignored in such a case.



[Impala-ASF-CR] Update ASF copyright to current year, 2019

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12242 )

Change subject: Update ASF copyright to current year, 2019
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/12242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia76bfc03122ebdaedbe1cfdd4e166228c399257f
Gerrit-Change-Number: 12242
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Sat, 19 Jan 2019 01:20:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] Update ASF copyright to current year, 2019

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/12242 )

Change subject: Update ASF copyright to current year, 2019
..

Update ASF copyright to current year, 2019

Change-Id: Ia76bfc03122ebdaedbe1cfdd4e166228c399257f
Reviewed-on: http://gerrit.cloudera.org:8080/12242
Reviewed-by: Fredy Wijaya 
Tested-by: Impala Public Jenkins 
---
M NOTICE.txt
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Fredy Wijaya: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/12242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia76bfc03122ebdaedbe1cfdd4e166228c399257f
Gerrit-Change-Number: 12242
Gerrit-PatchSet: 2
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12069 )

Change subject: IMPALA-7694: Add host resource usage metrics to profile
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1831/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12069
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9
Gerrit-Change-Number: 12069
Gerrit-PatchSet: 11
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 19 Jan 2019 00:04:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12229 )

Change subject: IMPALA-7731: Add Read/Exchange counters to profile
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1830/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
Gerrit-Change-Number: 12229
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 18 Jan 2019 23:53:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile

2019-01-18 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12069 )

Change subject: IMPALA-7694: Add host resource usage metrics to profile
..


Patch Set 11:

PS 11 rebases PS10


--
To view, visit http://gerrit.cloudera.org:8080/12069
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9
Gerrit-Change-Number: 12069
Gerrit-PatchSet: 11
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 23:30:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12069 )

Change subject: IMPALA-7694: Add host resource usage metrics to profile
..


Patch Set 11:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/12069/11/bin/plot_profile_resource_usage.py
File bin/plot_profile_resource_usage.py:

http://gerrit.cloudera.org:8080/#/c/12069/11/bin/plot_profile_resource_usage.py@27
PS11, Line 27: d
flake8: E501 line too long (162 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/12069/11/bin/plot_profile_resource_usage.py@31
PS11, Line 31: from impala_py_lib import profiles
flake8: E402 module level import not at top of file


http://gerrit.cloudera.org:8080/#/c/12069/11/bin/plot_profile_resource_usage.py@32
PS11, Line 32: import argparse
flake8: E402 module level import not at top of file


http://gerrit.cloudera.org:8080/#/c/12069/11/bin/plot_profile_resource_usage.py@33
PS11, Line 33: import sys
flake8: E402 module level import not at top of file


http://gerrit.cloudera.org:8080/#/c/12069/11/bin/plot_profile_resource_usage.py@34
PS11, Line 34: import matplotlib
flake8: E402 module level import not at top of file


http://gerrit.cloudera.org:8080/#/c/12069/11/bin/plot_profile_resource_usage.py@36
PS11, Line 36: import matplotlib.pyplot as plt
flake8: E402 module level import not at top of file


http://gerrit.cloudera.org:8080/#/c/12069/11/lib/python/impala_py_lib/profiles.py
File lib/python/impala_py_lib/profiles.py:

http://gerrit.cloudera.org:8080/#/c/12069/11/lib/python/impala_py_lib/profiles.py@28
PS11, Line 28: def decode_profile_line(line):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/12069/11/tests/common/impala_service.py
File tests/common/impala_service.py:

http://gerrit.cloudera.org:8080/#/c/12069/11/tests/common/impala_service.py@91
PS11, Line 91: ;
flake8: E703 statement ends with a semicolon


http://gerrit.cloudera.org:8080/#/c/12069/11/tests/observability/test_plot_profile_resource_usage.py
File tests/observability/test_plot_profile_resource_usage.py:

http://gerrit.cloudera.org:8080/#/c/12069/11/tests/observability/test_plot_profile_resource_usage.py@25
PS11, Line 25: class TestPlotProfileResourceUsage(ImpalaTestSuite):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/12069/11/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12069/11/tests/query_test/test_observability.py@399
PS11, Line 399: e
flake8: F841 local variable 'expected_strs' is assigned to but never used



--
To view, visit http://gerrit.cloudera.org:8080/12069
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9
Gerrit-Change-Number: 12069
Gerrit-PatchSet: 11
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 23:25:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile

2019-01-18 Thread Lars Volker (Code Review)
Hello Michael Ho, Philip Zeyliger, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12069

to look at the new patch set (#11).

Change subject: IMPALA-7694: Add host resource usage metrics to profile
..

IMPALA-7694: Add host resource usage metrics to profile

This change adds a mechanism to collect host resource usage metrics to
profiles. Metric collection can be controlled through a new query option
'RESOURCE_TRACE_RATIO'. It specifies the probability with which metrics
collection will be enabled. Collection always happens per query for all
executors that run one or more fragment instances of the query.

This mechanism adds a new time series counter class that collects all
measured values and does not re-sample them. It will re-sample values
when printing them into a string profile, preserving up to 64 values,
but Thrift profiles will contain the full list of values.

We add a new section "Per Node Profiles" to the profile to store and
show these values:

Per Node Profiles:
  lv-desktop:22000:
CpuIoWaitPercentage (500.000ms): 0, 0
CpuSysPercentage (500.000ms): 1, 1
CpuUserPercentage (500.000ms): 4, 0
  - ScratchBytesRead: 0
  - ScratchBytesWritten: 0
  - ScratchFileUsedBytes: 0
  - ScratchReads: 0 (0)
  - ScratchWrites: 0 (0)
  - TotalEncryptionTime: 0.000ns
  - TotalReadBlockTime: 0.000ns

This change also uses the aforementioned mechanism to collect CPU usage
metrics (user, system, and IO wait time).

This change also adds a tool to decode a Thrift profile and plot the
contained usage metrics using matplotlib. Example:
https://user-images.githubusercontent.com/151514/49830685-bb7efd80-fd46-11e8-8e23-9f5bc47635c1.png

This change also includes a few minor improvements to make the resulting
code more readable:
- Extend the PeriodicCounterUpdater to call functions to update global
  metrics before updating the counters.
- Expose the scratch profile within the per node resource usage section.
- Improve documentation of the profile counter classes.
- Remove synchronization from StreamingSampler.
- Remove a few pieces of dead code that otherwise would have required
  updates.
- Factor some common code for profile decoding into the Impala python
  library

Testing: This change contains a unit test for the system level metrics
collection and e2e tests for the profile changes. It also contains a
test for the plotting tool.

Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-state.cc
M be/src/service/impala-server.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/CMakeLists.txt
M be/src/util/periodic-counter-updater.cc
M be/src/util/periodic-counter-updater.h
M be/src/util/pretty-printer.h
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile-test.cc
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M be/src/util/streaming-sampler.h
A be/src/util/system-state-info-test.cc
A be/src/util/system-state-info.cc
A be/src/util/system-state-info.h
M bin/parse-thrift-profile.py
A bin/plot_profile_resource_usage.py
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/Metrics.thrift
M common/thrift/RuntimeProfile.thrift
A lib/python/impala_py_lib/profiles.py
M tests/beeswax/impala_beeswax.py
M tests/common/impala_service.py
A tests/observability/test_plot_profile_resource_usage.py
M tests/query_test/test_observability.py
37 files changed, 1,284 insertions(+), 256 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/12069/11
--
To view, visit http://gerrit.cloudera.org:8080/12069
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9
Gerrit-Change-Number: 12069
Gerrit-PatchSet: 11
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile

2019-01-18 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12229 )

Change subject: IMPALA-7731: Add Read/Exchange counters to profile
..


Patch Set 4:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator-backend-state.cc@251
PS2, Line 251: }
> Is this how we determine whether the fragment had a scan in it or not? Poss
Done


http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc@524
PS3, Line 524: RuntimeProfile::Counter* c = 
p->GetCounter(ScanNode::SCAN_RANGES_COMPLETE_COUNTER);
> ??
Sry, done.


http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc@531
PS3, Line 531: 
p->GetCounter(KrpcDataStreamSender::TOTAL_BYTES_SENT_COUNTER);
> This this be a constant somewhere?
Done


http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator.cc@779
PS2, Line 779:   total_utilization.bytes_read);
> Should we add comments here about what these new counters mean? (Especially
I added comments to all of them. Once IMPALA-7550 comes around we can mode them 
to the counter definition. (WIP here: https://gerrit.cloudera.org/#/c/12116/)


http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator.cc@794
PS2, Line 794:   double xchg_scan_ratio = 0;
> nit: extra line
Done


http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator.cc@790
PS3, Line 790:   // node.
> Is it the case that
It follows directly from these three COUNTER_SET calls, the addition for 
TotalBytesSent happens in L783. I added a test to the python tests.


http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py@391
PS2, Line 391:  "TotalInnerBytesSent", 
"ExchangeScanRatio",
> line has trailing whitespace
Done


http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py@391
PS2, Line 391:
> This query takes 3.4 secs on my machine, which is far from extreme, but I w
I picked a query that does a partitioned exchange without being able to push 
predicates down to the scan, so that the TxRatio would be large. This was the 
simplest one that I could come up with. It takes 2.4 s on my machine. Do you 
have an idea for a faster one.


http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py@391
PS2, Line 391:
> flake8: W291 trailing whitespace
Done


http://gerrit.cloudera.org:8080/#/c/12229/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12229/3/tests/query_test/test_observability.py@390
PS3, Line 390: expected_counters = ["TotalBytesRead", "TotalBytesSent", 
"TotalScanBytesSent",
> Should we add small tests for the other new metrics you have?
I added a second test that checks all counters are present, and amended this 
one to check that the *BytesSent counters add up.



--
To view, visit http://gerrit.cloudera.org:8080/12229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
Gerrit-Change-Number: 12229
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 18 Jan 2019 23:20:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile

2019-01-18 Thread Lars Volker (Code Review)
Hello Philip Zeyliger, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12229

to look at the new patch set (#4).

Change subject: IMPALA-7731: Add Read/Exchange counters to profile
..

IMPALA-7731: Add Read/Exchange counters to profile

Selective scans (and by extension selective fragment instances)
take higher performance hits when reading data remotely. They can
be identified by a low ratio between data being transmitted vs data
being read from HDFS.

This change adds several counters to the profile to make it easier to
identify queries based on their scan instance selectivity.

* TotalBytesSent - The total number of bytes sent by a query in
  exchange nodes. Does not include remote reads, data written to disk,
  or data sent to the client.

* TotalScanBytesSent - The total number of bytes sent by fragment
  instances that had a scan node in their plan.

* TotalInnerBytesSent - The total number of bytes sent by fragment
  instances that did not have a scan node in their plan, i.e. that
  received their input data from other instances through exchange node.

* ExchangeScanRatio - The ratio between TotalScanBytesSent and
  TotalBytesRead, i.e. the selectivity over all fragment instances that
  had a scan node in their plan. This counter is also added to each
  fragment instance.

* InnerNodeSelectivityRatio - The ratio between bytes sent by instances
  with a scan node in their plan and instances without a scan node in
  their plan. This indicates how well the inner nodes of the execution
  plan reduced the data volume.

Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
---
M be/src/exec/scan-node.cc
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/util/pretty-printer.h
M tests/query_test/test_observability.py
12 files changed, 141 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/12229/4
--
To view, visit http://gerrit.cloudera.org:8080/12229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
Gerrit-Change-Number: 12229
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)

2019-01-18 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12181 )

Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD 
COLUMN(S)
..


Patch Set 8:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12181/3/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/12181/3/fe/src/main/cup/sql-parser.cup@1097
PS3, Line 1097:   KW_ALTER KW_TABLE table_name:table KW_ADD KW_COLUMN 
if_not_exists_val:if_not_exists
> Is there anything in particular that you have in mind to be pulled out into
Each statement here starts with ALTER TABLE . In parser-speak:

alter_tbl_stmt ::=
  KW_ALTER KW_TABLE table_name:table alter_table_tail

alter_table_tail ::=
  KW_ADD KW_COLUMN ...
| KW_ADD ...
| KW_DROP ...

Would need to pass the column name around somehow, which is probably why we 
have the current structure.

Not a huge issue, we can always clean this up later.


http://gerrit.cloudera.org:8080/#/c/12181/8/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12181/8/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2048
PS8, Line 2048:   Column col = tbl.getColumn(column.getColumnName());
What is the concurrency model? Suppose I start 100 threads each adding the same 
columns simultaneously. Is there a race condition between the check here and 
the actual add below?

Also, suppose I add 50 threads which remove the same columns. Is there a race 
condition where we don't add a column because it occurs in our cache, though it 
has actually been removed from the table in HMS?

We can't fix this issue, really, because there is no global locking. Still, we 
should think about expected behavior.


http://gerrit.cloudera.org:8080/#/c/12181/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/12181/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@328
PS8, Line 328: AnalyzesOk("alter table functional.alltypes add column if 
not exists INT_COL int");
Just a side note: While this kind of test is better than no test, it is rather 
limited. Doing absolutely nothing in analyze() will allow this test to pass.

Would be better to do a sampling of deeper tests. That is, get back the 
statement and probe to ensure that the table name was set and resolved. That 
the column names and types were set.

Such checks are implicitly done later, but catching bugs is easier of we verify 
the data structures at each step.

Adding deep tests is a bit more work, so not required here as we've not done 
this in the past. Offering it as something to consider in the future. (I've 
started adding such tests for the analyzer, etc.)



--
To view, visit http://gerrit.cloudera.org:8080/12181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9
Gerrit-Change-Number: 12181
Gerrit-PatchSet: 8
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Fri, 18 Jan 2019 23:01:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile

2019-01-18 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12229 )

Change subject: IMPALA-7731: Add Read/Exchange counters to profile
..


Patch Set 3: Code-Review+2

(4 comments)

Thanks!

http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc@524
PS3, Line 524: //if (!p->metadata().__isset.plan_node_id) continue;
??


http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc@531
PS3, Line 531: RuntimeProfile::Counter* bytes_sent = 
p->GetCounter("TotalBytesSent");
This this be a constant somewhere?


http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator.cc@790
PS3, Line 790:   // node.
Is it the case that

TotalBytesSent == TotalScanBytesSent + TotalInnerBytesSent

?

Do we have a place to DCHECK assert that and/or test it?


http://gerrit.cloudera.org:8080/#/c/12229/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12229/3/tests/query_test/test_observability.py@390
PS3, Line 390:   def test_exchange_scan_ratio_non_zero(self):
Should we add small tests for the other new metrics you have?



--
To view, visit http://gerrit.cloudera.org:8080/12229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
Gerrit-Change-Number: 12229
Gerrit-PatchSet: 3
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 18 Jan 2019 22:40:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12181 )

Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD 
COLUMN(S)
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1829/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9
Gerrit-Change-Number: 12181
Gerrit-PatchSet: 8
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Fri, 18 Jan 2019 22:38:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12229 )

Change subject: IMPALA-7731: Add Read/Exchange counters to profile
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1828/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
Gerrit-Change-Number: 12229
Gerrit-PatchSet: 3
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 18 Jan 2019 22:24:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3323: Fix unrecognizable shell option when --config file is specified

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12245 )

Change subject: IMPALA-3323: Fix unrecognizable shell option when --config_file 
is specified
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1827/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12245
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea
Gerrit-Change-Number: 12245
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Fri, 18 Jan 2019 22:18:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)

2019-01-18 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12181 )

Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD 
COLUMN(S)
..


Patch Set 8: Code-Review+1

(5 comments)

Carry Bharath's +1.

http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@612
PS7, Line 612:
> nit: Column(s) have.., here and below.
Done


http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2046
PS7, Line 2046: colsToAdd = new Ar
> nit: I think colsToAdd will be more clear. Please ignore if you disagree.
Yeah that's a better name. Done.


http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2049
PS7, Line 2049: if (ifNotExists && col != null) continue;
  :   if (col != null) {
  :
> single line and else if below?
Done


http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2053
PS7, Line 2053: tName(), tbl.getN
> We are past analysis at this point. Throw a CatalogException? Also, include
Done


http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2057
PS7, Line 2057:
> we don't use it anymore, update?
Done



--
To view, visit http://gerrit.cloudera.org:8080/12181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9
Gerrit-Change-Number: 12181
Gerrit-PatchSet: 8
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Fri, 18 Jan 2019 22:04:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)

2019-01-18 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/12181 )

Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD 
COLUMN(S)
..

IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)

This patch adds IF NOT EXISTS support in ALTER TABLE ADD COLUMN and
ALTER TABLE ADD COLUMNS. If IF NOT EXISTS is specified and a column
already exists with this name, no error is thrown. If IF NOT EXISTS
is specified for multiple columns and a column already exists, no
error is thrown and a new column that does not exist will be added.

Syntax:
ALTER TABLE tbl ADD COLUMN [IF NOT EXISTS] i int
ALTER TABLE tbl ADD [IF NOT EXISTS] COLUMNS (i int, j int)

Testing:
- Added new FE tests
- Ran all FE tests
- Updated E2E DDL tests
- Ran all E2E DDL tests

Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9
---
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
R fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
C fe/src/main/java/org/apache/impala/analysis/AlterTableReplaceColsStmt.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/alter-table.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
11 files changed, 398 insertions(+), 182 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/12181/8
--
To view, visit http://gerrit.cloudera.org:8080/12181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9
Gerrit-Change-Number: 12181
Gerrit-PatchSet: 8
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile

2019-01-18 Thread Lars Volker (Code Review)
Hello Philip Zeyliger, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12229

to look at the new patch set (#3).

Change subject: IMPALA-7731: Add Read/Exchange counters to profile
..

IMPALA-7731: Add Read/Exchange counters to profile

Selective scans (and by extension selective fragment instances)
take higher performance hits when reading data remotely. They can
be identified by a low ratio between data being transmitted vs data
being read from HDFS.

This change adds several counters to the profile to make it easier to
identify queries based on their scan instance selectivity.

* TotalBytesRead - The total number of bytes read by a query.

* TotalBytesSent - The total number of bytes sent by a query in
  exchange nodes. Does not include remote reads, data written to disk,
  or data sent to the client.

* TotalScanBytesSent - The total number of bytes sent by fragment
  instances that had a scan node in their plan.

* TotalInnerBytesSent - The total number of bytes sent by fragment
  instances that did not have a scan node in their plan, i.e. that
  received their input data from other instances through exchange node.

* ExchangeScanRatio - The ratio between TotalScanBytesSent and
  TotalBytesRead, i.e. the selectivity over all fragment instances that
  had a scan node in their plan. This counter is also added to each
  fragment instance.

* InnerNodeSelectivityRatio - The ratio between bytes sent by instances
  with a scan node in their plan and instances without a scan node in
  their plan. This indicates how well the inner nodes of the execution
  plan reduced the data volume.

Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
---
M be/src/exec/scan-node.cc
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/util/pretty-printer.h
M tests/query_test/test_observability.py
11 files changed, 112 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/12229/3
--
To view, visit http://gerrit.cloudera.org:8080/12229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
Gerrit-Change-Number: 12229
Gerrit-PatchSet: 3
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-3323: Fix unrecognizable shell option when --config file is specified

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12245 )

Change subject: IMPALA-3323: Fix unrecognizable shell option when --config_file 
is specified
..


Patch Set 2:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/1826/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/12245
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea
Gerrit-Change-Number: 12245
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Fri, 18 Jan 2019 21:52:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3323: Fix unrecognizable shell option when --config file is specified

2019-01-18 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/12245 )

Change subject: IMPALA-3323: Fix unrecognizable shell option when --config_file 
is specified
..

IMPALA-3323: Fix unrecognizable shell option when --config_file is specified

Impala shell defines a dictionary of default values for some shell
options. Before this patch, the logic for --config_file checks if
a shell option exists by using the default value dictionary, which
does not contain the exhaustive list of shell options. This causes
a valid option in the Impala shell config file to be treated as
unrecognizable shell option due to the option not having a default
value. The patch fixes the issue by changing the logic that checks
for the existence of an option using the option list from optparse.
The patch also fixes the missing dest parameter for ldap_password_cmd
option.

Testing:
- Updated test_shell_commandline::test_config_file
- Ran all shell tests

Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea
---
M shell/impala_shell.py
M shell/option_parser.py
M tests/shell/good_impalarc
M tests/shell/test_shell_commandline.py
4 files changed, 21 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/12245/3
--
To view, visit http://gerrit.cloudera.org:8080/12245
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea
Gerrit-Change-Number: 12245
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-3323: Fix unrecognizable shell option when --config file is specified

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12245 )

Change subject: IMPALA-3323: Fix unrecognizable shell option when --config_file 
is specified
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/1825/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/12245
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea
Gerrit-Change-Number: 12245
Gerrit-PatchSet: 1
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Fri, 18 Jan 2019 21:33:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7988: support loading data with dockerized Impalas

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12189 )

Change subject: IMPALA-7988: support loading data with dockerized Impalas
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/12189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517
Gerrit-Change-Number: 12189
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 21:33:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7988: support loading data with dockerized Impalas

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/12189 )

Change subject: IMPALA-7988: support loading data with dockerized Impalas
..

IMPALA-7988: support loading data with dockerized Impalas

This patch does the work to load data and run some end-to-end
query tests on a dockerised cluster. Changes were required
in start-impala-cluster.py/ImpalaCluster and in some configuration
files.

ImpalaCluster is used for various things, including discovering
service ports and testing for cluster readiness. This patch adds
basic support and uses it from start-impala-cluster.py to check
for cluster readiness. Some logic is moved from
start-impala-cluster.py to ImpalaCluster.

Limitations:
* We're fairly inconsistent about whether services listen only on
  a single interface (e.g. loopback, traditionally) or whether it
  listens on all interfaces. This doesn't fix all of those issues.
  E.g. HDFS datanodes listen on all interfaces to work around
  some issues.
* Many tests don't pass yet, particularly those using
  ImpalaCluster(), which isn't initialised with the appropriate
  docker arguments.

Testing:
Did a full data load locally using a dockerised Impala cluster:

  START_CLUSTER_ARGS="--docker_network=impala-cluster" \
  TEST_START_CLUSTER_ARGS="--docker_network=impala-cluster" \
  ./buildall.sh -format -testdata -ninja -notests -skiptests -noclean

Ran a selection of end-to-end tests touching HDFS, Kudu and HBase
tables after I loaded data locally.

Ran exhaustive tests with non-dockerised impala cluster.

Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517
Reviewed-on: http://gerrit.cloudera.org:8080/12189
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 
---
M bin/impala-config.sh
M bin/start-impala-cluster.py
M fe/src/test/resources/hbase-site.xml.template
M fe/src/test/resources/hive-default.xml
M fe/src/test/resources/mysql-hive-site.xml.template
M fe/src/test/resources/postgresql-hive-site.xml.template
M testdata/bin/create-load-data.sh
M testdata/bin/run-all.sh
M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
M testdata/cluster/node_templates/common/etc/kudu/tserver.conf.tmpl
M tests/common/impala_cluster.py
11 files changed, 358 insertions(+), 229 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/12189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517
Gerrit-Change-Number: 12189
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Update ASF copyright to current year, 2019

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12242 )

Change subject: Update ASF copyright to current year, 2019
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3655/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/12242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia76bfc03122ebdaedbe1cfdd4e166228c399257f
Gerrit-Change-Number: 12242
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 18 Jan 2019 21:26:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile

2019-01-18 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12069 )

Change subject: IMPALA-7694: Add host resource usage metrics to profile
..


Patch Set 10:

> What was the bug you found?

For step size 'step', the number of values we print is (num + 1) / step, but 
the code used num/step which printed the wrong number for 127 samples.


--
To view, visit http://gerrit.cloudera.org:8080/12069
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9
Gerrit-Change-Number: 12069
Gerrit-PatchSet: 10
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 21:22:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3323: Fix unrecognizable shell option when --config file is specified

2019-01-18 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12245 )

Change subject: IMPALA-3323: Fix unrecognizable shell option when --config_file 
is specified
..


Patch Set 2:

(2 comments)

Fixed some flake8 errors.

http://gerrit.cloudera.org:8080/#/c/12245/1/shell/option_parser.py
File shell/option_parser.py:

http://gerrit.cloudera.org:8080/#/c/12245/1/shell/option_parser.py@55
PS1, Line 55: raise InvalidOptionValueError("Unexpected value in 
configuration file. '" + value
> flake8: E302 expected 2 blank lines, found 1
Done


http://gerrit.cloudera.org:8080/#/c/12245/1/shell/option_parser.py@75
PS1, Line 75: else:
> flake8: E302 expected 2 blank lines, found 1
Done



--
To view, visit http://gerrit.cloudera.org:8080/12245
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea
Gerrit-Change-Number: 12245
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Fri, 18 Jan 2019 21:20:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3323: Fix unrecognizable shell option when --config file is specified

2019-01-18 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/12245 )

Change subject: IMPALA-3323: Fix unrecognizable shell option when --config_file 
is specified
..

IMPALA-3323: Fix unrecognizable shell option when --config_file is specified

Impala shell defines a dictionary of default values for some shell
options. Before this patch, the logic for --config_file checks if
a shell option exists by using the default value dictionary, which
does not contain the exhaustive list of shell options. This causes
a valid option in the Impala shell config file to be treated as
unrecognizable shell option due to the option not having a default
value. The patch fixes the issue by changing the logic that checks
for the existence of an option using the option list from optparse.
The patch also fixes the missing dest parameter for ldap_password_cmd
option.

Testing:
- Updated test_shell_commandline::test_config_file
- Ran all shell tests

Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea
---
M shell/impala_shell.py
M shell/option_parser.py
M tests/shell/good_impalarc
M tests/shell/test_shell_commandline.py
4 files changed, 21 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/12245/2
--
To view, visit http://gerrit.cloudera.org:8080/12245
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea
Gerrit-Change-Number: 12245
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-8073: Fix FE tests that leak HMS connections

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12241 )

Change subject: IMPALA-8073: Fix FE tests that leak HMS connections
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1824/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91
Gerrit-Change-Number: 12241
Gerrit-PatchSet: 6
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 21:17:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8073: Fix FE tests that leak HMS connections

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12241 )

Change subject: IMPALA-8073: Fix FE tests that leak HMS connections
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1823/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91
Gerrit-Change-Number: 12241
Gerrit-PatchSet: 5
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 21:16:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3323: Fix unrecognizable shell option when --config file is specified

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12245 )

Change subject: IMPALA-3323: Fix unrecognizable shell option when --config_file 
is specified
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12245/1/shell/option_parser.py
File shell/option_parser.py:

http://gerrit.cloudera.org:8080/#/c/12245/1/shell/option_parser.py@55
PS1, Line 55: def parse_shell_options(options, defaults, option_list):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/12245/1/shell/option_parser.py@75
PS1, Line 75: def get_config_from_file(config_filename, option_list):
flake8: E302 expected 2 blank lines, found 1



--
To view, visit http://gerrit.cloudera.org:8080/12245
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea
Gerrit-Change-Number: 12245
Gerrit-PatchSet: 1
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Fri, 18 Jan 2019 21:15:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3323: Fix unrecognizable shell option when --config file is specified

2019-01-18 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12245


Change subject: IMPALA-3323: Fix unrecognizable shell option when --config_file 
is specified
..

IMPALA-3323: Fix unrecognizable shell option when --config_file is specified

Impala shell defines a dictionary of default values for some shell
options. Before this patch, the logic for --config_file checks if
a shell option exists by using the default value dictionary, which
does not contain the exhaustive list of shell options. This causes
a valid option in the Impala shell config file to be treated as
unrecognizable shell option due to the option not having a default
value. The patch fixes the issue by changing the logic that checks
for the existence of an option using the option list from optparse.
The patch also fixes the missing dest parameter for ldap_password_cmd
option.

Testing:
- Updated test_shell_commandline::test_config_file
- Ran all shell tests

Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea
---
M shell/impala_shell.py
M shell/option_parser.py
M tests/shell/good_impalarc
M tests/shell/test_shell_commandline.py
4 files changed, 14 insertions(+), 10 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/12245/1
--
To view, visit http://gerrit.cloudera.org:8080/12245
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iff371d038fa77ba659e9b7c7a4ed5b374237f2ea
Gerrit-Change-Number: 12245
Gerrit-PatchSet: 1
Gerrit-Owner: Fredy Wijaya 


[Impala-ASF-CR] IMPALA-6664: Tag log statements with fragment or query ids.

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12129 )

Change subject: IMPALA-6664: Tag log statements with fragment or query ids.
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1822/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12129
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6634ef9d1a7346339f24f2d40a7a3aa36a535da8
Gerrit-Change-Number: 12129
Gerrit-PatchSet: 6
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Fri, 18 Jan 2019 21:08:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12244 )

Change subject: IMPALA-8092: Add an admission controller debug page
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1821/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12244
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766
Gerrit-Change-Number: 12244
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 20:51:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8073: Fix FE tests that leak HMS connections

2019-01-18 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12241 )

Change subject: IMPALA-8073: Fix FE tests that leak HMS connections
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12241/5/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
File fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java:

http://gerrit.cloudera.org:8080/#/c/12241/5/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@851
PS5, Line 851:   // running the query with authtest/hostn...@realm.com user 
which is converted to
> line too long (91 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/12241/5/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@1032
PS5, Line 1032:   // Create an analysis context + FE with the test user
> line too long (91 > 90)
Done



--
To view, visit http://gerrit.cloudera.org:8080/12241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91
Gerrit-Change-Number: 12241
Gerrit-PatchSet: 6
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 20:47:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8073: Fix FE tests that leak HMS connections

2019-01-18 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/12241 )

Change subject: IMPALA-8073: Fix FE tests that leak HMS connections
..

IMPALA-8073: Fix FE tests that leak HMS connections

This patch fixes FE tests that leak the HMS connections. I was to
reproduce the issue by in SentryTestProxy by looping the test multiple
times.

Testing:
- Ran SentryProxyTest in a loop without any issue.
- Ran all FE tests

Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91
---
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/test/java/org/apache/impala/analysis/AuditingTest.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
M fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogObjectToFromThriftTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java
M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/testutil/BlockIdGenerator.java
M fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java
M fe/src/test/java/org/apache/impala/util/SentryProxyTest.java
12 files changed, 206 insertions(+), 180 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/12241/6
--
To view, visit http://gerrit.cloudera.org:8080/12241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91
Gerrit-Change-Number: 12241
Gerrit-PatchSet: 6
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8073: Fix FE tests that leak HMS connections

2019-01-18 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12241 )

Change subject: IMPALA-8073: Fix FE tests that leak HMS connections
..


Patch Set 5:

- Updated Catalog interface to implement AutoCloseable.
- Fixed all other tests that leak HMS connections.


--
To view, visit http://gerrit.cloudera.org:8080/12241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91
Gerrit-Change-Number: 12241
Gerrit-PatchSet: 5
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 20:45:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8073: Fix FE tests that leak HMS connections

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12241 )

Change subject: IMPALA-8073: Fix FE tests that leak HMS connections
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12241/5/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
File fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java:

http://gerrit.cloudera.org:8080/#/c/12241/5/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@851
PS5, Line 851:   // running the query with authtest/hostn...@realm.com user 
which is converted to user
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/12241/5/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@1032
PS5, Line 1032:   // Create an analysis context + FE with the test user (as 
defined in the policy file)
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/12241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91
Gerrit-Change-Number: 12241
Gerrit-PatchSet: 5
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 20:45:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8073: Fix FE tests that leak HMS connections

2019-01-18 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/12241 )

Change subject: IMPALA-8073: Fix FE tests that leak HMS connections
..

IMPALA-8073: Fix FE tests that leak HMS connections

This patch fixes FE tests that leak the HMS connections. I was to
reproduce the issue by in SentryTestProxy by looping the test multiple
times.

Testing:
- Ran SentryProxyTest in a loop without any issue.
- Ran all FE tests

Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91
---
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/test/java/org/apache/impala/analysis/AuditingTest.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
M fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogObjectToFromThriftTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogdTableInvalidatorTest.java
M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/testutil/BlockIdGenerator.java
M fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java
M fe/src/test/java/org/apache/impala/util/SentryProxyTest.java
12 files changed, 205 insertions(+), 180 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/12241/5
--
To view, visit http://gerrit.cloudera.org:8080/12241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91
Gerrit-Change-Number: 12241
Gerrit-PatchSet: 5
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12069 )

Change subject: IMPALA-7694: Add host resource usage metrics to profile
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1820/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12069
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9
Gerrit-Change-Number: 12069
Gerrit-PatchSet: 10
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 20:44:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6664: Tag log statements with fragment or query ids.

2019-01-18 Thread Philip Zeyliger (Code Review)
Hello Paul Rogers, Zoltan Borok-Nagy, Zoram Thanga, Tim Armstrong, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12129

to look at the new patch set (#6).

Change subject: IMPALA-6664: Tag log statements with fragment or query ids.
..

IMPALA-6664: Tag log statements with fragment or query ids.

This implements much of the desire in IMPALA-6664 to tag all log
statements with their query ids. It re-uses the existing ThreadDebugInfo
infrastructure as well as the existing
InstallLogMessageListenerFunction() patch to glog (currently used for
log redaction) to prefix log messages with fragment ids or query ids,
when available. The fragment id is the query id with the last bits
incremented, so it's possible to correlate a given query's log messages.
For example:

  $ grep 85420d575b9ff4b9:402b8868 logs/cluster/impalad.INFO
  I0108 10:39:16.453958 14752 impala-server.cc:1052] 
85420d575b9ff4b9:402b8868] Registered query 
query_id=85420d575b9ff4b9:402b8868 
session_id=aa45e480434f0516:101ae5ac12679d94
  I0108 10:39:16.454738 14752 Frontend.java:1242] 
85420d575b9ff4b9:402b8868] Analyzing query: select count(*) from 
tpcds.web_sales
  I0108 10:39:16.456627 14752 Frontend.java:1282] 
85420d575b9ff4b9:402b8868] Analysis finished.
  I0108 10:39:16.463538 14818 admission-controller.cc:598] 
85420d575b9ff4b9:402b8868] Schedule for 
id=85420d575b9ff4b9:402b8868 in pool_name=default-pool 
per_host_mem_estimate=180.02 MB PoolConfig: max_requests=-1 max_queued=200 
max_mem=-1.00 B
  I0108 10:39:16.463603 14818 admission-controller.cc:603] 
85420d575b9ff4b9:402b8868] Stats: agg_num_running=0, agg_num_queued=0, 
agg_mem_reserved=0,  local_host(local_mem_admitted=0, num_admitted_running=0, 
num_queued=0, backend_mem_reserved=0)
  I0108 10:39:16.463780 14818 admission-controller.cc:635] 
85420d575b9ff4b9:402b8868] Admitted query 
id=85420d575b9ff4b9:402b8868
  I0108 10:39:16.463896 14818 coordinator.cc:93] 
85420d575b9ff4b9:402b8868] Exec() 
query_id=85420d575b9ff4b9:402b8868 stmt=select count(*) from 
tpcds.web_sales
  I0108 10:39:16.464795 14818 coordinator.cc:356] 
85420d575b9ff4b9:402b8868] starting execution on 2 backends for 
query_id=85420d575b9ff4b9:402b8868
  I0108 10:39:16.466384 24891 impala-internal-service.cc:49] 
ExecQueryFInstances(): query_id=85420d575b9ff4b9:402b8868 
coord=pannier.sf.cloudera.com:22000 #instances=2
  I0108 10:39:16.467339 14818 coordinator.cc:370] 
85420d575b9ff4b9:402b8868] started execution on 2 backends for 
query_id=85420d575b9ff4b9:402b8868
  I0108 10:39:16.467536 14823 query-state.cc:579] 
85420d575b9ff4b9:402b8868] Executing instance. 
instance_id=85420d575b9ff4b9:402b8868 fragment_idx=0 
per_fragment_instance_idx=0 coord_state_idx=0 #in-flight=1
  I0108 10:39:16.467627 14824 query-state.cc:579] 
85420d575b9ff4b9:402b88680001] Executing instance. 
instance_id=85420d575b9ff4b9:402b88680001 fragment_idx=1 
per_fragment_instance_idx=0 coord_state_idx=0 #in-flight=2
  I0108 10:39:16.820933 14824 query-state.cc:587] 
85420d575b9ff4b9:402b88680001] Instance completed. 
instance_id=85420d575b9ff4b9:402b88680001 #in-flight=1 status=OK
  I0108 10:39:17.122299 14823 krpc-data-stream-mgr.cc:294] 
85420d575b9ff4b9:402b8868] DeregisterRecvr(): 
fragment_instance_id=85420d575b9ff4b9:402b8868, node=2
  I0108 10:39:17.123500 24038 coordinator.cc:709] Backend completed: 
host=pannier.sf.cloudera.com:22001 remaining=2 
query_id=85420d575b9ff4b9:402b8868
  I0108 10:39:17.123509 24038 coordinator-backend-state.cc:265] 
query_id=85420d575b9ff4b9:402b8868: first in-progress backend: 
pannier.sf.cloudera.com:22000
  I0108 10:39:17.167752 14752 impala-beeswax-server.cc:197] 
85420d575b9ff4b9:402b8868] get_results_metadata(): 
query_id=85420d575b9ff4b9:402b8868
  I0108 10:39:17.168762 14752 coordinator.cc:483] 
85420d575b9ff4b9:402b8868] ExecState: query 
id=85420d575b9ff4b9:402b8868 execution completed
  I0108 10:39:17.168808 14752 coordinator.cc:608] 
85420d575b9ff4b9:402b8868] Coordinator waiting for backends to finish, 
1 remaining. query_id=85420d575b9ff4b9:402b8868
  I0108 10:39:17.168880 14823 query-state.cc:587] 
85420d575b9ff4b9:402b8868] Instance completed. 
instance_id=85420d575b9ff4b9:402b8868 #in-flight=0 status=OK
  I0108 10:39:17.168977 14821 query-state.cc:252] UpdateBackendExecState(): 
last report for 85420d575b9ff4b9:402b8868
  I0108 10:39:17.174401 24038 coordinator.cc:709] Backend completed: 
host=pannier.sf.cloudera.com:22000 remaining=1 
query_id=85420d575b9ff4b9:402b8868
  I0108 10:39:17.174513 14752 coordinator.cc:814] 
85420d575b9ff4b9:402b8868] Release admission control 

[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile

2019-01-18 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12069 )

Change subject: IMPALA-7694: Add host resource usage metrics to profile
..


Patch Set 10: Code-Review+2

What was the bug you found?


--
To view, visit http://gerrit.cloudera.org:8080/12069
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9
Gerrit-Change-Number: 12069
Gerrit-PatchSet: 10
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 20:32:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12244 )

Change subject: IMPALA-8092: Add an admission controller debug page
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc
File be/src/service/impala-http-handler.cc:

http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc@866
PS1, Line 866: void ImpalaHttpHandler::AdmissionControllerStateHandler(const 
Webserver::ArgumentMap& args,
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/12244
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766
Gerrit-Change-Number: 12244
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 20:14:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page

2019-01-18 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12244


Change subject: IMPALA-8092: Add an admission controller debug page
..

IMPALA-8092: Add an admission controller debug page

This patch adds a new debug page "admission_controller" that provides
the following details about all resource pools:
- Pool configuration
- Relevant pool stats
- Queued Queries in order of being queued (local to the coordinator)
- Running Queries (local to this coordinator)
- Histogram of the distribution of peak memory used by queries admitted
  to the pool

The aforementioned details are also available as JSON from the same
http endpoint.

Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766
---
M be/src/runtime/coordinator.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/service/impala-http-handler.cc
M be/src/service/impala-http-handler.h
A www/admission_controller.tmpl
6 files changed, 659 insertions(+), 9 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/12244/1
--
To view, visit http://gerrit.cloudera.org:8080/12244
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766
Gerrit-Change-Number: 12244
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 


[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile

2019-01-18 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12069 )

Change subject: IMPALA-7694: Add host resource usage metrics to profile
..


Patch Set 10:

(5 comments)

Thanks for the reviews. Please see PS10.

http://gerrit.cloudera.org:8080/#/c/12069/8/be/src/util/pretty-printer.h
File be/src/util/pretty-printer.h:

http://gerrit.cloudera.org:8080/#/c/12069/8/be/src/util/pretty-printer.h@139
PS8, Line 139:   // Printed as integer values
 :   case TUnit::BASIS_POINTS: {
 : DCHECK_LE(value, 1);
 : ss << (value / 100);
 :
> Add "Prints integer values" as a comment on line 139-140, and we can call i
Done


http://gerrit.cloudera.org:8080/#/c/12069/8/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/12069/8/be/src/util/runtime-profile.cc@793
PS8, Line 793: stream << endl;
> (Showing X of Y values from Thrift Profile)?
Done


http://gerrit.cloudera.org:8080/#/c/12069/8/be/src/util/system-state-info.h
File be/src/util/system-state-info.h:

http://gerrit.cloudera.org:8080/#/c/12069/8/be/src/util/system-state-info.h@70
PS8, Line 70:   enum PROC_STAT_CPU_VALUES {
: CPU_USER,
: CPU_NICE,
: CPU_SYSTEM,
: CPU_IDLE,
: CPU_IOWAIT
:   };
:
:   /// We store the CPU usage values in an array so that
> I think I do prefer a struct, but I'm open to leaving it the way it is.
I checked with Tim and we prefer to leave it the way it is for now.


http://gerrit.cloudera.org:8080/#/c/12069/8/be/src/util/system-state-info.cc
File be/src/util/system-state-info.cc:

http://gerrit.cloudera.org:8080/#/c/12069/8/be/src/util/system-state-info.cc@33
PS8, Line 33: SystemStateInfo::SystemStateInfo() {
> Indeed, TIL. It looks cleaner to me, although I had to search the internet
I saw that this now angers clang-tidy:

  /home/ubuntu/Impala/be/src/util/system-state-info.cc:33:52: warning: missing 
field 'system' initializer [clang-diagnostic-missing-field-initializers]

I'll wait for you thoughts on array vs struct in general, but if we stay with 
an array I'd rather use memset than add #pragma lines to ignore this.

Alternatively we could disable the warning in clang-tidy based on this 
discussion that points out that newer versions of GCC don't warn about this 
anymore.


http://gerrit.cloudera.org:8080/#/c/12069/8/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12069/8/tests/query_test/test_observability.py@419
PS8, Line 419: """Tests that the thrift profile contains a time series 
counter for CPU resource
> Roughly, a C++ unit test could do:
Thanks for the pointer, I added such a test. It turned out a bit more 
complicated than that, but I also found a bug. :)



--
To view, visit http://gerrit.cloudera.org:8080/12069
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9
Gerrit-Change-Number: 12069
Gerrit-PatchSet: 10
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 20:10:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12069 )

Change subject: IMPALA-7694: Add host resource usage metrics to profile
..


Patch Set 10:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/12069/10/bin/plot_profile_resource_usage.py
File bin/plot_profile_resource_usage.py:

http://gerrit.cloudera.org:8080/#/c/12069/10/bin/plot_profile_resource_usage.py@27
PS10, Line 27: d
flake8: E501 line too long (162 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/12069/10/bin/plot_profile_resource_usage.py@31
PS10, Line 31: from impala_py_lib import profiles
flake8: E402 module level import not at top of file


http://gerrit.cloudera.org:8080/#/c/12069/10/bin/plot_profile_resource_usage.py@32
PS10, Line 32: import argparse
flake8: E402 module level import not at top of file


http://gerrit.cloudera.org:8080/#/c/12069/10/bin/plot_profile_resource_usage.py@33
PS10, Line 33: import sys
flake8: E402 module level import not at top of file


http://gerrit.cloudera.org:8080/#/c/12069/10/bin/plot_profile_resource_usage.py@34
PS10, Line 34: import matplotlib
flake8: E402 module level import not at top of file


http://gerrit.cloudera.org:8080/#/c/12069/10/bin/plot_profile_resource_usage.py@36
PS10, Line 36: import matplotlib.pyplot as plt
flake8: E402 module level import not at top of file


http://gerrit.cloudera.org:8080/#/c/12069/10/lib/python/impala_py_lib/profiles.py
File lib/python/impala_py_lib/profiles.py:

http://gerrit.cloudera.org:8080/#/c/12069/10/lib/python/impala_py_lib/profiles.py@28
PS10, Line 28: def decode_profile_line(line):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/12069/10/tests/common/impala_service.py
File tests/common/impala_service.py:

http://gerrit.cloudera.org:8080/#/c/12069/10/tests/common/impala_service.py@91
PS10, Line 91: ;
flake8: E703 statement ends with a semicolon


http://gerrit.cloudera.org:8080/#/c/12069/10/tests/observability/test_plot_profile_resource_usage.py
File tests/observability/test_plot_profile_resource_usage.py:

http://gerrit.cloudera.org:8080/#/c/12069/10/tests/observability/test_plot_profile_resource_usage.py@25
PS10, Line 25: class TestPlotProfileResourceUsage(ImpalaTestSuite):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/12069/10/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12069/10/tests/query_test/test_observability.py@369
PS10, Line 369: e
flake8: F841 local variable 'expected_strs' is assigned to but never used



--
To view, visit http://gerrit.cloudera.org:8080/12069
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9
Gerrit-Change-Number: 12069
Gerrit-PatchSet: 10
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 20:10:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7694: Add host resource usage metrics to profile

2019-01-18 Thread Lars Volker (Code Review)
Hello Michael Ho, Philip Zeyliger, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12069

to look at the new patch set (#10).

Change subject: IMPALA-7694: Add host resource usage metrics to profile
..

IMPALA-7694: Add host resource usage metrics to profile

This change adds a mechanism to collect host resource usage metrics to
profiles. Metric collection can be controlled through a new query option
'RESOURCE_TRACE_RATIO'. It specifies the probability with which metrics
collection will be enabled. Collection always happens per query for all
executors that run one or more fragment instances of the query.

This mechanism adds a new time series counter class that collects all
measured values and does not re-sample them. It will re-sample values
when printing them into a string profile, preserving up to 64 values,
but Thrift profiles will contain the full list of values.

We add a new section "Per Node Profiles" to the profile to store and
show these values:

Per Node Profiles:
  lv-desktop:22000:
CpuIoWaitPercentage (500.000ms): 0, 0
CpuSysPercentage (500.000ms): 1, 1
CpuUserPercentage (500.000ms): 4, 0
  - ScratchBytesRead: 0
  - ScratchBytesWritten: 0
  - ScratchFileUsedBytes: 0
  - ScratchReads: 0 (0)
  - ScratchWrites: 0 (0)
  - TotalEncryptionTime: 0.000ns
  - TotalReadBlockTime: 0.000ns

This change also uses the aforementioned mechanism to collect CPU usage
metrics (user, system, and IO wait time).

This change also adds a tool to decode a Thrift profile and plot the
contained usage metrics using matplotlib. Example:
https://user-images.githubusercontent.com/151514/49830685-bb7efd80-fd46-11e8-8e23-9f5bc47635c1.png

This change also includes a few minor improvements to make the resulting
code more readable:
- Extend the PeriodicCounterUpdater to call functions to update global
  metrics before updating the counters.
- Expose the scratch profile within the per node resource usage section.
- Improve documentation of the profile counter classes.
- Remove synchronization from StreamingSampler.
- Remove a few pieces of dead code that otherwise would have required
  updates.
- Factor some common code for profile decoding into the Impala python
  library

Testing: This change contains a unit test for the system level metrics
collection and e2e tests for the profile changes. It also contains a
test for the plotting tool.

Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-state.cc
M be/src/service/impala-server.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/CMakeLists.txt
M be/src/util/periodic-counter-updater.cc
M be/src/util/periodic-counter-updater.h
M be/src/util/pretty-printer.h
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile-test.cc
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M be/src/util/streaming-sampler.h
A be/src/util/system-state-info-test.cc
A be/src/util/system-state-info.cc
A be/src/util/system-state-info.h
M bin/parse-thrift-profile.py
A bin/plot_profile_resource_usage.py
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/Metrics.thrift
M common/thrift/RuntimeProfile.thrift
A lib/python/impala_py_lib/profiles.py
M tests/beeswax/impala_beeswax.py
M tests/common/impala_service.py
A tests/observability/test_plot_profile_resource_usage.py
M tests/query_test/test_observability.py
37 files changed, 1,283 insertions(+), 255 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/12069/10
--
To view, visit http://gerrit.cloudera.org:8080/12069
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9
Gerrit-Change-Number: 12069
Gerrit-PatchSet: 10
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7980: Fix spinning threads because of buggy handling of num unqueued files .

2019-01-18 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12097 )

Change subject: IMPALA-7980: Fix spinning threads because of buggy handling of 
num_unqueued_files_.
..


Patch Set 2:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/12097/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12097/2//COMMIT_MSG@19
PS2, Line 19: set to 1 initially; decrement at exit points of
: IssueInitialRanges
that seems like the negation of what I'd expect given the name above?


http://gerrit.cloudera.org:8080/#/c/12097/2//COMMIT_MSG@21
PS2, Line 21: base-sequence-scanner.cc.
did you consider an approach like adding something like this to the top level 
base class:

virtual bool AllScanRangesIssued() = 0;

and then in parquet you can implement it as 'return 
initial_scan_ranges_issued_;'

and then all the fancy counter stuff can go into base-sequence-scanner.cc?


http://gerrit.cloudera.org:8080/#/c/12097/2//COMMIT_MSG@38
PS2, Line 38: ~30 threads to be spinning in the kernel. We were able to use 
perf to finda lot
typo


http://gerrit.cloudera.org:8080/#/c/12097/2//COMMIT_MSG@74
PS2, Line 74: a an
typo


http://gerrit.cloudera.org:8080/#/c/12097/2/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

http://gerrit.cloudera.org:8080/#/c/12097/2/be/src/exec/hdfs-scan-node-base.h@266
PS2, Line 266: Should
Must?


http://gerrit.cloudera.org:8080/#/c/12097/2/be/src/exec/hdfs-scan-node-base.h@277
PS2, Line 277: plugins
impala has plugins?

perhaps we can use [[deprecated]] or something here? ATTRIBUTE_DEPRECATED? (do 
we have that in impala's copy of gutil?)


http://gerrit.cloudera.org:8080/#/c/12097/2/be/src/exec/hdfs-scan-node-base.h@287
PS2, Line 287: no_more_add_disk_io_ranges_possible_.Add(delta);
can you DCHECK that the result doesn't go below zero?


http://gerrit.cloudera.org:8080/#/c/12097/2/be/src/exec/hdfs-scan-node-base.h@471
PS2, Line 471: no_more_add_disk_io_ranges_possible_
> I feel like this should be called more_add_disk_io_ranges_possible_, since
yea, I was also confused by the inverted naming here.

If we want to make this reflect its actual numeric value, maybe something like 
remaining_scan_range_addition_steps ? ie the Parquet scanner has one "scan 
range addition step" whereas the text ones have one such step per file? Or 
perhaps remaining_scan_range_issuances_?


http://gerrit.cloudera.org:8080/#/c/12097/2/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/12097/2/be/src/exec/hdfs-scan-node-base.cc@453
PS2, Line 453: UpdateNoMoreAddDiskIoRangesBarrier(-1);
is it important to not decrement this in the error cases below on line 471-494? 
ie why not just unconditionally decrement this, in the same way that you 
unconditionally set initial_ranges_issued_ = true on L448?


http://gerrit.cloudera.org:8080/#/c/12097/2/be/src/exec/hdfs-scan-node.h
File be/src/exec/hdfs-scan-node.h:

http://gerrit.cloudera.org:8080/#/c/12097/2/be/src/exec/hdfs-scan-node.h@139
PS2, Line 139:
 :   /// Number of times scanner thread didn't find work to do.
 :   RuntimeProfile::Counter* 
scanner_thread_workless_loops_counter_ = nullptr;
hrm, I see the value of this for your test, but not entirely convinced it makes 
sense to keep around in profiles. Curious on others' opinions



--
To view, visit http://gerrit.cloudera.org:8080/12097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I133de13238d3d05c510e2ff771d48979125735b1
Gerrit-Change-Number: 12097
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 18 Jan 2019 19:31:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8073: Fix flaky SentryProxyTest

2019-01-18 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12241 )

Change subject: IMPALA-8073: Fix flaky SentryProxyTest
..


Patch Set 3:

Also, you could implement AutoCloseable and use try-with-resources.


--
To view, visit http://gerrit.cloudera.org:8080/12241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91
Gerrit-Change-Number: 12241
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 19:42:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)

2019-01-18 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12181 )

Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD 
COLUMN(S)
..


Patch Set 7: Code-Review+1

(5 comments)

Nice test coverage. Paul, looks like you had some comments about refactoring. 
Can you take the final pass?

http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@612
PS7, Line 612: Column
nit: Column(s) have.., here and below.


http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2046
PS7, Line 2046: nonExistingColumns
nit: I think colsToAdd will be more clear. Please ignore if you disagree.


http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2049
PS7, Line 2049: if (ifNotExists && col != null) {
  : continue;
  :   }
single line and else if below?


http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2053
PS7, Line 2053: AnalysisException
We are past analysis at this point. Throw a CatalogException? Also, include the 
tbl name?


http://gerrit.cloudera.org:8080/#/c/12181/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2057
PS7, Line 2057: and ifNotExists is true, do nothing
we don't use it anymore, update?



--
To view, visit http://gerrit.cloudera.org:8080/12181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9
Gerrit-Change-Number: 12181
Gerrit-PatchSet: 7
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Fri, 18 Jan 2019 19:41:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Update ASF copyright to current year, 2019

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12242 )

Change subject: Update ASF copyright to current year, 2019
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1819/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia76bfc03122ebdaedbe1cfdd4e166228c399257f
Gerrit-Change-Number: 12242
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 18 Jan 2019 19:18:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] Update ASF copyright to current year, 2019

2019-01-18 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12242 )

Change subject: Update ASF copyright to current year, 2019
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia76bfc03122ebdaedbe1cfdd4e166228c399257f
Gerrit-Change-Number: 12242
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 18 Jan 2019 19:11:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] Update ASF copyright to current year, 2019

2019-01-18 Thread Jim Apple (Code Review)
Jim Apple has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12242


Change subject: Update ASF copyright to current year, 2019
..

Update ASF copyright to current year, 2019

Change-Id: Ia76bfc03122ebdaedbe1cfdd4e166228c399257f
---
M NOTICE.txt
1 file changed, 1 insertion(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/12242/1
--
To view, visit http://gerrit.cloudera.org:8080/12242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia76bfc03122ebdaedbe1cfdd4e166228c399257f
Gerrit-Change-Number: 12242
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple 


[Impala-ASF-CR] IMPALA-8073: Fix flaky SentryProxyTest

2019-01-18 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12241 )

Change subject: IMPALA-8073: Fix flaky SentryProxyTest
..


Patch Set 3:

It looks like a few other tests like PartialCatalogInfoTest, CatalogTest, 
LocalCatalogTest, CatalogdTableInvalidatorTest and FrontendTestBase create 
catalogs and I don't think I see them close them - do they need to be fixed too?


--
To view, visit http://gerrit.cloudera.org:8080/12241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91
Gerrit-Change-Number: 12241
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 18:35:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8073: Fix flaky SentryProxyTest

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12241 )

Change subject: IMPALA-8073: Fix flaky SentryProxyTest
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1818/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91
Gerrit-Change-Number: 12241
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 18:25:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7988: support loading data with dockerized Impalas

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12189 )

Change subject: IMPALA-7988: support loading data with dockerized Impalas
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1817/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517
Gerrit-Change-Number: 12189
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 18:02:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7800: Reject new connections after --fe service threads

2019-01-18 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12226 )

Change subject: IMPALA-7800: Reject new connections after --fe_service_threads
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12226/6/tests/custom_cluster/test_frontend_connection_limit.py
File tests/custom_cluster/test_frontend_connection_limit.py:

http://gerrit.cloudera.org:8080/#/c/12226/6/tests/custom_cluster/test_frontend_connection_limit.py@49
PS6, Line 49:   client.execute(query)
Come to think of it, I don't even think you need to do a query. Just connecting 
eats the thread on the server side, so I don't think you need threading in the 
test at all. Just make 5 impala clients, and they'll each eat the thread.



--
To view, visit http://gerrit.cloudera.org:8080/12226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4b9d48aaff9e3b5854b5121798211c641c58a559
Gerrit-Change-Number: 12226
Gerrit-PatchSet: 6
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Fri, 18 Jan 2019 18:00:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7800: Reject new connections after --fe service threads

2019-01-18 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12226 )

Change subject: IMPALA-7800: Reject new connections after --fe_service_threads
..


Patch Set 6:

(5 comments)

The historical tidbit that struck my interests is the following:

 * This helps solve IMPALA-4135, where connections were timing out while 
waiting in the
 * OS accept queue, by ensuring that accept() is called as quickly as possible.
 */
class TAcceptQueueServer : public TServer {

So, at some point in the past, users, when connecting to a busy impala, would 
hit a timeout. For our purposes, let's pretend that "busy" is that all 
fe_service_threads are handling a session, and those sessions are ... lasting 
forever. Then, we decided, with IMPALA-4135, to make that "connect timeout" not 
work, by accepting the connection right away, and users would block. And now, 
we have an option to go full circle, where the user will get rejected right 
away, albeit without a timeout, and with an error message.

Part of the problem stems from the fact that "idle" sessions still eat up a 
thread, at least for impala-shell. (I ran bin/start-impala-cluster.py 
--impalad_args=-fe_service_threads=1 and ran two impala-shell's. One of them is 
just stuck, with the TCP connection opened, but progress blocked in 
"self.imp_service.PingImpalaService()" from the perspective of impala-shell.)

What's the motivation to make this change, and, also to default it off? We can 
already tell that a user has N sessions because they're exposed in the UI. I 
think the metrics

impala.thrift-server.beeswax-frontend.connection-setup-queue-size
impala.thrift-server.beeswax-frontend.connections-in-use
impala.thrift-server.beeswax-frontend.total-connections

might be somewhat insufficient to tell us we're in this state, but exposing the 
task queue size directly as a metric is something we can do.

Today, without this flag, if there are many non-interactive, but well-behaved 
users, they can submit 256 queries concurrently, and only 64 of them will run 
at a time because of the connection limit, but everyone will eventually drain 
out. With this flag, we'll see cases where some of this queries get rejected.

http://gerrit.cloudera.org:8080/#/c/12226/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12226/6//COMMIT_MSG@18
PS6, Line 18: By default we retain the current behavior. If
What's the point of this work if we don't change the default? When do we expect 
it to be used?


http://gerrit.cloudera.org:8080/#/c/12226/6/be/src/rpc/TAcceptQueueServer.cpp
File be/src/rpc/TAcceptQueueServer.cpp:

http://gerrit.cloudera.org:8080/#/c/12226/6/be/src/rpc/TAcceptQueueServer.cpp@166
PS6, Line 166: if (FLAGS_reject_request_on_max_fe_threads) {
 :   Synchronized s(tasksMonitor_);
 :   if (maxTasks_ > 0 && tasks_.size() >= maxTasks_) {
 : LOG(INFO) << "All " << maxTasks_ << " server threads are 
in use. "
 :   << "Rejecting new connection request.";
 : throw TTransportException("Server busy. Please try again 
later.");
 :   }
 : }
Regardless of the value of FLAGS_reject_request_on_max_fe_threads, do you want 
to (a) warn that tasks are sitting the queue in the log?

Note that we already have queue_size_metric_, so maybe it's already monitored.


http://gerrit.cloudera.org:8080/#/c/12226/6/tests/custom_cluster/test_frontend_connection_limit.py
File tests/custom_cluster/test_frontend_connection_limit.py:

http://gerrit.cloudera.org:8080/#/c/12226/6/tests/custom_cluster/test_frontend_connection_limit.py@35
PS6, Line 35:   statement = "select * from tpch_parquet.lineitem limit 5"
Please use a query with a "sleep(...)" in it. 'select * limit 5" can execute 
very quickly, and your test_server_busy() test might be flaky because of it.


http://gerrit.cloudera.org:8080/#/c/12226/6/tests/custom_cluster/test_frontend_connection_limit.py@47
PS6, Line 47: query = self.statement.format()
I don't think you need the .format(), either here or in line 99.

I also don't think you need the variable query, which you only use once, and is 
just self.statement.


http://gerrit.cloudera.org:8080/#/c/12226/6/tests/custom_cluster/test_frontend_connection_limit.py@50
PS6, Line 50: except Exception as e:
:   print 'Unexpected exception: ' + e.message
:   sys.exit(1)
It's not nice to sys.exit() from tests. The exception is enough. Just let it 
percolate out.



--
To view, visit http://gerrit.cloudera.org:8080/12226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4b9d48aaff9e3b5854b5121798211c641c58a559
Gerrit-Change-Number: 12226
Gerrit-PatchSet: 6
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Impala Public Jenkins 

[Impala-ASF-CR] IMPALA-8073: Fix flaky SentryProxyTest

2019-01-18 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12241


Change subject: IMPALA-8073: Fix flaky SentryProxyTest
..

IMPALA-8073: Fix flaky SentryProxyTest

This patch fixes SentryProxyTest that leaks the HMS connections. I was
able to reproduce the issue by looping the test multiple times.

Testing:
- Ran SentryProxyTest in a loop without any issue.

Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91
---
M fe/src/test/java/org/apache/impala/util/SentryProxyTest.java
1 file changed, 66 insertions(+), 56 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/12241/3
--
To view, visit http://gerrit.cloudera.org:8080/12241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I56930bc5f7a7bda4db11d4447461f907b1017b91
Gerrit-Change-Number: 12241
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya 


[Impala-ASF-CR] IMPALA-7800: Reject new connections after --fe service threads

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12226 )

Change subject: IMPALA-7800: Reject new connections after --fe_service_threads
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1816/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4b9d48aaff9e3b5854b5121798211c641c58a559
Gerrit-Change-Number: 12226
Gerrit-PatchSet: 6
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Fri, 18 Jan 2019 17:54:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12181 )

Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD 
COLUMN(S)
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1815/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9
Gerrit-Change-Number: 12181
Gerrit-PatchSet: 7
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Fri, 18 Jan 2019 17:29:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7800: Reject new connections after --fe service threads

2019-01-18 Thread Zoram Thanga (Code Review)
Zoram Thanga has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12226


Change subject: IMPALA-7800: Reject new connections after --fe_service_threads
..

IMPALA-7800: Reject new connections after --fe_service_threads

The current implementation of the FE thrift server waits
indefinitely to open the new session, if the maximum number of
FE service threads specified by --fe_service_threads has been
allocated.

This patch introduces a startup flag to control
how the server should treat new connection requests if we have
run out of the configured number of server threads.

By default we retain the current behavior. If
--reject_request_on_max_fe_threads=true, however, the new
connection requests are rejected by the server.

Testing:

Added a new custom cluster test suite to exercise the
new code.

Ran core tests.

Change-Id: I4b9d48aaff9e3b5854b5121798211c641c58a559
---
M be/src/rpc/TAcceptQueueServer.cpp
A tests/custom_cluster/test_frontend_connection_limit.py
2 files changed, 115 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/12226/6
--
To view, visit http://gerrit.cloudera.org:8080/12226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4b9d48aaff9e3b5854b5121798211c641c58a559
Gerrit-Change-Number: 12226
Gerrit-PatchSet: 6
Gerrit-Owner: Zoram Thanga 


[Impala-ASF-CR] IMPALA-7988: support loading data with dockerized Impalas

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12189 )

Change subject: IMPALA-7988: support loading data with dockerized Impalas
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3654/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/12189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517
Gerrit-Change-Number: 12189
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 17:31:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7988: support loading data with dockerized Impalas

2019-01-18 Thread Tim Armstrong (Code Review)
Hello Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12189

to look at the new patch set (#6).

Change subject: IMPALA-7988: support loading data with dockerized Impalas
..

IMPALA-7988: support loading data with dockerized Impalas

This patch does the work to load data and run some end-to-end
query tests on a dockerised cluster. Changes were required
in start-impala-cluster.py/ImpalaCluster and in some configuration
files.

ImpalaCluster is used for various things, including discovering
service ports and testing for cluster readiness. This patch adds
basic support and uses it from start-impala-cluster.py to check
for cluster readiness. Some logic is moved from
start-impala-cluster.py to ImpalaCluster.

Limitations:
* We're fairly inconsistent about whether services listen only on
  a single interface (e.g. loopback, traditionally) or whether it
  listens on all interfaces. This doesn't fix all of those issues.
  E.g. HDFS datanodes listen on all interfaces to work around
  some issues.
* Many tests don't pass yet, particularly those using
  ImpalaCluster(), which isn't initialised with the appropriate
  docker arguments.

Testing:
Did a full data load locally using a dockerised Impala cluster:

  START_CLUSTER_ARGS="--docker_network=impala-cluster" \
  TEST_START_CLUSTER_ARGS="--docker_network=impala-cluster" \
  ./buildall.sh -format -testdata -ninja -notests -skiptests -noclean

Ran a selection of end-to-end tests touching HDFS, Kudu and HBase
tables after I loaded data locally.

Ran exhaustive tests with non-dockerised impala cluster.

Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517
---
M bin/impala-config.sh
M bin/start-impala-cluster.py
M fe/src/test/resources/hbase-site.xml.template
M fe/src/test/resources/hive-default.xml
M fe/src/test/resources/mysql-hive-site.xml.template
M fe/src/test/resources/postgresql-hive-site.xml.template
M testdata/bin/create-load-data.sh
M testdata/bin/run-all.sh
M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
M testdata/cluster/node_templates/common/etc/kudu/tserver.conf.tmpl
M tests/common/impala_cluster.py
11 files changed, 358 insertions(+), 229 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/12189/6
--
To view, visit http://gerrit.cloudera.org:8080/12189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517
Gerrit-Change-Number: 12189
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7988: support loading data with dockerized Impalas

2019-01-18 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12189 )

Change subject: IMPALA-7988: support loading data with dockerized Impalas
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517
Gerrit-Change-Number: 12189
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 17:31:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7979: Enhance decoders to support value-skipping

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12172 )

Change subject: IMPALA-7979: Enhance decoders to support value-skipping
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1814/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib848f1bd71735fe84e8064daf700417b32589f57
Gerrit-Change-Number: 12172
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 18 Jan 2019 17:30:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7988: support loading data with dockerized Impalas

2019-01-18 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12189 )

Change subject: IMPALA-7988: support loading data with dockerized Impalas
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12189/5/tests/common/impala_cluster.py
File tests/common/impala_cluster.py:

http://gerrit.cloudera.org:8080/#/c/12189/5/tests/common/impala_cluster.py@223
PS5, Line 223: _find_docker_containers
> Nit: I believe you can make this fully private (__find_docker_containers) i
Done



--
To view, visit http://gerrit.cloudera.org:8080/12189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517
Gerrit-Change-Number: 12189
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 17:28:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7988: support loading data with dockerized Impalas

2019-01-18 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12189 )

Change subject: IMPALA-7988: support loading data with dockerized Impalas
..


Patch Set 5: Code-Review+2

(1 comment)

This looks good to me.

http://gerrit.cloudera.org:8080/#/c/12189/5/tests/common/impala_cluster.py
File tests/common/impala_cluster.py:

http://gerrit.cloudera.org:8080/#/c/12189/5/tests/common/impala_cluster.py@223
PS5, Line 223: _find_docker_containers
Nit: I believe you can make this fully private (__find_docker_containers) if 
you want. It looks like we only use _get_container_info elsewhere. Not 
important.



--
To view, visit http://gerrit.cloudera.org:8080/12189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I98fb9c4f5a3a3bb15c7809eab28ec8e5f63ff517
Gerrit-Change-Number: 12189
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Fri, 18 Jan 2019 17:20:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7979: Enhance decoders to support value-skipping

2019-01-18 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12172 )

Change subject: IMPALA-7979: Enhance decoders to support value-skipping
..


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/exec/parquet/parquet-bool-decoder-test.cc
File be/src/exec/parquet/parquet-bool-decoder-test.cc:

http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/exec/parquet/parquet-bool-decoder-test.cc@75
PS1, Line 75: 100 falses, 100 trues, 100 alternat
> Sorry, I am still not 100% ok with the testdata, as it will only use repeat
Added 100 alternating values


http://gerrit.cloudera.org:8080/#/c/12172/3/be/src/exec/parquet/parquet-bool-decoder-test.cc
File be/src/exec/parquet/parquet-bool-decoder-test.cc:

http://gerrit.cloudera.org:8080/#/c/12172/3/be/src/exec/parquet/parquet-bool-decoder-test.cc@73
PS3, Line 73: TestDecodeAnd
> The name seems a bit misleading, as the test checks both plain and RLE enco
Renamed this test


http://gerrit.cloudera.org:8080/#/c/12172/3/be/src/util/bit-stream-utils-test.cc
File be/src/util/bit-stream-utils-test.cc:

http://gerrit.cloudera.org:8080/#/c/12172/3/be/src/util/bit-stream-utils-test.cc@17
PS3, Line 17:
> optional: I am unsure whether we should do it in this change, but it would
For now I'll it here. I don't know if we could re-use some of these stuff for 
ORC or some other formats, probably not.


http://gerrit.cloudera.org:8080/#/c/12172/3/be/src/util/bit-stream-utils-test.cc@91
PS3, Line 91: // against a reader that doesn't skip.
> Might be worth adding a brief comment explaining the testing strategy, e.g.
Done


http://gerrit.cloudera.org:8080/#/c/12172/3/be/src/util/rle-test.cc
File be/src/util/rle-test.cc:

http://gerrit.cloudera.org:8080/#/c/12172/3/be/src/util/rle-test.cc@301
PS3, Line 301:   vector values{1, 0};
> Yeah, I like randomised tests. I don't think this code is so complex that i
Yeah, I was also thinking about some fuzz testing, but wasn't sure how to 
output the failed test cases in order to let users reproduce the issue for 
debugging.

I extended the string 'description' in ValidateRleSkip() to output all of its 
parameters, including the values as well.

Other approach would be to not use a random seed, so failures would be 
deterministic and reproducible.

Anyway, I kept this test case also.



--
To view, visit http://gerrit.cloudera.org:8080/12172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib848f1bd71735fe84e8064daf700417b32589f57
Gerrit-Change-Number: 12172
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 18 Jan 2019 16:49:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7731: Add TxRatio counter to instances and query profile

2019-01-18 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12229 )

Change subject: IMPALA-7731: Add TxRatio counter to instances and query profile
..


Patch Set 2: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator.cc@794
PS2, Line 794:
nit: extra line


http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py@391
PS2, Line 391: lineitem
This query takes 3.4 secs on my machine, which is far from extreme, but I would 
prefer to choose a cheaper one if it is enough for the test.



--
To view, visit http://gerrit.cloudera.org:8080/12229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
Gerrit-Change-Number: 12229
Gerrit-PatchSet: 2
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 18 Jan 2019 16:52:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)

2019-01-18 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12181 )

Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD 
COLUMN(S)
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12181/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12181/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2048
PS4, Line 2048: Column col = tbl.getColumn(column.getColumnName());
  :   if (ifNotExists && col != null) {
  :
> Not sure if we can skip the ifNotExists check. It is possible that the msTb
You're right. Done.



--
To view, visit http://gerrit.cloudera.org:8080/12181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9
Gerrit-Change-Number: 12181
Gerrit-PatchSet: 7
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Fri, 18 Jan 2019 16:47:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7979: Enhance decoders to support value-skipping

2019-01-18 Thread Zoltan Borok-Nagy (Code Review)
Hello Tim Armstrong, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12172

to look at the new patch set (#4).

Change subject: IMPALA-7979: Enhance decoders to support value-skipping
..

IMPALA-7979: Enhance decoders to support value-skipping

This commit adds value-skipping functionality to the decoders.
Value-skipping is useful when we know that we won't need the
following N values, so instead of decoding and dropping them, we
can just "jump" through them.

This feature is a prerequisite for Parquet page skipping (IMPALA-5843).

I added backend tests for all the decoders. Backed tests related to
bitpacking are moved to the newly created bit-stream-utils-test.

Change-Id: Ib848f1bd71735fe84e8064daf700417b32589f57
---
M be/src/exec/parquet/CMakeLists.txt
A be/src/exec/parquet/parquet-bool-decoder-test.cc
M be/src/exec/parquet/parquet-bool-decoder.cc
M be/src/exec/parquet/parquet-bool-decoder.h
M be/src/util/CMakeLists.txt
A be/src/util/bit-stream-utils-test.cc
M be/src/util/bit-stream-utils.h
M be/src/util/bit-stream-utils.inline.h
M be/src/util/dict-encoding.h
M be/src/util/dict-test.cc
M be/src/util/rle-encoding.h
M be/src/util/rle-test.cc
12 files changed, 735 insertions(+), 130 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/12172/4
--
To view, visit http://gerrit.cloudera.org:8080/12172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib848f1bd71735fe84e8064daf700417b32589f57
Gerrit-Change-Number: 12172
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)

2019-01-18 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/12181 )

Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD 
COLUMN(S)
..

IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)

This patch adds IF NOT EXISTS support in ALTER TABLE ADD COLUMN and
ALTER TABLE ADD COLUMNS. If IF NOT EXISTS is specified and a column
already exists with this name, no error is thrown. If IF NOT EXISTS
is specified for multiple columns and a column already exists, no
error is thrown and a new column that does not exist will be added.

Syntax:
ALTER TABLE tbl ADD COLUMN [IF NOT EXISTS] i int
ALTER TABLE tbl ADD [IF NOT EXISTS] COLUMNS (i int, j int)

Testing:
- Added new FE tests
- Ran all FE tests
- Updated E2E DDL tests
- Ran all E2E DDL tests

Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9
---
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
R fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
C fe/src/main/java/org/apache/impala/analysis/AlterTableReplaceColsStmt.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/alter-table.test
9 files changed, 395 insertions(+), 179 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/12181/7
--
To view, visit http://gerrit.cloudera.org:8080/12181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9
Gerrit-Change-Number: 12181
Gerrit-PatchSet: 7
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-18 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 21:

(14 comments)

Patch code flow and logic lgtm generally. Just had some nits/clarifications. 
Publishing these while I'm looking at the tests.

http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@307
PS21, Line 307: BackendConfig.INSTANCE.getHMSPollingIntervalInSeconds()
eventPollingInterval


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@319
PS21, Line 319:  LOG.error("Unable to fetch the current notification event id 
from metastore."
  :   + "Metastore event processing will be disabled.",
  :   e);
nit: Multiple places in the code, try to format in fewer lines. Something like

LOG.error("..."
+ ".", e)


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1206
PS21, Line 1206: // Even though we get the current notification event id 
before stopping the event
clarification: reset() is equivalent to invalidating everything and fetching 
from scratch from HMS. In that case, do we still restart the event processing 
from where we left off (before reset()) or can we start with the latest event 
ID from HMS?


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1804
PS21, Line 1804: Table incompleteTable = 
IncompleteTable.createUninitializedTable(db, tblName);
move it after L1807?


http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java
File fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@195
PS13, Line 195:
> curious to understand why do you think this method has a race?
nvm missed the synchronized part.


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@155
PS21, Line 155: event %d of type %s on table %s
I think it is cleaner to define a debugString(NotificationEvent event) {} that 
formats it cleanly and consistently across all invocations. We can use it to 
dump event state in multiple places.


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@160
PS21, Line 160: if (!event.getEventType().equalsIgnoreCase("CREATE_DATABASE")
  :   && 
!event.getEventType().equalsIgnoreCase("DROP_DATABASE")) {
  : tblName_ = 
Preconditions.checkNotNull(event.getTableName());
  :   } else {
  : tblName_ = null;
  :   }
nit: Reformat to the following? (fewer branches and use enum)

tblName_ = null;
if (eventType_ == MetaStoreEventType.CREATE_TABLE) {
  tblName_ = ...
}


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@181
PS21, Line 181: tblName_
checkNotNull()?


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@198
PS21, Line 198:private final String dbName_;
  : private final String tblName_;
aren't these set in the parent c'tor?


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@347
PS21, Line 347: droppedTable_
isn't this more like tblTobeDropped_ ?


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@434
PS21, Line 434: //
nit: remove, you are already in a comment block.


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@310
PS21, Line 310: if (currentEventId <= lastSyncedEventId_) {
  : return Collections.emptyList();
  :   }
nit: single line



[Impala-ASF-CR] IMPALA-6503: Support reading complex types from ORC format files

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12168 )

Change subject: IMPALA-6503: Support reading complex types from ORC format files
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1813/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I244dc9d2b3e425393f90e45632cb8cdbea6cf790
Gerrit-Change-Number: 12168
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 Jan 2019 14:41:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6503: Support reading complex types from ORC format files

2019-01-18 Thread Quanlong Huang (Code Review)
Hello Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12168

to look at the new patch set (#8).

Change subject: IMPALA-6503: Support reading complex types from ORC format files
..

IMPALA-6503: Support reading complex types from ORC format files

We’ve supported reading primitive types from ORC files (IMPALA-5717).
In this patch we add support for complex types (struct/array/map).

In IMPALA-5717, we depend on the ORC lib to read ORC binaries. The ORC
lib can materialize ORC column binaries into its representation
(orc::ColumnVectorBatch), so we don’t need to do anything about
decoding/decompression in hdfs-orc-scanner. Since it already supports
complex types, we’ll still depend on it.

What we need to add in IMPALA-6503 are two things:
1. Specify which nested columns we need to the ORC lib
2. Transform outputs of ORC lib (nested orc::ColumnVectorBatch) into
  Impala’s representation

To format the materialization, we implement several ORC column readers
used in hdfs-orc-scanner. Each kind of reader treats a column type.
Don’t like the Parquet readers (used in hdfs-parquet-scanner) which
materializes Parquet column binaries into tuple values directly, the ORC
readers (in hdfs-orc-scanner) just need to transform outputs of the ORC
lib into tuple/slot values.

Tests:
* Enable existing tests for complex types (test_nested_types.py,
test_tpch_nested_queries.py) for ORC.

Change-Id: I244dc9d2b3e425393f90e45632cb8cdbea6cf790
---
M be/src/exec/CMakeLists.txt
M be/src/exec/hdfs-orc-scanner.cc
M be/src/exec/hdfs-orc-scanner.h
A be/src/exec/orc-column-readers.cc
A be/src/exec/orc-column-readers.h
A be/src/exec/orc-metadata-utils.cc
A be/src/exec/orc-metadata-utils.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java
A testdata/ComplexTypesTbl/README
A testdata/ComplexTypesTbl/nonnullable.orc
A testdata/ComplexTypesTbl/nullable.orc
M testdata/bin/create-load-data.sh
M testdata/bin/load_nested.py
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-planner/queries/PlannerTest/complex-types-file-formats.test
M testdata/workloads/functional-query/queries/QueryTest/max-nesting-depth.test
M 
testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test
M 
testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch-limit.test
M 
testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch-mem-limit.test
M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test
M testdata/workloads/tpch_nested/tpch_nested_core.csv
M testdata/workloads/tpch_nested/tpch_nested_dimensions.csv
M testdata/workloads/tpch_nested/tpch_nested_exhaustive.csv
M testdata/workloads/tpch_nested/tpch_nested_pairwise.csv
M tests/query_test/test_nested_types.py
M tests/query_test/test_tpch_nested_queries.py
29 files changed, 1,695 insertions(+), 449 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/12168/8
--
To view, visit http://gerrit.cloudera.org:8080/12168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I244dc9d2b3e425393f90e45632cb8cdbea6cf790
Gerrit-Change-Number: 12168
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6664: Tag log statements with fragment or query ids.

2019-01-18 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12129 )

Change subject: IMPALA-6664: Tag log statements with fragment or query ids.
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12129/2/be/src/common/thread-debug-info.h
File be/src/common/thread-debug-info.h:

http://gerrit.cloudera.org:8080/#/c/12129/2/be/src/common/thread-debug-info.h@a116
PS2, Line 116:
> Cool. I'll take a look at impala-gdb.py and fix it appropriately. Printing
Thanks! Do you plan to fix it in this change, or in a follow-up commit?


http://gerrit.cloudera.org:8080/#/c/12129/5/be/src/service/impala-hs2-server.cc
File be/src/service/impala-hs2-server.cc:

http://gerrit.cloudera.org:8080/#/c/12129/5/be/src/service/impala-hs2-server.cc@126
PS5, Line 126: ScopedThreadContext(GetThreadDebugInfo(), query_ctx.query_id);
Currently it creates a temporary object that is deleted promptly.

You need to create a variable so the TDI object only gets reset at the end of 
the function.


http://gerrit.cloudera.org:8080/#/c/12129/5/be/src/service/impala-internal-service.cc
File be/src/service/impala-internal-service.cc:

http://gerrit.cloudera.org:8080/#/c/12129/5/be/src/service/impala-internal-service.cc@49
PS5, Line 49:   ScopedThreadContext(GetThreadDebugInfo(), 
params.query_ctx.query_id);
It creates a temporary that is deleted promptly.



--
To view, visit http://gerrit.cloudera.org:8080/12129
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6634ef9d1a7346339f24f2d40a7a3aa36a535da8
Gerrit-Change-Number: 12129
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Fri, 18 Jan 2019 13:48:41 +
Gerrit-HasComments: Yes