[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class

2018-07-20 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10813 )

Change subject: IMPALA-7163: Implement a state machine for the QueryState class
..


Patch Set 6:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/fragment-instance-state.cc@a113
PS6, Line 113:
May make sense to replace this by checking the state. Doing a racy read of the 
state should be fine.


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@198
PS6, Line 198:  struct FInstanceStatus {
If you take the suggestion to only keep a single 'query_status_', there doesn't 
seem to be need for this struct. We can replace it with a single class member 
to record the fragment instance ID of the first failed fragment instance.


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@212
PS6, Line 212: TUniqueId finst_id_;
This is only meaningful when there is an error status, right ? Please document 
it.


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@234
PS6, Line 234: FragmentInstanceState
fragment instance


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@271
PS6, Line 271: INITIALIZED,
Consider calling it "PREPARING"


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@274
PS6, Line 274: PREPARED,
Consider calling it "EXECUTING"


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@308
PS6, Line 308:   /// The overall status of the Prepare phase for this query 
state.
 :   FInstanceStatus prepare_status_;
 :
 :   /// The overall status of the Exec phase (after Prepare()) for 
this query state.
 :   FInstanceStatus exec_status_;
 :
 :   /// Protects 'prepare_status_' and 'exec_status_'.
 :   SpinLock status_lock_;
It seems simpler to keep track of a single status for the first error hit 
instead of tracking the errors in different phases.

The behavior in the existing code is that the first fragment instance which 
hits the error will report it to the coordinator directly. In theory, a 
fragment instance F1 can hit an error during exec after prepare phase is done 
while another fragment instance F2 can hit error during its prepare phase after 
F1 hits the error. The new patch seems to set query_status_ to the error of F2 
while the coordinator will get the error of F1. So, this seems to cause 
divergence with the coordinator.

So, in that sense, it's necessary to keep a single status only.


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@320
PS6, Line 320:   FInstanceStatus query_status_;
Mind documenting the thread safety about this variable ?


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.cc@238
PS6, Line 238: case BackendExecState::INITIALIZED:
 : case BackendExecState::PREPARED:
 :   DCHECK(query_status_.ok()) << query_status_.ToString();
 :   if (!status.ok()) {
 : // Error while executing - go to ERROR state.
 : query_status_ = finst_status;
 : backend_exec_state_ = BackendExecState::ERROR;
Can this be merged with TransitionNonTerminalState() above as:

 if (query_status_.IsCancelled()) {
 new_state = BackendExecState::CANCELLED;
 } else if (!query_status_.ok()) {
 new_state = BackendExecState::ERROR;
 } else {
 new_state = state == PREPARING ? EXECUTING : FINISHED;
 }


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.cc@250
PS6, Line 250: case BackendExecState::FINISHED
Under what circumstances would we call QueryState::UpdateBackendExecState() 
after entering a terminal state ?


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.cc@457
PS6, Line 457: ErrorDuringPrepare(
Won't it be a problem if we call ErrorDuringPrepare() here as it only bump the 
instances_prepared_barrier_ by one instead of the number of remaining fragment 
instances ? In that sense, won't caller of WaitForPrepare() wait forever ?

Also, the thread creation failure path warrants a targeted test case. May be a 
debug action can be added in Thread::Create().


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.cc@460
PS6, Line 460: return;
Seems wrong to return here. Don't we need to call ReportExecStatusAux() like we 
did in the past.


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.cc@463
PS6, Line 463:   all_fis_status = 

[Impala-ASF-CR] IMPALA-7234: Improve memory estimates produced by the Planner

2018-07-20 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/11001 )

Change subject: IMPALA-7234: Improve memory estimates produced by the Planner
..

IMPALA-7234: Improve memory estimates produced by the Planner

Previously, the planner used the getMajorityFormat to estimate
the memory requirements of its partitions. Additionally, before
IMPALA-6625 was merged, the majority format for a multi-format
table with no numerical majority was calculated using a HashMap,
thus producing non deterministic results. This change ensures that
the memory estimate is deterministic and always based on partition
that has the maximum memory requirement.

Testing: Ran all PlannerTests. Also, modified plans of scans with
multiple partitions to ensure that the memory estimate produced
corresponds to the partition with the maximum requirement.

Change-Id: I0666ae3d45fbd8615d3fa9a8626ebd29cf94fb4b
---
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering-disabled.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
8 files changed, 52 insertions(+), 64 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/11001/3
--
To view, visit http://gerrit.cloudera.org:8080/11001
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0666ae3d45fbd8615d3fa9a8626ebd29cf94fb4b
Gerrit-Change-Number: 11001
Gerrit-PatchSet: 3
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Pooja Nilangekar 


[Impala-ASF-CR] IMPALA-7209: Disallow self referencing in ALTER VIEW statements

2018-07-20 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10908 )

Change subject: IMPALA-7209: Disallow self referencing in ALTER VIEW statements
..


Patch Set 7:

(1 comment)

I am sorry I had to rebase this change. I couldn't get impalad to start due to 
IMPALA-7316.

http://gerrit.cloudera.org:8080/#/c/10908/6/fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java:

http://gerrit.cloudera.org:8080/#/c/10908/6/fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java@69
PS6, Line 69: if (inlineViewRefs.contains(((InlineViewRef) tblRef).getView())) {
:   throw new AnalysisException(
:   String.format("Self-reference not allowed on view: %s", 
tblRef.toSql()));
: }
:
: createColumnAndViewDefs(analyzer);
: if (BackendConfig.INSTANCE.getComputeLineage() || 
RuntimeEnv.INSTANCE.isTestEnv()) {
:   computeLineageGraph(analyzer);
: }
:   }
:
:   @Override
:   public String toSql() {
: StringBuilder sb = new StringBuilder();
: sb.append("ALTER VIEW ");
: if (tableName_.getDb() != null) {
:   sb.append(tableName_.getDb() + ".");
: }
: sb.append(tableName_.getTbl());
: if (columnDefs_ != null) sb.append("(" + getColumnNames() + 
")");
: sb.append(" AS " + viewDefStmt_.toSql());
: return sb.toString();
:   }
: }
:
:
:
:
:
> How about factoring this out into a method in QueryStmt and have helper met
Done



--
To view, visit http://gerrit.cloudera.org:8080/10908
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I17c231c9d74d9d411463a408b086eb874090b9b7
Gerrit-Change-Number: 10908
Gerrit-PatchSet: 7
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Comment-Date: Sat, 21 Jul 2018 01:30:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7209: Disallow self referencing in ALTER VIEW statements

2018-07-20 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/10908 )

Change subject: IMPALA-7209: Disallow self referencing in ALTER VIEW statements
..

IMPALA-7209: Disallow self referencing in ALTER VIEW statements

Previously, ALTER VIEW queries did not carry out reference checks
in the analysis phase. This allowed the DDL operation to succeed
but subsequent queries to the view threw StackOverflowError
because the catalog was unable to resolve the reference. With this
change, the AlterViewStmt checks for direct and in-direct self
references before altering a view.

Testing: Added tests to AnalyzeDDLTest to verify it.
Change-Id: I17c231c9d74d9d411463a408b086eb874090b9b7
---
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
5 files changed, 93 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/10908/7
--
To view, visit http://gerrit.cloudera.org:8080/10908
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I17c231c9d74d9d411463a408b086eb874090b9b7
Gerrit-Change-Number: 10908
Gerrit-PatchSet: 7
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Pooja Nilangekar 


[Impala-ASF-CR] PREVIEW: IMPALA-110: Support for multiple DISTINCT

2018-07-20 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10771 )

Change subject: PREVIEW: IMPALA-110: Support for multiple DISTINCT
..


Patch Set 1:

(7 comments)

Did a quick initial pass over the backend part of this.

http://gerrit.cloudera.org:8080/#/c/10771/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10771/1//COMMIT_MSG@60
PS1, Line 60: - Ran hdfs/core tests
Can we add some spilling tests too?


http://gerrit.cloudera.org:8080/#/c/10771/1/be/src/exec/aggregation-node.cc
File be/src/exec/aggregation-node.cc:

http://gerrit.cloudera.org:8080/#/c/10771/1/be/src/exec/aggregation-node.cc@118
PS1, Line 118: // Separate input batch into mini batches destined for the 
different aggs.
I feel like there's a lot of unnecessary overhead in the pre-partitioning into 
mini-batches (the approach in general and maybe some of the specifics of how 
this code uses the RowBatch APIs).

I'd be interested to see how much time is spent here for a high-NDV aggregation 
to see if it's worth optimising.

We could avoid the construction of the intermediate batches if we pushed down 
logic into AddBatch so that it consumed rows from 'batch' up until it saw one 
that didn't match its index.


http://gerrit.cloudera.org:8080/#/c/10771/1/be/src/exec/grouping-aggregator-ir.cc
File be/src/exec/grouping-aggregator-ir.cc:

http://gerrit.cloudera.org:8080/#/c/10771/1/be/src/exec/grouping-aggregator-ir.cc@171
PS1, Line 171:   ++streaming_idx_;
Let's hoist the store to 'streaming_idx_' out of the loop.


http://gerrit.cloudera.org:8080/#/c/10771/1/be/src/exec/grouping-aggregator-ir.cc@190
PS1, Line 190: out_batch->ClearRow(row);
This seems expensive to include in the inner loop. Can we avoid doing this 
entirely in the non-multiagg case? And do it efficiently with a memset() before 
calling this function in the multiagg case.


http://gerrit.cloudera.org:8080/#/c/10771/1/be/src/exec/grouping-aggregator-ir.cc@191
PS1, Line 191: agg_idx_
Let's hoist this load out of the loop too. Can we codegen this out for the 
non-multi-agg case?


http://gerrit.cloudera.org:8080/#/c/10771/1/be/src/exec/grouping-aggregator.h
File be/src/exec/grouping-aggregator.h:

http://gerrit.cloudera.org:8080/#/c/10771/1/be/src/exec/grouping-aggregator.h@296
PS1, Line 296:   int32_t streaming_idx_;
Comments?


http://gerrit.cloudera.org:8080/#/c/10771/1/be/src/exec/grouping-aggregator.cc
File be/src/exec/grouping-aggregator.cc:

http://gerrit.cloudera.org:8080/#/c/10771/1/be/src/exec/grouping-aggregator.cc@285
PS1, Line 285: row_batch->ClearRow(row);
This seems like unnecessary overhead.



--
To view, visit http://gerrit.cloudera.org:8080/10771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I055402eaef6d81e5f70e850d9f8a621e766830a4
Gerrit-Change-Number: 10771
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 21 Jul 2018 01:13:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7234: Improve memory estimates produced by the Planner

2018-07-20 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11001 )

Change subject: IMPALA-7234: Improve memory estimates produced by the Planner
..


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/11001/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11001/1//COMMIT_MSG@12
PS1, Line 12: calcuated
nit: typo


http://gerrit.cloudera.org:8080/#/c/11001/1//COMMIT_MSG@14
PS1, Line 14: esimate
nit: typo


http://gerrit.cloudera.org:8080/#/c/11001/1//COMMIT_MSG@18
PS1, Line 18: esimate
nit: typo


http://gerrit.cloudera.org:8080/#/c/11001/1/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
File fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java:

http://gerrit.cloudera.org:8080/#/c/11001/1/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java@300
PS1, Line 300:   public static Set getFileFormats(
 :   Iterable partitions) {
 : Set fileFormats = Sets.newHashSet();
 : for (FeFsPartition partition : partitions) {
 :   fileFormats.add(partition.getFileFormat());
 : }
 : return fileFormats;
you can get rid of this since its used only in one place and move functionality 
to HdfsTable


http://gerrit.cloudera.org:8080/#/c/11001/1/fe/src/main/java/org/apache/impala/catalog/FeFsTable.java
File fe/src/main/java/org/apache/impala/catalog/FeFsTable.java:

http://gerrit.cloudera.org:8080/#/c/11001/1/fe/src/main/java/org/apache/impala/catalog/FeFsTable.java@97
PS1, Line 97:   /**
:* @return the set of file formats that the partitions in this 
table use.
:* This API is only used by the TableSink to write out 
partitions. It
:* should not be used for scanning.
:*/
:   public Set getFileFormats();
we can probably get rid of this since its used at only one place and move the 
functionality there. Interested in what others think about this.


http://gerrit.cloudera.org:8080/#/c/11001/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/11001/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1369
PS1, Line 1369: Henece
nit: typo


http://gerrit.cloudera.org:8080/#/c/11001/1/fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
File fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java:

http://gerrit.cloudera.org:8080/#/c/11001/1/fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java@83
PS1, Line 83:
: HdfsTable table = (HdfsTable) targetTable_;
: // TODO: Estimate the memory requirements more accurately by 
partition type.
: Set formats = table.getFileFormats();
you can pass the table directly and also simplify getPerPartitionMemReq() by 
using a contains(FORMAT) instead of a for-loop + switch-case. Also add 
corresponding comments.


http://gerrit.cloudera.org:8080/#/c/11001/1/fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java@109
PS1, Line 109:  * Returns the per-partition memory requirement for inserting 
into the given
 :* file format.
update comment



--
To view, visit http://gerrit.cloudera.org:8080/11001
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0666ae3d45fbd8615d3fa9a8626ebd29cf94fb4b
Gerrit-Change-Number: 11001
Gerrit-PatchSet: 1
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Comment-Date: Sat, 21 Jul 2018 01:13:02 +
Gerrit-HasComments: Yes


[native-toolchain-CR] Bump Kudu version to 5211897

2018-07-20 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11003 )

Change subject: Bump Kudu version to 5211897
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11003
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5aa226d74150bc6623d685e3d995e1699db4f2e8
Gerrit-Change-Number: 11003
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 21 Jul 2018 00:58:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] Test gerrit trigger

2018-07-20 Thread Tim Armstrong (Code Review)
Tim Armstrong has abandoned this change. ( 
http://gerrit.cloudera.org:8080/11009 )

Change subject: Test gerrit trigger
..


Abandoned
--
To view, visit http://gerrit.cloudera.org:8080/11009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: If08d7ce10631670ed913b4ff5c48f9ff07c2709c
Gerrit-Change-Number: 11009
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] Test gerrit trigger

2018-07-20 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/11009 )

Change subject: Test gerrit trigger
..

Test gerrit trigger

Change-Id: If08d7ce10631670ed913b4ff5c48f9ff07c2709c
---
M be/src/common/global-flags.cc
1 file changed, 3 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/11009/2
--
To view, visit http://gerrit.cloudera.org:8080/11009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If08d7ce10631670ed913b4ff5c48f9ff07c2709c
Gerrit-Change-Number: 11009
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] Test gerrit trigger

2018-07-20 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11009


Change subject: Test gerrit trigger
..

Test gerrit trigger

Change-Id: If08d7ce10631670ed913b4ff5c48f9ff07c2709c
---
M be/src/common/global-flags.cc
1 file changed, 1 insertion(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/11009/1
--
To view, visit http://gerrit.cloudera.org:8080/11009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If08d7ce10631670ed913b4ff5c48f9ff07c2709c
Gerrit-Change-Number: 11009
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] Test gerrit trigger

2018-07-20 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11008


Change subject: Test gerrit trigger
..

Test gerrit trigger

Change-Id: If08d7ce10631670ed913b4ff5c48f9ff07c2709c
---
M be/src/common/global-flags.cc
1 file changed, 1 insertion(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/11008/1
--
To view, visit http://gerrit.cloudera.org:8080/11008
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If08d7ce10631670ed913b4ff5c48f9ff07c2709c
Gerrit-Change-Number: 11008
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6299: Modify IRBuilder to use LLVM's CPU features

2018-07-20 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10979 )

Change subject: IMPALA-6299: Modify IRBuilder to use LLVM's CPU features
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10979/2/be/src/codegen/llvm-codegen-test.cc
File be/src/codegen/llvm-codegen-test.cc:

http://gerrit.cloudera.org:8080/#/c/10979/2/be/src/codegen/llvm-codegen-test.cc@467
PS2, Line 467: data1_hash_fn, llvm::ArrayRef({llvm_data1, 
llvm_len1, seed}));
> Thanks for explaining. Might be worth leaving a brief comment explaining th
Done


http://gerrit.cloudera.org:8080/#/c/10979/2/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/10979/2/be/src/codegen/llvm-codegen.h@737
PS2, Line 737:
> How about we move the method EnableCPUFeature out from this class and into
Done. I have moved the helper function to the test class and added a comment 
stating that cpu_attrs_ should be modified only for tests.


http://gerrit.cloudera.org:8080/#/c/10979/3/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/10979/3/be/src/codegen/llvm-codegen.cc@1519
PS3, Line 1519:
> nit: not a big deal but just to be consistent with CpuInfo and the static f
Done



--
To view, visit http://gerrit.cloudera.org:8080/10979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifece8949c143146d2a1b38d72d21c2d733bed90f
Gerrit-Change-Number: 10979
Gerrit-PatchSet: 5
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 21 Jul 2018 00:33:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6299: Modify IRBuilder to use LLVM's CPU features

2018-07-20 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/10979 )

Change subject: IMPALA-6299: Modify IRBuilder to use LLVM's CPU features
..

IMPALA-6299: Modify IRBuilder to use LLVM's CPU features

Previously, the IRBuilder of the LlvmCodeGen class used CpuInfo's
list of enabled features to determine the validity of certain
instructions. It did not consider the whitelist which passed while
initilaizing the LlvmCodeGen class. Now, the IRBuilder inspects
its own cpu attributes before emitting instruction. This change
also adds functionality to modify the cpu attributes of the
LlvmCodeGen class for testing.

Testing: Verified that the current tests which use and modify
CpuInfo produce expected results.

Change-Id: Ifece8949c143146d2a1b38d72d21c2d733bed90f
---
M be/src/benchmarks/bloom-filter-benchmark.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/filter-context.cc
M be/src/util/cpu-info.cc
6 files changed, 90 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/10979/5
--
To view, visit http://gerrit.cloudera.org:8080/10979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifece8949c143146d2a1b38d72d21c2d733bed90f
Gerrit-Change-Number: 10979
Gerrit-PatchSet: 5
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7173: [DOCS] Added check options in the load balancer examples

2018-07-20 Thread Alex Rodoni (Code Review)
Alex Rodoni has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11006


Change subject: IMPALA-7173: [DOCS] Added check options in the load balancer 
examples
..

IMPALA-7173: [DOCS] Added check options in the load balancer examples

Change-Id: Ic5148aecf4605373ea6a28bf032bcfdab4f822fd
---
M docs/topics/impala_proxy.xml
1 file changed, 4 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/11006/1
--
To view, visit http://gerrit.cloudera.org:8080/11006
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic5148aecf4605373ea6a28bf032bcfdab4f822fd
Gerrit-Change-Number: 11006
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 


[Impala-ASF-CR] IMPALA-3675: part 1: -Werror for ASAN

2018-07-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11002 )

Change subject: IMPALA-3675: part 1: -Werror for ASAN
..


Patch Set 4: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2846/


--
To view, visit http://gerrit.cloudera.org:8080/11002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If83ca41cde49fd6cfde479e45f9cfdd9e3a45bec
Gerrit-Change-Number: 11002
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 20 Jul 2018 23:55:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7257. Support Kudu tables in LocalCatalog

2018-07-20 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10912 )

Change subject: IMPALA-7257. Support Kudu tables in LocalCatalog
..


Patch Set 3:

(1 comment)

LGTM. Will +2 after the small fixes are applied.

http://gerrit.cloudera.org:8080/#/c/10912/3/fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java:

http://gerrit.cloudera.org:8080/#/c/10912/3/fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java@140
PS3, Line 140:  kuduTableName_ instead of name below
> The line of code there is accessing KuduTableDescriptor::table_name() (set
You are right. It's only used for logging in the backend.



--
To view, visit http://gerrit.cloudera.org:8080/10912
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5b6a317ee895e43e00ade953e814867b56b4e6dd
Gerrit-Change-Number: 10912
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 20 Jul 2018 22:55:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3675: part 1: -Werror for ASAN

2018-07-20 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11002 )

Change subject: IMPALA-3675: part 1: -Werror for ASAN
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11002/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11002/4//COMMIT_MSG@9
PS4, Line 9: This fixes some clang warnings and enables -Werror going forward 
for
> I don't see any diffs, other than the mismatched struct/class ones. I feel
The clang-tidy precommit prevented most warnings from cropping up. The only 
fixes were from enabling the matched tags check and to suppress 
return-type-c-linkage that is also suppressed in clang-tidy.


http://gerrit.cloudera.org:8080/#/c/11002/4/be/CMakeLists.txt
File be/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/11002/4/be/CMakeLists.txt@105
PS4, Line 105: #   Enabled for DEBUG and ASAN builds which are built pre-commit.
> TSAN and UBSAN, too, or do you think that's overzealous?
Will do that. I don't expect any issues but will rerun the build job to check: 
https://jenkins.impala.io/job/all-build-options-ub1604/2165



--
To view, visit http://gerrit.cloudera.org:8080/11002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If83ca41cde49fd6cfde479e45f9cfdd9e3a45bec
Gerrit-Change-Number: 11002
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 20 Jul 2018 22:51:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3675: part 1: -Werror for ASAN

2018-07-20 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11002 )

Change subject: IMPALA-3675: part 1: -Werror for ASAN
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11002/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11002/4//COMMIT_MSG@9
PS4, Line 9: This fixes some clang warnings and enables -Werror going forward 
for
I don't see any diffs, other than the mismatched struct/class ones. I feel like 
I'm missing something.


http://gerrit.cloudera.org:8080/#/c/11002/4/be/CMakeLists.txt
File be/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/11002/4/be/CMakeLists.txt@105
PS4, Line 105: #   Enabled for DEBUG and ASAN builds which are built pre-commit.
TSAN and UBSAN, too, or do you think that's overzealous?



--
To view, visit http://gerrit.cloudera.org:8080/11002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If83ca41cde49fd6cfde479e45f9cfdd9e3a45bec
Gerrit-Change-Number: 11002
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Comment-Date: Fri, 20 Jul 2018 22:44:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7257. Support Kudu tables in LocalCatalog

2018-07-20 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10912 )

Change subject: IMPALA-7257. Support Kudu tables in LocalCatalog
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10912/3/fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java:

http://gerrit.cloudera.org:8080/#/c/10912/3/fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java@140
PS3, Line 140:  kuduTableName_ instead of name below
> It's eventually used here: https://github.com/apache/impala/blob/master/be/
The line of code there is accessing KuduTableDescriptor::table_name() (set down 
on line 149 of this file below). The table name passed in TTableDescriptor 
should be the _impala_ table name, though, which may not match the underlying 
kudu table name. (usually the underlying kudu table is 
"impala::dbname.tablename"). I don't know who accesses the 
TTableDescriptor.tablename



--
To view, visit http://gerrit.cloudera.org:8080/10912
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5b6a317ee895e43e00ade953e814867b56b4e6dd
Gerrit-Change-Number: 10912
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 20 Jul 2018 22:27:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5826: [DOCS] Documented the IDLE SESSION TIMEOUT query option

2018-07-20 Thread Alex Rodoni (Code Review)
Alex Rodoni has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11004


Change subject: IMPALA-5826: [DOCS] Documented the IDLE_SESSION_TIMEOUT query 
option
..

IMPALA-5826: [DOCS] Documented the IDLE_SESSION_TIMEOUT query option

Change-Id: I37182a3c5cf19fdcbb5f247ed71d43f963143510
---
M docs/impala.ditamap
A docs/topics/impala_idle_session_timeout.xml
M docs/topics/impala_timeouts.xml
3 files changed, 106 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/11004/1
--
To view, visit http://gerrit.cloudera.org:8080/11004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I37182a3c5cf19fdcbb5f247ed71d43f963143510
Gerrit-Change-Number: 11004
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 


[Impala-ASF-CR] IMPALA-7257. Support Kudu tables in LocalCatalog

2018-07-20 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10912 )

Change subject: IMPALA-7257. Support Kudu tables in LocalCatalog
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10912/3/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
File fe/src/main/java/org/apache/impala/catalog/KuduTable.java:

http://gerrit.cloudera.org:8080/#/c/10912/3/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@258
PS3, Line 258:   public static List loadPartitionByParams(
> I think eventually they could become default methods in the FeKuduTable int
Another option is to put them in a class FeKuduTable.Common. It would be closer 
to the Java 8 code structure.


http://gerrit.cloudera.org:8080/#/c/10912/3/fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java:

http://gerrit.cloudera.org:8080/#/c/10912/3/fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java@140
PS3, Line 140:  kuduTableName_ instead of name below
> I think that was a bug in the old implementation. The name that shows up in
It's eventually used here: 
https://github.com/apache/impala/blob/master/be/src/exec/kudu-scan-node-base.cc#L103.
 Why shouldn't it be the table name in Kudu?



--
To view, visit http://gerrit.cloudera.org:8080/10912
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5b6a317ee895e43e00ade953e814867b56b4e6dd
Gerrit-Change-Number: 10912
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 20 Jul 2018 21:26:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7256: Aggregator mem usage isn't reflected in summary

2018-07-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10989 )

Change subject: IMPALA-7256: Aggregator mem usage isn't reflected in summary
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10989
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iba6ef207bed47810fc742aec3481db5f313cf97f
Gerrit-Change-Number: 10989
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 20 Jul 2018 20:59:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7256: Aggregator mem usage isn't reflected in summary

2018-07-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10989 )

Change subject: IMPALA-7256: Aggregator mem usage isn't reflected in summary
..

IMPALA-7256: Aggregator mem usage isn't reflected in summary

This patch fixes a bug where memory used by an Aggregator wasn't being
reflected in the exec summary for the corresponding aggregation node
by ensuring that the Aggregator's MemTracker is a child of the node's
MemTracker.

Testing:
- Manually ran a query and checked that all memory used by the
  Aggregator is accounted for in the exec summary.

Change-Id: Iba6ef207bed47810fc742aec3481db5f313cf97f
Reviewed-on: http://gerrit.cloudera.org:8080/10989
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/aggregator.cc
M be/src/exec/aggregator.h
M be/src/exec/exec-node.cc
M be/src/exec/grouping-aggregator.cc
M be/src/exec/grouping-aggregator.h
M be/src/runtime/reservation-manager.cc
M be/src/runtime/reservation-manager.h
7 files changed, 23 insertions(+), 10 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/10989
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iba6ef207bed47810fc742aec3481db5f313cf97f
Gerrit-Change-Number: 10989
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3675: part 1: -Werror for ASAN

2018-07-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11002 )

Change subject: IMPALA-3675: part 1: -Werror for ASAN
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2846/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/11002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If83ca41cde49fd6cfde479e45f9cfdd9e3a45bec
Gerrit-Change-Number: 11002
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 20 Jul 2018 20:56:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3675: part 1: -Werror for ASAN

2018-07-20 Thread Tim Armstrong (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11002

to look at the new patch set (#3).

Change subject: IMPALA-3675: part 1: -Werror for ASAN
..

IMPALA-3675: part 1: -Werror for ASAN

This fixes some clang warnings and enables -Werror going forward for
ASAN.

It adds -Wno-return-type-c-linkage to suppress a warning that otherwise
would fail the build.

It also enables the mismatched tags check and fixes the various
instances of it (this warning is irritating for YouCompleteMe users).

Change-Id: If83ca41cde49fd6cfde479e45f9cfdd9e3a45bec
---
M .clang-tidy
M be/CMakeLists.txt
M be/src/exec/hdfs-table-writer.h
M be/src/exprs/expr.h
M be/src/exprs/scalar-expr.h
M be/src/runtime/krpc-data-stream-recvr.h
M be/src/runtime/mem-tracker.h
7 files changed, 8 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/11002/3
--
To view, visit http://gerrit.cloudera.org:8080/11002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If83ca41cde49fd6cfde479e45f9cfdd9e3a45bec
Gerrit-Change-Number: 11002
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-7219. 7.5% of Catalog Server heap wasted by empty HashMaps and ArrayLists

2018-07-20 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10982 )

Change subject: IMPALA-7219. 7.5% of Catalog Server heap wasted by empty 
HashMaps and ArrayLists
..


Patch Set 2:

Yep, that makes sense -- lazy initializing the StructType field in Table looks 
like an easy win here. You're right that changing the treatment of 
IncompleteTable would be a much deeper change.

Worth adding to the commit message, btw, that the "7.5% of catalog server heap" 
is in the case where the catalog is mostly made up of IncompleteTables. Or 
quantify the number of bytes per IncompleteTable that are saved. That way we 
can pretty easily estimate the absolute heap size savings on an arbitrary 
catalog just by knowing how many tables they have.


--
To view, visit http://gerrit.cloudera.org:8080/10982
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If9c75f65ecb3ba3f2c739fa483a84dc052f471c6
Gerrit-Change-Number: 10982
Gerrit-PatchSet: 2
Gerrit-Owner: Misha Dmitriev 
Gerrit-Reviewer: Misha Dmitriev 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 20 Jul 2018 20:36:58 +
Gerrit-HasComments: No


[native-toolchain-CR] Bump Kudu version to 5211897

2018-07-20 Thread Thomas Marshall (Code Review)
Thomas Marshall has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11003


Change subject: Bump Kudu version to 5211897
..

Bump Kudu version to 5211897

Testing:
- Ran a full toolchain build.

Change-Id: I5aa226d74150bc6623d685e3d995e1699db4f2e8
---
M buildall.sh
1 file changed, 1 insertion(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/03/11003/1
--
To view, visit http://gerrit.cloudera.org:8080/11003
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5aa226d74150bc6623d685e3d995e1699db4f2e8
Gerrit-Change-Number: 11003
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall 


[Impala-ASF-CR] IMPALA-7219. 7.5% of Catalog Server heap wasted by empty HashMaps and ArrayLists

2018-07-20 Thread Misha Dmitriev (Code Review)
Misha Dmitriev has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10982 )

Change subject: IMPALA-7219. 7.5% of Catalog Server heap wasted by empty 
HashMaps and ArrayLists
..


Patch Set 2:

Thank you for your comments/clarifications, guys. I see what you mean now.

I've checked the heap dump more carefully, and now I realize that all the 
StructType instances (all of which have both of their data fields, fieldMap_ 
and fields_, empty) come from IncompleteTable's. Here is the relevant excerpt 
from the jxray heap report:


182,587K (1.2%): org.apache.impala.catalog.StructType: 7790403 / 100% objects
↖org.apache.impala.catalog.ArrayType.itemType_
↖org.apache.impala.catalog.IncompleteTable.type_
↖{java.util.concurrent.ConcurrentHashMap}.values
↖org.apache.impala.catalog.CatalogObjectCache.metadataCache_
↖org.apache.impala.catalog.Db.tableCache_
↖{java.util.concurrent.ConcurrentHashMap}.values
↖java.util.concurrent.atomic.AtomicReference.value
↖org.apache.impala.catalog.CatalogServiceCatalog.dbCache_
↖org.apache.impala.catalog.BuiltinsDb.parentCatalog_
↖Java Static org.apache.impala.catalog.Catalog.builtinsDb_

So, StructType objects themselves take 1.2% of the heap, plus their empty 
fieldMap_ maps take 2.5% and empty fields_ lists take 1.2%. And looks like if 
we simply initialize these StructType objects lazily (more precisely, the 
'ArrayType type_' field in Table), we will save all the above memory, i.e. 
considerably more than the current patch saves. Furthermore, the lazy changes 
in StructType itself can be reverted, which would keep the code more readable.

Makes sense? I think this is the best I can do in Impala code with my current 
very limited understanding of it. More advanced changes, like making 
IncompleteTable not inherit from Table, would likely require a much deeper 
knowledge of how this code works.


--
To view, visit http://gerrit.cloudera.org:8080/10982
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If9c75f65ecb3ba3f2c739fa483a84dc052f471c6
Gerrit-Change-Number: 10982
Gerrit-PatchSet: 2
Gerrit-Owner: Misha Dmitriev 
Gerrit-Reviewer: Misha Dmitriev 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 20 Jul 2018 19:50:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6299: Modify IRBuilder to use LLVM's CPU features

2018-07-20 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10979 )

Change subject: IMPALA-6299: Modify IRBuilder to use LLVM's CPU features
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10979/2/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/10979/2/be/src/codegen/llvm-codegen.h@737
PS2, Line 737: enabled_cpu_attrs_
> Thanks for the explanation. I agree that we should be defensive. Maybe all
How about we move the method EnableCPUFeature out from this class and into the 
test class only and maintain this duplication of attributes there.
and also comment on cpu_attrs_ that it should not be modified except in 
InitializeLlvm() or for testing purposes.


http://gerrit.cloudera.org:8080/#/c/10979/3/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/10979/3/be/src/codegen/llvm-codegen.cc@1519
PS3, Line 1519: long
nit: not a big deal but just to be consistent with CpuInfo and the static 
flags, use int64_t



--
To view, visit http://gerrit.cloudera.org:8080/10979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifece8949c143146d2a1b38d72d21c2d733bed90f
Gerrit-Change-Number: 10979
Gerrit-PatchSet: 3
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 20 Jul 2018 19:10:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7219. 7.5% of Catalog Server heap wasted by empty HashMaps and ArrayLists

2018-07-20 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10982 )

Change subject: IMPALA-7219. 7.5% of Catalog Server heap wasted by empty 
HashMaps and ArrayLists
..


Patch Set 2:

Right, it seems like this is just a case of wastage by IncompleteTable objects, 
which contain the entirety of the 'Table' superclass but in fact use none of 
it. We could probably get away with a much cheaper implementation of 
IncompleteTable that has only the "name" field and nothing else.


--
To view, visit http://gerrit.cloudera.org:8080/10982
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If9c75f65ecb3ba3f2c739fa483a84dc052f471c6
Gerrit-Change-Number: 10982
Gerrit-PatchSet: 2
Gerrit-Owner: Misha Dmitriev 
Gerrit-Reviewer: Misha Dmitriev 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 20 Jul 2018 19:01:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6994: Avoid reloading a table's HMS data for file-only operations

2018-07-20 Thread Pranay Singh (Code Review)
Hello Bharath Vissapragada, Balazs Jeszenszky, Todd Lipcon, Vuk Ercegovac, 

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10587

to look at the new patch set (#4).

Change subject: IMPALA-6994: Avoid reloading a table's HMS data for file-only 
operations
..

IMPALA-6994: Avoid reloading a table's HMS data for file-only operations

The problem is that while inserting a new row to an existing partition
of a HDFS table or to an unpartitioned table, catalogd makes an unnecessary
call to Hive MetaStore to do a getTable() and list the partitions of the
table. In such a case where a row is being inserted to an existing
partition or to an unpartitioned table only the file metadata for
HDFS tables needs to be updated.

This change avoids calling the Hive Meta Store in updatePartitionsFromHms()
by providing a hint from the caller, to skip loading the partitions.

Testing:
  Ran core test without failure.

Change-Id: I331bb0371fde287f43a85b025b4f98cb45f3eb3c
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
2 files changed, 126 insertions(+), 52 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/10587/4
--
To view, visit http://gerrit.cloudera.org:8080/10587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I331bb0371fde287f43a85b025b4f98cb45f3eb3c
Gerrit-Change-Number: 10587
Gerrit-PatchSet: 4
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7234: Improve memory estimates produced by the Planner

2018-07-20 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11001


Change subject: IMPALA-7234: Improve memory estimates produced by the Planner
..

IMPALA-7234: Improve memory estimates produced by the Planner

Previously, the planner used the getMajorityFormat to estimate
the memory requirements of its partitions. Additionally, before
IMPALA-6625 was merged, the majority format for a multi-format
table with no numerical majority was calcuated using a HashMap,
thus producing non deterministic results. This change ensures that
the memory esimate is deterministic and always based on partition
that has the maximum memory requirement.

Testing: Ran all PlannerTests. Also, modified plans of scans with
multiple partitions to ensure that the memory esimate produced
corresponds to the partition with the maximum requirement.

Change-Id: I0666ae3d45fbd8615d3fa9a8626ebd29cf94fb4b
---
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering-disabled.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
8 files changed, 51 insertions(+), 63 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/11001/1
--
To view, visit http://gerrit.cloudera.org:8080/11001
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I0666ae3d45fbd8615d3fa9a8626ebd29cf94fb4b
Gerrit-Change-Number: 11001
Gerrit-PatchSet: 1
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Pooja Nilangekar 


[Impala-ASF-CR] IMPALA-6994: Avoid reloading a table's HMS data for file-only operations

2018-07-20 Thread Pranay Singh (Code Review)
Pranay Singh has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10587 )

Change subject: IMPALA-6994: Avoid reloading a table's HMS data for file-only 
operations
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10587/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/10587/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1251
PS3, Line 1251:  RELOAD_PARTITION_METADATA if not set in flags results in not 
getting the
  :* partition details from HiveMetaStore in 
updatePartitionsFromHms().
> nit: this double negative is a bit confusing. Perhaps it's clearer to rephr
Changed the comment, thanks for rewording it.


http://gerrit.cloudera.org:8080/#/c/10587/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1254
PS3, Line 1254: or if there are no partitions in the table
> I'm not sure I follow this part. How does the caller know if there are no p
That's right the unpartitioned case doesn't have to consider this, I have 
removed the sentence.



--
To view, visit http://gerrit.cloudera.org:8080/10587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I331bb0371fde287f43a85b025b4f98cb45f3eb3c
Gerrit-Change-Number: 10587
Gerrit-PatchSet: 3
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 20 Jul 2018 18:57:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7219. 7.5% of Catalog Server heap wasted by empty HashMaps and ArrayLists

2018-07-20 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10982 )

Change subject: IMPALA-7219. 7.5% of Catalog Server heap wasted by empty 
HashMaps and ArrayLists
..


Patch Set 2:

My question (2) may have been unclear. I'm referring to StructType's HashMap 
that you're changed to be lazy initialized. In particular, I'm noting that its 
empty only when the no-arg constructor is called or when the input fields array 
is empty. Todd's question is basically the same (in StructType): why are so 
many empty (e.g., no field) structs created? We can look at call-sites.. 
perhaps we should change how/where these structs are used.


--
To view, visit http://gerrit.cloudera.org:8080/10982
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If9c75f65ecb3ba3f2c739fa483a84dc052f471c6
Gerrit-Change-Number: 10982
Gerrit-PatchSet: 2
Gerrit-Owner: Misha Dmitriev 
Gerrit-Reviewer: Misha Dmitriev 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 20 Jul 2018 18:56:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5542: Impala cannot scan Parquet decimal stored as int64 t/int32 t

2018-07-20 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11000 )

Change subject: IMPALA-5542: Impala cannot scan Parquet decimal stored as 
int64_t/int32_t
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/11000/1/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/11000/1/be/src/exec/parquet-column-readers.cc@1536
PS1, Line 1536: DCHECK_EQ(sizeof(Decimal4Value::StorageType), 
slot_desc->type().GetByteSize());
i dont remember where the file level column metadata validation occurs in code, 
but can you add this check there in case the metadata is corrupted and/or does 
not match the hms column metadata


http://gerrit.cloudera.org:8080/#/c/11000/1/be/src/exec/parquet-common.h
File be/src/exec/parquet-common.h:

http://gerrit.cloudera.org:8080/#/c/11000/1/be/src/exec/parquet-common.h@62
PS1, Line 62:   static int ByteSize(const InternalType& v) { return 
sizeof(InternalType); }
> I'm still confused about what this function does. Can we rename to somethin
This methods represents the encoded byte size if the internalType is written to 
parquet. I guess the confusion lies where in there is special casing for when 
this is used for encoding/decoding of fixed/variable len types.
>From what i understand this used to get the encoded size of an internalType 
>only for fixed len types, except for String type.
---
returning -1 makes no sense and it only does that for cases which are 
impossible, hence the DCHECK(false) associated with them.

The part in HdfsParquetTableWriter is correct since a plain_encoded_value_size_ 
< 0 means that its a variable len encoded type and since impala supports only 
string type to be written as a variable len encoded type, it calls on 
ParquetPlainEncoder::ByteSize to get it encode byte size which.


http://gerrit.cloudera.org:8080/#/c/11000/1/be/src/exec/parquet-common.h@182
PS1, Line 182:   /// Decodes t from 'buffer', reading up to the byte before 
'buffer_end'. 'buffer'
 :   /// need not be aligned. If PARQUET_TYPE is 
FIXED_LEN_BYTE_ARRAY then 'fixed_len_size'
 :   /// is the size of the object. Otherwise, it is unused.
 :   /// Returns the number of bytes read or -1 if the value was 
not decoded successfully.
I think it will be super helpful to list which types are decoded by this 
template method and which ones have specialization. What do you all think?


http://gerrit.cloudera.org:8080/#/c/11000/1/be/src/exec/parquet-common.h@208
PS1, Line 208: // Only used for testing to test decimals stored as INT32.
 :   // Otherwise Impala stores decimals as FIXED_LEN_BYTE_ARRAY.
a bit misleading, this implies that impala only supports this type for testing 
(that too not sure if for reading or writing). and the second line implies this 
method is only used for writing.


http://gerrit.cloudera.org:8080/#/c/11000/1/be/src/exec/parquet-metadata-utils.cc
File be/src/exec/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/11000/1/be/src/exec/parquet-metadata-utils.cc@195
PS1, Line 195:  if (slot_desc->type().type == TYPE_DECIMAL) {
here it is! you should add checks here to make sure int32 and int64 encoded 
decimal types have the right precision.



--
To view, visit http://gerrit.cloudera.org:8080/11000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8c41bfc7c1664bdba5099d3893dc8dbe4304794
Gerrit-Change-Number: 11000
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 20 Jul 2018 18:49:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7219. 7.5% of Catalog Server heap wasted by empty HashMaps and ArrayLists

2018-07-20 Thread Misha Dmitriev (Code Review)
Misha Dmitriev has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10982 )

Change subject: IMPALA-7219. 7.5% of Catalog Server heap wasted by empty 
HashMaps and ArrayLists
..


Patch Set 2:

Thank you for the quick response, Vuk. Answering your questions:

1. The respective IMPALA-7219 contains more details. In the case that I looked 
at, there was exactly the same number (a little less than 8 million) of both 
kinds of HashMaps. I.e. ~8M coming from IncompleteTable.colsByName_ and ~8M 
coming from StructType.fieldMap_. Each empty HashMap takes 48 bytes, so 
collectively they wasted ~700MB of memory.

2. Let me clarify. There is no such thing as an object (collection) "with no 
fields". When you call 'new HashMap()', you always create a HashMap object, 
which internally has fields like 'int size, threshold' etc. The good news is 
that the most important field, 'HashMap$Entry table[]' is null until the first 
element is added to this table. But just the above data fields, plus the 
12-byte internal header that each object in the JVM heap has, result in the 
fact that an empty, completely unpopulated HashMap, occupies 48 bytes. This is 
what I want to get rid of. If all this still seems a little vague, I suggest 
you take a look at the slides here: 
https://www.slideshare.net/MikhailMishaDmitriev/java-memory-analysis-problems-and-solutions?trk=v-feed
 In particular, slide 10 shows the internals and lifecycle of a HashMap.

So your suggestion in (2) may or may not make sense depending what exactly you 
had in mind, and needs some clarification.

Regarding tests: yes, I ran it through the Impala Jenkins build, and all tests 
passed: https://jenkins.impala.io/job/pre-review-test/186/


--
To view, visit http://gerrit.cloudera.org:8080/10982
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If9c75f65ecb3ba3f2c739fa483a84dc052f471c6
Gerrit-Change-Number: 10982
Gerrit-PatchSet: 2
Gerrit-Owner: Misha Dmitriev 
Gerrit-Reviewer: Misha Dmitriev 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 20 Jul 2018 18:41:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7325: Incorrect SHOW CREATE VIEW with built-in functions

2018-07-20 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10995 )

Change subject: IMPALA-7325: Incorrect SHOW CREATE VIEW with built-in functions
..


Patch Set 3: Code-Review+1

(1 comment)

Carry Adam's +1

http://gerrit.cloudera.org:8080/#/c/10995/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java
File fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java:

http://gerrit.cloudera.org:8080/#/c/10995/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java@889
PS2, Line 889:
> nit: add empty line back
Done



--
To view, visit http://gerrit.cloudera.org:8080/10995
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia164c55fd9459cf5f11eb72561e9cd4ffe1d5367
Gerrit-Change-Number: 10995
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 20 Jul 2018 18:28:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7325: Incorrect SHOW CREATE VIEW with built-in functions

2018-07-20 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/10995 )

Change subject: IMPALA-7325: Incorrect SHOW CREATE VIEW with built-in functions
..

IMPALA-7325: Incorrect SHOW CREATE VIEW with built-in functions

In the prior code, the authorization checker for the masked privilege
requests skips the check for system database access. As a result, certain
commands, such as SHOW CREATE VIEW that references built-in database
requires permission to access to the built-in database where accessing
built-in database should always be allowed. The patch fixes it by using
the authorizePrivilegeRequest() method that does a check on the system
database similar to how other authorization checks are performed.

Testing:
- Added new authorization test
- Ran all FE tests

Change-Id: Ia164c55fd9459cf5f11eb72561e9cd4ffe1d5367
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
3 files changed, 44 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/10995/3
--
To view, visit http://gerrit.cloudera.org:8080/10995
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia164c55fd9459cf5f11eb72561e9cd4ffe1d5367
Gerrit-Change-Number: 10995
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-6299: Modify IRBuilder to use LLVM's CPU features

2018-07-20 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10979 )

Change subject: IMPALA-6299: Modify IRBuilder to use LLVM's CPU features
..


Patch Set 2:

(3 comments)

No worries about the rebase, it happens sometimes.

http://gerrit.cloudera.org:8080/#/c/10979/2/be/src/codegen/llvm-codegen-test.cc
File be/src/codegen/llvm-codegen-test.cc:

http://gerrit.cloudera.org:8080/#/c/10979/2/be/src/codegen/llvm-codegen-test.cc@467
PS2, Line 467:   CpuInfo::EnableFeature(CpuInfo::SSE4_2, false);
> Yes, because `expected_hash` is computed using the `HashUtil::Hash` functio
Thanks for explaining. Might be worth leaving a brief comment explaining this.


http://gerrit.cloudera.org:8080/#/c/10979/2/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/10979/2/be/src/codegen/llvm-codegen.h@737
PS2, Line 737: current_cpu_attrs_
Thanks for the explanation. I agree that we should be defensive. Maybe all this 
needs is a slight clarifying comment about when the contents of the maps may 
change?

> The other option I can think of is trying to figure out from the context if 
> this is an impalad instance or a test call.
We have TestInfo::is_test() - you could add a DCHECK.


http://gerrit.cloudera.org:8080/#/c/10979/2/be/src/util/bit-util-test.cc
File be/src/util/bit-util-test.cc:

http://gerrit.cloudera.org:8080/#/c/10979/2/be/src/util/bit-util-test.cc@325
PS2, Line 325:   const int64_t CPU_INFO_FLAG = CpuInfo::SSSE3;
> I had to change it because I moved the declaration of CpuInfo::SSSE3 to cpu
Gotcha, didn't think of that



--
To view, visit http://gerrit.cloudera.org:8080/10979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifece8949c143146d2a1b38d72d21c2d733bed90f
Gerrit-Change-Number: 10979
Gerrit-PatchSet: 2
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 20 Jul 2018 18:28:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7219. 7.5% of Catalog Server heap wasted by empty HashMaps and ArrayLists

2018-07-20 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10982 )

Change subject: IMPALA-7219. 7.5% of Catalog Server heap wasted by empty 
HashMaps and ArrayLists
..


Patch Set 2:

(5 comments)

See suggestion about changing IncompleteTable to not inherit from Table -- 
maybe now with the 'FeTable' interface we could avoid that?

http://gerrit.cloudera.org:8080/#/c/10982/2/fe/src/main/java/org/apache/impala/catalog/StructType.java
File fe/src/main/java/org/apache/impala/catalog/StructType.java:

http://gerrit.cloudera.org:8080/#/c/10982/2/fe/src/main/java/org/apache/impala/catalog/StructType.java@37
PS2, Line 37: public class StructType extends Type {
agreed with Vuk -- not clear to me why we have so many empty structs. I thought 
StructType is only used for describing the schema of a table, and a table with 
no fields seems uncommon. Do you have handy the "paths to GC root" for these 
empty fields?


http://gerrit.cloudera.org:8080/#/c/10982/2/fe/src/main/java/org/apache/impala/catalog/StructType.java@39
PS2, Line 39:   // Made non-final so that it can be initialized lazily to save 
memory when this map is not used.
nit: this comment makes sense when looking at this as a patch, but when someone 
sees the code later, the "Made non-final" thing isn't super clear.

Perhaps better to just describe the current treatment of the member, like: "Map 
of field name to field. Null is used to indicate empty to save memory."


http://gerrit.cloudera.org:8080/#/c/10982/2/fe/src/main/java/org/apache/impala/catalog/StructType.java@84
PS2, Line 84:   public StructField getField(String fieldName) {
I wonder if we could do even better here and only lazily construct a field map 
at places where method is called. it seems this is the only usage site of the 
field map, and best I can tell, this method is only called during query 
analysis, and never by the catalogd itself. Given that, it doesn't really make 
sense to keep around these members, when it would be pretty trivial to 
reconstruct the map as necessary during path resolution. Then we could fully 
remove all these maps (even the non-empty ones). Would that save significant 
memory?

Another thought here: it seems like the most common case where we have an empty 
struct woudl be a reference from IncompleteTable. I'm guessing you were looking 
at a catalogd dump from a cluster with a lot of tables? Maybe we could change 
IncompleteTable to not inherit from 'Table' so that it can avoid all of these 
members entirely?


http://gerrit.cloudera.org:8080/#/c/10982/2/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/10982/2/fe/src/main/java/org/apache/impala/catalog/Table.java@320
PS2, Line 320: {
 :   return;
 : }
> nit: single line
same as below


http://gerrit.cloudera.org:8080/#/c/10982/2/fe/src/main/java/org/apache/impala/catalog/Table.java@447
PS2, Line 447:   public Column getColumn(String name) {
I think it would actually be an error to call getColumn before the colsByName_ 
map was initted. Such a call would indicate that columns haven't been loaded.



--
To view, visit http://gerrit.cloudera.org:8080/10982
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If9c75f65ecb3ba3f2c739fa483a84dc052f471c6
Gerrit-Change-Number: 10982
Gerrit-PatchSet: 2
Gerrit-Owner: Misha Dmitriev 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 20 Jul 2018 18:26:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7296: bytes limit for row batch queue

2018-07-20 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10977 )

Change subject: IMPALA-7296: bytes limit for row batch queue
..


Patch Set 5:

Mainly to avoid potentially making things worse for narrow rows. I'm open to 
changing it but this was the easiest way I could convince myself that it 
wouldn't make anything worse.

There are some weird cases like count(*) where we may end up with row batches 
that have very little memory - just the tuple_ptrs_ which aren't counted in the 
byte size. I did some rough calculations and in those cases, even if we include 
the tuple_ptrs_ size, we could potentially end up with 1000s of batches in the 
queue, which would mean more untracked memory.

An alternative is to add some constant number of bytes per row batch for 
overhead.


--
To view, visit http://gerrit.cloudera.org:8080/10977
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa06d1d8da2a6d101efda08f620c0bf84a71e681
Gerrit-Change-Number: 10977
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 20 Jul 2018 18:22:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7325: Incorrect SHOW CREATE VIEW with built-in functions

2018-07-20 Thread Adam Holley (Code Review)
Adam Holley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10995 )

Change subject: IMPALA-7325: Incorrect SHOW CREATE VIEW with built-in functions
..


Patch Set 2: Code-Review+1

(1 comment)

Good after nit.

http://gerrit.cloudera.org:8080/#/c/10995/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java
File fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java:

http://gerrit.cloudera.org:8080/#/c/10995/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java@889
PS2, Line 889: @Test
nit: add empty line back



--
To view, visit http://gerrit.cloudera.org:8080/10995
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia164c55fd9459cf5f11eb72561e9cd4ffe1d5367
Gerrit-Change-Number: 10995
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 20 Jul 2018 18:22:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6923: Update scripts in benchmark folder to store workload and few minor updates

2018-07-20 Thread Nithya Janarthanan (Code Review)
Nithya Janarthanan has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10100 )

Change subject: IMPALA-6923: Update scripts in benchmark folder to store 
workload and few minor updates
..


Patch Set 12:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/10100/12/tests/benchmark/perf_result_datastore.py
File tests/benchmark/perf_result_datastore.py:

http://gerrit.cloudera.org:8080/#/c/10100/12/tests/benchmark/perf_result_datastore.py@23
PS12, Line 23: from subprocess import *
> "*" is an antipattern for imports. It is better to be explicit about import
Done

Removed this import since it is not used in the file


http://gerrit.cloudera.org:8080/#/c/10100/12/tests/benchmark/perf_result_datastore.py@63
PS12, Line 63:   def insert_workload_summary(self, run_info_id):
 : self._insert_workload_summary(run_info_id)
> This doesn't seem to do anything useful, so just rename _insert_workload_su
Done. Good catch


http://gerrit.cloudera.org:8080/#/c/10100/12/tests/benchmark/perf_result_datastore.py@85
PS12, Line 85:   dont_save_profiles):
> Negatives are weird. It would be better to call this save_profiles and flip
Done


http://gerrit.cloudera.org:8080/#/c/10100/12/tests/benchmark/perf_result_datastore.py@88
PS12, Line 88: temp_profile_folder = 
os.path.join(home_dir,"workspace","impala-workload-runner","profiles")
> This is badly formated. There should be spaces between parameters. impala-f
Done


http://gerrit.cloudera.org:8080/#/c/10100/12/tests/benchmark/perf_result_datastore.py@155
PS12, Line 155: self._cache: Dictionary of with run_info/user as 
key/value
> This isn't quite right. self._cache appears to be a hash in which a tuple o
Done



--
To view, visit http://gerrit.cloudera.org:8080/10100
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ica7c00ad59963d466bae9e607a4692af0138962c
Gerrit-Change-Number: 10100
Gerrit-PatchSet: 12
Gerrit-Owner: Nithya Janarthanan 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Jinchul Kim 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Nithya Janarthanan 
Gerrit-Comment-Date: Fri, 20 Jul 2018 18:11:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7329: Blacklist CDH Maven snapshots repository

2018-07-20 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has removed Jim Apple from this change.  ( 
http://gerrit.cloudera.org:8080/10999 )

Change subject: IMPALA-7329: Blacklist CDH Maven snapshots repository
..


Removed reviewer Jim Apple.
--
To view, visit http://gerrit.cloudera.org:8080/10999
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: Id945bc2769f92f3df3bb4f617b00db77a6502ff3
Gerrit-Change-Number: 10999
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-7219. 7.5% of Catalog Server heap wasted by empty HashMaps and ArrayLists

2018-07-20 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10982 )

Change subject: IMPALA-7219. 7.5% of Catalog Server heap wasted by empty 
HashMaps and ArrayLists
..


Patch Set 2:

(2 comments)

thx for the changes!
questions:
(1) do you know which one of these contributes to more empty collections 
(structs or tables) for the case(s) you looked at?
(2) an empty collection for a struct comes up when it has no fields (otherwise 
there will be no savings). have you seen examples of call-sites where these 
data structures are created with no fields? perhaps some of those are worth 
following up (separately).

http://gerrit.cloudera.org:8080/#/c/10982/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10982/2//COMMIT_MSG@14
PS2, Line 14:
pls add any tests that you ran. for this change, passing existing tests should 
be fine.


http://gerrit.cloudera.org:8080/#/c/10982/2/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/10982/2/fe/src/main/java/org/apache/impala/catalog/Table.java@320
PS2, Line 320: {
 :   return;
 : }
nit: single line



--
To view, visit http://gerrit.cloudera.org:8080/10982
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If9c75f65ecb3ba3f2c739fa483a84dc052f471c6
Gerrit-Change-Number: 10982
Gerrit-PatchSet: 2
Gerrit-Owner: Misha Dmitriev 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 20 Jul 2018 17:55:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7256: Aggregator mem usage isn't reflected in summary

2018-07-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10989 )

Change subject: IMPALA-7256: Aggregator mem usage isn't reflected in summary
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2844/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/10989
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iba6ef207bed47810fc742aec3481db5f313cf97f
Gerrit-Change-Number: 10989
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 20 Jul 2018 17:45:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7256: Aggregator mem usage isn't reflected in summary

2018-07-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10989 )

Change subject: IMPALA-7256: Aggregator mem usage isn't reflected in summary
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10989
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iba6ef207bed47810fc742aec3481db5f313cf97f
Gerrit-Change-Number: 10989
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 20 Jul 2018 17:45:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7219. 7.5% of Catalog Server heap wasted by empty HashMaps and ArrayLists

2018-07-20 Thread Misha Dmitriev (Code Review)
Misha Dmitriev has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10982


Change subject: IMPALA-7219. 7.5% of Catalog Server heap wasted by empty 
HashMaps and ArrayLists
..

IMPALA-7219. 7.5% of Catalog Server heap wasted by empty HashMaps and ArrayLists

This change switches initialization of HashMaps in IncompleteTable.colsByName_ 
and
StructType.fieldMap_ from eager to lazy. In this way, we avoid wasting memory
when HashMaps stay empty (even an empty HashMap uses at least 48 bytes in the 
heap).
This optimization becomes really relevant when a catalog server loads a very 
large
number (millions) of tables.

Change-Id: If9c75f65ecb3ba3f2c739fa483a84dc052f471c6
---
M fe/src/main/java/org/apache/impala/catalog/StructType.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
2 files changed, 35 insertions(+), 8 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/10982/2
--
To view, visit http://gerrit.cloudera.org:8080/10982
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If9c75f65ecb3ba3f2c739fa483a84dc052f471c6
Gerrit-Change-Number: 10982
Gerrit-PatchSet: 2
Gerrit-Owner: Misha Dmitriev 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7209: Disallow self referencing in ALTER VIEW statements

2018-07-20 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10908 )

Change subject: IMPALA-7209: Disallow self referencing in ALTER VIEW statements
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10908/4/fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java:

http://gerrit.cloudera.org:8080/#/c/10908/4/fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java@75
PS4, Line 75: InlineViewRef fromViewRef = (InlineViewRef) 
fromTblRef;
> That was my initial approach. However, I faced the issue that while getting
right, we ideally need to implement something like collectInlineViewRefs() in 
all the subclasses but that is probably not worth the effort since it is not 
used anywhere else.


http://gerrit.cloudera.org:8080/#/c/10908/6/fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java:

http://gerrit.cloudera.org:8080/#/c/10908/6/fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java@69
PS6, Line 69: while (!subQueries.isEmpty()) {
:   QueryStmt stmt = subQueries.pop();
:   if (stmt instanceof SelectStmt) {
: List fromTblRefs = ((SelectStmt) 
stmt).getTableRefs();
: for (TableRef fromTblRef : fromTblRefs) {
:   if (fromTblRef instanceof InlineViewRef) {
: InlineViewRef fromViewRef = (InlineViewRef) 
fromTblRef;
: if (fromViewRef.getView() == ((InlineViewRef) 
tblRef).getView()) {
:   throw new AnalysisException(String.format(
:   "Self-reference not allowed on view: %s", 
tblRef.toSql()));
: } else {
:   subQueries.push(fromViewRef.getViewStmt());
: }
:   }
: }
: QueryStmt whereStmt = ((SelectStmt) 
stmt).getWhereQueryStmt();
: if (whereStmt != null) {
:   subQueries.add(whereStmt);
: }
:   } else if (stmt instanceof UnionStmt) {
: List unionOperands = ((UnionStmt) 
stmt).getOperands();
: for (UnionOperand operand : unionOperands) {
:   subQueries.push(operand.getQueryStmt());
: }
:   } else {
: throw new AnalysisException("Subqueries not supported for 
" +
: stmt.getClass().getSimpleName() + " statements");
:   }
:
How about factoring this out into a method in QueryStmt and have helper methods 
in SelectStmt and UnionStmt to collect each of their own view refs? That way 
you could do something like if 
(viewDefStmt_.collectInlineViewRefs().contains(tblRef.getView())...



--
To view, visit http://gerrit.cloudera.org:8080/10908
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I17c231c9d74d9d411463a408b086eb874090b9b7
Gerrit-Change-Number: 10908
Gerrit-PatchSet: 6
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Comment-Date: Fri, 20 Jul 2018 17:28:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7296: bytes limit for row batch queue

2018-07-20 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10977 )

Change subject: IMPALA-7296: bytes limit for row batch queue
..


Patch Set 5:

Hi Tim, will it be simpler in general to just use max_bytes instead of both 
(max_batches, max_bytes) to control the queue capacity ? Can you please 
elaborate on the reasoning for not doing so ?


--
To view, visit http://gerrit.cloudera.org:8080/10977
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa06d1d8da2a6d101efda08f620c0bf84a71e681
Gerrit-Change-Number: 10977
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 20 Jul 2018 17:27:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6994: Avoid reloading a table's HMS data for file-only operations

2018-07-20 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10587 )

Change subject: IMPALA-6994: Avoid reloading a table's HMS data for file-only 
operations
..


Patch Set 3:

(3 comments)

Just left some notes on the HdfsTable side. I dont know the CatalogOpExecutor 
code well so probably better for someone else to take a look at that part.

http://gerrit.cloudera.org:8080/#/c/10587/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/10587/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1251
PS3, Line 1251:  RELOAD_PARTITION_METADATA if not set in flags results in not 
getting the
  :* partition details from HiveMetaStore in 
updatePartitionsFromHms().
nit: this double negative is a bit confusing. Perhaps it's clearer to rephrase 
like:

If RELOAD_PARTITION_METADATA is set in flags, the partition details will be 
reloaded from the HMS. In some cases, this may not be necessary, and when 
unset, an expensive call to the HMS can be avoided.


http://gerrit.cloudera.org:8080/#/c/10587/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1254
PS3, Line 1254: or if there are no partitions in the table
I'm not sure I follow this part. How does the caller know if there are no 
partitions in the table without fetching the list of partitions? In the case of 
an unpartitioned table (no partition keys) the code already takes care of this 
internally without the caller having to pass it, right?


http://gerrit.cloudera.org:8080/#/c/10587/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1418
PS3, Line 1418:  Updates the partitions of an HdfsTable so that they are in 
sync with the Hive
  :* Metastore.
this doesn't seem to be an accurate description of the function, since the 
function also reloads file metadata, right? Maybe the function should just be 
called updatePartitionMetadata? Then it's clearer that the flags specify 
whether we want to update the file metadata, the list of partitions, or both, 
right?



--
To view, visit http://gerrit.cloudera.org:8080/10587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I331bb0371fde287f43a85b025b4f98cb45f3eb3c
Gerrit-Change-Number: 10587
Gerrit-PatchSet: 3
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 20 Jul 2018 16:57:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7329: Blacklist CDH Maven snapshots repository

2018-07-20 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/10999 )

Change subject: IMPALA-7329: Blacklist CDH Maven snapshots repository
..

IMPALA-7329: Blacklist CDH Maven snapshots repository

The Impala development boostrapping depends on CDH Maven snapshots
which transitively pull dependencies from other repositories which
can cause the build to be non-reproducible, e.g. IMPALA-7316. This
patch makes the build to be reproducible by blacklisting
cdh.snapshots.repo so that Maven does not accidentally downloads the
latest CDH snapshots when running a build, which can cause
incompatibility issues.

Testing:
- Ran core tests with CDH_BUILD_NUMBER=422770 and did not hit the
  issue described in IMPALA-7316

Change-Id: Id945bc2769f92f3df3bb4f617b00db77a6502ff3
---
M impala-parent/pom.xml
1 file changed, 11 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/10999/3
--
To view, visit http://gerrit.cloudera.org:8080/10999
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id945bc2769f92f3df3bb4f617b00db77a6502ff3
Gerrit-Change-Number: 10999
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-5542: Impala cannot scan Parquet decimal stored as int64 t/int32 t

2018-07-20 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11000 )

Change subject: IMPALA-5542: Impala cannot scan Parquet decimal stored as 
int64_t/int32_t
..


Patch Set 1:

(2 comments)

My main question arose out of me trying to understand the implications of the 
changes to ByteSize() - I feel like there's something weird there.

http://gerrit.cloudera.org:8080/#/c/11000/1/be/src/exec/parquet-common.h
File be/src/exec/parquet-common.h:

http://gerrit.cloudera.org:8080/#/c/11000/1/be/src/exec/parquet-common.h@62
PS1, Line 62:   static int ByteSize(const InternalType& v) { return 
sizeof(InternalType); }
I'm still confused about what this function does. Can we rename to something 
like "InternalByteSize()". What does it returning -1 mean also?

It's weird that it also sometimes seems to be used to return the encoded byte 
size - that seems wrong.

E.g. I'm looking at this logic in HdfsParquetTableWriter and wondering if it's 
wrong:

  *bytes_needed = plain_encoded_value_size_ < 0 ?
  ParquetPlainEncoder::ByteSize(*v) :
  plain_encoded_value_size_;


http://gerrit.cloudera.org:8080/#/c/11000/1/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/11000/1/tests/query_test/test_scanners.py@293
PS1, Line 293:   def _create_table_from_file(self, table_name, unique_database):
It would be nice to use this for other tests (no requirement, just an 
observation :)).



--
To view, visit http://gerrit.cloudera.org:8080/11000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8c41bfc7c1664bdba5099d3893dc8dbe4304794
Gerrit-Change-Number: 11000
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 20 Jul 2018 16:07:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6490: Reconnect shell when remote restarts

2018-07-20 Thread Le Minh Nghia (Code Review)
Le Minh Nghia has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10992 )

Change subject: IMPALA-6490: Reconnect shell when remote restarts
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/10992/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10992/3//COMMIT_MSG@9
PS3, Line 9: impad
> typo: impalad
Done


http://gerrit.cloudera.org:8080/#/c/10992/3/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/10992/3/shell/impala_client.py@233
PS3, Line 233: If not, an exception
 : will be catched
> I think the internal workings of the function shouldn't be described in the
Done


http://gerrit.cloudera.org:8080/#/c/10992/3/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/10992/3/shell/impala_shell.py@614
PS3, Line 614: promt
> typo: prompt
Why do I have so many typos? :(. Done


http://gerrit.cloudera.org:8080/#/c/10992/3/tests/custom_cluster/test_shell_interactive_reconnect.py
File tests/custom_cluster/test_shell_interactive_reconnect.py:

http://gerrit.cloudera.org:8080/#/c/10992/3/tests/custom_cluster/test_shell_interactive_reconnect.py@85
PS3, Line 85: self._start_impala_cluster(["--kill"])
: self._start_impala_cluster([])
> Is this really needed? Before each test a clean new impala cluster is start
Done



--
To view, visit http://gerrit.cloudera.org:8080/10992
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia13365a9696886f01294e98054cf4e7cd66ab712
Gerrit-Change-Number: 10992
Gerrit-PatchSet: 3
Gerrit-Owner: Le Minh Nghia 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Le Minh Nghia 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 20 Jul 2018 14:10:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6490: Reconnect shell when remote restarts

2018-07-20 Thread Le Minh Nghia (Code Review)
Hello Zoltan Borok-Nagy, Attila Jeges,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10992

to look at the new patch set (#5).

Change subject: IMPALA-6490: Reconnect shell when remote restarts
..

IMPALA-6490: Reconnect shell when remote restarts

If the remote impalad died while the shell waited for a
command to complete, the shell disconnected. Previously
after restarting the remote impalad, we need to run
"connect;" to reconnect, now the shell will automatically
reconnect.

Testing:
Added test_auto_connect_after_impalad_died in
test_shell_interactive_reconnect.py

Change-Id: Ia13365a9696886f01294e98054cf4e7cd66ab712
---
M shell/impala_client.py
M shell/impala_shell.py
M tests/common/custom_cluster_test_suite.py
M tests/custom_cluster/test_shell_interactive_reconnect.py
4 files changed, 52 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/10992/5
--
To view, visit http://gerrit.cloudera.org:8080/10992
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia13365a9696886f01294e98054cf4e7cd66ab712
Gerrit-Change-Number: 10992
Gerrit-PatchSet: 5
Gerrit-Owner: Le Minh Nghia 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Le Minh Nghia 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6490: Reconnect shell when remote restarts

2018-07-20 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10992 )

Change subject: IMPALA-6490: Reconnect shell when remote restarts
..


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/10992/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10992/3//COMMIT_MSG@9
PS3, Line 9: impad
typo: impalad


http://gerrit.cloudera.org:8080/#/c/10992/3/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/10992/3/shell/impala_client.py@233
PS3, Line 233: If not, an exception
 : will be caught
I think the internal workings of the function shouldn't be described in the doc 
comment.
The user of this method only needs to know what will be the result on what 
condition.


http://gerrit.cloudera.org:8080/#/c/10992/3/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/10992/3/shell/impala_shell.py@614
PS3, Line 614: promt
typo: prompt


http://gerrit.cloudera.org:8080/#/c/10992/3/tests/custom_cluster/test_shell_interactive_reconnect.py
File tests/custom_cluster/test_shell_interactive_reconnect.py:

http://gerrit.cloudera.org:8080/#/c/10992/3/tests/custom_cluster/test_shell_interactive_reconnect.py@85
PS3, Line 85: self._start_impala_cluster(["--kill"])
: self._start_impala_cluster([])
Is this really needed? Before each test a clean new impala cluster is started 
anyway.

Also, if you just want to restart the cluster, you don't need to kill it. You 
can just call _start_impala_cluster([])



--
To view, visit http://gerrit.cloudera.org:8080/10992
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia13365a9696886f01294e98054cf4e7cd66ab712
Gerrit-Change-Number: 10992
Gerrit-PatchSet: 4
Gerrit-Owner: Le Minh Nghia 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Le Minh Nghia 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 20 Jul 2018 13:55:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6490: Reconnect shell when remote restarts

2018-07-20 Thread Le Minh Nghia (Code Review)
Hello Zoltan Borok-Nagy, Attila Jeges,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10992

to look at the new patch set (#4).

Change subject: IMPALA-6490: Reconnect shell when remote restarts
..

IMPALA-6490: Reconnect shell when remote restarts

If the remote impad died while the shell waited for a
command to complete, the shell disconnected. Previously
after restarting the remote impalad, we need to run
"connect;" to reconnect, now the shell will automatically
reconnect.

Testing:
Added test_auto_connect_after_impalad_died in
test_shell_interactive_reconnect.py

Change-Id: Ia13365a9696886f01294e98054cf4e7cd66ab712
---
M shell/impala_client.py
M shell/impala_shell.py
M tests/common/custom_cluster_test_suite.py
M tests/custom_cluster/test_shell_interactive_reconnect.py
4 files changed, 54 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/10992/4
--
To view, visit http://gerrit.cloudera.org:8080/10992
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia13365a9696886f01294e98054cf4e7cd66ab712
Gerrit-Change-Number: 10992
Gerrit-PatchSet: 4
Gerrit-Owner: Le Minh Nghia 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Le Minh Nghia 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6490: Reconnect shell when remote restarts

2018-07-20 Thread Le Minh Nghia (Code Review)
Le Minh Nghia has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10992 )

Change subject: IMPALA-6490: Reconnect shell when remote restarts
..


Patch Set 3:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/10992/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10992/2//COMMIT_MSG@12
PS2, Line 12: now
> now
Done


http://gerrit.cloudera.org:8080/#/c/10992/2/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/10992/2/shell/impala_client.py@33
PS2, Line 33: from thrift.Thrift import TApplicationException, TException
:
> nit: One line
Done


http://gerrit.cloudera.org:8080/#/c/10992/2/shell/impala_client.py@234
PS2, Line 234: ll be catched and return False."""
 : if self.connect
> Comment is still incorrect.
Hopefully done.


http://gerrit.cloudera.org:8080/#/c/10992/2/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/10992/2/shell/impala_shell.py@566
PS2, Line 566: nnected():
> test_connected() could be renamed to is_connected() or something similar to
Done


http://gerrit.cloudera.org:8080/#/c/10992/2/shell/impala_shell.py@566
PS2, Line 566: if not self.imp_client.is_connected():
 :   print_to_stderr
> Introducing 'connected' is unnecessary. Mathod should be called in the if s
Done


http://gerrit.cloudera.org:8080/#/c/10992/2/tests/custom_cluster/test_shell_interactive_reconnect.py
File tests/custom_cluster/test_shell_interactive_reconnect.py:

http://gerrit.cloudera.org:8080/#/c/10992/2/tests/custom_cluster/test_shell_interactive_reconnect.py@84
PS2, Line 84:  # reconnect.
> Would it be possible to use ImpalaShell() object here instead of pexpect? (
I added comments which explain my decision


http://gerrit.cloudera.org:8080/#/c/10992/2/tests/shell/test_shell_interactive.py
File tests/shell/test_shell_interactive.py:

http://gerrit.cloudera.org:8080/#/c/10992/2/tests/shell/test_shell_interactive.py@43
PS2, Line 43: SHELL_HISTORY_FILE = os.path.expanduser("~/.impalahistory")
> 'START_CLUSTER' is not used anywhere, you can remove this line.
Done



--
To view, visit http://gerrit.cloudera.org:8080/10992
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia13365a9696886f01294e98054cf4e7cd66ab712
Gerrit-Change-Number: 10992
Gerrit-PatchSet: 3
Gerrit-Owner: Le Minh Nghia 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Le Minh Nghia 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 20 Jul 2018 12:34:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5542: Impala cannot scan Parquet decimal stored as int64 t/int32 t

2018-07-20 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11000


Change subject: IMPALA-5542: Impala cannot scan Parquet decimal stored as 
int64_t/int32_t
..

IMPALA-5542: Impala cannot scan Parquet decimal stored as int64_t/int32_t

The Decimal type in Parquet is a logical type. That means
the Parquet file stores some physical/primitive type that
is annotated by the DECIMAL tag to make it behave like
decimals.

The allowed physical types for decimals are INT32, INT64,
FIXED, and BINARY. Before this commit Impala could only
read decimals stored as FIXED or BINARY.

Spark decided to write decimals as INT32 or INT64 when
their precision allows it:
(1 <= precision <= 9) ==> INT32
(10 <= precision <= 18) ==> INT64

I updated our column readers to accept INT32 and INT64
as valid physical types for decimals.

Testing:
* extended parquet-plain-test.cc
* added Parquet files generated by Spark 2.3.1
  and updated test_scanners.py

Change-Id: Ib8c41bfc7c1664bdba5099d3893dc8dbe4304794
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-common.h
M be/src/exec/parquet-metadata-utils.cc
M be/src/exec/parquet-plain-test.cc
M testdata/data/README
A testdata/data/decimal_stored_as_int32.parquet
A testdata/data/decimal_stored_as_int64.parquet
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-decimal-formats.test
M tests/query_test/test_scanners.py
9 files changed, 82 insertions(+), 12 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/11000/1
--
To view, visit http://gerrit.cloudera.org:8080/11000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib8c41bfc7c1664bdba5099d3893dc8dbe4304794
Gerrit-Change-Number: 11000
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6490: Reconnect shell when remote restarts

2018-07-20 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10992 )

Change subject: IMPALA-6490: Reconnect shell when remote restarts
..


Patch Set 2:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/10992/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10992/2//COMMIT_MSG@12
PS2, Line 12: noew
now


http://gerrit.cloudera.org:8080/#/c/10992/2/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/10992/2/shell/impala_client.py@33
PS2, Line 33: from thrift.Thrift import TApplicationException
: from thrift.Thrift import TException
nit: One line


http://gerrit.cloudera.org:8080/#/c/10992/2/shell/impala_client.py@234
PS2, Line 234: "Checks to see if the current Impala connection is still alive. 
If not, an exception
 : will be raised.
Comment is still incorrect.


http://gerrit.cloudera.org:8080/#/c/10992/2/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/10992/2/shell/impala_shell.py@566
PS2, Line 566: test_connection
test_connected() could be renamed to is_connected() or something similar to 
make clear that we expect a True/False return value.


http://gerrit.cloudera.org:8080/#/c/10992/2/shell/impala_shell.py@566
PS2, Line 566: connected = self.imp_client.test_connection()
 : if not connected:
Introducing 'connected' is unnecessary. Mathod should be called in the if 
statement.


http://gerrit.cloudera.org:8080/#/c/10992/2/tests/custom_cluster/test_shell_interactive_reconnect.py
File tests/custom_cluster/test_shell_interactive_reconnect.py:

http://gerrit.cloudera.org:8080/#/c/10992/2/tests/custom_cluster/test_shell_interactive_reconnect.py@84
PS2, Line 84:  proc = pexpect.spawn(SHELL_CMD)
Would it be possible to use ImpalaShell() object here instead of pexpect? (just 
like in 'test_auto_reconnect' above). 'ImpalaShell' class also has a 'prompt' 
member that we could use for this test (instead of pexpect.expect()).


http://gerrit.cloudera.org:8080/#/c/10992/2/tests/shell/test_shell_interactive.py
File tests/shell/test_shell_interactive.py:

http://gerrit.cloudera.org:8080/#/c/10992/2/tests/shell/test_shell_interactive.py@43
PS2, Line 43: START_CLUSTER = "%s/bin/start-impala-cluster.py" % 
os.environ['IMPALA_HOME']
'START_CLUSTER' is not used anywhere, you can remove this line.



--
To view, visit http://gerrit.cloudera.org:8080/10992
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia13365a9696886f01294e98054cf4e7cd66ab712
Gerrit-Change-Number: 10992
Gerrit-PatchSet: 2
Gerrit-Owner: Le Minh Nghia 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Le Minh Nghia 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 20 Jul 2018 10:35:11 +
Gerrit-HasComments: Yes