[Impala-ASF-CR] IMPALA-8593: Prohibit write operations for bucketed tables

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13558 )

Change subject: IMPALA-8593: Prohibit write operations for bucketed tables
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3861/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13558
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia08d01168660830b6e0d08b55a95eac129889cec
Gerrit-Change-Number: 13558
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 11 Jul 2019 04:28:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8593: Prohibit write operations for bucketed tables

2019-07-10 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13558 )

Change subject: IMPALA-8593: Prohibit write operations for bucketed tables
..


Patch Set 6:

(2 comments)

Patch set 6 addressed the review issues and was rebased to the latest hive bump.
Run cdp all core tests, the failures are the same as before the change to 
upstream master. Both have 12 failures:
My private cdp build:
https://master-02.jenkins.cloudera.com/job/impala-private-parameterized/5393/testReport/

upstream without the patch:
https://jenkins.impala.io/job/ubuntu-16.04-from-scratch-cdp-hive/23/testReport/

So no regressions.

http://gerrit.cloudera.org:8080/#/c/13558/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/13558/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2208
PS5, Line 2208:   } e
> What should happen if it is a full acid tables? If it shouldn't be possible
Done


http://gerrit.cloudera.org:8080/#/c/13558/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2210
PS5, Line 2210: 1. The property is not stor
> A comment could be added about the lack of capabilities for Kudu tables.
Done



--
To view, visit http://gerrit.cloudera.org:8080/13558
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia08d01168660830b6e0d08b55a95eac129889cec
Gerrit-Change-Number: 13558
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 11 Jul 2019 04:00:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8593: Prohibit write operations for bucketed tables

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13558 )

Change subject: IMPALA-8593: Prohibit write operations for bucketed tables
..


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/13558/6/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/13558/6/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@20
PS6, Line 20: import static 
org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.ACCESSTYPE_NONE;
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/13558/6/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@21
PS6, Line 21: import static 
org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.ACCESSTYPE_READONLY;
line too long (95 > 90)


http://gerrit.cloudera.org:8080/#/c/13558/6/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@22
PS6, Line 22: import static 
org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.ACCESSTYPE_READWRITE;
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/13558/6/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@23
PS6, Line 23: import static 
org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.ACCESSTYPE_WRITEONLY;
line too long (96 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/13558
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia08d01168660830b6e0d08b55a95eac129889cec
Gerrit-Change-Number: 13558
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 11 Jul 2019 03:49:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8593: Prohibit write operations for bucketed tables

2019-07-10 Thread Yongzhi Chen (Code Review)
Hello Vihang Karajgaonkar, Sudhanshu Arora, Zoltan Borok-Nagy, Sahil Takiar, 
Todd Lipcon, Joe McDonnell, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13558

to look at the new patch set (#6).

Change subject: IMPALA-8593: Prohibit write operations for bucketed tables
..

IMPALA-8593: Prohibit write operations for bucketed tables

This patch adds a method to check if a table bucketed.
For Hive 3, integrates with HMS translation layer for
capabilities checks.
Implements methods ensureTableWriteSupported and
ensureTableReadSupported.
Set default capabilities for tables.

Tests:
Added unit tests to ParserTest and AnalyzerTest.
Added bucketed tables which are required by IMPALA-8439.
Ran core tests(Hive 2 and Hive 3)

ToDo:
Integrate checking bucketed tables capabilities and creating
error messages with HMS translation after Hive provides the
required functions.
Enable capabilities checking for Kudu tables.

Change-Id: Ia08d01168660830b6e0d08b55a95eac129889cec
---
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/LoadDataStmt.java
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M tests/metadata/test_ddl.py
M tests/metadata/test_show_create_table.py
19 files changed, 392 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/13558/6
--
To view, visit http://gerrit.cloudera.org:8080/13558
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia08d01168660830b6e0d08b55a95eac129889cec
Gerrit-Change-Number: 13558
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] WIP: IMPALA-8586: Support download URLs for CDP

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13432 )

Change subject: WIP: IMPALA-8586: Support download URLs for CDP
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3860/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
Gerrit-Change-Number: 13432
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 11 Jul 2019 01:00:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13801 )

Change subject: IMPALA-5149: Provide query profile in JSON format
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3859/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 6
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 11 Jul 2019 00:39:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8543: More diagnostics for TAcceptQueueServer

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13790 )

Change subject: IMPALA-8543: More diagnostics for TAcceptQueueServer
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I33b32352b457a2c8ec7bae6da46bb9c555dc9c36
Gerrit-Change-Number: 13790
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Thu, 11 Jul 2019 00:37:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8543: More diagnostics for TAcceptQueueServer

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13790 )

Change subject: IMPALA-8543: More diagnostics for TAcceptQueueServer
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4594/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/13790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I33b32352b457a2c8ec7bae6da46bb9c555dc9c36
Gerrit-Change-Number: 13790
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Thu, 11 Jul 2019 00:37:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8732: Use a serialized descriptor table in TQueryCtx

2019-07-10 Thread Joe McDonnell (Code Review)
Joe McDonnell has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13772 )

Change subject: IMPALA-8732: Use a serialized descriptor table in TQueryCtx
..

IMPALA-8732: Use a serialized descriptor table in TQueryCtx

In IMPALA-8732, there is contention in tcmalloc when sending the
ExecQueryFInstances messages for a query referencing a large
number of partitions. This is because each thread in the
ExecEnv::exec_rpc_thread_pool_ is making a copy of the TQueryCtx,
which contains the TDescriptorTable and a large map of THdfsPartition
objects. Every thread in the exec_rpc_thread_pool_ is doing this
simultaneously.

The threads are copying this structure, but the TQueryCtx and its
corresponding TDescriptorTable is the same across all messages for
this query. Copying a large map of THdfsPartition objects is
wasteful, especially considering that the coordinator does not
need to access any of the information in TDescriptorTable before
sending it out to executors.

In future, the entire TQueryCtx can be serialized once and
embedded in its own sidecar. This change is limited to
TDescriptorTable to allow easier backports to older versions,
as this codepath has been converted from Thrift to KRPC and
a large amount of code has changed.

This changes TQueryCtx to contain a TDescriptorTableSerialized,
which is a binary blob containing the serialized form of
TDescriptorTable. This is serialized in the frontend and
passed directly through to executors. The old unserialized
TDescriptorTable form is maintained to enable frontend
planner tests (which use incomplete structures lacking some
required fields and cannot be serialized).

Testing:
 - Core and exhaustive tests pass

Change-Id: I458aa62dd4d1e4e4a7b1869a604623a69f3b2d9a
Reviewed-on: http://gerrit.cloudera.org:8080/13772
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 
---
M be/src/runtime/coordinator.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/query-state.cc
M be/src/testutil/desc-tbl-builder.cc
M common/thrift/Descriptors.thrift
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
11 files changed, 112 insertions(+), 26 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/13772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I458aa62dd4d1e4e4a7b1869a604623a69f3b2d9a
Gerrit-Change-Number: 13772
Gerrit-PatchSet: 7
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] WIP: IMPALA-8586: Support download URLs for CDP

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13432 )

Change subject: WIP: IMPALA-8586: Support download URLs for CDP
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13432/2/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

http://gerrit.cloudera.org:8080/#/c/13432/2/bin/bootstrap_toolchain.py@707
PS2, Line 707: if __name__ == "__main__": main()
flake8: E305 expected 2 blank lines after class or function definition, found 1



--
To view, visit http://gerrit.cloudera.org:8080/13432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
Gerrit-Change-Number: 13432
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 11 Jul 2019 00:21:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-8586: Support download URLs for CDP

2019-07-10 Thread Joe McDonnell (Code Review)
Hello Vihang Karajgaonkar, Fredy Wijaya, Todd Lipcon, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13432

to look at the new patch set (#2).

Change subject: WIP: IMPALA-8586: Support download URLs for CDP
..

WIP: IMPALA-8586: Support download URLs for CDP

bin/bootstrap_toolchain.py has accumulated complexity over time.
CDH, CDP, and the native toolchain all use different download
machinery and naming. One feature that is needed on the CDP side
is the ability to specify the download URL in an IMPALA_*_URL
environment variable.

This adds that support and refactors CDH and native toolchain
downloads to use the new system.

Currently, there are multiple phases of downloads, each with their
own download functions and peculiarities to account for package
names and destinations for downloads. This changes the logic
so that a package will generate a DownloadUnpackAction that is
completely resolved. It contains everything about what to download
and where to put it as well as a needs_download() function and a
download() function. Once there is a list of DownloadUnpackAction
objects, they can all be downloaded and unpacked in a single phase.
This implements different types of packages as subclasses of
DownloadUnpackAction.

Kudu requires special handling and gets its own set of subclasses
to handle various subtleties like toolchain vs CDH Kudu, the Kudu
stub, and making sure that the "kudu" package and the "kudu-java"
package don't confuse each other.

Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
---
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
2 files changed, 457 insertions(+), 332 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/13432/2
--
To view, visit http://gerrit.cloudera.org:8080/13432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I67824fd82b820e68e9f5c87939ec94ca6abadb8c
Gerrit-Change-Number: 13432
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-8732: Use a serialized descriptor table in TQueryCtx

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13772 )

Change subject: IMPALA-8732: Use a serialized descriptor table in TQueryCtx
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/13772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I458aa62dd4d1e4e4a7b1869a604623a69f3b2d9a
Gerrit-Change-Number: 13772
Gerrit-PatchSet: 6
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Thu, 11 Jul 2019 00:17:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-07-10 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13748 )

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..


Patch Set 14:

> Patch Set 14:
>
> (8 comments)
>
> > Patch Set 14:
> >
> > (9 comments)
>
> Replied to comments


--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 14
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 11 Jul 2019 00:06:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-10 Thread Jiawei Wang (Code Review)
Jiawei Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13801 )

Change subject: IMPALA-5149: Provide query profile in JSON format
..


Patch Set 6:

(1 comment)

Thanks for your feedback! I also added some tests for this.

http://gerrit.cloudera.org:8080/#/c/13801/2/be/src/service/impala-hs2-server.cc
File be/src/service/impala-hs2-server.cc:

http://gerrit.cloudera.org:8080/#/c/13801/2/be/src/service/impala-hs2-server.cc@979
PS2, Line 979:   SQLSTATE_GENERAL_ERROR);
 :   if (request.format == TRuntimeProfileFormat::THRIFT) {
 : return_val.__set_thrift_profile(thrift_profile);
 :   } else if (request.format == TRuntimeProfileFormat::JSON) {
 : rapidjson::StringBuffer sb;
 : rapidjson::PrettyWriter writer(sb);
 :
> should there be an equivalent here for JSON?
Done. I am putting the output inside the profile, which is a string.

Also, do we need to adopt this change to impala-beeswax-server? I looked at the 
code there but find that the function there only support text format download.



--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 6
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 11 Jul 2019 00:05:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-07-10 Thread radford nguyen (Code Review)
radford nguyen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13748 )

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..


Patch Set 14:

(8 comments)

> Patch Set 14:
>
> (9 comments)

Replied to comments

http://gerrit.cloudera.org:8080/#/c/13748/14/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/13748/14/be/src/service/impala-server.cc@275
PS14, Line 275: 1
> Since the hook execution is async (to query unregister) anyway, how about w
That's a good idea; I'll bump it


http://gerrit.cloudera.org:8080/#/c/13748/14/be/src/service/impala-server.cc@279
PS14, Line 279: true
> Should anyone ever have to set this to false? If not, I suggest removing th
Nobody has requested it, but I was under the impression that a user might want 
to use higher-priority threads for executing hooks.  They might also want to 
ensure that the threads will not be killed at shutdown until the hooks have 
completed executing.


http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
File 
fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java:

http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@97
PS14, Line 97:  * ${hookClass}.${method}.execution.rejections
> Where are all these metrics exposed? I don't see the plumbing to make them
I haven't completed the plumbing yet; I'm still trying to figure out how it 
works.  This is the main reason the CR is still WIP


http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@101
PS14, Line 101: exceptions
> nit: failures?
I can change it to failures if you think that's more expressive


http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@131
PS14, Line 131: private final long hookTimeout_;
  :   private final TimeUnit hookTimeoutUnit_;
> Just force it to be secs or something?
Yes, it is forced to `SECONDS` in the `QueryHookManager` when instantiating 
this executor.  I am allowing a different time unit here internally to make the 
unit tests faster.


http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@290
PS14, Line 290:   context);
> probably helps to include query ID explicitly.
Query ID is not currently available; are you suggesting that it be passed to 
the hook from the backend?  I do like that idea


http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@327
PS14, Line 327: final Future f = hookExecutor_.submit(() -> 
{
> I don't think you need to offload this to a separate executor pool again.
I think I'm missing something.  If I just do new FutureTask<>(callHook).run(), 
won't that just run callHook in this thread?  And doesn't that mean that we 
won't get to line 337 until callHook has completed?


http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
File fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java:

http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java@77
PS14, Line 77:   static final String BE_HOOKS_FLAG = "query_event_hook_classes";
> nit: any reason to remove private qualifier?
I used one of the constants in an error message logged by 
`FixedCapacityQueryHookExecutor`.  If you think it's cleaner, I can make these 
private again and maybe(?) make `FCQHExecutor` an inner class



--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 14
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Thu, 11 Jul 2019 00:04:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-07-10 Thread Jiawei Wang (Code Review)
Hello Bharath Vissapragada, Greg Rahn, David Rorke, David Knupp, Sahil Takiar, 
Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13801

to look at the new patch set (#6).

Change subject: IMPALA-5149: Provide query profile in JSON format
..

IMPALA-5149: Provide query profile in JSON format

Description:
Today there is a text and Thrift version of the query profile, but
it would be useful to have a JSON version for portability and machine
consumption. The ultimate goal is to have a Download JSON format
profile link along with the other two formats.

Modification:
1.Add Json format download option in impalad profile page
2.Add ToJson() function for RuntimeProfile, Counters, EventSequence
3.Add JSON format into QueryStateRecord
4.Add tests for E2E test, unit test, hs2 test
5.Modify query profile page to different download option
6.Modify HS2 server to support get JSON format profile

Tests:
E2E tests:
* tests/webserver/test_web_pages.py - test_download_json_profile
HS2 tests:
* tests/hs2/test_hs2.py - test_get_profile_json
BE Unit tests:
* be/src/util/runtime-profile-test.cc
ToJson.RuntimeProfileToJsonTest
ToJson.TimeSeriesCounterToJsonTest

Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
---
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-http-handler.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/CMakeLists.txt
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile-test.cc
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/ImpalaService.thrift
M common/thrift/RuntimeProfile.thrift
M tests/hs2/test_hs2.py
M tests/webserver/test_web_pages.py
M www/query_profile.tmpl
16 files changed, 525 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/13801/6
--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 6
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-8747: Fix string formatting on HS2 connection setup failure

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13815 )

Change subject: IMPALA-8747: Fix string formatting on HS2 connection setup 
failure
..

IMPALA-8747: Fix string formatting on HS2 connection setup failure

In impala_test_suite.py, the logging for HS2 connection failure
uses "{0}" replacement codes when the logging facility expects it
to use "%"-style codes. This fixes it to call .format() directly.

Change-Id: I4789712fcd71c887a9c6c890cd22a4f382faa3a5
Reviewed-on: http://gerrit.cloudera.org:8080/13815
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M tests/common/impala_test_suite.py
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/13815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I4789712fcd71c887a9c6c890cd22a4f382faa3a5
Gerrit-Change-Number: 13815
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-8747: Fix string formatting on HS2 connection setup failure

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13815 )

Change subject: IMPALA-8747: Fix string formatting on HS2 connection setup 
failure
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/13815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4789712fcd71c887a9c6c890cd22a4f382faa3a5
Gerrit-Change-Number: 13815
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Wed, 10 Jul 2019 23:35:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8729: [DOCS] Describe on-demand metadata feature

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13802 )

Change subject: IMPALA-8729: [DOCS] Describe on-demand metadata feature
..


Patch Set 3: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/383/ : Doc tests passed.


-- 
To view, visit http://gerrit.cloudera.org:8080/13802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64261625c1d9b122c7cca59f9b004dda05810351
Gerrit-Change-Number: 13802
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 10 Jul 2019 20:21:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8729: [DOCS] Describe on-demand metadata feature

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13802 )

Change subject: IMPALA-8729: [DOCS] Describe on-demand metadata feature
..


Patch Set 3:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/383/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/13802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64261625c1d9b122c7cca59f9b004dda05810351
Gerrit-Change-Number: 13802
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 10 Jul 2019 19:57:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8729: [DOCS] Describe on-demand metadata feature

2019-07-10 Thread Alex Rodoni (Code Review)
Hello Bharath Vissapragada, Balazs Jeszenszky, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13802

to look at the new patch set (#3).

Change subject: IMPALA-8729: [DOCS] Describe on-demand metadata feature
..

IMPALA-8729: [DOCS] Describe on-demand metadata feature

- Overview of on-demand metadata.
- Config flags to enable/disable on-demand metadata.

Change-Id: I64261625c1d9b122c7cca59f9b004dda05810351
---
M docs/topics/impala_metadata.xml
1 file changed, 98 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/13802/3
--
To view, visit http://gerrit.cloudera.org:8080/13802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I64261625c1d9b122c7cca59f9b004dda05810351
Gerrit-Change-Number: 13802
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8729: [DOCS] Describe on-demand metadata feature

2019-07-10 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13802 )

Change subject: IMPALA-8729: [DOCS] Describe on-demand metadata feature
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/13802/2/docs/topics/impala_metadata.xml
File docs/topics/impala_metadata.xml:

http://gerrit.cloudera.org:8080/#/c/13802/2/docs/topics/impala_metadata.xml@52
PS2, Line 52: coordinators
> nit: coordinator
Done


http://gerrit.cloudera.org:8080/#/c/13802/2/docs/topics/impala_metadata.xml@60
PS2, Line 60: of impalad This is redundant.
Done


http://gerrit.cloudera.org:8080/#/c/13802/2/docs/topics/impala_metadata.xml@60
PS2, Line 60: new version of metadata
> I think calling it new version of metadata is confusing, may be say, with t
Done


http://gerrit.cloudera.org:8080/#/c/13802/2/docs/topics/impala_metadata.xml@71
PS2, Line 71:
: This is a preview feature in  and not generally
: available.
> IMO, we could get rid of this.
Done



--
To view, visit http://gerrit.cloudera.org:8080/13802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64261625c1d9b122c7cca59f9b004dda05810351
Gerrit-Change-Number: 13802
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 10 Jul 2019 19:56:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8593: Prohibit write operations for bucketed tables

2019-07-10 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13558 )

Change subject: IMPALA-8593: Prohibit write operations for bucketed tables
..


Patch Set 5: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13558/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/13558/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2208
PS5, Line 2208: }
What should happen if it is a full acid tables? If it shouldn't be possible, 
that a precondition could be used instead of the "if".


http://gerrit.cloudera.org:8080/#/c/13558/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2210
PS5, Line 2210: !KuduTable.isKuduTable(tbl)
A comment could be added about the lack of capabilities for Kudu tables.



--
To view, visit http://gerrit.cloudera.org:8080/13558
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia08d01168660830b6e0d08b55a95eac129889cec
Gerrit-Change-Number: 13558
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 10 Jul 2019 18:47:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8732: Use a serialized descriptor table in TQueryCtx

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13772 )

Change subject: IMPALA-8732: Use a serialized descriptor table in TQueryCtx
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3858/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I458aa62dd4d1e4e4a7b1869a604623a69f3b2d9a
Gerrit-Change-Number: 13772
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 10 Jul 2019 18:35:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8732: Use a serialized descriptor table in TQueryCtx

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13772 )

Change subject: IMPALA-8732: Use a serialized descriptor table in TQueryCtx
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4593/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/13772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I458aa62dd4d1e4e4a7b1869a604623a69f3b2d9a
Gerrit-Change-Number: 13772
Gerrit-PatchSet: 6
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 10 Jul 2019 17:56:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8732: Use a serialized descriptor table in TQueryCtx

2019-07-10 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13772 )

Change subject: IMPALA-8732: Use a serialized descriptor table in TQueryCtx
..


Patch Set 6: Code-Review+2

Carrying +2


--
To view, visit http://gerrit.cloudera.org:8080/13772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I458aa62dd4d1e4e4a7b1869a604623a69f3b2d9a
Gerrit-Change-Number: 13772
Gerrit-PatchSet: 6
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 10 Jul 2019 17:56:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8732: Use a serialized descriptor table in TQueryCtx

2019-07-10 Thread Joe McDonnell (Code Review)
Hello Michael Ho, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13772

to look at the new patch set (#5).

Change subject: IMPALA-8732: Use a serialized descriptor table in TQueryCtx
..

IMPALA-8732: Use a serialized descriptor table in TQueryCtx

In IMPALA-8732, there is contention in tcmalloc when sending the
ExecQueryFInstances messages for a query referencing a large
number of partitions. This is because each thread in the
ExecEnv::exec_rpc_thread_pool_ is making a copy of the TQueryCtx,
which contains the TDescriptorTable and a large map of THdfsPartition
objects. Every thread in the exec_rpc_thread_pool_ is doing this
simultaneously.

The threads are copying this structure, but the TQueryCtx and its
corresponding TDescriptorTable is the same across all messages for
this query. Copying a large map of THdfsPartition objects is
wasteful, especially considering that the coordinator does not
need to access any of the information in TDescriptorTable before
sending it out to executors.

In future, the entire TQueryCtx can be serialized once and
embedded in its own sidecar. This change is limited to
TDescriptorTable to allow easier backports to older versions,
as this codepath has been converted from Thrift to KRPC and
a large amount of code has changed.

This changes TQueryCtx to contain a TDescriptorTableSerialized,
which is a binary blob containing the serialized form of
TDescriptorTable. This is serialized in the frontend and
passed directly through to executors. The old unserialized
TDescriptorTable form is maintained to enable frontend
planner tests (which use incomplete structures lacking some
required fields and cannot be serialized).

Testing:
 - Core and exhaustive tests pass

Change-Id: I458aa62dd4d1e4e4a7b1869a604623a69f3b2d9a
---
M be/src/runtime/coordinator.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/query-state.cc
M be/src/testutil/desc-tbl-builder.cc
M common/thrift/Descriptors.thrift
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
11 files changed, 112 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/13772/5
--
To view, visit http://gerrit.cloudera.org:8080/13772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I458aa62dd4d1e4e4a7b1869a604623a69f3b2d9a
Gerrit-Change-Number: 13772
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-8732: Use a serialized descriptor table in TQueryCtx

2019-07-10 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13772 )

Change subject: IMPALA-8732: Use a serialized descriptor table in TQueryCtx
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13772/4/be/src/runtime/descriptors.h
File be/src/runtime/descriptors.h:

http://gerrit.cloudera.org:8080/#/c/13772/4/be/src/runtime/descriptors.h@539
PS4, Line 539: WARN_UNUSED_RESULT;
> Do DeserializeThrift() and CreateTblDecriptorInternal() also need this ?
Good point, I think we should go ahead and keep doing that. I added this to the 
functions that didn't have it.



--
To view, visit http://gerrit.cloudera.org:8080/13772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I458aa62dd4d1e4e4a7b1869a604623a69f3b2d9a
Gerrit-Change-Number: 13772
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 10 Jul 2019 17:53:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8543: More diagnostics for TAcceptQueueServer

2019-07-10 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13790 )

Change subject: IMPALA-8543: More diagnostics for TAcceptQueueServer
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I33b32352b457a2c8ec7bae6da46bb9c555dc9c36
Gerrit-Change-Number: 13790
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Wed, 10 Jul 2019 17:45:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] Support SPNEGO for Impala webserver

2019-07-10 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13774 )

Change subject: Support SPNEGO for Impala webserver
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13774/1/be/src/util/webserver.cc
File be/src/util/webserver.cc:

http://gerrit.cloudera.org:8080/#/c/13774/1/be/src/util/webserver.cc@370
PS1, Line 370: LAGS_webserver_require_spnego) {
> I don't think the --principal is actually necessary

My point was that InitKerberosForServer() won't have actually been called 
unless --principal is set (see authentication.cc). Maybe my confusion is the 
word "assume" and it would be clearer to me if this was phrased as "checking" 
whether all of the necessary set up has been done? Unless we want to support a 
configuration where the webserver is secured with Kerberos but regular client 
connections aren't (which seems weird)

> I think people will get pretty grouchy

Works for me. Of course, we'll probably want to update the docs around 
Impala/Kerberos to mention this stuff



--
To view, visit http://gerrit.cloudera.org:8080/13774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife2b04310e1571d231bf8ee1bcfd3b7afc2edd8f
Gerrit-Change-Number: 13774
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 10 Jul 2019 17:43:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-07-10 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13748 )

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..


Patch Set 14:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/13748/14/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/13748/14/be/src/service/impala-server.cc@275
PS14, Line 275: 1
Since the hook execution is async (to query unregister) anyway, how about we 
bump this up a little? 3s or so?


http://gerrit.cloudera.org:8080/#/c/13748/14/be/src/service/impala-server.cc@277
PS14, Line 277: default 1.
not needed, the web UI shows the default separately. (same below)


http://gerrit.cloudera.org:8080/#/c/13748/14/be/src/service/impala-server.cc@279
PS14, Line 279: true
Should anyone ever have to set this to false? If not, I suggest removing this 
flag.


http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
File 
fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java:

http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@97
PS14, Line 97:  * ${hookClass}.${method}.execution.rejections
Where are all these metrics exposed? I don't see the plumbing to make them 
available on /metrics (or may be I'm missing the link somwhere)


http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@101
PS14, Line 101: exceptions
nit: failures?


http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@131
PS14, Line 131: private final long hookTimeout_;
  :   private final TimeUnit hookTimeoutUnit_;
Just force it to be secs or something?


http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@290
PS14, Line 290:   context);
probably helps to include query ID explicitly.


http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@327
PS14, Line 327: final Future f = hookExecutor_.submit(() -> 
{
I don't think you need to offload this to a separate executor pool again.  You 
could probably just do new FutureTask<>(Callable).run() (essentially what the 
hookExecutor_ does for you) but I think that simplifies the code and make it 
easy to reason about (1:1 correspondence between threadPool thread accepting 
the task and the thread that actually executes the hook).


http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java
File fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java:

http://gerrit.cloudera.org:8080/#/c/13748/14/fe/src/main/java/org/apache/impala/hooks/QueryEventHookManager.java@77
PS14, Line 77:   static final String BE_HOOKS_FLAG = "query_event_hook_classes";
nit: any reason to remove private qualifier?



--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 14
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Wed, 10 Jul 2019 17:23:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8747: Fix string formatting on HS2 connection setup failure

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13815 )

Change subject: IMPALA-8747: Fix string formatting on HS2 connection setup 
failure
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4592/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/13815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4789712fcd71c887a9c6c890cd22a4f382faa3a5
Gerrit-Change-Number: 13815
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Wed, 10 Jul 2019 17:18:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8747: Fix string formatting on HS2 connection setup failure

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13815 )

Change subject: IMPALA-8747: Fix string formatting on HS2 connection setup 
failure
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4789712fcd71c887a9c6c890cd22a4f382faa3a5
Gerrit-Change-Number: 13815
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Wed, 10 Jul 2019 17:18:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8747: Fix string formatting on HS2 connection setup failure

2019-07-10 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13815 )

Change subject: IMPALA-8747: Fix string formatting on HS2 connection setup 
failure
..


Patch Set 2:

Test failure in an unrelated piece of code, retrying.


--
To view, visit http://gerrit.cloudera.org:8080/13815
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4789712fcd71c887a9c6c890cd22a4f382faa3a5
Gerrit-Change-Number: 13815
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Wed, 10 Jul 2019 17:18:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8741: Speed up bit unpacking by vectorisation

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13807 )

Change subject: IMPALA-8741: Speed up bit unpacking by vectorisation
..


Patch Set 4:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/3857/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/13807
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e452a547973778bbd8d768c608e1a32e948f947
Gerrit-Change-Number: 13807
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 10 Jul 2019 11:46:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8741: Speed up bit unpacking by vectorisation

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13807 )

Change subject: IMPALA-8741: Speed up bit unpacking by vectorisation
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13807/4/be/src/util/vectorised_bit_unpacking_generator.py
File be/src/util/vectorised_bit_unpacking_generator.py:

http://gerrit.cloudera.org:8080/#/c/13807/4/be/src/util/vectorised_bit_unpacking_generator.py@183
PS4, Line 183: =
flake8: E999 SyntaxError: invalid syntax


http://gerrit.cloudera.org:8080/#/c/13807/4/be/src/util/vectorised_bit_unpacking_generator.py@1547
PS4, Line 1547: if __name__ == "__main__":
flake8: E305 expected 2 blank lines after class or function definition, found 1



--
To view, visit http://gerrit.cloudera.org:8080/13807
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e452a547973778bbd8d768c608e1a32e948f947
Gerrit-Change-Number: 13807
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 10 Jul 2019 11:06:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8741: Speed up bit unpacking by vectorisation

2019-07-10 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/13807 )

Change subject: IMPALA-8741: Speed up bit unpacking by vectorisation
..

IMPALA-8741: Speed up bit unpacking by vectorisation

Adding a vectorised implementation to bit unpacking using AVX, AVX2 and
BMI2 instructions through compiler intrinsics.

Vectorised bit unpacking is implemented for bit widths from 1 to 16.
Higher bit widths would not benefit from this vectorisation algorithm.

We check at runtime whether the required instructions are available on
the CPU and fall back to the scalar implementation if not.

The vectorised unpacking functions are in the file
be/src/util/bit-packing-vectorized.h, which is generated by the python
script in be/src/util/vectorised_bit_unpacking_generator.py.

Also adding benchmarks comparing the scalar and the vectorised
implementations.

Testing:
  - Added tests for the vectorised unpacking implementations.

Change-Id: I9e452a547973778bbd8d768c608e1a32e948f947
---
M be/src/benchmarks/bit-packing-benchmark.cc
M be/src/util/bit-packing-test.cc
A be/src/util/bit-packing-vectorized.h
M be/src/util/bit-packing.cc
M be/src/util/bit-packing.h
M be/src/util/bit-packing.inline.h
M be/src/util/cpu-info.cc
M be/src/util/cpu-info.h
A be/src/util/vectorised_bit_unpacking_generator.py
9 files changed, 5,771 insertions(+), 73 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/13807/4
--
To view, visit http://gerrit.cloudera.org:8080/13807
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9e452a547973778bbd8d768c608e1a32e948f947
Gerrit-Change-Number: 13807
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8703: ISO:SQL:2016 datetime patterns - Milestone 1

2019-07-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13722 )

Change subject: IMPALA-8703: ISO:SQL:2016 datetime patterns - Milestone 1
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3856/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13722
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I19d8d097a45ae6f103b6cd1b2d81aad38dfd9e23
Gerrit-Change-Number: 13722
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 10 Jul 2019 10:13:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8703: ISO:SQL:2016 datetime patterns - Milestone 1

2019-07-10 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13722 )

Change subject: IMPALA-8703: ISO:SQL:2016 datetime patterns - Milestone 1
..


Patch Set 4:

(12 comments)

Attila, with patch set 4 I have covered all your findings. The perf degradation 
is still an issue, though.

http://gerrit.cloudera.org:8080/#/c/13722/2/be/src/runtime/datetime-iso-sql-format-parser.cc
File be/src/runtime/datetime-iso-sql-format-parser.cc:

http://gerrit.cloudera.org:8080/#/c/13722/2/be/src/runtime/datetime-iso-sql-format-parser.cc@184
PS2, Line 184: >day)) {
 :   return false;
 : }
 :   }
 :
 :   return true;
 : }
 :
 : const char* 
DateTimeIsoSqlFormatParser::GetNextTokenGroupFromInput(const char* input_str,
 : int input_len, const DateTimeFormatToken& tok) {
 :   DCHECK(input_str != nullptr);
 :
 :   // Handle separately the meridiem indicators for two reasons.
 :   // 1: They might contain '.' that is not meant to be a 
separator character.
 :   // 2: The length of the token in the pattern might differ from 
the length of the token
 :   // in the input. E.g. "AM" should match with "P.M.".
 :   if (tok.type == MERIDIEM_IND
> This can be done more efficiently if you use hard-coded month ranges, like
Done


http://gerrit.cloudera.org:8080/#/c/13722/2/be/src/runtime/datetime-iso-sql-format-parser.cc@184
PS2, Line 184: >day)) {
 :   return false;
 : }
 :   }
 :
 :   return true;
 : }
 :
 : const char* 
DateTimeIsoSqlFormatParser::GetNextTokenGroupFromInput(const char* input_str,
 : int input_len, const DateTimeFormatToken& tok) {
 :   DCHECK(input_str != nullptr);
 :
 :   // Handle separately the meridiem indicators for two reasons.
 :   // 1: They might contain '.' that is not meant to be a 
separator character.
 :   // 2: The length of the token in the pattern might differ from 
the length of the token
 :   // in the input. E.g. "AM" should match with "P.M.".
 :   if (tok.type == MERIDIEM_IND
> Move this to a separate function.
Done


http://gerrit.cloudera.org:8080/#/c/13722/3/be/src/runtime/datetime-iso-sql-format-parser.cc
File be/src/runtime/datetime-iso-sql-format-parser.cc:

http://gerrit.cloudera.org:8080/#/c/13722/3/be/src/runtime/datetime-iso-sql-format-parser.cc@78
PS3, Line 78: if (!ParseAndValidate(current_pos, group_len, 0, , 
>year)) {
> line too long (92 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/13722/3/be/src/runtime/datetime-iso-sql-format-parser.cc@95
PS3, Line 95:   }
> line too long (91 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/13722/3/be/src/runtime/datetime-iso-sql-format-parser.cc@124
PS3, Line 124: if (!ParseAndValidate(current_pos, group_len, 0, 23, 
>hour)) return false;
> line too long (92 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/13722/3/be/src/runtime/datetime-iso-sql-format-parser.cc@128
PS3, Line 128: if (!ParseAndValidate(current_pos, group_len, 0, 59, 
>minute)) {
> line too long (92 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/13722/3/be/src/runtime/datetime-iso-sql-format-parser.cc@133
PS3, Line 133:   case SECOND_IN_MINUTE: {
> line too long (94 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/13722/2/be/src/runtime/datetime-parser-common.cc
File be/src/runtime/datetime-parser-common.cc:

http://gerrit.cloudera.org:8080/#/c/13722/2/be/src/runtime/datetime-parser-common.cc@185
PS2, Line 185:
 :
 :
 :
 :
 :
 :
 :
 :
 :
 :
> This function can be simplified if you use hard-coded month ranges. (See MO
Done


http://gerrit.cloudera.org:8080/#/c/13722/3/be/src/runtime/datetime-simple-date-format-parser.h
File be/src/runtime/datetime-simple-date-format-parser.h:

http://gerrit.cloudera.org:8080/#/c/13722/3/be/src/runtime/datetime-simple-date-format-parser.h@98
PS3, Line 98:   /// accept_time_toks_only -- if true, time tokens w/o date 
tokens are accepted.
> line too long (92 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/13722/3/be/src/runtime/datetime-simple-date-format-parser.h@104
PS3, Line 104:   /// Parse date/time string to find the corresponding default 
date/time format context.
> line too long (92 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/13722/3/be/src/runtime/datetime-simple-date-format-parser.h@160
PS3, Line 160:   /// Does only a basic validation on the parsed date/time 
values. The caller is
> line too long (92 > 90)
Done



[Impala-ASF-CR] IMPALA-8703: ISO:SQL:2016 datetime patterns - Milestone 1

2019-07-10 Thread Gabor Kaszab (Code Review)
Hello Attila Jeges, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13722

to look at the new patch set (#4).

Change subject: IMPALA-8703: ISO:SQL:2016 datetime patterns - Milestone 1
..

IMPALA-8703: ISO:SQL:2016 datetime patterns - Milestone 1

This enhancement introduces FORMAT clause for CAST() operator that is
applicable for casts between string types and timestamp types. Instead
of accepting SimpleDateFormat patterns the FORMAT clause supports
datetime patterns following the ISO:SQL:2016 standard.
Note, the CAST() operator without the FORMAT clause still uses
Impala's implementation of SimpleDateFormat handling. Similarly, the
existing conversion functions such as to_timestamp(), from_timestamp()
etc. remain unchanged and use SimpleDateFormat.

Milestone 1 contains all the format tokens covered by the SQL
standard. Further milestones will add more functionality on top of
this list to cover functionality provided by other RDBMS systems.

List of tokens implemented by this change:
- , YYY, YY, Y: Year tokens
- , RR: Round year tokens
- MM: Month
- DD: Day
- DDD: Day of year
- HH, HH12: Hour of day (1-12)
- HH24: Hour of day (0-23)
- MI: Minute
- SS: Second
- S: Second of day
- FF, FF1, ..., FF9: Fractional second
- AM, PM, A.M., P.M.: Meridiem indicators
- TZH: Timezone hour
- TZM: Timezone minute
- Separators: - . / , ' ; : space
- ISO8601 date indicators (T, Z)

Some notes about the matching algorithm:
- The parsing algorithm uses these tokens in a case insensitive
  manner.
- The separators are interchangeable with each other. For example a
  '-' separator in the format will match with a '.' character in the
  input.
- The length of the separator sequences is handled flexibly meaning
  that a single separator character in the format for instance would
  match with a multi-separator sequence in the input.
- In a string type to timestamp conversion the timezone offset tokens
  are parsed, expected to match with the input but they don't adjust
  the result as the input is already expected to be in UTC format.

Usage example:
SELECT CAST('01-02-2019' AS TIMESTAMP FORMAT 'MM-DD-');
SELECT CAST('2019.10.10 13:30:40.123456 +01:30' AS TIMESTAMP
FORMAT '-MM-DD HH24:MI:SS.FF9 TZH:TZM');
SELECT CAST(timestamp_column as STRING
FORMAT " MM HH12 YY") from some_table;

Change-Id: I19d8d097a45ae6f103b6cd1b2d81aad38dfd9e23
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/common/init.cc
M be/src/exec/text-converter.inline.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/cast-expr.cc
A be/src/exprs/cast-expr.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/date-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.h
M be/src/exprs/scalar-expr.cc
M be/src/exprs/scalar-expr.h
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/date-parse-util.cc
M be/src/runtime/date-parse-util.h
M be/src/runtime/date-test.cc
M be/src/runtime/date-value.cc
M be/src/runtime/date-value.h
A be/src/runtime/datetime-iso-sql-format-parser.cc
A be/src/runtime/datetime-iso-sql-format-parser.h
A be/src/runtime/datetime-iso-sql-format-tokenizer.cc
A be/src/runtime/datetime-iso-sql-format-tokenizer.h
D be/src/runtime/datetime-parse-util.h
A be/src/runtime/datetime-parser-common.cc
A be/src/runtime/datetime-parser-common.h
R be/src/runtime/datetime-simple-date-format-parser.cc
A be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-parse-util.h
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M be/src/service/impala-server.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/testutil/random-vector-generators.h
M be/src/util/dict-test.cc
M be/src/util/min-max-filter-test.cc
M be/src/util/string-parser.h
M common/thrift/Exprs.thrift
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
A 
testdata/workloads/functional-query/queries/QueryTest/cast_format_from_table.test
M testdata/workloads/functional-query/queries/QueryTest/date.test
A tests/query_test/test_cast_with_format.py
54 files changed, 3,340 insertions(+), 853 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/13722/4
--
To view, visit http://gerrit.cloudera.org:8080/13722
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF