[jira] [Assigned] (HIVE-21564) Load data into a bucketed table is ignoring partitions specs and loading data into default partition.

2019-04-01 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan reassigned HIVE-21564:
---


> Load data into a bucketed table is ignoring partitions specs and loading data 
> into default partition.
> -
>
> Key: HIVE-21564
> URL: https://issues.apache.org/jira/browse/HIVE-21564
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>
> When running below command to load data into bucketed tables it is not 
> loading into specified partition instead loaded into default partition.
> LOAD DATA INPATH '/tmp/files/00_0' OVERWRITE INTO TABLE call 
> PARTITION(year_partition=2012, month=12);
> SELECT * FROM call WHERE year_partition=2012 AND month=12; --> returns 0 rows.
> {code}
> CREATE TABLE call( 
> date_time_date date, 
> ssn string, 
> name string, 
> location string) 
> PARTITIONED BY ( 
> year_partition int, 
> month int) 
> CLUSTERED BY ( 
> date_time_date) 
> SORTED BY ( 
> date_time_date ASC) 
> INTO 1 BUCKETS 
> STORED AS ORC;
> {code}
> If set hive.exec.dynamic.partition to false, it fails with below error.
> {code}
> Error: Error while compiling statement: FAILED: SemanticException 1:18 
> Dynamic partition is disabled. Either enable it by setting 
> hive.exec.dynamic.partition=true or specify partition column values. Error 
> encountered near token 'month' (state=42000,code=4)
> {code}
> When we "set hive.strict.checks.bucketing=false;", the load works fine.
> This is a behaviour imposed by HIVE-15148 to avoid incorrectly named data 
> files being loaded to the bucketed tables. In customer use case, if the files 
> are named properly with bucket_id (0_0, 0_1 etc), then it is safe to 
> set this flag to false.
> However, current behaviour of loading into default partitions when 
> hive.strict.checks.bucketing=true and partitions specified, was a bug 
> injected by HIVE-19311 where the given query is re-written into a insert 
> query (to handle incorrect file names and Orc versions) but missed to 
> incorporate the partitions specs to it. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite

2019-04-01 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21539:
---
Attachment: HIVE-21539.2.patch

> GroupBy + where clause on same column results in incorrect query rewrite
> 
>
> Key: HIVE-21539
> URL: https://issues.apache.org/jira/browse/HIVE-21539
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: anishek
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch
>
>
> {code}
> create table a (i int, j string);
> insert into a values ( 1, 'a'),(2,'b');
> explain extended select min(j) from a where j='a' group by j;
> ++
> |  Explain   |
> ++
> | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`|
> | FROM `default`.`a` |
> | WHERE `j` = 'a'|
> | GROUP BY TRUE  |
> | STAGE DEPENDENCIES:|
> |   Stage-1 is a root stage  |
> |   Stage-0 depends on stages: Stage-1   |
> ||
> | STAGE PLANS:   |
> |   Stage: Stage-1   |
> | Tez|
> |   DagId: 
> anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 |
> |   Edges:   |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE)   |
> |   DagName: 
> anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 |
> |   Vertices:|
> | Map 1  |
> | Map Operator Tree: |
> | TableScan  |
> |   alias: a |
> |   filterExpr: (j = 'a') (type: boolean) |
> |   Statistics: Num rows: 2 Data size: 170 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> |   GatherStats: false   |
> |   Filter Operator  |
> | isSamplingPred: false  |
> | predicate: (j = 'a') (type: boolean) |
> | Statistics: Num rows: 1 Data size: 85 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> | Select Operator|
> |   Statistics: Num rows: 1 Data size: 85 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> |   Group By Operator|
> | aggregations: min(true)|
> | keys: true (type: boolean) |
> | mode: hash |
> | outputColumnNames: _col0, _col1 |
> | Statistics: Num rows: 1 Data size: 8 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> | Reduce Output Operator |
> |   key expressions: _col0 (type: boolean) |
> |   null sort order: a   |
> |   sort order: +|
> |   Map-reduce partition columns: _col0 (type: 
> boolean) |
> |   Statistics: Num rows: 1 Data size: 8 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> |   tag: -1  |
> |   value expressions: _col1 (type: boolean) |
> |   auto parallelism: true   |
> | Path -> Alias: |
> |   hdfs://localhost:9000/tmp/hive/warehouse/a [a] |
> | Path -> Partition: |
> |   hdfs://localhost:9000/tmp/hive/warehouse/a  |
> | Partition  |
> |   base file name: a|
> |   input format: org.apache.hadoop.mapred.TextInputFormat |
> |   output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |
> |   properties:  |
> | COLUMN_STATS_ACCURATE 
> {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} |
> | bucket_count -1|
> | bucketing_version 2|
> | column.name.delimiter ,|
> | columns i,j|
> | columns.comments   

[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite

2019-04-01 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21539:
---
Status: Patch Available  (was: Open)

> GroupBy + where clause on same column results in incorrect query rewrite
> 
>
> Key: HIVE-21539
> URL: https://issues.apache.org/jira/browse/HIVE-21539
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: anishek
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch
>
>
> {code}
> create table a (i int, j string);
> insert into a values ( 1, 'a'),(2,'b');
> explain extended select min(j) from a where j='a' group by j;
> ++
> |  Explain   |
> ++
> | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`|
> | FROM `default`.`a` |
> | WHERE `j` = 'a'|
> | GROUP BY TRUE  |
> | STAGE DEPENDENCIES:|
> |   Stage-1 is a root stage  |
> |   Stage-0 depends on stages: Stage-1   |
> ||
> | STAGE PLANS:   |
> |   Stage: Stage-1   |
> | Tez|
> |   DagId: 
> anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 |
> |   Edges:   |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE)   |
> |   DagName: 
> anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 |
> |   Vertices:|
> | Map 1  |
> | Map Operator Tree: |
> | TableScan  |
> |   alias: a |
> |   filterExpr: (j = 'a') (type: boolean) |
> |   Statistics: Num rows: 2 Data size: 170 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> |   GatherStats: false   |
> |   Filter Operator  |
> | isSamplingPred: false  |
> | predicate: (j = 'a') (type: boolean) |
> | Statistics: Num rows: 1 Data size: 85 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> | Select Operator|
> |   Statistics: Num rows: 1 Data size: 85 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> |   Group By Operator|
> | aggregations: min(true)|
> | keys: true (type: boolean) |
> | mode: hash |
> | outputColumnNames: _col0, _col1 |
> | Statistics: Num rows: 1 Data size: 8 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> | Reduce Output Operator |
> |   key expressions: _col0 (type: boolean) |
> |   null sort order: a   |
> |   sort order: +|
> |   Map-reduce partition columns: _col0 (type: 
> boolean) |
> |   Statistics: Num rows: 1 Data size: 8 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> |   tag: -1  |
> |   value expressions: _col1 (type: boolean) |
> |   auto parallelism: true   |
> | Path -> Alias: |
> |   hdfs://localhost:9000/tmp/hive/warehouse/a [a] |
> | Path -> Partition: |
> |   hdfs://localhost:9000/tmp/hive/warehouse/a  |
> | Partition  |
> |   base file name: a|
> |   input format: org.apache.hadoop.mapred.TextInputFormat |
> |   output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |
> |   properties:  |
> | COLUMN_STATS_ACCURATE 
> {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} |
> | bucket_count -1|
> | bucketing_version 2|
> | column.name.delimiter ,|
> | columns i,j|
> | columns.comments 

[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite

2019-04-01 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21539:
---
Status: Open  (was: Patch Available)

> GroupBy + where clause on same column results in incorrect query rewrite
> 
>
> Key: HIVE-21539
> URL: https://issues.apache.org/jira/browse/HIVE-21539
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: anishek
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch
>
>
> {code}
> create table a (i int, j string);
> insert into a values ( 1, 'a'),(2,'b');
> explain extended select min(j) from a where j='a' group by j;
> ++
> |  Explain   |
> ++
> | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`|
> | FROM `default`.`a` |
> | WHERE `j` = 'a'|
> | GROUP BY TRUE  |
> | STAGE DEPENDENCIES:|
> |   Stage-1 is a root stage  |
> |   Stage-0 depends on stages: Stage-1   |
> ||
> | STAGE PLANS:   |
> |   Stage: Stage-1   |
> | Tez|
> |   DagId: 
> anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 |
> |   Edges:   |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE)   |
> |   DagName: 
> anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 |
> |   Vertices:|
> | Map 1  |
> | Map Operator Tree: |
> | TableScan  |
> |   alias: a |
> |   filterExpr: (j = 'a') (type: boolean) |
> |   Statistics: Num rows: 2 Data size: 170 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> |   GatherStats: false   |
> |   Filter Operator  |
> | isSamplingPred: false  |
> | predicate: (j = 'a') (type: boolean) |
> | Statistics: Num rows: 1 Data size: 85 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> | Select Operator|
> |   Statistics: Num rows: 1 Data size: 85 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> |   Group By Operator|
> | aggregations: min(true)|
> | keys: true (type: boolean) |
> | mode: hash |
> | outputColumnNames: _col0, _col1 |
> | Statistics: Num rows: 1 Data size: 8 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> | Reduce Output Operator |
> |   key expressions: _col0 (type: boolean) |
> |   null sort order: a   |
> |   sort order: +|
> |   Map-reduce partition columns: _col0 (type: 
> boolean) |
> |   Statistics: Num rows: 1 Data size: 8 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> |   tag: -1  |
> |   value expressions: _col1 (type: boolean) |
> |   auto parallelism: true   |
> | Path -> Alias: |
> |   hdfs://localhost:9000/tmp/hive/warehouse/a [a] |
> | Path -> Partition: |
> |   hdfs://localhost:9000/tmp/hive/warehouse/a  |
> | Partition  |
> |   base file name: a|
> |   input format: org.apache.hadoop.mapred.TextInputFormat |
> |   output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |
> |   properties:  |
> | COLUMN_STATS_ACCURATE 
> {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} |
> | bucket_count -1|
> | bucketing_version 2|
> | column.name.delimiter ,|
> | columns i,j|
> | columns.comments 

[jira] [Assigned] (HIVE-21563) Improve Table#getEmptyTable performance by disable registerAllFunctionsOnce

2019-04-01 Thread Yuming Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang reassigned HIVE-21563:
--


> Improve Table#getEmptyTable performance by disable registerAllFunctionsOnce
> ---
>
> Key: HIVE-21563
> URL: https://issues.apache.org/jira/browse/HIVE-21563
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yuming Wang
>Assignee: Yuming Wang
>Priority: Major
>
> We do not need registerAllFunctionsOnce when {{Table#getEmptyTable}}. The 
> stack trace:
> {noformat}
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:177)
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:170)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.(FunctionRegistry.java:209)
>   at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:247)
>   at 
> org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:231)
>   at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:388)
>   at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:332)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:312)
>   at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:288)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.setAuthorizerV2Config(SessionState.java:913)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:877)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.getAuthenticator(SessionState.java:1479)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.getUserFromAuthenticator(SessionState.java:1150)
>   at org.apache.hadoop.hive.ql.metadata.Table.getEmptyTable(Table.java:180)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-9995) ACID compaction tries to compact a single file

2019-04-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807409#comment-16807409
 ] 

Hive QA commented on HIVE-9995:
---



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12964487/HIVE-9995.10.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15890 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16811/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16811/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16811/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12964487 - PreCommit-HIVE-Build

> ACID compaction tries to compact a single file
> --
>
> Key: HIVE-9995
> URL: https://issues.apache.org/jira/browse/HIVE-9995
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, 
> HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, 
> HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.08.patch, 
> HIVE-9995.09.patch, HIVE-9995.10.patch, HIVE-9995.WIP.patch
>
>
> Consider TestWorker.minorWithOpenInMiddle()
> since there is an open txnId=23, this doesn't have any meaningful minor 
> compaction work to do.  The system still tries to compact a single delta file 
> for 21-22 id range, and effectively copies the file onto itself.
> This is 1. inefficient and 2. can potentially affect a reader.
> (from a real cluster)
> Suppose we start with 
> {noformat}
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:03 
> /user/hive/warehouse/t/base_016
> -rw-r--r--   1 ekoifman staff602 2016-06-09 16:03 
> /user/hive/warehouse/t/base_016/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/base_017
> -rw-r--r--   1 ekoifman staff588 2016-06-09 16:07 
> /user/hive/warehouse/t/base_017/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_017_017_
> -rw-r--r--   1 ekoifman staff514 2016-06-09 16:06 
> /user/hive/warehouse/t/delta_017_017_/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_018_018_
> -rw-r--r--   1 ekoifman staff612 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_018_018_/bucket_0
> {noformat}
> then do _alter table T compact 'minor';_
> then we end up with 
> {noformat}
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/base_017
> -rw-r--r--   1 ekoifman staff588 2016-06-09 16:07 
> /user/hive/warehouse/t/base_017/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:11 
> /user/hive/warehouse/t/delta_018_018
> -rw-r--r--   1 ekoifman staff500 2016-06-09 16:11 
> /user/hive/warehouse/t/delta_018_018/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_018_018_
> -rw-r--r--   1 ekoifman staff612 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_018_018_/bucket_0
> {noformat}
> So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite

2019-04-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807348#comment-16807348
 ] 

Hive QA commented on HIVE-21539:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
28s{color} | {color:blue} ql in master has 2258 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
45s{color} | {color:red} ql: The patch generated 2 new + 1 unchanged - 0 fixed 
= 3 total (was 1) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m 19s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16807/dev-support/hive-personality.sh
 |
| git revision | master / 2111c01 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16807/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16807/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> GroupBy + where clause on same column results in incorrect query rewrite
> 
>
> Key: HIVE-21539
> URL: https://issues.apache.org/jira/browse/HIVE-21539
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: anishek
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21539.1.patch
>
>
> {code}
> create table a (i int, j string);
> insert into a values ( 1, 'a'),(2,'b');
> explain extended select min(j) from a where j='a' group by j;
> ++
> |  Explain   |
> ++
> | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`|
> | FROM `default`.`a` |
> | WHERE `j` = 'a'|
> | GROUP BY TRUE  |
> | STAGE DEPENDENCIES:|
> |   Stage-1 is a root stage  |
> |   Stage-0 depends on stages: Stage-1   |
> ||
> | STAGE PLANS:   |
> |   Stage: Stage-1   |
> | Tez|
> |   DagId: 
> anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 |
> |   Edges: 

[jira] [Commented] (HIVE-21560) Update Derby DDL to use CLOB instead of LONG VARCHAR

2019-04-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807370#comment-16807370
 ] 

Hive QA commented on HIVE-21560:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12964482/HIVE-21560.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16808/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16808/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16808/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2019-04-02 02:40:09.551
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-16808/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2019-04-02 02:40:09.554
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 2111c01 HIVE-21537: Scalar query rewrite could be improved to 
not generate an extra join if subquery is guaranteed to produce atmost one row 
(Vineet Garg, reviewed by Jesus Camacho Rodriguez)
+ git clean -f -d
Removing standalone-metastore/metastore-server/src/gen/
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 2111c01 HIVE-21537: Scalar query rewrite could be improved to 
not generate an extra join if subquery is guaranteed to produce atmost one row 
(Vineet Garg, reviewed by Jesus Camacho Rodriguez)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2019-04-02 02:40:10.213
+ rm -rf ../yetus_PreCommit-HIVE-Build-16808
+ mkdir ../yetus_PreCommit-HIVE-Build-16808
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-16808
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-16808/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: a/metastore/scripts/upgrade/derby/058-HIVE-21560.derby.sql: does not 
exist in index
error: a/standalone-metastore/metastore-server/src/main/resources/package.jdo: 
does not exist in index
error: 
a/standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-3.2.0.derby.sql:
 does not exist in index
error: 
a/standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql:
 does not exist in index
error: 
a/standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.1.0-to-3.2.0.derby.sql:
 does not exist in index
error: metastore/scripts/upgrade/derby/058-HIVE-21560.derby.sql: does not exist 
in index
error: scripts/upgrade/derby/058-HIVE-21560.derby.sql: does not exist in index
error: metastore-server/src/main/resources/package.jdo: does not exist in index
error: metastore-server/src/main/sql/derby/hive-schema-3.2.0.derby.sql: does 
not exist in index
error: metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql: does 
not exist in index
error: metastore-server/src/main/sql/derby/upgrade-3.1.0-to-3.2.0.derby.sql: 
does not exist in index
The patch does not appear to apply with p0, p1, or p2
+ result=1
+ '[' 1 -ne 0 ']'
+ rm -rf yetus_PreCommit-HIVE-Build-16808
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12964482 - PreCommit-HIVE-Build

> Update Derby DDL to use CLOB instead of LONG VARCHAR
> 
>
> Key: HIVE-21560
> URL: https://issues.apache.org/jira/browse/HIVE-21560
> Project: Hive
>  Issue Type: Bug
>Reporter: Shawn Weeks
>Assignee: Rajkumar Singh
>Priority: Minor
> Attachments: HIVE-21560.patch
>
>
> in the 

[jira] [Commented] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite

2019-04-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807369#comment-16807369
 ] 

Hive QA commented on HIVE-21539:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12964480/HIVE-21539.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15861 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=171)

[authorization_view_8.q,load_dyn_part5.q,vector_groupby_grouping_sets5.q,vector_complex_join.q,orc_llap.q,vectorization_7.q,cbo_gby.q,vectorized_dynamic_semijoin_reduction2.q,bucket_num_reducers_acid2.q,schema_evol_orc_vec_table.q,auto_sortmerge_join_1.q,results_cache_empty_result.q,lineage3.q,materialized_view_rewrite_empty.q,q93_with_constraints.q,vector_struct_in.q,bucketmapjoin3.q,vectorization_16.q,orc_ppd_schema_evol_2a.q,partition_ctas.q,vector_windowing_multipartitioning.q,vectorized_join46.q,orc_ppd_date.q,create_merge_compressed.q,vector_outer_join1.q,dynpart_sort_optimization_acid.q,vectorization_not.q,having.q,vector_topnkey.q,special_character_in_tabnames_1.q]
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16807/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16807/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16807/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12964480 - PreCommit-HIVE-Build

> GroupBy + where clause on same column results in incorrect query rewrite
> 
>
> Key: HIVE-21539
> URL: https://issues.apache.org/jira/browse/HIVE-21539
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: anishek
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21539.1.patch
>
>
> {code}
> create table a (i int, j string);
> insert into a values ( 1, 'a'),(2,'b');
> explain extended select min(j) from a where j='a' group by j;
> ++
> |  Explain   |
> ++
> | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`|
> | FROM `default`.`a` |
> | WHERE `j` = 'a'|
> | GROUP BY TRUE  |
> | STAGE DEPENDENCIES:|
> |   Stage-1 is a root stage  |
> |   Stage-0 depends on stages: Stage-1   |
> ||
> | STAGE PLANS:   |
> |   Stage: Stage-1   |
> | Tez|
> |   DagId: 
> anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 |
> |   Edges:   |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE)   |
> |   DagName: 
> anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 |
> |   Vertices:|
> | Map 1  |
> | Map Operator Tree: |
> | TableScan  |
> |   alias: a |
> |   filterExpr: (j = 'a') (type: boolean) |
> |   Statistics: Num rows: 2 Data size: 170 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> |   GatherStats: false   |
> |   Filter Operator  |
> | isSamplingPred: false  |
> | predicate: (j = 'a') (type: boolean) |
> | Statistics: Num rows: 1 Data size: 85 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> | Select Operator|
> |   Statistics: Num rows: 1 Data size: 85 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> |   Group By Operator|
> | aggregations: min(true)|
> | keys: true (type: boolean) |
> | mode: hash |
> |   

[jira] [Comment Edited] (HIVE-21166) Keyword as column name in DBS table of Hive metastore

2019-04-01 Thread Jianguo Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807359#comment-16807359
 ] 

Jianguo Tian edited comment on HIVE-21166 at 4/2/19 2:22 AM:
-

You can query DESC column like this: 
{code:sql}
select `DESC` from DBS limit 10;
{code}


was (Author: jonnyr):
You can query DESC column like this: 
{code:java}
// select `DESC` from DBS limit 10;
{code}

> Keyword as column name in DBS table of Hive metastore
> -
>
> Key: HIVE-21166
> URL: https://issues.apache.org/jira/browse/HIVE-21166
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Vamsi UCSS
>Priority: Blocker
>
> The table "DBS" in hive schema (metastore) has a column called "DESC" which 
> is a Hive keyword. This is causing any queries on this table to result in a 
> syntax error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21166) Keyword as column name in DBS table of Hive metastore

2019-04-01 Thread Jianguo Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807359#comment-16807359
 ] 

Jianguo Tian commented on HIVE-21166:
-

You can query DESC column like this: 
{code:java}
// select `DESC` from DBS limit 10;
{code}

> Keyword as column name in DBS table of Hive metastore
> -
>
> Key: HIVE-21166
> URL: https://issues.apache.org/jira/browse/HIVE-21166
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Vamsi UCSS
>Priority: Blocker
>
> The table "DBS" in hive schema (metastore) has a column called "DESC" which 
> is a Hive keyword. This is causing any queries on this table to result in a 
> syntax error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20382) Materialized views: Introduce heuristic to favour incremental rebuild

2019-04-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807382#comment-16807382
 ] 

Hive QA commented on HIVE-20382:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
5s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 2s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
38s{color} | {color:blue} common in master has 63 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
26s{color} | {color:blue} ql in master has 2258 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
23s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
46s{color} | {color:red} ql: The patch generated 6 new + 147 unchanged - 2 
fixed = 153 total (was 149) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
43s{color} | {color:red} ql generated 1 new + 2258 unchanged - 0 fixed = 2259 
total (was 2258) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 31m 46s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  
org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveMaterializedViewRule.MATERIALIZED_VIEW_REWRITING_RULES
 is a mutable array  At HiveMaterializedViewRule.java: At 
HiveMaterializedViewRule.java:[line 105] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16810/dev-support/hive-personality.sh
 |
| git revision | master / 2111c01 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16810/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16810/yetus/new-findbugs-ql.html
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16810/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Materialized views: Introduce heuristic to favour incremental rebuild
> -
>
> Key: HIVE-20382
> URL: https://issues.apache.org/jira/browse/HIVE-20382
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20382.01.patch, HIVE-20382.patch, HIVE-20382.patch
>
>
> Currently, we do not expose stats over ROW\_\_ID.writeId to the optimizer 
> (this should be fixed by HIVE-20313). Even 

[jira] [Updated] (HIVE-21560) Update Derby DDL to use CLOB instead of LONG VARCHAR

2019-04-01 Thread Rajkumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-21560:
--
Attachment: HIVE-21560.01.patch
Status: Patch Available  (was: Open)

> Update Derby DDL to use CLOB instead of LONG VARCHAR
> 
>
> Key: HIVE-21560
> URL: https://issues.apache.org/jira/browse/HIVE-21560
> Project: Hive
>  Issue Type: Bug
>Reporter: Shawn Weeks
>Assignee: Rajkumar Singh
>Priority: Minor
> Attachments: HIVE-21560.01.patch, HIVE-21560.patch
>
>
> in the Hive 1.x and 2.x metastore version for Derby there are two column in 
> "TBLS" that are set to LONG VARCHAR. This causes larger create view 
> statements to fail when using embedded metastore for testing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21560) Update Derby DDL to use CLOB instead of LONG VARCHAR

2019-04-01 Thread Rajkumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-21560:
--
Status: Open  (was: Patch Available)

> Update Derby DDL to use CLOB instead of LONG VARCHAR
> 
>
> Key: HIVE-21560
> URL: https://issues.apache.org/jira/browse/HIVE-21560
> Project: Hive
>  Issue Type: Bug
>Reporter: Shawn Weeks
>Assignee: Rajkumar Singh
>Priority: Minor
> Attachments: HIVE-21560.01.patch, HIVE-21560.patch
>
>
> in the Hive 1.x and 2.x metastore version for Derby there are two column in 
> "TBLS" that are set to LONG VARCHAR. This causes larger create view 
> statements to fail when using embedded metastore for testing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21499) should not remove the function from registry if create command failed with AlreadyExistsException

2019-04-01 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21499:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Rajkumar!

> should not remove the function from registry if create command failed with 
> AlreadyExistsException
> -
>
> Key: HIVE-21499
> URL: https://issues.apache.org/jira/browse/HIVE-21499
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
> Environment: Hive-3.1
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21499.01.patch, HIVE-21499.02.patch, 
> HIVE-21499.patch
>
>
> As a part of HIVE-20953 we are removing the function if creation for same 
> failed with any reason, this will yield into the following situation.
> 1. create function failed since function already exists
> 2. on #1 failure hive will clear the permanent function from the registry
> 3. this function will be of no use until hiveserver2 restarted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20382) Materialized views: Introduce heuristic to favour incremental rebuild

2019-04-01 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20382:
---
Attachment: HIVE-20382.02.patch

> Materialized views: Introduce heuristic to favour incremental rebuild
> -
>
> Key: HIVE-20382
> URL: https://issues.apache.org/jira/browse/HIVE-20382
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20382.01.patch, HIVE-20382.02.patch, 
> HIVE-20382.patch, HIVE-20382.patch
>
>
> Currently, we do not expose stats over ROW\_\_ID.writeId to the optimizer 
> (this should be fixed by HIVE-20313). Even if we did, we always assume 
> uniform distribution of the column values, which can easily lead to 
> overestimations on the number of rows read when we filter on 
> ROW\_\_ID.writeId for materialized views (think about a large transaction for 
> MV creation and then small ones for incremental maintenance). This 
> overestimation can lead to incremental view maintenance not being triggered 
> as cost of the incremental plan is overestimated (we think we will read more 
> rows than we actually do). This could be fixed by introducing histograms that 
> reflect better the column values distribution.
> Till both fixes are implemented, we will use a config variable that will 
> multiply the estimated cost of the rebuild plan and hence will be able to 
> favour incremental rebuild over full rebuild.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-9995) ACID compaction tries to compact a single file

2019-04-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807402#comment-16807402
 ] 

Hive QA commented on HIVE-9995:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
 9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
28s{color} | {color:blue} ql in master has 2258 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} ql: The patch generated 0 new + 823 unchanged - 11 
fixed = 823 total (was 834) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m 16s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16811/dev-support/hive-personality.sh
 |
| git revision | master / 5bf5d14 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16811/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> ACID compaction tries to compact a single file
> --
>
> Key: HIVE-9995
> URL: https://issues.apache.org/jira/browse/HIVE-9995
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, 
> HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, 
> HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.08.patch, 
> HIVE-9995.09.patch, HIVE-9995.10.patch, HIVE-9995.WIP.patch
>
>
> Consider TestWorker.minorWithOpenInMiddle()
> since there is an open txnId=23, this doesn't have any meaningful minor 
> compaction work to do.  The system still tries to compact a single delta file 
> for 21-22 id range, and effectively copies the file onto itself.
> This is 1. inefficient and 2. can potentially affect a reader.
> (from a real cluster)
> Suppose we start with 
> {noformat}
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:03 
> /user/hive/warehouse/t/base_016
> -rw-r--r--   1 ekoifman staff602 2016-06-09 16:03 
> /user/hive/warehouse/t/base_016/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/base_017
> -rw-r--r--   1 ekoifman staff588 2016-06-09 16:07 
> /user/hive/warehouse/t/base_017/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_017_017_
> -rw-r--r--   1 ekoifman staff514 

[jira] [Updated] (HIVE-21563) Improve Table#getEmptyTable performance by disable registerAllFunctionsOnce

2019-04-01 Thread Yuming Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated HIVE-21563:
---
Attachment: HIVE-21563.001.patch
Status: Patch Available  (was: Open)

> Improve Table#getEmptyTable performance by disable registerAllFunctionsOnce
> ---
>
> Key: HIVE-21563
> URL: https://issues.apache.org/jira/browse/HIVE-21563
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yuming Wang
>Assignee: Yuming Wang
>Priority: Major
> Attachments: HIVE-21563.001.patch
>
>
> We do not need registerAllFunctionsOnce when {{Table#getEmptyTable}}. The 
> stack trace:
> {noformat}
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:177)
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:170)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.(FunctionRegistry.java:209)
>   at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:247)
>   at 
> org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:231)
>   at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:388)
>   at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:332)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:312)
>   at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:288)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.setAuthorizerV2Config(SessionState.java:913)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:877)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.getAuthenticator(SessionState.java:1479)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.getUserFromAuthenticator(SessionState.java:1150)
>   at org.apache.hadoop.hive.ql.metadata.Table.getEmptyTable(Table.java:180)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21561) Revert removal of TableType.INDEX_TABLE enum

2019-04-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807415#comment-16807415
 ] 

Hive QA commented on HIVE-21561:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
48s{color} | {color:blue} standalone-metastore/metastore-common in master has 
29 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 19m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16813/dev-support/hive-personality.sh
 |
| git revision | master / 5bf5d14 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: standalone-metastore/metastore-common U: 
standalone-metastore/metastore-common |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16813/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Revert removal of TableType.INDEX_TABLE enum
> 
>
> Key: HIVE-21561
> URL: https://issues.apache.org/jira/browse/HIVE-21561
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-21561.1.patch, HIVE-21561.2.patch
>
>
> Index tables have been removed from Hive as of HIVE-18715.
> However, in case users still have index tables defined in the metastore, we 
> should keep the TableType.INDEX_TABLE enum around so that users can drop 
> these tables. Without the enum defined Hive cannot do anything with them as 
> it fails with IllegalArgumentException errors when trying to call 
> TableType.valueOf() on INDEX_TABLE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20382) Materialized views: Introduce heuristic to favour incremental rebuild

2019-04-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807392#comment-16807392
 ] 

Hive QA commented on HIVE-20382:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12964486/HIVE-20382.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15890 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_mv] 
(batchId=195)
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.testMetastoreTablesCleanup 
(batchId=327)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16810/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16810/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16810/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12964486 - PreCommit-HIVE-Build

> Materialized views: Introduce heuristic to favour incremental rebuild
> -
>
> Key: HIVE-20382
> URL: https://issues.apache.org/jira/browse/HIVE-20382
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20382.01.patch, HIVE-20382.patch, HIVE-20382.patch
>
>
> Currently, we do not expose stats over ROW\_\_ID.writeId to the optimizer 
> (this should be fixed by HIVE-20313). Even if we did, we always assume 
> uniform distribution of the column values, which can easily lead to 
> overestimations on the number of rows read when we filter on 
> ROW\_\_ID.writeId for materialized views (think about a large transaction for 
> MV creation and then small ones for incremental maintenance). This 
> overestimation can lead to incremental view maintenance not being triggered 
> as cost of the incremental plan is overestimated (we think we will read more 
> rows than we actually do). This could be fixed by introducing histograms that 
> reflect better the column values distribution.
> Till both fixes are implemented, we will use a config variable that will 
> multiply the estimated cost of the rebuild plan and hence will be able to 
> favour incremental rebuild over full rebuild.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21561) Revert removal of TableType.INDEX_TABLE enum

2019-04-01 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-21561:
--
Attachment: HIVE-21561.2.patch

> Revert removal of TableType.INDEX_TABLE enum
> 
>
> Key: HIVE-21561
> URL: https://issues.apache.org/jira/browse/HIVE-21561
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-21561.1.patch, HIVE-21561.2.patch
>
>
> Index tables have been removed from Hive as of HIVE-18715.
> However, in case users still have index tables defined in the metastore, we 
> should keep the TableType.INDEX_TABLE enum around so that users can drop 
> these tables. Without the enum defined Hive cannot do anything with them as 
> it fails with IllegalArgumentException errors when trying to call 
> TableType.valueOf() on INDEX_TABLE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-14836) Test the predicate pushing down support for Parquet vectorization read path

2019-04-01 Thread Xinli Shang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807321#comment-16807321
 ] 

Xinli Shang commented on HIVE-14836:


It seems merging errors. Is it something you are going to fix? 

> Test the predicate pushing down support for Parquet vectorization read path
> ---
>
> Key: HIVE-14836
> URL: https://issues.apache.org/jira/browse/HIVE-14836
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14836.patch
>
>
> We should add more UT test for predicate pushing down support for Parquet 
> vectorization read path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21558) Query based compaction fails if the temporary FS is different than the table FS

2019-04-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807322#comment-16807322
 ] 

Hive QA commented on HIVE-21558:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
30s{color} | {color:blue} ql in master has 2258 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16806/dev-support/hive-personality.sh
 |
| git revision | master / 2111c01 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16806/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Query based compaction fails if the temporary FS is different than the table 
> FS
> ---
>
> Key: HIVE-21558
> URL: https://issues.apache.org/jira/browse/HIVE-21558
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-21558.02.patch, HIVE-21558.patch
>
>
> The Exception I got is like this:
> {code:java}
> 2019-04-01T13:45:44,035 ERROR [PeterVary-MBP15.local-33] compactor.Worker: 
> Caught exception while trying to compact 
> id:24,dbname:default,tableName:acid,partName:null,state:,type:MAJOR,properties:null,runAs:petervary,tooManyAborts:false,highestWriteId:9.
>  Marking failed to avoid repeated failures, 
> java.lang.IllegalArgumentException: Wrong FS: 
> pfile:/Users/petervary/data/apache/hive/warehouse/acid/base_009_v284/bucket_0,
>  expected: file:///
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
> at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:454)
> at org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1768)
> at 
> 

[jira] [Commented] (HIVE-21558) Query based compaction fails if the temporary FS is different than the table FS

2019-04-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807335#comment-16807335
 ] 

Hive QA commented on HIVE-21558:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12964479/HIVE-21558.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15890 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16806/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16806/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16806/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12964479 - PreCommit-HIVE-Build

> Query based compaction fails if the temporary FS is different than the table 
> FS
> ---
>
> Key: HIVE-21558
> URL: https://issues.apache.org/jira/browse/HIVE-21558
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-21558.02.patch, HIVE-21558.patch
>
>
> The Exception I got is like this:
> {code:java}
> 2019-04-01T13:45:44,035 ERROR [PeterVary-MBP15.local-33] compactor.Worker: 
> Caught exception while trying to compact 
> id:24,dbname:default,tableName:acid,partName:null,state:,type:MAJOR,properties:null,runAs:petervary,tooManyAborts:false,highestWriteId:9.
>  Marking failed to avoid repeated failures, 
> java.lang.IllegalArgumentException: Wrong FS: 
> pfile:/Users/petervary/data/apache/hive/warehouse/acid/base_009_v284/bucket_0,
>  expected: file:///
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
> at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:454)
> at org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1768)
> at 
> org.apache.hadoop.hive.ql.io.ProxyLocalFileSystem.rename(ProxyLocalFileSystem.java:34)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.commitCrudMajorCompaction(CompactorMR.java:583)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runCrudCompaction(CompactorMR.java:401)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:248)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:195){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21556) Useless configuration for old jetty in log4j.properties

2019-04-01 Thread Chen Zhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Zhi updated HIVE-21556:

Description: We has upgrade to jetty 9 in HIVE-16049, the configuration 
`org.mortbay` in log4j.properties for old version of jetty is useless.   (was: 
We has upgrade to jetty 9 in 
[HIVE-16049](https://issues.apache.org/jira/browse/HIVE-16049), the 
configuration `org.mortbay` in log4j.properties for old version of jetty is 
useless. )

> Useless configuration for old jetty in log4j.properties
> ---
>
> Key: HIVE-21556
> URL: https://issues.apache.org/jira/browse/HIVE-21556
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Chen Zhi
>Priority: Minor
>
> We has upgrade to jetty 9 in HIVE-16049, the configuration `org.mortbay` in 
> log4j.properties for old version of jetty is useless. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21556) Useless configuration for old jetty in log4j.properties

2019-04-01 Thread Chen Zhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Zhi updated HIVE-21556:

Labels: patch-available  (was: )

> Useless configuration for old jetty in log4j.properties
> ---
>
> Key: HIVE-21556
> URL: https://issues.apache.org/jira/browse/HIVE-21556
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Chen Zhi
>Priority: Minor
>  Labels: patch-available
> Attachments: HIVE-21556.1.patch
>
>
>  
> {code:java}
> logger.Mortbay.name = org.mortbay
> logger.Mortbay.level = INFO
> {code}
> The logger `Mortbay` in log4j.properties is used to control logging 
> activities of jetty (6.x). However, we have upgrade to jetty 9 in HIVE-16049, 
> the package name has changed to `org.eclipse.jetty` and we have added the new 
> logger to control jetty. `Mortbay` is useless. I guess we can remove it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21404) MSSQL upgrade script alters the wrong column

2019-04-01 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-21404:
---

Assignee: David Lavati  (was: Zoltan Haindrich)

> MSSQL upgrade script alters the wrong column
> 
>
> Key: HIVE-21404
> URL: https://issues.apache.org/jira/browse/HIVE-21404
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.2.0
>Reporter: David Lavati
>Assignee: David Lavati
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
> Attachments: HIVE-21404.1.patch, HIVE-21404.2.patch, 
> HIVE-21404.3.patch, HIVE-21404.4.patch, HIVE-21404.4.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-20221 changes PARTITION_PARAMS, so the following command is modifying 
> the wrong table:
> {{ALTER TABLE "SERDE_PARAMS" ALTER COLUMN "PARAM_VALUE" nvarchar(MAX);}}
> https://github.com/apache/hive/blob/d3b036920acde7bb04840697eb13038103b062b4/standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.1.0-to-3.2.0.mssql.sql#L21



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21404) MSSQL upgrade script alters the wrong column

2019-04-01 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-21404:

Attachment: HIVE-21404.4.patch

> MSSQL upgrade script alters the wrong column
> 
>
> Key: HIVE-21404
> URL: https://issues.apache.org/jira/browse/HIVE-21404
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.2.0
>Reporter: David Lavati
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
> Attachments: HIVE-21404.1.patch, HIVE-21404.2.patch, 
> HIVE-21404.3.patch, HIVE-21404.4.patch, HIVE-21404.4.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-20221 changes PARTITION_PARAMS, so the following command is modifying 
> the wrong table:
> {{ALTER TABLE "SERDE_PARAMS" ALTER COLUMN "PARAM_VALUE" nvarchar(MAX);}}
> https://github.com/apache/hive/blob/d3b036920acde7bb04840697eb13038103b062b4/standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.1.0-to-3.2.0.mssql.sql#L21



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21407) Parquet predicate pushdown is not working correctly for char column types

2019-04-01 Thread Marta Kuczora (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora reassigned HIVE-21407:


Assignee: Marta Kuczora

> Parquet predicate pushdown is not working correctly for char column types
> -
>
> Key: HIVE-21407
> URL: https://issues.apache.org/jira/browse/HIVE-21407
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>
> If the 'hive.optimize.index.filter' parameter is false, the filter predicate 
> is not pushed to parquet, so the filtering only happens within Hive. If the 
> parameter is true, the filter is pushed to parquet, but for a char type, the 
> value which is pushed to Parquet will be padded with spaces:
> {noformat}
>   @Override
>   public void setValue(String val, int len) {
> super.setValue(HiveBaseChar.getPaddedValue(val, len), -1);
>   }
> {noformat} 
> So if we have a char(10) column which contains the value "apple" and the 
> where condition looks like 'where c='apple'', the value pushed to Paquet will 
> be 'apple' followed by 5 spaces. But the stored values are not padded, so no 
> rows will be returned from Parquet.
> How to reproduce:
> {noformat}
> $ create table ppd (c char(10), v varchar(10), i int) stored as parquet;
> $ insert into ppd values ('apple', 'bee', 1),('apple', 'tree', 2),('hello', 
> 'world', 1),('hello','vilag',3);
> $ set hive.optimize.ppd.storage=true;
> $ set hive.vectorized.execution.enabled=true;
> $ set hive.vectorized.execution.enabled=false;
> $ set hive.optimize.ppd=true;
> $ set hive.optimize.index.filter=true;
> $ set hive.parquet.timestamp.skip.conversion=false;
> $ select * from ppd where c='apple';
> ++++
> | ppd.c  | ppd.v  | ppd.i  |
> ++++
> ++++
> $ set hive.optimize.index.filter=false; or set 
> hive.optimize.ppd.storage=false;
> $ select * from ppd where c='apple';
> +-+++
> |ppd.c| ppd.v  | ppd.i  |
> +-+++
> | apple   | bee| 1  |
> | apple   | tree   | 2  |
> +-+++
> {noformat}
> The issue surfaced after uploading the fix for 
> [HIVE-21327|https://issues.apache.org/jira/browse/HIVE-21327] was uploaded 
> upstream. Before the HIVE-21327 fix, setting the parameter 
> 'hive.parquet.timestamp.skip.conversion' to true in the parquet_ppd_char.q 
> test hid this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21540) Query with join condition having date literal throws SemanticException.

2019-04-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806425#comment-16806425
 ] 

Hive QA commented on HIVE-21540:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
53s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
22s{color} | {color:blue} ql in master has 2257 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16790/dev-support/hive-personality.sh
 |
| git revision | master / dc0b16a |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16790/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Query with join condition having date literal throws SemanticException.
> ---
>
> Key: HIVE-21540
> URL: https://issues.apache.org/jira/browse/HIVE-21540
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.1.0, 3.1.1
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: Analyzer, DateField, pull-request-available
> Attachments: HIVE-21540.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This semantic exception is thrown for the following query. 
> *SemanticException '2019-03-20' encountered with 0 children*
> {code}
> create table date_1 (key int, dd date);
> create table date_2 (key int, dd date);
> select d1.key, d2.dd from(
>   select key, dd as start_dd, current_date as end_dd from date_1) d1
>   join date_2 as d2 on d1.key = d2.key where d2.dd between start_dd and 
> end_dd;
> {code}
> When the WHERE condition below is commented out, the query completes 
> successfully.
> where d2.dd between start_dd and end_dd
> 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21556) Useless configuration for old jetty in log4j.properties

2019-04-01 Thread Chen Zhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Zhi updated HIVE-21556:

Description: 
 
{code:java}
logger.Mortbay.name = org.mortbay
logger.Mortbay.level = INFO
{code}
The logger `Mortbay` in log4j.properties is used to control logging activities 
of jetty (6.x). However, we have upgrade to jetty 9 in HIVE-16049, the package 
name has changed to `org.eclipse.jetty` and we have added the new logger to 
control jetty. `Mortbay` is useless. I guess we can remove it.

  was:
{{The logger `Mortbay`}}
{code:java}
// code placeholder
{code}
{{in log4j.properties is used to control logging activities of jetty (6.x). 
However, we have upgrade to jetty 9 in HIVE-16049, the package name has changed 
to `org.eclipse.jetty` and we have added `Jetty` logger to control the new 
version. `Mortbay` is useless. I guess we can remove it.}}


> Useless configuration for old jetty in log4j.properties
> ---
>
> Key: HIVE-21556
> URL: https://issues.apache.org/jira/browse/HIVE-21556
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Chen Zhi
>Priority: Minor
>
>  
> {code:java}
> logger.Mortbay.name = org.mortbay
> logger.Mortbay.level = INFO
> {code}
> The logger `Mortbay` in log4j.properties is used to control logging 
> activities of jetty (6.x). However, we have upgrade to jetty 9 in HIVE-16049, 
> the package name has changed to `org.eclipse.jetty` and we have added the new 
> logger to control jetty. `Mortbay` is useless. I guess we can remove it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221121=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221121
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 06:49
Start Date: 01/Apr/19 06:49
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #579: 
HIVE-21109 : Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270732089
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
 ##
 @@ -2950,21 +2957,32 @@ public Partition createPartition(Table tbl, 
Map partSpec) throws
 int size = addPartitionDesc.getPartitionCount();
 List in =
 new ArrayList(size);
-AcidUtils.TableSnapshot tableSnapshot = AcidUtils.getTableSnapshot(conf, 
tbl, true);
 long writeId;
 String validWriteIdList;
-if (tableSnapshot != null && tableSnapshot.getWriteId() > 0) {
-  writeId = tableSnapshot.getWriteId();
-  validWriteIdList = tableSnapshot.getValidWriteIdList();
+
+// In case of replication, get the writeId from the source and use valid 
write Id list
+// for replication.
+if (addPartitionDesc.getReplicationSpec().isInReplicationScope() &&
+addPartitionDesc.getPartition(0).getWriteId() > 0) {
 
 Review comment:
   writeId will be 0 for non-transactional tables. Also this is 
createPartitions code, which may get executed for partitions created when 
writeId 0 for transactional tables as well. The condition, which I borrowed 
from the old code is required so that we don't create a valid writeId list or 
try to get a table snapshot when writeId is zero.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221121)
Time Spent: 9.5h  (was: 9h 20m)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch
>
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221119=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221119
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 06:48
Start Date: 01/Apr/19 06:48
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #579: 
HIVE-21109 : Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270732089
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
 ##
 @@ -2950,21 +2957,32 @@ public Partition createPartition(Table tbl, 
Map partSpec) throws
 int size = addPartitionDesc.getPartitionCount();
 List in =
 new ArrayList(size);
-AcidUtils.TableSnapshot tableSnapshot = AcidUtils.getTableSnapshot(conf, 
tbl, true);
 long writeId;
 String validWriteIdList;
-if (tableSnapshot != null && tableSnapshot.getWriteId() > 0) {
-  writeId = tableSnapshot.getWriteId();
-  validWriteIdList = tableSnapshot.getValidWriteIdList();
+
+// In case of replication, get the writeId from the source and use valid 
write Id list
+// for replication.
+if (addPartitionDesc.getReplicationSpec().isInReplicationScope() &&
+addPartitionDesc.getPartition(0).getWriteId() > 0) {
 
 Review comment:
   writeId will be 0 for non-transactional tables. Also this is 
createPartitions code, which may get executed for partitions created when 
writeId 0 for transactional tables as well.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221119)
Time Spent: 9h 20m  (was: 9h 10m)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21540) Query with join condition having date literal throws SemanticException.

2019-04-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806457#comment-16806457
 ] 

Hive QA commented on HIVE-21540:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12964397/HIVE-21540.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15884 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16790/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16790/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16790/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12964397 - PreCommit-HIVE-Build

> Query with join condition having date literal throws SemanticException.
> ---
>
> Key: HIVE-21540
> URL: https://issues.apache.org/jira/browse/HIVE-21540
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.1.0, 3.1.1
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: Analyzer, DateField, pull-request-available
> Attachments: HIVE-21540.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This semantic exception is thrown for the following query. 
> *SemanticException '2019-03-20' encountered with 0 children*
> {code}
> create table date_1 (key int, dd date);
> create table date_2 (key int, dd date);
> select d1.key, d2.dd from(
>   select key, dd as start_dd, current_date as end_dd from date_1) d1
>   join date_2 as d2 on d1.key = d2.key where d2.dd between start_dd and 
> end_dd;
> {code}
> When the WHERE condition below is commented out, the query completes 
> successfully.
> where d2.dd between start_dd and end_dd
> 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21392) Misconfigurations of DataNucleus log in log4j.properties

2019-04-01 Thread Chen Zhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Zhi updated HIVE-21392:

Attachment: HIVE-21392.08.patch

> Misconfigurations of DataNucleus log in log4j.properties
> 
>
> Key: HIVE-21392
> URL: https://issues.apache.org/jira/browse/HIVE-21392
> Project: Hive
>  Issue Type: Improvement
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Chen Zhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21392.02.patch, HIVE-21392.03.patch, 
> HIVE-21392.04.patch, HIVE-21392.05.patch, HIVE-21392.06.patch, 
> HIVE-21392.07.patch, HIVE-21392.08.patch, HIVE-21392.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In the patch of  
> [HIVE-12020|https://issues.apache.org/jira/browse/HIVE-12020], we changed the 
> DataNucleus related logging configuration from nine fine-grained loggers with 
> three coarse-grained loggers (DataNucleus, Datastore and JPOX). As Prasanth 
> Jayachandran 
> [explain|https://issues.apache.org/jira/browse/HIVE-12020?focusedCommentId=15025612=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15025612],
>  these three loggers are the top-level logger in DataNucleus, so that we 
> don't need to specify other loggers for DataNucleus. However, according to 
> the 
> [documents|http://www.datanucleus.org/products/accessplatform/logging.html] 
> and [source 
> codes|https://github.com/datanucleus/datanucleus-core/blob/master/src/main/java/org/datanucleus/util/NucleusLogger.java#L108]
>  of DataNucleus, the top-level logger in DataNucleus is `DataNucleus`. 
> Therefore, we just need to keep the right one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221122=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221122
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 06:50
Start Date: 01/Apr/19 06:50
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #579: 
HIVE-21109 : Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270732089
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
 ##
 @@ -2950,21 +2957,32 @@ public Partition createPartition(Table tbl, 
Map partSpec) throws
 int size = addPartitionDesc.getPartitionCount();
 List in =
 new ArrayList(size);
-AcidUtils.TableSnapshot tableSnapshot = AcidUtils.getTableSnapshot(conf, 
tbl, true);
 long writeId;
 String validWriteIdList;
-if (tableSnapshot != null && tableSnapshot.getWriteId() > 0) {
-  writeId = tableSnapshot.getWriteId();
-  validWriteIdList = tableSnapshot.getValidWriteIdList();
+
+// In case of replication, get the writeId from the source and use valid 
write Id list
+// for replication.
+if (addPartitionDesc.getReplicationSpec().isInReplicationScope() &&
+addPartitionDesc.getPartition(0).getWriteId() > 0) {
 
 Review comment:
   writeId will be 0 for non-transactional tables. Also this is 
createPartitions code, which may get executed for with writeId = 0 for 
non-transactional modifications to partitions for transactional tables as well. 
The condition, which I borrowed from the old code is required so that we don't 
create a valid writeId list or try to get a table snapshot when writeId is zero.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221122)
Time Spent: 9h 40m  (was: 9.5h)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch
>
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21230) LEFT OUTER JOIN does not generate transitive IS NOT NULL filter on right side (HiveJoinAddNotNullRule bails out for outer joins)

2019-04-01 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-21230:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

pushed to master. Thank you Vineet !

> LEFT OUTER JOIN does not generate transitive IS NOT NULL filter on right side 
> (HiveJoinAddNotNullRule bails out for outer joins)
> 
>
> Key: HIVE-21230
> URL: https://issues.apache.org/jira/browse/HIVE-21230
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Vineet Garg
>Priority: Major
>  Labels: newbie
> Fix For: 4.0.0
>
> Attachments: HIVE-21230.1.patch, HIVE-21230.2.patch, 
> HIVE-21230.3.patch, HIVE-21230.4.patch, HIVE-21230.5.patch, 
> HIVE-21230.6.patch, HIVE-21230.7.patch, HIVE-21230.8.patch
>
>
> For instance, given the following query:
> {code:sql}
> SELECT t0.col0, t0.col1
> FROM
>   (
> SELECT col0, col1 FROM tab
>   ) AS t0
>   LEFT JOIN
>   (
> SELECT col0, col1 FROM tab
>   ) AS t1
> ON t0.col0 = t1.col0 AND t0.col1 = t1.col1
> {code}
> we could still infer that col0 and col1 cannot be null in the right input and 
> introduce the corresponding filter predicate. Currently, the rule just bails 
> out if it is not an inner join.
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveJoinAddNotNullRule.java#L79



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21402) Compaction state remains 'working' when major compaction fails

2019-04-01 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-21402:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.

Thanks for the review [~vgumashta]!

> Compaction state remains 'working' when major compaction fails
> --
>
> Key: HIVE-21402
> URL: https://issues.apache.org/jira/browse/HIVE-21402
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21402.patch
>
>
> When calcite is not on the HMS classpath, and query based compaction is 
> enabled then the compaction fails with NoClassDefFound error. Since the catch 
> block only catches Exceptions the following code block is not executed:
> {code:java}
> } catch (Exception e) {
>   LOG.error("Caught exception while trying to compact " + ci +
>   ".  Marking failed to avoid repeated failures, " + 
> StringUtils.stringifyException(e));
>   msc.markFailed(CompactionInfo.compactionInfoToStruct(ci));
>   msc.abortTxns(Collections.singletonList(compactorTxnId));
> }
> {code}
> So the compaction is not set to failed.
> Would be better to catch Throwable instead of Exception



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21404) MSSQL upgrade script alters the wrong column

2019-04-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806551#comment-16806551
 ] 

Hive QA commented on HIVE-21404:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12964409/HIVE-21404.4.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 15885 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.metastore.TestObjectStore.catalogs (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testDatabaseOps (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testDeprecatedConfigIsOverwritten
 (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSQLDropParitionsCleanup
 (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSQLDropPartitionsCacheCrossSession
 (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSqlErrorMetrics 
(batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testEmptyTrustStoreProps 
(batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testMasterKeyOps (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testMaxEventResponse 
(batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testPartitionOps (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testQueryCloseOnError 
(batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testRoleOps (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testTableOps (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testUseSSLProperty 
(batchId=230)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16792/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16792/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16792/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12964409 - PreCommit-HIVE-Build

> MSSQL upgrade script alters the wrong column
> 
>
> Key: HIVE-21404
> URL: https://issues.apache.org/jira/browse/HIVE-21404
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.2.0
>Reporter: David Lavati
>Assignee: David Lavati
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
> Attachments: HIVE-21404.1.patch, HIVE-21404.2.patch, 
> HIVE-21404.3.patch, HIVE-21404.4.patch, HIVE-21404.4.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-20221 changes PARTITION_PARAMS, so the following command is modifying 
> the wrong table:
> {{ALTER TABLE "SERDE_PARAMS" ALTER COLUMN "PARAM_VALUE" nvarchar(MAX);}}
> https://github.com/apache/hive/blob/d3b036920acde7bb04840697eb13038103b062b4/standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.1.0-to-3.2.0.mssql.sql#L21



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221179=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221179
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 09:38
Start Date: 01/Apr/19 09:38
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #579: 
HIVE-21109 : Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270786715
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java
 ##
 @@ -269,11 +294,23 @@ private String dumpLoadVerify(List tableNames, 
String lastReplicationId,
 WarehouseInstance.Tuple dumpTuple = primary.run("use " + primaryDbName)
 .dump(primaryDbName, lastReplicationId, withClauseList);
 
+
 // Load, if necessary changing configuration.
 if (parallelLoad) {
   replica.hiveConf.setBoolVar(HiveConf.ConfVars.EXECPARALLEL, true);
 }
 
+// Fail load if for testing failure and retry scenario. Fail the load 
while setting
+// checkpoint for a table in the middle of list of tables.
+if (failRetry) {
+  if (lastReplicationId == null) {
+failBootstrapLoad(dumpTuple, tableNames.size()/2);
+  } else {
+failIncrementalLoad(dumpTuple, tableNames.size()/2);
 
 Review comment:
   We are counting UpdateTableStats or UpdatePartStats events and not every 
event. So, we will fail only after encountering no of tables/2 events of those 
types. So it can not fail before applying update stats events. But to be on the 
safer side, I have changed the code to fail after second event so that we have 
at least one successful application before we fail. Since we are performing 
multiple insert events per table, we can be sure that there are at least 2 
events of each type.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221179)
Time Spent: 9h 50m  (was: 9h 40m)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221112=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221112
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 06:36
Start Date: 01/Apr/19 06:36
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #579: 
HIVE-21109 : Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270729852
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
 ##
 @@ -694,7 +695,9 @@ public void alterTable(String catName, String dbName, 
String tblName, Table newT
   AcidUtils.TableSnapshot tableSnapshot = null;
   if (transactional) {
 if (replWriteId > 0) {
-  ValidWriteIdList writeIds = 
AcidUtils.getTableValidWriteIdListWithTxnList(conf, dbName, tblName);
+  ValidWriteIdList writeIds = new 
ValidReaderWriteIdList(TableName.getDbTable(dbName, tblName),
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221112)
Time Spent: 8h 50m  (was: 8h 40m)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21556) Useless configuration for old jetty in log4j.properties

2019-04-01 Thread Chen Zhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Zhi updated HIVE-21556:

Description: The configuration `org.mortbay` in log4j.properties is used to 
control logging activities of jetty . However, we have upgrade to jetty 9 in 
HIVE-16049, the package name has changed to `org.eclipse.jetty`, this 
configuration is useless. I guess we can remove it.  (was: We has upgrade to 
jetty 9 in HIVE-16049, the configuration `org.mortbay` in log4j.properties for 
old version of jetty is useless. )

> Useless configuration for old jetty in log4j.properties
> ---
>
> Key: HIVE-21556
> URL: https://issues.apache.org/jira/browse/HIVE-21556
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Chen Zhi
>Priority: Minor
>
> The configuration `org.mortbay` in log4j.properties is used to control 
> logging activities of jetty . However, we have upgrade to jetty 9 in 
> HIVE-16049, the package name has changed to `org.eclipse.jetty`, this 
> configuration is useless. I guess we can remove it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21556) Useless configuration for old jetty in log4j.properties

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21556?focusedWorklogId=221137=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221137
 ]

ASF GitHub Bot logged work on HIVE-21556:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 07:29
Start Date: 01/Apr/19 07:29
Worklog Time Spent: 10m 
  Work Description: coder-chenzhi commented on pull request #586: 
HIVE-21556 remove configuraion for old version of jetty (6.x)
URL: https://github.com/apache/hive/pull/586
 
 
   ```
   logger.Mortbay.name = org.mortbay
   logger.Mortbay.level = INFO
   ```
   The logger `Mortbay` in log4j.properties is used to control logging 
activities of jetty (6.x). However, we have upgrade to jetty 9 in HIVE-16049, 
the package name has changed to `org.eclipse.jetty` and we have added the new 
logger to control jetty. `Mortbay` is useless. I guess we can remove it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221137)
Time Spent: 10m
Remaining Estimate: 0h

> Useless configuration for old jetty in log4j.properties
> ---
>
> Key: HIVE-21556
> URL: https://issues.apache.org/jira/browse/HIVE-21556
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Chen Zhi
>Priority: Minor
>  Labels: patch-available, pull-request-available
> Attachments: HIVE-21556.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
> {code:java}
> logger.Mortbay.name = org.mortbay
> logger.Mortbay.level = INFO
> {code}
> The logger `Mortbay` in log4j.properties is used to control logging 
> activities of jetty (6.x). However, we have upgrade to jetty 9 in HIVE-16049, 
> the package name has changed to `org.eclipse.jetty` and we have added the new 
> logger to control jetty. `Mortbay` is useless. I guess we can remove it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21556) Useless configuration for old jetty in log4j.properties

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-21556:
--
Labels: patch-available pull-request-available  (was: patch-available)

> Useless configuration for old jetty in log4j.properties
> ---
>
> Key: HIVE-21556
> URL: https://issues.apache.org/jira/browse/HIVE-21556
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Chen Zhi
>Priority: Minor
>  Labels: patch-available, pull-request-available
> Attachments: HIVE-21556.1.patch
>
>
>  
> {code:java}
> logger.Mortbay.name = org.mortbay
> logger.Mortbay.level = INFO
> {code}
> The logger `Mortbay` in log4j.properties is used to control logging 
> activities of jetty (6.x). However, we have upgrade to jetty 9 in HIVE-16049, 
> the package name has changed to `org.eclipse.jetty` and we have added the new 
> logger to control jetty. `Mortbay` is useless. I guess we can remove it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21404) MSSQL upgrade script alters the wrong column

2019-04-01 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-21404:
---

Assignee: Zoltan Haindrich  (was: David Lavati)

> MSSQL upgrade script alters the wrong column
> 
>
> Key: HIVE-21404
> URL: https://issues.apache.org/jira/browse/HIVE-21404
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.2.0
>Reporter: David Lavati
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
> Attachments: HIVE-21404.1.patch, HIVE-21404.2.patch, 
> HIVE-21404.3.patch, HIVE-21404.4.patch, HIVE-21404.4.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-20221 changes PARTITION_PARAMS, so the following command is modifying 
> the wrong table:
> {{ALTER TABLE "SERDE_PARAMS" ALTER COLUMN "PARAM_VALUE" nvarchar(MAX);}}
> https://github.com/apache/hive/blob/d3b036920acde7bb04840697eb13038103b062b4/standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.1.0-to-3.2.0.mssql.sql#L21



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221113=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221113
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 06:38
Start Date: 01/Apr/19 06:38
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #579: 
HIVE-21109 : Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270730194
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
 ##
 @@ -2950,21 +2957,32 @@ public Partition createPartition(Table tbl, 
Map partSpec) throws
 int size = addPartitionDesc.getPartitionCount();
 List in =
 new ArrayList(size);
-AcidUtils.TableSnapshot tableSnapshot = AcidUtils.getTableSnapshot(conf, 
tbl, true);
 long writeId;
 String validWriteIdList;
-if (tableSnapshot != null && tableSnapshot.getWriteId() > 0) {
-  writeId = tableSnapshot.getWriteId();
-  validWriteIdList = tableSnapshot.getValidWriteIdList();
+
+// In case of replication, get the writeId from the source and use valid 
write Id list
+// for replication.
+if (addPartitionDesc.getReplicationSpec().isInReplicationScope() &&
+addPartitionDesc.getPartition(0).getWriteId() > 0) {
+  writeId = addPartitionDesc.getPartition(0).getWriteId();
+  validWriteIdList = new 
ValidReaderWriteIdList(TableName.getDbTable(tbl.getDbName(),
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221113)
Time Spent: 9h  (was: 8h 50m)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch
>
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21556) Useless configuration for old jetty in log4j.properties

2019-04-01 Thread Chen Zhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Zhi updated HIVE-21556:

Description: The configuration `org.mortbay` in log4j.properties is used to 
control logging activities of jetty (6.x). However, we have upgrade to jetty 9 
in HIVE-16049, the package name has changed to `org.eclipse.jetty`. This 
configuration is useless. I guess we can remove it.  (was: The configuration 
`org.mortbay` in log4j.properties is used to control logging activities of 
jetty . However, we have upgrade to jetty 9 in HIVE-16049, the package name has 
changed to `org.eclipse.jetty`, this configuration is useless. I guess we can 
remove it.)

> Useless configuration for old jetty in log4j.properties
> ---
>
> Key: HIVE-21556
> URL: https://issues.apache.org/jira/browse/HIVE-21556
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Chen Zhi
>Priority: Minor
>
> The configuration `org.mortbay` in log4j.properties is used to control 
> logging activities of jetty (6.x). However, we have upgrade to jetty 9 in 
> HIVE-16049, the package name has changed to `org.eclipse.jetty`. This 
> configuration is useless. I guess we can remove it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21392) Misconfigurations of DataNucleus log in log4j.properties

2019-04-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806487#comment-16806487
 ] 

Hive QA commented on HIVE-21392:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12964407/HIVE-21392.08.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16791/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16791/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16791/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2019-04-01 07:43:32.550
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-16791/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2019-04-01 07:43:32.553
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at dc0b16a HIVE-21001: Upgrade to calcite-1.19 (Zoltan Haindrich 
reviewed by Jesus Camacho Rodriguez)
+ git clean -f -d
Removing standalone-metastore/metastore-server/src/gen/
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at dc0b16a HIVE-21001: Upgrade to calcite-1.19 (Zoltan Haindrich 
reviewed by Jesus Camacho Rodriguez)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2019-04-01 07:43:33.315
+ rm -rf ../yetus_PreCommit-HIVE-Build-16791
+ mkdir ../yetus_PreCommit-HIVE-Build-16791
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-16791
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-16791/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: common/src/main/resources/hive-log4j2.properties:51
error: repository lacks the necessary blob to fall back on 3-way merge.
error: common/src/main/resources/hive-log4j2.properties: patch does not apply
error: patch failed: 
common/src/test/resources/hive-exec-log4j2-test.properties:42
error: repository lacks the necessary blob to fall back on 3-way merge.
error: common/src/test/resources/hive-exec-log4j2-test.properties: patch does 
not apply
error: patch failed: common/src/test/resources/hive-log4j2-test.properties:49
error: repository lacks the necessary blob to fall back on 3-way merge.
error: common/src/test/resources/hive-log4j2-test.properties: patch does not 
apply
error: patch failed: data/conf/hive-log4j2.properties:50
error: repository lacks the necessary blob to fall back on 3-way merge.
error: data/conf/hive-log4j2.properties: patch does not apply
error: patch failed: 
hcatalog/src/test/e2e/templeton/deployers/config/hive/hive-log4j2.properties:49
error: repository lacks the necessary blob to fall back on 3-way merge.
error: 
hcatalog/src/test/e2e/templeton/deployers/config/hive/hive-log4j2.properties: 
patch does not apply
error: patch failed: 
llap-server/src/main/resources/llap-cli-log4j2.properties:58
error: repository lacks the necessary blob to fall back on 3-way merge.
error: llap-server/src/main/resources/llap-cli-log4j2.properties: patch does 
not apply
error: patch failed: 
llap-server/src/main/resources/llap-daemon-log4j2.properties:100
error: repository lacks the necessary blob to fall back on 3-way merge.
error: llap-server/src/main/resources/llap-daemon-log4j2.properties: patch does 
not apply
error: patch failed: 
llap-server/src/test/resources/llap-daemon-log4j2.properties:64
error: repository lacks the necessary blob to fall back on 3-way merge.
error: llap-server/src/test/resources/llap-daemon-log4j2.properties: patch does 
not apply
error: patch failed: 

[jira] [Comment Edited] (HIVE-21540) Query with join condition having date literal throws SemanticException.

2019-04-01 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806491#comment-16806491
 ] 

Zoltan Haindrich edited comment on HIVE-21540 at 4/1/19 7:53 AM:
-

+1
I think we might possibly also miss: TOK_TIMESTAMPLITERAL and 
TOK_TIMESTAMPLOCALTZLITERAL from that switch statement


was (Author: kgyrtkirk):
+1


> Query with join condition having date literal throws SemanticException.
> ---
>
> Key: HIVE-21540
> URL: https://issues.apache.org/jira/browse/HIVE-21540
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.1.0, 3.1.1
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: Analyzer, DateField, pull-request-available
> Attachments: HIVE-21540.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This semantic exception is thrown for the following query. 
> *SemanticException '2019-03-20' encountered with 0 children*
> {code}
> create table date_1 (key int, dd date);
> create table date_2 (key int, dd date);
> select d1.key, d2.dd from(
>   select key, dd as start_dd, current_date as end_dd from date_1) d1
>   join date_2 as d2 on d1.key = d2.key where d2.dd between start_dd and 
> end_dd;
> {code}
> When the WHERE condition below is commented out, the query completes 
> successfully.
> where d2.dd between start_dd and end_dd
> 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21316) Comparision of varchar column and string literal should happen in varchar

2019-04-01 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-21316:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

pushed to master. Thank you Vineet for reviewing the changes!

> Comparision of varchar column and string literal should happen in varchar
> -
>
> Key: HIVE-21316
> URL: https://issues.apache.org/jira/browse/HIVE-21316
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21316.01.patch, HIVE-21316.02.patch, 
> HIVE-21316.03.patch, HIVE-21316.04.patch, HIVE-21316.05.patch, 
> HIVE-21316.06.patch, HIVE-21316.06.patch, HIVE-21316.07.patch, 
> HIVE-21316.07.patch, HIVE-21316.07.patch, HIVE-21316.08.patch, 
> HIVE-21316.08.patch
>
>
> this is most probably the root cause behind HIVE-21310 as well



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21509) LLAP may cache corrupted column vectors and return wrong query result

2019-04-01 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-21509:
--
Status: Patch Available  (was: In Progress)

> LLAP may cache corrupted column vectors and return wrong query result
> -
>
> Key: HIVE-21509
> URL: https://issues.apache.org/jira/browse/HIVE-21509
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-21509.0.wip.patch, HIVE-21509.1.wip.patch, 
> HIVE-21509.2.patch, HIVE-21509.3.patch
>
>
> In some scenarios, LLAP might store column vectors in cache that are getting 
> reused and reset just before their original content would be written.
> The issue is a concurrency issue and is thereby flaky. It is not easy to 
> reproduce, but the odds of surfacing this issue can by improved by setting 
> LLAP executor and IO thread counts this way:
>  * set hive.llap.daemon.num.executors=32;
>  * set hive.llap.io.threadpool.size=1;
>  * using TPCDS input data of store_sales table, have at least a couple of 
> 100k's of rows, and use text format:
> {code:java}
> ROW FORMAT SERDE    'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'  
> WITH SERDEPROPERTIES (    'field.delim'='|',    'serialization.format'='|')  
> STORED AS INPUTFORMAT    'org.apache.hadoop.mapred.TextInputFormat'  
> OUTPUTFORMAT    
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'{code}
>  * having more splits increases the issue showing itself, so it is worth to 
> _set tez.grouping.min-size=1024; set tez.grouping.max-size=1024;_
>  * run query on this this table: select min(ss_sold_date_sk) from store_sales;
> The first query result is correct (2450816 in my case). Repeating the query 
> will trigger reading from LLAP cache and produce a wrong result: 0.
> If one wants to make sure of running into this issue, place a 
> Thread.sleep(250) at the beginning of VectorDeserializeOrcWriter#run().
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21509) LLAP may cache corrupted column vectors and return wrong query result

2019-04-01 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-21509:
--
Attachment: HIVE-21509.3.patch

> LLAP may cache corrupted column vectors and return wrong query result
> -
>
> Key: HIVE-21509
> URL: https://issues.apache.org/jira/browse/HIVE-21509
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-21509.0.wip.patch, HIVE-21509.1.wip.patch, 
> HIVE-21509.2.patch, HIVE-21509.3.patch
>
>
> In some scenarios, LLAP might store column vectors in cache that are getting 
> reused and reset just before their original content would be written.
> The issue is a concurrency issue and is thereby flaky. It is not easy to 
> reproduce, but the odds of surfacing this issue can by improved by setting 
> LLAP executor and IO thread counts this way:
>  * set hive.llap.daemon.num.executors=32;
>  * set hive.llap.io.threadpool.size=1;
>  * using TPCDS input data of store_sales table, have at least a couple of 
> 100k's of rows, and use text format:
> {code:java}
> ROW FORMAT SERDE    'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'  
> WITH SERDEPROPERTIES (    'field.delim'='|',    'serialization.format'='|')  
> STORED AS INPUTFORMAT    'org.apache.hadoop.mapred.TextInputFormat'  
> OUTPUTFORMAT    
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'{code}
>  * having more splits increases the issue showing itself, so it is worth to 
> _set tez.grouping.min-size=1024; set tez.grouping.max-size=1024;_
>  * run query on this this table: select min(ss_sold_date_sk) from store_sales;
> The first query result is correct (2450816 in my case). Repeating the query 
> will trigger reading from LLAP cache and produce a wrong result: 0.
> If one wants to make sure of running into this issue, place a 
> Thread.sleep(250) at the beginning of VectorDeserializeOrcWriter#run().
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21509) LLAP may cache corrupted column vectors and return wrong query result

2019-04-01 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-21509:
--
Status: In Progress  (was: Patch Available)

> LLAP may cache corrupted column vectors and return wrong query result
> -
>
> Key: HIVE-21509
> URL: https://issues.apache.org/jira/browse/HIVE-21509
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-21509.0.wip.patch, HIVE-21509.1.wip.patch, 
> HIVE-21509.2.patch, HIVE-21509.3.patch
>
>
> In some scenarios, LLAP might store column vectors in cache that are getting 
> reused and reset just before their original content would be written.
> The issue is a concurrency issue and is thereby flaky. It is not easy to 
> reproduce, but the odds of surfacing this issue can by improved by setting 
> LLAP executor and IO thread counts this way:
>  * set hive.llap.daemon.num.executors=32;
>  * set hive.llap.io.threadpool.size=1;
>  * using TPCDS input data of store_sales table, have at least a couple of 
> 100k's of rows, and use text format:
> {code:java}
> ROW FORMAT SERDE    'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'  
> WITH SERDEPROPERTIES (    'field.delim'='|',    'serialization.format'='|')  
> STORED AS INPUTFORMAT    'org.apache.hadoop.mapred.TextInputFormat'  
> OUTPUTFORMAT    
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'{code}
>  * having more splits increases the issue showing itself, so it is worth to 
> _set tez.grouping.min-size=1024; set tez.grouping.max-size=1024;_
>  * run query on this this table: select min(ss_sold_date_sk) from store_sales;
> The first query result is correct (2450816 in my case). Repeating the query 
> will trigger reading from LLAP cache and produce a wrong result: 0.
> If one wants to make sure of running into this issue, place a 
> Thread.sleep(250) at the beginning of VectorDeserializeOrcWriter#run().
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221109=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221109
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 06:30
Start Date: 01/Apr/19 06:30
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #579: 
HIVE-21109 : Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270728648
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java
 ##
 @@ -297,21 +303,34 @@ private ColumnStatisticsDesc getColumnStatsDesc(String 
dbName,
 
   private int persistColumnStats(Hive db) throws HiveException, MetaException, 
IOException {
 ColumnStatistics colStats = constructColumnStatsFromInput();
-ColumnStatisticsDesc colStatsDesc = colStats.getStatsDesc();
-// We do not support stats replication for a transactional table yet. If 
we are converting
-// a non-transactional table to a transactional table during replication, 
we might get
-// column statistics but we shouldn't update those.
-if (work.getColStats() != null &&
-
AcidUtils.isTransactionalTable(getHive().getTable(colStatsDesc.getDbName(),
-  
colStatsDesc.getTableName( {
-  LOG.debug("Skipped updating column stats for table " +
-TableName.getDbTable(colStatsDesc.getDbName(), 
colStatsDesc.getTableName()) +
-" because it is converted to a transactional table during 
replication.");
-  return 0;
-}
-
 SetPartitionsStatsRequest request =
 new SetPartitionsStatsRequest(Collections.singletonList(colStats));
+
+// Set writeId and validWriteId list for replicated statistics.
+if (work.getColStats() != null) {
+  String dbName = colStats.getStatsDesc().getDbName();
+  String tblName = colStats.getStatsDesc().getTableName();
+  Table tbl = db.getTable(dbName, tblName);
+  long writeId = work.getWriteId();
+  // If it's a transactional table on source and target, we will get a 
valid writeId
+  // associated with it. Otherwise it's a non-transactional table on 
source migrated to a
+  // transactional table on target, we need to craft a valid writeId here.
+  if (AcidUtils.isTransactionalTable(tbl)) {
+ValidWriteIdList writeIds;
+if (writeId <= 0) {
 
 Review comment:
   We can not set writeId in the ColumnStatsUpdateWork because the writeId for 
migration is available only after a transaction is opened for migration, which 
doesn't happen at the load time (when the work is created). Going by the gist 
of your suggestion, I have set a flag in work to indicate that the writeId 
should be the one used for migration.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221109)
Time Spent: 8h 40m  (was: 8.5h)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221117=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221117
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 06:44
Start Date: 01/Apr/19 06:44
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #579: 
HIVE-21109 : Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270731373
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java
 ##
 @@ -297,21 +303,34 @@ private ColumnStatisticsDesc getColumnStatsDesc(String 
dbName,
 
   private int persistColumnStats(Hive db) throws HiveException, MetaException, 
IOException {
 ColumnStatistics colStats = constructColumnStatsFromInput();
-ColumnStatisticsDesc colStatsDesc = colStats.getStatsDesc();
-// We do not support stats replication for a transactional table yet. If 
we are converting
-// a non-transactional table to a transactional table during replication, 
we might get
-// column statistics but we shouldn't update those.
-if (work.getColStats() != null &&
-
AcidUtils.isTransactionalTable(getHive().getTable(colStatsDesc.getDbName(),
-  
colStatsDesc.getTableName( {
-  LOG.debug("Skipped updating column stats for table " +
-TableName.getDbTable(colStatsDesc.getDbName(), 
colStatsDesc.getTableName()) +
-" because it is converted to a transactional table during 
replication.");
-  return 0;
-}
-
 SetPartitionsStatsRequest request =
 new SetPartitionsStatsRequest(Collections.singletonList(colStats));
+
+// Set writeId and validWriteId list for replicated statistics.
+if (work.getColStats() != null) {
+  String dbName = colStats.getStatsDesc().getDbName();
+  String tblName = colStats.getStatsDesc().getTableName();
+  Table tbl = db.getTable(dbName, tblName);
+  long writeId = work.getWriteId();
+  // If it's a transactional table on source and target, we will get a 
valid writeId
+  // associated with it. Otherwise it's a non-transactional table on 
source migrated to a
+  // transactional table on target, we need to craft a valid writeId here.
+  if (AcidUtils.isTransactionalTable(tbl)) {
+ValidWriteIdList writeIds;
+if (writeId <= 0) {
+  Long tmpWriteId = ReplUtils.getMigrationCurrentTblWriteId(conf);
+  if (tmpWriteId == null) {
+throw new HiveException("DDLTask : Write id is not set in the 
config by open txn task for migration");
+  }
+  writeId = tmpWriteId;
+}
+writeIds = new ValidReaderWriteIdList(TableName.getDbTable(dbName, 
tblName), new long[0],
 
 Review comment:
   work.getColStats() returns non-null value only in case of replication flow. 
This block of code is under that condition. So, it executes only in repl flow. 
Added a comment to that effect. Also added a comment per your suggestion.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221117)
Time Spent: 9h 10m  (was: 9h)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch
>
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21556) Useless configuration for old jetty in log4j.properties

2019-04-01 Thread Chen Zhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Zhi updated HIVE-21556:

Description: 
{{The logger `Mortbay`}}
{code:java}
// code placeholder
{code}
{{in log4j.properties is used to control logging activities of jetty (6.x). 
However, we have upgrade to jetty 9 in HIVE-16049, the package name has changed 
to `org.eclipse.jetty` and we have added `Jetty` logger to control the new 
version. `Mortbay` is useless. I guess we can remove it.}}

  was:The configuration `org.mortbay` in log4j.properties is used to control 
logging activities of jetty (6.x). However, we have upgrade to jetty 9 in 
HIVE-16049, the package name has changed to `org.eclipse.jetty`. This 
configuration is useless. I guess we can remove it.


> Useless configuration for old jetty in log4j.properties
> ---
>
> Key: HIVE-21556
> URL: https://issues.apache.org/jira/browse/HIVE-21556
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Chen Zhi
>Priority: Minor
>
> {{The logger `Mortbay`}}
> {code:java}
> // code placeholder
> {code}
> {{in log4j.properties is used to control logging activities of jetty (6.x). 
> However, we have upgrade to jetty 9 in HIVE-16049, the package name has 
> changed to `org.eclipse.jetty` and we have added `Jetty` logger to control 
> the new version. `Mortbay` is useless. I guess we can remove it.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21556) Useless configuration for old jetty in log4j.properties

2019-04-01 Thread Chen Zhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Zhi updated HIVE-21556:

Attachment: HIVE-21556.1.patch

> Useless configuration for old jetty in log4j.properties
> ---
>
> Key: HIVE-21556
> URL: https://issues.apache.org/jira/browse/HIVE-21556
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Chen Zhi
>Priority: Minor
> Attachments: HIVE-21556.1.patch
>
>
>  
> {code:java}
> logger.Mortbay.name = org.mortbay
> logger.Mortbay.level = INFO
> {code}
> The logger `Mortbay` in log4j.properties is used to control logging 
> activities of jetty (6.x). However, we have upgrade to jetty 9 in HIVE-16049, 
> the package name has changed to `org.eclipse.jetty` and we have added the new 
> logger to control jetty. `Mortbay` is useless. I guess we can remove it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21540) Query with join condition having date literal throws SemanticException.

2019-04-01 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806491#comment-16806491
 ] 

Zoltan Haindrich commented on HIVE-21540:
-

+1


> Query with join condition having date literal throws SemanticException.
> ---
>
> Key: HIVE-21540
> URL: https://issues.apache.org/jira/browse/HIVE-21540
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.1.0, 3.1.1
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: Analyzer, DateField, pull-request-available
> Attachments: HIVE-21540.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This semantic exception is thrown for the following query. 
> *SemanticException '2019-03-20' encountered with 0 children*
> {code}
> create table date_1 (key int, dd date);
> create table date_2 (key int, dd date);
> select d1.key, d2.dd from(
>   select key, dd as start_dd, current_date as end_dd from date_1) d1
>   join date_2 as d2 on d1.key = d2.key where d2.dd between start_dd and 
> end_dd;
> {code}
> When the WHERE condition below is commented out, the query completes 
> successfully.
> where d2.dd between start_dd and end_dd
> 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-15546) Optimize Utilities.getInputPaths() so each listStatus of a partition is done in parallel

2019-04-01 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804873#comment-16804873
 ] 

t oo edited comment on HIVE-15546 at 4/1/19 8:01 AM:
-

[~stakiar] Issue still faced with single threading - HIVE-21546


was (Author: toopt4):
Did this ever make release 2.3? I can't see it in 
[https://github.com/apache/hive/blob/rel/release-2.3.0/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java]
 and issue still faced with single threading 
(https://stackoverflow.com/questions/55416703/hiveserver2-on-spark-mapred-fileinputformat-total-input-files-to-process)

> Optimize Utilities.getInputPaths() so each listStatus of a partition is done 
> in parallel
> 
>
> Key: HIVE-15546
> URL: https://issues.apache.org/jira/browse/HIVE-15546
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: 2.3.0
>
> Attachments: HIVE-15546.1.patch, HIVE-15546.2.patch, 
> HIVE-15546.3.patch, HIVE-15546.4.patch, HIVE-15546.5.patch, HIVE-15546.6.patch
>
>
> When running on blobstores (like S3) where metadata operations (like 
> listStatus) are costly, Utilities.getInputPaths() can add significant 
> overhead when setting up the input paths for an MR / Spark / Tez job.
> The method performs a listStatus on all input paths in order to check if the 
> path is empty. If the path is empty, a dummy file is created for the given 
> partition. This is all done sequentially. This can be really slow when there 
> are a lot of empty partitions. Even when all partitions have input data, this 
> can take a long time.
> We should either:
> (1) Just remove the logic to check if each input path is empty, and handle 
> any edge cases accordingly.
> (2) Multi-thread the listStatus calls



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21404) MSSQL upgrade script alters the wrong column

2019-04-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806516#comment-16806516
 ] 

Hive QA commented on HIVE-21404:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 20 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 11m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16792/dev-support/hive-personality.sh
 |
| git revision | master / f8a73a8 |
| Default Java | 1.8.0_111 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16792/yetus/whitespace-tabs.txt
 |
| modules | C: standalone-metastore/metastore-server U: 
standalone-metastore/metastore-server |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16792/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> MSSQL upgrade script alters the wrong column
> 
>
> Key: HIVE-21404
> URL: https://issues.apache.org/jira/browse/HIVE-21404
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.2.0
>Reporter: David Lavati
>Assignee: David Lavati
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
> Attachments: HIVE-21404.1.patch, HIVE-21404.2.patch, 
> HIVE-21404.3.patch, HIVE-21404.4.patch, HIVE-21404.4.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-20221 changes PARTITION_PARAMS, so the following command is modifying 
> the wrong table:
> {{ALTER TABLE "SERDE_PARAMS" ALTER COLUMN "PARAM_VALUE" nvarchar(MAX);}}
> https://github.com/apache/hive/blob/d3b036920acde7bb04840697eb13038103b062b4/standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.1.0-to-3.2.0.mssql.sql#L21



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-14836) Test the predicate pushing down support for Parquet vectorization read path

2019-04-01 Thread Xinli Shang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806928#comment-16806928
 ] 

Xinli Shang commented on HIVE-14836:


Hi [~Ferd] , is this task done? 

> Test the predicate pushing down support for Parquet vectorization read path
> ---
>
> Key: HIVE-14836
> URL: https://issues.apache.org/jira/browse/HIVE-14836
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14836.patch
>
>
> We should add more UT test for predicate pushing down support for Parquet 
> vectorization read path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18576) Support to read nested complex type with Parquet in vectorization mode

2019-04-01 Thread Xinli Shang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806930#comment-16806930
 ] 

Xinli Shang commented on HIVE-18576:


Hi [~jerrychenhf] , is this task done? 

> Support to read nested complex type with Parquet in vectorization mode
> --
>
> Key: HIVE-18576
> URL: https://issues.apache.org/jira/browse/HIVE-18576
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Colin Ma
>Assignee: Haifeng Chen
>Priority: Major
>
> Nested complex type is common used, eg: Struct, s2 
> List>. Currently, nested complex type can't be parsed in vectorization 
> mode, this ticket is target to support it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21529) Hive support bootstrap of ACID/MM tables on an existing policy.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21529?focusedWorklogId=221331=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221331
 ]

ASF GitHub Bot logged work on HIVE-21529:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 15:33
Start Date: 01/Apr/19 15:33
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #581: HIVE-21529 : 
Bootstrap ACID tables as part of incremental dump.
URL: https://github.com/apache/hive/pull/581#discussion_r270896979
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -193,27 +226,39 @@ private Long incrementalDump(Path dumpRoot, DumpMetaData 
dmd, Path cmRoot, Hive
 dmd.setDump(DumpType.INCREMENTAL, work.eventFrom, lastReplId, cmRoot);
 dmd.write();
 
-// If external tables are enabled for replication and
-// - If bootstrap is enabled, then need to combine bootstrap dump of 
external tables.
-// - If metadata-only dump is enabled, then shall skip dumping external 
tables data locations to
-//   _external_tables_info file. If not metadata-only, then dump the data 
locations.
-if (conf.getBoolVar(HiveConf.ConfVars.REPL_INCLUDE_EXTERNAL_TABLES)
-&& (!conf.getBoolVar(HiveConf.ConfVars.REPL_DUMP_METADATA_ONLY)
-|| conf.getBoolVar(HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES))) 
{
+// Examine all the tables if required.
+if (shouldExamineTablesToDump()) {
   Path dbRoot = getBootstrapDbRoot(dumpRoot, dbName, true);
+
+  // If we are bootstrapping ACID tables, stop all the concurrent 
transactions and take a
+  // snapshot to dump those tables. Record the last event id in case we 
are performing
+  // bootstrap of ACID tables.
+  String validTxnList = null;
+  long bootstrapLastReplId = 0;
+  if (conf.getBoolVar(HiveConf.ConfVars.REPL_BOOTSTRAP_ACID_TABLES)) {
+validTxnList = getValidTxnListForReplDump(hiveDb);
+bootstrapLastReplId = 
hiveDb.getMSC().getCurrentNotificationEventId().getEventId();
 
 Review comment:
   bootstrapLastReplId should be captured before open txn of REPL DUMP query. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221331)
Time Spent: 20m  (was: 10m)

> Hive support bootstrap of ACID/MM tables on an existing policy.
> ---
>
> Key: HIVE-21529
> URL: https://issues.apache.org/jira/browse/HIVE-21529
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl, Transactions
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Attachments: HIVE-21529.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> If ACID/MM tables to be enabled (hive.repl.dump.include.acid.tables) on an 
> existing repl policy, then need to combine bootstrap dump of these tables 
> along with the ongoing incremental dump. 
>  Shall add a one time config "hive.repl.bootstrap.acid.tables" to include 
> bootstrap in the given dump.
> The support for hive.repl.bootstrap.cleanup.type for ACID tables to clean-up 
> partially bootstrapped tables in case of retry is already in place, thanks to 
> the work done during external tables. Need to test that it actually works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21529) Hive support bootstrap of ACID/MM tables on an existing policy.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21529?focusedWorklogId=221333=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221333
 ]

ASF GitHub Bot logged work on HIVE-21529:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 15:33
Start Date: 01/Apr/19 15:33
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #581: HIVE-21529 : 
Bootstrap ACID tables as part of incremental dump.
URL: https://github.com/apache/hive/pull/581#discussion_r270926635
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -193,27 +226,39 @@ private Long incrementalDump(Path dumpRoot, DumpMetaData 
dmd, Path cmRoot, Hive
 dmd.setDump(DumpType.INCREMENTAL, work.eventFrom, lastReplId, cmRoot);
 dmd.write();
 
-// If external tables are enabled for replication and
-// - If bootstrap is enabled, then need to combine bootstrap dump of 
external tables.
-// - If metadata-only dump is enabled, then shall skip dumping external 
tables data locations to
-//   _external_tables_info file. If not metadata-only, then dump the data 
locations.
-if (conf.getBoolVar(HiveConf.ConfVars.REPL_INCLUDE_EXTERNAL_TABLES)
-&& (!conf.getBoolVar(HiveConf.ConfVars.REPL_DUMP_METADATA_ONLY)
-|| conf.getBoolVar(HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES))) 
{
+// Examine all the tables if required.
+if (shouldExamineTablesToDump()) {
   Path dbRoot = getBootstrapDbRoot(dumpRoot, dbName, true);
+
+  // If we are bootstrapping ACID tables, stop all the concurrent 
transactions and take a
+  // snapshot to dump those tables. Record the last event id in case we 
are performing
+  // bootstrap of ACID tables.
+  String validTxnList = null;
+  long bootstrapLastReplId = 0;
+  if (conf.getBoolVar(HiveConf.ConfVars.REPL_BOOTSTRAP_ACID_TABLES)) {
+validTxnList = getValidTxnListForReplDump(hiveDb);
+bootstrapLastReplId = 
hiveDb.getMSC().getCurrentNotificationEventId().getEventId();
 
 Review comment:
   There is a very corner case  where this logic can be problem even for full 
bootstrap.
   Driver.java: L663
   if ((queryState.getHiveOperation() != null)
   && 
queryState.getHiveOperation().equals(HiveOperation.REPLDUMP)) {
 setLastReplIdForDump(queryState.getConf());
   }
   openTransaction();
   Here we capture last repl ID just before opening txn from REPL DUMP 
execution thread (let's say T1)
   If a concurrent thread (let's say T2) opens a txn after getting last repl ID 
but before openTransaction() in T1. Let's say T2 writes in to the ACID table.
   In this case, REPL DUMP would wait for T2 to commit txn before dumping ACID 
table. Now, the snapshot of bootstrap dumped table includes the data written by 
Thread T2. But, those events will be part of subsequent incremental dump.
   When we apply these OpenTxn, allocateWriteId and commitTxn events, it may 
create duplicate data. 
   Need some idempotent logic in replica side to handle this. 
   Please check if my theory make sense.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221333)
Time Spent: 40m  (was: 0.5h)

> Hive support bootstrap of ACID/MM tables on an existing policy.
> ---
>
> Key: HIVE-21529
> URL: https://issues.apache.org/jira/browse/HIVE-21529
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl, Transactions
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Attachments: HIVE-21529.01.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> If ACID/MM tables to be enabled (hive.repl.dump.include.acid.tables) on an 
> existing repl policy, then need to combine bootstrap dump of these tables 
> along with the ongoing incremental dump. 
>  Shall add a one time config "hive.repl.bootstrap.acid.tables" to include 
> bootstrap in the given dump.
> The support for hive.repl.bootstrap.cleanup.type for ACID tables to clean-up 
> partially bootstrapped tables in case of retry is already in place, thanks to 
> the work done during external tables. Need to test that it actually works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21529) Hive support bootstrap of ACID/MM tables on an existing policy.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21529?focusedWorklogId=221335=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221335
 ]

ASF GitHub Bot logged work on HIVE-21529:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 15:33
Start Date: 01/Apr/19 15:33
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #581: HIVE-21529 : 
Bootstrap ACID tables as part of incremental dump.
URL: https://github.com/apache/hive/pull/581#discussion_r270897339
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -193,27 +226,39 @@ private Long incrementalDump(Path dumpRoot, DumpMetaData 
dmd, Path cmRoot, Hive
 dmd.setDump(DumpType.INCREMENTAL, work.eventFrom, lastReplId, cmRoot);
 
 Review comment:
   Events dump should be limited until last repl ID that was captured before 
open txn.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221335)
Time Spent: 1h  (was: 50m)

> Hive support bootstrap of ACID/MM tables on an existing policy.
> ---
>
> Key: HIVE-21529
> URL: https://issues.apache.org/jira/browse/HIVE-21529
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl, Transactions
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Attachments: HIVE-21529.01.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> If ACID/MM tables to be enabled (hive.repl.dump.include.acid.tables) on an 
> existing repl policy, then need to combine bootstrap dump of these tables 
> along with the ongoing incremental dump. 
>  Shall add a one time config "hive.repl.bootstrap.acid.tables" to include 
> bootstrap in the given dump.
> The support for hive.repl.bootstrap.cleanup.type for ACID tables to clean-up 
> partially bootstrapped tables in case of retry is already in place, thanks to 
> the work done during external tables. Need to test that it actually works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21529) Hive support bootstrap of ACID/MM tables on an existing policy.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21529?focusedWorklogId=221332=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221332
 ]

ASF GitHub Bot logged work on HIVE-21529:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 15:33
Start Date: 01/Apr/19 15:33
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #581: HIVE-21529 : 
Bootstrap ACID tables as part of incremental dump.
URL: https://github.com/apache/hive/pull/581#discussion_r270898935
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -193,27 +226,39 @@ private Long incrementalDump(Path dumpRoot, DumpMetaData 
dmd, Path cmRoot, Hive
 dmd.setDump(DumpType.INCREMENTAL, work.eventFrom, lastReplId, cmRoot);
 dmd.write();
 
-// If external tables are enabled for replication and
-// - If bootstrap is enabled, then need to combine bootstrap dump of 
external tables.
-// - If metadata-only dump is enabled, then shall skip dumping external 
tables data locations to
-//   _external_tables_info file. If not metadata-only, then dump the data 
locations.
-if (conf.getBoolVar(HiveConf.ConfVars.REPL_INCLUDE_EXTERNAL_TABLES)
-&& (!conf.getBoolVar(HiveConf.ConfVars.REPL_DUMP_METADATA_ONLY)
-|| conf.getBoolVar(HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES))) 
{
+// Examine all the tables if required.
+if (shouldExamineTablesToDump()) {
   Path dbRoot = getBootstrapDbRoot(dumpRoot, dbName, true);
+
+  // If we are bootstrapping ACID tables, stop all the concurrent 
transactions and take a
+  // snapshot to dump those tables. Record the last event id in case we 
are performing
+  // bootstrap of ACID tables.
+  String validTxnList = null;
+  long bootstrapLastReplId = 0;
+  if (conf.getBoolVar(HiveConf.ConfVars.REPL_BOOTSTRAP_ACID_TABLES)) {
+validTxnList = getValidTxnListForReplDump(hiveDb);
+bootstrapLastReplId = 
hiveDb.getMSC().getCurrentNotificationEventId().getEventId();
+  }
+
   try (Writer writer = new Writer(dumpRoot, conf)) {
 
 Review comment:
   Shall we create the _external_info file only if 
shouldDumpExternalTableLocation?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221332)
Time Spent: 0.5h  (was: 20m)

> Hive support bootstrap of ACID/MM tables on an existing policy.
> ---
>
> Key: HIVE-21529
> URL: https://issues.apache.org/jira/browse/HIVE-21529
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl, Transactions
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Attachments: HIVE-21529.01.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> If ACID/MM tables to be enabled (hive.repl.dump.include.acid.tables) on an 
> existing repl policy, then need to combine bootstrap dump of these tables 
> along with the ongoing incremental dump. 
>  Shall add a one time config "hive.repl.bootstrap.acid.tables" to include 
> bootstrap in the given dump.
> The support for hive.repl.bootstrap.cleanup.type for ACID tables to clean-up 
> partially bootstrapped tables in case of retry is already in place, thanks to 
> the work done during external tables. Need to test that it actually works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21529) Hive support bootstrap of ACID/MM tables on an existing policy.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21529?focusedWorklogId=221334=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221334
 ]

ASF GitHub Bot logged work on HIVE-21529:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 15:33
Start Date: 01/Apr/19 15:33
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #581: HIVE-21529 : 
Bootstrap ACID tables as part of incremental dump.
URL: https://github.com/apache/hive/pull/581#discussion_r270911338
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/Utils.java
 ##
 @@ -196,6 +196,11 @@ public static boolean shouldReplicate(ReplicationSpec 
replicationSpec, Table tab
 }
 return shouldReplicateExternalTables;
   }
+
+  // Skip dumping events related to ACID tables if bootstrap is enabled on 
it
 
 Review comment:
   shouldReplicate is not checked by events such as AllocateWriteIdEvent and 
CommitTxnEvent.
   CommitTxnEvent cannot be skipped but need to remove all AcidWriteEvents 
which are packed along with it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221334)
Time Spent: 50m  (was: 40m)

> Hive support bootstrap of ACID/MM tables on an existing policy.
> ---
>
> Key: HIVE-21529
> URL: https://issues.apache.org/jira/browse/HIVE-21529
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl, Transactions
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Attachments: HIVE-21529.01.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> If ACID/MM tables to be enabled (hive.repl.dump.include.acid.tables) on an 
> existing repl policy, then need to combine bootstrap dump of these tables 
> along with the ongoing incremental dump. 
>  Shall add a one time config "hive.repl.bootstrap.acid.tables" to include 
> bootstrap in the given dump.
> The support for hive.repl.bootstrap.cleanup.type for ACID tables to clean-up 
> partially bootstrapped tables in case of retry is already in place, thanks to 
> the work done during external tables. Need to test that it actually works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16795) Measure Performance for Parquet Vectorization Reader

2019-04-01 Thread Xinli Shang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806927#comment-16806927
 ] 

Xinli Shang commented on HIVE-16795:


Hi [~Ferd] , is this effort done? Any numbers you can share? Sorry If I missed 
other channels for this task since I just joined recently. 

> Measure Performance for Parquet Vectorization Reader
> 
>
> Key: HIVE-16795
> URL: https://issues.apache.org/jira/browse/HIVE-16795
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Colin Ma
>Priority: Major
>
> We need to measure the performance of Parquet Vectorization reader feature 
> using TPCx-BB or TPC-DS to see how much performance gain we can archive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21557) Query based compaction fails with NullPointerException: Non-local session path expected to be non-null

2019-04-01 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-21557:
--
Attachment: HIVE-21557.02.patch

> Query based compaction fails with NullPointerException: Non-local session 
> path expected to be non-null
> --
>
> Key: HIVE-21557
> URL: https://issues.apache.org/jira/browse/HIVE-21557
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-21557.02.patch, HIVE-21557.patch
>
>
> {code:java}
> 2019-03-29T13:04:19.282Z hiveserver2-65d5bb4bd8-xx24r hiveserver2 1 
> db896a5e-5215-11e9-87ec-020c4712c37c [mdc@18060 class="compactor.CompactorMR" 
> level="ERROR" thread="hiveserver2-65d5bb4bd8-xx24r-28"] 
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run drop table if 
> exists default_tmp_compactor_asd_1553864659196
> at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:57)
> at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:34)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runCrudCompaction(CompactorMR.java:408)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:250)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:194)
> Caused by: java.lang.NullPointerException: Non-local session path expected to 
> be non-null
> at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:228)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:838)
> at org.apache.hadoop.hive.ql.Context.(Context.java:319)
> at org.apache.hadoop.hive.ql.Context.(Context.java:305)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:603)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1881)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2004)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1764)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753)
> at 
> org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:54){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21557) Query based compaction fails with NullPointerException: Non-local session path expected to be non-null

2019-04-01 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806917#comment-16806917
 ] 

Ashutosh Chauhan commented on HIVE-21557:
-

+1

> Query based compaction fails with NullPointerException: Non-local session 
> path expected to be non-null
> --
>
> Key: HIVE-21557
> URL: https://issues.apache.org/jira/browse/HIVE-21557
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-21557.02.patch, HIVE-21557.patch
>
>
> {code:java}
> 2019-03-29T13:04:19.282Z hiveserver2-65d5bb4bd8-xx24r hiveserver2 1 
> db896a5e-5215-11e9-87ec-020c4712c37c [mdc@18060 class="compactor.CompactorMR" 
> level="ERROR" thread="hiveserver2-65d5bb4bd8-xx24r-28"] 
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run drop table if 
> exists default_tmp_compactor_asd_1553864659196
> at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:57)
> at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:34)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runCrudCompaction(CompactorMR.java:408)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:250)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:194)
> Caused by: java.lang.NullPointerException: Non-local session path expected to 
> be non-null
> at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:228)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:838)
> at org.apache.hadoop.hive.ql.Context.(Context.java:319)
> at org.apache.hadoop.hive.ql.Context.(Context.java:305)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:603)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1881)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2004)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1764)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753)
> at 
> org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:54){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221203=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221203
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 09:58
Start Date: 01/Apr/19 09:58
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #579: 
HIVE-21109 : Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270794735
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java
 ##
 @@ -301,12 +338,106 @@ private String dumpLoadVerify(List tableNames, 
String lastReplicationId,
 return dumpTuple.lastReplicationId;
   }
 
+  /**
+   * Run a bootstrap that will fail.
+   * @param tuple the location of bootstrap dump
+   */
+  private void failBootstrapLoad(WarehouseInstance.Tuple tuple, int 
failAfterNumTables) throws Throwable {
+// fail setting ckpt directory property for the second table so that we 
test the case when
+// bootstrap load fails after some but not all tables are loaded.
+BehaviourInjection callerVerifier
+= new BehaviourInjection() {
+  int cntTables = 0;
+  @Nullable
+  @Override
+  public Boolean apply(@Nullable CallerArguments args) {
+cntTables++;
+if (args.dbName.equalsIgnoreCase(replicatedDbName) && cntTables > 
failAfterNumTables) {
+  injectionPathCalled = true;
+  LOG.warn("Verifier - DB : " + args.dbName + " TABLE : " + 
args.tblName);
+  return false;
+}
+return true;
+  }
+};
+
+InjectableBehaviourObjectStore.setAlterTableModifier(callerVerifier);
+try {
+  replica.loadFailure(replicatedDbName, tuple.dumpLocation);
+  callerVerifier.assertInjectionsPerformed(true, false);
+} finally {
+  InjectableBehaviourObjectStore.resetAlterTableModifier();
+}
 
 Review comment:
   I don't think we need to be really hard and fast about the exact number of 
tables loaded. All we are testing is whether there was a failure and the retry 
loaded the stats successfully. Current set of checks is enough for that.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221203)
Time Spent: 10h  (was: 9h 50m)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch
>
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21532) RuntimeException due to AccessControlException during creating hive-staging-dir

2019-04-01 Thread Oleksandr Polishchuk (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksandr Polishchuk reassigned HIVE-21532:
---

Assignee: Oleksiy Sayankin  (was: Oleksandr Polishchuk)

> RuntimeException due to AccessControlException during creating 
> hive-staging-dir
> ---
>
> Key: HIVE-21532
> URL: https://issues.apache.org/jira/browse/HIVE-21532
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksandr Polishchuk
>Assignee: Oleksiy Sayankin
>Priority: Minor
> Attachments: HIVE-21532.1.patch, HIVE-21532.1.patch, 
> HIVE-21532.2.patch
>
>
> The bug was found with environment - Hive-2.3.
> Steps lead to an exception:
> 1) Create user without root permissions on your node.
> 2) The {{hive-site.xml}} file has to contain the next properties:
> {code:java}
>  
>     hive.security.authorization.enabled
>   true
>   
>   
>    hive.security.authorization.manager
>  
> org.apache.hadoop.hive.ql.security.authorization.plugin.fallback.FallbackHiveAuthorizerFactory
>   
> {code}
> 3) Open Hive CLI and do next query:
> {code:java}
>  insert overwrite local directory '/tmp/test_dir' row format delimited fields 
> terminated by ',' select * from temp.test;
> {code}
> The previous query will fails with the next exception:
> {code:java}
> FAILED: RuntimeException Cannot create staging directory 
> 'hdfs:///tmp/test_dir/.hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1':
>  User testuser(user id 3456)  has been denied access to create 
> .hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1
> {code}
> The investigation shows that if delete the mentioned above properties from 
> {{hive-site.xml}} and pass {{`queryTmpdir`}} instead of {{`dest_path`}} in 
> the {{org.apache.hadoop.hive.ql.Context#getTempDirForPath()}} as was in the 
> Hive-2.1. everything will be fine. The current method is using in the 
> {{org.apache.hadoop.hive.ql.parse.SemanticAnalyzer}}  - {{String statsTmpLoc 
> = ctx.getTempDirForPath(dest_path).toString();}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21511) beeline -f report no such file if file is not on local fs

2019-04-01 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806598#comment-16806598
 ] 

Zoltan Haindrich commented on HIVE-21511:
-

+1
patch 1 had a green run; patch 2 only added a new test (which is also passing)

> beeline -f report no such file if file is not on local fs
> -
>
> Key: HIVE-21511
> URL: https://issues.apache.org/jira/browse/HIVE-21511
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Bruno Pusztahazi
>Assignee: Bruno Pusztahazi
>Priority: Blocker
>  Labels: patch
> Attachments: HIVE-21511.1.patch, HIVE-21511.2.patch, 
> HIVE-21511.3.patch
>
>   Original Estimate: 0.05h
>  Remaining Estimate: 0.05h
>
> I test like this
> HQL=hdfs://hacluster/tmp/ff.hql
> if hadoop fs -test -f ${HQL}
> then
>    beeline -f ${HQL}
> fi
> test ${HQL} ok, but beeline report ${HQL} no such file or directory



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21034) Add option to schematool to drop Hive databases

2019-04-01 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-21034:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

pushed to master. Thank you [~dvoros]!

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21034.1.patch, HIVE-21034.2.patch, 
> HIVE-21034.2.patch, HIVE-21034.3.patch, HIVE-21034.4.patch, 
> HIVE-21034.5.patch, HIVE-21034.5.patch, HIVE-21034.5.patch, HIVE-21034.5.patch
>
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221241=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221241
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 11:13
Start Date: 01/Apr/19 11:13
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #579: HIVE-21109 : 
Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270818464
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java
 ##
 @@ -297,21 +303,34 @@ private ColumnStatisticsDesc getColumnStatsDesc(String 
dbName,
 
   private int persistColumnStats(Hive db) throws HiveException, MetaException, 
IOException {
 ColumnStatistics colStats = constructColumnStatsFromInput();
-ColumnStatisticsDesc colStatsDesc = colStats.getStatsDesc();
-// We do not support stats replication for a transactional table yet. If 
we are converting
-// a non-transactional table to a transactional table during replication, 
we might get
-// column statistics but we shouldn't update those.
-if (work.getColStats() != null &&
-
AcidUtils.isTransactionalTable(getHive().getTable(colStatsDesc.getDbName(),
-  
colStatsDesc.getTableName( {
-  LOG.debug("Skipped updating column stats for table " +
-TableName.getDbTable(colStatsDesc.getDbName(), 
colStatsDesc.getTableName()) +
-" because it is converted to a transactional table during 
replication.");
-  return 0;
-}
-
 SetPartitionsStatsRequest request =
 new SetPartitionsStatsRequest(Collections.singletonList(colStats));
+
+// Set writeId and validWriteId list for replicated statistics.
+if (work.getColStats() != null) {
+  String dbName = colStats.getStatsDesc().getDbName();
+  String tblName = colStats.getStatsDesc().getTableName();
+  Table tbl = db.getTable(dbName, tblName);
+  long writeId = work.getWriteId();
+  // If it's a transactional table on source and target, we will get a 
valid writeId
+  // associated with it. Otherwise it's a non-transactional table on 
source migrated to a
+  // transactional table on target, we need to craft a valid writeId here.
+  if (AcidUtils.isTransactionalTable(tbl)) {
+ValidWriteIdList writeIds;
+if (writeId <= 0) {
+  Long tmpWriteId = ReplUtils.getMigrationCurrentTblWriteId(conf);
+  if (tmpWriteId == null) {
+throw new HiveException("DDLTask : Write id is not set in the 
config by open txn task for migration");
+  }
+  writeId = tmpWriteId;
+}
+writeIds = new ValidReaderWriteIdList(TableName.getDbTable(dbName, 
tblName), new long[0],
 
 Review comment:
   I think, this assumption can change in future if someone uses this task to 
update stats even in non-repl flow. I suggest to add explicit check for repl 
scope.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221241)
Time Spent: 11h  (was: 10h 50m)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch
>
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21532) RuntimeException due to AccessControlException during creating hive-staging-dir

2019-04-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806701#comment-16806701
 ] 

Hive QA commented on HIVE-21532:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12964427/HIVE-21532.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 15890 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSQLDropParitionsCleanup
 (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSQLDropPartitionsCacheCrossSession
 (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSqlErrorMetrics 
(batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testEmptyTrustStoreProps 
(batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testMaxEventResponse 
(batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testPartitionOps (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testQueryCloseOnError 
(batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testRoleOps (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testTableOps (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testUseSSLProperty 
(batchId=230)
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.testMultiInsert 
(batchId=327)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16794/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16794/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16794/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12964427 - PreCommit-HIVE-Build

> RuntimeException due to AccessControlException during creating 
> hive-staging-dir
> ---
>
> Key: HIVE-21532
> URL: https://issues.apache.org/jira/browse/HIVE-21532
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksandr Polishchuk
>Assignee: Oleksiy Sayankin
>Priority: Minor
> Attachments: HIVE-21532.1.patch, HIVE-21532.1.patch, 
> HIVE-21532.2.patch, HIVE-21532.3.patch
>
>
> The bug was found with environment - Hive-2.3.
> Steps lead to an exception:
> 1) Create user without root permissions on your node.
> 2) The {{hive-site.xml}} file has to contain the next properties:
> {code:java}
>  
>     hive.security.authorization.enabled
>   true
>   
>   
>    hive.security.authorization.manager
>  
> org.apache.hadoop.hive.ql.security.authorization.plugin.fallback.FallbackHiveAuthorizerFactory
>   
> {code}
> 3) Open Hive CLI and do next query:
> {code:java}
>  insert overwrite local directory '/tmp/test_dir' row format delimited fields 
> terminated by ',' select * from temp.test;
> {code}
> The previous query will fails with the next exception:
> {code:java}
> FAILED: RuntimeException Cannot create staging directory 
> 'hdfs:///tmp/test_dir/.hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1':
>  User testuser(user id 3456)  has been denied access to create 
> .hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1
> {code}
> The investigation shows that if delete the mentioned above properties from 
> {{hive-site.xml}} and pass {{`queryTmpdir`}} instead of {{`dest_path`}} in 
> the {{org.apache.hadoop.hive.ql.Context#getTempDirForPath()}} as was in the 
> Hive-2.1. everything will be fine. The current method is using in the 
> {{org.apache.hadoop.hive.ql.parse.SemanticAnalyzer}}  - {{String statsTmpLoc 
> = ctx.getTempDirForPath(dest_path).toString();}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221207=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221207
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 10:02
Start Date: 01/Apr/19 10:02
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #579: 
HIVE-21109 : Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270795998
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java
 ##
 @@ -301,12 +338,106 @@ private String dumpLoadVerify(List tableNames, 
String lastReplicationId,
 return dumpTuple.lastReplicationId;
   }
 
+  /**
+   * Run a bootstrap that will fail.
+   * @param tuple the location of bootstrap dump
+   */
+  private void failBootstrapLoad(WarehouseInstance.Tuple tuple, int 
failAfterNumTables) throws Throwable {
+// fail setting ckpt directory property for the second table so that we 
test the case when
+// bootstrap load fails after some but not all tables are loaded.
+BehaviourInjection callerVerifier
+= new BehaviourInjection() {
+  int cntTables = 0;
+  @Nullable
+  @Override
+  public Boolean apply(@Nullable CallerArguments args) {
+cntTables++;
+if (args.dbName.equalsIgnoreCase(replicatedDbName) && cntTables > 
failAfterNumTables) {
+  injectionPathCalled = true;
+  LOG.warn("Verifier - DB : " + args.dbName + " TABLE : " + 
args.tblName);
+  return false;
+}
+return true;
+  }
+};
+
+InjectableBehaviourObjectStore.setAlterTableModifier(callerVerifier);
+try {
+  replica.loadFailure(replicatedDbName, tuple.dumpLocation);
+  callerVerifier.assertInjectionsPerformed(true, false);
+} finally {
+  InjectableBehaviourObjectStore.resetAlterTableModifier();
+}
+  }
+
+  private void failIncrementalLoad(WarehouseInstance.Tuple dumpTuple, int 
failAfterNumEvents) throws Throwable {
+// fail add notification when updating table stats after given number of 
such events. Thus we
+// test successful application as well as failed application of this event.
+BehaviourInjection callerVerifier
+= new BehaviourInjection() {
+  int cntEvents = 0;
+  @Override
+  public Boolean apply(NotificationEvent entry) {
+cntEvents++;
+if 
(entry.getEventType().equalsIgnoreCase(EventMessage.EventType.UPDATE_TABLE_COLUMN_STAT.toString())
 &&
+cntEvents > failAfterNumEvents) {
+  injectionPathCalled = true;
+  LOG.warn("Verifier - DB: " + entry.getDbName()
+  + " Table: " + entry.getTableName()
+  + " Event: " + entry.getEventType());
+  return false;
+}
+return true;
+  }
+};
+
+InjectableBehaviourObjectStore.setAddNotificationModifier(callerVerifier);
+try {
+  replica.loadFailure(replicatedDbName, dumpTuple.dumpLocation);
+} finally {
+  InjectableBehaviourObjectStore.resetAddNotificationModifier();
+}
+callerVerifier.assertInjectionsPerformed(true, false);
+
+// fail add notification when updating partition stats for for the second 
time. Thus we test
+// successful application as well as failed application of this event.
+callerVerifier = new BehaviourInjection() {
+  int cntEvents = 1;
+
+  @Override
+  public Boolean apply(NotificationEvent entry) {
+cntEvents++;
+if 
(entry.getEventType().equalsIgnoreCase(EventMessage.EventType.UPDATE_PARTITION_COLUMN_STAT.toString())
 &&
+cntEvents > failAfterNumEvents) {
+  injectionPathCalled = true;
+  LOG.warn("Verifier - DB: " + entry.getDbName()
+  + " Table: " + entry.getTableName()
+  + " Event: " + entry.getEventType());
+  return false;
+}
+return true;
+  }
+};
+
+InjectableBehaviourObjectStore.setAddNotificationModifier(callerVerifier);
+try {
+  replica.loadFailure(replicatedDbName, dumpTuple.dumpLocation);
+} finally {
+  InjectableBehaviourObjectStore.resetAddNotificationModifier();
+}
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221207)
Time Spent: 10h 20m  (was: 10h 10m)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
>   

[jira] [Updated] (HIVE-19034) hadoop fs test can check srcipt ok, but beeline -f report no such file

2019-04-01 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-19034:

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

duplicate of HIVE-21511 

> hadoop fs test can check srcipt ok, but beeline -f report no such file
> --
>
> Key: HIVE-19034
> URL: https://issues.apache.org/jira/browse/HIVE-19034
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.3.0, beeline-cli-branch
> Environment: java version: 1.8.0_112-b15
> hadoop version: 2.7.2
> hive version:1.3.0
> hive JDBS version: 1.3.0
> beeline version: 1.3.0
>Reporter: fengxianghui
>Assignee: Bruno Pusztahazi
>Priority: Blocker
>  Labels: patch, todoc4.0
> Fix For: 1.3.0
>
> Attachments: HIVE-19034.1.patch
>
>   Original Estimate: 0.05h
>  Remaining Estimate: 0.05h
>
> I test like this
> HQL=hdfs://hacluster/tmp/ff.hql
> if hadoop fs -test -f ${HQL}
> then
>    beeline -f ${HQL}
> fi
> test ${HQL} ok, but beeline report ${HQL} no such file or directory



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21532) RuntimeException due to AccessControlException during creating hive-staging-dir

2019-04-01 Thread Oleksiy Sayankin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806622#comment-16806622
 ] 

Oleksiy Sayankin commented on HIVE-21532:
-

Rebased the patch against master.  Let's see if there any conflicts.

> RuntimeException due to AccessControlException during creating 
> hive-staging-dir
> ---
>
> Key: HIVE-21532
> URL: https://issues.apache.org/jira/browse/HIVE-21532
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksandr Polishchuk
>Assignee: Oleksiy Sayankin
>Priority: Minor
> Attachments: HIVE-21532.1.patch, HIVE-21532.1.patch, 
> HIVE-21532.2.patch, HIVE-21532.3.patch
>
>
> The bug was found with environment - Hive-2.3.
> Steps lead to an exception:
> 1) Create user without root permissions on your node.
> 2) The {{hive-site.xml}} file has to contain the next properties:
> {code:java}
>  
>     hive.security.authorization.enabled
>   true
>   
>   
>    hive.security.authorization.manager
>  
> org.apache.hadoop.hive.ql.security.authorization.plugin.fallback.FallbackHiveAuthorizerFactory
>   
> {code}
> 3) Open Hive CLI and do next query:
> {code:java}
>  insert overwrite local directory '/tmp/test_dir' row format delimited fields 
> terminated by ',' select * from temp.test;
> {code}
> The previous query will fails with the next exception:
> {code:java}
> FAILED: RuntimeException Cannot create staging directory 
> 'hdfs:///tmp/test_dir/.hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1':
>  User testuser(user id 3456)  has been denied access to create 
> .hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1
> {code}
> The investigation shows that if delete the mentioned above properties from 
> {{hive-site.xml}} and pass {{`queryTmpdir`}} instead of {{`dest_path`}} in 
> the {{org.apache.hadoop.hive.ql.Context#getTempDirForPath()}} as was in the 
> Hive-2.1. everything will be fine. The current method is using in the 
> {{org.apache.hadoop.hive.ql.parse.SemanticAnalyzer}}  - {{String statsTmpLoc 
> = ctx.getTempDirForPath(dest_path).toString();}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-21532) RuntimeException due to AccessControlException during creating hive-staging-dir

2019-04-01 Thread Oleksiy Sayankin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806622#comment-16806622
 ] 

Oleksiy Sayankin edited comment on HIVE-21532 at 4/1/19 11:34 AM:
--

Rebased the patch against master. Let's see if there are any conflicts.


was (Author: osayankin):
Rebased the patch against master.  Let's see if there any conflicts.

> RuntimeException due to AccessControlException during creating 
> hive-staging-dir
> ---
>
> Key: HIVE-21532
> URL: https://issues.apache.org/jira/browse/HIVE-21532
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksandr Polishchuk
>Assignee: Oleksiy Sayankin
>Priority: Minor
> Attachments: HIVE-21532.1.patch, HIVE-21532.1.patch, 
> HIVE-21532.2.patch, HIVE-21532.3.patch
>
>
> The bug was found with environment - Hive-2.3.
> Steps lead to an exception:
> 1) Create user without root permissions on your node.
> 2) The {{hive-site.xml}} file has to contain the next properties:
> {code:java}
>  
>     hive.security.authorization.enabled
>   true
>   
>   
>    hive.security.authorization.manager
>  
> org.apache.hadoop.hive.ql.security.authorization.plugin.fallback.FallbackHiveAuthorizerFactory
>   
> {code}
> 3) Open Hive CLI and do next query:
> {code:java}
>  insert overwrite local directory '/tmp/test_dir' row format delimited fields 
> terminated by ',' select * from temp.test;
> {code}
> The previous query will fails with the next exception:
> {code:java}
> FAILED: RuntimeException Cannot create staging directory 
> 'hdfs:///tmp/test_dir/.hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1':
>  User testuser(user id 3456)  has been denied access to create 
> .hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1
> {code}
> The investigation shows that if delete the mentioned above properties from 
> {{hive-site.xml}} and pass {{`queryTmpdir`}} instead of {{`dest_path`}} in 
> the {{org.apache.hadoop.hive.ql.Context#getTempDirForPath()}} as was in the 
> Hive-2.1. everything will be fine. The current method is using in the 
> {{org.apache.hadoop.hive.ql.parse.SemanticAnalyzer}}  - {{String statsTmpLoc 
> = ctx.getTempDirForPath(dest_path).toString();}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21557) Query based compaction fails with NullPointerException: Non-local session path expected to be non-null

2019-04-01 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-21557:
--
Attachment: HIVE-21557.patch

> Query based compaction fails with NullPointerException: Non-local session 
> path expected to be non-null
> --
>
> Key: HIVE-21557
> URL: https://issues.apache.org/jira/browse/HIVE-21557
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Vary
>Priority: Major
> Attachments: HIVE-21557.patch
>
>
> {code:java}
> 2019-03-29T13:04:19.282Z hiveserver2-65d5bb4bd8-xx24r hiveserver2 1 
> db896a5e-5215-11e9-87ec-020c4712c37c [mdc@18060 class="compactor.CompactorMR" 
> level="ERROR" thread="hiveserver2-65d5bb4bd8-xx24r-28"] 
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run drop table if 
> exists default_tmp_compactor_asd_1553864659196
> at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:57)
> at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:34)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runCrudCompaction(CompactorMR.java:408)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:250)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:194)
> Caused by: java.lang.NullPointerException: Non-local session path expected to 
> be non-null
> at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:228)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:838)
> at org.apache.hadoop.hive.ql.Context.(Context.java:319)
> at org.apache.hadoop.hive.ql.Context.(Context.java:305)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:603)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1881)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2004)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1764)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753)
> at 
> org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:54){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21557) Query based compaction fails with NullPointerException: Non-local session path expected to be non-null

2019-04-01 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-21557:
--
Assignee: Peter Vary
  Status: Patch Available  (was: Open)

> Query based compaction fails with NullPointerException: Non-local session 
> path expected to be non-null
> --
>
> Key: HIVE-21557
> URL: https://issues.apache.org/jira/browse/HIVE-21557
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-21557.patch
>
>
> {code:java}
> 2019-03-29T13:04:19.282Z hiveserver2-65d5bb4bd8-xx24r hiveserver2 1 
> db896a5e-5215-11e9-87ec-020c4712c37c [mdc@18060 class="compactor.CompactorMR" 
> level="ERROR" thread="hiveserver2-65d5bb4bd8-xx24r-28"] 
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run drop table if 
> exists default_tmp_compactor_asd_1553864659196
> at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:57)
> at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:34)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runCrudCompaction(CompactorMR.java:408)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:250)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:194)
> Caused by: java.lang.NullPointerException: Non-local session path expected to 
> be non-null
> at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:228)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:838)
> at org.apache.hadoop.hive.ql.Context.(Context.java:319)
> at org.apache.hadoop.hive.ql.Context.(Context.java:305)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:603)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1881)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2004)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1764)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753)
> at 
> org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:54){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-9995) ACID compaction tries to compact a single file

2019-04-01 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806592#comment-16806592
 ] 

Peter Vary commented on HIVE-9995:
--

[~dkuzmenko]: Rebase please :(

> ACID compaction tries to compact a single file
> --
>
> Key: HIVE-9995
> URL: https://issues.apache.org/jira/browse/HIVE-9995
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, 
> HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, 
> HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.08.patch, 
> HIVE-9995.WIP.patch
>
>
> Consider TestWorker.minorWithOpenInMiddle()
> since there is an open txnId=23, this doesn't have any meaningful minor 
> compaction work to do.  The system still tries to compact a single delta file 
> for 21-22 id range, and effectively copies the file onto itself.
> This is 1. inefficient and 2. can potentially affect a reader.
> (from a real cluster)
> Suppose we start with 
> {noformat}
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:03 
> /user/hive/warehouse/t/base_016
> -rw-r--r--   1 ekoifman staff602 2016-06-09 16:03 
> /user/hive/warehouse/t/base_016/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/base_017
> -rw-r--r--   1 ekoifman staff588 2016-06-09 16:07 
> /user/hive/warehouse/t/base_017/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_017_017_
> -rw-r--r--   1 ekoifman staff514 2016-06-09 16:06 
> /user/hive/warehouse/t/delta_017_017_/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_018_018_
> -rw-r--r--   1 ekoifman staff612 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_018_018_/bucket_0
> {noformat}
> then do _alter table T compact 'minor';_
> then we end up with 
> {noformat}
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/base_017
> -rw-r--r--   1 ekoifman staff588 2016-06-09 16:07 
> /user/hive/warehouse/t/base_017/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:11 
> /user/hive/warehouse/t/delta_018_018
> -rw-r--r--   1 ekoifman staff500 2016-06-09 16:11 
> /user/hive/warehouse/t/delta_018_018/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_018_018_
> -rw-r--r--   1 ekoifman staff612 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_018_018_/bucket_0
> {noformat}
> So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221209=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221209
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 10:05
Start Date: 01/Apr/19 10:05
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #579: 
HIVE-21109 : Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270797067
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenariosMigrationNoAutogather.java
 ##
 @@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.parse;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import 
org.apache.hadoop.hive.metastore.messaging.json.gzip.GzipJSONMessageEncoder;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.rules.TestName;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * Tests statistics replication for ACID tables.
+ */
+public class TestStatsReplicationScenariosMigrationNoAutogather extends 
TestStatsReplicationScenarios {
+  @Rule
+  public final TestName testName = new TestName();
+
+  protected static final Logger LOG = 
LoggerFactory.getLogger(TestReplicationScenarios.class);
 
 Review comment:
   Removed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221209)
Time Spent: 10.5h  (was: 10h 20m)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch
>
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221210=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221210
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 10:05
Start Date: 01/Apr/19 10:05
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #579: 
HIVE-21109 : Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270797151
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java
 ##
 @@ -301,12 +338,106 @@ private String dumpLoadVerify(List tableNames, 
String lastReplicationId,
 return dumpTuple.lastReplicationId;
   }
 
+  /**
+   * Run a bootstrap that will fail.
+   * @param tuple the location of bootstrap dump
+   */
+  private void failBootstrapLoad(WarehouseInstance.Tuple tuple, int 
failAfterNumTables) throws Throwable {
+// fail setting ckpt directory property for the second table so that we 
test the case when
+// bootstrap load fails after some but not all tables are loaded.
+BehaviourInjection callerVerifier
+= new BehaviourInjection() {
+  int cntTables = 0;
+  @Nullable
+  @Override
+  public Boolean apply(@Nullable CallerArguments args) {
+cntTables++;
 
 Review comment:
   Hmm. Thanks for catching this. Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221210)
Time Spent: 10h 40m  (was: 10.5h)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch
>
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21509) LLAP may cache corrupted column vectors and return wrong query result

2019-04-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806649#comment-16806649
 ] 

Hive QA commented on HIVE-21509:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12964421/HIVE-21509.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15886 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16793/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16793/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16793/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12964421 - PreCommit-HIVE-Build

> LLAP may cache corrupted column vectors and return wrong query result
> -
>
> Key: HIVE-21509
> URL: https://issues.apache.org/jira/browse/HIVE-21509
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-21509.0.wip.patch, HIVE-21509.1.wip.patch, 
> HIVE-21509.2.patch, HIVE-21509.3.patch
>
>
> In some scenarios, LLAP might store column vectors in cache that are getting 
> reused and reset just before their original content would be written.
> The issue is a concurrency issue and is thereby flaky. It is not easy to 
> reproduce, but the odds of surfacing this issue can by improved by setting 
> LLAP executor and IO thread counts this way:
>  * set hive.llap.daemon.num.executors=32;
>  * set hive.llap.io.threadpool.size=1;
>  * using TPCDS input data of store_sales table, have at least a couple of 
> 100k's of rows, and use text format:
> {code:java}
> ROW FORMAT SERDE    'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'  
> WITH SERDEPROPERTIES (    'field.delim'='|',    'serialization.format'='|')  
> STORED AS INPUTFORMAT    'org.apache.hadoop.mapred.TextInputFormat'  
> OUTPUTFORMAT    
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'{code}
>  * having more splits increases the issue showing itself, so it is worth to 
> _set tez.grouping.min-size=1024; set tez.grouping.max-size=1024;_
>  * run query on this this table: select min(ss_sold_date_sk) from store_sales;
> The first query result is correct (2450816 in my case). Repeating the query 
> will trigger reading from LLAP cache and produce a wrong result: 0.
> If one wants to make sure of running into this issue, place a 
> Thread.sleep(250) at the beginning of VectorDeserializeOrcWriter#run().
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221240=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221240
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 11:10
Start Date: 01/Apr/19 11:10
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #579: HIVE-21109 : 
Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270817825
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenariosMigration.java
 ##
 @@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.parse;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import 
org.apache.hadoop.hive.metastore.messaging.json.gzip.GzipJSONMessageEncoder;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.rules.TestName;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * Tests statistics replication for ACID tables.
+ */
+public class TestStatsReplicationScenariosMigration extends 
TestStatsReplicationScenarios {
+  @Rule
+  public final TestName testName = new TestName();
+
+  protected static final Logger LOG = 
LoggerFactory.getLogger(TestReplicationScenarios.class);
+
+  @BeforeClass
+  public static void classLevelSetup() throws Exception {
+Map overrides = new HashMap<>();
+overrides.put(MetastoreConf.ConfVars.EVENT_MESSAGE_FACTORY.getHiveName(),
+GzipJSONMessageEncoder.class.getCanonicalName());
+
+HashMap replicaConfigs = new HashMap() {{
+  put("hive.support.concurrency", "true");
+  put("hive.txn.manager", 
"org.apache.hadoop.hive.ql.lockmgr.DbTxnManager");
+  put("hive.metastore.client.capability.check", "false");
+  put("hive.repl.bootstrap.dump.open.txn.timeout", "1s");
+  put("hive.exec.dynamic.partition.mode", "nonstrict");
+  put("hive.strict.checks.bucketing", "false");
+  put("hive.mapred.mode", "nonstrict");
+  put("mapred.input.dir.recursive", "true");
+  put("hive.metastore.disallow.incompatible.col.type.changes", "false");
+  put("hive.strict.managed.tables", "true");
+}};
+replicaConfigs.putAll(overrides);
+
+HashMap primaryConfigs = new HashMap() {{
+  put("hive.metastore.client.capability.check", "false");
+  put("hive.repl.bootstrap.dump.open.txn.timeout", "1s");
+  put("hive.exec.dynamic.partition.mode", "nonstrict");
+  put("hive.strict.checks.bucketing", "false");
+  put("hive.mapred.mode", "nonstrict");
+  put("mapred.input.dir.recursive", "true");
+  put("hive.metastore.disallow.incompatible.col.type.changes", "false");
+  put("hive.support.concurrency", "false");
+  put("hive.txn.manager", 
"org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager");
+  put("hive.strict.managed.tables", "false");
+}};
+primaryConfigs.putAll(overrides);
+
+internalBeforeClassSetup(primaryConfigs, replicaConfigs,
 
 Review comment:
   In migration case, we shall validate if stats are associated with correct 
writeId. I think, in our tests, it should be pointing to last allocated writeId.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221240)
Time Spent: 10h 50m  (was: 10h 40m)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: 

[jira] [Commented] (HIVE-21532) RuntimeException due to AccessControlException during creating hive-staging-dir

2019-04-01 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806681#comment-16806681
 ] 

Zoltan Haindrich commented on HIVE-21532:
-

test?
https://issues.apache.org/jira/browse/HIVE-21532?focusedCommentId=16806615=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16806615

> RuntimeException due to AccessControlException during creating 
> hive-staging-dir
> ---
>
> Key: HIVE-21532
> URL: https://issues.apache.org/jira/browse/HIVE-21532
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksandr Polishchuk
>Assignee: Oleksiy Sayankin
>Priority: Minor
> Attachments: HIVE-21532.1.patch, HIVE-21532.1.patch, 
> HIVE-21532.2.patch, HIVE-21532.3.patch
>
>
> The bug was found with environment - Hive-2.3.
> Steps lead to an exception:
> 1) Create user without root permissions on your node.
> 2) The {{hive-site.xml}} file has to contain the next properties:
> {code:java}
>  
>     hive.security.authorization.enabled
>   true
>   
>   
>    hive.security.authorization.manager
>  
> org.apache.hadoop.hive.ql.security.authorization.plugin.fallback.FallbackHiveAuthorizerFactory
>   
> {code}
> 3) Open Hive CLI and do next query:
> {code:java}
>  insert overwrite local directory '/tmp/test_dir' row format delimited fields 
> terminated by ',' select * from temp.test;
> {code}
> The previous query will fails with the next exception:
> {code:java}
> FAILED: RuntimeException Cannot create staging directory 
> 'hdfs:///tmp/test_dir/.hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1':
>  User testuser(user id 3456)  has been denied access to create 
> .hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1
> {code}
> The investigation shows that if delete the mentioned above properties from 
> {{hive-site.xml}} and pass {{`queryTmpdir`}} instead of {{`dest_path`}} in 
> the {{org.apache.hadoop.hive.ql.Context#getTempDirForPath()}} as was in the 
> Hive-2.1. everything will be fine. The current method is using in the 
> {{org.apache.hadoop.hive.ql.parse.SemanticAnalyzer}}  - {{String statsTmpLoc 
> = ctx.getTempDirForPath(dest_path).toString();}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21540) Query with join condition having date literal throws SemanticException.

2019-04-01 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806591#comment-16806591
 ] 

Sankar Hariappan commented on HIVE-21540:
-

Thanks for the review [~kgyrtkirk]!
I will create another ticket to track TOK_TIMESTAMPLITERAL and 
TOK_TIMESTAMPLOCALTZLITERAL cases.

> Query with join condition having date literal throws SemanticException.
> ---
>
> Key: HIVE-21540
> URL: https://issues.apache.org/jira/browse/HIVE-21540
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.1.0, 3.1.1
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: Analyzer, DateField, pull-request-available
> Attachments: HIVE-21540.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This semantic exception is thrown for the following query. 
> *SemanticException '2019-03-20' encountered with 0 children*
> {code}
> create table date_1 (key int, dd date);
> create table date_2 (key int, dd date);
> select d1.key, d2.dd from(
>   select key, dd as start_dd, current_date as end_dd from date_1) d1
>   join date_2 as d2 on d1.key = d2.key where d2.dd between start_dd and 
> end_dd;
> {code}
> When the WHERE condition below is commented out, the query completes 
> successfully.
> where d2.dd between start_dd and end_dd
> 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221206=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221206
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 10:00
Start Date: 01/Apr/19 10:00
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #579: 
HIVE-21109 : Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270795354
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java
 ##
 @@ -301,12 +338,106 @@ private String dumpLoadVerify(List tableNames, 
String lastReplicationId,
 return dumpTuple.lastReplicationId;
   }
 
+  /**
+   * Run a bootstrap that will fail.
+   * @param tuple the location of bootstrap dump
+   */
+  private void failBootstrapLoad(WarehouseInstance.Tuple tuple, int 
failAfterNumTables) throws Throwable {
+// fail setting ckpt directory property for the second table so that we 
test the case when
+// bootstrap load fails after some but not all tables are loaded.
+BehaviourInjection callerVerifier
+= new BehaviourInjection() {
+  int cntTables = 0;
+  @Nullable
+  @Override
+  public Boolean apply(@Nullable CallerArguments args) {
+cntTables++;
+if (args.dbName.equalsIgnoreCase(replicatedDbName) && cntTables > 
failAfterNumTables) {
+  injectionPathCalled = true;
+  LOG.warn("Verifier - DB : " + args.dbName + " TABLE : " + 
args.tblName);
+  return false;
+}
+return true;
+  }
+};
+
+InjectableBehaviourObjectStore.setAlterTableModifier(callerVerifier);
+try {
+  replica.loadFailure(replicatedDbName, tuple.dumpLocation);
+  callerVerifier.assertInjectionsPerformed(true, false);
+} finally {
+  InjectableBehaviourObjectStore.resetAlterTableModifier();
+}
+  }
+
+  private void failIncrementalLoad(WarehouseInstance.Tuple dumpTuple, int 
failAfterNumEvents) throws Throwable {
+// fail add notification when updating table stats after given number of 
such events. Thus we
+// test successful application as well as failed application of this event.
+BehaviourInjection callerVerifier
+= new BehaviourInjection() {
+  int cntEvents = 0;
+  @Override
+  public Boolean apply(NotificationEvent entry) {
+cntEvents++;
 
 Review comment:
   This code has changed while working on another related comment. Again we 
don't need to count exact number of events. We need at least one successful 
event and other one unsuccessful event.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221206)
Time Spent: 10h 10m  (was: 10h)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21509) LLAP may cache corrupted column vectors and return wrong query result

2019-04-01 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806601#comment-16806601
 ] 

Hive QA commented on HIVE-21509:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
57s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
33s{color} | {color:blue} storage-api in master has 48 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
49s{color} | {color:blue} llap-server in master has 81 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
30s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
23s{color} | {color:red} llap-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
23s{color} | {color:red} llap-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 23s{color} 
| {color:red} llap-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
15s{color} | {color:red} llap-server: The patch generated 4 new + 29 unchanged 
- 1 fixed = 33 total (was 30) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
22s{color} | {color:red} llap-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 17m 15s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16793/dev-support/hive-personality.sh
 |
| git revision | master / 7bbd93f |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| mvninstall | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16793/yetus/patch-mvninstall-llap-server.txt
 |
| compile | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16793/yetus/patch-compile-llap-server.txt
 |
| javac | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16793/yetus/patch-compile-llap-server.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16793/yetus/diff-checkstyle-llap-server.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16793/yetus/patch-findbugs-llap-server.txt
 |
| modules | C: storage-api llap-server U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16793/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> LLAP may cache corrupted column vectors and return wrong query result
> -
>
> Key: HIVE-21509
> URL: https://issues.apache.org/jira/browse/HIVE-21509
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-21509.0.wip.patch, HIVE-21509.1.wip.patch, 
> HIVE-21509.2.patch, HIVE-21509.3.patch
>
>
> In some scenarios, LLAP might store column 

[jira] [Updated] (HIVE-21511) beeline -f report no such file if file is not on local fs

2019-04-01 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-21511:

Labels: patch todoc4.0  (was: patch)

> beeline -f report no such file if file is not on local fs
> -
>
> Key: HIVE-21511
> URL: https://issues.apache.org/jira/browse/HIVE-21511
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Bruno Pusztahazi
>Assignee: Bruno Pusztahazi
>Priority: Blocker
>  Labels: patch, todoc4.0
> Attachments: HIVE-21511.1.patch, HIVE-21511.2.patch, 
> HIVE-21511.3.patch
>
>   Original Estimate: 0.05h
>  Remaining Estimate: 0.05h
>
> I test like this
> HQL=hdfs://hacluster/tmp/ff.hql
> if hadoop fs -test -f ${HQL}
> then
>    beeline -f ${HQL}
> fi
> test ${HQL} ok, but beeline report ${HQL} no such file or directory



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21532) RuntimeException due to AccessControlException during creating hive-staging-dir

2019-04-01 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806615#comment-16806615
 ] 

Zoltan Haindrich commented on HIVE-21532:
-

Please submit the patch against master - as it also might be affected (at first 
glance the patch seem to apply)
Could you please write a qtest for this issue? ({{git grep 
FallbackHiveAuthorizerFactory|grep q:}} to get some examples

> RuntimeException due to AccessControlException during creating 
> hive-staging-dir
> ---
>
> Key: HIVE-21532
> URL: https://issues.apache.org/jira/browse/HIVE-21532
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksandr Polishchuk
>Assignee: Oleksiy Sayankin
>Priority: Minor
> Attachments: HIVE-21532.1.patch, HIVE-21532.1.patch, 
> HIVE-21532.2.patch
>
>
> The bug was found with environment - Hive-2.3.
> Steps lead to an exception:
> 1) Create user without root permissions on your node.
> 2) The {{hive-site.xml}} file has to contain the next properties:
> {code:java}
>  
>     hive.security.authorization.enabled
>   true
>   
>   
>    hive.security.authorization.manager
>  
> org.apache.hadoop.hive.ql.security.authorization.plugin.fallback.FallbackHiveAuthorizerFactory
>   
> {code}
> 3) Open Hive CLI and do next query:
> {code:java}
>  insert overwrite local directory '/tmp/test_dir' row format delimited fields 
> terminated by ',' select * from temp.test;
> {code}
> The previous query will fails with the next exception:
> {code:java}
> FAILED: RuntimeException Cannot create staging directory 
> 'hdfs:///tmp/test_dir/.hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1':
>  User testuser(user id 3456)  has been denied access to create 
> .hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1
> {code}
> The investigation shows that if delete the mentioned above properties from 
> {{hive-site.xml}} and pass {{`queryTmpdir`}} instead of {{`dest_path`}} in 
> the {{org.apache.hadoop.hive.ql.Context#getTempDirForPath()}} as was in the 
> Hive-2.1. everything will be fine. The current method is using in the 
> {{org.apache.hadoop.hive.ql.parse.SemanticAnalyzer}}  - {{String statsTmpLoc 
> = ctx.getTempDirForPath(dest_path).toString();}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21532) RuntimeException due to AccessControlException during creating hive-staging-dir

2019-04-01 Thread Oleksiy Sayankin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksiy Sayankin updated HIVE-21532:

Status: In Progress  (was: Patch Available)

> RuntimeException due to AccessControlException during creating 
> hive-staging-dir
> ---
>
> Key: HIVE-21532
> URL: https://issues.apache.org/jira/browse/HIVE-21532
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksandr Polishchuk
>Assignee: Oleksiy Sayankin
>Priority: Minor
> Attachments: HIVE-21532.1.patch, HIVE-21532.1.patch, 
> HIVE-21532.2.patch, HIVE-21532.3.patch
>
>
> The bug was found with environment - Hive-2.3.
> Steps lead to an exception:
> 1) Create user without root permissions on your node.
> 2) The {{hive-site.xml}} file has to contain the next properties:
> {code:java}
>  
>     hive.security.authorization.enabled
>   true
>   
>   
>    hive.security.authorization.manager
>  
> org.apache.hadoop.hive.ql.security.authorization.plugin.fallback.FallbackHiveAuthorizerFactory
>   
> {code}
> 3) Open Hive CLI and do next query:
> {code:java}
>  insert overwrite local directory '/tmp/test_dir' row format delimited fields 
> terminated by ',' select * from temp.test;
> {code}
> The previous query will fails with the next exception:
> {code:java}
> FAILED: RuntimeException Cannot create staging directory 
> 'hdfs:///tmp/test_dir/.hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1':
>  User testuser(user id 3456)  has been denied access to create 
> .hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1
> {code}
> The investigation shows that if delete the mentioned above properties from 
> {{hive-site.xml}} and pass {{`queryTmpdir`}} instead of {{`dest_path`}} in 
> the {{org.apache.hadoop.hive.ql.Context#getTempDirForPath()}} as was in the 
> Hive-2.1. everything will be fine. The current method is using in the 
> {{org.apache.hadoop.hive.ql.parse.SemanticAnalyzer}}  - {{String statsTmpLoc 
> = ctx.getTempDirForPath(dest_path).toString();}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21532) RuntimeException due to AccessControlException during creating hive-staging-dir

2019-04-01 Thread Oleksiy Sayankin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksiy Sayankin updated HIVE-21532:

Attachment: HIVE-21532.3.patch

> RuntimeException due to AccessControlException during creating 
> hive-staging-dir
> ---
>
> Key: HIVE-21532
> URL: https://issues.apache.org/jira/browse/HIVE-21532
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksandr Polishchuk
>Assignee: Oleksiy Sayankin
>Priority: Minor
> Attachments: HIVE-21532.1.patch, HIVE-21532.1.patch, 
> HIVE-21532.2.patch, HIVE-21532.3.patch
>
>
> The bug was found with environment - Hive-2.3.
> Steps lead to an exception:
> 1) Create user without root permissions on your node.
> 2) The {{hive-site.xml}} file has to contain the next properties:
> {code:java}
>  
>     hive.security.authorization.enabled
>   true
>   
>   
>    hive.security.authorization.manager
>  
> org.apache.hadoop.hive.ql.security.authorization.plugin.fallback.FallbackHiveAuthorizerFactory
>   
> {code}
> 3) Open Hive CLI and do next query:
> {code:java}
>  insert overwrite local directory '/tmp/test_dir' row format delimited fields 
> terminated by ',' select * from temp.test;
> {code}
> The previous query will fails with the next exception:
> {code:java}
> FAILED: RuntimeException Cannot create staging directory 
> 'hdfs:///tmp/test_dir/.hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1':
>  User testuser(user id 3456)  has been denied access to create 
> .hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1
> {code}
> The investigation shows that if delete the mentioned above properties from 
> {{hive-site.xml}} and pass {{`queryTmpdir`}} instead of {{`dest_path`}} in 
> the {{org.apache.hadoop.hive.ql.Context#getTempDirForPath()}} as was in the 
> Hive-2.1. everything will be fine. The current method is using in the 
> {{org.apache.hadoop.hive.ql.parse.SemanticAnalyzer}}  - {{String statsTmpLoc 
> = ctx.getTempDirForPath(dest_path).toString();}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21553) Upgrade derby version in standalone-metastore

2019-04-01 Thread Laszlo Bodor (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806650#comment-16806650
 ] 

Laszlo Bodor commented on HIVE-21553:
-

thanks for the review [~kgyrtkirk]!
as this is just a derby version change, I haven't tried integration tests with 
other db vendors, and the basic units tests should have run with the precommit 
ptest, I hope it's enough

> Upgrade derby version in standalone-metastore
> -
>
> Key: HIVE-21553
> URL: https://issues.apache.org/jira/browse/HIVE-21553
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-21553.01.patch
>
>
> HIVE-17506 decoupled standalone metastore poms from hive, then HIVE-18586 
> upgraded derby to 10.14.1.0 in hive's top level pom.xml. Maybe it could be 
> useful to synchronize them.
> Current versions
> hive: 10.14.1.0
> standalone-metastore: 10.10.2.0
> This idea came up while investigating some derby lock issues in HIVE-21550.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221247=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221247
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 11:18
Start Date: 01/Apr/19 11:18
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #579: HIVE-21109 : 
Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270820010
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java
 ##
 @@ -301,12 +338,106 @@ private String dumpLoadVerify(List tableNames, 
String lastReplicationId,
 return dumpTuple.lastReplicationId;
   }
 
+  /**
+   * Run a bootstrap that will fail.
+   * @param tuple the location of bootstrap dump
+   */
+  private void failBootstrapLoad(WarehouseInstance.Tuple tuple, int 
failAfterNumTables) throws Throwable {
+// fail setting ckpt directory property for the second table so that we 
test the case when
+// bootstrap load fails after some but not all tables are loaded.
+BehaviourInjection callerVerifier
+= new BehaviourInjection() {
+  int cntTables = 0;
+  @Nullable
+  @Override
+  public Boolean apply(@Nullable CallerArguments args) {
+cntTables++;
+if (args.dbName.equalsIgnoreCase(replicatedDbName) && cntTables > 
failAfterNumTables) {
+  injectionPathCalled = true;
+  LOG.warn("Verifier - DB : " + args.dbName + " TABLE : " + 
args.tblName);
+  return false;
+}
+return true;
+  }
+};
+
+InjectableBehaviourObjectStore.setAlterTableModifier(callerVerifier);
+try {
+  replica.loadFailure(replicatedDbName, tuple.dumpLocation);
+  callerVerifier.assertInjectionsPerformed(true, false);
+} finally {
+  InjectableBehaviourObjectStore.resetAlterTableModifier();
+}
+  }
+
+  private void failIncrementalLoad(WarehouseInstance.Tuple dumpTuple, int 
failAfterNumEvents) throws Throwable {
+// fail add notification when updating table stats after given number of 
such events. Thus we
+// test successful application as well as failed application of this event.
+BehaviourInjection callerVerifier
+= new BehaviourInjection() {
+  int cntEvents = 0;
+  @Override
+  public Boolean apply(NotificationEvent entry) {
+cntEvents++;
 
 Review comment:
   OK. Pls update the test to fail for 2nd update stats event.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221247)
Time Spent: 11h 40m  (was: 11.5h)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch
>
>  Time Spent: 11h 40m
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221246=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221246
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 11:18
Start Date: 01/Apr/19 11:18
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #579: HIVE-21109 : 
Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270820010
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java
 ##
 @@ -301,12 +338,106 @@ private String dumpLoadVerify(List tableNames, 
String lastReplicationId,
 return dumpTuple.lastReplicationId;
   }
 
+  /**
+   * Run a bootstrap that will fail.
+   * @param tuple the location of bootstrap dump
+   */
+  private void failBootstrapLoad(WarehouseInstance.Tuple tuple, int 
failAfterNumTables) throws Throwable {
+// fail setting ckpt directory property for the second table so that we 
test the case when
+// bootstrap load fails after some but not all tables are loaded.
+BehaviourInjection callerVerifier
+= new BehaviourInjection() {
+  int cntTables = 0;
+  @Nullable
+  @Override
+  public Boolean apply(@Nullable CallerArguments args) {
+cntTables++;
+if (args.dbName.equalsIgnoreCase(replicatedDbName) && cntTables > 
failAfterNumTables) {
+  injectionPathCalled = true;
+  LOG.warn("Verifier - DB : " + args.dbName + " TABLE : " + 
args.tblName);
+  return false;
+}
+return true;
+  }
+};
+
+InjectableBehaviourObjectStore.setAlterTableModifier(callerVerifier);
+try {
+  replica.loadFailure(replicatedDbName, tuple.dumpLocation);
+  callerVerifier.assertInjectionsPerformed(true, false);
+} finally {
+  InjectableBehaviourObjectStore.resetAlterTableModifier();
+}
+  }
+
+  private void failIncrementalLoad(WarehouseInstance.Tuple dumpTuple, int 
failAfterNumEvents) throws Throwable {
+// fail add notification when updating table stats after given number of 
such events. Thus we
+// test successful application as well as failed application of this event.
+BehaviourInjection callerVerifier
+= new BehaviourInjection() {
+  int cntEvents = 0;
+  @Override
+  public Boolean apply(NotificationEvent entry) {
+cntEvents++;
 
 Review comment:
   OK
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221246)
Time Spent: 11.5h  (was: 11h 20m)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221244=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221244
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 11:17
Start Date: 01/Apr/19 11:17
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #579: HIVE-21109 : 
Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270819437
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
 ##
 @@ -2950,21 +2957,32 @@ public Partition createPartition(Table tbl, 
Map partSpec) throws
 int size = addPartitionDesc.getPartitionCount();
 List in =
 new ArrayList(size);
-AcidUtils.TableSnapshot tableSnapshot = AcidUtils.getTableSnapshot(conf, 
tbl, true);
 long writeId;
 String validWriteIdList;
-if (tableSnapshot != null && tableSnapshot.getWriteId() > 0) {
-  writeId = tableSnapshot.getWriteId();
-  validWriteIdList = tableSnapshot.getValidWriteIdList();
+
+// In case of replication, get the writeId from the source and use valid 
write Id list
+// for replication.
+if (addPartitionDesc.getReplicationSpec().isInReplicationScope() &&
+addPartitionDesc.getPartition(0).getWriteId() > 0) {
 
 Review comment:
   I think, this assumption of table type is non-transactional (based on 
writeId=0) and ignoring failure case is not right.
   We can explicitly check if it is transactional table or not and then do 
necessary checks. If writeId comes as 0 for transactional table, then it is 
error flow. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221244)
Time Spent: 11h 10m  (was: 11h)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch
>
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221245=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221245
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 01/Apr/19 11:17
Start Date: 01/Apr/19 11:17
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #579: HIVE-21109 : 
Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r270819769
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java
 ##
 @@ -301,12 +338,106 @@ private String dumpLoadVerify(List tableNames, 
String lastReplicationId,
 return dumpTuple.lastReplicationId;
   }
 
+  /**
+   * Run a bootstrap that will fail.
+   * @param tuple the location of bootstrap dump
+   */
+  private void failBootstrapLoad(WarehouseInstance.Tuple tuple, int 
failAfterNumTables) throws Throwable {
+// fail setting ckpt directory property for the second table so that we 
test the case when
+// bootstrap load fails after some but not all tables are loaded.
+BehaviourInjection callerVerifier
+= new BehaviourInjection() {
+  int cntTables = 0;
+  @Nullable
+  @Override
+  public Boolean apply(@Nullable CallerArguments args) {
+cntTables++;
+if (args.dbName.equalsIgnoreCase(replicatedDbName) && cntTables > 
failAfterNumTables) {
+  injectionPathCalled = true;
+  LOG.warn("Verifier - DB : " + args.dbName + " TABLE : " + 
args.tblName);
+  return false;
+}
+return true;
+  }
+};
+
+InjectableBehaviourObjectStore.setAlterTableModifier(callerVerifier);
+try {
+  replica.loadFailure(replicatedDbName, tuple.dumpLocation);
+  callerVerifier.assertInjectionsPerformed(true, false);
+} finally {
+  InjectableBehaviourObjectStore.resetAlterTableModifier();
+}
 
 Review comment:
   Ok
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221245)
Time Spent: 11h 20m  (was: 11h 10m)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch
>
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >