[jira] [Commented] (HIVE-21593) Break up DDLTask - extract Privilege related operations

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814065#comment-16814065
 ] 

Hive QA commented on HIVE-21593:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12965310/HIVE-21593.01.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 15896 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] 
(batchId=109)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=109)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_fail_1]
 (batchId=99)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_fail_8]
 (batchId=100)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_grant_group]
 (batchId=99)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_grant_table_allpriv]
 (batchId=100)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_grant_table_dup]
 (batchId=99)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_grant_table_fail1]
 (batchId=100)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_grant_table_fail_nogrant]
 (batchId=99)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_invalid_priv_v2]
 (batchId=100)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_revoke_table_fail1]
 (batchId=99)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_revoke_table_fail2]
 (batchId=99)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_show_grant_otheruser_all]
 (batchId=100)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_show_grant_otheruser_alltabs]
 (batchId=99)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_show_grant_otheruser_wtab]
 (batchId=100)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_show_role_principals_no_admin]
 (batchId=100)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_show_roles_no_admin]
 (batchId=99)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16905/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16905/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16905/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 17 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12965310 - PreCommit-HIVE-Build

> Break up DDLTask - extract Privilege related operations
> ---
>
> Key: HIVE-21593
> URL: https://issues.apache.org/jira/browse/HIVE-21593
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
> Attachments: HIVE-21593.01.patch
>
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #4: extract all the privilege related operations from the old DDLTask, 
> and move them under the new package.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21500) Replicate conversion of managed table to external at source.

2019-04-09 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-21500:

Description: 
Couple of scenarios for Hive2 to Hive3(strict managed tables enabled) 
replication where managed table is converted to external at source. 
*Scenario-1: (ACID/MM table converted to external at target)*
1. Create non-ACID ORC format table.
2. Insert some rows
3. Replicate this create event which creates ACID table at target (due to 
migration rule). Each insert event adds transactional metadata in HMS 
corresponding to the current table.
4. Convert table to external table using ALTER command at source.

*Scenario-2: (External table at target changes table location)*
1. Create non-ACID avro format table.
2. Insert some rows
3. Replicate this create event which creates external table at target (due to 
migration rule). The data path is chosen under default external warehouse 
directory.
4. Convert table to external table using ALTER command at source.

It is unable to convert an ACID table to external table at target. Also, it is 
hard to detect what would be the table type at target when perform this ALTER 
table operation at source.
So, it is decided to disable conversion of managed table at source (Hive2) to 
EXTERNAL or vice-versa.

  was:
Couple of scenarios for Hive2 to Hive3(strict managed tables enabled) 
replication where managed table is converted to external at source. 
*Scenario-1: (ACID/MM table converted to external at target)*
1. Create non-ACID ORC format table.
2. Insert some rows
3. Replicate this create event which creates ACID table at target (due to 
migration rule). Each insert event adds transactional metadata in HMS 
corresponding to the current table.
4. Convert table to external table using ALTER command at source.
5. Replicating this alter event should convert ACID table to external table and 
make sure corresponding metadata are removed.

*Scenario-2: (External table at target changes table location)*
1. Create non-ACID avro format table.
2. Insert some rows
3. Replicate this create event which creates external table at target (due to 
migration rule). The data path is chosen under default external warehouse 
directory.
4. Convert table to external table using ALTER command at source.
5. Replicating this alter event should update the table/partitions location as 
data moved under external tables base directory.


> Replicate conversion of managed table to external at source.
> 
>
> Key: HIVE-21500
> URL: https://issues.apache.org/jira/browse/HIVE-21500
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, Replication
>
> Couple of scenarios for Hive2 to Hive3(strict managed tables enabled) 
> replication where managed table is converted to external at source. 
> *Scenario-1: (ACID/MM table converted to external at target)*
> 1. Create non-ACID ORC format table.
> 2. Insert some rows
> 3. Replicate this create event which creates ACID table at target (due to 
> migration rule). Each insert event adds transactional metadata in HMS 
> corresponding to the current table.
> 4. Convert table to external table using ALTER command at source.
> *Scenario-2: (External table at target changes table location)*
> 1. Create non-ACID avro format table.
> 2. Insert some rows
> 3. Replicate this create event which creates external table at target (due to 
> migration rule). The data path is chosen under default external warehouse 
> directory.
> 4. Convert table to external table using ALTER command at source.
> It is unable to convert an ACID table to external table at target. Also, it 
> is hard to detect what would be the table type at target when perform this 
> ALTER table operation at source.
> So, it is decided to disable conversion of managed table at source (Hive2) to 
> EXTERNAL or vice-versa.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21500) Disable conversion of managed table to external at source.

2019-04-09 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-21500:

Summary: Disable conversion of managed table to external at source.  (was: 
Replicate conversion of managed table to external at source.)

> Disable conversion of managed table to external at source.
> --
>
> Key: HIVE-21500
> URL: https://issues.apache.org/jira/browse/HIVE-21500
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, Replication
>
> Couple of scenarios for Hive2 to Hive3(strict managed tables enabled) 
> replication where managed table is converted to external at source. 
> *Scenario-1: (ACID/MM table converted to external at target)*
> 1. Create non-ACID ORC format table.
> 2. Insert some rows
> 3. Replicate this create event which creates ACID table at target (due to 
> migration rule). Each insert event adds transactional metadata in HMS 
> corresponding to the current table.
> 4. Convert table to external table using ALTER command at source.
> *Scenario-2: (External table at target changes table location)*
> 1. Create non-ACID avro format table.
> 2. Insert some rows
> 3. Replicate this create event which creates external table at target (due to 
> migration rule). The data path is chosen under default external warehouse 
> directory.
> 4. Convert table to external table using ALTER command at source.
> It is unable to convert an ACID table to external table at target. Also, it 
> is hard to detect what would be the table type at target when perform this 
> ALTER table operation at source.
> So, it is decided to disable conversion of managed table at source (Hive2) to 
> EXTERNAL or vice-versa.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21593) Break up DDLTask - extract Privilege related operations

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814038#comment-16814038
 ] 

Hive QA commented on HIVE-21593:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
3s{color} | {color:blue} ql in master has 2258 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
42s{color} | {color:red} ql: The patch generated 10 new + 422 unchanged - 31 
fixed = 432 total (was 453) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
18s{color} | {color:red} ql generated 2 new + 2249 unchanged - 9 fixed = 2251 
total (was 2258) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Class org.apache.hadoop.hive.ql.ddl.privilege.GrantDesc defines 
non-transient non-serializable instance field privilegeSubject  In 
GrantDesc.java:instance field privilegeSubject  In GrantDesc.java |
|  |  Class org.apache.hadoop.hive.ql.ddl.privilege.RevokeDesc defines 
non-transient non-serializable instance field privilegeSubject  In 
RevokeDesc.java:instance field privilegeSubject  In RevokeDesc.java |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16905/dev-support/hive-personality.sh
 |
| git revision | master / 928f3d6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16905/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16905/yetus/whitespace-eol.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16905/yetus/new-findbugs-ql.html
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16905/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16905/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Break up DDLTask - extract Privilege related operations
> ---
>
> Key: HIVE-21593
> URL: https://issues.apache.org/jira/browse/HIVE-21593
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
> Attachments: HIVE-21593.01.patch
>
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a 

[jira] [Commented] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814027#comment-16814027
 ] 

Hive QA commented on HIVE-20968:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12965306/HIVE-20968.01.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 532 failed/errored test(s), 15783 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[escape_comments] 
(batchId=275)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_stats4] (batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_view_delete] 
(batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_format_loc]
 (batchId=58)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_location] 
(batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_stats_status]
 (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_view_as_select] 
(batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_view_col_type] 
(batchId=4)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_view_rename] 
(batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_8] 
(batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_cli_createtab]
 (batchId=31)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_cli_createtab_noauthzapi]
 (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_owner_actions]
 (batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_1] 
(batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_2] 
(batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_3] 
(batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_4] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_disable_cbo_1]
 (batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_disable_cbo_2]
 (batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_disable_cbo_3]
 (batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_disable_cbo_4]
 (batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_const] (batchId=19)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_subq_exists] 
(batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_union_view] 
(batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[column_pruning_partitioned_view]
 (batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[concat_op] (batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_big_view] 
(batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_like_tbl_props] 
(batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_like_view] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_or_replace_view] 
(batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view_defaultformats]
 (batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view_partitioned] 
(batchId=40)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view_translate] 
(batchId=97)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas_char] (batchId=20)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas_date] (batchId=1)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas_varchar] 
(batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cteViews] (batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_2] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_4] (batchId=92)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[database_drop] 
(batchId=65)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_ddl1] 
(batchId=87)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_query5] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_formatted_view_partitioned]
 (batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_formatted_view_partitioned_json]
 (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[escape_comments] 
(batchId=83)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explain_ddl] (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explain_dependency] 
(batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explain_logical] 
(batchId=71)

[jira] [Commented] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814025#comment-16814025
 ] 

Hive QA commented on HIVE-20968:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
41s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
51s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
19s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
41s{color} | {color:blue} standalone-metastore/metastore-common in master has 
29 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} common in master has 63 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
10s{color} | {color:blue} standalone-metastore/metastore-server in master has 
179 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
8s{color} | {color:blue} ql in master has 2258 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
29s{color} | {color:blue} hcatalog/server-extensions in master has 3 extant 
Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
28s{color} | {color:blue} hcatalog/webhcat/java-client in master has 3 extant 
Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
42s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
36s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} The patch metastore-common passed checkstyle {color} 
|
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} The patch common passed checkstyle {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
19s{color} | {color:red} standalone-metastore/metastore-server: The patch 
generated 41 new + 97 unchanged - 32 fixed = 138 total (was 129) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
43s{color} | {color:red} ql: The patch generated 2 new + 190 unchanged - 0 
fixed = 192 total (was 190) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
13s{color} | {color:red} hcatalog/server-extensions: The patch generated 9 new 
+ 118 unchanged - 6 fixed = 127 total (was 124) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} hcatalog/webhcat/java-client: The patch generated 0 
new + 62 unchanged - 1 fixed = 62 total (was 63) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
23s{color} | {color:red} itests/hive-unit: The patch generated 5 new + 633 
unchanged - 0 fixed = 638 total (was 633) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 11m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | 

[jira] [Commented] (HIVE-20901) running compactor when there is nothing to do produces duplicate data

2019-04-09 Thread Abhishek Somani (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814022#comment-16814022
 ] 

Abhishek Somani commented on HIVE-20901:


Looks like HIVE-9995 got merged yesterday with a fix for this, I didn't realize 
it was being worked on.

> running compactor when there is nothing to do produces duplicate data
> -
>
> Key: HIVE-20901
> URL: https://issues.apache.org/jira/browse/HIVE-20901
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Abhishek Somani
>Priority: Major
> Attachments: HIVE-20901.1.patch, HIVE-20901.2.patch
>
>
> suppose we run minor compaction 2 times, via alter table
> The 2nd request to compaction should have nothing to do but I don't think 
> there is a check for that.  It's visible in the context of HIVE-20823, where 
> each compactor run produces a delta with new visibility suffix so we end up 
> with something like
> {noformat}
> target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands3-1541810844849/warehouse/t/
> ├── delete_delta_001_002_v019
> │   ├── _orc_acid_version
> │   └── bucket_0
> ├── delete_delta_001_002_v021
> │   ├── _orc_acid_version
> │   └── bucket_0
> ├── delta_001_001_
> │   ├── _orc_acid_version
> │   └── bucket_0
> ├── delta_001_002_v019
> │   ├── _orc_acid_version
> │   └── bucket_0
> ├── delta_001_002_v021
> │   ├── _orc_acid_version
> │   └── bucket_0
> └── delta_002_002_
>     ├── _orc_acid_version
>     └── bucket_0{noformat}
> i.e. 2 deltas with the same write ID range
> this is bad.  Probably happens today as well but new run produces a delta 
> with the same name and clobbers the previous one, which may interfere with 
> writers
>  
> need to investigate
>  
> -The issue (I think) is that {{AcidUtils.getAcidState()}} then returns both 
> deltas as if they were distinct and it effectively duplicates data.-  There 
> is no data duplication - {{getAcidState()}} will not use 2 deltas with the 
> same {{writeid}} range
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21291) Restore historical way of handling timestamps in Avro while keeping the new semantics at the same time

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814005#comment-16814005
 ] 

Hive QA commented on HIVE-21291:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 0s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
41s{color} | {color:blue} serde in master has 197 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
9s{color} | {color:blue} ql in master has 2258 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m  
2s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 66m 43s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16903/dev-support/hive-personality.sh
 |
| git revision | master / 928f3d6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: serde ql . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16903/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Restore historical way of handling timestamps in Avro while keeping the new 
> semantics at the same time
> --
>
> Key: HIVE-21291
> URL: https://issues.apache.org/jira/browse/HIVE-21291
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Ivanfi
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-21291.1.patch, HIVE-21291.2.patch, 
> HIVE-21291.3.patch, HIVE-21291.4.patch, HIVE-21291.4.patch
>
>
> This sub-task is for implementing the Avro-specific parts of the following 
> plan:
> h1. Problem
> Historically, the semantics of the TIMESTAMP type in Hive depended on the 
> file format. Timestamps in Avro, Parquet and RCFiles with a binary SerDe had 
> _Instant_ semantics, while timestamps in ORC, textfiles and RCFiles with a 
> text SerDe had _LocalDateTime_ semantics.
> The Hive community wanted to get rid of this inconsistency and have 
> _LocalDateTime_ semantics in Avro, Parquet and RCFiles with a binary SerDe as 
> well. *Hive 3.1 turned off normalization to UTC* to achieve this. While this 
> leads to the desired new semantics, it 

[jira] [Commented] (HIVE-21291) Restore historical way of handling timestamps in Avro while keeping the new semantics at the same time

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814004#comment-16814004
 ] 

Hive QA commented on HIVE-21291:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12965307/HIVE-21291.4.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15899 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16903/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16903/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16903/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12965307 - PreCommit-HIVE-Build

> Restore historical way of handling timestamps in Avro while keeping the new 
> semantics at the same time
> --
>
> Key: HIVE-21291
> URL: https://issues.apache.org/jira/browse/HIVE-21291
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Ivanfi
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-21291.1.patch, HIVE-21291.2.patch, 
> HIVE-21291.3.patch, HIVE-21291.4.patch, HIVE-21291.4.patch
>
>
> This sub-task is for implementing the Avro-specific parts of the following 
> plan:
> h1. Problem
> Historically, the semantics of the TIMESTAMP type in Hive depended on the 
> file format. Timestamps in Avro, Parquet and RCFiles with a binary SerDe had 
> _Instant_ semantics, while timestamps in ORC, textfiles and RCFiles with a 
> text SerDe had _LocalDateTime_ semantics.
> The Hive community wanted to get rid of this inconsistency and have 
> _LocalDateTime_ semantics in Avro, Parquet and RCFiles with a binary SerDe as 
> well. *Hive 3.1 turned off normalization to UTC* to achieve this. While this 
> leads to the desired new semantics, it also leads to incorrect results when 
> new Hive versions read timestamps written by old Hive versions or when old 
> Hive versions or any other component not aware of this change (including 
> legacy Impala and Spark versions) read timestamps written by new Hive 
> versions.
> h1. Solution
> To work around this issue, Hive *should restore the practice of normalizing 
> to UTC* when writing timestamps to Avro, Parquet and RCFiles with a binary 
> SerDe. In itself, this would restore the historical _Instant_ semantics, 
> which is undesirable. In order to achieve the desired _LocalDateTime_ 
> semantics in spite of normalizing to UTC, newer Hive versions should record 
> the session-local local time zone in the file metadata fields serving 
> arbitrary key-value storage purposes.
> When reading back files with this time zone metadata, newer Hive versions (or 
> any other new component aware of this extra metadata) can achieve 
> _LocalDateTime_ semantics by *converting from UTC to the saved time zone 
> (instead of to the local time zone)*. Legacy components that are unaware of 
> the new metadata can read the files without any problem and the timestamps 
> will show the historical Instant behaviour to them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21570) Convert llap iomem servlets output to json format

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813975#comment-16813975
 ] 

Hive QA commented on HIVE-21570:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12965296/HIVE-21570.04.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15896 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16902/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16902/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16902/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12965296 - PreCommit-HIVE-Build

> Convert llap iomem servlets output to json format
> -
>
> Key: HIVE-21570
> URL: https://issues.apache.org/jira/browse/HIVE-21570
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Minor
> Attachments: HIVE-21570.01.patch, HIVE-21570.02.patch, 
> HIVE-21570.03.patch, HIVE-21570.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21572) HiveRemoveSqCountCheck rule could be enhanced to capture more patterns

2019-04-09 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21572:
---
Status: Patch Available  (was: Open)

> HiveRemoveSqCountCheck rule could be enhanced to capture more patterns 
> ---
>
> Key: HIVE-21572
> URL: https://issues.apache.org/jira/browse/HIVE-21572
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21572.1.patch, HIVE-21572.2.patch, 
> HIVE-21572.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21572) HiveRemoveSqCountCheck rule could be enhanced to capture more patterns

2019-04-09 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21572:
---
Status: Open  (was: Patch Available)

> HiveRemoveSqCountCheck rule could be enhanced to capture more patterns 
> ---
>
> Key: HIVE-21572
> URL: https://issues.apache.org/jira/browse/HIVE-21572
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21572.1.patch, HIVE-21572.2.patch, 
> HIVE-21572.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21572) HiveRemoveSqCountCheck rule could be enhanced to capture more patterns

2019-04-09 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21572:
---
Attachment: HIVE-21572.3.patch

> HiveRemoveSqCountCheck rule could be enhanced to capture more patterns 
> ---
>
> Key: HIVE-21572
> URL: https://issues.apache.org/jira/browse/HIVE-21572
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21572.1.patch, HIVE-21572.2.patch, 
> HIVE-21572.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21570) Convert llap iomem servlets output to json format

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813958#comment-16813958
 ] 

Hive QA commented on HIVE-21570:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 9s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
24s{color} | {color:blue} llap-client in master has 26 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
5s{color} | {color:blue} ql in master has 2258 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
45s{color} | {color:blue} llap-server in master has 81 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
24s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} The patch llap-client passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} ql: The patch generated 0 new + 5 unchanged - 7 
fixed = 5 total (was 12) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
15s{color} | {color:red} llap-server: The patch generated 3 new + 193 unchanged 
- 5 fixed = 196 total (was 198) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
53s{color} | {color:red} llap-server generated 1 new + 80 unchanged - 1 fixed = 
81 total (was 81) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 57s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:llap-server |
|  |  
org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.debugDumpShort(List) 
does not release lock on all paths  At BuddyAllocator.java:on all paths  At 
BuddyAllocator.java:[line 1154] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16902/dev-support/hive-personality.sh
 |
| git revision | master / 928f3d6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16902/yetus/diff-checkstyle-llap-server.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16902/yetus/new-findbugs-llap-server.html
 |
| modules | C: llap-client ql llap-server U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16902/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Convert llap iomem servlets output to json format
> -
>
> Key: 

[jira] [Commented] (HIVE-20901) running compactor when there is nothing to do produces duplicate data

2019-04-09 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813942#comment-16813942
 ] 

Eugene Koifman commented on HIVE-20901:
---

I'd suggest that {{msg.append("Skipping minor compaction as");}} should include 
compaction ID and db.table.partition info.

> running compactor when there is nothing to do produces duplicate data
> -
>
> Key: HIVE-20901
> URL: https://issues.apache.org/jira/browse/HIVE-20901
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Abhishek Somani
>Priority: Major
> Attachments: HIVE-20901.1.patch, HIVE-20901.2.patch
>
>
> suppose we run minor compaction 2 times, via alter table
> The 2nd request to compaction should have nothing to do but I don't think 
> there is a check for that.  It's visible in the context of HIVE-20823, where 
> each compactor run produces a delta with new visibility suffix so we end up 
> with something like
> {noformat}
> target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands3-1541810844849/warehouse/t/
> ├── delete_delta_001_002_v019
> │   ├── _orc_acid_version
> │   └── bucket_0
> ├── delete_delta_001_002_v021
> │   ├── _orc_acid_version
> │   └── bucket_0
> ├── delta_001_001_
> │   ├── _orc_acid_version
> │   └── bucket_0
> ├── delta_001_002_v019
> │   ├── _orc_acid_version
> │   └── bucket_0
> ├── delta_001_002_v021
> │   ├── _orc_acid_version
> │   └── bucket_0
> └── delta_002_002_
>     ├── _orc_acid_version
>     └── bucket_0{noformat}
> i.e. 2 deltas with the same write ID range
> this is bad.  Probably happens today as well but new run produces a delta 
> with the same name and clobbers the previous one, which may interfere with 
> writers
>  
> need to investigate
>  
> -The issue (I think) is that {{AcidUtils.getAcidState()}} then returns both 
> deltas as if they were distinct and it effectively duplicates data.-  There 
> is no data duplication - {{getAcidState()}} will not use 2 deltas with the 
> same {{writeid}} range
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21538) Beeline: password source though the console reader did not pass to connection param

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813933#comment-16813933
 ] 

Hive QA commented on HIVE-21538:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12965263/HIVE-21538.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15896 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16901/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16901/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16901/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12965263 - PreCommit-HIVE-Build

> Beeline: password source though the console reader did not pass to connection 
> param
> ---
>
> Key: HIVE-21538
> URL: https://issues.apache.org/jira/browse/HIVE-21538
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0
> Environment: Hive-3.1 auth set to LDAP
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-21538.01.patch, HIVE-21538.02.patch, 
> HIVE-21538.patch
>
>
> Beeline: password source through the console reader do not pass to connection 
> param, this will yield into the Authentication failure in case of LDAP 
> authentication.
> {code}
> beeline -n USER -u 
> "jdbc:hive2://host:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"
>  -p
> Connecting to 
> jdbc:hive2://host:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;user=USER
> Enter password for jdbc:hive2://host:2181/: 
> 19/03/26 19:49:44 [main]: WARN jdbc.HiveConnection: Failed to connect to 
> host:1
> 19/03/26 19:49:44 [main]: ERROR jdbc.Utils: Unable to read HiveServer2 
> configs from ZooKeeper
> Unknown HS2 problem when communicating with Thrift server.
> Error: Could not open client transport for any of the Server URI's in 
> ZooKeeper: Peer indicated failure: PLAIN auth failed: 
> javax.security.sasl.AuthenticationException: Error validating LDAP user 
> [Caused by javax.naming.AuthenticationException: [LDAP: error code 49 - 
> 80090308: LdapErr: DSID-0C0903C8, comment: AcceptSecurityContext error, data 
> 52e, v2580]] (state=08S01,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21596) HiveMetastoreClient should be able to connect to older metastore servers

2019-04-09 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-21596:
---
Description: 
{{HiveMetastoreClient}} currently depends on the fact that both the client and 
server versions are the same. Additionally, since the server APIs are backwards 
compatible, it is possible for a older client (eg. 2.1.0 client version) to 
connect to a newer server (eg. 3.1.0 server version) without any issues. This 
is useful in setups where HMS is deployed in a remote mode and clients connect 
to it remotely.

It would be a good improvement if a newer version {{HiveMetastoreClient }} can 
connect to the a older server version. When a newer client is talking to a 
older server following things can happen:

1. Client invokes a RPC to the older server which doesn't exist.
In such a case, thrift will throw {{Invalid method name}} exception which 
should be automatically be handled by the clients since each API already throws 
TException.

2. Client invokes a RPC using thrift objects which has new fields added.
When a new field is added to a thrift object, the server does not deserialize 
the field in the first place since it does not know about that field id. So the 
wire-compatibility exists already. However, the client side application should 
understand the implications of such a behavior. In such cases, it would be 
better for the client to throw exception by checking the server version which 
was added in HIVE-21484

3. If the newer client has re-implemented a certain API, for example, using 
newer thrift API the client will start seeing exception {{Invalid method name}} 
since the older server does not have such a method.
This can be handled on the client side by making sure that the newer 
implementation is conditional to the server version. Which means client should 
check the server version and invoke the new implementation only if the server 
version supports the newer API. (On a side note, it would be great if metastore 
also gives information of which APIs are supported for a given version)

One of the real world use-case of such a feature is in Impala which wants to 
have capability to talk to both HMS 2.x and HMS 3.x. But other applications 
like Spark (or third party applications which want to support multiple HMS 
versions) may also find this useful.

Also, this patch will do a best effort to fix all such cases between Hive 2.3.0 
and newer versions of HMS. It should be a on-going effort to be exhaustive. We 
will also need to add support for this in our test infrastructure to spin up 
older HMS server versions and test using newer clients APIs. I will create a 
separate sub-task for that since it may need more plumbing in ptest.

  was:
{{HiveMetastoreClient}} currently depends on the fact that both the client and 
server versions are the same. Additionally, since the server APIs are backwards 
compatible, it is possible for a older client (eg. 2.1.0 client version) to 
connect to a newer server (eg. 3.1.0 server version) without any issues. This 
is useful in setups where HMS is deployed in a remote mode and clients connect 
to it remotely.

It would be a good improvement if a newer version {{HiveMetastoreClient }} can 
connect to the a older server version. When a newer client is talking to a 
older server following things can happen:

1. Client invokes a RPC to the older server which doesn't exist.
In such a case, thrift will throw {{Invalid method name}} exception which 
should be automatically be handled by the clients since each API already throws 
TException.

2. Client invokes a RPC using thrift objects which has new fields added.
When a new field is added to a thrift object, the server does not deserialize 
the field in the first place since it does not know about that field id. So the 
wire-compatibility exists already. However, the client side application should 
understand the implications of such a behavior. In such cases, it would be 
better for the client to throw exception by checking the server version which 
was added in HIVE-21484

3. If the newer client has re-implemented a certain API, for example, using a 
newer more efficient thrift API, but an older thrift API also exists which can 
provide the same functionality. In this case, the new client will start seeing 
exception {{Invalid method name}} since the older server does not have such a 
method. This can be handled on the client side by making sure that the newer 
implementation is conditional to the server version, and falling back to the 
older (maybe less-efficient) one when necessary. Which means client should 
check the server version and invoke the new implementation only if the server 
version supports the newer API. (On a side note, it would be great if metastore 
also gives information of which APIs are supported for a given version)

One of the real world use-case of such a feature is 

[jira] [Updated] (HIVE-21596) HiveMetastoreClient should be able to connect to older metastore servers

2019-04-09 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HIVE-21596:
---
Description: 
{{HiveMetastoreClient}} currently depends on the fact that both the client and 
server versions are the same. Additionally, since the server APIs are backwards 
compatible, it is possible for a older client (eg. 2.1.0 client version) to 
connect to a newer server (eg. 3.1.0 server version) without any issues. This 
is useful in setups where HMS is deployed in a remote mode and clients connect 
to it remotely.

It would be a good improvement if a newer version {{HiveMetastoreClient }} can 
connect to the a older server version. When a newer client is talking to a 
older server following things can happen:

1. Client invokes a RPC to the older server which doesn't exist.
In such a case, thrift will throw {{Invalid method name}} exception which 
should be automatically be handled by the clients since each API already throws 
TException.

2. Client invokes a RPC using thrift objects which has new fields added.
When a new field is added to a thrift object, the server does not deserialize 
the field in the first place since it does not know about that field id. So the 
wire-compatibility exists already. However, the client side application should 
understand the implications of such a behavior. In such cases, it would be 
better for the client to throw exception by checking the server version which 
was added in HIVE-21484

3. If the newer client has re-implemented a certain API, for example, using a 
newer more efficient thrift API, but an older thrift API also exists which can 
provide the same functionality. In this case, the new client will start seeing 
exception {{Invalid method name}} since the older server does not have such a 
method. This can be handled on the client side by making sure that the newer 
implementation is conditional to the server version, and falling back to the 
older (maybe less-efficient) one when necessary. Which means client should 
check the server version and invoke the new implementation only if the server 
version supports the newer API. (On a side note, it would be great if metastore 
also gives information of which APIs are supported for a given version)

One of the real world use-case of such a feature is in Impala which wants to 
have capability to talk to both HMS 2.x and HMS 3.x. But other applications 
like Spark (or third party applications which want to support multiple HMS 
versions) may also find this useful.

  was:
{{HiveMetastoreClient}} currently depends on the fact that both the client and 
server versions are the same. Additionally, since the server APIs are backwards 
compatible, it is possible for a older client (eg. 2.1.0 client version) to 
connect to a newer server (eg. 3.1.0 server version) without any issues. This 
is useful in setups where HMS is deployed in a remote mode and clients connect 
to it remotely.

It would be a good improvement if a newer version {{HiveMetastoreClient }} can 
connect to the a older server version. When a newer client is talking to a 
older server following things can happen:

1. Client invokes a RPC to the older server which doesn't exist.
In such a case, thrift will throw {{Invalid method name}} exception which 
should be automatically be handled by the clients since each API already throws 
TException.

2. Client invokes a RPC using thrift objects which has new fields added.
When a new field is added to a thrift object, the server does not deserialize 
the field in the first place since it does not know about that field id. So the 
wire-compatibility exists already. However, the client side application should 
understand the implications of such a behavior. In such cases, it would be 
better for the client to throw exception by checking the server version which 
was added in HIVE-21484

3. If the newer client has re-implemented a certain API, for example, using 
newer thrift API the client will start seeing exception {{Invalid method name}} 
since the older server does not have such a method.
This can be handled on the client side by making sure that the newer 
implementation is conditional to the server version. Which means client should 
check the server version and invoke the new implementation only if the server 
version supports the newer API. (On a side note, it would be great if metastore 
also gives information of which APIs are supported for a given version)

One of the real world use-case of such a feature is in Impala which wants to 
have capability to talk to both HMS 2.x and HMS 3.x. But other applications 
like Spark (or third party applications which want to support multiple HMS 
versions) may also find this useful.


> HiveMetastoreClient should be able to connect to older metastore servers
> 
>
> Key: HIVE-21596

[jira] [Updated] (HIVE-21596) HiveMetastoreClient should be able to connect to older metastore servers

2019-04-09 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-21596:
---
Description: 
{{HiveMetastoreClient}} currently depends on the fact that both the client and 
server versions are the same. Additionally, since the server APIs are backwards 
compatible, it is possible for a older client (eg. 2.1.0 client version) to 
connect to a newer server (eg. 3.1.0 server version) without any issues. This 
is useful in setups where HMS is deployed in a remote mode and clients connect 
to it remotely.

It would be a good improvement if a newer version {{HiveMetastoreClient }} can 
connect to the a older server version. When a newer client is talking to a 
older server following things can happen:

1. Client invokes a RPC to the older server which doesn't exist.
In such a case, thrift will throw {{Invalid method name}} exception which 
should be automatically be handled by the clients since each API already throws 
TException.

2. Client invokes a RPC using thrift objects which has new fields added.
When a new field is added to a thrift object, the server does not deserialize 
the field in the first place since it does not know about that field id. So the 
wire-compatibility exists already. However, the client side application should 
understand the implications of such a behavior. In such cases, it would be 
better for the client to throw exception by checking the server version which 
was added in HIVE-21484

3. If the newer client has re-implemented a certain API, for example, using 
newer thrift API the client will start seeing exception {{Invalid method name}} 
since the older server does not have such a method.
This can be handled on the client side by making sure that the newer 
implementation is conditional to the server version. Which means client should 
check the server version and invoke the new implementation only if the server 
version supports the newer API. (On a side note, it would be great if metastore 
also gives information of which APIs are supported for a given version)

One of the real world use-case of such a feature is in Impala which wants to 
have capability to talk to both HMS 2.x and HMS 3.x. But other applications 
like Spark (or third party applications which want to support multiple HMS 
versions) may also find this useful.

  was:
{{HiveMetastoreClient}} currently depends on the fact that both the client and 
server versions are the same. Additionally, since the server APIs are backwards 
compatible, it is possible for a older client (eg. 2.1.0 client version) to 
connect to a newer server (eg. 3.1.0 server version) without any issues. This 
is useful in setups where HMS is deployed in a remote mode and clients connect 
to it remotely.

It would be a good improvement if a newer version {{HiveMetastoreClient }} can 
connect to the a older server version. When a newer client is talking to a 
older server following things can happen:

1. Client invokes a RPC to the older server which doesn't exist.
In such a case, thrift will throw {{Invalid method name}} exception which 
should be automatically be handled by the clients since each API already throws 
TException.

2. Client invokes a RPC using thrift objects which has new fields added.
When a new field is added to a thrift object, the server does not deserialize 
the field in the first place since it does not know about that field id. So the 
wire-compatibility exists already. However, the client side application should 
understand the implications of such a behavior. In such cases, it would be 
better for the client to throw exception by checking the server version which 
was added in HIVE-21484

3. If the newer client has re-implemented a certain API, for example, using 
newer thrift API the client will start seeing exception {{Invalid method name}} 
since the older server does not have such a method.
This can be handled on the client side by making sure that the newer 
implementation is conditional to the server version. Which means client should 
check the server version and invoke the new implementation only if the server 
version supports the newer API. (On a side note, it would be great if metastore 
also gives information of which APIs are supported for a given version)



> HiveMetastoreClient should be able to connect to older metastore servers
> 
>
> Key: HIVE-21596
> URL: https://issues.apache.org/jira/browse/HIVE-21596
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
>
> {{HiveMetastoreClient}} currently depends on the fact that both the client 
> and server versions are the same. Additionally, since the server APIs are 
> backwards compatible, it is 

[jira] [Updated] (HIVE-21596) HiveMetastoreClient should be able to connect to older metastore servers

2019-04-09 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-21596:
---
Description: 
{{HiveMetastoreClient}} currently depends on the fact that both the client and 
server versions are the same. Additionally, since the server APIs are backwards 
compatible, it is possible for a older client (eg. 2.1.0 client version) to 
connect to a newer server (eg. 3.1.0 server version) without any issues. This 
is useful in setups where HMS is deployed in a remote mode and clients connect 
to it remotely.

It would be a good improvement if a newer version {{HiveMetastoreClient }} can 
connect to the a older server version. When a newer client is talking to a 
older server following things can happen:

1. Client invokes a RPC to the older server which doesn't exist.
In such a case, thrift will throw {{Invalid method name}} exception which 
should be automatically be handled by the clients since each API already throws 
TException.

2. Client invokes a RPC using thrift objects which has new fields added.
When a new field is added to a thrift object, the server does not deserialize 
the field in the first place since it does not know about that field id. So the 
wire-compatibility exists already. However, the client side application should 
understand the implications of such a behavior. In such cases, it would be 
better for the client to throw exception by checking the server version which 
was added in HIVE-21484

3. If the newer client has re-implemented a certain API, for example, using 
newer thrift API the client will start seeing exception {{Invalid method name}} 
since the older server does not have such a method.
This can be handled on the client side by making sure that the newer 
implementation is conditional to the server version. Which means client should 
check the server version and invoke the new implementation only if the server 
version supports the newer API. (On a side note, it would be great if metastore 
also gives information of which APIs are supported for a given version)


  was:
{{HiveMetastoreClient}} currently depends on the fact that both the client and 
server versions are the same. Additionally, since the server APIs are backwards 
compatible, it is possible for a older client (eg. 2.1.0 client version) to 
connect to a newer server (eg. 3.1.0 server version) without any issues. This 
is useful in setups where HMS is deployed in a remote mode and clients connect 
to it remotely.

It would be a good improvement if a newer version {{HiveMetastoreClient }} can 
connect to the a newer server version. When a newer client is talking to a 
older server following things can happen:

1. Client invokes a RPC to the older server which doesn't exist.
In such a case, thrift will throw {{Invalid method name}} exception which 
should be automatically be handled by the clients since each API throws 
TException.

2. Client invokes a RPC using thrift objects which has new fields added.
When a new field is added to a thrift object, the server does not deserialize 
the field in the first place since it does not know about that field id. So the 
wire-compatibility exists already. However, the client side application should 
understand the implications of such a behavior. In such cases, it would be 
better for the client to throw exception by checking the server version which 
was added in HIVE-21484

3. If the newer client has re-implemented a certain API, for example, using 
newer thrift API the client will start seeing exception {{Invalid method name}}
This can be handled on the client side by making sure that the newer 
implementation is conditional to the server version. Which means client should 
check the server version and invoke the new implementation only if the server 
version supports the newer API. (On a side note, it would be great if metastore 
also gives information of which APIs are supported for a given version)



> HiveMetastoreClient should be able to connect to older metastore servers
> 
>
> Key: HIVE-21596
> URL: https://issues.apache.org/jira/browse/HIVE-21596
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
>
> {{HiveMetastoreClient}} currently depends on the fact that both the client 
> and server versions are the same. Additionally, since the server APIs are 
> backwards compatible, it is possible for a older client (eg. 2.1.0 client 
> version) to connect to a newer server (eg. 3.1.0 server version) without any 
> issues. This is useful in setups where HMS is deployed in a remote mode and 
> clients connect to it remotely.
> It would be a good improvement if a newer version {{HiveMetastoreClient }} 
> can connect 

[jira] [Assigned] (HIVE-21596) HiveMetastoreClient should be able to connect to older metastore servers

2019-04-09 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-21596:
--


> HiveMetastoreClient should be able to connect to older metastore servers
> 
>
> Key: HIVE-21596
> URL: https://issues.apache.org/jira/browse/HIVE-21596
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
>
> {{HiveMetastoreClient}} currently depends on the fact that both the client 
> and server versions are the same. Additionally, since the server APIs are 
> backwards compatible, it is possible for a older client (eg. 2.1.0 client 
> version) to connect to a newer server (eg. 3.1.0 server version) without any 
> issues. This is useful in setups where HMS is deployed in a remote mode and 
> clients connect to it remotely.
> It would be a good improvement if a newer version {{HiveMetastoreClient }} 
> can connect to the a newer server version. When a newer client is talking to 
> a older server following things can happen:
> 1. Client invokes a RPC to the older server which doesn't exist.
> In such a case, thrift will throw {{Invalid method name}} exception which 
> should be automatically be handled by the clients since each API throws 
> TException.
> 2. Client invokes a RPC using thrift objects which has new fields added.
> When a new field is added to a thrift object, the server does not deserialize 
> the field in the first place since it does not know about that field id. So 
> the wire-compatibility exists already. However, the client side application 
> should understand the implications of such a behavior. In such cases, it 
> would be better for the client to throw exception by checking the server 
> version which was added in HIVE-21484
> 3. If the newer client has re-implemented a certain API, for example, using 
> newer thrift API the client will start seeing exception {{Invalid method 
> name}}
> This can be handled on the client side by making sure that the newer 
> implementation is conditional to the server version. Which means client 
> should check the server version and invoke the new implementation only if the 
> server version supports the newer API. (On a side note, it would be great if 
> metastore also gives information of which APIs are supported for a given 
> version)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21595) HIVE-20556 breaks backwards compatibility

2019-04-09 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813904#comment-16813904
 ] 

Vihang Karajgaonkar commented on HIVE-21595:


Looks like HIVE-19820 also changes order of fields in 
{{AlterPartitionsRequest}} 

> HIVE-20556 breaks backwards compatibility
> -
>
> Key: HIVE-21595
> URL: https://issues.apache.org/jira/browse/HIVE-21595
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
>
> HIVE-20556 exposes a new field Table definition. However, it changes the 
> order of the field ids which breaks backwards wire-compatibility. Any older 
> client which is connects with HMS will not be able to deserialize table 
> objects correctly since the field ids are different on client and server side.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21595) HIVE-20556 breaks backwards compatibility

2019-04-09 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-21595:
--


> HIVE-20556 breaks backwards compatibility
> -
>
> Key: HIVE-21595
> URL: https://issues.apache.org/jira/browse/HIVE-21595
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
>
> HIVE-20556 exposes a new field Table definition. However, it changes the 
> order of the field ids which breaks backwards wire-compatibility. Any older 
> client which is connects with HMS will not be able to deserialize table 
> objects correctly since the field ids are different on client and server side.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21538) Beeline: password source though the console reader did not pass to connection param

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813898#comment-16813898
 ] 

Hive QA commented on HIVE-21538:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
30s{color} | {color:blue} jdbc in master has 16 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16901/dev-support/hive-personality.sh
 |
| git revision | master / 928f3d6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: jdbc U: jdbc |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16901/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Beeline: password source though the console reader did not pass to connection 
> param
> ---
>
> Key: HIVE-21538
> URL: https://issues.apache.org/jira/browse/HIVE-21538
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0
> Environment: Hive-3.1 auth set to LDAP
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-21538.01.patch, HIVE-21538.02.patch, 
> HIVE-21538.patch
>
>
> Beeline: password source through the console reader do not pass to connection 
> param, this will yield into the Authentication failure in case of LDAP 
> authentication.
> {code}
> beeline -n USER -u 
> "jdbc:hive2://host:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"
>  -p
> Connecting to 
> jdbc:hive2://host:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;user=USER
> Enter password for jdbc:hive2://host:2181/: 
> 19/03/26 19:49:44 [main]: WARN jdbc.HiveConnection: Failed to connect to 
> host:1
> 19/03/26 19:49:44 [main]: ERROR jdbc.Utils: Unable to read HiveServer2 
> configs from ZooKeeper
> Unknown HS2 problem when communicating with Thrift server.
> Error: Could not open client transport for any of the Server URI's in 
> ZooKeeper: Peer indicated failure: PLAIN auth failed: 
> javax.security.sasl.AuthenticationException: Error validating LDAP user 
> [Caused by javax.naming.AuthenticationException: [LDAP: error code 49 - 
> 80090308: LdapErr: DSID-0C0903C8, comment: AcceptSecurityContext 

[jira] [Updated] (HIVE-21531) Vectorization: all NULL hashcodes are not computed using Murmur3

2019-04-09 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-21531:
---
Attachment: HIVE-21531.1.patch

> Vectorization: all NULL hashcodes are not computed using Murmur3
> 
>
> Key: HIVE-21531
> URL: https://issues.apache.org/jira/browse/HIVE-21531
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.1.1
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Critical
> Attachments: HIVE-21531.1.patch, HIVE-21531.1.patch, 
> HIVE-21531.WIP.patch
>
>
> The comments in Vectorized hash computation call out the MurmurHash 
> implementation (the one using 0x5bd1e995), while the non-vectorized codepath 
> calls out the Murmur3 one (using 0xcc9e2d51).
> The comments here are wrong
> {code}
>  /**
>* Batch compute the hash codes for all the serialized keys.
>*
>* NOTE: MAJOR MAJOR ASSUMPTION:
>* We assume that HashCodeUtil.murmurHash produces the same result
>* as MurmurHash.hash with seed = 0 (the method used by 
> ReduceSinkOperator for
>* UNIFORM distribution).
>*/
>   protected void computeSerializedHashCodes() {
> int offset = 0;
> int keyLength;
> byte[] bytes = output.getData();
> for (int i = 0; i < nonNullKeyCount; i++) {
>   keyLength = serializedKeyLengths[i];
>   hashCodes[i] = Murmur3.hash32(bytes, offset, keyLength, 0);
>   offset += keyLength;
> }
>   }
> {code}
> but the wrong comment is followed in the Vector RS operator 
> {code}
>   System.arraycopy(nullKeyOutput.getData(), 0, nullBytes, 0, 
> nullBytesLength);
>   nullKeyHashCode = HashCodeUtil.calculateBytesHashCode(nullBytes, 0, 
> nullBytesLength);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-13582) E061-07 and E061-12: Quantified Comparison Predicates

2019-04-09 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-13582:
--

Assignee: Vineet Garg

> E061-07 and E061-12: Quantified Comparison Predicates
> -
>
> Key: HIVE-13582
> URL: https://issues.apache.org/jira/browse/HIVE-13582
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Carter Shanklin
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-13582.1.patch
>
>
> This is a part of the SQL:2011 Analytics Complete Umbrella JIRA HIVE-13554. 
> Quantified comparison predicates (ANY/SOME/ALL) are mandatory in the SQL 
> standard. Hive should support the predicates (E061-07) and you should be able 
> to use these with subqueries (E061-12)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-13582) E061-07 and E061-12: Quantified Comparison Predicates

2019-04-09 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813863#comment-16813863
 ] 

Vineet Garg commented on HIVE-13582:


Attaching initial patch to support ALL and SOME/ANY for uncorrelated 
subqueries. Currently <>ANY and =ALL are disallowed (Not yet sure how to 
transform those).

> E061-07 and E061-12: Quantified Comparison Predicates
> -
>
> Key: HIVE-13582
> URL: https://issues.apache.org/jira/browse/HIVE-13582
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Carter Shanklin
>Priority: Major
> Attachments: HIVE-13582.1.patch
>
>
> This is a part of the SQL:2011 Analytics Complete Umbrella JIRA HIVE-13554. 
> Quantified comparison predicates (ANY/SOME/ALL) are mandatory in the SQL 
> standard. Hive should support the predicates (E061-07) and you should be able 
> to use these with subqueries (E061-12)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-13582) E061-07 and E061-12: Quantified Comparison Predicates

2019-04-09 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-13582:
---
Status: Patch Available  (was: Open)

> E061-07 and E061-12: Quantified Comparison Predicates
> -
>
> Key: HIVE-13582
> URL: https://issues.apache.org/jira/browse/HIVE-13582
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Carter Shanklin
>Priority: Major
> Attachments: HIVE-13582.1.patch
>
>
> This is a part of the SQL:2011 Analytics Complete Umbrella JIRA HIVE-13554. 
> Quantified comparison predicates (ANY/SOME/ALL) are mandatory in the SQL 
> standard. Hive should support the predicates (E061-07) and you should be able 
> to use these with subqueries (E061-12)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-13582) E061-07 and E061-12: Quantified Comparison Predicates

2019-04-09 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-13582:
---
Attachment: HIVE-13582.1.patch

> E061-07 and E061-12: Quantified Comparison Predicates
> -
>
> Key: HIVE-13582
> URL: https://issues.apache.org/jira/browse/HIVE-13582
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Carter Shanklin
>Priority: Major
> Attachments: HIVE-13582.1.patch
>
>
> This is a part of the SQL:2011 Analytics Complete Umbrella JIRA HIVE-13554. 
> Quantified comparison predicates (ANY/SOME/ALL) are mandatory in the SQL 
> standard. Hive should support the predicates (E061-07) and you should be able 
> to use these with subqueries (E061-12)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HIVE-9532) add support for quantified predicates

2019-04-09 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg resolved HIVE-9532.
---
Resolution: Duplicate

> add support for quantified predicates 
> --
>
> Key: HIVE-9532
> URL: https://issues.apache.org/jira/browse/HIVE-9532
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
>Reporter: N Campbell
>Priority: Major
>
> allow a quantified predicate using ALL/ANY/SOME
> {code}
> select rnum, c1, c2 from tjoin2 where 20 > some (select c1 from tjoin1)
> Error while compiling statement: FAILED: ParseException line 1:50 cannot 
> recognize input near 'select' 'c1' 'from' in function specification
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21591) Using triggers in non-LLAP mode should not require wm queue

2019-04-09 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21591:
-
   Resolution: Fixed
Fix Version/s: 3.2.0
   4.0.0
   Status: Resolved  (was: Patch Available)

Committed to master and branch-3. Thanks Daniel for the review!

> Using triggers in non-LLAP mode should not require wm queue
> ---
>
> Key: HIVE-21591
> URL: https://issues.apache.org/jira/browse/HIVE-21591
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-21591.1.patch
>
>
> Resource plan triggers are supported in non-LLAP (tez container) mode. But 
> fetching of resource plan happens only when 
> hive.server2.tez.interactive.queue is set. For tez container mode, only 
> triggers are applicable, so this queue dependency can be removed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21427) Syslog storage handler

2019-04-09 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21427:
-
Status: Patch Available  (was: Reopened)

> Syslog storage handler
> --
>
> Key: HIVE-21427
> URL: https://issues.apache.org/jira/browse/HIVE-21427
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21427.1.patch, HIVE-21427.2.patch, 
> HIVE-21427.3.patch, HIVE-21427.4.patch, HIVE-21427.5.patch, 
> HIVE-21427.6.patch, HIVE-21427.7.patch, HIVE-21427.8.patch
>
>
> It will be useful to read syslog generated log files in Hive. Hive generates 
> logs in RFC5424 log4j2 layout and stores it as external table in sys.db. This 
> includes a SyslogSerde that can parse RFC5424 formatted logs and maps them to 
> logs table schema for query processing by hive. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21571) SHOW COMPACTIONS shows column names as its first output row

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813807#comment-16813807
 ] 

Hive QA commented on HIVE-21571:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12965262/HIVE-21571.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15896 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=86)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16900/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16900/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16900/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12965262 - PreCommit-HIVE-Build

> SHOW COMPACTIONS shows column names as its first output row
> ---
>
> Key: HIVE-21571
> URL: https://issues.apache.org/jira/browse/HIVE-21571
> Project: Hive
>  Issue Type: Bug
>Reporter: Todd Lipcon
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-21571.01.patch, HIVE-21571.patch
>
>
> SHOW COMPACTIONS yields a resultset with nice column names, and then the 
> first row of data is a repetition of those column names. This is somewhat 
> confusing and hard to read.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21571) SHOW COMPACTIONS shows column names as its first output row

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813771#comment-16813771
 ] 

Hive QA commented on HIVE-21571:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
5s{color} | {color:blue} ql in master has 2258 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 48s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16900/dev-support/hive-personality.sh
 |
| git revision | master / 7072e0b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16900/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> SHOW COMPACTIONS shows column names as its first output row
> ---
>
> Key: HIVE-21571
> URL: https://issues.apache.org/jira/browse/HIVE-21571
> Project: Hive
>  Issue Type: Bug
>Reporter: Todd Lipcon
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-21571.01.patch, HIVE-21571.patch
>
>
> SHOW COMPACTIONS yields a resultset with nice column names, and then the 
> first row of data is a repetition of those column names. This is somewhat 
> confusing and hard to read.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21592) OptimizedSql is not shown when the expression contains CONCAT

2019-04-09 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813755#comment-16813755
 ] 

Gopal V commented on HIVE-21592:


LGTM - +1 tests pending

> OptimizedSql is not shown when the expression contains CONCAT
> -
>
> Key: HIVE-21592
> URL: https://issues.apache.org/jira/browse/HIVE-21592
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21592.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21594) Warning regarding multiple MOVE triggers is logged even if conflicting triggers are both KILL

2019-04-09 Thread Brian Goerlitz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Goerlitz updated HIVE-21594:
--
Description: 
The following in TriggerValidatorRunnable will log a WARN about conflicting 
MOVE triggers even if the previous trigger violated was a KILL trigger.

 
{code:java}
} else { 
// if multiple MOVE happens, only first move will be chosen
LOG.warn("Conflicting MOVE triggers ({} and {}). Choosing the first MOVE 
trigger: {}", existingTrigger, currentTrigger, existingTrigger.getName());
{code}
 

This logging makes sense if the code broke out of the triggers loop on the 
first encountered KILL trigger violation, but this currently does not happen.

  was:
The following in TriggerValidatorRunnable will log a WARN about conflicting 
MOVE triggers even if the previous trigger violated was a KILL trigger.

 
{code:java}
} else { // if multiple MOVE happens, only first move will be chosen 
LOG.warn("Conflicting MOVE triggers ({} and {}). Choosing the first MOVE 
trigger: {}", existingTrigger, currentTrigger, existingTrigger.getName());
{code}
 

This logging makes sense if the code broke out of the triggers loop on the 
first encountered KILL trigger violation, but this currently does not happen.


> Warning regarding multiple MOVE triggers is logged even if conflicting 
> triggers are both KILL
> -
>
> Key: HIVE-21594
> URL: https://issues.apache.org/jira/browse/HIVE-21594
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Brian Goerlitz
>Priority: Minor
>
> The following in TriggerValidatorRunnable will log a WARN about conflicting 
> MOVE triggers even if the previous trigger violated was a KILL trigger.
>  
> {code:java}
> } else { 
> // if multiple MOVE happens, only first move will be chosen
> LOG.warn("Conflicting MOVE triggers ({} and {}). Choosing the first MOVE 
> trigger: {}", existingTrigger, currentTrigger, existingTrigger.getName());
> {code}
>  
> This logging makes sense if the code broke out of the triggers loop on the 
> first encountered KILL trigger violation, but this currently does not happen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool

2019-04-09 Thread Nitin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813673#comment-16813673
 ] 

Nitin commented on HIVE-21492:
--

[~Ferd] Can you please review the patch ?

> VectorizedParquetRecordReader can't to read parquet file generated using 
> thrift/custom tool
> ---
>
> Key: HIVE-21492
> URL: https://issues.apache.org/jira/browse/HIVE-21492
> Project: Hive
>  Issue Type: Bug
>Reporter: Ganesha Shreedhara
>Assignee: Ganesha Shreedhara
>Priority: Major
> Attachments: HIVE-21492.patch
>
>
> Taking an example of a parquet table having array of integers as below. 
> {code:java}
> CREATE EXTERNAL TABLE ( list_of_ints` array)
> STORED AS PARQUET 
> LOCATION '{location}';
> {code}
> Parquet file generated using hive will have schema for Type as below:
> {code:java}
> group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} 
> {code}
> Parquet file generated using thrift or any custom tool (using 
> org.apache.parquet.io.api.RecordConsumer)
> may have schema for Type as below:
> {code:java}
> required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code}
> VectorizedParquetRecordReader handles only parquet file generated using hive. 
> It throws the following exception when parquet file generated using thrift is 
> read because of the changes done as part of HIVE-18553 .
> {code:java}
> Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is 
> not a group
>  at org.apache.parquet.schema.Type.asGroupType(Type.java:207)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:532)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365){code}
>  
>  I have done a small change to handle the case where the child type of group 
> type can be PrimitiveType.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21592) OptimizedSql is not shown when the expression contains CONCAT

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813632#comment-16813632
 ] 

Hive QA commented on HIVE-21592:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12965258/HIVE-21592.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 15896 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_map_ppr] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_map_ppr_multi_distinct]
 (batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_ppr] (batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_ppr_multi_distinct]
 (batchId=63)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_map_ppr] 
(batchId=115)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_map_ppr_multi_distinct]
 (batchId=134)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_ppr] 
(batchId=124)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_ppr_multi_distinct]
 (batchId=138)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16899/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16899/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16899/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12965258 - PreCommit-HIVE-Build

> OptimizedSql is not shown when the expression contains CONCAT
> -
>
> Key: HIVE-21592
> URL: https://issues.apache.org/jira/browse/HIVE-21592
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21592.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21592) OptimizedSql is not shown when the expression contains CONCAT

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813595#comment-16813595
 ] 

Hive QA commented on HIVE-21592:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
3s{color} | {color:blue} ql in master has 2258 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16899/dev-support/hive-personality.sh
 |
| git revision | master / 7072e0b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16899/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> OptimizedSql is not shown when the expression contains CONCAT
> -
>
> Key: HIVE-21592
> URL: https://issues.apache.org/jira/browse/HIVE-21592
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21592.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21568) HiveRelOptUtil.isRowFilteringPlan should skip Project

2019-04-09 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21568:
---
Attachment: HIVE-21568.01.patch

> HiveRelOptUtil.isRowFilteringPlan should skip Project
> -
>
> Key: HIVE-21568
> URL: https://issues.apache.org/jira/browse/HIVE-21568
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21568.01.patch, HIVE-21568.01.patch, 
> HIVE-21568.01.patch
>
>
> Project operator should not return true in any case, this may trigger 
> additional rewritings in presence of constraints.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21109) Stats replication for ACID tables.

2019-04-09 Thread Ashutosh Bapat (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-21109:
--
Attachment: HIVE-21109.10.patch
Status: Patch Available  (was: In Progress)

Patch addresses [~sankarh]'s second set of comments. PR is updated with new set 
of commits, mostly each commit addressing one comment.

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch, 
> HIVE-21109.09.patch, HIVE-21109.09.patch, HIVE-21109.10.patch
>
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21109) Stats replication for ACID tables.

2019-04-09 Thread Ashutosh Bapat (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-21109:
--
Status: In Progress  (was: Patch Available)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch, 
> HIVE-21109.09.patch, HIVE-21109.09.patch
>
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=225073=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225073
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 15:55
Start Date: 09/Apr/19 15:55
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #579: 
HIVE-21109 : Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r273563740
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenariosMigration.java
 ##
 @@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.parse;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import 
org.apache.hadoop.hive.metastore.messaging.json.gzip.GzipJSONMessageEncoder;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.rules.TestName;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * Tests statistics replication for ACID tables.
+ */
+public class TestStatsReplicationScenariosMigration extends 
TestStatsReplicationScenarios {
+  @Rule
+  public final TestName testName = new TestName();
+
+  protected static final Logger LOG = 
LoggerFactory.getLogger(TestReplicationScenarios.class);
+
+  @BeforeClass
+  public static void classLevelSetup() throws Exception {
+Map overrides = new HashMap<>();
+overrides.put(MetastoreConf.ConfVars.EVENT_MESSAGE_FACTORY.getHiveName(),
+GzipJSONMessageEncoder.class.getCanonicalName());
+
+HashMap replicaConfigs = new HashMap() {{
+  put("hive.support.concurrency", "true");
+  put("hive.txn.manager", 
"org.apache.hadoop.hive.ql.lockmgr.DbTxnManager");
+  put("hive.metastore.client.capability.check", "false");
+  put("hive.repl.bootstrap.dump.open.txn.timeout", "1s");
+  put("hive.exec.dynamic.partition.mode", "nonstrict");
+  put("hive.strict.checks.bucketing", "false");
+  put("hive.mapred.mode", "nonstrict");
+  put("mapred.input.dir.recursive", "true");
+  put("hive.metastore.disallow.incompatible.col.type.changes", "false");
+  put("hive.strict.managed.tables", "true");
+}};
+replicaConfigs.putAll(overrides);
+
+HashMap primaryConfigs = new HashMap() {{
+  put("hive.metastore.client.capability.check", "false");
+  put("hive.repl.bootstrap.dump.open.txn.timeout", "1s");
+  put("hive.exec.dynamic.partition.mode", "nonstrict");
+  put("hive.strict.checks.bucketing", "false");
+  put("hive.mapred.mode", "nonstrict");
+  put("mapred.input.dir.recursive", "true");
+  put("hive.metastore.disallow.incompatible.col.type.changes", "false");
+  put("hive.support.concurrency", "false");
+  put("hive.txn.manager", 
"org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager");
+  put("hive.strict.managed.tables", "false");
+}};
+primaryConfigs.putAll(overrides);
+
+internalBeforeClassSetup(primaryConfigs, replicaConfigs,
 
 Review comment:
   As long as the writeId associated with the stats is valid according to  the 
given query's valid writeId list, the stats will be used if they are marked 
valid. Usually when a writeId advances, the stats will be marked invalid if the 
operation advancing the writeId renders stats inaccurate. In case of migration, 
even though writeid advances, the operation may not necessarily render stats 
inaccurate. In that case, even if the writeId associated with the stats is 
behind the latest allocated one the stats will be useful as long as 1. the 
writeId appears valid according to the query's writeId list and 2. stats 
themselves are marked valid.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact 

[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=225063=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225063
 ]

ASF GitHub Bot logged work on HIVE-21109:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 15:47
Start Date: 09/Apr/19 15:47
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #579: 
HIVE-21109 : Support stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r273560079
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java
 ##
 @@ -216,16 +233,23 @@ private void 
verifyNoPartitionStatsReplicationForMetadataOnly(String tableName)
 String ndTableName = "ndTable";
 // Partitioned table without data during bootstrap and hence no stats.
 String ndPartTableName = "ndPTable";
+String tblCreateExtra = "";
+
+if (useAcidTables) {
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 225063)
Time Spent: 12h 20m  (was: 12h 10m)

> Stats replication for ACID tables.
> --
>
> Key: HIVE-21109
> URL: https://issues.apache.org/jira/browse/HIVE-21109
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, 
> HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, 
> HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch, 
> HIVE-21109.09.patch, HIVE-21109.09.patch
>
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> Transactional tables require a writeID associated with the stats update. This 
> writeId needs to be in sync with the writeId on the source and hence needs to 
> be replicated from the source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21591) Using triggers in non-LLAP mode should not require wm queue

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813405#comment-16813405
 ] 

Hive QA commented on HIVE-21591:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12965245/HIVE-21591.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15896 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16898/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16898/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16898/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12965245 - PreCommit-HIVE-Build

> Using triggers in non-LLAP mode should not require wm queue
> ---
>
> Key: HIVE-21591
> URL: https://issues.apache.org/jira/browse/HIVE-21591
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21591.1.patch
>
>
> Resource plan triggers are supported in non-LLAP (tez container) mode. But 
> fetching of resource plan happens only when 
> hive.server2.tez.interactive.queue is set. For tez container mode, only 
> triggers are applicable, so this queue dependency can be removed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21591) Using triggers in non-LLAP mode should not require wm queue

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813354#comment-16813354
 ] 

Hive QA commented on HIVE-21591:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
18s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
54s{color} | {color:blue} ql in master has 2258 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
40s{color} | {color:blue} service in master has 48 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
21s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
44s{color} | {color:red} ql: The patch generated 1 new + 76 unchanged - 0 fixed 
= 77 total (was 76) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
14s{color} | {color:red} service: The patch generated 4 new + 35 unchanged - 0 
fixed = 39 total (was 35) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 30m 25s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16898/dev-support/hive-personality.sh
 |
| git revision | master / 7072e0b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16898/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16898/yetus/diff-checkstyle-service.txt
 |
| modules | C: ql service U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16898/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Using triggers in non-LLAP mode should not require wm queue
> ---
>
> Key: HIVE-21591
> URL: https://issues.apache.org/jira/browse/HIVE-21591
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21591.1.patch
>
>
> Resource plan triggers are supported in non-LLAP (tez container) mode. But 
> fetching of resource plan happens only when 
> hive.server2.tez.interactive.queue is set. For tez container mode, only 
> triggers are applicable, so this queue dependency can be removed. 



--
This message was sent by Atlassian JIRA

[jira] [Work logged] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20968?focusedWorklogId=224920=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224920
 ]

ASF GitHub Bot logged work on HIVE-20968:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 11:26
Start Date: 09/Apr/19 11:26
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on pull request #588: HIVE-20968 : 
Support conversion of managed to external where location set was not owned by 
hive
URL: https://github.com/apache/hive/pull/588#discussion_r273443677
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSpec.java
 ##
 @@ -437,4 +448,30 @@ public void setNeedDupCopyCheck(boolean 
isFirstIncPending) {
 // Check HIVE-21197 for more detail.
 this.needDupCopyCheck = isFirstIncPending;
   }
+
+  public boolean isPathOwnedByHive() {
+return isPathOwnedByHive;
+  }
+
+  public static boolean isPathOwnedByHive(HiveConf conf, String user) {
 
 Review comment:
   This method is used at target during truncate ...truncate is not using the 
replication scope from disk ...
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 224920)
Time Spent: 2h  (was: 1h 50m)

> Support conversion of managed to external where location set was not owned by 
> hive
> --
>
> Key: HIVE-20968
> URL: https://issues.apache.org/jira/browse/HIVE-20968
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, pull-request-available
> Attachments: HIVE-20968.01.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner along with dump 
> command (default to hive user). This value will be used during dump to decide 
> the ownership which will be used during load to decide the table type. The 
> location owner information can be stored in the events during create table. 
> The flag can be stored in replication spec. Check other such configs used in 
> upgrade tool.
> 5. Replication flow also set additional parameter 
> "external.table.purge"="true" ..only for migration to external table
> 6. Block conversion from managed to external and vice versa. Pass some flag 
> in upgrade flow to allow this conversion during upgrade flow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20968?focusedWorklogId=224921=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224921
 ]

ASF GitHub Bot logged work on HIVE-20968:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 11:26
Start Date: 09/Apr/19 11:26
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on pull request #588: HIVE-20968 : 
Support conversion of managed to external where location set was not owned by 
hive
URL: https://github.com/apache/hive/pull/588#discussion_r273443856
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSpec.java
 ##
 @@ -437,4 +448,30 @@ public void setNeedDupCopyCheck(boolean 
isFirstIncPending) {
 // Check HIVE-21197 for more detail.
 this.needDupCopyCheck = isFirstIncPending;
   }
+
+  public boolean isPathOwnedByHive() {
+return isPathOwnedByHive;
+  }
+
+  public static boolean isPathOwnedByHive(HiveConf conf, String user) {
+String ownerName = 
conf.get(HiveConf.ConfVars.STRICT_MANAGED_TABLES_MIGRARTION_OWNER.varname, 
"hive");
+return  (user == null || ownerName.equals(user));
+  }
+
+  public void setPathOwnedByHive(boolean pathOwnedByHive) {
+isPathOwnedByHive = pathOwnedByHive;
+  }
+
+  public void setPathOwnedByHive(HiveConf conf, String user) {
 
 Review comment:
   as the member is part of replication scope ..its better to keep it in 
replication scope only 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 224921)
Time Spent: 2h 10m  (was: 2h)

> Support conversion of managed to external where location set was not owned by 
> hive
> --
>
> Key: HIVE-20968
> URL: https://issues.apache.org/jira/browse/HIVE-20968
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, pull-request-available
> Attachments: HIVE-20968.01.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner along with dump 
> command (default to hive user). This value will be used during dump to decide 
> the ownership which will be used during load to decide the table type. The 
> location owner information can be stored in the events during create table. 
> The flag can be stored in replication spec. Check other such configs used in 
> upgrade tool.
> 5. Replication flow also set additional parameter 
> "external.table.purge"="true" ..only for migration to external table
> 6. Block conversion from managed to external and vice versa. Pass some flag 
> in upgrade flow to allow this conversion during upgrade flow.



--
This 

[jira] [Work logged] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20968?focusedWorklogId=224919=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224919
 ]

ASF GitHub Bot logged work on HIVE-20968:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 11:25
Start Date: 09/Apr/19 11:25
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on pull request #588: HIVE-20968 : 
Support conversion of managed to external where location set was not owned by 
hive
URL: https://github.com/apache/hive/pull/588#discussion_r273443465
 
 

 ##
 File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/events/AlterTableEvent.java
 ##
 @@ -40,6 +42,11 @@ public AlterTableEvent (Table oldTable, Table newTable, 
boolean isTruncateOp, bo
 this.newTable = newTable;
 this.isTruncateOp = isTruncateOp;
 this.writeId = writeId;
+if (newTable.getSd() == null) {
 
 Review comment:
   not sure ..if the location is not changed ..then it make sense to keep it 
null ..to avoid extra data to be written to meta store db 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 224919)
Time Spent: 1h 50m  (was: 1h 40m)

> Support conversion of managed to external where location set was not owned by 
> hive
> --
>
> Key: HIVE-20968
> URL: https://issues.apache.org/jira/browse/HIVE-20968
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, pull-request-available
> Attachments: HIVE-20968.01.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner along with dump 
> command (default to hive user). This value will be used during dump to decide 
> the ownership which will be used during load to decide the table type. The 
> location owner information can be stored in the events during create table. 
> The flag can be stored in replication spec. Check other such configs used in 
> upgrade tool.
> 5. Replication flow also set additional parameter 
> "external.table.purge"="true" ..only for migration to external table
> 6. Block conversion from managed to external and vice versa. Pass some flag 
> in upgrade flow to allow this conversion during upgrade flow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20968?focusedWorklogId=224918=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224918
 ]

ASF GitHub Bot logged work on HIVE-20968:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 11:24
Start Date: 09/Apr/19 11:24
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on pull request #588: HIVE-20968 : 
Support conversion of managed to external where location set was not owned by 
hive
URL: https://github.com/apache/hive/pull/588#discussion_r273443108
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/TableExport.java
 ##
 @@ -84,6 +84,34 @@ public TableExport(Paths paths, TableSpec tableSpec, 
ReplicationSpec replication
 this.conf = conf;
 this.paths = paths;
 this.mmCtx = mmCtx;
+this.replicationSpec.setEventBasedOwnershipCheck(false);
+setPathOwnedByHive(this.replicationSpec, 
tableSpec.tableHandle.getDataLocation(), db.getConf());
+  }
+
+  public static void setPathOwnedByHive(ReplicationSpec replicationSpec, Path 
location, HiveConf conf) {
+// For incremental load path, this flag should be set using the owner name 
in the event.
+if (replicationSpec == null || !replicationSpec.isInReplicationScope() ||
+replicationSpec.isEventBasedOwnershipCheck()) {
+  return;
+}
+
+// If the table path or path of any of the partitions is not owned by hive,
+// then table location not owned by hive for whole table.
+if (!replicationSpec.isPathOwnedByHive()) {
+  logger.info("Path is not owned by hive user for table or some partition. 
No need to check further.");
+  return;
+}
+
+try {
+  FileStatus fileStatus = 
location.getFileSystem(conf).getFileStatus(location);
+  String hiveOwner = 
conf.get(HiveConf.ConfVars.STRICT_MANAGED_TABLES_MIGRARTION_OWNER.varname, 
"hive");
+  
replicationSpec.setPathOwnedByHive(hiveOwner.equals(fileStatus.getOwner()));
 
 Review comment:
   The same way its used in migration tool ..so kept it same
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 224918)
Time Spent: 1h 40m  (was: 1.5h)

> Support conversion of managed to external where location set was not owned by 
> hive
> --
>
> Key: HIVE-20968
> URL: https://issues.apache.org/jira/browse/HIVE-20968
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, pull-request-available
> Attachments: HIVE-20968.01.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner along 

[jira] [Work logged] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20968?focusedWorklogId=224917=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224917
 ]

ASF GitHub Bot logged work on HIVE-20968:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 11:23
Start Date: 09/Apr/19 11:23
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on pull request #588: HIVE-20968 : 
Support conversion of managed to external where location set was not owned by 
hive
URL: https://github.com/apache/hive/pull/588#discussion_r273442880
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java
 ##
 @@ -300,10 +300,10 @@ public static void createExportDump(FileSystem fs, Path 
metadataPath, Table tabl
 }
 
 try (JsonWriter writer = new JsonWriter(fs, metadataPath)) {
+  new TableSerializer(tableHandle, partitions, hiveConf).writeTo(writer, 
replicationSpec);
 
 Review comment:
   replication spec is modified in the serializer to avoid iterating over the 
partitions again. So replication spec dump is moved down.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 224917)
Time Spent: 1.5h  (was: 1h 20m)

> Support conversion of managed to external where location set was not owned by 
> hive
> --
>
> Key: HIVE-20968
> URL: https://issues.apache.org/jira/browse/HIVE-20968
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, pull-request-available
> Attachments: HIVE-20968.01.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner along with dump 
> command (default to hive user). This value will be used during dump to decide 
> the ownership which will be used during load to decide the table type. The 
> location owner information can be stored in the events during create table. 
> The flag can be stored in replication spec. Check other such configs used in 
> upgrade tool.
> 5. Replication flow also set additional parameter 
> "external.table.purge"="true" ..only for migration to external table
> 6. Block conversion from managed to external and vice versa. Pass some flag 
> in upgrade flow to allow this conversion during upgrade flow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21572) HiveRemoveSqCountCheck rule could be enhanced to capture more patterns

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813261#comment-16813261
 ] 

Hive QA commented on HIVE-21572:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12965226/HIVE-21572.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 15895 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bool_unknown] 
(batchId=39)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[subquery_scalar_multi_rows]
 (batchId=101)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] 
(batchId=129)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query44] 
(batchId=277)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query54] 
(batchId=277)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query58] 
(batchId=277)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query54] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query58] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query44] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query54] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query58] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query54]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query58]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query54]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query58]
 (batchId=275)
org.apache.hadoop.hive.ql.TestTxnCommandsWithSplitUpdateAndVectorization.testBadOnClause
 (batchId=310)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitions
 (batchId=263)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitionsUnionAll
 (batchId=263)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes
 (batchId=263)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16897/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16897/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16897/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 19 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12965226 - PreCommit-HIVE-Build

> HiveRemoveSqCountCheck rule could be enhanced to capture more patterns 
> ---
>
> Key: HIVE-21572
> URL: https://issues.apache.org/jira/browse/HIVE-21572
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21572.1.patch, HIVE-21572.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21593) Break up DDLTask - extract Privilege related operations

2019-04-09 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21593:
--
Status: Patch Available  (was: Open)

> Break up DDLTask - extract Privilege related operations
> ---
>
> Key: HIVE-21593
> URL: https://issues.apache.org/jira/browse/HIVE-21593
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
> Attachments: HIVE-21593.01.patch
>
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #4: extract all the privilege related operations from the old DDLTask, 
> and move them under the new package.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21593) Break up DDLTask - extract Privilege related operations

2019-04-09 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21593:
--
Attachment: HIVE-21593.01.patch

> Break up DDLTask - extract Privilege related operations
> ---
>
> Key: HIVE-21593
> URL: https://issues.apache.org/jira/browse/HIVE-21593
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
> Attachments: HIVE-21593.01.patch
>
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #4: extract all the privilege related operations from the old DDLTask, 
> and move them under the new package.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17938) Enable parallel query compilation in HS2

2019-04-09 Thread Nitin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813206#comment-16813206
 ] 

Nitin commented on HIVE-17938:
--

[~thejas] Do we have any update on this?

> Enable parallel query compilation in HS2
> 
>
> Key: HIVE-17938
> URL: https://issues.apache.org/jira/browse/HIVE-17938
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>Priority: Major
> Attachments: HIVE-17938.1.patch, HIVE-17938.2.patch, 
> HIVE-17938.3.patch, HIVE-17938.3.patch
>
>
> This (hive.driver.parallel.compilation) has been enabled in many production 
> environments for a while (Hortonworks customers), and it has been stable.
> Just realized that this is not yet enabled in apache by default. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21593) Break up DDLTask - extract Privilege related operations

2019-04-09 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely reassigned HIVE-21593:
-


> Break up DDLTask - extract Privilege related operations
> ---
>
> Key: HIVE-21593
> URL: https://issues.apache.org/jira/browse/HIVE-21593
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #4: extract all the function related operations from the old DDLTask, 
> and move them under the new package.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21593) Break up DDLTask - extract Privilege related operations

2019-04-09 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21593:
--
Description: 
DDLTask is a huge class, more than 5000 lines long. The related DDLWork is also 
a huge class, which has a field for each DDL operation it supports. The goal is 
to refactor these in order to have everything cut into more handleable classes 
under the package  org.apache.hadoop.hive.ql.exec.ddl:
 * have a separate class for each operation
 * have a package for each operation group (database ddl, table ddl, etc), so 
the amount of classes under a package is more manageable
 * make all the requests (DDLDesc subclasses) immutable
 * DDLTask should be agnostic to the actual operations
 * right now let's ignore the issue of having some operations handled by 
DDLTask which are not actual DDL operations (lock, unlock, desc...)

In the interim time when there are two DDLTask and DDLWork classes in the code 
base the new ones in the new package are called DDLTask2 and DDLWork2 thus 
avoiding the usage of fully qualified class names where both the old and the 
new classes are in use.

Step #4: extract all the privilege related operations from the old DDLTask, and 
move them under the new package.

  was:
DDLTask is a huge class, more than 5000 lines long. The related DDLWork is also 
a huge class, which has a field for each DDL operation it supports. The goal is 
to refactor these in order to have everything cut into more handleable classes 
under the package  org.apache.hadoop.hive.ql.exec.ddl:
 * have a separate class for each operation
 * have a package for each operation group (database ddl, table ddl, etc), so 
the amount of classes under a package is more manageable
 * make all the requests (DDLDesc subclasses) immutable
 * DDLTask should be agnostic to the actual operations
 * right now let's ignore the issue of having some operations handled by 
DDLTask which are not actual DDL operations (lock, unlock, desc...)

In the interim time when there are two DDLTask and DDLWork classes in the code 
base the new ones in the new package are called DDLTask2 and DDLWork2 thus 
avoiding the usage of fully qualified class names where both the old and the 
new classes are in use.

Step #4: extract all the function related operations from the old DDLTask, and 
move them under the new package.


> Break up DDLTask - extract Privilege related operations
> ---
>
> Key: HIVE-21593
> URL: https://issues.apache.org/jira/browse/HIVE-21593
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #4: extract all the privilege related operations from the old DDLTask, 
> and move them under the new package.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21572) HiveRemoveSqCountCheck rule could be enhanced to capture more patterns

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813205#comment-16813205
 ] 

Hive QA commented on HIVE-21572:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
10s{color} | {color:blue} ql in master has 2258 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 1 new + 1 unchanged - 0 fixed 
= 2 total (was 1) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16897/dev-support/hive-personality.sh
 |
| git revision | master / cfe90d5 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16897/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16897/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> HiveRemoveSqCountCheck rule could be enhanced to capture more patterns 
> ---
>
> Key: HIVE-21572
> URL: https://issues.apache.org/jira/browse/HIVE-21572
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21572.1.patch, HIVE-21572.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20968?focusedWorklogId=224871=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224871
 ]

ASF GitHub Bot logged work on HIVE-20968:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 09:53
Start Date: 09/Apr/19 09:53
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #588: HIVE-20968 : 
Support conversion of managed to external where location set was not owned by 
hive
URL: https://github.com/apache/hive/pull/588#discussion_r273406301
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/TableExport.java
 ##
 @@ -84,6 +84,34 @@ public TableExport(Paths paths, TableSpec tableSpec, 
ReplicationSpec replication
 this.conf = conf;
 this.paths = paths;
 this.mmCtx = mmCtx;
+this.replicationSpec.setEventBasedOwnershipCheck(false);
+setPathOwnedByHive(this.replicationSpec, 
tableSpec.tableHandle.getDataLocation(), db.getConf());
+  }
+
+  public static void setPathOwnedByHive(ReplicationSpec replicationSpec, Path 
location, HiveConf conf) {
+// For incremental load path, this flag should be set using the owner name 
in the event.
+if (replicationSpec == null || !replicationSpec.isInReplicationScope() ||
+replicationSpec.isEventBasedOwnershipCheck()) {
+  return;
+}
+
+// If the table path or path of any of the partitions is not owned by hive,
+// then table location not owned by hive for whole table.
+if (!replicationSpec.isPathOwnedByHive()) {
+  logger.info("Path is not owned by hive user for table or some partition. 
No need to check further.");
+  return;
+}
+
+try {
+  FileStatus fileStatus = 
location.getFileSystem(conf).getFileStatus(location);
+  String hiveOwner = 
conf.get(HiveConf.ConfVars.STRICT_MANAGED_TABLES_MIGRARTION_OWNER.varname, 
"hive");
+  
replicationSpec.setPathOwnedByHive(hiveOwner.equals(fileStatus.getOwner()));
 
 Review comment:
   Is user name case sensitive? If not, we need to use equalsIgnoreCase.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 224871)
Time Spent: 50m  (was: 40m)

> Support conversion of managed to external where location set was not owned by 
> hive
> --
>
> Key: HIVE-20968
> URL: https://issues.apache.org/jira/browse/HIVE-20968
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, pull-request-available
> Attachments: HIVE-20968.01.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner 

[jira] [Work logged] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20968?focusedWorklogId=224875=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224875
 ]

ASF GitHub Bot logged work on HIVE-20968:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 09:53
Start Date: 09/Apr/19 09:53
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #588: HIVE-20968 : 
Support conversion of managed to external where location set was not owned by 
hive
URL: https://github.com/apache/hive/pull/588#discussion_r273411629
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSpec.java
 ##
 @@ -437,4 +448,30 @@ public void setNeedDupCopyCheck(boolean 
isFirstIncPending) {
 // Check HIVE-21197 for more detail.
 this.needDupCopyCheck = isFirstIncPending;
   }
+
+  public boolean isPathOwnedByHive() {
+return isPathOwnedByHive;
+  }
+
+  public static boolean isPathOwnedByHive(HiveConf conf, String user) {
+String ownerName = 
conf.get(HiveConf.ConfVars.STRICT_MANAGED_TABLES_MIGRARTION_OWNER.varname, 
"hive");
+return  (user == null || ownerName.equals(user));
+  }
+
+  public void setPathOwnedByHive(boolean pathOwnedByHive) {
+isPathOwnedByHive = pathOwnedByHive;
+  }
+
+  public void setPathOwnedByHive(HiveConf conf, String user) {
 
 Review comment:
   This method also can be moved to ReplUtils and use 
setPathOwnedByHive(boolean) method to set it in ReplicationSpec. 
   Just to decouple the logic outside.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 224875)
Time Spent: 1h 10m  (was: 1h)

> Support conversion of managed to external where location set was not owned by 
> hive
> --
>
> Key: HIVE-20968
> URL: https://issues.apache.org/jira/browse/HIVE-20968
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, pull-request-available
> Attachments: HIVE-20968.01.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner along with dump 
> command (default to hive user). This value will be used during dump to decide 
> the ownership which will be used during load to decide the table type. The 
> location owner information can be stored in the events during create table. 
> The flag can be stored in replication spec. Check other such configs used in 
> upgrade tool.
> 5. Replication flow also set additional parameter 
> "external.table.purge"="true" ..only for migration to external table
> 6. Block conversion from managed to external and vice versa. Pass some flag 
> in upgrade flow to 

[jira] [Work logged] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20968?focusedWorklogId=224868=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224868
 ]

ASF GitHub Bot logged work on HIVE-20968:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 09:53
Start Date: 09/Apr/19 09:53
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #588: HIVE-20968 : 
Support conversion of managed to external where location set was not owned by 
hive
URL: https://github.com/apache/hive/pull/588#discussion_r273402075
 
 

 ##
 File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
 ##
 @@ -515,6 +515,11 @@ private static void populateLlapDaemonVarsSet(Set 
llapDaemonVarsSetLocal
 "This is the base directory on the target/replica warehouse under 
which data for "
 + "external tables is stored. This is relative base path and hence 
prefixed to the source "
 + "external table path on target cluster."),
+
STRICT_MANAGED_TABLES_MIGRARTION_OWNER("strict.managed.tables.migration.owner", 
"hive",
+"This is used by upgrade tool to check if a managed table should be 
converted to external table. If the " +
 
 Review comment:
   upgrade tool is not the right term. It should be HiveStrictManagedMigration 
tool.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 224868)
Time Spent: 0.5h  (was: 20m)

> Support conversion of managed to external where location set was not owned by 
> hive
> --
>
> Key: HIVE-20968
> URL: https://issues.apache.org/jira/browse/HIVE-20968
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, pull-request-available
> Attachments: HIVE-20968.01.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner along with dump 
> command (default to hive user). This value will be used during dump to decide 
> the ownership which will be used during load to decide the table type. The 
> location owner information can be stored in the events during create table. 
> The flag can be stored in replication spec. Check other such configs used in 
> upgrade tool.
> 5. Replication flow also set additional parameter 
> "external.table.purge"="true" ..only for migration to external table
> 6. Block conversion from managed to external and vice versa. Pass some flag 
> in upgrade flow to allow this conversion during upgrade flow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20968?focusedWorklogId=224870=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224870
 ]

ASF GitHub Bot logged work on HIVE-20968:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 09:53
Start Date: 09/Apr/19 09:53
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #588: HIVE-20968 : 
Support conversion of managed to external where location set was not owned by 
hive
URL: https://github.com/apache/hive/pull/588#discussion_r273404861
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java
 ##
 @@ -300,10 +300,10 @@ public static void createExportDump(FileSystem fs, Path 
metadataPath, Table tabl
 }
 
 try (JsonWriter writer = new JsonWriter(fs, metadataPath)) {
+  new TableSerializer(tableHandle, partitions, hiveConf).writeTo(writer, 
replicationSpec);
 
 Review comment:
   Why is it moved up?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 224870)
Time Spent: 40m  (was: 0.5h)

> Support conversion of managed to external where location set was not owned by 
> hive
> --
>
> Key: HIVE-20968
> URL: https://issues.apache.org/jira/browse/HIVE-20968
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, pull-request-available
> Attachments: HIVE-20968.01.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner along with dump 
> command (default to hive user). This value will be used during dump to decide 
> the ownership which will be used during load to decide the table type. The 
> location owner information can be stored in the events during create table. 
> The flag can be stored in replication spec. Check other such configs used in 
> upgrade tool.
> 5. Replication flow also set additional parameter 
> "external.table.purge"="true" ..only for migration to external table
> 6. Block conversion from managed to external and vice versa. Pass some flag 
> in upgrade flow to allow this conversion during upgrade flow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20968?focusedWorklogId=224869=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224869
 ]

ASF GitHub Bot logged work on HIVE-20968:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 09:53
Start Date: 09/Apr/19 09:53
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #588: HIVE-20968 : 
Support conversion of managed to external where location set was not owned by 
hive
URL: https://github.com/apache/hive/pull/588#discussion_r273403978
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationWithTableMigration.java
 ##
 @@ -427,4 +430,67 @@ public void 
testIncrementalLoadMigrationToAcidWithMoveOptimization() throws Thro
 replica.load(replicatedDbName, tuple.dumpLocation, withConfigs);
 verifyLoadExecution(replicatedDbName, tuple.lastReplicationId);
   }
+
+  private boolean isExternal(Table table) {
+return "TRUE".equalsIgnoreCase(table.getParameters().get("EXTERNAL"));
+  }
+
+  @Test
+  public void testMigrarionLocationOwnership() throws Throwable {
+primary.run("use " + primaryDbName)
+.run("create table tbl (fld int)")
 
 Review comment:
   Shall create one of the table as ORC table.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 224869)
Time Spent: 40m  (was: 0.5h)

> Support conversion of managed to external where location set was not owned by 
> hive
> --
>
> Key: HIVE-20968
> URL: https://issues.apache.org/jira/browse/HIVE-20968
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, pull-request-available
> Attachments: HIVE-20968.01.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner along with dump 
> command (default to hive user). This value will be used during dump to decide 
> the ownership which will be used during load to decide the table type. The 
> location owner information can be stored in the events during create table. 
> The flag can be stored in replication spec. Check other such configs used in 
> upgrade tool.
> 5. Replication flow also set additional parameter 
> "external.table.purge"="true" ..only for migration to external table
> 6. Block conversion from managed to external and vice versa. Pass some flag 
> in upgrade flow to allow this conversion during upgrade flow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20968?focusedWorklogId=224866=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224866
 ]

ASF GitHub Bot logged work on HIVE-20968:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 09:53
Start Date: 09/Apr/19 09:53
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #588: HIVE-20968 : 
Support conversion of managed to external where location set was not owned by 
hive
URL: https://github.com/apache/hive/pull/588#discussion_r273402372
 
 

 ##
 File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
 ##
 @@ -515,6 +515,11 @@ private static void populateLlapDaemonVarsSet(Set 
llapDaemonVarsSetLocal
 "This is the base directory on the target/replica warehouse under 
which data for "
 + "external tables is stored. This is relative base path and hence 
prefixed to the source "
 + "external table path on target cluster."),
+
STRICT_MANAGED_TABLES_MIGRARTION_OWNER("strict.managed.tables.migration.owner", 
"hive",
+"This is used by upgrade tool to check if a managed table should be 
converted to external table. If the " +
+"owner of the table location is not same as this config value, 
then the table is converted to an " +
+"external table. The same is used during replication from a 
cluster with strict managed table set to" +
 
 Review comment:
   Shall quote the config name for strict managed table.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 224866)
Time Spent: 20m  (was: 10m)

> Support conversion of managed to external where location set was not owned by 
> hive
> --
>
> Key: HIVE-20968
> URL: https://issues.apache.org/jira/browse/HIVE-20968
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, pull-request-available
> Attachments: HIVE-20968.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner along with dump 
> command (default to hive user). This value will be used during dump to decide 
> the ownership which will be used during load to decide the table type. The 
> location owner information can be stored in the events during create table. 
> The flag can be stored in replication spec. Check other such configs used in 
> upgrade tool.
> 5. Replication flow also set additional parameter 
> "external.table.purge"="true" ..only for migration to external table
> 6. Block conversion from managed to external and vice versa. Pass some flag 
> in upgrade 

[jira] [Work logged] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20968?focusedWorklogId=224867=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224867
 ]

ASF GitHub Bot logged work on HIVE-20968:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 09:53
Start Date: 09/Apr/19 09:53
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #588: HIVE-20968 : 
Support conversion of managed to external where location set was not owned by 
hive
URL: https://github.com/apache/hive/pull/588#discussion_r273402969
 
 

 ##
 File path: 
hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/messaging/AlterTableMessage.java
 ##
 @@ -42,4 +42,6 @@ public HCatEventMessage checkValid() {
   }
 
   public abstract Long getWriteId();
+
+  public abstract String getLocOwner();
 
 Review comment:
   Shall use a complete name getDataLocationOwner. It is not too long. Same for 
other places.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 224867)
Time Spent: 20m  (was: 10m)

> Support conversion of managed to external where location set was not owned by 
> hive
> --
>
> Key: HIVE-20968
> URL: https://issues.apache.org/jira/browse/HIVE-20968
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, pull-request-available
> Attachments: HIVE-20968.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner along with dump 
> command (default to hive user). This value will be used during dump to decide 
> the ownership which will be used during load to decide the table type. The 
> location owner information can be stored in the events during create table. 
> The flag can be stored in replication spec. Check other such configs used in 
> upgrade tool.
> 5. Replication flow also set additional parameter 
> "external.table.purge"="true" ..only for migration to external table
> 6. Block conversion from managed to external and vice versa. Pass some flag 
> in upgrade flow to allow this conversion during upgrade flow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20968?focusedWorklogId=224872=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224872
 ]

ASF GitHub Bot logged work on HIVE-20968:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 09:53
Start Date: 09/Apr/19 09:53
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #588: HIVE-20968 : 
Support conversion of managed to external where location set was not owned by 
hive
URL: https://github.com/apache/hive/pull/588#discussion_r273405034
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java
 ##
 @@ -203,15 +203,15 @@ private void parsePartitionSpec(ASTNode tableNode, 
LinkedHashMap
   }
 
   private static void 
upgradeTableDesc(org.apache.hadoop.hive.metastore.api.Table tableObj, MetaData 
rv,
-   EximUtil.SemanticAnalyzerWrapperContext 
x)
+   EximUtil.SemanticAnalyzerWrapperContext 
x, Boolean isPathOwnedByHive)
 
 Review comment:
   Shall use primitive boolean type instead of class Boolean.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 224872)

> Support conversion of managed to external where location set was not owned by 
> hive
> --
>
> Key: HIVE-20968
> URL: https://issues.apache.org/jira/browse/HIVE-20968
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, pull-request-available
> Attachments: HIVE-20968.01.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner along with dump 
> command (default to hive user). This value will be used during dump to decide 
> the ownership which will be used during load to decide the table type. The 
> location owner information can be stored in the events during create table. 
> The flag can be stored in replication spec. Check other such configs used in 
> upgrade tool.
> 5. Replication flow also set additional parameter 
> "external.table.purge"="true" ..only for migration to external table
> 6. Block conversion from managed to external and vice versa. Pass some flag 
> in upgrade flow to allow this conversion during upgrade flow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20968?focusedWorklogId=224876=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224876
 ]

ASF GitHub Bot logged work on HIVE-20968:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 09:53
Start Date: 09/Apr/19 09:53
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #588: HIVE-20968 : 
Support conversion of managed to external where location set was not owned by 
hive
URL: https://github.com/apache/hive/pull/588#discussion_r273408285
 
 

 ##
 File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/events/AlterTableEvent.java
 ##
 @@ -40,6 +42,11 @@ public AlterTableEvent (Table oldTable, Table newTable, 
boolean isTruncateOp, bo
 this.newTable = newTable;
 this.isTruncateOp = isTruncateOp;
 this.writeId = writeId;
+if (newTable.getSd() == null) {
 
 Review comment:
   In what case newTable.getSd() can be null?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 224876)
Time Spent: 1h 20m  (was: 1h 10m)

> Support conversion of managed to external where location set was not owned by 
> hive
> --
>
> Key: HIVE-20968
> URL: https://issues.apache.org/jira/browse/HIVE-20968
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, pull-request-available
> Attachments: HIVE-20968.01.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner along with dump 
> command (default to hive user). This value will be used during dump to decide 
> the ownership which will be used during load to decide the table type. The 
> location owner information can be stored in the events during create table. 
> The flag can be stored in replication spec. Check other such configs used in 
> upgrade tool.
> 5. Replication flow also set additional parameter 
> "external.table.purge"="true" ..only for migration to external table
> 6. Block conversion from managed to external and vice versa. Pass some flag 
> in upgrade flow to allow this conversion during upgrade flow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20968?focusedWorklogId=224874=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224874
 ]

ASF GitHub Bot logged work on HIVE-20968:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 09:53
Start Date: 09/Apr/19 09:53
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #588: HIVE-20968 : 
Support conversion of managed to external where location set was not owned by 
hive
URL: https://github.com/apache/hive/pull/588#discussion_r273411092
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSpec.java
 ##
 @@ -437,4 +448,30 @@ public void setNeedDupCopyCheck(boolean 
isFirstIncPending) {
 // Check HIVE-21197 for more detail.
 this.needDupCopyCheck = isFirstIncPending;
   }
+
+  public boolean isPathOwnedByHive() {
+return isPathOwnedByHive;
+  }
+
+  public static boolean isPathOwnedByHive(HiveConf conf, String user) {
 
 Review comment:
   Can this method moved to ReplUtils as it is static method. ReplicationSpec 
is shared by both source and target but this method is needed only in source.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 224874)
Time Spent: 1h 10m  (was: 1h)

> Support conversion of managed to external where location set was not owned by 
> hive
> --
>
> Key: HIVE-20968
> URL: https://issues.apache.org/jira/browse/HIVE-20968
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, pull-request-available
> Attachments: HIVE-20968.01.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner along with dump 
> command (default to hive user). This value will be used during dump to decide 
> the ownership which will be used during load to decide the table type. The 
> location owner information can be stored in the events during create table. 
> The flag can be stored in replication spec. Check other such configs used in 
> upgrade tool.
> 5. Replication flow also set additional parameter 
> "external.table.purge"="true" ..only for migration to external table
> 6. Block conversion from managed to external and vice versa. Pass some flag 
> in upgrade flow to allow this conversion during upgrade flow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20968?focusedWorklogId=224873=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224873
 ]

ASF GitHub Bot logged work on HIVE-20968:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 09:53
Start Date: 09/Apr/19 09:53
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #588: HIVE-20968 : 
Support conversion of managed to external where location set was not owned by 
hive
URL: https://github.com/apache/hive/pull/588#discussion_r273407992
 
 

 ##
 File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/events/AlterPartitionEvent.java
 ##
 @@ -33,6 +34,7 @@
   private final Table table;
   private final boolean isTruncateOp;
   private Long writeId;
+  private String locOwner;
 
 Review comment:
   Can be final type?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 224873)
Time Spent: 1h  (was: 50m)

> Support conversion of managed to external where location set was not owned by 
> hive
> --
>
> Key: HIVE-20968
> URL: https://issues.apache.org/jira/browse/HIVE-20968
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, pull-request-available
> Attachments: HIVE-20968.01.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner along with dump 
> command (default to hive user). This value will be used during dump to decide 
> the ownership which will be used during load to decide the table type. The 
> location owner information can be stored in the events during create table. 
> The flag can be stored in replication spec. Check other such configs used in 
> upgrade tool.
> 5. Replication flow also set additional parameter 
> "external.table.purge"="true" ..only for migration to external table
> 6. Block conversion from managed to external and vice versa. Pass some flag 
> in upgrade flow to allow this conversion during upgrade flow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21567) Break up DDLTask - extract Function related operations

2019-04-09 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-21567:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

pushed to master. Thank you [~mgergely]!

> Break up DDLTask - extract Function related operations
> --
>
> Key: HIVE-21567
> URL: https://issues.apache.org/jira/browse/HIVE-21567
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
> Attachments: HIVE-21567.01.patch, HIVE-21567.02.patch, 
> HIVE-21567.03.patch, HIVE-21567.04.patch
>
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #4: extract all the function related operations from the old DDLTask, 
> and move them under the new package.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21291) Restore historical way of handling timestamps in Avro while keeping the new semantics at the same time

2019-04-09 Thread Karen Coppage (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-21291:
-
Status: Open  (was: Patch Available)

> Restore historical way of handling timestamps in Avro while keeping the new 
> semantics at the same time
> --
>
> Key: HIVE-21291
> URL: https://issues.apache.org/jira/browse/HIVE-21291
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Ivanfi
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-21291.1.patch, HIVE-21291.2.patch, 
> HIVE-21291.3.patch, HIVE-21291.4.patch, HIVE-21291.4.patch
>
>
> This sub-task is for implementing the Avro-specific parts of the following 
> plan:
> h1. Problem
> Historically, the semantics of the TIMESTAMP type in Hive depended on the 
> file format. Timestamps in Avro, Parquet and RCFiles with a binary SerDe had 
> _Instant_ semantics, while timestamps in ORC, textfiles and RCFiles with a 
> text SerDe had _LocalDateTime_ semantics.
> The Hive community wanted to get rid of this inconsistency and have 
> _LocalDateTime_ semantics in Avro, Parquet and RCFiles with a binary SerDe as 
> well. *Hive 3.1 turned off normalization to UTC* to achieve this. While this 
> leads to the desired new semantics, it also leads to incorrect results when 
> new Hive versions read timestamps written by old Hive versions or when old 
> Hive versions or any other component not aware of this change (including 
> legacy Impala and Spark versions) read timestamps written by new Hive 
> versions.
> h1. Solution
> To work around this issue, Hive *should restore the practice of normalizing 
> to UTC* when writing timestamps to Avro, Parquet and RCFiles with a binary 
> SerDe. In itself, this would restore the historical _Instant_ semantics, 
> which is undesirable. In order to achieve the desired _LocalDateTime_ 
> semantics in spite of normalizing to UTC, newer Hive versions should record 
> the session-local local time zone in the file metadata fields serving 
> arbitrary key-value storage purposes.
> When reading back files with this time zone metadata, newer Hive versions (or 
> any other new component aware of this extra metadata) can achieve 
> _LocalDateTime_ semantics by *converting from UTC to the saved time zone 
> (instead of to the local time zone)*. Legacy components that are unaware of 
> the new metadata can read the files without any problem and the timestamps 
> will show the historical Instant behaviour to them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21291) Restore historical way of handling timestamps in Avro while keeping the new semantics at the same time

2019-04-09 Thread Karen Coppage (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-21291:
-
Attachment: HIVE-21291.4.patch
Status: Patch Available  (was: Open)

> Restore historical way of handling timestamps in Avro while keeping the new 
> semantics at the same time
> --
>
> Key: HIVE-21291
> URL: https://issues.apache.org/jira/browse/HIVE-21291
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Ivanfi
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-21291.1.patch, HIVE-21291.2.patch, 
> HIVE-21291.3.patch, HIVE-21291.4.patch, HIVE-21291.4.patch
>
>
> This sub-task is for implementing the Avro-specific parts of the following 
> plan:
> h1. Problem
> Historically, the semantics of the TIMESTAMP type in Hive depended on the 
> file format. Timestamps in Avro, Parquet and RCFiles with a binary SerDe had 
> _Instant_ semantics, while timestamps in ORC, textfiles and RCFiles with a 
> text SerDe had _LocalDateTime_ semantics.
> The Hive community wanted to get rid of this inconsistency and have 
> _LocalDateTime_ semantics in Avro, Parquet and RCFiles with a binary SerDe as 
> well. *Hive 3.1 turned off normalization to UTC* to achieve this. While this 
> leads to the desired new semantics, it also leads to incorrect results when 
> new Hive versions read timestamps written by old Hive versions or when old 
> Hive versions or any other component not aware of this change (including 
> legacy Impala and Spark versions) read timestamps written by new Hive 
> versions.
> h1. Solution
> To work around this issue, Hive *should restore the practice of normalizing 
> to UTC* when writing timestamps to Avro, Parquet and RCFiles with a binary 
> SerDe. In itself, this would restore the historical _Instant_ semantics, 
> which is undesirable. In order to achieve the desired _LocalDateTime_ 
> semantics in spite of normalizing to UTC, newer Hive versions should record 
> the session-local local time zone in the file metadata fields serving 
> arbitrary key-value storage purposes.
> When reading back files with this time zone metadata, newer Hive versions (or 
> any other new component aware of this extra metadata) can achieve 
> _LocalDateTime_ semantics by *converting from UTC to the saved time zone 
> (instead of to the local time zone)*. Legacy components that are unaware of 
> the new metadata can read the files without any problem and the timestamps 
> will show the historical Instant behaviour to them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-20968:
---
Status: Patch Available  (was: Open)

> Support conversion of managed to external where location set was not owned by 
> hive
> --
>
> Key: HIVE-20968
> URL: https://issues.apache.org/jira/browse/HIVE-20968
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, pull-request-available
> Attachments: HIVE-20968.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner along with dump 
> command (default to hive user). This value will be used during dump to decide 
> the ownership which will be used during load to decide the table type. The 
> location owner information can be stored in the events during create table. 
> The flag can be stored in replication spec. Check other such configs used in 
> upgrade tool.
> 5. Replication flow also set additional parameter 
> "external.table.purge"="true" ..only for migration to external table
> 6. Block conversion from managed to external and vice versa. Pass some flag 
> in upgrade flow to allow this conversion during upgrade flow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20968) Support conversion of managed to external where location set was not owned by hive

2019-04-09 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-20968:
---
Attachment: HIVE-20968.01.patch

> Support conversion of managed to external where location set was not owned by 
> hive
> --
>
> Key: HIVE-20968
> URL: https://issues.apache.org/jira/browse/HIVE-20968
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, pull-request-available
> Attachments: HIVE-20968.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner along with dump 
> command (default to hive user). This value will be used during dump to decide 
> the ownership which will be used during load to decide the table type. The 
> location owner information can be stored in the events during create table. 
> The flag can be stored in replication spec. Check other such configs used in 
> upgrade tool.
> 5. Replication flow also set additional parameter 
> "external.table.purge"="true" ..only for migration to external table
> 6. Block conversion from managed to external and vice versa. Pass some flag 
> in upgrade flow to allow this conversion during upgrade flow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file

2019-04-09 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-9995:
-
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

> ACID compaction tries to compact a single file
> --
>
> Key: HIVE-9995
> URL: https://issues.apache.org/jira/browse/HIVE-9995
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Denys Kuzmenko
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, 
> HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, 
> HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.08.patch, 
> HIVE-9995.09.patch, HIVE-9995.10.patch, HIVE-9995.WIP.patch
>
>
> Consider TestWorker.minorWithOpenInMiddle()
> since there is an open txnId=23, this doesn't have any meaningful minor 
> compaction work to do.  The system still tries to compact a single delta file 
> for 21-22 id range, and effectively copies the file onto itself.
> This is 1. inefficient and 2. can potentially affect a reader.
> (from a real cluster)
> Suppose we start with 
> {noformat}
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:03 
> /user/hive/warehouse/t/base_016
> -rw-r--r--   1 ekoifman staff602 2016-06-09 16:03 
> /user/hive/warehouse/t/base_016/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/base_017
> -rw-r--r--   1 ekoifman staff588 2016-06-09 16:07 
> /user/hive/warehouse/t/base_017/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_017_017_
> -rw-r--r--   1 ekoifman staff514 2016-06-09 16:06 
> /user/hive/warehouse/t/delta_017_017_/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_018_018_
> -rw-r--r--   1 ekoifman staff612 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_018_018_/bucket_0
> {noformat}
> then do _alter table T compact 'minor';_
> then we end up with 
> {noformat}
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/base_017
> -rw-r--r--   1 ekoifman staff588 2016-06-09 16:07 
> /user/hive/warehouse/t/base_017/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:11 
> /user/hive/warehouse/t/delta_018_018
> -rw-r--r--   1 ekoifman staff500 2016-06-09 16:11 
> /user/hive/warehouse/t/delta_018_018/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_018_018_
> -rw-r--r--   1 ekoifman staff612 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_018_018_/bucket_0
> {noformat}
> So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-9995) ACID compaction tries to compact a single file

2019-04-09 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813150#comment-16813150
 ] 

Adam Szita commented on HIVE-9995:
--

Committed to master. Thanks Denys!

> ACID compaction tries to compact a single file
> --
>
> Key: HIVE-9995
> URL: https://issues.apache.org/jira/browse/HIVE-9995
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, 
> HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, 
> HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.08.patch, 
> HIVE-9995.09.patch, HIVE-9995.10.patch, HIVE-9995.WIP.patch
>
>
> Consider TestWorker.minorWithOpenInMiddle()
> since there is an open txnId=23, this doesn't have any meaningful minor 
> compaction work to do.  The system still tries to compact a single delta file 
> for 21-22 id range, and effectively copies the file onto itself.
> This is 1. inefficient and 2. can potentially affect a reader.
> (from a real cluster)
> Suppose we start with 
> {noformat}
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:03 
> /user/hive/warehouse/t/base_016
> -rw-r--r--   1 ekoifman staff602 2016-06-09 16:03 
> /user/hive/warehouse/t/base_016/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/base_017
> -rw-r--r--   1 ekoifman staff588 2016-06-09 16:07 
> /user/hive/warehouse/t/base_017/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_017_017_
> -rw-r--r--   1 ekoifman staff514 2016-06-09 16:06 
> /user/hive/warehouse/t/delta_017_017_/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_018_018_
> -rw-r--r--   1 ekoifman staff612 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_018_018_/bucket_0
> {noformat}
> then do _alter table T compact 'minor';_
> then we end up with 
> {noformat}
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/base_017
> -rw-r--r--   1 ekoifman staff588 2016-06-09 16:07 
> /user/hive/warehouse/t/base_017/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:11 
> /user/hive/warehouse/t/delta_018_018
> -rw-r--r--   1 ekoifman staff500 2016-06-09 16:11 
> /user/hive/warehouse/t/delta_018_018/bucket_0
> drwxr-xr-x   - ekoifman staff  0 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_018_018_
> -rw-r--r--   1 ekoifman staff612 2016-06-09 16:07 
> /user/hive/warehouse/t/delta_018_018_/bucket_0
> {noformat}
> So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21427) Syslog storage handler

2019-04-09 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21427:
-
Attachment: HIVE-21427.8.patch

> Syslog storage handler
> --
>
> Key: HIVE-21427
> URL: https://issues.apache.org/jira/browse/HIVE-21427
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21427.1.patch, HIVE-21427.2.patch, 
> HIVE-21427.3.patch, HIVE-21427.4.patch, HIVE-21427.5.patch, 
> HIVE-21427.6.patch, HIVE-21427.7.patch, HIVE-21427.8.patch
>
>
> It will be useful to read syslog generated log files in Hive. Hive generates 
> logs in RFC5424 log4j2 layout and stores it as external table in sys.db. This 
> includes a SyslogSerde that can parse RFC5424 formatted logs and maps them to 
> logs table schema for query processing by hive. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21427) Syslog storage handler

2019-04-09 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813108#comment-16813108
 ] 

Prasanth Jayachandran commented on HIVE-21427:
--

Renamed syslog* qfiles to rfc5424*.

> Syslog storage handler
> --
>
> Key: HIVE-21427
> URL: https://issues.apache.org/jira/browse/HIVE-21427
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21427.1.patch, HIVE-21427.2.patch, 
> HIVE-21427.3.patch, HIVE-21427.4.patch, HIVE-21427.5.patch, 
> HIVE-21427.6.patch, HIVE-21427.7.patch, HIVE-21427.8.patch
>
>
> It will be useful to read syslog generated log files in Hive. Hive generates 
> logs in RFC5424 log4j2 layout and stores it as external table in sys.db. This 
> includes a SyslogSerde that can parse RFC5424 formatted logs and maps them to 
> logs table schema for query processing by hive. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21231) HiveJoinAddNotNullRule support for range predicates

2019-04-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21231?focusedWorklogId=224806=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224806
 ]

ASF GitHub Bot logged work on HIVE-21231:
-

Author: ASF GitHub Bot
Created on: 09/Apr/19 08:05
Start Date: 09/Apr/19 08:05
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #580: 
HIVE-21231 HiveJoinAddNotNullRule support for range predicates
URL: https://github.com/apache/hive/pull/580
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 224806)
Time Spent: 20m  (was: 10m)

> HiveJoinAddNotNullRule support for range predicates
> ---
>
> Key: HIVE-21231
> URL: https://issues.apache.org/jira/browse/HIVE-21231
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: newbie, pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21231.01.patch, HIVE-21231.02.patch, 
> HIVE-21231.03.patch, HIVE-21231.04.patch, HIVE-21231.05.patch, 
> HIVE-21231.06.patch, HIVE-21231.07.patch, HIVE-21231.08.patch, 
> HIVE-21231.09.patch, HIVE-21231.10.patch, HIVE-21231.11.patch, 
> HIVE-21231.12.patch, HIVE-21231.13.patch, HIVE-21231.14.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> For instance, given the following query:
> {code:sql}
> SELECT t0.col0, t0.col1
> FROM
>   (
> SELECT col0, col1 FROM tab
>   ) AS t0
>   INNER JOIN
>   (
> SELECT col0, col1 FROM tab
>   ) AS t1
> ON t0.col0 < t1.col0 AND t0.col1 > t1.col1
> {code}
> we could still infer that col0 and col1 cannot be null for any of the inputs. 
> Currently we do not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21568) HiveRelOptUtil.isRowFilteringPlan should skip Project

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813102#comment-16813102
 ] 

Hive QA commented on HIVE-21568:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12965209/HIVE-21568.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 15880 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=155)

[intersect_all.q,unionDistinct_1.q,table_nonprintable.q,orc_llap_counters1.q,mm_cttas.q,whroot_external1.q,global_limit.q,cte_2.q,rcfile_createas1.q,dynamic_partition_pruning_2.q,intersect_merge.q,results_cache_diff_fs.q,cttl.q,parallel_colstats.q,load_hdfs_file_with_space_in_the_name.q]
org.apache.hadoop.hive.metastore.TestObjectStore.catalogs (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testDatabaseOps (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testDeprecatedConfigIsOverwritten
 (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSQLDropParitionsCleanup
 (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSQLDropPartitionsCacheCrossSession
 (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSqlErrorMetrics 
(batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testEmptyTrustStoreProps 
(batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testMasterKeyOps (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testMaxEventResponse 
(batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testPartitionOps (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testQueryCloseOnError 
(batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testRoleOps (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testTableOps (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testUseSSLProperty 
(batchId=230)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16896/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16896/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16896/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12965209 - PreCommit-HIVE-Build

> HiveRelOptUtil.isRowFilteringPlan should skip Project
> -
>
> Key: HIVE-21568
> URL: https://issues.apache.org/jira/browse/HIVE-21568
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21568.01.patch, HIVE-21568.01.patch
>
>
> Project operator should not return true in any case, this may trigger 
> additional rewritings in presence of constraints.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21427) Syslog storage handler

2019-04-09 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813093#comment-16813093
 ] 

Prasanth Jayachandran commented on HIVE-21427:
--

PTest framework seems to be using rsyslog and seems to move some files that 
starts with "syslog"
{code:java}
+ find ./ -type f -name 'syslog*'
+ xargs -I '{}' sh -c 'mkdir -p 
/home/hiveptest/104.197.208.125-hiveptest-2/logs/logs/syslogs; mv {} 
/home/hiveptest/104.197.208.125-hiveptest-2/logs/logs/syslogs'{code}
This could explain why the qfile tests are missing. I will rename the qfiles 
and see if that helps. 

> Syslog storage handler
> --
>
> Key: HIVE-21427
> URL: https://issues.apache.org/jira/browse/HIVE-21427
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21427.1.patch, HIVE-21427.2.patch, 
> HIVE-21427.3.patch, HIVE-21427.4.patch, HIVE-21427.5.patch, 
> HIVE-21427.6.patch, HIVE-21427.7.patch
>
>
> It will be useful to read syslog generated log files in Hive. Hive generates 
> logs in RFC5424 log4j2 layout and stores it as external table in sys.db. This 
> includes a SyslogSerde that can parse RFC5424 formatted logs and maps them to 
> logs table schema for query processing by hive. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21570) Convert llap iomem servlets output to json format

2019-04-09 Thread Antal Sinkovits (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits updated HIVE-21570:
---
Attachment: HIVE-21570.04.patch

> Convert llap iomem servlets output to json format
> -
>
> Key: HIVE-21570
> URL: https://issues.apache.org/jira/browse/HIVE-21570
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Minor
> Attachments: HIVE-21570.01.patch, HIVE-21570.02.patch, 
> HIVE-21570.03.patch, HIVE-21570.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21568) HiveRelOptUtil.isRowFilteringPlan should skip Project

2019-04-09 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813071#comment-16813071
 ] 

Hive QA commented on HIVE-21568:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
7s{color} | {color:blue} ql in master has 2258 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
40s{color} | {color:red} ql: The patch generated 1 new + 15 unchanged - 0 fixed 
= 16 total (was 15) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 35s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16896/dev-support/hive-personality.sh
 |
| git revision | master / 877757c |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16896/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16896/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> HiveRelOptUtil.isRowFilteringPlan should skip Project
> -
>
> Key: HIVE-21568
> URL: https://issues.apache.org/jira/browse/HIVE-21568
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21568.01.patch, HIVE-21568.01.patch
>
>
> Project operator should not return true in any case, this may trigger 
> additional rewritings in presence of constraints.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HIVE-21185) insert overwrite directory ... stored as nontextfile raise exception with merge files open

2019-04-09 Thread chengkun jia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chengkun jia resolved HIVE-21185.
-
   Resolution: Duplicate
Fix Version/s: 3.0.0

> insert overwrite directory ... stored as nontextfile raise exception with 
> merge files open
> --
>
> Key: HIVE-21185
> URL: https://issues.apache.org/jira/browse/HIVE-21185
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.1, 2.3.0
>Reporter: chengkun jia
>Priority: Major
> Fix For: 3.0.0
>
>
> reproduce:
>  
> {code:java}
> # init table with small files
> create table multiple_small_files (id int);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> # open small file merge
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=true;
> insert overwrite directory '/path/to/hdfs' stored as avro
> select * from multiple_small_files;
> {code}
> this will produce exception like:
> {code:java}
> Messages for this Task:Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing writable 
> Objavro.schema�{"type":"record","name":"baseRecord","fields":[{"name":"_col0","type":["null","int"],"default":null}]}�$$N���e(���
>                                                              �$$N���e(��� 
> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:169) at 
> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)Caused by: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing writable 
> Objavro.schema�{"type":"record","name":"baseRecord","fields":[{"name":"_col0","type":["null","int"],"default":null}]}�$$N���e(���
>                                      �$$N���e(��� at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497) at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160) ... 8 
> moreCaused by: org.apache.hadoop.hive.serde2.avro.AvroSerdeException: 
> Expecting a AvroGenericRecordWritable at 
> org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserialize(AvroDeserializer.java:139)
>  at 
> org.apache.hadoop.hive.serde2.avro.AvroSerDe.deserialize(AvroSerDe.java:216) 
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:128)
>  at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:92)
>  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:488) 
> ... 9 moreFAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> {code}
>  
> This issue not only affect avrofile format but all nontextfile storage 
> format. The rootcause is hive get wrong input format in file merge stage



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-21185) insert overwrite directory ... stored as nontextfile raise exception with merge files open

2019-04-09 Thread chengkun jia (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813048#comment-16813048
 ] 

chengkun jia edited comment on HIVE-21185 at 4/9/19 6:52 AM:
-

I think this issue is resolved in 
https://issues.apache.org/jira/browse/HIVE-18833 and released in hive 2.3.4 and 
3.0.0

that's what i just wanted.

 


was (Author: lfyzjck):
I think this issue is resolved in 
https://issues.apache.org/jira/browse/HIVE-18833

that's what i just wanted.

 

> insert overwrite directory ... stored as nontextfile raise exception with 
> merge files open
> --
>
> Key: HIVE-21185
> URL: https://issues.apache.org/jira/browse/HIVE-21185
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.1, 2.3.0
>Reporter: chengkun jia
>Priority: Major
>
> reproduce:
>  
> {code:java}
> # init table with small files
> create table multiple_small_files (id int);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> # open small file merge
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=true;
> insert overwrite directory '/path/to/hdfs' stored as avro
> select * from multiple_small_files;
> {code}
> this will produce exception like:
> {code:java}
> Messages for this Task:Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing writable 
> Objavro.schema�{"type":"record","name":"baseRecord","fields":[{"name":"_col0","type":["null","int"],"default":null}]}�$$N���e(���
>                                                              �$$N���e(��� 
> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:169) at 
> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)Caused by: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing writable 
> Objavro.schema�{"type":"record","name":"baseRecord","fields":[{"name":"_col0","type":["null","int"],"default":null}]}�$$N���e(���
>                                      �$$N���e(��� at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497) at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160) ... 8 
> moreCaused by: org.apache.hadoop.hive.serde2.avro.AvroSerdeException: 
> Expecting a AvroGenericRecordWritable at 
> org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserialize(AvroDeserializer.java:139)
>  at 
> org.apache.hadoop.hive.serde2.avro.AvroSerDe.deserialize(AvroSerDe.java:216) 
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:128)
>  at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:92)
>  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:488) 
> ... 9 moreFAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> {code}
>  
> This issue not only affect avrofile format but all nontextfile storage 
> format. The rootcause is hive get wrong input format in file merge stage



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21185) insert overwrite directory ... stored as nontextfile raise exception with merge files open

2019-04-09 Thread chengkun jia (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813048#comment-16813048
 ] 

chengkun jia commented on HIVE-21185:
-

I think this issue is resolved in 
https://issues.apache.org/jira/browse/HIVE-18833

that's what i just wanted.

 

> insert overwrite directory ... stored as nontextfile raise exception with 
> merge files open
> --
>
> Key: HIVE-21185
> URL: https://issues.apache.org/jira/browse/HIVE-21185
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.1, 2.3.0
>Reporter: chengkun jia
>Priority: Major
>
> reproduce:
>  
> {code:java}
> # init table with small files
> create table multiple_small_files (id int);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> insert into multiple_small_files values (1);
> # open small file merge
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=true;
> insert overwrite directory '/path/to/hdfs' stored as avro
> select * from multiple_small_files;
> {code}
> this will produce exception like:
> {code:java}
> Messages for this Task:Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing writable 
> Objavro.schema�{"type":"record","name":"baseRecord","fields":[{"name":"_col0","type":["null","int"],"default":null}]}�$$N���e(���
>                                                              �$$N���e(��� 
> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:169) at 
> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)Caused by: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing writable 
> Objavro.schema�{"type":"record","name":"baseRecord","fields":[{"name":"_col0","type":["null","int"],"default":null}]}�$$N���e(���
>                                      �$$N���e(��� at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497) at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160) ... 8 
> moreCaused by: org.apache.hadoop.hive.serde2.avro.AvroSerdeException: 
> Expecting a AvroGenericRecordWritable at 
> org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserialize(AvroDeserializer.java:139)
>  at 
> org.apache.hadoop.hive.serde2.avro.AvroSerDe.deserialize(AvroSerDe.java:216) 
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:128)
>  at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:92)
>  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:488) 
> ... 9 moreFAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> {code}
>  
> This issue not only affect avrofile format but all nontextfile storage 
> format. The rootcause is hive get wrong input format in file merge stage



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)