[jira] [Updated] (HIVE-21167) Bucketing: Bucketing version 1 is incorrectly partitioning data

2019-02-22 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-21167:
---
Fix Version/s: 4.0.0

> Bucketing: Bucketing version 1 is incorrectly partitioning data
> ---
>
> Key: HIVE-21167
> URL: https://issues.apache.org/jira/browse/HIVE-21167
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Vaibhav Gumashta
>Assignee: Deepak Jaiswal
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21167.1.patch, HIVE-21167.2.patch, 
> HIVE-21167.3.patch, HIVE-21167.4.patch
>
>
> Using murmur hash for bucketing columns was introduced in HIVE-18910, 
> following which {{'bucketing_version'='1'}} stands for the old behaviour 
> (where for example integer columns were partitioned based on mod values). 
> Looks like we have a bug in the old bucketing scheme now. I could repro it 
> when modified the existing schema using an alter table add column and adding 
> new data. Repro:
> {code}
> 0: jdbc:hive2://localhost:10010> create transactional table acid_ptn_bucket1 
> (a int, b int) partitioned by(ds string) clustered by (a) into 2 buckets 
> stored as ORC TBLPROPERTIES('bucketing_version'='1', 'transactional'='true', 
> 'transactional_properties'='default');
> No rows affected (0.418 seconds)
> 0: jdbc:hive2://localhost:10010> insert into acid_ptn_bucket1 partition (ds) 
> values(1,2,'today'),(1,3,'today'),(1,4,'yesterday'),(2,2,'yesterday'),(2,3,'today'),(2,4,'today');
> 6 rows affected (3.695 seconds)
> {code}
> Data from ORC file (data as expected):
> {code}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_001_001_/bucket_0
> {"operation": 0, "originalTransaction": 1, "bucket": 536870912, "rowId": 0, 
> "currentTransaction": 1, "row": {"a": 2, "b": 4}}
> {"operation": 0, "originalTransaction": 1, "bucket": 536870912, "rowId": 1, 
> "currentTransaction": 1, "row": {"a": 2, "b": 3}}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_001_001_/bucket_1
> {"operation": 0, "originalTransaction": 1, "bucket": 536936448, "rowId": 0, 
> "currentTransaction": 1, "row": {"a": 1, "b": 3}}
> {"operation": 0, "originalTransaction": 1, "bucket": 536936448, "rowId": 1, 
> "currentTransaction": 1, "row": {"a": 1, "b": 2}}
> {code}
> Modifying table schema and inserting new data:
> {code}
> 0: jdbc:hive2://localhost:10010> alter table acid_ptn_bucket1 add columns(c 
> int);
> No rows affected (0.541 seconds)
> 0: jdbc:hive2://localhost:10010> insert into acid_ptn_bucket1 partition (ds) 
> values(3,2,1000,'yesterday'),(3,3,1001,'today'),(3,4,1002,'yesterday'),(4,2,1003,'today'),
>  (4,3,1004,'yesterday'),(4,4,1005,'today');
> 6 rows affected (3.699 seconds)
> {code}
> Data from ORC file (wrong partitioning):
> {code}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_003_003_/bucket_0
> {"operation": 0, "originalTransaction": 3, "bucket": 536870912, "rowId": 0, 
> "currentTransaction": 3, "row": {"a": 3, "b": 3, "c": 1001}}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_003_003_/bucket_1
> {"operation": 0, "originalTransaction": 3, "bucket": 536936448, "rowId": 0, 
> "currentTransaction": 3, "row": {"a": 4, "b": 4, "c": 1005}}
> {"operation": 0, "originalTransaction": 3, "bucket": 536936448, "rowId": 1, 
> "currentTransaction": 3, "row": {"a": 4, "b": 2, "c": 1003}}
> {code}
> As seen above, the expected behaviour is that new data with column 'a' being 
> 3 should go to bucket1 and column 'a' being 4 should go to bucket0, but the 
> partitioning is wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21307) Need to set GzipJSONMessageEncoder as default config for EVENT_MESSAGE_FACTORY.

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775779#comment-16775779
 ] 

Hive QA commented on HIVE-21307:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12959869/HIVE-21307.02.patch

{color:green}SUCCESS:{color} +1 due to 8 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15811 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=190)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16211/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16211/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16211/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12959869 - PreCommit-HIVE-Build

> Need to set GzipJSONMessageEncoder as default config for 
> EVENT_MESSAGE_FACTORY.
> ---
>
> Key: HIVE-21307
> URL: https://issues.apache.org/jira/browse/HIVE-21307
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration, repl
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21307.01.patch, HIVE-21307.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, we use JsonMessageEncoder as the default message factory for 
> Notification events. As the size of some of the events are really huge and 
> cause OOM issues in RDBMS. So, it is needed to enable GzipJSONMessageEncoder 
> as default message factory to optimise the memory usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21307) Need to set GzipJSONMessageEncoder as default config for EVENT_MESSAGE_FACTORY.

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775771#comment-16775771
 ] 

Hive QA commented on HIVE-21307:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
46s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
36s{color} | {color:blue} standalone-metastore/metastore-common in master has 
29 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
34s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
27s{color} | {color:blue} hcatalog/webhcat/java-client in master has 3 extant 
Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
44s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
1s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} The patch metastore-common passed checkstyle {color} 
|
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} The patch common passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} hcatalog/webhcat/java-client: The patch generated 0 
new + 108 unchanged - 1 fixed = 108 total (was 109) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} itests/hcatalog-unit: The patch generated 0 new + 26 
unchanged - 1 fixed = 26 total (was 27) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} The patch hive-unit passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
7s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m  8s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16211/dev-support/hive-personality.sh
 |
| git revision | master / 69a7fc5 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: standalone-metastore/metastore-common common 
hcatalog/webhcat/java-client itests/hcatalog-unit itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16211/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> 

[jira] [Updated] (HIVE-21307) Need to set GzipJSONMessageEncoder as default config for EVENT_MESSAGE_FACTORY.

2019-02-22 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-21307:

Status: Patch Available  (was: Open)

Reattached 02.patch as pest failure was irrelevant.

> Need to set GzipJSONMessageEncoder as default config for 
> EVENT_MESSAGE_FACTORY.
> ---
>
> Key: HIVE-21307
> URL: https://issues.apache.org/jira/browse/HIVE-21307
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration, repl
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21307.01.patch, HIVE-21307.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, we use JsonMessageEncoder as the default message factory for 
> Notification events. As the size of some of the events are really huge and 
> cause OOM issues in RDBMS. So, it is needed to enable GzipJSONMessageEncoder 
> as default message factory to optimise the memory usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21307) Need to set GzipJSONMessageEncoder as default config for EVENT_MESSAGE_FACTORY.

2019-02-22 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-21307:

Attachment: (was: HIVE-21307.02.patch)

> Need to set GzipJSONMessageEncoder as default config for 
> EVENT_MESSAGE_FACTORY.
> ---
>
> Key: HIVE-21307
> URL: https://issues.apache.org/jira/browse/HIVE-21307
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration, repl
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21307.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, we use JsonMessageEncoder as the default message factory for 
> Notification events. As the size of some of the events are really huge and 
> cause OOM issues in RDBMS. So, it is needed to enable GzipJSONMessageEncoder 
> as default message factory to optimise the memory usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21307) Need to set GzipJSONMessageEncoder as default config for EVENT_MESSAGE_FACTORY.

2019-02-22 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-21307:

Attachment: HIVE-21307.02.patch

> Need to set GzipJSONMessageEncoder as default config for 
> EVENT_MESSAGE_FACTORY.
> ---
>
> Key: HIVE-21307
> URL: https://issues.apache.org/jira/browse/HIVE-21307
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration, repl
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21307.01.patch, HIVE-21307.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, we use JsonMessageEncoder as the default message factory for 
> Notification events. As the size of some of the events are really huge and 
> cause OOM issues in RDBMS. So, it is needed to enable GzipJSONMessageEncoder 
> as default message factory to optimise the memory usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21307) Need to set GzipJSONMessageEncoder as default config for EVENT_MESSAGE_FACTORY.

2019-02-22 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-21307:

Status: Open  (was: Patch Available)

> Need to set GzipJSONMessageEncoder as default config for 
> EVENT_MESSAGE_FACTORY.
> ---
>
> Key: HIVE-21307
> URL: https://issues.apache.org/jira/browse/HIVE-21307
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration, repl
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21307.01.patch, HIVE-21307.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, we use JsonMessageEncoder as the default message factory for 
> Notification events. As the size of some of the events are really huge and 
> cause OOM issues in RDBMS. So, it is needed to enable GzipJSONMessageEncoder 
> as default message factory to optimise the memory usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775754#comment-16775754
 ] 

Hive QA commented on HIVE-21279:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12959857/HIVE-21279.6.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 15811 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[results_cache_diff_fs]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_1]
 (batchId=177)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_2]
 (batchId=182)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_capacity]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_invalidation2]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_invalidation]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_lifetime]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_quoted_identifiers]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_temptable]
 (batchId=183)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_transactional]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_with_masking]
 (batchId=178)
org.apache.hadoop.hive.ql.TestAcidOnTez.testMapJoinOnMR (batchId=241)
org.apache.hive.hcatalog.streaming.TestStreaming.testStreamBucketingMatchesRegularBucketing
 (batchId=216)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16210/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16210/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16210/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12959857 - PreCommit-HIVE-Build

> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.2.patch, 
> HIVE-21279.3.patch, HIVE-21279.4.patch, HIVE-21279.5.patch, HIVE-21279.6.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775740#comment-16775740
 ] 

Hive QA commented on HIVE-21279:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
20s{color} | {color:blue} ql in master has 2261 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
50s{color} | {color:red} ql: The patch generated 44 new + 787 unchanged - 2 
fixed = 831 total (was 789) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16210/dev-support/hive-personality.sh
 |
| git revision | master / 69a7fc5 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16210/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16210/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.2.patch, 
> HIVE-21279.3.patch, HIVE-21279.4.patch, HIVE-21279.5.patch, HIVE-21279.6.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21240) JSON SerDe Re-Write

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775726#comment-16775726
 ] 

Hive QA commented on HIVE-21240:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12959849/HIVE-21240.9.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 15820 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniDruidKafkaCliDriver.testCliDriver[druidkafkamini_delimited]
 (batchId=275)
org.apache.hadoop.hive.cli.TestMiniHiveKafkaCliDriver.testCliDriver[kafka_storage_handler]
 (batchId=275)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitions
 (batchId=264)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitionsUnionAll
 (batchId=264)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomNonExistent
 (batchId=264)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighBytesRead 
(batchId=264)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes
 (batchId=264)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerSlowQueryElapsedTime
 (batchId=264)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerSlowQueryExecutionTime
 (batchId=264)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16209/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16209/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16209/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12959849 - PreCommit-HIVE-Build

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, 
> HIVE-21240.9.patch, HIVE-24240.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21240) JSON SerDe Re-Write

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775714#comment-16775714
 ] 

Hive QA commented on HIVE-21240:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
45s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
40s{color} | {color:blue} serde in master has 197 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
8s{color} | {color:blue} ql in master has 2261 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} hcatalog/core in master has 29 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} serde: The patch generated 0 new + 4 unchanged - 25 
fixed = 4 total (was 29) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} ql: The patch generated 0 new + 6 unchanged - 5 
fixed = 6 total (was 11) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} The patch core passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
49s{color} | {color:green} serde generated 0 new + 193 unchanged - 4 fixed = 
193 total (was 197) {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
19s{color} | {color:green} ql in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green} core in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 33m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16209/dev-support/hive-personality.sh
 |
| git revision | master / a33d35f |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: serde ql hcatalog/core U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16209/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: BELUGA BEHR
>

[jira] [Updated] (HIVE-21167) Bucketing: Bucketing version 1 is incorrectly partitioning data

2019-02-22 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-21167:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master.

 

Thanks for reviews [~vgarg] and [~jdere]

> Bucketing: Bucketing version 1 is incorrectly partitioning data
> ---
>
> Key: HIVE-21167
> URL: https://issues.apache.org/jira/browse/HIVE-21167
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Vaibhav Gumashta
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-21167.1.patch, HIVE-21167.2.patch, 
> HIVE-21167.3.patch, HIVE-21167.4.patch
>
>
> Using murmur hash for bucketing columns was introduced in HIVE-18910, 
> following which {{'bucketing_version'='1'}} stands for the old behaviour 
> (where for example integer columns were partitioned based on mod values). 
> Looks like we have a bug in the old bucketing scheme now. I could repro it 
> when modified the existing schema using an alter table add column and adding 
> new data. Repro:
> {code}
> 0: jdbc:hive2://localhost:10010> create transactional table acid_ptn_bucket1 
> (a int, b int) partitioned by(ds string) clustered by (a) into 2 buckets 
> stored as ORC TBLPROPERTIES('bucketing_version'='1', 'transactional'='true', 
> 'transactional_properties'='default');
> No rows affected (0.418 seconds)
> 0: jdbc:hive2://localhost:10010> insert into acid_ptn_bucket1 partition (ds) 
> values(1,2,'today'),(1,3,'today'),(1,4,'yesterday'),(2,2,'yesterday'),(2,3,'today'),(2,4,'today');
> 6 rows affected (3.695 seconds)
> {code}
> Data from ORC file (data as expected):
> {code}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_001_001_/bucket_0
> {"operation": 0, "originalTransaction": 1, "bucket": 536870912, "rowId": 0, 
> "currentTransaction": 1, "row": {"a": 2, "b": 4}}
> {"operation": 0, "originalTransaction": 1, "bucket": 536870912, "rowId": 1, 
> "currentTransaction": 1, "row": {"a": 2, "b": 3}}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_001_001_/bucket_1
> {"operation": 0, "originalTransaction": 1, "bucket": 536936448, "rowId": 0, 
> "currentTransaction": 1, "row": {"a": 1, "b": 3}}
> {"operation": 0, "originalTransaction": 1, "bucket": 536936448, "rowId": 1, 
> "currentTransaction": 1, "row": {"a": 1, "b": 2}}
> {code}
> Modifying table schema and inserting new data:
> {code}
> 0: jdbc:hive2://localhost:10010> alter table acid_ptn_bucket1 add columns(c 
> int);
> No rows affected (0.541 seconds)
> 0: jdbc:hive2://localhost:10010> insert into acid_ptn_bucket1 partition (ds) 
> values(3,2,1000,'yesterday'),(3,3,1001,'today'),(3,4,1002,'yesterday'),(4,2,1003,'today'),
>  (4,3,1004,'yesterday'),(4,4,1005,'today');
> 6 rows affected (3.699 seconds)
> {code}
> Data from ORC file (wrong partitioning):
> {code}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_003_003_/bucket_0
> {"operation": 0, "originalTransaction": 3, "bucket": 536870912, "rowId": 0, 
> "currentTransaction": 3, "row": {"a": 3, "b": 3, "c": 1001}}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_003_003_/bucket_1
> {"operation": 0, "originalTransaction": 3, "bucket": 536936448, "rowId": 0, 
> "currentTransaction": 3, "row": {"a": 4, "b": 4, "c": 1005}}
> {"operation": 0, "originalTransaction": 3, "bucket": 536936448, "rowId": 1, 
> "currentTransaction": 3, "row": {"a": 4, "b": 2, "c": 1003}}
> {code}
> As seen above, the expected behaviour is that new data with column 'a' being 
> 3 should go to bucket1 and column 'a' being 4 should go to bucket0, but the 
> partitioning is wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-22 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21279:
---
Attachment: HIVE-21279.6.patch

> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.2.patch, 
> HIVE-21279.3.patch, HIVE-21279.4.patch, HIVE-21279.5.patch, HIVE-21279.6.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21167) Bucketing: Bucketing version 1 is incorrectly partitioning data

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775694#comment-16775694
 ] 

Hive QA commented on HIVE-21167:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12959833/HIVE-21167.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15811 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16208/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16208/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16208/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12959833 - PreCommit-HIVE-Build

> Bucketing: Bucketing version 1 is incorrectly partitioning data
> ---
>
> Key: HIVE-21167
> URL: https://issues.apache.org/jira/browse/HIVE-21167
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Vaibhav Gumashta
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-21167.1.patch, HIVE-21167.2.patch, 
> HIVE-21167.3.patch, HIVE-21167.4.patch
>
>
> Using murmur hash for bucketing columns was introduced in HIVE-18910, 
> following which {{'bucketing_version'='1'}} stands for the old behaviour 
> (where for example integer columns were partitioned based on mod values). 
> Looks like we have a bug in the old bucketing scheme now. I could repro it 
> when modified the existing schema using an alter table add column and adding 
> new data. Repro:
> {code}
> 0: jdbc:hive2://localhost:10010> create transactional table acid_ptn_bucket1 
> (a int, b int) partitioned by(ds string) clustered by (a) into 2 buckets 
> stored as ORC TBLPROPERTIES('bucketing_version'='1', 'transactional'='true', 
> 'transactional_properties'='default');
> No rows affected (0.418 seconds)
> 0: jdbc:hive2://localhost:10010> insert into acid_ptn_bucket1 partition (ds) 
> values(1,2,'today'),(1,3,'today'),(1,4,'yesterday'),(2,2,'yesterday'),(2,3,'today'),(2,4,'today');
> 6 rows affected (3.695 seconds)
> {code}
> Data from ORC file (data as expected):
> {code}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_001_001_/bucket_0
> {"operation": 0, "originalTransaction": 1, "bucket": 536870912, "rowId": 0, 
> "currentTransaction": 1, "row": {"a": 2, "b": 4}}
> {"operation": 0, "originalTransaction": 1, "bucket": 536870912, "rowId": 1, 
> "currentTransaction": 1, "row": {"a": 2, "b": 3}}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_001_001_/bucket_1
> {"operation": 0, "originalTransaction": 1, "bucket": 536936448, "rowId": 0, 
> "currentTransaction": 1, "row": {"a": 1, "b": 3}}
> {"operation": 0, "originalTransaction": 1, "bucket": 536936448, "rowId": 1, 
> "currentTransaction": 1, "row": {"a": 1, "b": 2}}
> {code}
> Modifying table schema and inserting new data:
> {code}
> 0: jdbc:hive2://localhost:10010> alter table acid_ptn_bucket1 add columns(c 
> int);
> No rows affected (0.541 seconds)
> 0: jdbc:hive2://localhost:10010> insert into acid_ptn_bucket1 partition (ds) 
> values(3,2,1000,'yesterday'),(3,3,1001,'today'),(3,4,1002,'yesterday'),(4,2,1003,'today'),
>  (4,3,1004,'yesterday'),(4,4,1005,'today');
> 6 rows affected (3.699 seconds)
> {code}
> Data from ORC file (wrong partitioning):
> {code}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_003_003_/bucket_0
> {"operation": 0, "originalTransaction": 3, "bucket": 536870912, "rowId": 0, 
> "currentTransaction": 3, "row": {"a": 3, "b": 3, "c": 1001}}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_003_003_/bucket_1
> {"operation": 0, "originalTransaction": 3, "bucket": 536936448, "rowId": 0, 
> "currentTransaction": 3, "row": {"a": 4, "b": 4, "c": 1005}}
> {"operation": 0, "originalTransaction": 3, "bucket": 536936448, "rowId": 1, 
> "currentTransaction": 3, "row": {"a": 4, "b": 2, "c": 1003}}
> {code}
> As seen above, the expected behaviour is that new data with column 'a' being 
> 3 should go to bucket1 and column 'a' being 4 should go to bucket0, but the 
> partitioning is wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-22 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21279:
---
Status: Patch Available  (was: Open)

> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.2.patch, 
> HIVE-21279.3.patch, HIVE-21279.4.patch, HIVE-21279.5.patch, HIVE-21279.6.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-22 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21279:
---
Status: Open  (was: Patch Available)

> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.2.patch, 
> HIVE-21279.3.patch, HIVE-21279.4.patch, HIVE-21279.5.patch, HIVE-21279.6.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21204) Instrumentation for read/write locks in LLAP

2019-02-22 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775687#comment-16775687
 ] 

slim bouguerra commented on HIVE-21204:
---

[~odraese] seems like your last patch didn't run the build.

+1 after green run.

> Instrumentation for read/write locks in LLAP
> 
>
> Key: HIVE-21204
> URL: https://issues.apache.org/jira/browse/HIVE-21204
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Oliver Draese
>Assignee: Oliver Draese
>Priority: Major
> Attachments: HIVE-21204.1.patch, HIVE-21204.2.patch, HIVE-21204.patch
>
>
> LLAP has several R/W locks for serialization of updates to query tracker, 
> file data, 
> Instrumentation is added to monitor the
>  * total amount of R/W locks within a particular category
>  * average + max wait/suspension time to get the R/W lock
> A category includes all lock instances for particular areas (i.e. category is 
> FileData and all R/W locks that are used in FileData instances are accounted 
> within the one category).
> The monitoring/accounting is done via Hadoop Metrics 2, making them 
> accessible via JMX. In addition, a new "locking" GET endpoint is added to the 
> LLAP daemon's REST interface. It produces output like the following example:
> {
>  {{  "statsCollection": "enabled",}}
>  {{  "lockStats": [}}
>  {{    {}}{{ "type": "R/W Lock Stats",}}
>  {{      "label": "FileData",}}
>  {{      "totalLockWaitTimeMillis": 0,}}
>  {{      "readLock": {}}
>  {{         "count": 0,}}
>  {{         "avgWaitTimeNanos": 0,}}
>  {{         "maxWaitTimeNanos": 0}}
>  {{      },}}
>  {{      "writeLock": {}}
>  {{         "count": 0,}}
>  {{         "avgWaitTimeNanos": 0,}}
>  {{         "maxWaitTimeNanos": 0}}
>               }
>  {{    },}}
>  {{    { "}}{{type": "R/W Lock Stats",}}
>  {{      "label": "QueryTracker",}}
>  {{      "totalLockWaitTimeMillis": 0,}}
>  {{      "readLock": {}}
>  {{         "count": 0,}}
>  {{         "avgWaitTimeNanos": 0,}}
>  {{         "maxWaitTimeNanos": 0}}
>  {{      },}}
>  {{      "writeLock": {}}
>  {{         "count": 0,}}
>  {{         "avgWaitTimeNanos": 0,}}
>  {{         "maxWaitTimeNanos": 0}}
>               }
>  {{    } }}{{]}}
> {{}}}
> To avoid the overhead of lock instrumentation, lock metrics collection is 
> disabled by default and can be enabled via the following configuration 
> parameter:
>   {{hive.llap.lockmetrics.collect = true}}
>   
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21167) Bucketing: Bucketing version 1 is incorrectly partitioning data

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775675#comment-16775675
 ] 

Hive QA commented on HIVE-21167:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
6s{color} | {color:blue} ql in master has 2261 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} ql: The patch generated 0 new + 37 unchanged - 2 
fixed = 37 total (was 39) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 12s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16208/dev-support/hive-personality.sh
 |
| git revision | master / a33d35f |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16208/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Bucketing: Bucketing version 1 is incorrectly partitioning data
> ---
>
> Key: HIVE-21167
> URL: https://issues.apache.org/jira/browse/HIVE-21167
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Vaibhav Gumashta
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-21167.1.patch, HIVE-21167.2.patch, 
> HIVE-21167.3.patch, HIVE-21167.4.patch
>
>
> Using murmur hash for bucketing columns was introduced in HIVE-18910, 
> following which {{'bucketing_version'='1'}} stands for the old behaviour 
> (where for example integer columns were partitioned based on mod values). 
> Looks like we have a bug in the old bucketing scheme now. I could repro it 
> when modified the existing schema using an alter table add column and adding 
> new data. Repro:
> {code}
> 0: jdbc:hive2://localhost:10010> create transactional table acid_ptn_bucket1 
> (a int, b int) partitioned by(ds string) clustered by (a) into 2 buckets 
> stored as ORC TBLPROPERTIES('bucketing_version'='1', 'transactional'='true', 
> 'transactional_properties'='default');
> No rows affected (0.418 seconds)
> 0: jdbc:hive2://localhost:10010> insert into acid_ptn_bucket1 partition (ds) 
> values(1,2,'today'),(1,3,'today'),(1,4,'yesterday'),(2,2,'yesterday'),(2,3,'today'),(2,4,'today');
> 6 rows affected (3.695 seconds)
> {code}
> Data from ORC file (data as expected):
> {code}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_001_001_/bucket_0
> {"operation": 0, 

[jira] [Commented] (HIVE-21001) Upgrade to calcite-1.18

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775663#comment-16775663
 ] 

Hive QA commented on HIVE-21001:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
38s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
18s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
9s{color} | {color:blue} ql in master has 2261 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
31s{color} | {color:blue} accumulo-handler in master has 21 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
32s{color} | {color:blue} hbase-handler in master has 15 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m 
43s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
40s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
21s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
44s{color} | {color:red} ql: The patch generated 5 new + 290 unchanged - 29 
fixed = 295 total (was 319) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m 
12s{color} | {color:red} root: The patch generated 5 new + 290 unchanged - 29 
fixed = 295 total (was 319) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 4 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 12m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
54s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 77m 31s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  xml  compile  findbugs  
checkstyle  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16207/dev-support/hive-personality.sh
 |
| git revision | master / a33d35f |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16207/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16207/yetus/diff-checkstyle-root.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16207/yetus/whitespace-eol.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16207/yetus/whitespace-tabs.txt
 |
| modules | C: ql accumulo-handler hbase-handler . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16207/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Upgrade to calcite-1.18

[jira] [Commented] (HIVE-21001) Upgrade to calcite-1.18

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775657#comment-16775657
 ] 

Hive QA commented on HIVE-21001:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12959828/HIVE-21001.39.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 15811 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ambiguitycheck] 
(batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_ppd_char] 
(batchId=11)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_ppd_varchar]
 (batchId=178)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_in]
 (batchId=176)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_sortmerge_join_16]
 (batchId=136)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_in] 
(batchId=140)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query93] 
(batchId=277)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query93] 
(batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query93] 
(batchId=275)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16207/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16207/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16207/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12959828 - PreCommit-HIVE-Build

> Upgrade to calcite-1.18
> ---
>
> Key: HIVE-21001
> URL: https://issues.apache.org/jira/browse/HIVE-21001
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-21001.01.patch, HIVE-21001.01.patch, 
> HIVE-21001.02.patch, HIVE-21001.03.patch, HIVE-21001.04.patch, 
> HIVE-21001.05.patch, HIVE-21001.06.patch, HIVE-21001.06.patch, 
> HIVE-21001.07.patch, HIVE-21001.08.patch, HIVE-21001.08.patch, 
> HIVE-21001.08.patch, HIVE-21001.09.patch, HIVE-21001.09.patch, 
> HIVE-21001.09.patch, HIVE-21001.10.patch, HIVE-21001.11.patch, 
> HIVE-21001.12.patch, HIVE-21001.13.patch, HIVE-21001.15.patch, 
> HIVE-21001.16.patch, HIVE-21001.17.patch, HIVE-21001.18.patch, 
> HIVE-21001.18.patch, HIVE-21001.19.patch, HIVE-21001.20.patch, 
> HIVE-21001.21.patch, HIVE-21001.22.patch, HIVE-21001.22.patch, 
> HIVE-21001.22.patch, HIVE-21001.23.patch, HIVE-21001.24.patch, 
> HIVE-21001.26.patch, HIVE-21001.26.patch, HIVE-21001.26.patch, 
> HIVE-21001.26.patch, HIVE-21001.26.patch, HIVE-21001.27.patch, 
> HIVE-21001.28.patch, HIVE-21001.29.patch, HIVE-21001.29.patch, 
> HIVE-21001.30.patch, HIVE-21001.31.patch, HIVE-21001.32.patch, 
> HIVE-21001.34.patch, HIVE-21001.35.patch, HIVE-21001.36.patch, 
> HIVE-21001.37.patch, HIVE-21001.38.patch, HIVE-21001.39.patch
>
>
> XLEAR LIBRARY CACHE 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20602) hive3 crashes after 1min

2019-02-22 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775656#comment-16775656
 ] 

t oo commented on HIVE-20602:
-

workaround is set _hive_.metastore.event.db._notification_._api_.auth to false

> hive3 crashes after 1min
> 
>
> Key: HIVE-20602
> URL: https://issues.apache.org/jira/browse/HIVE-20602
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore, Standalone Metastore
>Affects Versions: 3.0.0
>Reporter: t oo
>Priority: Blocker
>
> Running hiveserver2 process (v3.0.0 of hive) on ec2 (not emr), the process 
> starts up and for the first 1min everything is ok (I can make beeline 
> connection, create/repair/select external hive tables) but then the 
> hiveserver2 process crashes. If I restart the process and even do nothing the 
> hiveserver2 process crashes after 1min. When checking the logs I see messages 
> like 'number of connections to metastore: 1','number of connections to 
> metastore: 2','number of connections to metastore: 3' then 'could not bind to 
> port 1 port already in use' then end of the logs.
> I made some experiments on few different ec2s (if i use hive v2.3.2 the 
> hiveserver2 process never crashes), but if i use hive v3.0.0 it consistently 
> crashes after a min.
> Metastore db is mysql rds, hive metastore process never crashed. I can see 
> the external hive table ddls are persisted in the mysql (ie DBS, TBLS tables).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21167) Bucketing: Bucketing version 1 is incorrectly partitioning data

2019-02-22 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775652#comment-16775652
 ] 

Jason Dere commented on HIVE-21167:
---

+1

> Bucketing: Bucketing version 1 is incorrectly partitioning data
> ---
>
> Key: HIVE-21167
> URL: https://issues.apache.org/jira/browse/HIVE-21167
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Vaibhav Gumashta
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-21167.1.patch, HIVE-21167.2.patch, 
> HIVE-21167.3.patch, HIVE-21167.4.patch
>
>
> Using murmur hash for bucketing columns was introduced in HIVE-18910, 
> following which {{'bucketing_version'='1'}} stands for the old behaviour 
> (where for example integer columns were partitioned based on mod values). 
> Looks like we have a bug in the old bucketing scheme now. I could repro it 
> when modified the existing schema using an alter table add column and adding 
> new data. Repro:
> {code}
> 0: jdbc:hive2://localhost:10010> create transactional table acid_ptn_bucket1 
> (a int, b int) partitioned by(ds string) clustered by (a) into 2 buckets 
> stored as ORC TBLPROPERTIES('bucketing_version'='1', 'transactional'='true', 
> 'transactional_properties'='default');
> No rows affected (0.418 seconds)
> 0: jdbc:hive2://localhost:10010> insert into acid_ptn_bucket1 partition (ds) 
> values(1,2,'today'),(1,3,'today'),(1,4,'yesterday'),(2,2,'yesterday'),(2,3,'today'),(2,4,'today');
> 6 rows affected (3.695 seconds)
> {code}
> Data from ORC file (data as expected):
> {code}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_001_001_/bucket_0
> {"operation": 0, "originalTransaction": 1, "bucket": 536870912, "rowId": 0, 
> "currentTransaction": 1, "row": {"a": 2, "b": 4}}
> {"operation": 0, "originalTransaction": 1, "bucket": 536870912, "rowId": 1, 
> "currentTransaction": 1, "row": {"a": 2, "b": 3}}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_001_001_/bucket_1
> {"operation": 0, "originalTransaction": 1, "bucket": 536936448, "rowId": 0, 
> "currentTransaction": 1, "row": {"a": 1, "b": 3}}
> {"operation": 0, "originalTransaction": 1, "bucket": 536936448, "rowId": 1, 
> "currentTransaction": 1, "row": {"a": 1, "b": 2}}
> {code}
> Modifying table schema and inserting new data:
> {code}
> 0: jdbc:hive2://localhost:10010> alter table acid_ptn_bucket1 add columns(c 
> int);
> No rows affected (0.541 seconds)
> 0: jdbc:hive2://localhost:10010> insert into acid_ptn_bucket1 partition (ds) 
> values(3,2,1000,'yesterday'),(3,3,1001,'today'),(3,4,1002,'yesterday'),(4,2,1003,'today'),
>  (4,3,1004,'yesterday'),(4,4,1005,'today');
> 6 rows affected (3.699 seconds)
> {code}
> Data from ORC file (wrong partitioning):
> {code}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_003_003_/bucket_0
> {"operation": 0, "originalTransaction": 3, "bucket": 536870912, "rowId": 0, 
> "currentTransaction": 3, "row": {"a": 3, "b": 3, "c": 1001}}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_003_003_/bucket_1
> {"operation": 0, "originalTransaction": 3, "bucket": 536936448, "rowId": 0, 
> "currentTransaction": 3, "row": {"a": 4, "b": 4, "c": 1005}}
> {"operation": 0, "originalTransaction": 3, "bucket": 536936448, "rowId": 1, 
> "currentTransaction": 3, "row": {"a": 4, "b": 2, "c": 1003}}
> {code}
> As seen above, the expected behaviour is that new data with column 'a' being 
> 3 should go to bucket1 and column 'a' being 4 should go to bucket0, but the 
> partitioning is wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21306) Upgrade HttpComponents to the latest versions similar to what Hadoop has done.

2019-02-22 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775636#comment-16775636
 ] 

Thejas M Nair commented on HIVE-21306:
--

+1


> Upgrade HttpComponents to the latest versions similar to what Hadoop has done.
> --
>
> Key: HIVE-21306
> URL: https://issues.apache.org/jira/browse/HIVE-21306
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21306.01.patch
>
>   Original Estimate: 24h
>  Time Spent: 10m
>  Remaining Estimate: 23h 50m
>
> The use of HTTPClient 4.5.2 breaks the use of SPNEGO over TLS.
> It mistakenly added HTTPS instead of HTTP to the principal when over SSL and 
> thus breaks the authentication.
> This was upgraded recently in Hadoop and needs to be done for Hive as well.
> See: HADOOP-16076
> Where we upgraded from 4.5.2 and 4.4.4 to 4.5.6 and 4.4.10.
> 
> 4.5.2
> 4.4.4
> + 4.5.6
> + 4.4.10



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21240) JSON SerDe Re-Write

2019-02-22 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-21240:
---
Attachment: HIVE-21240.9.patch

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, 
> HIVE-21240.9.patch, HIVE-24240.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21240) JSON SerDe Re-Write

2019-02-22 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-21240:
---
Status: Patch Available  (was: Open)

Added patch to fix JSON writer when using derived column names (_c0, _c1, etc.)

OK.  So, the Kafka_Handler Q-Test fails locally on trunk as well, so please 
ignore that UT failure.  If Jenkins comes back clean, please consider accepting 
[^HIVE-21240.9.patch] for inclusion into the project.

 

Reads with this SerDe are a bit quicker, writes, a bit slower.  I'm not exactly 
sure what makes the reads faster, but the slower writes are expected as the 
writer more fully utilizes the Jackson library whereas the current 
implementation uses its own writing mechanisms that is very lightweight.

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.1.1, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, 
> HIVE-21240.9.patch, HIVE-24240.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21240) JSON SerDe Re-Write

2019-02-22 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-21240:
---
Attachment: (was: HIVE-24240.8.patch)

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, HIVE-24240.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21240) JSON SerDe Re-Write

2019-02-22 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-21240:
---
Attachment: (was: HIVE-24240.8.patch)

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, HIVE-24240.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21240) JSON SerDe Re-Write

2019-02-22 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-21240:
---
Status: Open  (was: Patch Available)

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.1.1, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, 
> HIVE-21240.8.patch, HIVE-21240.8.patch, HIVE-24240.8.patch, 
> HIVE-24240.8.patch, HIVE-24240.8.patch, HIVE-24240.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21240) JSON SerDe Re-Write

2019-02-22 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-21240:
---
Attachment: (was: HIVE-24240.8.patch)

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, HIVE-24240.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21240) JSON SerDe Re-Write

2019-02-22 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-21240:
---
Attachment: (was: HIVE-21240.8.patch)

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, HIVE-24240.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21240) JSON SerDe Re-Write

2019-02-22 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-21240:
---
Attachment: (was: HIVE-21240.8.patch)

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, HIVE-24240.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775617#comment-16775617
 ] 

Hive QA commented on HIVE-21292:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12959793/HIVE-21292.06.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15811 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16206/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16206/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16206/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12959793 - PreCommit-HIVE-Build

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch, 
> HIVE-21292.06.patch
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the database related operations from the old DDLTask, 
> and move them under the new package. Also create the new internal framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775581#comment-16775581
 ] 

Hive QA commented on HIVE-21292:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
44s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
18s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
4s{color} | {color:blue} ql in master has 2261 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} hcatalog/core in master has 29 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
46s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
41s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} ql: The patch generated 0 new + 507 unchanged - 25 
fixed = 507 total (was 532) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} hcatalog/core: The patch generated 0 new + 40 
unchanged - 2 fixed = 40 total (was 42) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} The patch hive-unit passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
20s{color} | {color:green} ql generated 0 new + 2260 unchanged - 1 fixed = 2260 
total (was 2261) {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
44s{color} | {color:green} core in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} hive-unit in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 35m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16206/dev-support/hive-personality.sh
 |
| git revision | master / a33d35f |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql hcatalog/core itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16206/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>   

[jira] [Commented] (HIVE-21181) Hive pre-upgrade tool not working with HDFS HA, tries connecting to nameservice as it was a NameNode

2019-02-22 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775561#comment-16775561
 ] 

Jason Dere commented on HIVE-21181:
---

Sounds like one solution to this issue is to make sure the HDFS conf dir is in 
the classpath when running the pre-upgrade tool

> Hive pre-upgrade tool not working with HDFS HA, tries connecting to 
> nameservice as it was a NameNode
> 
>
> Key: HIVE-21181
> URL: https://issues.apache.org/jira/browse/HIVE-21181
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
> Environment: Centos 7.4.1708
> kernel 3.10.0-693.11.6.el7.x86_64
> Ambari 2.6.2.2
> HDP-2.6.5.0-292
> Hive 1.2.1000
> HDFS 2.7.3
>Reporter: Attila Csaba Marosi
>Priority: Major
> Attachments: core-site.xml, hdfs-site.xml
>
>
> While preparing a production cluster HDP-2.6.5 -> HDP-3.1 upgrades, we've 
> noticed issues with the hive-pre-upgrade tool, when we tried running it, we 
> got the exception:
> {{Found Acid table: default.hello_acid
> 2019-01-28 15:54:20,331 ERROR [main] acid.PreUpgradeTool 
> (PreUpgradeTool.java:main(152)) - PreUpgradeTool failed
> java.lang.IllegalArgumentException: java.net.UnknownHostException: 
> mytestcluster
> at 
> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:439)
> at 
> org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:321)
> at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
> at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:696)
> at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:636)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:160)
> at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2796)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
> at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2830)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2812)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:390)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
> at 
> org.apache.hadoop.hive.upgrade.acid.PreUpgradeTool.needsCompaction(PreUpgradeTool.java:417)
> at 
> org.apache.hadoop.hive.upgrade.acid.PreUpgradeTool.getCompactionCommands(PreUpgradeTool.java:384)
> at 
> org.apache.hadoop.hive.upgrade.acid.PreUpgradeTool.getCompactionCommands(PreUpgradeTool.java:374)
> at 
> org.apache.hadoop.hive.upgrade.acid.PreUpgradeTool.prepareAcidUpgradeInternal(PreUpgradeTool.java:235)
> at 
> org.apache.hadoop.hive.upgrade.acid.PreUpgradeTool.main(PreUpgradeTool.java:149)
> Caused by: java.net.UnknownHostException: mytestcluster
> ... 17 more}}
> We tried running it on a kerberized test cluster built based on the same 
> blueprint like the production clusters, with HDP-2.6.5.0-292, Hive 1.2.1000, 
> HDFS 2.7.3, with HDFS HA and without Hive HA.
> We enabled Hive ACID, created the same example ACID table as shown in 
> https://hortonworks.com/tutorial/using-hive-acid-transactions-to-insert-update-and-delete-data/
> We followed the steps described at 
> https://docs.hortonworks.com/HDPDocuments/Ambari-2.7.3.0/bk_ambari-upgrade-major/content/prepare_hive_for_upgrade.html
>  , kinit-ed, used the "-Djavax.security.auth.useSubjectCredsOnly=false" 
> parameter.
> Without the ACID table there is no issue.
> I'm attaching the hdfs-site.xml and core-site.xml.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21293) Fix ambiguity in grammar warnings at compilation time (II)

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1677#comment-1677
 ] 

Hive QA commented on HIVE-21293:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12959758/HIVE-21293.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15811 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16205/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16205/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16205/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12959758 - PreCommit-HIVE-Build

> Fix ambiguity in grammar warnings at compilation time (II)
> --
>
> Key: HIVE-21293
> URL: https://issues.apache.org/jira/browse/HIVE-21293
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-21293.01.patch
>
>
> These are the warnings at compilation time:
> {code}
> warning(200): IdentifiersParser.g:424:5:
> Decision can match input such as "KW_UNKNOWN" using multiple alternatives: 1, 
> 10
> As a result, alternative(s) 10 were disabled for that input
> {code}
> This means that multiple parser rules can match certain query text, possibly 
> leading to unexpected errors at parsing time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21301) Show tables statement to include views and materialized views

2019-02-22 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21301:
---
   Resolution: Fixed
Fix Version/s: 3.2.0
   4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master, branch-3. Thanks for reviewing [~ashutoshc]

> Show tables statement to include views and materialized views
> -
>
> Key: HIVE-21301
> URL: https://issues.apache.org/jira/browse/HIVE-21301
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: TODOC3.2, pull-request-available
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-21301.01.patch, HIVE-21301.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-19974 introduced backwards incompatible change, with {{SHOW TABLES}} 
> statement showing only managed/external tables in the system.
> This issue will restore old behavior, with {{SHOW TABLES}} showing all 
> queryable entities, including views and materialized views.
> Instead, to provide information about table types, {{SHOW EXTENDED TABLES}} 
> statement is introduced, which includes an additional column with the table 
> type for each of the tables listed.
> Besides, the possibility to filter the show tables statements with a {{WHERE 
> `table_type` = 'ANY_TYPE'}} clause is introduced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (HIVE-21240) JSON SerDe Re-Write

2019-02-22 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-21240:
---
Comment: was deleted

(was: Though, I am getting a failure in some scenarios that are not picked up 
in the UTs.  I need to investigate them further.)

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, 
> HIVE-21240.8.patch, HIVE-21240.8.patch, HIVE-24240.8.patch, 
> HIVE-24240.8.patch, HIVE-24240.8.patch, HIVE-24240.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (HIVE-21240) JSON SerDe Re-Write

2019-02-22 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-21240:
---
Comment: was deleted

(was: OK, I figured out the issue.  I am running this SerDe in CDH 6.1 (based 
on Hive 2.2) and it fails with a version-mismatch issue when handling dates.

 

This patch contains a JsonSerDe which is faster (read) and more feature rich 
than the existing JsonSerde.  Please accept the latest patch for inclusion into 
the project.)

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, 
> HIVE-21240.8.patch, HIVE-21240.8.patch, HIVE-24240.8.patch, 
> HIVE-24240.8.patch, HIVE-24240.8.patch, HIVE-24240.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21305) LLAP: Option to skip cache for ETL queries

2019-02-22 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775544#comment-16775544
 ] 

Gopal V commented on HIVE-21305:


bq. We decide if the query inserts into a table then we do not add entries to 
the cache, but we still use the existing cache elements?

The cache does the read through, so the cache is in charge of reading data into 
itself - the items are not read and then placed into the cache.

bq. We might be better off caching the small tables but skipping the big ones.

Once you decide not to cache in a scenario, the smallest tables are the least 
worth caching - the improvement in performance is going to be smaller as the 
tables get smaller.

A more granular decision might be helpful, but this is a "too obvious" general 
ticket & not the final version (we will learn as we implement and deploy).

The original customer case was for the SerDeEncodedDataReader (which burns CPU 
to do intermediate transforms, not just caching data), not the ORC cache.

And the real issue was displacement as well (the existing "hot data"  getting 
displaced by this) - in the most common scenario, the text data is never going 
to be read again.

> LLAP: Option to skip cache for ETL queries
> --
>
> Key: HIVE-21305
> URL: https://issues.apache.org/jira/browse/HIVE-21305
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Priority: Major
>
> To avoid ETL queries from polluting the cache, would be good to detect such 
> queries at compile time and optional skip llap io for such queries. 
> org.apache.hadoop.hive.ql.parse.QBParseInfo.hasInsertTables() is the simplest 
> way  to catch ETL queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-21240) JSON SerDe Re-Write

2019-02-22 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775543#comment-16775543
 ] 

BELUGA BEHR edited comment on HIVE-21240 at 2/22/19 7:41 PM:
-

OK, I figured out the issue.  I am running this SerDe in CDH 6.1 (based on Hive 
2.2) and it fails with a version-mismatch issue when handling dates.

 

This patch contains a JsonSerDe which is faster (read) and more feature rich 
than the existing JsonSerde.  Please accept the latest patch for inclusion into 
the project.


was (Author: belugabehr):
OK, I figured out the issue.  I am running this SerDe in CDH 6.1 and it fails 
with a version-mismatch issue when handling dates.

 

This patch contains a JsonSerDe which is faster (read) and more feature rich 
than the existing JsonSerde.  Please accept the latest patch for inclusion into 
the project.

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, 
> HIVE-21240.8.patch, HIVE-21240.8.patch, HIVE-24240.8.patch, 
> HIVE-24240.8.patch, HIVE-24240.8.patch, HIVE-24240.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21240) JSON SerDe Re-Write

2019-02-22 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775543#comment-16775543
 ] 

BELUGA BEHR commented on HIVE-21240:


OK, I figured out the issue.  I am running this SerDe in CDH 6.1 and it fails 
with a version-mismatch issue when handling dates.

 

This patch contains a JsonSerDe which is faster (read) and more feature rich 
than the existing JsonSerde.  Please accept the latest patch for inclusion into 
the project.

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, 
> HIVE-21240.8.patch, HIVE-21240.8.patch, HIVE-24240.8.patch, 
> HIVE-24240.8.patch, HIVE-24240.8.patch, HIVE-24240.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21301) Show tables statement to include views and materialized views

2019-02-22 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775533#comment-16775533
 ] 

Ashutosh Chauhan commented on HIVE-21301:
-

+1

> Show tables statement to include views and materialized views
> -
>
> Key: HIVE-21301
> URL: https://issues.apache.org/jira/browse/HIVE-21301
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: TODOC3.2, pull-request-available
> Attachments: HIVE-21301.01.patch, HIVE-21301.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-19974 introduced backwards incompatible change, with {{SHOW TABLES}} 
> statement showing only managed/external tables in the system.
> This issue will restore old behavior, with {{SHOW TABLES}} showing all 
> queryable entities, including views and materialized views.
> Instead, to provide information about table types, {{SHOW EXTENDED TABLES}} 
> statement is introduced, which includes an additional column with the table 
> type for each of the tables listed.
> Besides, the possibility to filter the show tables statements with a {{WHERE 
> `table_type` = 'ANY_TYPE'}} clause is introduced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21293) Fix ambiguity in grammar warnings at compilation time (II)

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775531#comment-16775531
 ] 

Hive QA commented on HIVE-21293:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
4s{color} | {color:blue} ql in master has 2261 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
42s{color} | {color:red} ql: The patch generated 92 new + 0 unchanged - 1019 
fixed = 92 total (was 1019) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
17s{color} | {color:red} ql generated 8 new + 2260 unchanged - 1 fixed = 2268 
total (was 2261) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 10s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Dead store to LA13_127 in 
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser$DFA13.specialStateTransition(int,
 IntStream)  At 
HiveParser_SelectClauseParser.java:org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser$DFA13.specialStateTransition(int,
 IntStream)  At HiveParser_SelectClauseParser.java:[line 4696] |
|  |  Dead store to LA13_128 in 
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser$DFA13.specialStateTransition(int,
 IntStream)  At 
HiveParser_SelectClauseParser.java:org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser$DFA13.specialStateTransition(int,
 IntStream)  At HiveParser_SelectClauseParser.java:[line 4709] |
|  |  Dead store to LA13_130 in 
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser$DFA13.specialStateTransition(int,
 IntStream)  At 
HiveParser_SelectClauseParser.java:org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser$DFA13.specialStateTransition(int,
 IntStream)  At HiveParser_SelectClauseParser.java:[line 4722] |
|  |  Dead store to LA13_131 in 
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser$DFA13.specialStateTransition(int,
 IntStream)  At 
HiveParser_SelectClauseParser.java:org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser$DFA13.specialStateTransition(int,
 IntStream)  At HiveParser_SelectClauseParser.java:[line 4735] |
|  |  Dead store to LA19_89 in 
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser$DFA19.specialStateTransition(int,
 IntStream)  At 
HiveParser_SelectClauseParser.java:org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser$DFA19.specialStateTransition(int,
 IntStream)  At HiveParser_SelectClauseParser.java:[line 5472] |
|  |  Dead store to LA19_90 in 
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser$DFA19.specialStateTransition(int,
 IntStream)  At 
HiveParser_SelectClauseParser.java:org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser$DFA19.specialStateTransition(int,
 IntStream)  At HiveParser_SelectClauseParser.java:[line 5485] |
|  |  Dead store to LA19_92 in 
org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser$DFA19.specialStateTransition(int,
 IntStream)  At 

[jira] [Commented] (HIVE-21307) Need to set GzipJSONMessageEncoder as default config for EVENT_MESSAGE_FACTORY.

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775501#comment-16775501
 ] 

Hive QA commented on HIVE-21307:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12959756/HIVE-21307.02.patch

{color:green}SUCCESS:{color} +1 due to 8 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15810 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.TestTxnCommandsWithSplitUpdateAndVectorization.testMergeOnTezEdges
 (batchId=311)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16204/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16204/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16204/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12959756 - PreCommit-HIVE-Build

> Need to set GzipJSONMessageEncoder as default config for 
> EVENT_MESSAGE_FACTORY.
> ---
>
> Key: HIVE-21307
> URL: https://issues.apache.org/jira/browse/HIVE-21307
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration, repl
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21307.01.patch, HIVE-21307.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, we use JsonMessageEncoder as the default message factory for 
> Notification events. As the size of some of the events are really huge and 
> cause OOM issues in RDBMS. So, it is needed to enable GzipJSONMessageEncoder 
> as default message factory to optimise the memory usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21167) Bucketing: Bucketing version 1 is incorrectly partitioning data

2019-02-22 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-21167:
--
Attachment: HIVE-21167.4.patch

> Bucketing: Bucketing version 1 is incorrectly partitioning data
> ---
>
> Key: HIVE-21167
> URL: https://issues.apache.org/jira/browse/HIVE-21167
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Vaibhav Gumashta
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-21167.1.patch, HIVE-21167.2.patch, 
> HIVE-21167.3.patch, HIVE-21167.4.patch
>
>
> Using murmur hash for bucketing columns was introduced in HIVE-18910, 
> following which {{'bucketing_version'='1'}} stands for the old behaviour 
> (where for example integer columns were partitioned based on mod values). 
> Looks like we have a bug in the old bucketing scheme now. I could repro it 
> when modified the existing schema using an alter table add column and adding 
> new data. Repro:
> {code}
> 0: jdbc:hive2://localhost:10010> create transactional table acid_ptn_bucket1 
> (a int, b int) partitioned by(ds string) clustered by (a) into 2 buckets 
> stored as ORC TBLPROPERTIES('bucketing_version'='1', 'transactional'='true', 
> 'transactional_properties'='default');
> No rows affected (0.418 seconds)
> 0: jdbc:hive2://localhost:10010> insert into acid_ptn_bucket1 partition (ds) 
> values(1,2,'today'),(1,3,'today'),(1,4,'yesterday'),(2,2,'yesterday'),(2,3,'today'),(2,4,'today');
> 6 rows affected (3.695 seconds)
> {code}
> Data from ORC file (data as expected):
> {code}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_001_001_/bucket_0
> {"operation": 0, "originalTransaction": 1, "bucket": 536870912, "rowId": 0, 
> "currentTransaction": 1, "row": {"a": 2, "b": 4}}
> {"operation": 0, "originalTransaction": 1, "bucket": 536870912, "rowId": 1, 
> "currentTransaction": 1, "row": {"a": 2, "b": 3}}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_001_001_/bucket_1
> {"operation": 0, "originalTransaction": 1, "bucket": 536936448, "rowId": 0, 
> "currentTransaction": 1, "row": {"a": 1, "b": 3}}
> {"operation": 0, "originalTransaction": 1, "bucket": 536936448, "rowId": 1, 
> "currentTransaction": 1, "row": {"a": 1, "b": 2}}
> {code}
> Modifying table schema and inserting new data:
> {code}
> 0: jdbc:hive2://localhost:10010> alter table acid_ptn_bucket1 add columns(c 
> int);
> No rows affected (0.541 seconds)
> 0: jdbc:hive2://localhost:10010> insert into acid_ptn_bucket1 partition (ds) 
> values(3,2,1000,'yesterday'),(3,3,1001,'today'),(3,4,1002,'yesterday'),(4,2,1003,'today'),
>  (4,3,1004,'yesterday'),(4,4,1005,'today');
> 6 rows affected (3.699 seconds)
> {code}
> Data from ORC file (wrong partitioning):
> {code}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_003_003_/bucket_0
> {"operation": 0, "originalTransaction": 3, "bucket": 536870912, "rowId": 0, 
> "currentTransaction": 3, "row": {"a": 3, "b": 3, "c": 1001}}
> /apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_003_003_/bucket_1
> {"operation": 0, "originalTransaction": 3, "bucket": 536936448, "rowId": 0, 
> "currentTransaction": 3, "row": {"a": 4, "b": 4, "c": 1005}}
> {"operation": 0, "originalTransaction": 3, "bucket": 536936448, "rowId": 1, 
> "currentTransaction": 3, "row": {"a": 4, "b": 2, "c": 1003}}
> {code}
> As seen above, the expected behaviour is that new data with column 'a' being 
> 3 should go to bucket1 and column 'a' being 4 should go to bucket0, but the 
> partitioning is wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21307) Need to set GzipJSONMessageEncoder as default config for EVENT_MESSAGE_FACTORY.

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775468#comment-16775468
 ] 

Hive QA commented on HIVE-21307:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
47s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
37s{color} | {color:blue} standalone-metastore/metastore-common in master has 
29 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
33s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
29s{color} | {color:blue} hcatalog/webhcat/java-client in master has 3 extant 
Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
43s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
2s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} The patch metastore-common passed checkstyle {color} 
|
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} The patch common passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} hcatalog/webhcat/java-client: The patch generated 0 
new + 108 unchanged - 1 fixed = 108 total (was 109) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} itests/hcatalog-unit: The patch generated 0 new + 26 
unchanged - 1 fixed = 26 total (was 27) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} The patch hive-unit passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 33m 35s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16204/dev-support/hive-personality.sh
 |
| git revision | master / 49fe5fc |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: standalone-metastore/metastore-common common 
hcatalog/webhcat/java-client itests/hcatalog-unit itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16204/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> 

[jira] [Commented] (HIVE-21181) Hive pre-upgrade tool not working with HDFS HA, tries connecting to nameservice as it was a NameNode

2019-02-22 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775458#comment-16775458
 ] 

Jason Dere commented on HIVE-21181:
---

Looks like the pre-upgrade tool does not seem to recognize that the HDFS path 
is using namenode HA. These conf files are from the host running the Metastore? 
Does listing files from the default FS work (hdfs://mytestcluster)?

> Hive pre-upgrade tool not working with HDFS HA, tries connecting to 
> nameservice as it was a NameNode
> 
>
> Key: HIVE-21181
> URL: https://issues.apache.org/jira/browse/HIVE-21181
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
> Environment: Centos 7.4.1708
> kernel 3.10.0-693.11.6.el7.x86_64
> Ambari 2.6.2.2
> HDP-2.6.5.0-292
> Hive 1.2.1000
> HDFS 2.7.3
>Reporter: Attila Csaba Marosi
>Priority: Major
> Attachments: core-site.xml, hdfs-site.xml
>
>
> While preparing a production cluster HDP-2.6.5 -> HDP-3.1 upgrades, we've 
> noticed issues with the hive-pre-upgrade tool, when we tried running it, we 
> got the exception:
> {{Found Acid table: default.hello_acid
> 2019-01-28 15:54:20,331 ERROR [main] acid.PreUpgradeTool 
> (PreUpgradeTool.java:main(152)) - PreUpgradeTool failed
> java.lang.IllegalArgumentException: java.net.UnknownHostException: 
> mytestcluster
> at 
> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:439)
> at 
> org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:321)
> at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
> at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:696)
> at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:636)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:160)
> at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2796)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
> at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2830)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2812)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:390)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
> at 
> org.apache.hadoop.hive.upgrade.acid.PreUpgradeTool.needsCompaction(PreUpgradeTool.java:417)
> at 
> org.apache.hadoop.hive.upgrade.acid.PreUpgradeTool.getCompactionCommands(PreUpgradeTool.java:384)
> at 
> org.apache.hadoop.hive.upgrade.acid.PreUpgradeTool.getCompactionCommands(PreUpgradeTool.java:374)
> at 
> org.apache.hadoop.hive.upgrade.acid.PreUpgradeTool.prepareAcidUpgradeInternal(PreUpgradeTool.java:235)
> at 
> org.apache.hadoop.hive.upgrade.acid.PreUpgradeTool.main(PreUpgradeTool.java:149)
> Caused by: java.net.UnknownHostException: mytestcluster
> ... 17 more}}
> We tried running it on a kerberized test cluster built based on the same 
> blueprint like the production clusters, with HDP-2.6.5.0-292, Hive 1.2.1000, 
> HDFS 2.7.3, with HDFS HA and without Hive HA.
> We enabled Hive ACID, created the same example ACID table as shown in 
> https://hortonworks.com/tutorial/using-hive-acid-transactions-to-insert-update-and-delete-data/
> We followed the steps described at 
> https://docs.hortonworks.com/HDPDocuments/Ambari-2.7.3.0/bk_ambari-upgrade-major/content/prepare_hive_for_upgrade.html
>  , kinit-ed, used the "-Djavax.security.auth.useSubjectCredsOnly=false" 
> parameter.
> Without the ACID table there is no issue.
> I'm attaching the hdfs-site.xml and core-site.xml.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21308) Negative forms of variables are not supported in HPL/SQL

2019-02-22 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-21308:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

+1. Patch pushed to master. Thanks Baoning!

> Negative forms of variables are not supported in HPL/SQL
> 
>
> Key: HIVE-21308
> URL: https://issues.apache.org/jira/browse/HIVE-21308
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Baoning He
>Assignee: Baoning He
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21308.1.patch
>
>
> In the following HPL/SQL programs:
> declare num = 1; print -num;
> The expected result should be '-1',but it print '1' .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20079) Populate more accurate rawDataSize for parquet format

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775431#comment-16775431
 ] 

Hive QA commented on HIVE-20079:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12959742/HIVE-20079.6.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15811 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16203/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16203/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16203/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12959742 - PreCommit-HIVE-Build

> Populate more accurate rawDataSize for parquet format
> -
>
> Key: HIVE-20079
> URL: https://issues.apache.org/jira/browse/HIVE-20079
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Antal Sinkovits
>Priority: Major
> Attachments: HIVE-20079.1.patch, HIVE-20079.2.patch, 
> HIVE-20079.3.patch, HIVE-20079.4.patch, HIVE-20079.5.patch, HIVE-20079.6.patch
>
>
> Run the following queries and you will see the raw data for the table is 4 
> (that is the number of fields) incorrectly. We need to populate correct data 
> size so data can be split properly.
> {noformat}
> SET hive.stats.autogather=true;
> CREATE TABLE parquet_stats (id int,str string) STORED AS PARQUET;
> INSERT INTO parquet_stats values(0, 'this is string 0'), (1, 'string 1');
> DESC FORMATTED parquet_stats;
> {noformat}
> {noformat}
> Table Parameters:
>   COLUMN_STATS_ACCURATE   true
>   numFiles1
>   numRows 2
>   rawDataSize 4
>   totalSize   373
>   transient_lastDdlTime   1530660523
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21001) Upgrade to calcite-1.18

2019-02-22 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-21001:

Attachment: HIVE-21001.39.patch

> Upgrade to calcite-1.18
> ---
>
> Key: HIVE-21001
> URL: https://issues.apache.org/jira/browse/HIVE-21001
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-21001.01.patch, HIVE-21001.01.patch, 
> HIVE-21001.02.patch, HIVE-21001.03.patch, HIVE-21001.04.patch, 
> HIVE-21001.05.patch, HIVE-21001.06.patch, HIVE-21001.06.patch, 
> HIVE-21001.07.patch, HIVE-21001.08.patch, HIVE-21001.08.patch, 
> HIVE-21001.08.patch, HIVE-21001.09.patch, HIVE-21001.09.patch, 
> HIVE-21001.09.patch, HIVE-21001.10.patch, HIVE-21001.11.patch, 
> HIVE-21001.12.patch, HIVE-21001.13.patch, HIVE-21001.15.patch, 
> HIVE-21001.16.patch, HIVE-21001.17.patch, HIVE-21001.18.patch, 
> HIVE-21001.18.patch, HIVE-21001.19.patch, HIVE-21001.20.patch, 
> HIVE-21001.21.patch, HIVE-21001.22.patch, HIVE-21001.22.patch, 
> HIVE-21001.22.patch, HIVE-21001.23.patch, HIVE-21001.24.patch, 
> HIVE-21001.26.patch, HIVE-21001.26.patch, HIVE-21001.26.patch, 
> HIVE-21001.26.patch, HIVE-21001.26.patch, HIVE-21001.27.patch, 
> HIVE-21001.28.patch, HIVE-21001.29.patch, HIVE-21001.29.patch, 
> HIVE-21001.30.patch, HIVE-21001.31.patch, HIVE-21001.32.patch, 
> HIVE-21001.34.patch, HIVE-21001.35.patch, HIVE-21001.36.patch, 
> HIVE-21001.37.patch, HIVE-21001.38.patch, HIVE-21001.39.patch
>
>
> XLEAR LIBRARY CACHE 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21310) Hashcode of a varchar column is incorrect if its folded

2019-02-22 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775385#comment-16775385
 ] 

Zoltan Haindrich commented on HIVE-21310:
-

during folding {{WritableHiveVarcharObjectInspector}} is changed to some plain 
string

> Hashcode of a varchar column is incorrect if its folded
> ---
>
> Key: HIVE-21310
> URL: https://issues.apache.org/jira/browse/HIVE-21310
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Priority: Major
>
> {code:sql}
> create table t (a varchar(10));
> insert into t values('bee'),('xxx');
> -- select  t0.v,t1.v from
> select   assert_true(t0.v = t1.v) from
> (select hash(a) as v from t where a='bee') as t0 
> join(select hash(a) as v from t where a='bee' or a='xbee') as t1 on 
> (true);
> {code}
> the assertion fails because: {{97410 != 127201}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20079) Populate more accurate rawDataSize for parquet format

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775378#comment-16775378
 ] 

Hive QA commented on HIVE-20079:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
58s{color} | {color:blue} ql in master has 2261 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} ql: The patch generated 0 new + 14 unchanged - 5 
fixed = 14 total (was 19) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 43s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16203/dev-support/hive-personality.sh
 |
| git revision | master / c45751f |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16203/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Populate more accurate rawDataSize for parquet format
> -
>
> Key: HIVE-20079
> URL: https://issues.apache.org/jira/browse/HIVE-20079
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Antal Sinkovits
>Priority: Major
> Attachments: HIVE-20079.1.patch, HIVE-20079.2.patch, 
> HIVE-20079.3.patch, HIVE-20079.4.patch, HIVE-20079.5.patch, HIVE-20079.6.patch
>
>
> Run the following queries and you will see the raw data for the table is 4 
> (that is the number of fields) incorrectly. We need to populate correct data 
> size so data can be split properly.
> {noformat}
> SET hive.stats.autogather=true;
> CREATE TABLE parquet_stats (id int,str string) STORED AS PARQUET;
> INSERT INTO parquet_stats values(0, 'this is string 0'), (1, 'string 1');
> DESC FORMATTED parquet_stats;
> {noformat}
> {noformat}
> Table Parameters:
>   COLUMN_STATS_ACCURATE   true
>   numFiles1
>   numRows 2
>   rawDataSize 4
>   totalSize   373
>   transient_lastDdlTime   1530660523
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21293) Fix ambiguity in grammar warnings at compilation time (II)

2019-02-22 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775374#comment-16775374
 ] 

Jesus Camacho Rodriguez commented on HIVE-21293:


[~abstractdog], is there a way to rewrite the rule to fix the issue without 
making unknown a reserved keyword? We were discussing with [~ashutoshc] and he 
had concerns because this will be backwards incompatible.

> Fix ambiguity in grammar warnings at compilation time (II)
> --
>
> Key: HIVE-21293
> URL: https://issues.apache.org/jira/browse/HIVE-21293
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-21293.01.patch
>
>
> These are the warnings at compilation time:
> {code}
> warning(200): IdentifiersParser.g:424:5:
> Decision can match input such as "KW_UNKNOWN" using multiple alternatives: 1, 
> 10
> As a result, alternative(s) 10 were disabled for that input
> {code}
> This means that multiple parser rules can match certain query text, possibly 
> leading to unexpected errors at parsing time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21288) Runtime rowcount calculation is incorrect in vectorized executions

2019-02-22 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775360#comment-16775360
 ] 

Jesus Camacho Rodriguez commented on HIVE-21288:


+1

> Runtime rowcount calculation is incorrect in vectorized executions
> --
>
> Key: HIVE-21288
> URL: https://issues.apache.org/jira/browse/HIVE-21288
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-21288.01.patch, HIVE-21288.02.patch, 
> HIVE-21288.02.patch
>
>
> before HIVE-18908; there was a baseForward (non vectorized) and vectorForward 
> (vectorized); and both of them have accounted for the number of rows 
> correctly - after HIVE-18908 this have changed to count vectorized batches in 
> case of vectorized execution.
> [relevant part of 
> Operator.java|https://github.com/apache/hive/commit/a37827ecd557c7f7d69f3b2ccdbf6535908b1461#diff-dd93eb584eb10a8f68b906a98edaae77L946]
> [counters are dropping to 1|
> https://github.com/apache/hive/commit/a37827ecd557c7f7d69f3b2ccdbf6535908b1461#diff-e70a9d33150346fe8b9b7d719d677b97L356]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21240) JSON SerDe Re-Write

2019-02-22 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775343#comment-16775343
 ] 

BELUGA BEHR commented on HIVE-21240:


Though, I am getting a failure in some scenarios that are not picked up in the 
UTs.  I need to investigate them further.

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, 
> HIVE-21240.8.patch, HIVE-21240.8.patch, HIVE-24240.8.patch, 
> HIVE-24240.8.patch, HIVE-24240.8.patch, HIVE-24240.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21294) Vectorization: 1-reducer Shuffle can skip the object hash functions

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775335#comment-16775335
 ] 

Hive QA commented on HIVE-21294:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12959738/HIVE-21294.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16202/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16202/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16202/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12959738/HIVE-21294.2.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12959738 - PreCommit-HIVE-Build

> Vectorization: 1-reducer Shuffle can skip the object hash functions
> ---
>
> Key: HIVE-21294
> URL: https://issues.apache.org/jira/browse/HIVE-21294
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Gopal V
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21294.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> VectorReduceSinkObjectHashOperator can skip the object hashing entirely if 
> the reducer count = 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21288) Runtime rowcount calculation is incorrect in vectorized executions

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775333#comment-16775333
 ] 

Hive QA commented on HIVE-21288:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12959740/HIVE-21288.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15810 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16201/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16201/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16201/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12959740 - PreCommit-HIVE-Build

> Runtime rowcount calculation is incorrect in vectorized executions
> --
>
> Key: HIVE-21288
> URL: https://issues.apache.org/jira/browse/HIVE-21288
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-21288.01.patch, HIVE-21288.02.patch, 
> HIVE-21288.02.patch
>
>
> before HIVE-18908; there was a baseForward (non vectorized) and vectorForward 
> (vectorized); and both of them have accounted for the number of rows 
> correctly - after HIVE-18908 this have changed to count vectorized batches in 
> case of vectorized execution.
> [relevant part of 
> Operator.java|https://github.com/apache/hive/commit/a37827ecd557c7f7d69f3b2ccdbf6535908b1461#diff-dd93eb584eb10a8f68b906a98edaae77L946]
> [counters are dropping to 1|
> https://github.com/apache/hive/commit/a37827ecd557c7f7d69f3b2ccdbf6535908b1461#diff-e70a9d33150346fe8b9b7d719d677b97L356]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21240) JSON SerDe Re-Write

2019-02-22 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775331#comment-16775331
 ] 

BELUGA BEHR commented on HIVE-21240:


Read Performance

195 million JSON records (String, int, float, Date)

 

JSON-Trunk: 160s

JSON-21240: 147s

 

 

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, 
> HIVE-21240.8.patch, HIVE-21240.8.patch, HIVE-24240.8.patch, 
> HIVE-24240.8.patch, HIVE-24240.8.patch, HIVE-24240.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202695=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202695
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 15:40
Start Date: 22/Feb/19 15:40
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #543: HIVE-21292: 
Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259393882
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/database/AlterDatabaseDesc.java
 ##
 @@ -16,112 +16,100 @@
  * limitations under the License.
  */
 
-package org.apache.hadoop.hive.ql.plan;
+package org.apache.hadoop.hive.ql.exec.ddl.database;
 
 import java.io.Serializable;
 import java.util.Map;
 
+import org.apache.hadoop.hive.ql.exec.ddl.DDLDesc;
+import org.apache.hadoop.hive.ql.exec.ddl.DDLTask2;
 import org.apache.hadoop.hive.ql.parse.ReplicationSpec;
+import org.apache.hadoop.hive.ql.plan.Explain;
 import org.apache.hadoop.hive.ql.plan.Explain.Level;
+import org.apache.hadoop.hive.ql.plan.PrincipalDesc;
 
 /**
- * AlterDatabaseDesc.
- *
+ * DDL task description for ALTER DATABASE commands.
  */
 @Explain(displayName = "Alter Database", explainLevels = { Level.USER, 
Level.DEFAULT, Level.EXTENDED })
 public class AlterDatabaseDesc extends DDLDesc implements Serializable {
-
   private static final long serialVersionUID = 1L;
 
-  // Only altering the database property and owner is currently supported
-  public static enum ALTER_DB_TYPES {
-ALTER_PROPERTY, ALTER_OWNER, ALTER_LOCATION
-  };
-
-  ALTER_DB_TYPES alterType;
-  String databaseName;
-  Map dbProperties;
-  PrincipalDesc ownerPrincipal;
-  ReplicationSpec replicationSpec;
-  String location;
+  static {
+DDLTask2.registerOperator(AlterDatabaseDesc.class, 
AlterDatabaseOperation.class);
+  }
 
   /**
-   * For serialization only.
+   * Supported type of alter db commands.
+   * Only altering the database property and owner is currently supported
*/
-  public AlterDatabaseDesc() {
-  }
+  public enum AlterDbType {
+ALTER_PROPERTY, ALTER_OWNER, ALTER_LOCATION
+  };
 
-  public AlterDatabaseDesc(String databaseName, Map dbProps,
 
 Review comment:
   I see; so you've already considered considered and rejected this approach :)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202695)
Time Spent: 7h  (was: 6h 50m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch, 
> HIVE-21292.06.patch
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the database related operations from the old DDLTask, 
> and move them under the new package. Also create the new internal framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21288) Runtime rowcount calculation is incorrect in vectorized executions

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775292#comment-16775292
 ] 

Hive QA commented on HIVE-21288:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
2s{color} | {color:blue} ql in master has 2261 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16201/dev-support/hive-personality.sh
 |
| git revision | master / c45751f |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16201/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Runtime rowcount calculation is incorrect in vectorized executions
> --
>
> Key: HIVE-21288
> URL: https://issues.apache.org/jira/browse/HIVE-21288
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-21288.01.patch, HIVE-21288.02.patch, 
> HIVE-21288.02.patch
>
>
> before HIVE-18908; there was a baseForward (non vectorized) and vectorForward 
> (vectorized); and both of them have accounted for the number of rows 
> correctly - after HIVE-18908 this have changed to count vectorized batches in 
> case of vectorized execution.
> [relevant part of 
> Operator.java|https://github.com/apache/hive/commit/a37827ecd557c7f7d69f3b2ccdbf6535908b1461#diff-dd93eb584eb10a8f68b906a98edaae77L946]
> [counters are dropping to 1|
> https://github.com/apache/hive/commit/a37827ecd557c7f7d69f3b2ccdbf6535908b1461#diff-e70a9d33150346fe8b9b7d719d677b97L356]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202694=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202694
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 15:37
Start Date: 22/Feb/19 15:37
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259392471
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/database/AlterDatabaseDesc.java
 ##
 @@ -16,112 +16,100 @@
  * limitations under the License.
  */
 
-package org.apache.hadoop.hive.ql.plan;
+package org.apache.hadoop.hive.ql.exec.ddl.database;
 
 import java.io.Serializable;
 import java.util.Map;
 
+import org.apache.hadoop.hive.ql.exec.ddl.DDLDesc;
+import org.apache.hadoop.hive.ql.exec.ddl.DDLTask2;
 import org.apache.hadoop.hive.ql.parse.ReplicationSpec;
+import org.apache.hadoop.hive.ql.plan.Explain;
 import org.apache.hadoop.hive.ql.plan.Explain.Level;
+import org.apache.hadoop.hive.ql.plan.PrincipalDesc;
 
 /**
- * AlterDatabaseDesc.
- *
+ * DDL task description for ALTER DATABASE commands.
  */
 @Explain(displayName = "Alter Database", explainLevels = { Level.USER, 
Level.DEFAULT, Level.EXTENDED })
 public class AlterDatabaseDesc extends DDLDesc implements Serializable {
-
   private static final long serialVersionUID = 1L;
 
-  // Only altering the database property and owner is currently supported
-  public static enum ALTER_DB_TYPES {
-ALTER_PROPERTY, ALTER_OWNER, ALTER_LOCATION
-  };
-
-  ALTER_DB_TYPES alterType;
-  String databaseName;
-  Map dbProperties;
-  PrincipalDesc ownerPrincipal;
-  ReplicationSpec replicationSpec;
-  String location;
+  static {
+DDLTask2.registerOperator(AlterDatabaseDesc.class, 
AlterDatabaseOperation.class);
+  }
 
   /**
-   * For serialization only.
+   * Supported type of alter db commands.
+   * Only altering the database property and owner is currently supported
*/
-  public AlterDatabaseDesc() {
-  }
+  public enum AlterDbType {
+ALTER_PROPERTY, ALTER_OWNER, ALTER_LOCATION
+  };
 
-  public AlterDatabaseDesc(String databaseName, Map dbProps,
 
 Review comment:
   Basically AlterDatabaseOperation (or in more general every XXXOperation 
could be a task as well, and every XXXDesc could have their own XXXWork class. 
But that would mean a lot of repetitive code, so having a common Task, and Work 
class for them is just reusing the same boilerplate code. Having a separate SA 
is a good idea, I think when DDLTask is cut to pieces the next step should be a 
to cut DDLSemanticAnalyzer too!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202694)
Time Spent: 6h 50m  (was: 6h 40m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch, 
> HIVE-21292.06.patch
>
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the 

[jira] [Commented] (HIVE-21001) Upgrade to calcite-1.18

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775261#comment-16775261
 ] 

Hive QA commented on HIVE-21001:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
37s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
13s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
8s{color} | {color:blue} ql in master has 2261 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
33s{color} | {color:blue} accumulo-handler in master has 21 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
35s{color} | {color:blue} hbase-handler in master has 15 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  9m 
24s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
39s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
30s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
42s{color} | {color:red} ql: The patch generated 5 new + 290 unchanged - 29 
fixed = 295 total (was 319) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m 
11s{color} | {color:red} root: The patch generated 5 new + 290 unchanged - 29 
fixed = 295 total (was 319) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 12m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 77m 16s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  xml  compile  findbugs  
checkstyle  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16200/dev-support/hive-personality.sh
 |
| git revision | master / c45751f |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16200/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16200/yetus/diff-checkstyle-root.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16200/yetus/whitespace-eol.txt
 |
| modules | C: ql accumulo-handler hbase-handler . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16200/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Upgrade to calcite-1.18
> ---
>
> Key: HIVE-21001
> URL: https://issues.apache.org/jira/browse/HIVE-21001
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
> 

[jira] [Commented] (HIVE-21001) Upgrade to calcite-1.18

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775244#comment-16775244
 ] 

Hive QA commented on HIVE-21001:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12959734/HIVE-21001.38.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 121 failed/errored test(s), 15810 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ambiguitycheck] 
(batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[except_all] (batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[filter_cond_pushdown] 
(batchId=65)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_grouping_sets_grouping]
 (batchId=4)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_rollup_empty] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_ppd_char] 
(batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_15]
 (batchId=94)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_4] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_6] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_div0]
 (batchId=83)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamp_ints_casts] 
(batchId=1)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_to_unix_timestamp] 
(batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_remove_6_subq] 
(batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_math_funcs]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_15] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_4] 
(batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_6] 
(batchId=29)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_case] 
(batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_casts] 
(batchId=89)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_math_funcs] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_string_funcs] 
(batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp_ints_casts]
 (batchId=53)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[view_cbo] (batchId=73)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[except_distinct] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_16]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[default_constraint]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[groupby_rollup_empty]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage3] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_3]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_4]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_5]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_rebuild_dummy]
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_time_window]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_rewrite_10]
 (batchId=177)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_rewrite_4]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_rewrite_6]
 (batchId=178)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_rewrite_7]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_rewrite_8]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_rewrite_no_join_opt]
 (batchId=184)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_ppd_varchar]
 (batchId=178)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_in]
 (batchId=176)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_multi]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_notin]
 (batchId=178)

[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202660=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202660
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 14:30
Start Date: 22/Feb/19 14:30
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #543: HIVE-21292: 
Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259363654
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/database/AlterDatabaseDesc.java
 ##
 @@ -16,112 +16,100 @@
  * limitations under the License.
  */
 
-package org.apache.hadoop.hive.ql.plan;
+package org.apache.hadoop.hive.ql.exec.ddl.database;
 
 import java.io.Serializable;
 import java.util.Map;
 
+import org.apache.hadoop.hive.ql.exec.ddl.DDLDesc;
+import org.apache.hadoop.hive.ql.exec.ddl.DDLTask2;
 import org.apache.hadoop.hive.ql.parse.ReplicationSpec;
+import org.apache.hadoop.hive.ql.plan.Explain;
 import org.apache.hadoop.hive.ql.plan.Explain.Level;
+import org.apache.hadoop.hive.ql.plan.PrincipalDesc;
 
 /**
- * AlterDatabaseDesc.
- *
+ * DDL task description for ALTER DATABASE commands.
  */
 @Explain(displayName = "Alter Database", explainLevels = { Level.USER, 
Level.DEFAULT, Level.EXTENDED })
 public class AlterDatabaseDesc extends DDLDesc implements Serializable {
-
   private static final long serialVersionUID = 1L;
 
-  // Only altering the database property and owner is currently supported
-  public static enum ALTER_DB_TYPES {
-ALTER_PROPERTY, ALTER_OWNER, ALTER_LOCATION
-  };
-
-  ALTER_DB_TYPES alterType;
-  String databaseName;
-  Map dbProperties;
-  PrincipalDesc ownerPrincipal;
-  ReplicationSpec replicationSpec;
-  String location;
+  static {
+DDLTask2.registerOperator(AlterDatabaseDesc.class, 
AlterDatabaseOperation.class);
+  }
 
   /**
-   * For serialization only.
+   * Supported type of alter db commands.
+   * Only altering the database property and owner is currently supported
*/
-  public AlterDatabaseDesc() {
-  }
+  public enum AlterDbType {
+ALTER_PROPERTY, ALTER_OWNER, ALTER_LOCATION
+  };
 
-  public AlterDatabaseDesc(String databaseName, Map dbProps,
 
 Review comment:
   I don't feel a few hundred lines much :) tought that closely related stuff 
might be beneficial during future feature additions/etc
   but it was just an idea; it will be fine without it as well.
   
   I'm right now wondering about what would happen if instead of adding an 
extra layer of  (`AlterDatabaseOperation`,`AlterDatabaseDesc`) ; how insane 
would be to do: `AlterDatabaseTask`,`Work` and `SA`?
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202660)
Time Spent: 6h 40m  (was: 6.5h)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch, 
> HIVE-21292.06.patch
>
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in 

[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202651=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202651
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 14:17
Start Date: 22/Feb/19 14:17
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259358745
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/DDLTask2.java
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.ddl;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.hadoop.hive.ql.CompilationOpContext;
+import org.apache.hadoop.hive.ql.DriverContext;
+import org.apache.hadoop.hive.ql.QueryPlan;
+import org.apache.hadoop.hive.ql.QueryState;
+import org.apache.hadoop.hive.ql.exec.Task;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.parse.ExplainConfiguration.AnalyzeState;
+import org.apache.hadoop.hive.ql.plan.api.StageType;
+
+/**
+ * DDLTask implementation.
+**/
+public class DDLTask2 extends Task implements Serializable {
+  private static final long serialVersionUID = 1L;
+
+  private static final Map, Class>> DESC_TO_OPARATION =
+  new HashMap<>();
+  public static void registerOperator(Class descClass,
+  Class> operationClass) {
+DESC_TO_OPARATION.put(descClass, operationClass);
+  }
+
+  @Override
+  public boolean requireLock() {
+return this.work != null && this.work.getNeedLock();
+  }
+
+  @Override
+  public void initialize(QueryState queryState, QueryPlan queryPlan, 
DriverContext ctx,
+  CompilationOpContext opContext) {
+super.initialize(queryState, queryPlan, ctx, opContext);
+  }
+
+  @Override
+  public int execute(DriverContext driverContext) {
+if (driverContext.getCtx().getExplainAnalyze() == AnalyzeState.RUNNING) {
+  return 0;
+}
+
+try {
+  Hive db = Hive.get(conf);
+  DDLDesc ddlDesc = work.getDDLDesc();
+
+  if (DESC_TO_OPARATION.containsKey(ddlDesc.getClass())) {
+DDLOperation ddlOperation = 
DESC_TO_OPARATION.get(ddlDesc.getClass()).newInstance();
+ddlOperation.init(db, conf, driverContext, ddlDesc);
 
 Review comment:
   Context created, will be in the next version.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202651)
Time Spent: 5h 50m  (was: 5h 40m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch, 
> HIVE-21292.06.patch
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes 

[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202658=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202658
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 14:24
Start Date: 22/Feb/19 14:24
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #543: HIVE-21292: 
Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259361340
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/database/SwitchDatabaseDesc.java
 ##
 @@ -16,37 +16,34 @@
  * limitations under the License.
  */
 
-package org.apache.hadoop.hive.ql.plan;
+package org.apache.hadoop.hive.ql.exec.ddl.database;
 
 import java.io.Serializable;
-import org.apache.hadoop.hive.ql.plan.Explain.Level;
 
+import org.apache.hadoop.hive.ql.exec.ddl.DDLDesc;
+import org.apache.hadoop.hive.ql.exec.ddl.DDLTask2;
+import org.apache.hadoop.hive.ql.plan.Explain;
+import org.apache.hadoop.hive.ql.plan.Explain.Level;
 
 /**
- * SwitchDatabaseDesc.
- *
+ * DDL task description for SWITCH DATABASE commands.
 
 Review comment:
   of course; it makes sense to do it separetly
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202658)
Time Spent: 6.5h  (was: 6h 20m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch, 
> HIVE-21292.06.patch
>
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the database related operations from the old DDLTask, 
> and move them under the new package. Also create the new internal framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202657=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202657
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 14:22
Start Date: 22/Feb/19 14:22
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #543: HIVE-21292: 
Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259360497
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/database/ShowDatabasesOperation.java
 ##
 @@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.ddl.database;
+
+import java.io.DataOutputStream;
+import java.util.List;
+
+import org.apache.hadoop.hive.ql.ErrorMsg;
+import org.apache.hadoop.hive.ql.exec.ddl.DDLOperation;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.io.IOUtils;
+
+/**
+ * Operation process of locking a database.
+ */
+public class ShowDatabasesOperation extends DDLOperation {
+  @Override
+  public int execute() throws HiveException {
+// get the databases for the desired pattern - populate the output stream
+List databases = null;
+if (ddlDesc.getPattern() != null) {
+  LOG.debug("pattern: {}", ddlDesc.getPattern());
+  databases = db.getDatabasesByPattern(ddlDesc.getPattern());
+} else {
+  databases = db.getAllDatabases();
+}
+
+LOG.info("Found {} database(s) matching the SHOW DATABASES statement.", 
databases.size());
+
+// write the results in the file
+DataOutputStream outStream = getOutputStream(ddlDesc.getResFile());
+try {
+  formatter.showDatabases(outStream, databases);
 
 Review comment:
   that could be done later; and there are 2 brands of formatters...so it's not 
that simple
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202657)
Time Spent: 6h 20m  (was: 6h 10m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch, 
> HIVE-21292.06.patch
>
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all 

[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202654=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202654
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 14:18
Start Date: 22/Feb/19 14:18
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259359243
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/DDLTask2.java
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.ddl;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.hadoop.hive.ql.CompilationOpContext;
+import org.apache.hadoop.hive.ql.DriverContext;
+import org.apache.hadoop.hive.ql.QueryPlan;
+import org.apache.hadoop.hive.ql.QueryState;
+import org.apache.hadoop.hive.ql.exec.Task;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.parse.ExplainConfiguration.AnalyzeState;
+import org.apache.hadoop.hive.ql.plan.api.StageType;
+
+/**
+ * DDLTask implementation.
+**/
+public class DDLTask2 extends Task implements Serializable {
+  private static final long serialVersionUID = 1L;
+
+  private static final Map, Class>> DESC_TO_OPARATION =
+  new HashMap<>();
+  public static void registerOperator(Class descClass,
+  Class> operationClass) {
+DESC_TO_OPARATION.put(descClass, operationClass);
+  }
+
+  @Override
+  public boolean requireLock() {
+return this.work != null && this.work.getNeedLock();
+  }
+
+  @Override
+  public void initialize(QueryState queryState, QueryPlan queryPlan, 
DriverContext ctx,
+  CompilationOpContext opContext) {
+super.initialize(queryState, queryPlan, ctx, opContext);
+  }
+
+  @Override
+  public int execute(DriverContext driverContext) {
+if (driverContext.getCtx().getExplainAnalyze() == AnalyzeState.RUNNING) {
+  return 0;
+}
+
+try {
+  Hive db = Hive.get(conf);
+  DDLDesc ddlDesc = work.getDDLDesc();
+
+  if (DESC_TO_OPARATION.containsKey(ddlDesc.getClass())) {
+DDLOperation ddlOperation = 
DESC_TO_OPARATION.get(ddlDesc.getClass()).newInstance();
 
 Review comment:
   Switched to use constructor. Looks better indeed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202654)
Time Spent: 6h 10m  (was: 6h)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch, 
> HIVE-21292.06.patch
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests 

[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202652=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202652
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 14:17
Start Date: 22/Feb/19 14:17
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259358917
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/DDLOperation.java
 ##
 @@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.ddl;
+
+import java.io.DataOutputStream;
+
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.DriverContext;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.metadata.formatting.MetaDataFormatUtils;
+import org.apache.hadoop.hive.ql.metadata.formatting.MetaDataFormatter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Abstract ancestor class of all DDL Operation classes.
+ */
+public abstract class DDLOperation {
+  protected static final Logger LOG = 
LoggerFactory.getLogger("hive.ql.exec.DDLTask");
+
+  protected Hive db;
 
 Review comment:
   Made them final, added via constructor. Looks better indeed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202652)
Time Spent: 6h  (was: 5h 50m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch, 
> HIVE-21292.06.patch
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the database related operations from the old DDLTask, 
> and move them under the new package. Also create the new internal framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21292:
--
Attachment: HIVE-21292.06.patch

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch, 
> HIVE-21292.06.patch
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the database related operations from the old DDLTask, 
> and move them under the new package. Also create the new internal framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21292:
--
Status: Patch Available  (was: Open)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch, 
> HIVE-21292.06.patch
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the database related operations from the old DDLTask, 
> and move them under the new package. Also create the new internal framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21292:
--
Status: Open  (was: Patch Available)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch, 
> HIVE-21292.06.patch
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the database related operations from the old DDLTask, 
> and move them under the new package. Also create the new internal framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21297) Replace all occurences of new Long, Boolean, Double etc with the corresponding .valueOf

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775136#comment-16775136
 ] 

Hive QA commented on HIVE-21297:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12959732/HIVE-21297.02.patch

{color:green}SUCCESS:{color} +1 due to 13 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15810 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16199/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16199/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16199/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12959732 - PreCommit-HIVE-Build

> Replace all occurences of new Long, Boolean, Double etc with the 
> corresponding .valueOf
> ---
>
> Key: HIVE-21297
> URL: https://issues.apache.org/jira/browse/HIVE-21297
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ivan Suller
>Assignee: Ivan Suller
>Priority: Trivial
> Fix For: 4.0.0
>
> Attachments: HIVE-21297.01.patch, HIVE-21297.02.patch
>
>
> Creating new objects with new Long(...), new Boolean etc creates a new 
> object, while Long.valueOf(...), Boolean.valueOf(...) can be cached (and is 
> actually cached in most if not all JVMs) thus reducing GC overhead. I already 
> had two similar tickets (HIVE-21228, HIVE-21199) - this one finishes the job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202632=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202632
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 13:14
Start Date: 22/Feb/19 13:14
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #543: HIVE-21292: 
Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259336043
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/DDLTask2.java
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.ddl;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.hadoop.hive.ql.CompilationOpContext;
+import org.apache.hadoop.hive.ql.DriverContext;
+import org.apache.hadoop.hive.ql.QueryPlan;
+import org.apache.hadoop.hive.ql.QueryState;
+import org.apache.hadoop.hive.ql.exec.Task;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.parse.ExplainConfiguration.AnalyzeState;
+import org.apache.hadoop.hive.ql.plan.api.StageType;
+
+/**
+ * DDLTask implementation.
+**/
+public class DDLTask2 extends Task implements Serializable {
+  private static final long serialVersionUID = 1L;
+
+  private static final Map, Class>> DESC_TO_OPARATION =
+  new HashMap<>();
+  public static void registerOperator(Class descClass,
+  Class> operationClass) {
+DESC_TO_OPARATION.put(descClass, operationClass);
+  }
+
+  @Override
+  public boolean requireLock() {
+return this.work != null && this.work.getNeedLock();
+  }
+
+  @Override
+  public void initialize(QueryState queryState, QueryPlan queryPlan, 
DriverContext ctx,
+  CompilationOpContext opContext) {
+super.initialize(queryState, queryPlan, ctx, opContext);
+  }
+
+  @Override
+  public int execute(DriverContext driverContext) {
+if (driverContext.getCtx().getExplainAnalyze() == AnalyzeState.RUNNING) {
+  return 0;
+}
+
+try {
+  Hive db = Hive.get(conf);
+  DDLDesc ddlDesc = work.getDDLDesc();
+
+  if (DESC_TO_OPARATION.containsKey(ddlDesc.getClass())) {
+DDLOperation ddlOperation = 
DESC_TO_OPARATION.get(ddlDesc.getClass()).newInstance();
 
 Review comment:
   if it's not the concreate operation's responsibility then the `init` method 
should be final.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202632)
Time Spent: 5h 40m  (was: 5.5h)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all 

[jira] [Commented] (HIVE-21297) Replace all occurences of new Long, Boolean, Double etc with the corresponding .valueOf

2019-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775120#comment-16775120
 ] 

Hive QA commented on HIVE-21297:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
52s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
14s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
35s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} serde in master has 197 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
8s{color} | {color:blue} standalone-metastore/metastore-server in master has 
181 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
2s{color} | {color:blue} ql in master has 2261 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
31s{color} | {color:blue} beeline in master has 45 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
34s{color} | {color:blue} hcatalog/webhcat/svr in master has 96 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
43s{color} | {color:blue} hplsql in master has 174 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
31s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} The patch common passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} The patch serde passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} The patch metastore-server passed checkstyle {color} 
|
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} ql: The patch generated 0 new + 501 unchanged - 22 
fixed = 501 total (was 523) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} beeline: The patch generated 0 new + 47 unchanged - 
4 fixed = 47 total (was 51) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} The patch svr passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} hplsql: The patch generated 0 new + 805 unchanged - 
29 fixed = 805 total (was 834) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
39s{color} | {color:green} common generated 0 new + 63 unchanged - 2 fixed = 63 
total (was 65) {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} serde in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
17s{color} | {color:green} standalone-metastore/metastore-server generated 0 
new + 179 unchanged - 2 fixed = 179 total (was 181) {color} |
| {color:green}+1{color} | {color:green} 

[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202618=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202618
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:49
Start Date: 22/Feb/19 12:49
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259328924
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java
 ##
 @@ -2571,11 +2567,10 @@ private void analyzeDescDatabase(ASTNode ast) throws 
SemanticException {
   throw new SemanticException("Unexpected Tokens at DESCRIBE DATABASE");
 }
 
-DescDatabaseDesc descDbDesc = new DescDatabaseDesc(ctx.getResFile(),
-dbName, isExtended);
+DescDatabaseDesc descDbDesc = new DescDatabaseDesc(ctx.getResFile(), 
dbName, isExtended);
 inputs.add(new ReadEntity(getDatabase(dbName)));
-rootTasks.add(TaskFactory.get(new DDLWork(getInputs(), getOutputs(), 
descDbDesc)));
-setFetchTask(createFetchTask(descDbDesc.getSchema()));
+rootTasks.add(TaskFactory.get(new DDLWork2(getInputs(), getOutputs(), 
descDbDesc)));
+setFetchTask(createFetchTask(DESC_DATABASE_SCHEMA));
 
 Review comment:
   will do
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202618)
Time Spent: 5.5h  (was: 5h 20m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the database related operations from the old DDLTask, 
> and move them under the new package. Also create the new internal framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202614=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202614
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:44
Start Date: 22/Feb/19 12:44
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259327538
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/database/DescDatabaseOperation.java
 ##
 @@ -0,0 +1,73 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.ddl.database;
+
+import java.io.DataOutputStream;
+import java.util.Map;
+import java.util.TreeMap;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.Database;
+import org.apache.hadoop.hive.metastore.api.PrincipalType;
+import org.apache.hadoop.hive.ql.ErrorMsg;
+import org.apache.hadoop.hive.ql.exec.ddl.DDLOperation;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.io.IOUtils;
+
+/**
+ * Operation process of describing a database.
+ */
+public class DescDatabaseOperation extends DDLOperation {
+  @Override
+  public int execute() throws HiveException {
+DataOutputStream outStream = getOutputStream(ddlDesc.getResFile());
 
 Review comment:
   will fix
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202614)
Time Spent: 5h 10m  (was: 5h)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the database related operations from the old DDLTask, 
> and move them under the new package. Also create the new internal framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202611=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202611
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:43
Start Date: 22/Feb/19 12:43
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259327077
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/database/DescDatabaseOperation.java
 ##
 @@ -0,0 +1,73 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.ddl.database;
+
+import java.io.DataOutputStream;
+import java.util.Map;
+import java.util.TreeMap;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.Database;
+import org.apache.hadoop.hive.metastore.api.PrincipalType;
+import org.apache.hadoop.hive.ql.ErrorMsg;
+import org.apache.hadoop.hive.ql.exec.ddl.DDLOperation;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.io.IOUtils;
+
+/**
+ * Operation process of describing a database.
+ */
+public class DescDatabaseOperation extends DDLOperation {
+  @Override
+  public int execute() throws HiveException {
+DataOutputStream outStream = getOutputStream(ddlDesc.getResFile());
+try {
+  Database database = db.getDatabase(ddlDesc.getDatabaseName());
+  if (database == null) {
+throw new HiveException(ErrorMsg.DATABASE_NOT_EXISTS, 
ddlDesc.getDatabaseName());
+  }
+
+  Map params = null;
+  if (ddlDesc.isExt()) {
+params = database.getParameters();
+  }
+
+  // If this is a q-test, let's order the params map (lexicographically) by
+  // key. This is to get consistent param ordering between Java7 and Java8.
+  if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_IN_TEST) && params 
!= null) {
+params = new TreeMap(params);
 
 Review comment:
   May be valid, but not in the scope of this modification. First, let's cut it 
to pieces, later we can create a jira for this.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202611)
Time Spent: 4h 50m  (was: 4h 40m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> 

[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202612=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202612
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:43
Start Date: 22/Feb/19 12:43
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259327195
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/database/DescDatabaseOperation.java
 ##
 @@ -0,0 +1,73 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.ddl.database;
+
+import java.io.DataOutputStream;
+import java.util.Map;
+import java.util.TreeMap;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.Database;
+import org.apache.hadoop.hive.metastore.api.PrincipalType;
+import org.apache.hadoop.hive.ql.ErrorMsg;
+import org.apache.hadoop.hive.ql.exec.ddl.DDLOperation;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.io.IOUtils;
+
+/**
+ * Operation process of describing a database.
+ */
+public class DescDatabaseOperation extends DDLOperation {
+  @Override
+  public int execute() throws HiveException {
+DataOutputStream outStream = getOutputStream(ddlDesc.getResFile());
+try {
+  Database database = db.getDatabase(ddlDesc.getDatabaseName());
+  if (database == null) {
+throw new HiveException(ErrorMsg.DATABASE_NOT_EXISTS, 
ddlDesc.getDatabaseName());
+  }
+
+  Map params = null;
+  if (ddlDesc.isExt()) {
+params = database.getParameters();
+  }
+
+  // If this is a q-test, let's order the params map (lexicographically) by
+  // key. This is to get consistent param ordering between Java7 and Java8.
+  if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_IN_TEST) && params 
!= null) {
+params = new TreeMap(params);
+  }
+
+  String location = database.getLocationUri();
+  if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_IN_TEST)) {
+location = "location/in/test";
+  }
+
+  PrincipalType ownerType = database.getOwnerType();
+  formatter.showDatabaseDescription(outStream, database.getName(), 
database.getDescription(), location,
+  database.getOwnerName(), (null == ownerType) ? null : 
ownerType.name(), params);
 
 Review comment:
   May be valid, but not in the scope of this modification. First, let's cut it 
to pieces, later we can create a jira for this.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202612)
Time Spent: 5h  (was: 4h 50m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have 

[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202616=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202616
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:47
Start Date: 22/Feb/19 12:47
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259328210
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/database/ShowDatabasesOperation.java
 ##
 @@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.ddl.database;
+
+import java.io.DataOutputStream;
+import java.util.List;
+
+import org.apache.hadoop.hive.ql.ErrorMsg;
+import org.apache.hadoop.hive.ql.exec.ddl.DDLOperation;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.io.IOUtils;
+
+/**
+ * Operation process of locking a database.
+ */
+public class ShowDatabasesOperation extends DDLOperation {
+  @Override
+  public int execute() throws HiveException {
+// get the databases for the desired pattern - populate the output stream
+List databases = null;
+if (ddlDesc.getPattern() != null) {
+  LOG.debug("pattern: {}", ddlDesc.getPattern());
+  databases = db.getDatabasesByPattern(ddlDesc.getPattern());
+} else {
+  databases = db.getAllDatabases();
+}
+
+LOG.info("Found {} database(s) matching the SHOW DATABASES statement.", 
databases.size());
+
+// write the results in the file
+DataOutputStream outStream = getOutputStream(ddlDesc.getResFile());
+try {
+  formatter.showDatabases(outStream, databases);
 
 Review comment:
   As I see only one formatter is used for all purposes in the old DDLTask 
class. What do you mean by finding a new home?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202616)
Time Spent: 5h 20m  (was: 5h 10m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: 

[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202610=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202610
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:42
Start Date: 22/Feb/19 12:42
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259326980
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/database/DescDatabaseOperation.java
 ##
 @@ -0,0 +1,73 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.ddl.database;
+
+import java.io.DataOutputStream;
+import java.util.Map;
+import java.util.TreeMap;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.Database;
+import org.apache.hadoop.hive.metastore.api.PrincipalType;
+import org.apache.hadoop.hive.ql.ErrorMsg;
+import org.apache.hadoop.hive.ql.exec.ddl.DDLOperation;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.io.IOUtils;
+
+/**
+ * Operation process of describing a database.
+ */
+public class DescDatabaseOperation extends DDLOperation {
+  @Override
+  public int execute() throws HiveException {
+DataOutputStream outStream = getOutputStream(ddlDesc.getResFile());
+try {
+  Database database = db.getDatabase(ddlDesc.getDatabaseName());
+  if (database == null) {
+throw new HiveException(ErrorMsg.DATABASE_NOT_EXISTS, 
ddlDesc.getDatabaseName());
+  }
+
+  Map params = null;
+  if (ddlDesc.isExt()) {
+params = database.getParameters();
+  }
+
+  // If this is a q-test, let's order the params map (lexicographically) by
+  // key. This is to get consistent param ordering between Java7 and Java8.
+  if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_IN_TEST) && params 
!= null) {
+params = new TreeMap(params);
+  }
+
+  String location = database.getLocationUri();
+  if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_IN_TEST)) {
+location = "location/in/test";
 
 Review comment:
   May be valid, but not in the scope of this modification. First, let's cut it 
to pieces, later we can create a jira for this.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202610)
Time Spent: 4h 40m  (was: 4.5h)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic 

[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202609=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202609
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:41
Start Date: 22/Feb/19 12:41
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259326675
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/DDLTask2.java
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.ddl;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.hadoop.hive.ql.CompilationOpContext;
+import org.apache.hadoop.hive.ql.DriverContext;
+import org.apache.hadoop.hive.ql.QueryPlan;
+import org.apache.hadoop.hive.ql.QueryState;
+import org.apache.hadoop.hive.ql.exec.Task;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.parse.ExplainConfiguration.AnalyzeState;
+import org.apache.hadoop.hive.ql.plan.api.StageType;
+
+/**
+ * DDLTask implementation.
+**/
+public class DDLTask2 extends Task implements Serializable {
+  private static final long serialVersionUID = 1L;
+
+  private static final Map, Class>> DESC_TO_OPARATION =
+  new HashMap<>();
+  public static void registerOperator(Class descClass,
+  Class> operationClass) {
+DESC_TO_OPARATION.put(descClass, operationClass);
+  }
+
+  @Override
+  public boolean requireLock() {
+return this.work != null && this.work.getNeedLock();
+  }
+
+  @Override
+  public void initialize(QueryState queryState, QueryPlan queryPlan, 
DriverContext ctx,
+  CompilationOpContext opContext) {
+super.initialize(queryState, queryPlan, ctx, opContext);
+  }
+
+  @Override
+  public int execute(DriverContext driverContext) {
+if (driverContext.getCtx().getExplainAnalyze() == AnalyzeState.RUNNING) {
+  return 0;
+}
+
+try {
+  Hive db = Hive.get(conf);
+  DDLDesc ddlDesc = work.getDDLDesc();
+
+  if (DESC_TO_OPARATION.containsKey(ddlDesc.getClass())) {
+DDLOperation ddlOperation = 
DESC_TO_OPARATION.get(ddlDesc.getClass()).newInstance();
+ddlOperation.init(db, conf, driverContext, ddlDesc);
 
 Review comment:
   conf and db are definitely not redundant, they are used by many operations. 
We may consider though creating a context.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202609)
Time Spent: 4.5h  (was: 4h 20m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, 

[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202608=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202608
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:40
Start Date: 22/Feb/19 12:40
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259326436
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/DDLTask2.java
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.ddl;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.hadoop.hive.ql.CompilationOpContext;
+import org.apache.hadoop.hive.ql.DriverContext;
+import org.apache.hadoop.hive.ql.QueryPlan;
+import org.apache.hadoop.hive.ql.QueryState;
+import org.apache.hadoop.hive.ql.exec.Task;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.parse.ExplainConfiguration.AnalyzeState;
+import org.apache.hadoop.hive.ql.plan.api.StageType;
+
+/**
+ * DDLTask implementation.
+**/
+public class DDLTask2 extends Task implements Serializable {
+  private static final long serialVersionUID = 1L;
+
+  private static final Map, Class>> DESC_TO_OPARATION =
+  new HashMap<>();
+  public static void registerOperator(Class descClass,
+  Class> operationClass) {
+DESC_TO_OPARATION.put(descClass, operationClass);
+  }
+
+  @Override
+  public boolean requireLock() {
+return this.work != null && this.work.getNeedLock();
+  }
+
+  @Override
+  public void initialize(QueryState queryState, QueryPlan queryPlan, 
DriverContext ctx,
+  CompilationOpContext opContext) {
+super.initialize(queryState, queryPlan, ctx, opContext);
+  }
+
+  @Override
+  public int execute(DriverContext driverContext) {
+if (driverContext.getCtx().getExplainAnalyze() == AnalyzeState.RUNNING) {
+  return 0;
+}
+
+try {
+  Hive db = Hive.get(conf);
+  DDLDesc ddlDesc = work.getDDLDesc();
+
+  if (DESC_TO_OPARATION.containsKey(ddlDesc.getClass())) {
+DDLOperation ddlOperation = 
DESC_TO_OPARATION.get(ddlDesc.getClass()).newInstance();
 
 Review comment:
   Using a constructor would mean that all the actual Operation classes should 
declare a constructor just to pass these arguments to the super class, which 
would lead to a lot of extra code doing nothing specific.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202608)
Time Spent: 4h 20m  (was: 4h 10m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each 

[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202604=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202604
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:38
Start Date: 22/Feb/19 12:38
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259325961
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/database/AlterDatabaseDesc.java
 ##
 @@ -16,112 +16,100 @@
  * limitations under the License.
  */
 
-package org.apache.hadoop.hive.ql.plan;
+package org.apache.hadoop.hive.ql.exec.ddl.database;
 
 import java.io.Serializable;
 import java.util.Map;
 
+import org.apache.hadoop.hive.ql.exec.ddl.DDLDesc;
+import org.apache.hadoop.hive.ql.exec.ddl.DDLTask2;
 import org.apache.hadoop.hive.ql.parse.ReplicationSpec;
+import org.apache.hadoop.hive.ql.plan.Explain;
 import org.apache.hadoop.hive.ql.plan.Explain.Level;
+import org.apache.hadoop.hive.ql.plan.PrincipalDesc;
 
 /**
- * AlterDatabaseDesc.
- *
+ * DDL task description for ALTER DATABASE commands.
  */
 @Explain(displayName = "Alter Database", explainLevels = { Level.USER, 
Level.DEFAULT, Level.EXTENDED })
 public class AlterDatabaseDesc extends DDLDesc implements Serializable {
-
   private static final long serialVersionUID = 1L;
 
-  // Only altering the database property and owner is currently supported
-  public static enum ALTER_DB_TYPES {
-ALTER_PROPERTY, ALTER_OWNER, ALTER_LOCATION
-  };
-
-  ALTER_DB_TYPES alterType;
-  String databaseName;
-  Map dbProperties;
-  PrincipalDesc ownerPrincipal;
-  ReplicationSpec replicationSpec;
-  String location;
+  static {
+DDLTask2.registerOperator(AlterDatabaseDesc.class, 
AlterDatabaseOperation.class);
+  }
 
   /**
-   * For serialization only.
+   * Supported type of alter db commands.
+   * Only altering the database property and owner is currently supported
*/
-  public AlterDatabaseDesc() {
-  }
+  public enum AlterDbType {
+ALTER_PROPERTY, ALTER_OWNER, ALTER_LOCATION
+  };
 
-  public AlterDatabaseDesc(String databaseName, Map dbProps,
 
 Review comment:
   This would result in having longer classes, and putting one indentation 
level lower everything. Also later I'd consider moving the related parts of the 
DDLSemanticAnalyzer next to these classes in a third class, which if put into 
the same class would also result even longer container classes. I believe 
having a separate class for Desc, Operation and Analyzer is cleaner.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202604)
Time Spent: 4h 10m  (was: 4h)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the database related operations 

[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202602=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202602
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:36
Start Date: 22/Feb/19 12:36
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259324982
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/database/SwitchDatabaseDesc.java
 ##
 @@ -16,37 +16,34 @@
  * limitations under the License.
  */
 
-package org.apache.hadoop.hive.ql.plan;
+package org.apache.hadoop.hive.ql.exec.ddl.database;
 
 import java.io.Serializable;
-import org.apache.hadoop.hive.ql.plan.Explain.Level;
 
+import org.apache.hadoop.hive.ql.exec.ddl.DDLDesc;
+import org.apache.hadoop.hive.ql.exec.ddl.DDLTask2;
+import org.apache.hadoop.hive.ql.plan.Explain;
+import org.apache.hadoop.hive.ql.plan.Explain.Level;
 
 /**
- * SwitchDatabaseDesc.
- *
+ * DDL task description for SWITCH DATABASE commands.
 
 Review comment:
   you are right! Let's modify the comment for now, then in a separate jira 
modify it everywhere to USE database, ok?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202602)
Time Spent: 4h  (was: 3h 50m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the database related operations from the old DDLTask, 
> and move them under the new package. Also create the new internal framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202600=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202600
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:33
Start Date: 22/Feb/19 12:33
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259324164
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/DDLOperation.java
 ##
 @@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.ddl;
+
+import java.io.DataOutputStream;
+
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.DriverContext;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.metadata.formatting.MetaDataFormatUtils;
+import org.apache.hadoop.hive.ql.metadata.formatting.MetaDataFormatter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Abstract ancestor class of all DDL Operation classes.
+ */
+public abstract class DDLOperation {
+  protected static final Logger LOG = 
LoggerFactory.getLogger("hive.ql.exec.DDLTask");
+
+  protected Hive db;
 
 Review comment:
   Have considered it, but as they are populated via the init method they can 
not be final. passing them via constructor would help, but then all the 
operation classes should have a constructor with the same arguments just to 
pass these variables...
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202600)
Time Spent: 3h 50m  (was: 3h 40m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the database related operations from the old DDLTask, 
> and move them under the new package. Also create the new internal framework.



--
This message was sent by Atlassian JIRA

[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202588=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202588
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:18
Start Date: 22/Feb/19 12:18
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259318829
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/database/CreateDatabaseOperation.java
 ##
 @@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.ddl.database;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.AlreadyExistsException;
+import org.apache.hadoop.hive.metastore.api.Database;
+import org.apache.hadoop.hive.metastore.api.PrincipalType;
+import org.apache.hadoop.hive.ql.ErrorMsg;
+import org.apache.hadoop.hive.ql.exec.Utilities;
+import org.apache.hadoop.hive.ql.exec.ddl.DDLOperation;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.session.SessionState;
+
+/**
+ * Operation process of creating a database.
+ */
+public class CreateDatabaseOperation extends DDLOperation {
+  private static final String DATABASE_PATH_SUFFIX = ".db";
+
+  @Override
+  public int execute() throws HiveException {
+Database database = new Database();
+database.setName(ddlDesc.getName());
+database.setDescription(ddlDesc.getComment());
+database.setLocationUri(ddlDesc.getLocationUri());
+database.setParameters(ddlDesc.getDatabaseProperties());
+database.setOwnerName(SessionState.getUserFromAuthenticator());
+database.setOwnerType(PrincipalType.USER);
+
+try {
+  makeLocationQualified(database);
+  db.createDatabase(database, ddlDesc.getIfNotExists());
+} catch (AlreadyExistsException ex) {
+  //it would be better if AlreadyExistsException had an errorCode field
+  throw new HiveException(ex, ErrorMsg.DATABASE_ALREADY_EXISTS, 
ddlDesc.getName());
+}
+
+return 0;
+  }
+
+  private void makeLocationQualified(Database database) throws HiveException {
 
 Review comment:
   Valid, but not in the scope of this. Let's cut it to pieces, later modify it 
in a different jira.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202588)
Time Spent: 3h 40m  (was: 3.5h)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the 

[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202584=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202584
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:15
Start Date: 22/Feb/19 12:15
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259317832
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/DDLDesc.java
 ##
 @@ -0,0 +1,27 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.ddl;
+
+import java.io.Serializable;
+
+/**
+ * Abstract ancestor of all DDL operation descriptors.
+ */
+public abstract class DDLDesc implements Serializable {
 
 Review comment:
   true, will modify
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202584)
Time Spent: 3.5h  (was: 3h 20m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the database related operations from the old DDLTask, 
> and move them under the new package. Also create the new internal framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202583=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202583
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:14
Start Date: 22/Feb/19 12:14
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259317746
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/DDLOperation.java
 ##
 @@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.ddl;
+
+import java.io.DataOutputStream;
+
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.DriverContext;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.metadata.formatting.MetaDataFormatUtils;
+import org.apache.hadoop.hive.ql.metadata.formatting.MetaDataFormatter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Abstract ancestor class of all DDL Operation classes.
+ */
+public abstract class DDLOperation {
+  protected static final Logger LOG = 
LoggerFactory.getLogger("hive.ql.exec.DDLTask");
+
+  protected Hive db;
+  protected HiveConf conf;
+  protected DriverContext driverContext;
+  protected T ddlDesc;
+  protected MetaDataFormatter formatter;
+
+  @SuppressWarnings("unchecked")
+  public void init(Hive db, HiveConf conf, DriverContext driverContext, 
DDLDesc ddlDesc) {
+this.db = db;
+this.conf = conf;
+this.driverContext = driverContext;
+this.ddlDesc = (T)ddlDesc;
 
 Review comment:
   will fix, where can I obtain this hive formatter?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202583)
Time Spent: 3h 20m  (was: 3h 10m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the 

[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202582=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202582
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:14
Start Date: 22/Feb/19 12:14
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259317522
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/DDLTask2.java
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.ddl;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.hadoop.hive.ql.CompilationOpContext;
+import org.apache.hadoop.hive.ql.DriverContext;
+import org.apache.hadoop.hive.ql.QueryPlan;
+import org.apache.hadoop.hive.ql.QueryState;
+import org.apache.hadoop.hive.ql.exec.Task;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.parse.ExplainConfiguration.AnalyzeState;
+import org.apache.hadoop.hive.ql.plan.api.StageType;
+
+/**
+ * DDLTask implementation.
+**/
+public class DDLTask2 extends Task implements Serializable {
+  private static final long serialVersionUID = 1L;
+
+  private static final Map, Class>> DESC_TO_OPARATION =
+  new HashMap<>();
+  public static void registerOperator(Class descClass,
+  Class> operationClass) {
+DESC_TO_OPARATION.put(descClass, operationClass);
+  }
+
+  @Override
+  public boolean requireLock() {
+return this.work != null && this.work.getNeedLock();
+  }
+
+  @Override
+  public void initialize(QueryState queryState, QueryPlan queryPlan, 
DriverContext ctx,
+  CompilationOpContext opContext) {
+super.initialize(queryState, queryPlan, ctx, opContext);
+  }
+
+  @Override
+  public int execute(DriverContext driverContext) {
+if (driverContext.getCtx().getExplainAnalyze() == AnalyzeState.RUNNING) {
+  return 0;
+}
+
+try {
+  Hive db = Hive.get(conf);
+  DDLDesc ddlDesc = work.getDDLDesc();
+
+  if (DESC_TO_OPARATION.containsKey(ddlDesc.getClass())) {
+DDLOperation ddlOperation = 
DESC_TO_OPARATION.get(ddlDesc.getClass()).newInstance();
+ddlOperation.init(db, conf, driverContext, ddlDesc);
+return ddlOperation.execute();
+  } else {
+throw new IllegalArgumentException("Unknown DDL request: " + 
ddlDesc.getClass());
+  }
+} catch (Throwable e) {
+  failed(e);
+  return 1;
 
 Review comment:
   This is a valid suggestion, but not in the scope of this refactoring. First, 
let's cut it to pieces, later we can create a jira for more informative error 
codes.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202582)
Time Spent: 3h 10m  (was: 3h)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it 

[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202581=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202581
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:12
Start Date: 22/Feb/19 12:12
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259317265
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/database/AlterDatabaseDesc.java
 ##
 @@ -16,112 +16,100 @@
  * limitations under the License.
  */
 
-package org.apache.hadoop.hive.ql.plan;
+package org.apache.hadoop.hive.ql.exec.ddl.database;
 
 import java.io.Serializable;
 import java.util.Map;
 
+import org.apache.hadoop.hive.ql.exec.ddl.DDLDesc;
+import org.apache.hadoop.hive.ql.exec.ddl.DDLTask2;
 import org.apache.hadoop.hive.ql.parse.ReplicationSpec;
+import org.apache.hadoop.hive.ql.plan.Explain;
 import org.apache.hadoop.hive.ql.plan.Explain.Level;
+import org.apache.hadoop.hive.ql.plan.PrincipalDesc;
 
 /**
- * AlterDatabaseDesc.
- *
+ * DDL task description for ALTER DATABASE commands.
  */
 @Explain(displayName = "Alter Database", explainLevels = { Level.USER, 
Level.DEFAULT, Level.EXTENDED })
 public class AlterDatabaseDesc extends DDLDesc implements Serializable {
-
   private static final long serialVersionUID = 1L;
 
-  // Only altering the database property and owner is currently supported
-  public static enum ALTER_DB_TYPES {
-ALTER_PROPERTY, ALTER_OWNER, ALTER_LOCATION
-  };
-
-  ALTER_DB_TYPES alterType;
-  String databaseName;
-  Map dbProperties;
-  PrincipalDesc ownerPrincipal;
-  ReplicationSpec replicationSpec;
-  String location;
+  static {
+DDLTask2.registerOperator(AlterDatabaseDesc.class, 
AlterDatabaseOperation.class);
 
 Review comment:
   yes, but it is not a problem, as we may not possibly need to register it as 
long as no XXXDesc object was created.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202581)
Time Spent: 3h  (was: 2h 50m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the database related operations from the old DDLTask, 
> and move them under the new package. Also create the new internal framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202579=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202579
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:10
Start Date: 22/Feb/19 12:10
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259316684
 
 

 ##
 File path: 
hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/SemanticAnalysis/HCatSemanticAnalyzerBase.java
 ##
 @@ -122,6 +128,13 @@ protected void 
authorizeDDLWork(HiveSemanticAnalyzerHookContext context,
   Hive hive, DDLWork work) throws HiveException {
   }
 
+  /**
+   * Authorized the given DDLWork2. It is only for the interim time while 
DDLTask and DDLWork are being refactored.
+   */
+  protected void authorizeDDLWork2(HiveSemanticAnalyzerHookContext context,
 
 Review comment:
   In the long term yes, for now it must stay there as there are works done by 
the original DDLWork, and others by the DDLWork2. When all the refactorings are 
done there going to be only one authorizeDDLWork.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202579)
Time Spent: 2h 40m  (was: 2.5h)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the database related operations from the old DDLTask, 
> and move them under the new package. Also create the new internal framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202580=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202580
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:10
Start Date: 22/Feb/19 12:10
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259316748
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/DDLWork2.java
 ##
 @@ -0,0 +1,73 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.exec.ddl;
+
+import org.apache.hadoop.hive.ql.hooks.ReadEntity;
+import org.apache.hadoop.hive.ql.hooks.WriteEntity;
+
+import java.io.Serializable;
+
+import java.util.Set;
+
+/**
+ * A DDL operation.
+ */
+public class DDLWork2 implements Serializable {
+  private static final long serialVersionUID = 1L;
+
+  private DDLDesc ddlDesc;
 
 Review comment:
   sure!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202580)
Time Spent: 2h 50m  (was: 2h 40m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> Step #1: extract all the database related operations from the old DDLTask, 
> and move them under the new package. Also create the new internal framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21292) Break up DDLTask 1 - extract Database related operations

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21292?focusedWorklogId=202576=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202576
 ]

ASF GitHub Bot logged work on HIVE-21292:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:09
Start Date: 22/Feb/19 12:09
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #543: 
HIVE-21292: Break up DDLTask 1 - extract Database related operations
URL: https://github.com/apache/hive/pull/543#discussion_r259316348
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ddl/DDLOperation.java
 ##
 @@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.exec.ddl;
+
+import java.io.DataOutputStream;
+
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.DriverContext;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.metadata.formatting.MetaDataFormatUtils;
+import org.apache.hadoop.hive.ql.metadata.formatting.MetaDataFormatter;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Abstract ancestor class of all DDL Operation classes.
+ */
+public abstract class DDLOperation {
+  protected static final Logger LOG = 
LoggerFactory.getLogger("hive.ql.exec.DDLTask");
+
+  protected Hive db;
+  protected HiveConf conf;
+  protected DriverContext driverContext;
+  protected T ddlDesc;
+  protected MetaDataFormatter formatter;
+
+  @SuppressWarnings("unchecked")
+  public void init(Hive db, HiveConf conf, DriverContext driverContext, 
DDLDesc ddlDesc) {
+this.db = db;
+this.conf = conf;
+this.driverContext = driverContext;
+this.ddlDesc = (T)ddlDesc;
+this.formatter = MetaDataFormatUtils.getFormatter(conf);
+  }
+
+  public abstract int execute() throws HiveException;
+
+  protected DataOutputStream getOutputStream(String resFile) throws 
HiveException {
 
 Review comment:
   ok, I'll remove it
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202576)
Time Spent: 2.5h  (was: 2h 20m)

> Break up DDLTask 1 - extract Database related operations
> 
>
> Key: HIVE-21292
> URL: https://issues.apache.org/jira/browse/HIVE-21292
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21292.01.patch, HIVE-21292.02.patch, 
> HIVE-21292.03.patch, HIVE-21292.04.patch, HIVE-21292.05.patch
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package 

[jira] [Work logged] (HIVE-21197) Hive replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21197?focusedWorklogId=202572=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202572
 ]

ASF GitHub Bot logged work on HIVE-21197:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 12:00
Start Date: 22/Feb/19 12:00
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on pull request #541: HIVE-21197 : 
Hive Replication can add duplicate data during migration to a target with 
hive.strict.managed.tables enabled
URL: https://github.com/apache/hive/pull/541#discussion_r259310998
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/plan/ReplSetFirstIncLoadFlagDesc.java
 ##
 @@ -0,0 +1,64 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.plan;
+import org.apache.hadoop.hive.ql.plan.Explain.Level;
+
+import java.io.Serializable;
+
+/**
+ * ReplSetFirstIncLoadFlagDesc.
+ *
+ */
+@Explain(displayName = "Set First Incr Load Flag", explainLevels = { 
Level.USER, Level.DEFAULT, Level.EXTENDED })
+public class ReplSetFirstIncLoadFlagDesc extends DDLDesc implements 
Serializable {
+
+  private static final long serialVersionUID = 1L;
+  String databaseName;
+  String tableName;
+
+  /**
+   * For serialization only.
+   */
+  public ReplSetFirstIncLoadFlagDesc() {
+  }
+
+  public ReplSetFirstIncLoadFlagDesc(String databaseName, String tableName) {
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202572)
Time Spent: 6h 10m  (was: 6h)

> Hive replication can add duplicate data during migration to a target with 
> hive.strict.managed.tables enabled
> 
>
> Key: HIVE-21197
> URL: https://issues.apache.org/jira/browse/HIVE-21197
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21197.01.patch, HIVE-21197.02.patch
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> During bootstrap phase it may happen that the files copied to target are 
> created by events which are not part of the bootstrap. This is because of the 
> fact that, bootstrap first gets the last event id and then the file list. 
> During this period if some event are added, then bootstrap will include files 
> created by these events also.The same files will be copied again during the 
> first incremental replication just after the bootstrap. In normal scenario, 
> the duplicate copy does not cause any issue as hive allows the use of target 
> database only after the first incremental. But in case of migration, the file 
> at source and target are copied to different location (based on the write id 
> at target) and thus this may lead to duplicate data at target. This can be 
> avoided by having at check at load time for duplicate file. This check can be 
> done only for the first incremental and the search can be done in the 
> bootstrap directory (with write id 1). if the file is already present then 
> just ignore the copy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21197) Hive replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21197?focusedWorklogId=202571=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202571
 ]

ASF GitHub Bot logged work on HIVE-21197:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 11:59
Start Date: 22/Feb/19 11:59
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on pull request #541: HIVE-21197 : 
Hive Replication can add duplicate data during migration to a target with 
hive.strict.managed.tables enabled
URL: https://github.com/apache/hive/pull/541#discussion_r259314021
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java
 ##
 @@ -133,8 +161,11 @@ protected int execute(DriverContext driverContext) {
 return 6;
   }
   long writeId = Long.parseLong(writeIdString);
-  toPath = new Path(toPath, 
AcidUtils.baseOrDeltaSubdir(work.getDeleteDestIfExist(), writeId, writeId,
-  
driverContext.getCtx().getHiveTxnManager().getStmtIdAndIncrement()));
+  // Set stmt id 0 for bootstrap load as the directory needs to be 
searched during incremental load to avoid any
+  // duplicate copy from the source. Check HIVE-21197 for more detail.
+  int stmtId = (writeId == 
ReplUtils.REPL_BOOTSTRAP_MIGRATION_BASE_WRITE_ID) ? 0 :
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202571)
Time Spent: 6h  (was: 5h 50m)

> Hive replication can add duplicate data during migration to a target with 
> hive.strict.managed.tables enabled
> 
>
> Key: HIVE-21197
> URL: https://issues.apache.org/jira/browse/HIVE-21197
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21197.01.patch, HIVE-21197.02.patch
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> During bootstrap phase it may happen that the files copied to target are 
> created by events which are not part of the bootstrap. This is because of the 
> fact that, bootstrap first gets the last event id and then the file list. 
> During this period if some event are added, then bootstrap will include files 
> created by these events also.The same files will be copied again during the 
> first incremental replication just after the bootstrap. In normal scenario, 
> the duplicate copy does not cause any issue as hive allows the use of target 
> database only after the first incremental. But in case of migration, the file 
> at source and target are copied to different location (based on the write id 
> at target) and thus this may lead to duplicate data at target. This can be 
> avoided by having at check at load time for duplicate file. This check can be 
> done only for the first incremental and the search can be done in the 
> bootstrap directory (with write id 1). if the file is already present then 
> just ignore the copy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21293) Fix ambiguity in grammar warnings at compilation time (II)

2019-02-22 Thread Laszlo Bodor (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775063#comment-16775063
 ] 

Laszlo Bodor commented on HIVE-21293:
-

[~jcamachorodriguez]: uploaded 01.patch. I modified IdentifiersParser to make 
unknown a keyword, as it solved the problem easily, and according to the 
standard it's a reserved keyword. cc: [~kgyrtkirk]
the rest of the patch is the cleanup of the TestSQL11ReservedKeyWordsNegative 
monster

> Fix ambiguity in grammar warnings at compilation time (II)
> --
>
> Key: HIVE-21293
> URL: https://issues.apache.org/jira/browse/HIVE-21293
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-21293.01.patch
>
>
> These are the warnings at compilation time:
> {code}
> warning(200): IdentifiersParser.g:424:5:
> Decision can match input such as "KW_UNKNOWN" using multiple alternatives: 1, 
> 10
> As a result, alternative(s) 10 were disabled for that input
> {code}
> This means that multiple parser rules can match certain query text, possibly 
> leading to unexpected errors at parsing time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21293) Fix ambiguity in grammar warnings at compilation time (II)

2019-02-22 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-21293:

Status: Patch Available  (was: Open)

> Fix ambiguity in grammar warnings at compilation time (II)
> --
>
> Key: HIVE-21293
> URL: https://issues.apache.org/jira/browse/HIVE-21293
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-21293.01.patch
>
>
> These are the warnings at compilation time:
> {code}
> warning(200): IdentifiersParser.g:424:5:
> Decision can match input such as "KW_UNKNOWN" using multiple alternatives: 1, 
> 10
> As a result, alternative(s) 10 were disabled for that input
> {code}
> This means that multiple parser rules can match certain query text, possibly 
> leading to unexpected errors at parsing time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21293) Fix ambiguity in grammar warnings at compilation time (II)

2019-02-22 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-21293:

Attachment: HIVE-21293.01.patch

> Fix ambiguity in grammar warnings at compilation time (II)
> --
>
> Key: HIVE-21293
> URL: https://issues.apache.org/jira/browse/HIVE-21293
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-21293.01.patch
>
>
> These are the warnings at compilation time:
> {code}
> warning(200): IdentifiersParser.g:424:5:
> Decision can match input such as "KW_UNKNOWN" using multiple alternatives: 1, 
> 10
> As a result, alternative(s) 10 were disabled for that input
> {code}
> This means that multiple parser rules can match certain query text, possibly 
> leading to unexpected errors at parsing time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21197) Hive replication can add duplicate data during migration to a target with hive.strict.managed.tables enabled

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21197?focusedWorklogId=202566=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202566
 ]

ASF GitHub Bot logged work on HIVE-21197:
-

Author: ASF GitHub Bot
Created on: 22/Feb/19 11:54
Start Date: 22/Feb/19 11:54
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on pull request #541: HIVE-21197 : 
Hive Replication can add duplicate data during migration to a target with 
hive.strict.managed.tables enabled
URL: https://github.com/apache/hive/pull/541#discussion_r259312676
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadDatabase.java
 ##
 @@ -158,6 +159,15 @@ private boolean isDbEmpty(String dbName) throws 
HiveException {
 // Add the checkpoint key to the Database binding it to current dump 
directory.
 // So, if retry using same dump, we shall skip Database object update.
 parameters.put(ReplUtils.REPL_CHECKPOINT_KEY, dumpDirectory);
+
+if (needSetIncFlag) {
 
 Review comment:
   in case of alter case need not set the flag ..
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202566)
Time Spent: 5h 50m  (was: 5h 40m)

> Hive replication can add duplicate data during migration to a target with 
> hive.strict.managed.tables enabled
> 
>
> Key: HIVE-21197
> URL: https://issues.apache.org/jira/browse/HIVE-21197
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21197.01.patch, HIVE-21197.02.patch
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> During bootstrap phase it may happen that the files copied to target are 
> created by events which are not part of the bootstrap. This is because of the 
> fact that, bootstrap first gets the last event id and then the file list. 
> During this period if some event are added, then bootstrap will include files 
> created by these events also.The same files will be copied again during the 
> first incremental replication just after the bootstrap. In normal scenario, 
> the duplicate copy does not cause any issue as hive allows the use of target 
> database only after the first incremental. But in case of migration, the file 
> at source and target are copied to different location (based on the write id 
> at target) and thus this may lead to duplicate data at target. This can be 
> avoided by having at check at load time for duplicate file. This check can be 
> done only for the first incremental and the search can be done in the 
> bootstrap directory (with write id 1). if the file is already present then 
> just ignore the copy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >