date:20210108

[jira] [Work started] (HIVE-24609) Fix ArrayIndexOutOfBoundsException when execute full outer join

2021-01-08 Thread jufeng li (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-24609 started by jufeng li.

> Fix ArrayIndexOutOfBoundsException when execute full outer join
> ---
>
> Key: HIVE-24609
> URL: https://issues.apache.org/jira/browse/HIVE-24609
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.1.0
>Reporter: jufeng li
>Assignee: jufeng li
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> here is my hive-sql:
> {code:java}
> select 
> ..
>  from A 
>  full outer join B on A.id = B.id
> {code}
>  
> It can not be execute,I got an ArrayIndexOutOfBoundsException.Then I debug 
> HiveServer2,found when compile sql in some situation and contains full outer 
> join,there is an ArrayIndexOutOfBoundsException.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24609) Fix ArrayIndexOutOfBoundsException when execute full outer join

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24609?focusedWorklogId=533313=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533313
 ]

ASF GitHub Bot logged work on HIVE-24609:
-

Author: ASF GitHub Bot
Created on: 09/Jan/21 01:39
Start Date: 09/Jan/21 01:39
Worklog Time Spent: 10m 
  Work Description: lijufeng2016 opened a new pull request #1844:
URL: https://github.com/apache/hive/pull/1844


   What changes were proposed in this pull request?
   Add a judgement for position.
   
   Why are the changes needed?
   To aviod ArrayIndexOutOfBoundsException
   
   Does this PR introduce any user-facing change?
   No
   
   How was this patch tested?
   It's OK.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533313)
Remaining Estimate: 0h
Time Spent: 10m

> Fix ArrayIndexOutOfBoundsException when execute full outer join
> ---
>
> Key: HIVE-24609
> URL: https://issues.apache.org/jira/browse/HIVE-24609
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.1.0
>Reporter: jufeng li
>Assignee: jufeng li
>Priority: Blocker
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> here is my hive-sql:
> {code:java}
> select 
> ..
>  from A 
>  full outer join B on A.id = B.id
> {code}
>  
> It can not be execute,I got an ArrayIndexOutOfBoundsException.Then I debug 
> HiveServer2,found when compile sql in some situation and contains full outer 
> join,there is an ArrayIndexOutOfBoundsException.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24609) Fix ArrayIndexOutOfBoundsException when execute full outer join

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24609:
--
Labels: pull-request-available  (was: )

> Fix ArrayIndexOutOfBoundsException when execute full outer join
> ---
>
> Key: HIVE-24609
> URL: https://issues.apache.org/jira/browse/HIVE-24609
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.1.0
>Reporter: jufeng li
>Assignee: jufeng li
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> here is my hive-sql:
> {code:java}
> select 
> ..
>  from A 
>  full outer join B on A.id = B.id
> {code}
>  
> It can not be execute,I got an ArrayIndexOutOfBoundsException.Then I debug 
> HiveServer2,found when compile sql in some situation and contains full outer 
> join,there is an ArrayIndexOutOfBoundsException.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24609) Fix ArrayIndexOutOfBoundsException when execute full outer join

2021-01-08 Thread jufeng li (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jufeng li updated HIVE-24609:
-
Target Version/s: 4.0.0

> Fix ArrayIndexOutOfBoundsException when execute full outer join
> ---
>
> Key: HIVE-24609
> URL: https://issues.apache.org/jira/browse/HIVE-24609
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.1.0
>Reporter: jufeng li
>Assignee: jufeng li
>Priority: Blocker
> Fix For: 4.0.0
>
>
> here is my hive-sql:
> {code:java}
> select 
> ..
>  from A 
>  full outer join B on A.id = B.id
> {code}
>  
> It can not be execute,I got an ArrayIndexOutOfBoundsException.Then I debug 
> HiveServer2,found when compile sql in some situation and contains full outer 
> join,there is an ArrayIndexOutOfBoundsException.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24609) Fix ArrayIndexOutOfBoundsException when execute full outer join

2021-01-08 Thread jufeng li (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jufeng li updated HIVE-24609:
-
Description: 
here is my hive-sql:
{code:java}
select 
..
 from A 
 full outer join B on A.id = B.id
{code}
 

It can not be execute,I got an ArrayIndexOutOfBoundsException.Then I debug 
HiveServer2,found when compile sql in some situation and contains full outer 
join,there is an ArrayIndexOutOfBoundsException.

  was:
here is my hive-sql:

```sql

select 

..
 from A 
 full outer join B on A.id = B.id

```

It can not be execute,I got an ArrayIndexOutOfBoundsException.Then I debug 
HiveServer2,found when compile sql in some situation and contains full outer 
join,there is an 


> Fix ArrayIndexOutOfBoundsException when execute full outer join
> ---
>
> Key: HIVE-24609
> URL: https://issues.apache.org/jira/browse/HIVE-24609
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.1.0
>Reporter: jufeng li
>Assignee: jufeng li
>Priority: Blocker
> Fix For: 4.0.0
>
>
> here is my hive-sql:
> {code:java}
> select 
> ..
>  from A 
>  full outer join B on A.id = B.id
> {code}
>  
> It can not be execute,I got an ArrayIndexOutOfBoundsException.Then I debug 
> HiveServer2,found when compile sql in some situation and contains full outer 
> join,there is an ArrayIndexOutOfBoundsException.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24609) Fix ArrayIndexOutOfBoundsException when execute full outer join

2021-01-08 Thread jufeng li (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jufeng li updated HIVE-24609:
-
Description: 
here is my hive-sql:

```sql

select 

..
 from A 
 full outer join B on A.id = B.id

```

It can not be execute,I got an ArrayIndexOutOfBoundsException.Then I debug 
HiveServer2,found when compile sql in some situation and contains full outer 
join,there is an 

  was:
here is my hive-sql:

```sql

select 

..
from A 
full outer join B on A.id = B.id

```

It can not be execute,I got 


> Fix ArrayIndexOutOfBoundsException when execute full outer join
> ---
>
> Key: HIVE-24609
> URL: https://issues.apache.org/jira/browse/HIVE-24609
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.1.0
>Reporter: jufeng li
>Assignee: jufeng li
>Priority: Blocker
> Fix For: 4.0.0
>
>
> here is my hive-sql:
> ```sql
> select 
> ..
>  from A 
>  full outer join B on A.id = B.id
> ```
> It can not be execute,I got an ArrayIndexOutOfBoundsException.Then I debug 
> HiveServer2,found when compile sql in some situation and contains full outer 
> join,there is an 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24109) Load partitions in batches for managed tables in the bootstrap phase

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24109?focusedWorklogId=533301=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533301
 ]

ASF GitHub Bot logged work on HIVE-24109:
-

Author: ASF GitHub Bot
Created on: 09/Jan/21 01:13
Start Date: 09/Jan/21 01:13
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1529:
URL: https://github.com/apache/hive/pull/1529


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533301)
Time Spent: 1h 50m  (was: 1h 40m)

> Load partitions in batches for managed tables in the bootstrap phase
> 
>
> Key: HIVE-24109
> URL: https://issues.apache.org/jira/browse/HIVE-24109
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24109.01.patch, HIVE-24109.02.patch, 
> HIVE-24109.03.patch, HIVE-24109.04.patch, Replication Performance 
> Improvements.pdf
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-16352) Ability to skip or repair out of sync blocks with HIVE at runtime

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-16352?focusedWorklogId=533302=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533302
 ]

ASF GitHub Bot logged work on HIVE-16352:
-

Author: ASF GitHub Bot
Created on: 09/Jan/21 01:13
Start Date: 09/Jan/21 01:13
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1436:
URL: https://github.com/apache/hive/pull/1436#issuecomment-757069854


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533302)
Time Spent: 1h 20m  (was: 1h 10m)

> Ability to skip or repair out of sync blocks with HIVE at runtime
> -
>
> Key: HIVE-16352
> URL: https://issues.apache.org/jira/browse/HIVE-16352
> Project: Hive
>  Issue Type: New Feature
>  Components: Avro, File Formats, Reader
>Affects Versions: 3.1.2
>Reporter: Navdeep Poonia
>Assignee: gabrywu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> When a file is corrupted it raises the error java.io.IOException: Invalid 
> sync! with hive.
>  Can we have some functionality to skip or repair such blocks at runtime to 
> make avro more error resilient in case of data corruption.
>  Error: java.io.IOException: java.io.IOException: java.io.IOException: While 
> processing file 
> s3n:///navdeepp/warehouse/avro_test/354dc34474404f4bbc0d8013fc8e6e4b_42.
>  java.io.IOException: Invalid sync!
>  at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>  at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>  at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:334)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24273) grouping key is case sensitive

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24273?focusedWorklogId=533300=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533300
 ]

ASF GitHub Bot logged work on HIVE-24273:
-

Author: ASF GitHub Bot
Created on: 09/Jan/21 01:13
Start Date: 09/Jan/21 01:13
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1579:
URL: https://github.com/apache/hive/pull/1579#issuecomment-757069838


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533300)
Time Spent: 0.5h  (was: 20m)

> grouping  key is case sensitive
> ---
>
> Key: HIVE-24273
> URL: https://issues.apache.org/jira/browse/HIVE-24273
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0, 4.0.0
>Reporter: zhaolong
>Assignee: zhaolong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: 0001-fix-HIVE-24273-grouping-key-is-case-sensitive.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> grouping key is case sensitive,  the follow step can reproduce
> 1.create table testaa(name string, age int);
> 2.select GROUPING(name) from testaa group by name;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24212) Refactor to take advantage of list* optimisations in cloud storage connectors

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24212?focusedWorklogId=533299=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533299
 ]

ASF GitHub Bot logged work on HIVE-24212:
-

Author: ASF GitHub Bot
Created on: 09/Jan/21 01:13
Start Date: 09/Jan/21 01:13
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1538:
URL: https://github.com/apache/hive/pull/1538#issuecomment-757069843


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533299)
Time Spent: 20m  (was: 10m)

> Refactor to take advantage of list* optimisations in cloud storage connectors
> -
>
> Key: HIVE-24212
> URL: https://issues.apache.org/jira/browse/HIVE-24212
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> https://issues.apache.org/jira/browse/HADOOP-17022, 
> https://issues.apache.org/jira/browse/HADOOP-17281, 
> https://issues.apache.org/jira/browse/HADOOP-16830 etc help in reducing 
> number of roundtrips to remote systems in cloud storage.
> Creating this ticket to do minor refactoring to take advantage of the above 
> optimizations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24327) AtlasServer entity may not be present during first Atlas metadata dump

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24327?focusedWorklogId=533297=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533297
 ]

ASF GitHub Bot logged work on HIVE-24327:
-

Author: ASF GitHub Bot
Created on: 09/Jan/21 01:13
Start Date: 09/Jan/21 01:13
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1623:
URL: https://github.com/apache/hive/pull/1623#issuecomment-757069832


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533297)
Time Spent: 50m  (was: 40m)

> AtlasServer entity may not be present during first Atlas metadata dump
> --
>
> Key: HIVE-24327
> URL: https://issues.apache.org/jira/browse/HIVE-24327
> Project: Hive
>  Issue Type: Bug
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24327.01.patch, HIVE-24327.02.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24330) Automate setting permissions on cmRoot directories.

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24330?focusedWorklogId=533298=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533298
 ]

ASF GitHub Bot logged work on HIVE-24330:
-

Author: ASF GitHub Bot
Created on: 09/Jan/21 01:13
Start Date: 09/Jan/21 01:13
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1630:
URL: https://github.com/apache/hive/pull/1630#issuecomment-757069827


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533298)
Time Spent: 1h  (was: 50m)

> Automate setting permissions on cmRoot directories.
> ---
>
> Key: HIVE-24330
> URL: https://issues.apache.org/jira/browse/HIVE-24330
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24330.01.patch, HIVE-24330.02.patch, 
> HIVE-24330.03.patch, HIVE-24330.04.patch, HIVE-24330.05.patch, 
> HIVE-24330.06.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24609) Fix ArrayIndexOutOfBoundsException when execute full outer join

2021-01-08 Thread jufeng li (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jufeng li updated HIVE-24609:
-
Description: 
here is my hive-sql:

```sql

select 

..
from A 
full outer join B on A.id = B.id

```

It can not be execute,I got 

> Fix ArrayIndexOutOfBoundsException when execute full outer join
> ---
>
> Key: HIVE-24609
> URL: https://issues.apache.org/jira/browse/HIVE-24609
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.1.0
>Reporter: jufeng li
>Assignee: jufeng li
>Priority: Blocker
> Fix For: 4.0.0
>
>
> here is my hive-sql:
> ```sql
> select 
> ..
> from A 
> full outer join B on A.id = B.id
> ```
> It can not be execute,I got 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24609) Fix ArrayIndexOutOfBoundsException when execute full outer join

2021-01-08 Thread jufeng li (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jufeng li reassigned HIVE-24609:



> Fix ArrayIndexOutOfBoundsException when execute full outer join
> ---
>
> Key: HIVE-24609
> URL: https://issues.apache.org/jira/browse/HIVE-24609
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.1.0
>Reporter: jufeng li
>Assignee: jufeng li
>Priority: Blocker
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24559) Fix some spelling issues

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24559?focusedWorklogId=533280=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533280
 ]

ASF GitHub Bot logged work on HIVE-24559:
-

Author: ASF GitHub Bot
Created on: 08/Jan/21 23:27
Start Date: 08/Jan/21 23:27
Worklog Time Spent: 10m 
  Work Description: sunchao merged pull request #1818:
URL: https://github.com/apache/hive/pull/1818


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533280)
Time Spent: 1h 40m  (was: 1.5h)

> Fix some spelling issues
> 
>
> Key: HIVE-24559
> URL: https://issues.apache.org/jira/browse/HIVE-24559
> Project: Hive
>  Issue Type: Improvement
>Reporter: RickyMa
>Assignee: RickyMa
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> There are some minor typos:
> [https://github.com/apache/hive/pull/1805/fileshttps://github.com/apache/hive/pull/1805/fileshttps://github.com/apache/hive/blob/branch-2.3/metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L858|https://github.com/apache/hive/blob/branch-2.3/metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L858]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-24559) Fix some spelling issues

2021-01-08 Thread Chao Sun (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun resolved HIVE-24559.
-
Fix Version/s: 4.0.0
   2.3.8
   Resolution: Fixed

> Fix some spelling issues
> 
>
> Key: HIVE-24559
> URL: https://issues.apache.org/jira/browse/HIVE-24559
> Project: Hive
>  Issue Type: Improvement
>Reporter: RickyMa
>Assignee: RickyMa
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 2.3.8, 4.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> There are some minor typos:
> [https://github.com/apache/hive/pull/1805/fileshttps://github.com/apache/hive/pull/1805/fileshttps://github.com/apache/hive/blob/branch-2.3/metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L858|https://github.com/apache/hive/blob/branch-2.3/metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L858]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24559) Fix some spelling issues

2021-01-08 Thread Chao Sun (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun reassigned HIVE-24559:
---

Assignee: RickyMa

> Fix some spelling issues
> 
>
> Key: HIVE-24559
> URL: https://issues.apache.org/jira/browse/HIVE-24559
> Project: Hive
>  Issue Type: Improvement
>Reporter: RickyMa
>Assignee: RickyMa
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> There are some minor typos:
> [https://github.com/apache/hive/pull/1805/fileshttps://github.com/apache/hive/pull/1805/fileshttps://github.com/apache/hive/blob/branch-2.3/metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L858|https://github.com/apache/hive/blob/branch-2.3/metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L858]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24608) Switch back to get_table in HMS client for Hive 2.3.x

2021-01-08 Thread Chao Sun (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-24608:

Summary: Switch back to get_table in HMS client for Hive 2.3.x  (was: 
Switch back to get_table in HMS client)

> Switch back to get_table in HMS client for Hive 2.3.x
> -
>
> Key: HIVE-24608
> URL: https://issues.apache.org/jira/browse/HIVE-24608
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.7
>Reporter: Chao Sun
>Priority: Major
>
> HIVE-15062 introduced a backward-incompatible change by replacing 
> {{get_table}} with {{get_table_req}}. As consequence, when HMS client w/ 
> version > 2.3 talks to a HMS w/ version < 2.3, it will get error similar to 
> the following:
> {code}
> AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: Unable 
> to fetch table testpartitiondata. Invalid method name: 'get_table_req';
> {code}
> Looking at HIVE-15062, the {{get_table_req}} is to introduce client-side 
> check for capabilities. However in branch-2.3 the check is a no-op since 
> there is no capability yet (it is assigned to null). Therefore, this JIRA 
> proposes to switch back to {{get_table}} in branch-2.3 to fix the 
> compatibility issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.0

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=533225=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533225
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 08/Jan/21 21:09
Start Date: 08/Jan/21 21:09
Worklog Time Spent: 10m 
  Work Description: vyaslav commented on pull request #1742:
URL: https://github.com/apache/hive/pull/1742#issuecomment-756999283


   > @wangyum Stuck.
   > 
   > There are two big issues here:
   > 
   > 1. Hive integration tests fire up Druid, Kafka, HDFS, LLAP, etc. all in 
the same JVM and their 3rd party dependencies are all over the place. Using a 
higher version of a dependency breaks one product, but using a lower version 
breaks the other.  To make this work well, there probably needs to be a way to 
launch each service in their own JVM class loader.  In lieu of that, I've been 
trying to move the ball closer to the goal post and getting dependencies closer 
together.
   > 
   > [apache/druid#10683](https://github.com/apache/druid/pull/10683)
   > [HIVE-24542](https://issues.apache.org/jira/browse/HIVE-24542)
   > 
   > 1. In HDFS 3.3.0, Hadoop team introduced `ProtobufRpcEngine2` in addition 
to `ProtobufRpcEngine` (sigh).  Some of the Hive LLAP stuff is using this 
Hadoop Protobuf RPC engine (`ProtobufRpcEngine`).  There's some `static` logic 
in the protocol engines that prohibits loading both RPC engines into the same 
JVM at the same time, I'm not sure why.  HDFS was migrated to 
`ProtobufRpcEngine2`.  So, again, in the integration tests, when the HDFS mini 
cluster is loaded, version 2 of the RPC engine is loaded into the JVM.  When 
LLAP is later loaded, it fails to start because version 1 cannot be registered 
at the same time.
   
   Regarding 1st, I faced the same issues in my PR for upgrade to 3.1.3 - 
https://github.com/apache/hive/pull/1638
   But, regarding 2nd I'am curious if it would be hard to replace 
`ProtobufRpcEngine` with `ProtobufRpcEngine2` in Hive itself. As I understand 
they have upgraded from PB2 to PB3



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533225)
Time Spent: 1h  (was: 50m)

> Upgrade Hadoop to 3.3.0
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-24509) Move show specific codes under DDL and cut MetaDataFormatter classes to pieces

2021-01-08 Thread Miklos Gergely (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-24509 started by Miklos Gergely.
-
> Move show specific codes under DDL and cut MetaDataFormatter classes to pieces
> --
>
> Key: HIVE-24509
> URL: https://issues.apache.org/jira/browse/HIVE-24509
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> Lot of show ... specific codes are under the  
> org.apache.hadoop.hive.ql.metadata.formatting package which are used only by 
> these commands. Also the two MetaDataFormatters (JsonMetaDataFormatter, 
> TextMetaDataFormatter) are trying to do everything, while they contain a lot 
> of code duplications. Their functionalities should be put under the 
> directories of the appropriate show commands.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-24509) Move show specific codes under DDL and cut MetaDataFormatter classes to pieces

2021-01-08 Thread Miklos Gergely (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely resolved HIVE-24509.
---
Resolution: Fixed

Pushed to master, thank you [~belugabehr]!

> Move show specific codes under DDL and cut MetaDataFormatter classes to pieces
> --
>
> Key: HIVE-24509
> URL: https://issues.apache.org/jira/browse/HIVE-24509
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> Lot of show ... specific codes are under the  
> org.apache.hadoop.hive.ql.metadata.formatting package which are used only by 
> these commands. Also the two MetaDataFormatters (JsonMetaDataFormatter, 
> TextMetaDataFormatter) are trying to do everything, while they contain a lot 
> of code duplications. Their functionalities should be put under the 
> directories of the appropriate show commands.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24509) Move show specific codes under DDL and cut MetaDataFormatter classes to pieces

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24509?focusedWorklogId=533217=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533217
 ]

ASF GitHub Bot logged work on HIVE-24509:
-

Author: ASF GitHub Bot
Created on: 08/Jan/21 20:52
Start Date: 08/Jan/21 20:52
Worklog Time Spent: 10m 
  Work Description: miklosgergely merged pull request #1756:
URL: https://github.com/apache/hive/pull/1756


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533217)
Time Spent: 8h 20m  (was: 8h 10m)

> Move show specific codes under DDL and cut MetaDataFormatter classes to pieces
> --
>
> Key: HIVE-24509
> URL: https://issues.apache.org/jira/browse/HIVE-24509
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> Lot of show ... specific codes are under the  
> org.apache.hadoop.hive.ql.metadata.formatting package which are used only by 
> these commands. Also the two MetaDataFormatters (JsonMetaDataFormatter, 
> TextMetaDataFormatter) are trying to do everything, while they contain a lot 
> of code duplications. Their functionalities should be put under the 
> directories of the appropriate show commands.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24608) Switch back to get_table in HMS client

2021-01-08 Thread Chao Sun (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261527#comment-17261527
 ] 

Chao Sun commented on HIVE-24608:
-

cc [~sershe], [~thejas] 

> Switch back to get_table in HMS client
> --
>
> Key: HIVE-24608
> URL: https://issues.apache.org/jira/browse/HIVE-24608
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.7
>Reporter: Chao Sun
>Priority: Major
>
> HIVE-15062 introduced a backward-incompatible change by replacing 
> {{get_table}} with {{get_table_req}}. As consequence, when HMS client w/ 
> version > 2.3 talks to a HMS w/ version < 2.3, it will get error similar to 
> the following:
> {code}
> AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: Unable 
> to fetch table testpartitiondata. Invalid method name: 'get_table_req';
> {code}
> Looking at HIVE-15062, the {{get_table_req}} is to introduce client-side 
> check for capabilities. However in branch-2.3 the check is a no-op since 
> there is no capability yet (it is assigned to null). Therefore, this JIRA 
> proposes to switch back to {{get_table}} in branch-2.3 to fix the 
> compatibility issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-24603) ALTER TABLE RENAME is not modifying the location of managed table

2021-01-08 Thread Sai Hemanth Gantasala (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-24603 started by Sai Hemanth Gantasala.

> ALTER TABLE RENAME is not modifying the location of managed table
> -
>
> Key: HIVE-24603
> URL: https://issues.apache.org/jira/browse/HIVE-24603
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The location of the managed table is not changing when the table is renamed.
> This causes correctness issues as well like the following -
> create table abc (id int);
> insert into abc values (1);
> rename table abc to def;
> create table abc (id int); // This should be empty
> insert into abc values (2);
> select * from abc ; // now returns the 1 and 2, (ie the old results as well)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24607) Add JUnit annotation for running tests only if ports are available

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24607?focusedWorklogId=533157=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533157
 ]

ASF GitHub Bot logged work on HIVE-24607:
-

Author: ASF GitHub Bot
Created on: 08/Jan/21 18:10
Start Date: 08/Jan/21 18:10
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #1843:
URL: https://github.com/apache/hive/pull/1843#issuecomment-756914601


   How are you planning to use this annotation? I mean are there going to be 
tests which will be skipped if the ports are not available? That would mean 
that patches that are breaking some tests may be merged in the event the test 
that is broken is just skipped due to the used port!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533157)
Time Spent: 20m  (was: 10m)

> Add JUnit annotation for running tests only if ports are available
> --
>
> Key: HIVE-24607
> URL: https://issues.apache.org/jira/browse/HIVE-24607
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Some unit tests tend to rely on some specific ports assuming that they are 
> available. Moreover, in some cases it is necessary to create explicitly a 
> socket bound to some specific port. 
> The goal of this Jira is to add a JUnit annotation that will run a test only 
> if the requested ports are available (skip it otherwise).
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24607) Add JUnit annotation for running tests only if ports are available

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24607?focusedWorklogId=533151=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533151
 ]

ASF GitHub Bot logged work on HIVE-24607:
-

Author: ASF GitHub Bot
Created on: 08/Jan/21 17:51
Start Date: 08/Jan/21 17:51
Worklog Time Spent: 10m 
  Work Description: zabetak opened a new pull request #1843:
URL: https://github.com/apache/hive/pull/1843


   ### What changes were proposed in this pull request?
   New JUnit annotation for running/skipping tests when ports are 
available/taken
   
   ### Why are the changes needed?
   Avoid unexpected failures in tests when ports required in tests are taken.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   `mvn test -pl testutils -Dtest=TestEnabledIfPortsAvailableCondition`
   [WARNING] Tests run: 3, Failures: 0, Errors: 0, Skipped: 1
   ```
   nc -l 2001 &
   mvn test -pl testutils -Dtest=TestEnabledIfPortsAvailableCondition
   ```
   [WARNING] Tests run: 3, Failures: 0, Errors: 0, Skipped: 2
   ```
   nc -l 5050 &
   mvn test -pl testutils -Dtest=TestEnabledIfPortsAvailableCondition
   ```
   [WARNING] Tests run: 3, Failures: 0, Errors: 0, Skipped: 3
   ```
   nc -l 2000 &
   mvn test -pl testutils -Dtest=TestEnabledIfPortsAvailableCondition
   ```
   [WARNING] Tests run: 3, Failures: 0, Errors: 0, Skipped: 3



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533151)
Remaining Estimate: 0h
Time Spent: 10m

> Add JUnit annotation for running tests only if ports are available
> --
>
> Key: HIVE-24607
> URL: https://issues.apache.org/jira/browse/HIVE-24607
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Some unit tests tend to rely on some specific ports assuming that they are 
> available. Moreover, in some cases it is necessary to create explicitly a 
> socket bound to some specific port. 
> The goal of this Jira is to add a JUnit annotation that will run a test only 
> if the requested ports are available (skip it otherwise).
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24607) Add JUnit annotation for running tests only if ports are available

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24607:
--
Labels: pull-request-available  (was: )

> Add JUnit annotation for running tests only if ports are available
> --
>
> Key: HIVE-24607
> URL: https://issues.apache.org/jira/browse/HIVE-24607
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Some unit tests tend to rely on some specific ports assuming that they are 
> available. Moreover, in some cases it is necessary to create explicitly a 
> socket bound to some specific port. 
> The goal of this Jira is to add a JUnit annotation that will run a test only 
> if the requested ports are available (skip it otherwise).
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24607) Add JUnit annotation for running tests only if ports are available

2021-01-08 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis reassigned HIVE-24607:
--


> Add JUnit annotation for running tests only if ports are available
> --
>
> Key: HIVE-24607
> URL: https://issues.apache.org/jira/browse/HIVE-24607
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>
> Some unit tests tend to rely on some specific ports assuming that they are 
> available. Moreover, in some cases it is necessary to create explicitly a 
> socket bound to some specific port. 
> The goal of this Jira is to add a JUnit annotation that will run a test only 
> if the requested ports are available (skip it otherwise).
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24588) Run tests using specific log4j2 configuration conveniently

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24588?focusedWorklogId=533090=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533090
 ]

ASF GitHub Bot logged work on HIVE-24588:
-

Author: ASF GitHub Bot
Created on: 08/Jan/21 15:57
Start Date: 08/Jan/21 15:57
Worklog Time Spent: 10m 
  Work Description: zabetak opened a new pull request #1842:
URL: https://github.com/apache/hive/pull/1842


   ### What changes were proposed in this pull request?
   Add new Junit Jupiter annotation/extension.
   
   ### Why are the changes needed?
   Run easily unit tests using a specific log4j configuration with minimal 
side-effects.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   `mvn test -pl testutils -Dtest=TestLog4jConfigExtension`
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533090)
Remaining Estimate: 0h
Time Spent: 10m

> Run tests using specific log4j2 configuration conveniently
> --
>
> Key: HIVE-24588
> URL: https://issues.apache.org/jira/browse/HIVE-24588
> Project: Hive
>  Issue Type: Improvement
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In order to reproduce a problem (e.g., HIVE-24569) or validate that a log4j2 
> configuration is working as expected it is necessary to run a test and 
> explicitly specify which configuration should be used. Moreover, after the 
> end of the test in question it is desirable to restore the old logging 
> configuration that was used before launching the test to avoid affecting the 
> overall logging output.
> The goal of this issue is to introduce a convenient & declarative way of 
> running tests with log4j2 configurations based on Jupiter extensions and 
> annotations. The test could like below:
> {code:java}
>   @Test
>   @Log4jConfig("test-log4j2.properties")
>   void testUseExplicitConfig() {
> // Do something and assert
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24588) Run tests using specific log4j2 configuration conveniently

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24588:
--
Labels: pull-request-available  (was: )

> Run tests using specific log4j2 configuration conveniently
> --
>
> Key: HIVE-24588
> URL: https://issues.apache.org/jira/browse/HIVE-24588
> Project: Hive
>  Issue Type: Improvement
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In order to reproduce a problem (e.g., HIVE-24569) or validate that a log4j2 
> configuration is working as expected it is necessary to run a test and 
> explicitly specify which configuration should be used. Moreover, after the 
> end of the test in question it is desirable to restore the old logging 
> configuration that was used before launching the test to avoid affecting the 
> overall logging output.
> The goal of this issue is to introduce a convenient & declarative way of 
> running tests with log4j2 configurations based on Jupiter extensions and 
> annotations. The test could like below:
> {code:java}
>   @Test
>   @Log4jConfig("test-log4j2.properties")
>   void testUseExplicitConfig() {
> // Do something and assert
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-22318) Java.io.exception:Two readers for

2021-01-08 Thread Shubham Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261355#comment-17261355
 ] 

Shubham Sharma edited comment on HIVE-22318 at 1/8/21, 3:26 PM:


[~kecheung] Faced this issue today with one of our user, here is the workaround 
to fix this issue:
 * Connect with beeline
 * Run below property in session:
{code:java}
set hive.fetch.task.conversion=none{code}

 * Now you'll be able to run select statements over the mentioned table.
 * Run below statement to create a backup for the table
{code:java}
create table  as select * from  ;{code}

 * Once you have the backup ready, logout from session and check the backup 
without setting any property (check count and table consistency from data 
quality perspective)
{code:java}
select * from  ;{code}

 * Now you can drop problem table and replace with backup table
{code:java}
drop table ;
alter table  rename to ;{code}
*Note:* To avoid this issue in future, create the table with a bucketing column 
in DDL


was (Author: shubh_init):
[~kecheung] Faced this issue today with one of our user, here is the workaround 
to fix this issue:
 * Connect with beeline
 * Run below property in session:
{code:java}
set hive.fetch.task.conversion=none{code}

 * Now you'll be able to run select statements over the mentioned table.
 * Run below statement to create a backup for the table
{code:java}
create table  as select * from  ;{code}

 * Once you have the backup ready, logout from session and check the backup 
without setting any property
{code:java}
select * from  ;{code}

 * Now you can drop problem table and replace with backup table
{code:java}
drop table ;
alter table  rename to ;{code}
*Note:* To avoid this issue in future, create the table with a bucketing column 
in DDL

> Java.io.exception:Two readers for
> -
>
> Key: HIVE-22318
> URL: https://issues.apache.org/jira/browse/HIVE-22318
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Affects Versions: 3.1.0
>Reporter: max_c
>Priority: Major
> Attachments: hiveserver2 for exception.log
>
>
> I create a ACID table with ORC format:
>  
> {noformat}
> CREATE TABLE `some.TableA`( 
>
>)   
>  ROW FORMAT SERDE   
>'org.apache.hadoop.hive.ql.io.orc.OrcSerde'  
>  STORED AS INPUTFORMAT  
>'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  
>  OUTPUTFORMAT   
>'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'  
>  TBLPROPERTIES (
>'bucketing_version'='2', 
>'orc.compress'='SNAPPY', 
>'transactional'='true',  
>'transactional_properties'='default'){noformat}
> After executing merge into operation:
> {noformat}
> MERGE INTO some.TableA AS a USING (SELECT vend_no FROM some.TableB UNION ALL 
> SELECT vend_no FROM some.TableC) AS b ON a.vend_no=b.vend_no WHEN MATCHED 
> THEN DELETE
> {noformat}
> the problem happend(when selecting the TableA, the exception happens too):
> {noformat}
> java.io.IOException: java.io.IOException: Two readers for {originalWriteId: 
> 4, bucket: 536870912(1.0.0), row: 2434, currentWriteId 25}: new 
> [key={originalWriteId: 4, bucket: 536870912(1.0.0), row: 2434, currentWriteId 
> 25}, nextRecord={2, 4, 536870912, 2434, 25, null}, reader=Hive ORC 
> Reader(hdfs://hdpprod/warehouse/tablespace/managed/hive/some.db/tableA/delete_delta_015_026/bucket_1,
>  9223372036854775807)], old [key={originalWriteId: 4, bucket: 
> 536870912(1.0.0), row: 2434, currentWriteId 25}, nextRecord={2, 4, 536870912, 
> 2434, 25, null}, reader=Hive ORC 
> Reader(hdfs://hdpprod/warehouse/tablespace/managed/hive/some.db/tableA/delete_delta_015_026/bucket_0{noformat}
> Through orc_tools I scan all the 
> files(bucket_0,bucket_1,bucket_2) under delete_delta and find all 
> rows of files are the same.I think this will cause the same 
> key(RecordIdentifer) when scan the bucket_1 after bucket_0 but I 
> don't know why all the rows are the same in these bucket files.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-22318) Java.io.exception:Two readers for

2021-01-08 Thread Shubham Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261355#comment-17261355
 ] 

Shubham Sharma edited comment on HIVE-22318 at 1/8/21, 3:18 PM:


[~kecheung] Faced this issue today with one of our user, here is the workaround 
to fix this issue:
 * Connect with beeline
 * Run below property in session:
{code:java}
set hive.fetch.task.conversion=none{code}

 * Now you'll be able to run select statements over the mentioned table.
 * Run below statement to create a backup for the table
{code:java}
create table  as select * from  ;{code}

 * Once you have the backup ready, logout from session and check the backup 
without setting any property
{code:java}
select * from  ;{code}

 * Now you can drop problem table and replace with backup table
{code:java}
drop table ;
alter table  rename to ;{code}
*Note:* To avoid this issue in future, create the table with a bucketing column 
in DDL


was (Author: shubh_init):
[~kecheung] Faced this issue today with one of our user, here is the workaround 
to fix this issue:
 * Connect with beeline
 * Run below property in session:
{code:java}
set hive.fetch.task.conversion=none{code}

 * Now you'll be able to run select statements over the mentioned table.
 * Run below statement to create a backup for the table
{code:java}
create table  as select * from  ;{code}

 * Once you have the backup ready, logout from session and check the backup 
without setting any property
{code:java}
select * from  ;{code}

 * Now you can drop problem table and replace with backup table
{code:java}
drop table ;
alter table  rename to ;{code}
*Note:* To avoid this issue in future, create the backup table with a bucketing 
column in DDL

> Java.io.exception:Two readers for
> -
>
> Key: HIVE-22318
> URL: https://issues.apache.org/jira/browse/HIVE-22318
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Affects Versions: 3.1.0
>Reporter: max_c
>Priority: Major
> Attachments: hiveserver2 for exception.log
>
>
> I create a ACID table with ORC format:
>  
> {noformat}
> CREATE TABLE `some.TableA`( 
>
>)   
>  ROW FORMAT SERDE   
>'org.apache.hadoop.hive.ql.io.orc.OrcSerde'  
>  STORED AS INPUTFORMAT  
>'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  
>  OUTPUTFORMAT   
>'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'  
>  TBLPROPERTIES (
>'bucketing_version'='2', 
>'orc.compress'='SNAPPY', 
>'transactional'='true',  
>'transactional_properties'='default'){noformat}
> After executing merge into operation:
> {noformat}
> MERGE INTO some.TableA AS a USING (SELECT vend_no FROM some.TableB UNION ALL 
> SELECT vend_no FROM some.TableC) AS b ON a.vend_no=b.vend_no WHEN MATCHED 
> THEN DELETE
> {noformat}
> the problem happend(when selecting the TableA, the exception happens too):
> {noformat}
> java.io.IOException: java.io.IOException: Two readers for {originalWriteId: 
> 4, bucket: 536870912(1.0.0), row: 2434, currentWriteId 25}: new 
> [key={originalWriteId: 4, bucket: 536870912(1.0.0), row: 2434, currentWriteId 
> 25}, nextRecord={2, 4, 536870912, 2434, 25, null}, reader=Hive ORC 
> Reader(hdfs://hdpprod/warehouse/tablespace/managed/hive/some.db/tableA/delete_delta_015_026/bucket_1,
>  9223372036854775807)], old [key={originalWriteId: 4, bucket: 
> 536870912(1.0.0), row: 2434, currentWriteId 25}, nextRecord={2, 4, 536870912, 
> 2434, 25, null}, reader=Hive ORC 
> Reader(hdfs://hdpprod/warehouse/tablespace/managed/hive/some.db/tableA/delete_delta_015_026/bucket_0{noformat}
> Through orc_tools I scan all the 
> files(bucket_0,bucket_1,bucket_2) under delete_delta and find all 
> rows of files are the same.I think this will cause the same 
> key(RecordIdentifer) when scan the bucket_1 after bucket_0 but I 
> don't know why all the rows are the same in these bucket files.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-22318) Java.io.exception:Two readers for

2021-01-08 Thread Shubham Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261355#comment-17261355
 ] 

Shubham Sharma edited comment on HIVE-22318 at 1/8/21, 3:18 PM:


[~kecheung] Faced this issue today with one of our user, here is the workaround 
to fix this issue:
 * Connect with beeline
 * Run below property in session:
{code:java}
set hive.fetch.task.conversion=none{code}

 * Now you'll be able to run select statements over the mentioned table.
 * Run below statement to create a backup for the table
{code:java}
create table  as select * from  ;{code}

 * Once you have the backup ready, logout from session and check the backup 
without setting any property
{code:java}
select * from  ;{code}

 * Now you can drop problem table and replace with backup table
{code:java}
drop table ;
alter table  rename to ;{code}
*Note:* To avoid this issue in future, create the backup table with a bucketing 
column in DDL


was (Author: shubh_init):
[~kecheung] Faced this issue today with one of our user, here is the workaround 
to fix this issue:
 # Connect with beeline
 # Run below property in session:
{code:java}
set hive.fetch.task.conversion=none{code}

 # Now you'll be able to run select statements over the mentioned table.
 # Run below statement to create a backup for the table
{code:java}
create table  as select * from  ;{code}

 # Once you have the backup ready, logout from session and check the backup 
without setting any property
{code:java}
select * from  ;{code}

 # Now you can drop problem table and replace with backup table
{code:java}
drop table ;
alter table  rename to ;{code}

 # To avoid this issue in future, create the backup table with a bucketing 
column in DDL

> Java.io.exception:Two readers for
> -
>
> Key: HIVE-22318
> URL: https://issues.apache.org/jira/browse/HIVE-22318
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Affects Versions: 3.1.0
>Reporter: max_c
>Priority: Major
> Attachments: hiveserver2 for exception.log
>
>
> I create a ACID table with ORC format:
>  
> {noformat}
> CREATE TABLE `some.TableA`( 
>
>)   
>  ROW FORMAT SERDE   
>'org.apache.hadoop.hive.ql.io.orc.OrcSerde'  
>  STORED AS INPUTFORMAT  
>'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  
>  OUTPUTFORMAT   
>'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'  
>  TBLPROPERTIES (
>'bucketing_version'='2', 
>'orc.compress'='SNAPPY', 
>'transactional'='true',  
>'transactional_properties'='default'){noformat}
> After executing merge into operation:
> {noformat}
> MERGE INTO some.TableA AS a USING (SELECT vend_no FROM some.TableB UNION ALL 
> SELECT vend_no FROM some.TableC) AS b ON a.vend_no=b.vend_no WHEN MATCHED 
> THEN DELETE
> {noformat}
> the problem happend(when selecting the TableA, the exception happens too):
> {noformat}
> java.io.IOException: java.io.IOException: Two readers for {originalWriteId: 
> 4, bucket: 536870912(1.0.0), row: 2434, currentWriteId 25}: new 
> [key={originalWriteId: 4, bucket: 536870912(1.0.0), row: 2434, currentWriteId 
> 25}, nextRecord={2, 4, 536870912, 2434, 25, null}, reader=Hive ORC 
> Reader(hdfs://hdpprod/warehouse/tablespace/managed/hive/some.db/tableA/delete_delta_015_026/bucket_1,
>  9223372036854775807)], old [key={originalWriteId: 4, bucket: 
> 536870912(1.0.0), row: 2434, currentWriteId 25}, nextRecord={2, 4, 536870912, 
> 2434, 25, null}, reader=Hive ORC 
> Reader(hdfs://hdpprod/warehouse/tablespace/managed/hive/some.db/tableA/delete_delta_015_026/bucket_0{noformat}
> Through orc_tools I scan all the 
> files(bucket_0,bucket_1,bucket_2) under delete_delta and find all 
> rows of files are the same.I think this will cause the same 
> key(RecordIdentifer) when scan the bucket_1 after bucket_0 but I 
> don't know why all the rows are the same in these bucket files.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22318) Java.io.exception:Two readers for

2021-01-08 Thread Shubham Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261355#comment-17261355
 ] 

Shubham Sharma commented on HIVE-22318:
---

[~kecheung] Faced this issue today with one of our user, here is the workaround 
to fix this issue:
 # Connect with beeline
 # Run below property in session:
{code:java}
set hive.fetch.task.conversion=none{code}

 # Now you'll be able to run select statements over the mentioned table.
 # Run below statement to create a backup for the table
{code:java}
create table  as select * from  ;{code}

 # Once you have the backup ready, logout from session and check the backup 
without setting any property
{code:java}
select * from  ;{code}

 # Now you can drop problem table and replace with backup table
{code:java}
drop table ;
alter table  rename to ;{code}

 # To avoid this issue in future, create the backup table with a bucketing 
column in DDL

> Java.io.exception:Two readers for
> -
>
> Key: HIVE-22318
> URL: https://issues.apache.org/jira/browse/HIVE-22318
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Affects Versions: 3.1.0
>Reporter: max_c
>Priority: Major
> Attachments: hiveserver2 for exception.log
>
>
> I create a ACID table with ORC format:
>  
> {noformat}
> CREATE TABLE `some.TableA`( 
>
>)   
>  ROW FORMAT SERDE   
>'org.apache.hadoop.hive.ql.io.orc.OrcSerde'  
>  STORED AS INPUTFORMAT  
>'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  
>  OUTPUTFORMAT   
>'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'  
>  TBLPROPERTIES (
>'bucketing_version'='2', 
>'orc.compress'='SNAPPY', 
>'transactional'='true',  
>'transactional_properties'='default'){noformat}
> After executing merge into operation:
> {noformat}
> MERGE INTO some.TableA AS a USING (SELECT vend_no FROM some.TableB UNION ALL 
> SELECT vend_no FROM some.TableC) AS b ON a.vend_no=b.vend_no WHEN MATCHED 
> THEN DELETE
> {noformat}
> the problem happend(when selecting the TableA, the exception happens too):
> {noformat}
> java.io.IOException: java.io.IOException: Two readers for {originalWriteId: 
> 4, bucket: 536870912(1.0.0), row: 2434, currentWriteId 25}: new 
> [key={originalWriteId: 4, bucket: 536870912(1.0.0), row: 2434, currentWriteId 
> 25}, nextRecord={2, 4, 536870912, 2434, 25, null}, reader=Hive ORC 
> Reader(hdfs://hdpprod/warehouse/tablespace/managed/hive/some.db/tableA/delete_delta_015_026/bucket_1,
>  9223372036854775807)], old [key={originalWriteId: 4, bucket: 
> 536870912(1.0.0), row: 2434, currentWriteId 25}, nextRecord={2, 4, 536870912, 
> 2434, 25, null}, reader=Hive ORC 
> Reader(hdfs://hdpprod/warehouse/tablespace/managed/hive/some.db/tableA/delete_delta_015_026/bucket_0{noformat}
> Through orc_tools I scan all the 
> files(bucket_0,bucket_1,bucket_2) under delete_delta and find all 
> rows of files are the same.I think this will cause the same 
> key(RecordIdentifer) when scan the bucket_1 after bucket_0 but I 
> don't know why all the rows are the same in these bucket files.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24581) Remove AcidUtils call from OrcInputformat for non transactional tables

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24581?focusedWorklogId=533046=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533046
 ]

ASF GitHub Bot logged work on HIVE-24581:
-

Author: ASF GitHub Bot
Created on: 08/Jan/21 14:42
Start Date: 08/Jan/21 14:42
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on a change in pull request #1826:
URL: https://github.com/apache/hive/pull/1826#discussion_r553982610



##
File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
##
@@ -1866,92 +1870,6 @@ private static boolean isDirUsable(Path child, long 
visibilityTxnId, List
 return true;
   }
 
-  public static class HdfsFileStatusWithoutId implements HdfsFileStatusWithId {
-private final FileStatus fs;
-
-public HdfsFileStatusWithoutId(FileStatus fs) {
-  this.fs = fs;
-}
-
-@Override
-public FileStatus getFileStatus() {
-  return fs;
-}
-
-@Override
-public Long getFileId() {
-  return null;
-}
-  }
-
-  /**
-   * Find the original files (non-ACID layout) recursively under the partition 
directory.
-   * @param fs the file system
-   * @param dir the directory to add
-   * @return the list of original files
-   * @throws IOException
-   */
-  public static List findOriginals(FileSystem fs, Path 
dir, Ref useFileIds,
-  boolean ignoreEmptyFiles, boolean recursive) throws IOException {
-List originals = new ArrayList<>();
-List childrenWithId = 
tryListLocatedHdfsStatus(useFileIds, fs, dir, hiddenFileFilter);
-if (childrenWithId != null) {
-  for (HdfsFileStatusWithId child : childrenWithId) {
-if (child.getFileStatus().isDirectory()) {
-  if (recursive) {
-originals.addAll(findOriginals(fs, 
child.getFileStatus().getPath(), useFileIds,
-ignoreEmptyFiles, true));
-  }
-} else {
-  if (!ignoreEmptyFiles || child.getFileStatus().getLen() > 0) {
-originals.add(child);
-  }
-}
-  }
-} else {
-  List children = HdfsUtils.listLocatedStatus(fs, dir, 
hiddenFileFilter);
-  for (FileStatus child : children) {
-if (child.isDirectory()) {
-  if (recursive) {
-originals.addAll(findOriginals(fs, child.getPath(), useFileIds, 
ignoreEmptyFiles, true));
-  }
-} else {
-  if (!ignoreEmptyFiles || child.getLen() > 0) {

Review comment:
   The findoriginals was called with ignoreEmpty = true always, so i 
removed it. If I understand correctly this parameter could be removed from 
every other acidutils call. It was there to handle MR, which created empty 
files for every bucket. 
https://issues.apache.org/jira/browse/HIVE-13040?focusedCommentId=15159223=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15159223





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533046)
Time Spent: 0.5h  (was: 20m)

> Remove AcidUtils call from OrcInputformat for non transactional tables
> --
>
> Key: HIVE-24581
> URL: https://issues.apache.org/jira/browse/HIVE-24581
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently the split generation in OrcInputformat is tightly coupled with acid 
> and AcidUtils.getAcidState is called even if the table is not transactional. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24602) Retry compaction after configured time

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24602?focusedWorklogId=533038=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533038
 ]

ASF GitHub Bot logged work on HIVE-24602:
-

Author: ASF GitHub Bot
Created on: 08/Jan/21 14:23
Start Date: 08/Jan/21 14:23
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on a change in pull request #1839:
URL: https://github.com/apache/hive/pull/1839#discussion_r553971662



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
##
@@ -1006,16 +1007,23 @@ public boolean checkFailedCompactions(CompactionInfo 
ci) throws MetaException {
 rs = pStmt.executeQuery();
 int numFailed = 0;
 int numTotal = 0;
+long lastEnqueueTime = -1;
 int failedThreshold = MetastoreConf.getIntVar(conf, 
ConfVars.COMPACTOR_INITIATOR_FAILED_THRESHOLD);
 while(rs.next() && ++numTotal <= failedThreshold) {
+  long enqueueTime = rs.getLong(2);
+  if (enqueueTime > lastEnqueueTime) {
+lastEnqueueTime = enqueueTime;
+  }
   if(rs.getString(1).charAt(0) == FAILED_STATE) {
 numFailed++;
   }
   else {
 numFailed--;
   }
 }
-return numFailed == failedThreshold;
+// If the last attempt was too long ago, ignore the failed treshold 
and try compaction again
+long retryTime = MetastoreConf.getTimeVar(conf, 
ConfVars.COMPACTOR_INITIATOR_FAILED_RETRY_TIME, TimeUnit.MILLISECONDS);

Review comment:
   Still, this does not initiate anything in the conf, it is already there 
in memory. Just run some test on my machine the average time for a getTimeVar 
call is 0.48 microsecond





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533038)
Time Spent: 1.5h  (was: 1h 20m)

> Retry compaction after configured time
> --
>
> Key: HIVE-24602
> URL: https://issues.apache.org/jira/browse/HIVE-24602
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Currently if compaction fails two consecutive times it will stop compaction 
> forever for the given partition / table unless someone manually intervenes. 
> See COMPACTOR_INITIATOR_FAILED_THRESHOLD.
> The Initiator should retry again after a configurable amount of time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24602) Retry compaction after configured time

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24602?focusedWorklogId=533033=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533033
 ]

ASF GitHub Bot logged work on HIVE-24602:
-

Author: ASF GitHub Bot
Created on: 08/Jan/21 14:03
Start Date: 08/Jan/21 14:03
Worklog Time Spent: 10m 
  Work Description: klcopp commented on a change in pull request #1839:
URL: https://github.com/apache/hive/pull/1839#discussion_r553960649



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
##
@@ -1006,16 +1007,23 @@ public boolean checkFailedCompactions(CompactionInfo 
ci) throws MetaException {
 rs = pStmt.executeQuery();
 int numFailed = 0;
 int numTotal = 0;
+long lastEnqueueTime = -1;
 int failedThreshold = MetastoreConf.getIntVar(conf, 
ConfVars.COMPACTOR_INITIATOR_FAILED_THRESHOLD);
 while(rs.next() && ++numTotal <= failedThreshold) {
+  long enqueueTime = rs.getLong(2);
+  if (enqueueTime > lastEnqueueTime) {
+lastEnqueueTime = enqueueTime;
+  }
   if(rs.getString(1).charAt(0) == FAILED_STATE) {
 numFailed++;
   }
   else {
 numFailed--;
   }
 }
-return numFailed == failedThreshold;
+// If the last attempt was too long ago, ignore the failed treshold 
and try compaction again
+long retryTime = MetastoreConf.getTimeVar(conf, 
ConfVars.COMPACTOR_INITIATOR_FAILED_RETRY_TIME, TimeUnit.MILLISECONDS);

Review comment:
   I'm not 100% convinced about the time critical part, since effectively, 
because of mutexing, only one Initiator thread runs on the whole system.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533033)
Time Spent: 1h 20m  (was: 1h 10m)

> Retry compaction after configured time
> --
>
> Key: HIVE-24602
> URL: https://issues.apache.org/jira/browse/HIVE-24602
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Currently if compaction fails two consecutive times it will stop compaction 
> forever for the given partition / table unless someone manually intervenes. 
> See COMPACTOR_INITIATOR_FAILED_THRESHOLD.
> The Initiator should retry again after a configurable amount of time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24606) Multi-stage materialized CTEs can lost intermediate data

2021-01-08 Thread okumin (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

okumin reassigned HIVE-24606:
-


> Multi-stage materialized CTEs can lost intermediate data
> 
>
> Key: HIVE-24606
> URL: https://issues.apache.org/jira/browse/HIVE-24606
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.1.2, 2.3.7, 4.0.0
>Reporter: okumin
>Assignee: okumin
>Priority: Major
>
> With complex multi-stage CTEs, Hive can start a latter stage before its 
> previous stage finishes.
>  That's because `SemanticAnalyzer#toRealRootTasks` can fail to resolve 
> dependency between multistage materialized CTEs when a non-materialized CTE 
> cuts in.
>  
> [https://github.com/apache/hive/blob/425e1ff7c054f87c4db87e77d004282d529599ae/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L1414]
>  
> For example, when submitting this query,
> {code:sql}
> SET hive.optimize.cte.materialize.threshold=2;
> SET hive.optimize.cte.materialize.full.aggregate.only=false;
> WITH x AS ( SELECT 'x' AS id ), -- not materialized
> a1 AS ( SELECT 'a1' AS id ), -- materialized by a2 and the root
> a2 AS ( SELECT 'a2 <- ' || id AS id FROM a1) -- materialized by the root
> SELECT * FROM a1
> UNION ALL
> SELECT * FROM x
> UNION ALL
> SELECT * FROM a2
> UNION ALL
> SELECT * FROM a2;
> {code}
> `toRealRootTask` will traverse the CTEs in order of `a1`, `x`, and `a2`. It 
> means the dependency between `a1` and `a2` will be ignored and `a2` can start 
> without waiting for `a1`. As a result, the above query returns the following 
> result.
> {code:java}
> +-+
> | id  |
> +-+
> | a1  |
> | x   |
> +-+
> {code}
> For your information, I ran this test with revision = 
> 425e1ff7c054f87c4db87e77d004282d529599ae.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23553) Upgrade ORC version to 1.6.6

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23553?focusedWorklogId=533018=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533018
 ]

ASF GitHub Bot logged work on HIVE-23553:
-

Author: ASF GitHub Bot
Created on: 08/Jan/21 13:38
Start Date: 08/Jan/21 13:38
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #1823:
URL: https://github.com/apache/hive/pull/1823#issuecomment-756759959


   the thing is that before I added that repo I pressed the test button which 
reported the error - because I think it also does the same for blackout 
detection - it would have probably not worked...
   
   I've added the repsy to the artifactory and added to the virtual repo 
precommit uses



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533018)
Time Spent: 2h 20m  (was: 2h 10m)

> Upgrade ORC version to 1.6.6
> 
>
> Key: HIVE-23553
> URL: https://issues.apache.org/jira/browse/HIVE-23553
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
>  Apache Hive is currently on 1.5.X version and in order to take advantage of 
> the latest ORC improvements such as column encryption we have to bump to 
> 1.6.X.
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12343288==12318320=Create_token=A5KQ-2QAV-T4JA-FDED_4ae78f19321c7fb1e7f337fba1dd90af751d8810_lin
> Even though ORC reader could work out of the box, HIVE LLAP is heavily 
> depending on internal ORC APIs e.g., to retrieve and store File Footers, 
> Tails, streams – un/compress RG data etc. As there ware many internal changes 
> from 1.5 to 1.6 (Input stream offsets, relative BufferChunks etc.) the 
> upgrade is not straightforward.
> This Umbrella Jira tracks this upgrade effort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-24565) Implement standard trim function

2021-01-08 Thread Krisztian Kasa (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa resolved HIVE-24565.
---
Resolution: Fixed

Pushed to master. Thanks [~jcamachorodriguez], [~kgyrtkirk] for review.

> Implement standard trim function
> 
>
> Key: HIVE-24565
> URL: https://issues.apache.org/jira/browse/HIVE-24565
> Project: Hive
>  Issue Type: Improvement
>  Components: Parser, UDF
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {code}
>  ::=
> TRIM   
>  ::=
> [ [  ] [  ] FROM ] 
>  ::=
> 
>  ::=
> LEADING
> | TRAILING
> | BOTH
>  ::=
> 
> {code}
> Example
> {code}
> SELECT TRIM(LEADING '0' FROM '000123');
> 123
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24565) Implement standard trim function

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24565?focusedWorklogId=533016=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533016
 ]

ASF GitHub Bot logged work on HIVE-24565:
-

Author: ASF GitHub Bot
Created on: 08/Jan/21 13:37
Start Date: 08/Jan/21 13:37
Worklog Time Spent: 10m 
  Work Description: kasakrisz merged pull request #1810:
URL: https://github.com/apache/hive/pull/1810


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533016)
Time Spent: 1h  (was: 50m)

> Implement standard trim function
> 
>
> Key: HIVE-24565
> URL: https://issues.apache.org/jira/browse/HIVE-24565
> Project: Hive
>  Issue Type: Improvement
>  Components: Parser, UDF
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {code}
>  ::=
> TRIM   
>  ::=
> [ [  ] [  ] FROM ] 
>  ::=
> 
>  ::=
> LEADING
> | TRAILING
> | BOTH
>  ::=
> 
> {code}
> Example
> {code}
> SELECT TRIM(LEADING '0' FROM '000123');
> 123
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24510) Vectorize compute_bit_vector

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24510?focusedWorklogId=533014=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533014
 ]

ASF GitHub Bot logged work on HIVE-24510:
-

Author: ASF GitHub Bot
Created on: 08/Jan/21 13:35
Start Date: 08/Jan/21 13:35
Worklog Time Spent: 10m 
  Work Description: abstractdog edited a comment on pull request #1824:
URL: https://github.com/apache/hive/pull/1824#issuecomment-756758376


   > I made a quick fix to allow that in early versions of this patch. Then I 
decided to not pursue it because I did not see the need for allowing constant 
argument in runtime.
   > 
   > > you can still do something like:
   > > if compute_bit_vector: -> handle constant parameter
   > 
   > We do exactly that. Not in vectorizer but earlier in 
`ColumnStatsSemanticAnalyzer.java `. I am reluctant to implement extra 
functionality or add special cases unless it is necessary. Note that 
compute_bit_vector is a newly added UDF in 4.0. So there is no backward 
compatibility concern either.
   > Do you see any other benefit than preserving the earlier q.out outputs?
   
   no, I'm concerned only about the qout changes
   you're right, if compute_bit_vector is a relatively new thing then we can 
also ignore backward compatibility problems and go on with 
compute_bit_vector_hll
   I would personally keep pursuing a smaller patch as having 
"compute_bit_vector_hll" has no benefits either, but it's up to you, I think if 
the default hll algo won't be changed in the near future for stats, we can go 
with the updated qouts :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533014)
Time Spent: 3h  (was: 2h 50m)

> Vectorize compute_bit_vector
> 
>
> Key: HIVE-24510
> URL: https://issues.apache.org/jira/browse/HIVE-24510
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa İman
>Assignee: Mustafa İman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> After https://issues.apache.org/jira/browse/HIVE-23530 , almost all compute 
> stats functions are vectorizable. Only function that is not vectorizable is 
> "compute_bit_vector" for ndv statistics computation. This causes "create 
> table as select" and "insert overwrite select" queries to run in 
> non-vectorized mode. 
> Even a very naive implementation of vectorized compute_bit_vector gives about 
> 50% performance improvement on simple "insert overwrite select" queries. That 
> is because entire mapper or reducer can run in vectorized mode.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24510) Vectorize compute_bit_vector

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24510?focusedWorklogId=533013=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533013
 ]

ASF GitHub Bot logged work on HIVE-24510:
-

Author: ASF GitHub Bot
Created on: 08/Jan/21 13:35
Start Date: 08/Jan/21 13:35
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on pull request #1824:
URL: https://github.com/apache/hive/pull/1824#issuecomment-756758376


   > I made a quick fix to allow that in early versions of this patch. Then I 
decided to not pursue it because I did not see the need for allowing constant 
argument in runtime.
   > 
   > > you can still do something like:
   > > if compute_bit_vector: -> handle constant parameter
   > 
   > We do exactly that. Not in vectorizer but earlier in 
`ColumnStatsSemanticAnalyzer.java `. I am reluctant to implement extra 
functionality or add special cases unless it is necessary. Note that 
compute_bit_vector is a newly added UDF in 4.0. So there is no backward 
compatibility concern either.
   > Do you see any other benefit than preserving the earlier q.out outputs?
   
   no, I'm concerned only about the qout changes
   you're right, if compute_bit_vector is a relatively new thing then we can 
also ignore backward compatibility problems and go on with 
compute_bit_vector_hll
   I would personally keep pursuing a smaller patch as having 
"compute_bit_vector_hll" has no benefits either, but it's up to you, I think if 
the default hll algo won't be changed in the near future, we can go with the 
updated qouts :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533013)
Time Spent: 2h 50m  (was: 2h 40m)

> Vectorize compute_bit_vector
> 
>
> Key: HIVE-24510
> URL: https://issues.apache.org/jira/browse/HIVE-24510
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa İman
>Assignee: Mustafa İman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> After https://issues.apache.org/jira/browse/HIVE-23530 , almost all compute 
> stats functions are vectorizable. Only function that is not vectorizable is 
> "compute_bit_vector" for ndv statistics computation. This causes "create 
> table as select" and "insert overwrite select" queries to run in 
> non-vectorized mode. 
> Even a very naive implementation of vectorized compute_bit_vector gives about 
> 50% performance improvement on simple "insert overwrite select" queries. That 
> is because entire mapper or reducer can run in vectorized mode.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24581) Remove AcidUtils call from OrcInputformat for non transactional tables

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24581?focusedWorklogId=533010=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-533010
 ]

ASF GitHub Bot logged work on HIVE-24581:
-

Author: ASF GitHub Bot
Created on: 08/Jan/21 13:31
Start Date: 08/Jan/21 13:31
Worklog Time Spent: 10m 
  Work Description: kuczoram commented on a change in pull request #1826:
URL: https://github.com/apache/hive/pull/1826#discussion_r553944336



##
File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
##
@@ -1866,92 +1870,6 @@ private static boolean isDirUsable(Path child, long 
visibilityTxnId, List
 return true;
   }
 
-  public static class HdfsFileStatusWithoutId implements HdfsFileStatusWithId {
-private final FileStatus fs;
-
-public HdfsFileStatusWithoutId(FileStatus fs) {
-  this.fs = fs;
-}
-
-@Override
-public FileStatus getFileStatus() {
-  return fs;
-}
-
-@Override
-public Long getFileId() {
-  return null;
-}
-  }
-
-  /**
-   * Find the original files (non-ACID layout) recursively under the partition 
directory.
-   * @param fs the file system
-   * @param dir the directory to add
-   * @return the list of original files
-   * @throws IOException
-   */
-  public static List findOriginals(FileSystem fs, Path 
dir, Ref useFileIds,
-  boolean ignoreEmptyFiles, boolean recursive) throws IOException {
-List originals = new ArrayList<>();
-List childrenWithId = 
tryListLocatedHdfsStatus(useFileIds, fs, dir, hiddenFileFilter);
-if (childrenWithId != null) {
-  for (HdfsFileStatusWithId child : childrenWithId) {
-if (child.getFileStatus().isDirectory()) {
-  if (recursive) {
-originals.addAll(findOriginals(fs, 
child.getFileStatus().getPath(), useFileIds,
-ignoreEmptyFiles, true));
-  }
-} else {
-  if (!ignoreEmptyFiles || child.getFileStatus().getLen() > 0) {
-originals.add(child);
-  }
-}
-  }
-} else {
-  List children = HdfsUtils.listLocatedStatus(fs, dir, 
hiddenFileFilter);
-  for (FileStatus child : children) {
-if (child.isDirectory()) {
-  if (recursive) {
-originals.addAll(findOriginals(fs, child.getPath(), useFileIds, 
ignoreEmptyFiles, true));
-  }
-} else {
-  if (!ignoreEmptyFiles || child.getLen() > 0) {

Review comment:
   As far as I see, the ignoreEmptyFiles parameter is removed and it is not 
present in the newly added methods. Is this parameter not used any more? (Just 
want to make sure that removing it won't have any side effect.)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 533010)
Time Spent: 20m  (was: 10m)

> Remove AcidUtils call from OrcInputformat for non transactional tables
> --
>
> Key: HIVE-24581
> URL: https://issues.apache.org/jira/browse/HIVE-24581
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently the split generation in OrcInputformat is tightly coupled with acid 
> and AcidUtils.getAcidState is called even if the table is not transactional. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24590) Operation Logging still leaks the log4j Appenders

2021-01-08 Thread Stamatis Zampetakis (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261298#comment-17261298
 ] 

Stamatis Zampetakis commented on HIVE-24590:


I may have misunderstood what IdlePurgePolicy does. In my mind, I was thinking 
that if the appender is removed due to inactivity and then at some point in 
time there is a request to write to the same route then a new appender 
(pointing to the same file) could open without problem.

> Operation Logging still leaks the log4j Appenders
> -
>
> Key: HIVE-24590
> URL: https://issues.apache.org/jira/browse/HIVE-24590
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Eugene Chung
>Assignee: Stamatis Zampetakis
>Priority: Major
> Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot 
> 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen 
> Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, 
> Screen Shot 2021-01-08 at 21.01.40.png, add_debug_log_and_trace.patch
>
>
> I'm using Hive 3.1.2 with options below.
>  * hive.server2.logging.operation.enabled=true
>  * hive.server2.logging.operation.level=VERBOSE
>  * hive.async.log.enabled=false
> I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 
> but HS2 still leaks log4j RandomAccessFileManager.
> !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197!
> I checked the operation log file which is not closed/deleted properly.
> !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272!
> Then there's the log,
> {code:java}
> client.TezClient: Shutting down Tez Session, sessionName= {code}
> !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24590) Operation Logging still leaks the log4j Appenders

2021-01-08 Thread Eugene Chung (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261280#comment-17261280
 ] 

Eugene Chung commented on HIVE-24590:
-

> are the log messages directed to the proper files or not?

I observed the situation. Sometimes logs for HDFS delegation token generated by 
other accounts are shown in my operation log. (I am using kerberized Hadoop & 
Hive.)

> Operation Logging still leaks the log4j Appenders
> -
>
> Key: HIVE-24590
> URL: https://issues.apache.org/jira/browse/HIVE-24590
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Eugene Chung
>Assignee: Stamatis Zampetakis
>Priority: Major
> Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot 
> 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen 
> Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, 
> Screen Shot 2021-01-08 at 21.01.40.png, add_debug_log_and_trace.patch
>
>
> I'm using Hive 3.1.2 with options below.
>  * hive.server2.logging.operation.enabled=true
>  * hive.server2.logging.operation.level=VERBOSE
>  * hive.async.log.enabled=false
> I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 
> but HS2 still leaks log4j RandomAccessFileManager.
> !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197!
> I checked the operation log file which is not closed/deleted properly.
> !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272!
> Then there's the log,
> {code:java}
> client.TezClient: Shutting down Tez Session, sessionName= {code}
> !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24510) Vectorize compute_bit_vector

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24510?focusedWorklogId=532991=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-532991
 ]

ASF GitHub Bot logged work on HIVE-24510:
-

Author: ASF GitHub Bot
Created on: 08/Jan/21 12:15
Start Date: 08/Jan/21 12:15
Worklog Time Spent: 10m 
  Work Description: mustafaiman edited a comment on pull request #1824:
URL: https://github.com/apache/hive/pull/1824#issuecomment-756725412


   I made a quick fix to allow that in early versions of this patch. Then I 
decided to not pursue it because I did not see the need for allowing constant 
argument in runtime.
   
   > you can still do something like:
   > if compute_bit_vector: -> handle constant parameter
   
   We do exactly that. Not in vectorizer but earlier in 
`ColumnStatsSemanticAnalyzer.java `. I am reluctant to implement extra 
functionality or add special cases unless it is necessary. Note that 
compute_bit_vector is a newly added UDF in 4.0. So there is no backward 
compatibility concern either.
   Do you see any other benefit than preserving the earlier q.out outputs?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 532991)
Time Spent: 2h 40m  (was: 2.5h)

> Vectorize compute_bit_vector
> 
>
> Key: HIVE-24510
> URL: https://issues.apache.org/jira/browse/HIVE-24510
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa İman
>Assignee: Mustafa İman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> After https://issues.apache.org/jira/browse/HIVE-23530 , almost all compute 
> stats functions are vectorizable. Only function that is not vectorizable is 
> "compute_bit_vector" for ndv statistics computation. This causes "create 
> table as select" and "insert overwrite select" queries to run in 
> non-vectorized mode. 
> Even a very naive implementation of vectorized compute_bit_vector gives about 
> 50% performance improvement on simple "insert overwrite select" queries. That 
> is because entire mapper or reducer can run in vectorized mode.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24510) Vectorize compute_bit_vector

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24510?focusedWorklogId=532990=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-532990
 ]

ASF GitHub Bot logged work on HIVE-24510:
-

Author: ASF GitHub Bot
Created on: 08/Jan/21 12:15
Start Date: 08/Jan/21 12:15
Worklog Time Spent: 10m 
  Work Description: mustafaiman commented on pull request #1824:
URL: https://github.com/apache/hive/pull/1824#issuecomment-756725412


   I made a quick fix to allow that in early versions of this patch. Then I 
decided to not pursue it because I did not see the need for allowing constant 
argument in runtime.
   
   > you can still do something like:
   > if compute_bit_vector: -> handle constant parameter
   We do exactly that. Not in vectorizer but earlier in 
`ColumnStatsSemanticAnalyzer.java `. I am reluctant to implement extra 
functionality or add special cases unless it is necessary. Note that 
compute_bit_vector is a newly added UDF in 4.0. So there is no backward 
compatibility concern either.
   Do you see any other benefit than preserving the earlier q.out outputs?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 532990)
Time Spent: 2.5h  (was: 2h 20m)

> Vectorize compute_bit_vector
> 
>
> Key: HIVE-24510
> URL: https://issues.apache.org/jira/browse/HIVE-24510
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa İman
>Assignee: Mustafa İman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> After https://issues.apache.org/jira/browse/HIVE-23530 , almost all compute 
> stats functions are vectorizable. Only function that is not vectorizable is 
> "compute_bit_vector" for ndv statistics computation. This causes "create 
> table as select" and "insert overwrite select" queries to run in 
> non-vectorized mode. 
> Even a very naive implementation of vectorized compute_bit_vector gives about 
> 50% performance improvement on simple "insert overwrite select" queries. That 
> is because entire mapper or reducer can run in vectorized mode.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-24590) Operation Logging still leaks the log4j Appenders

2021-01-08 Thread Eugene Chung (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261266#comment-17261266
 ] 

Eugene Chung edited comment on HIVE-24590 at 1/8/21, 12:11 PM:
---

!Screen Shot 2021-01-08 at 21.01.40.png|width=1486,height=34!

I have a question. Couldn't operation logs be idle for an hour like screenshot 
or days if the query time is very long?

My concern is that setting the time for purge policy could be difficult.


was (Author: euigeun_chung):
!Screen Shot 2021-01-08 at 21.01.40.png|width=1486,height=34!

I have a question. Couldn't operation logs be idle for an hour like screenshot 
or days if the query time is very long?

> Operation Logging still leaks the log4j Appenders
> -
>
> Key: HIVE-24590
> URL: https://issues.apache.org/jira/browse/HIVE-24590
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Eugene Chung
>Assignee: Stamatis Zampetakis
>Priority: Major
> Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot 
> 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen 
> Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, 
> Screen Shot 2021-01-08 at 21.01.40.png, add_debug_log_and_trace.patch
>
>
> I'm using Hive 3.1.2 with options below.
>  * hive.server2.logging.operation.enabled=true
>  * hive.server2.logging.operation.level=VERBOSE
>  * hive.async.log.enabled=false
> I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 
> but HS2 still leaks log4j RandomAccessFileManager.
> !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197!
> I checked the operation log file which is not closed/deleted properly.
> !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272!
> Then there's the log,
> {code:java}
> client.TezClient: Shutting down Tez Session, sessionName= {code}
> !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24590) Operation Logging still leaks the log4j Appenders

2021-01-08 Thread Eugene Chung (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261266#comment-17261266
 ] 

Eugene Chung commented on HIVE-24590:
-

!Screen Shot 2021-01-08 at 21.01.40.png|width=1486,height=34!

I have a question. Couldn't operation logs be idle for an hour like screenshot 
or days if the query time is very long?

> Operation Logging still leaks the log4j Appenders
> -
>
> Key: HIVE-24590
> URL: https://issues.apache.org/jira/browse/HIVE-24590
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Eugene Chung
>Assignee: Stamatis Zampetakis
>Priority: Major
> Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot 
> 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen 
> Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, 
> Screen Shot 2021-01-08 at 21.01.40.png, add_debug_log_and_trace.patch
>
>
> I'm using Hive 3.1.2 with options below.
>  * hive.server2.logging.operation.enabled=true
>  * hive.server2.logging.operation.level=VERBOSE
>  * hive.async.log.enabled=false
> I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 
> but HS2 still leaks log4j RandomAccessFileManager.
> !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197!
> I checked the operation log file which is not closed/deleted properly.
> !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272!
> Then there's the log,
> {code:java}
> client.TezClient: Shutting down Tez Session, sessionName= {code}
> !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24590) Operation Logging still leaks the log4j Appenders

2021-01-08 Thread Eugene Chung (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Chung updated HIVE-24590:

Attachment: Screen Shot 2021-01-08 at 21.01.40.png

> Operation Logging still leaks the log4j Appenders
> -
>
> Key: HIVE-24590
> URL: https://issues.apache.org/jira/browse/HIVE-24590
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Eugene Chung
>Assignee: Stamatis Zampetakis
>Priority: Major
> Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot 
> 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen 
> Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, 
> Screen Shot 2021-01-08 at 21.01.40.png, add_debug_log_and_trace.patch
>
>
> I'm using Hive 3.1.2 with options below.
>  * hive.server2.logging.operation.enabled=true
>  * hive.server2.logging.operation.level=VERBOSE
>  * hive.async.log.enabled=false
> I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 
> but HS2 still leaks log4j RandomAccessFileManager.
> !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197!
> I checked the operation log file which is not closed/deleted properly.
> !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272!
> Then there's the log,
> {code:java}
> client.TezClient: Shutting down Tez Session, sessionName= {code}
> !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24590) Operation Logging still leaks the log4j Appenders

2021-01-08 Thread Stamatis Zampetakis (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261251#comment-17261251
 ] 

Stamatis Zampetakis commented on HIVE-24590:


Thanks for looking into this [~euigeun_chung].

The main problem is the leak of appenders and file descriptors and I think this 
can be solved by adopting an appropriate purge policy as I wrote earlier. If 
this happens then we may remove LogUtils.unregisterLoggingContext() altogether. 

If the leak is solved then the next question is: are the log messages directed 
to the proper files or not? Depending on the answer we may need to look on how 
to clear/set the log4j context.

 

> Operation Logging still leaks the log4j Appenders
> -
>
> Key: HIVE-24590
> URL: https://issues.apache.org/jira/browse/HIVE-24590
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Eugene Chung
>Assignee: Stamatis Zampetakis
>Priority: Major
> Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot 
> 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen 
> Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, 
> add_debug_log_and_trace.patch
>
>
> I'm using Hive 3.1.2 with options below.
>  * hive.server2.logging.operation.enabled=true
>  * hive.server2.logging.operation.level=VERBOSE
>  * hive.async.log.enabled=false
> I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 
> but HS2 still leaks log4j RandomAccessFileManager.
> !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197!
> I checked the operation log file which is not closed/deleted properly.
> !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272!
> Then there's the log,
> {code:java}
> client.TezClient: Shutting down Tez Session, sessionName= {code}
> !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-24590) Operation Logging still leaks the log4j Appenders

2021-01-08 Thread Eugene Chung (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261235#comment-17261235
 ] 

Eugene Chung edited comment on HIVE-24590 at 1/8/21, 11:29 AM:
---

I've been digging it for days, and have found that MDC is not cleared.

Even I call MDC.clear() at 
org.apache.hive.service.cli.session.HiveSessionImpl.close(), the MDC context 
set by LogUtils.registerLoggingContext() still exists at 
org.apache.hadoop.hive.ql.log.HushableRandomAccessFileAppender.createAppender().

While following the codes related to slf4j MDC, it actually calls 
Log4jMDCAdapter and finally accesses log4j ThreadContext. It is basically 
stack-based. I don't know how log4j ThreadContext works exactly right now, but 
the log4j MDC stacks at the point of HiveSessionImpl.close() and 
HushableRandomAccessFileAppender.createAppender() seems to be different. 

When I call log4j ThreadContext.clearAll() instead of 
MDC.clear()(=ThreadContext.clearMap()), 
HushableRandomAccessFileAppender.createAppender() is not called anymore at the 
time of closing session.


was (Author: euigeun_chung):
I've been digging it for days, and have found that MDC is not cleared.

Even I call MDC.clear() at 
org.apache.hive.service.cli.session.HiveSessionImpl.close(), the MDC context 
set by LogUtils.registerLoggingContext() still exists at 
org.apache.hadoop.hive.ql.log.HushableRandomAccessFileAppender.createAppender().

While following the codes related to slf4j MDC, I know that it actually calls 
Log4jMDCAdapter and it finally uses log4j ThreadContext. It is basically 
stack-based. I don't know how log4j ThreadContext works exactly right now, but 
the log4j MDC stacks at the point of HiveSessionImpl.close() and 
HushableRandomAccessFileAppender.createAppender() seems to be different. 

When I call log4j ThreadContext.clearAll() instead of 
MDC.clear()(=ThreadContext.clearMap()), 
HushableRandomAccessFileAppender.createAppender() is not called anymore at the 
time of closing session.

> Operation Logging still leaks the log4j Appenders
> -
>
> Key: HIVE-24590
> URL: https://issues.apache.org/jira/browse/HIVE-24590
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Eugene Chung
>Assignee: Stamatis Zampetakis
>Priority: Major
> Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot 
> 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen 
> Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, 
> add_debug_log_and_trace.patch
>
>
> I'm using Hive 3.1.2 with options below.
>  * hive.server2.logging.operation.enabled=true
>  * hive.server2.logging.operation.level=VERBOSE
>  * hive.async.log.enabled=false
> I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 
> but HS2 still leaks log4j RandomAccessFileManager.
> !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197!
> I checked the operation log file which is not closed/deleted properly.
> !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272!
> Then there's the log,
> {code:java}
> client.TezClient: Shutting down Tez Session, sessionName= {code}
> !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24590) Operation Logging still leaks the log4j Appenders

2021-01-08 Thread Eugene Chung (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261248#comment-17261248
 ] 

Eugene Chung commented on HIVE-24590:
-

I think
 * we can set disableThreadContextStack to true if MDC is not necessary to be 
stack-based. [https://logging.apache.org/log4j/2.x/manual/thread-context.html]
 * we can call ThreadContext.clearAll() instead of MDC.clear() at 
LogUtils.unregisterLoggingContext().

> Operation Logging still leaks the log4j Appenders
> -
>
> Key: HIVE-24590
> URL: https://issues.apache.org/jira/browse/HIVE-24590
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Eugene Chung
>Assignee: Stamatis Zampetakis
>Priority: Major
> Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot 
> 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen 
> Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, 
> add_debug_log_and_trace.patch
>
>
> I'm using Hive 3.1.2 with options below.
>  * hive.server2.logging.operation.enabled=true
>  * hive.server2.logging.operation.level=VERBOSE
>  * hive.async.log.enabled=false
> I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 
> but HS2 still leaks log4j RandomAccessFileManager.
> !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197!
> I checked the operation log file which is not closed/deleted properly.
> !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272!
> Then there's the log,
> {code:java}
> client.TezClient: Shutting down Tez Session, sessionName= {code}
> !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24590) Operation Logging still leaks the log4j Appenders

2021-01-08 Thread Eugene Chung (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261235#comment-17261235
 ] 

Eugene Chung commented on HIVE-24590:
-

I've been digging it for days, and have found that MDC is not cleared.

Even I call MDC.clear() at 
org.apache.hive.service.cli.session.HiveSessionImpl.close(), the MDC context 
set by LogUtils.registerLoggingContext() still exists at 
org.apache.hadoop.hive.ql.log.HushableRandomAccessFileAppender.createAppender().

While following the codes related to slf4j MDC, I know that it actually calls 
Log4jMDCAdapter and it finally uses log4j ThreadContext. It is basically 
stack-based. I don't know how log4j ThreadContext works exactly right now, but 
the log4j MDC stacks at the point of HiveSessionImpl.close() and 
HushableRandomAccessFileAppender.createAppender() seems to be different. 

When I call log4j ThreadContext.clearAll() instead of 
MDC.clear()(=ThreadContext.clearMap()), 
HushableRandomAccessFileAppender.createAppender() is not called anymore at the 
time of closing session.

> Operation Logging still leaks the log4j Appenders
> -
>
> Key: HIVE-24590
> URL: https://issues.apache.org/jira/browse/HIVE-24590
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Eugene Chung
>Assignee: Stamatis Zampetakis
>Priority: Major
> Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot 
> 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen 
> Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, 
> add_debug_log_and_trace.patch
>
>
> I'm using Hive 3.1.2 with options below.
>  * hive.server2.logging.operation.enabled=true
>  * hive.server2.logging.operation.level=VERBOSE
>  * hive.async.log.enabled=false
> I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 
> but HS2 still leaks log4j RandomAccessFileManager.
> !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197!
> I checked the operation log file which is not closed/deleted properly.
> !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272!
> Then there's the log,
> {code:java}
> client.TezClient: Shutting down Tez Session, sessionName= {code}
> !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-15820) comment at the head of beeline -e

2021-01-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-15820?focusedWorklogId=532906=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-532906
 ]

ASF GitHub Bot logged work on HIVE-15820:
-

Author: ASF GitHub Bot
Created on: 08/Jan/21 08:44
Start Date: 08/Jan/21 08:44
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk merged pull request #1814:
URL: https://github.com/apache/hive/pull/1814


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 532906)
Time Spent: 1h  (was: 50m)

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: Robbie Zhang
>Priority: Major
>  Labels: patch, pull-request-available
> Attachments: HIVE-15820.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-15820) comment at the head of beeline -e

2021-01-08 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-15820:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

merged into master. Thank you [~robbiezhang] for fixing this and Vihang for 
reviewing the changes!

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: Robbie Zhang
>Priority: Major
>  Labels: patch, pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-15820.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

57 matches

Mail list logo