[jira] [Updated] (HIVE-25146) JMH tests for Multi HT and parallel load

2022-03-14 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25146:
--
Fix Version/s: 4.0.0

> JMH tests for Multi HT and parallel load
> 
>
> Key: HIVE-25146
> URL: https://issues.apache.org/jira/browse/HIVE-25146
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> JMH tests for parallel HT load, configuration parameters include 
> LOAD_THREADS_NUM, ROWS_NUM and JOIN_TYPE.
> A single thread simulates the default load behaviour while ROWS_NUM < 1M will 
> default to a single thread for simplicity.
> Higher number of threads >=2 evaluates the benefit of parallel loading of the 
> HT.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HIVE-25146) JMH tests for Multi HT and parallel load

2022-03-14 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-25146.
---
Resolution: Fixed

> JMH tests for Multi HT and parallel load
> 
>
> Key: HIVE-25146
> URL: https://issues.apache.org/jira/browse/HIVE-25146
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> JMH tests for parallel HT load, configuration parameters include 
> LOAD_THREADS_NUM, ROWS_NUM and JOIN_TYPE.
> A single thread simulates the default load behaviour while ROWS_NUM < 1M will 
> default to a single thread for simplicity.
> Higher number of threads >=2 evaluates the benefit of parallel loading of the 
> HT.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25146) JMH tests for Multi HT and parallel load

2022-03-14 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25146:
--
Description: 
JMH tests for parallel HT load, configuration parameters include 
LOAD_THREADS_NUM, ROWS_NUM and JOIN_TYPE.
A single thread simulates the default load behaviour while ROWS_NUM < 1M will 
default to a single thread for simplicity.
Higher number of threads >=2 evaluates the benefit of parallel loading of the 
HT.

  was:As the title suggests, add some benchmarks for Parallel HT construction 
feature


> JMH tests for Multi HT and parallel load
> 
>
> Key: HIVE-25146
> URL: https://issues.apache.org/jira/browse/HIVE-25146
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> JMH tests for parallel HT load, configuration parameters include 
> LOAD_THREADS_NUM, ROWS_NUM and JOIN_TYPE.
> A single thread simulates the default load behaviour while ROWS_NUM < 1M will 
> default to a single thread for simplicity.
> Higher number of threads >=2 evaluates the benefit of parallel loading of the 
> HT.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work started] (HIVE-25146) JMH tests for Multi HT and parallel load

2022-02-20 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25146 started by Panagiotis Garefalakis.
-
> JMH tests for Multi HT and parallel load
> 
>
> Key: HIVE-25146
> URL: https://issues.apache.org/jira/browse/HIVE-25146
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As the title suggests, add some benchmarks for Parallel HT construction 
> feature



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25149) Support parallel load for Fast HT implementations

2022-02-18 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25149:
--
Fix Version/s: 4.0.0

> Support parallel load for Fast HT implementations
> -
>
> Key: HIVE-25149
> URL: https://issues.apache.org/jira/browse/HIVE-25149
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25583) Support parallel load for HastTables - Interfaces

2022-02-18 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25583:
--
Fix Version/s: 4.0.0

> Support parallel load for HastTables - Interfaces
> -
>
> Key: HIVE-25583
> URL: https://issues.apache.org/jira/browse/HIVE-25583
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Support parallel load for HastTables - Interfaces
> * Introducing VectorMapJoinFastHashTableContainerBase class that implements 
> VectorMapJoinHashTable
> * Each VectorMapJoinFastStringHashMapContainer is a singleton that contains 
> an array of HashTables (1 or more)
> * VectorMapJoinFastTableContainer now initializes 
> VectorMapJoinFastHashTableContainers instead of HTs directly



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HIVE-25149) Support parallel load for Fast HT implementations

2022-02-18 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-25149.
---
Resolution: Fixed

> Support parallel load for Fast HT implementations
> -
>
> Key: HIVE-25149
> URL: https://issues.apache.org/jira/browse/HIVE-25149
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25149) Support parallel load for Fast HT implementations

2022-02-18 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17494785#comment-17494785
 ] 

Panagiotis Garefalakis commented on HIVE-25149:
---

Resolved as part of [https://github.com/apache/hive/pull/3029]

Thanks [~rameshkumar]  for the 
review[|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=rameshkumar]

> Support parallel load for Fast HT implementations
> -
>
> Key: HIVE-25149
> URL: https://issues.apache.org/jira/browse/HIVE-25149
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work started] (HIVE-25583) Support parallel load for HastTables - Interfaces

2022-02-10 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25583 started by Panagiotis Garefalakis.
-
> Support parallel load for HastTables - Interfaces
> -
>
> Key: HIVE-25583
> URL: https://issues.apache.org/jira/browse/HIVE-25583
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Support parallel load for HastTables - Interfaces
> * Introducing VectorMapJoinFastHashTableContainerBase class that implements 
> VectorMapJoinHashTable
> * Each VectorMapJoinFastStringHashMapContainer is a singleton that contains 
> an array of HashTables (1 or more)
> * VectorMapJoinFastTableContainer now initializes 
> VectorMapJoinFastHashTableContainers instead of HTs directly



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25583) Support parallel load for HastTables - Interfaces

2022-02-10 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17490416#comment-17490416
 ] 

Panagiotis Garefalakis commented on HIVE-25583:
---

Resolved via [https://github.com/apache/hive/pull/2999] 
Thanks [~rameshkumar]  for the review! 

> Support parallel load for HastTables - Interfaces
> -
>
> Key: HIVE-25583
> URL: https://issues.apache.org/jira/browse/HIVE-25583
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Support parallel load for HastTables - Interfaces
> * Introducing VectorMapJoinFastHashTableContainerBase class that implements 
> VectorMapJoinHashTable
> * Each VectorMapJoinFastStringHashMapContainer is a singleton that contains 
> an array of HashTables (1 or more)
> * VectorMapJoinFastTableContainer now initializes 
> VectorMapJoinFastHashTableContainers instead of HTs directly



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HIVE-25583) Support parallel load for HastTables - Interfaces

2022-02-10 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-25583.
---
Resolution: Fixed

> Support parallel load for HastTables - Interfaces
> -
>
> Key: HIVE-25583
> URL: https://issues.apache.org/jira/browse/HIVE-25583
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Support parallel load for HastTables - Interfaces
> * Introducing VectorMapJoinFastHashTableContainerBase class that implements 
> VectorMapJoinHashTable
> * Each VectorMapJoinFastStringHashMapContainer is a singleton that contains 
> an array of HashTables (1 or more)
> * VectorMapJoinFastTableContainer now initializes 
> VectorMapJoinFastHashTableContainers instead of HTs directly



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25583) Support parallel load for HastTables - Interfaces

2022-02-03 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25583:
--
Description: 
Support parallel load for HastTables - Interfaces
* Introducing VectorMapJoinFastHashTableContainerBase class that implements 
VectorMapJoinHashTable
* Each VectorMapJoinFastStringHashMapContainer is a singleton that contains an 
array of HashTables (1 or more)
* VectorMapJoinFastTableContainer now initializes 
VectorMapJoinFastHashTableContainers instead of HTs directly

> Support parallel load for HastTables - Interfaces
> -
>
> Key: HIVE-25583
> URL: https://issues.apache.org/jira/browse/HIVE-25583
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Support parallel load for HastTables - Interfaces
> * Introducing VectorMapJoinFastHashTableContainerBase class that implements 
> VectorMapJoinHashTable
> * Each VectorMapJoinFastStringHashMapContainer is a singleton that contains 
> an array of HashTables (1 or more)
> * VectorMapJoinFastTableContainer now initializes 
> VectorMapJoinFastHashTableContainers instead of HTs directly



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25583) Support parallel load for HastTables - Interfaces

2022-02-03 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25583:
-

Assignee: Panagiotis Garefalakis  (was: Ramesh Kumar Thangarajan)

> Support parallel load for HastTables - Interfaces
> -
>
> Key: HIVE-25583
> URL: https://issues.apache.org/jira/browse/HIVE-25583
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> Support parallel load for HastTables - Interfaces
> * Introducing VectorMapJoinFastHashTableContainerBase class that implements 
> VectorMapJoinHashTable
> * Each VectorMapJoinFastStringHashMapContainer is a singleton that contains 
> an array of HashTables (1 or more)
> * VectorMapJoinFastTableContainer now initializes 
> VectorMapJoinFastHashTableContainers instead of HTs directly



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HIVE-25828) Remove unused import and method in ParseUtils

2022-01-20 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-25828.
---
Resolution: Fixed

> Remove unused import and method in ParseUtils
> -
>
> Key: HIVE-25828
> URL: https://issues.apache.org/jira/browse/HIVE-25828
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> (1) Remove unused import
> (2) Remove unused method _sameTree(ASTNode node, ASTNode otherNode)_



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25828) Remove unused import and method in ParseUtils

2022-01-20 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17479290#comment-17479290
 ] 

Panagiotis Garefalakis commented on HIVE-25828:
---

Resolved via [https://github.com/apache/hive/pull/2900]
Thanks [~zhangbutao] for the patch!

> Remove unused import and method in ParseUtils
> -
>
> Key: HIVE-25828
> URL: https://issues.apache.org/jira/browse/HIVE-25828
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> (1) Remove unused import
> (2) Remove unused method _sameTree(ASTNode node, ASTNode otherNode)_



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25828) Remove unused import and method in ParseUtils

2022-01-20 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25828:
--
Affects Version/s: 4.0.0

> Remove unused import and method in ParseUtils
> -
>
> Key: HIVE-25828
> URL: https://issues.apache.org/jira/browse/HIVE-25828
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> (1) Remove unused import
> (2) Remove unused method _sameTree(ASTNode node, ASTNode otherNode)_



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25145) Improve Multi-HashTable EstimatedMemorySize

2022-01-20 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25145:
-

Assignee: Panagiotis Garefalakis

> Improve Multi-HashTable EstimatedMemorySize
> ---
>
> Key: HIVE-25145
> URL: https://issues.apache.org/jira/browse/HIVE-25145
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> When Multi HashTable is used for parallel HT loading, we calculate the 
> estimatedMemorySize as the sum of all HTs.
> However, each of those HTs already adds some constants to memory estimation 
> e.g., adding 16KB constant memory for keyBinarySortableDeserializeRead
> This ticket aims to improve the memory estimation for Multi HT



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25148) Support parallel load for Optimized HT implementations

2022-01-20 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25148:
-

Assignee: Panagiotis Garefalakis

> Support parallel load for Optimized HT implementations
> --
>
> Key: HIVE-25148
> URL: https://issues.apache.org/jira/browse/HIVE-25148
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25146) JMH tests for Multi HT and parallel load

2022-01-20 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25146:
-

Assignee: Panagiotis Garefalakis

> JMH tests for Multi HT and parallel load
> 
>
> Key: HIVE-25146
> URL: https://issues.apache.org/jira/browse/HIVE-25146
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> As the title suggests, add some benchmarks for Parallel HT construction 
> feature



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25149) Support parallel load for Fast HT implementations

2022-01-20 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25149:
-

Assignee: Panagiotis Garefalakis

> Support parallel load for Fast HT implementations
> -
>
> Key: HIVE-25149
> URL: https://issues.apache.org/jira/browse/HIVE-25149
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25736) Close ORC readers

2022-01-10 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25736:
-

Assignee: Peter Vary

> Close ORC readers
> -
>
> Key: HIVE-25736
> URL: https://issues.apache.org/jira/browse/HIVE-25736
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> After ORC-498 the Orc readers should be closed explicitly. One of the cases 
> was HIVE-25683, but there are several places where the ORC readers are still 
> not closed. 
> We should go through the code and make sure that the readers are closed.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25541) JsonSerDe: TBLPROPERTY treating nested json as String

2021-12-21 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25541:
--
Description: 
Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not 
support loading nested json into a string type directly. It requires the 
declaring the column as complex type (struct, map, array) to unpack nested json 
data.

Even though the data field is not a valid JSON String type there is value 
treating it as plain String instead of throwing an exception as we currently do.

{code:java}
create table json_table(data string, messageid string, publish_time bigint, 
attributes string);

{"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}}
{code}

This JIRA introduces an extra Table Property allowing to Stringify Complex JSON 
values instead of forcing the User to define the complete nested structure

  was:
本机 Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' 目前不支持将嵌套的 json 
直接加载到字符串类型中。它需要将列声明为复杂类型(结构、映射、数组)以解压嵌套的 json 数据。

即使数据字段不是有效的 JSON 字符串类型,也可以将其视为普通字符串,而不是像我们目前那样抛出异常。
{code:java}

{code}
创建表 json_table(数据字符串,messageid 字符串,publish_time bigint,属性字符串);
{code:java}

{code}
{ 
{code:java}
“数据”
{code}
:{ 
{code:java}
“H”
{code}
:{ 
{code:java}
“事件”
{code}
:
{code:java}
“track_active”
{code}
,
{code:java}
“平台”
{code}
:
{code:java}
“Android”
{code}
 },
{code:java}
“B”
{code}
:{ 
{code:java}
“设备类型”
{code}
:
{code:java}
“电话”
{code}
,
{code:java}
“uuid”
{code}
:
{code:java}
“[36ffec24-f6a4 -4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16] “
{code}
 }}, 
{code:java}
”邮件ID“ 
{code}
:
{code:java}
”2475185636801962“ 
{code}
,
{code:java}
”publish_time“
{code}
:1622514629783, 
{code:java}
”属性“
{code}
:{ 
{code:java}
”区“
{code}
:
{code:java}
”IN“
{code}
 }}” }}

这个 JIRA 引入了一个额外的表属性,允许对复杂的 JSON 值进行字符串化,而不是强制用户定义完整的嵌套结构


> JsonSerDe: TBLPROPERTY treating nested json as String
> -
>
> Key: HIVE-25541
> URL: https://issues.apache.org/jira/browse/HIVE-25541
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not 
> support loading nested json into a string type directly. It requires the 
> declaring the column as complex type (struct, map, array) to unpack nested 
> json data.
> Even though the data field is not a valid JSON String type there is value 
> treating it as plain String instead of throwing an exception as we currently 
> do.
> {code:java}
> create table json_table(data string, messageid string, publish_time bigint, 
> attributes string);
> {"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}}
> {code}
> This JIRA introduces an extra Table Property allowing to Stringify Complex 
> JSON values instead of forcing the User to define the complete nested 
> structure



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25497) Bump ORC to 1.7.1

2021-12-07 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25497:
--
Summary: Bump ORC to 1.7.1  (was: Bump ORC to 1.7.0)

> Bump ORC to 1.7.1
> -
>
> Key: HIVE-25497
> URL: https://issues.apache.org/jira/browse/HIVE-25497
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: William Hyun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25497) Bump ORC to 1.7.1

2021-12-07 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25497:
-

Assignee: Panagiotis Garefalakis

> Bump ORC to 1.7.1
> -
>
> Key: HIVE-25497
> URL: https://issues.apache.org/jira/browse/HIVE-25497
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: William Hyun
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25765) skip.header.line.count property skips rows of each block in FetchOperator when file size is larger

2021-12-03 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453204#comment-17453204
 ] 

Panagiotis Garefalakis commented on HIVE-25765:
---

Hey [~ganeshas]  – thanks for reporting this! 
Is this bug also visible in the latest master branch?

> skip.header.line.count property skips rows of each block in FetchOperator 
> when file size is larger
> --
>
> Key: HIVE-25765
> URL: https://issues.apache.org/jira/browse/HIVE-25765
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: Ganesha Shreedhara
>Assignee: Ganesha Shreedhara
>Priority: Major
>  Labels: pull-request-available
> Attachments: data.txt.gz
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When _skip.header.line.count_ property is set in table properties, simple 
> select queries that gets converted into FetchTask skip rows of each block 
> instead of skipping header lines of each file. This happens when the file 
> size is larger and file is read in blocks. This issue doesn't exist when 
> select query is converted into map only job by setting 
> _hive.fetch.task.conversion_ to _none_ because the header lines are skipped 
> only for the first block because of [this 
> check|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java#L330]
>  We should have similar check in FetchOperator to avoid this issue. 
>  
> *Steps to reproduce:* 
> {code:java}
> -- Create table on top of the data file (uncompressed size: ~239M) attached 
> in this ticket
> CREATE EXTERNAL TABLE test_table(
>   col1 string,
>   col2 string,
>   col3 string,
>   col4 string,
>   col5 string,
>   col6 string,
>   col7 string,
>   col8 string,
>   col9 string,
>   col10 string,
>   col11 string,
>   col12 string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'location_of_data_file'
> TBLPROPERTIES ('skip.header.line.count'='1');
> -- Counting number of rows gives correct result with only one header line 
> skipped
> select count(*) from test_table;
> 3145727
> -- Select query skips more rows and the result depends upon the number of 
> blocks configured in underlying filesystem. 3 rows are skipped when the file 
> is read in 3 blocks. 
> select * from test_table;
> .
> .
> Fetched 3145724 rows
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (HIVE-25765) skip.header.line.count property skips rows of each block in FetchOperator when file size is larger

2021-12-03 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453204#comment-17453204
 ] 

Panagiotis Garefalakis edited comment on HIVE-25765 at 12/3/21, 8:51 PM:
-

Hey [~ganeshas]  – thanks for reporting this! 
Is this also reproducible in the latest master branch?


was (Author: pgaref):
Hey [~ganeshas]  – thanks for reporting this! 
Is this bug also visible in the latest master branch?

> skip.header.line.count property skips rows of each block in FetchOperator 
> when file size is larger
> --
>
> Key: HIVE-25765
> URL: https://issues.apache.org/jira/browse/HIVE-25765
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: Ganesha Shreedhara
>Assignee: Ganesha Shreedhara
>Priority: Major
>  Labels: pull-request-available
> Attachments: data.txt.gz
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When _skip.header.line.count_ property is set in table properties, simple 
> select queries that gets converted into FetchTask skip rows of each block 
> instead of skipping header lines of each file. This happens when the file 
> size is larger and file is read in blocks. This issue doesn't exist when 
> select query is converted into map only job by setting 
> _hive.fetch.task.conversion_ to _none_ because the header lines are skipped 
> only for the first block because of [this 
> check|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java#L330]
>  We should have similar check in FetchOperator to avoid this issue. 
>  
> *Steps to reproduce:* 
> {code:java}
> -- Create table on top of the data file (uncompressed size: ~239M) attached 
> in this ticket
> CREATE EXTERNAL TABLE test_table(
>   col1 string,
>   col2 string,
>   col3 string,
>   col4 string,
>   col5 string,
>   col6 string,
>   col7 string,
>   col8 string,
>   col9 string,
>   col10 string,
>   col11 string,
>   col12 string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'location_of_data_file'
> TBLPROPERTIES ('skip.header.line.count'='1');
> -- Counting number of rows gives correct result with only one header line 
> skipped
> select count(*) from test_table;
> 3145727
> -- Select query skips more rows and the result depends upon the number of 
> blocks configured in underlying filesystem. 3 rows are skipped when the file 
> is read in 3 blocks. 
> select * from test_table;
> .
> .
> Fetched 3145724 rows
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25697) Upgrade commons-compress to 1.21

2021-11-15 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17444055#comment-17444055
 ] 

Panagiotis Garefalakis commented on HIVE-25697:
---

Thanks for taking care of this [~kgyrtkirk] and for the patch [~rameshkumar] ! 

> Upgrade commons-compress to 1.21
> 
>
> Key: HIVE-25697
> URL: https://issues.apache.org/jira/browse/HIVE-25697
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Upgrade commons-compress to 1.21 due to CVEs



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25541) JsonSerDe: TBLPROPERTY treating nested json as String

2021-10-21 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25541:
--
Fix Version/s: 4.0.0

> JsonSerDe: TBLPROPERTY treating nested json as String
> -
>
> Key: HIVE-25541
> URL: https://issues.apache.org/jira/browse/HIVE-25541
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not 
> support loading nested json into a string type directly. It requires the 
> declaring the column as complex type (struct, map, array) to unpack nested 
> json data.
> Even though the data field is not a valid JSON String type there is value 
> treating it as plain String instead of throwing an exception as we currently 
> do.
> {code:java}
> create table json_table(data string, messageid string, publish_time bigint, 
> attributes string);
> {"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}}
> {code}
> This JIRA introduces an extra Table Property allowing to Stringify Complex 
> JSON values instead of forcing the User to define the complete nested 
> structure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25541) JsonSerDe: TBLPROPERTY treating nested json as String

2021-10-21 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17432323#comment-17432323
 ] 

Panagiotis Garefalakis commented on HIVE-25541:
---

Resolved via https://github.com/apache/hive/pull/2664

> JsonSerDe: TBLPROPERTY treating nested json as String
> -
>
> Key: HIVE-25541
> URL: https://issues.apache.org/jira/browse/HIVE-25541
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not 
> support loading nested json into a string type directly. It requires the 
> declaring the column as complex type (struct, map, array) to unpack nested 
> json data.
> Even though the data field is not a valid JSON String type there is value 
> treating it as plain String instead of throwing an exception as we currently 
> do.
> {code:java}
> create table json_table(data string, messageid string, publish_time bigint, 
> attributes string);
> {"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}}
> {code}
> This JIRA introduces an extra Table Property allowing to Stringify Complex 
> JSON values instead of forcing the User to define the complete nested 
> structure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25541) JsonSerDe: TBLPROPERTY treating nested json as String

2021-10-21 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-25541.
---
Resolution: Fixed

> JsonSerDe: TBLPROPERTY treating nested json as String
> -
>
> Key: HIVE-25541
> URL: https://issues.apache.org/jira/browse/HIVE-25541
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not 
> support loading nested json into a string type directly. It requires the 
> declaring the column as complex type (struct, map, array) to unpack nested 
> json data.
> Even though the data field is not a valid JSON String type there is value 
> treating it as plain String instead of throwing an exception as we currently 
> do.
> {code:java}
> create table json_table(data string, messageid string, publish_time bigint, 
> attributes string);
> {"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}}
> {code}
> This JIRA introduces an extra Table Property allowing to Stringify Complex 
> JSON values instead of forcing the User to define the complete nested 
> structure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25505) Incorrect results with header. skip.header.line.count if first line is blank

2021-10-20 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-25505.
---
Resolution: Fixed

> Incorrect results with header. skip.header.line.count if first line is blank
> 
>
> Key: HIVE-25505
> URL: https://issues.apache.org/jira/browse/HIVE-25505
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Steve Carlin
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> aAtable with header. skip.header.line.count=1 does not skip the first line if 
> it is blank, except in a fetch task.
> To reproduce, create a csv table, ans set header. skip.header.line.count=1 in 
> table properties.
> In the table location, create a single file, with a blank (empty) first line, 
> and say 2 further lines.
> If you do a select * on it, you see 2 rows (correct)
>  If you do select count(*) on it, you get 3 (incorrect)
> {code:java}
> CREATE EXTERNAL TABLE `testcase1`(id int, name string) ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
>   LOCATION '${system:test.tmp.dir}/testcase1'
>   TBLPROPERTIES ("skip.header.line.count"="1");
> SET hive.fetch.task.conversion = more;
> select * from testcase1;
> select count(*) from testcase1;
> set hive.fetch.task.conversion=none;
> select * from testcase1;
> select count(*) from testcase1;
> Test file:
> 1,2019-12-31
> 2,2019-12-31
> 3,2019-12-31
> Should both yield (with the above test file):
>  A masked pattern was here 
> 1 2019-12-31
> 2 2019-12-31
> 3 2019-12-31
> 3
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25505) Incorrect results with header. skip.header.line.count if first line is blank

2021-10-20 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25505:
--
Fix Version/s: 4.0.0

> Incorrect results with header. skip.header.line.count if first line is blank
> 
>
> Key: HIVE-25505
> URL: https://issues.apache.org/jira/browse/HIVE-25505
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Steve Carlin
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> aAtable with header. skip.header.line.count=1 does not skip the first line if 
> it is blank, except in a fetch task.
> To reproduce, create a csv table, ans set header. skip.header.line.count=1 in 
> table properties.
> In the table location, create a single file, with a blank (empty) first line, 
> and say 2 further lines.
> If you do a select * on it, you see 2 rows (correct)
>  If you do select count(*) on it, you get 3 (incorrect)
> {code:java}
> CREATE EXTERNAL TABLE `testcase1`(id int, name string) ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
>   LOCATION '${system:test.tmp.dir}/testcase1'
>   TBLPROPERTIES ("skip.header.line.count"="1");
> SET hive.fetch.task.conversion = more;
> select * from testcase1;
> select count(*) from testcase1;
> set hive.fetch.task.conversion=none;
> select * from testcase1;
> select count(*) from testcase1;
> Test file:
> 1,2019-12-31
> 2,2019-12-31
> 3,2019-12-31
> Should both yield (with the above test file):
>  A masked pattern was here 
> 1 2019-12-31
> 2 2019-12-31
> 3 2019-12-31
> 3
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25505) Incorrect results with header. skip.header.line.count if first line is blank

2021-10-20 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17431159#comment-17431159
 ] 

Panagiotis Garefalakis commented on HIVE-25505:
---

Resolved via [https://github.com/apache/hive/pull/2717] 
Thanks [~abstractdog] for the review!

> Incorrect results with header. skip.header.line.count if first line is blank
> 
>
> Key: HIVE-25505
> URL: https://issues.apache.org/jira/browse/HIVE-25505
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Steve Carlin
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> aAtable with header. skip.header.line.count=1 does not skip the first line if 
> it is blank, except in a fetch task.
> To reproduce, create a csv table, ans set header. skip.header.line.count=1 in 
> table properties.
> In the table location, create a single file, with a blank (empty) first line, 
> and say 2 further lines.
> If you do a select * on it, you see 2 rows (correct)
>  If you do select count(*) on it, you get 3 (incorrect)
> {code:java}
> CREATE EXTERNAL TABLE `testcase1`(id int, name string) ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
>   LOCATION '${system:test.tmp.dir}/testcase1'
>   TBLPROPERTIES ("skip.header.line.count"="1");
> SET hive.fetch.task.conversion = more;
> select * from testcase1;
> select count(*) from testcase1;
> set hive.fetch.task.conversion=none;
> select * from testcase1;
> select count(*) from testcase1;
> Test file:
> 1,2019-12-31
> 2,2019-12-31
> 3,2019-12-31
> Should both yield (with the above test file):
>  A masked pattern was here 
> 1 2019-12-31
> 2 2019-12-31
> 3 2019-12-31
> 3
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-25505) Incorrect results with header. skip.header.line.count if first line is blank

2021-10-08 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426212#comment-17426212
 ] 

Panagiotis Garefalakis edited comment on HIVE-25505 at 10/8/21, 8:39 PM:
-

I have a repro for this – updating description and assigning to myself


was (Author: pgaref):
I have a repro for this -- updating description and assigned to myself


> Incorrect results with header. skip.header.line.count if first line is blank
> 
>
> Key: HIVE-25505
> URL: https://issues.apache.org/jira/browse/HIVE-25505
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Steve Carlin
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> aAtable with header. skip.header.line.count=1 does not skip the first line if 
> it is blank, except in a fetch task.
> To reproduce, create a csv table, ans set header. skip.header.line.count=1 in 
> table properties.
> In the table location, create a single file, with a blank (empty) first line, 
> and say 2 further lines.
> If you do a select * on it, you see 2 rows (correct)
>  If you do select count(*) on it, you get 3 (incorrect)
> {code:java}
> CREATE EXTERNAL TABLE `testcase1`(id int, name string) ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
>   LOCATION '${system:test.tmp.dir}/testcase1'
>   TBLPROPERTIES ("skip.header.line.count"="1");
> SET hive.fetch.task.conversion = more;
> select * from testcase1;
> select count(*) from testcase1;
> set hive.fetch.task.conversion=none;
> select * from testcase1;
> select count(*) from testcase1;
> Test file:
> 1,2019-12-31
> 2,2019-12-31
> 3,2019-12-31
> Should both yield (with the above test file):
>  A masked pattern was here 
> 1 2019-12-31
> 2 2019-12-31
> 3 2019-12-31
> 3
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25521) Data corruption when concatenating files with different compressions in same table/partition

2021-10-08 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-25521.
---
Resolution: Fixed

> Data corruption when concatenating files with different compressions in same 
> table/partition
> 
>
> Key: HIVE-25521
> URL: https://issues.apache.org/jira/browse/HIVE-25521
> Project: Hive
>  Issue Type: Bug
>Reporter: Harish JP
>Assignee: Harish JP
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently if files of different compressions are in same directory then 
> concatenate can fail and cause data corruption. This happens because file can 
> be moved by one task as incompatible file and the other tasks will fail after 
> this.
>  
> This issue is addressed in this Jira by only processing a file in one task 
> where offset 0 is process and ignoring the the file in all other tasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25521) Data corruption when concatenating files with different compressions in same table/partition

2021-10-08 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426356#comment-17426356
 ] 

Panagiotis Garefalakis commented on HIVE-25521:
---

Resolved via https://github.com/apache/hive/pull/2639

> Data corruption when concatenating files with different compressions in same 
> table/partition
> 
>
> Key: HIVE-25521
> URL: https://issues.apache.org/jira/browse/HIVE-25521
> Project: Hive
>  Issue Type: Bug
>Reporter: Harish JP
>Assignee: Harish JP
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently if files of different compressions are in same directory then 
> concatenate can fail and cause data corruption. This happens because file can 
> be moved by one task as incompatible file and the other tasks will fail after 
> this.
>  
> This issue is addressed in this Jira by only processing a file in one task 
> where offset 0 is process and ignoring the the file in all other tasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25521) Data corruption when concatenating files with different compressions in same table/partition

2021-10-08 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25521:
--
Fix Version/s: 4.0.0

> Data corruption when concatenating files with different compressions in same 
> table/partition
> 
>
> Key: HIVE-25521
> URL: https://issues.apache.org/jira/browse/HIVE-25521
> Project: Hive
>  Issue Type: Bug
>Reporter: Harish JP
>Assignee: Harish JP
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently if files of different compressions are in same directory then 
> concatenate can fail and cause data corruption. This happens because file can 
> be moved by one task as incompatible file and the other tasks will fail after 
> this.
>  
> This issue is addressed in this Jira by only processing a file in one task 
> where offset 0 is process and ignoring the the file in all other tasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25505) Incorrect results with header. skip.header.line.count if first line is blank

2021-10-08 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25505:
--
Description: 
aAtable with header. skip.header.line.count=1 does not skip the first line if 
it is blank, except in a fetch task.

To reproduce, create a csv table, ans set header. skip.header.line.count=1 in 
table properties.

In the table location, create a single file, with a blank (empty) first line, 
and say 2 further lines.

If you do a select * on it, you see 2 rows (correct)
 If you do select count(*) on it, you get 3 (incorrect)
{code:java}
CREATE EXTERNAL TABLE `testcase1`(id int, name string) ROW FORMAT SERDE 
'org.apache.hadoop.hive.serde2.OpenCSVSerde'
  LOCATION '${system:test.tmp.dir}/testcase1'
  TBLPROPERTIES ("skip.header.line.count"="1");

SET hive.fetch.task.conversion = more;
select * from testcase1;
select count(*) from testcase1;


set hive.fetch.task.conversion=none;
select * from testcase1;
select count(*) from testcase1;

Test file:

1,2019-12-31
2,2019-12-31
3,2019-12-31



Should both yield (with the above test file):
 A masked pattern was here 
1   2019-12-31
2   2019-12-31
3   2019-12-31

3

{code}

  was:
aAtable with header. skip.header.line.count=1 does not skip the first line if 
it is blank, except in a fetch task.

To reproduce, create a csv table, ans set header. skip.header.line.count=1 in 
table properties.

In the table location, create a single file, with a blank (empty) first line, 
and say 2 further lines.

If you do a select * on it, you see 2 rows (correct)
If you do select count(\*) on it, you get 3 (incorrect)




{code:java}
// Some comments here
public String getFoo()
{
return foo;
}
{code}



> Incorrect results with header. skip.header.line.count if first line is blank
> 
>
> Key: HIVE-25505
> URL: https://issues.apache.org/jira/browse/HIVE-25505
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Steve Carlin
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> aAtable with header. skip.header.line.count=1 does not skip the first line if 
> it is blank, except in a fetch task.
> To reproduce, create a csv table, ans set header. skip.header.line.count=1 in 
> table properties.
> In the table location, create a single file, with a blank (empty) first line, 
> and say 2 further lines.
> If you do a select * on it, you see 2 rows (correct)
>  If you do select count(*) on it, you get 3 (incorrect)
> {code:java}
> CREATE EXTERNAL TABLE `testcase1`(id int, name string) ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
>   LOCATION '${system:test.tmp.dir}/testcase1'
>   TBLPROPERTIES ("skip.header.line.count"="1");
> SET hive.fetch.task.conversion = more;
> select * from testcase1;
> select count(*) from testcase1;
> set hive.fetch.task.conversion=none;
> select * from testcase1;
> select count(*) from testcase1;
> Test file:
> 1,2019-12-31
> 2,2019-12-31
> 3,2019-12-31
> Should both yield (with the above test file):
>  A masked pattern was here 
> 1 2019-12-31
> 2 2019-12-31
> 3 2019-12-31
> 3
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25505) Incorrect results with header. skip.header.line.count if first line is blank

2021-10-08 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25505:
-

Assignee: Panagiotis Garefalakis

> Incorrect results with header. skip.header.line.count if first line is blank
> 
>
> Key: HIVE-25505
> URL: https://issues.apache.org/jira/browse/HIVE-25505
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Steve Carlin
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> aAtable with header. skip.header.line.count=1 does not skip the first line if 
> it is blank, except in a fetch task.
> To reproduce, create a csv table, ans set header. skip.header.line.count=1 in 
> table properties.
> In the table location, create a single file, with a blank (empty) first line, 
> and say 2 further lines.
> If you do a select * on it, you see 2 rows (correct)
> If you do select count(\*) on it, you get 3 (incorrect)
> {code:java}
> // Some comments here
> public String getFoo()
> {
> return foo;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25505) Incorrect results with header. skip.header.line.count if first line is blank

2021-10-08 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25505:
--
Description: 
aAtable with header. skip.header.line.count=1 does not skip the first line if 
it is blank, except in a fetch task.

To reproduce, create a csv table, ans set header. skip.header.line.count=1 in 
table properties.

In the table location, create a single file, with a blank (empty) first line, 
and say 2 further lines.

If you do a select * on it, you see 2 rows (correct)
If you do select count(\*) on it, you get 3 (incorrect)




{code:java}
// Some comments here
public String getFoo()
{
return foo;
}
{code}


  was:
aAtable with header. skip.header.line.count=1 does not skip the first line if 
it is blank, except in a fetch task.

To reproduce, create a csv table, ans set header. skip.header.line.count=1 in 
table properties.

In the table location, create a single file, with a blank (empty) first line, 
and say 2 further lines.

If you do a select * on it, you see 2 rows (correct)
If you do select count(\*) on it, you get 3 (incorrect)


> Incorrect results with header. skip.header.line.count if first line is blank
> 
>
> Key: HIVE-25505
> URL: https://issues.apache.org/jira/browse/HIVE-25505
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Steve Carlin
>Priority: Major
>
> aAtable with header. skip.header.line.count=1 does not skip the first line if 
> it is blank, except in a fetch task.
> To reproduce, create a csv table, ans set header. skip.header.line.count=1 in 
> table properties.
> In the table location, create a single file, with a blank (empty) first line, 
> and say 2 further lines.
> If you do a select * on it, you see 2 rows (correct)
> If you do select count(\*) on it, you get 3 (incorrect)
> {code:java}
> // Some comments here
> public String getFoo()
> {
> return foo;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25505) Incorrect results with header. skip.header.line.count if first line is blank

2021-10-08 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426212#comment-17426212
 ] 

Panagiotis Garefalakis commented on HIVE-25505:
---

I have a repro for this -- updating description and assigned to myself


> Incorrect results with header. skip.header.line.count if first line is blank
> 
>
> Key: HIVE-25505
> URL: https://issues.apache.org/jira/browse/HIVE-25505
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Steve Carlin
>Priority: Major
>
> aAtable with header. skip.header.line.count=1 does not skip the first line if 
> it is blank, except in a fetch task.
> To reproduce, create a csv table, ans set header. skip.header.line.count=1 in 
> table properties.
> In the table location, create a single file, with a blank (empty) first line, 
> and say 2 further lines.
> If you do a select * on it, you see 2 rows (correct)
> If you do select count(\*) on it, you get 3 (incorrect)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay

2021-10-08 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25362:
--
Fix Version/s: 4.0.0

> LLAP: ensure tasks with locality have a chance to adjust delay
> --
>
> Key: HIVE-25362
> URL: https://issues.apache.org/jira/browse/HIVE-25362
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> HIVE-24914 introduced a short-circuit optimization when all nodes are busy 
> returning DELAYED_RESOURCES and reseting locality delay for a given tasks.
> However, this may prevent tasks from adjusting their locality delay and being 
> added to the DelayQueue leading sometimes to missed locality chances when all 
> LLap resources are fully utilized.
> To address the issue we should handle the two cases separately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay

2021-10-08 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-25362.
---
Resolution: Fixed

> LLAP: ensure tasks with locality have a chance to adjust delay
> --
>
> Key: HIVE-25362
> URL: https://issues.apache.org/jira/browse/HIVE-25362
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> HIVE-24914 introduced a short-circuit optimization when all nodes are busy 
> returning DELAYED_RESOURCES and reseting locality delay for a given tasks.
> However, this may prevent tasks from adjusting their locality delay and being 
> added to the DelayQueue leading sometimes to missed locality chances when all 
> LLap resources are fully utilized.
> To address the issue we should handle the two cases separately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay

2021-10-08 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426197#comment-17426197
 ] 

Panagiotis Garefalakis commented on HIVE-25362:
---

Resolved via https://github.com/apache/hive/pull/2513

> LLAP: ensure tasks with locality have a chance to adjust delay
> --
>
> Key: HIVE-25362
> URL: https://issues.apache.org/jira/browse/HIVE-25362
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> HIVE-24914 introduced a short-circuit optimization when all nodes are busy 
> returning DELAYED_RESOURCES and reseting locality delay for a given tasks.
> However, this may prevent tasks from adjusting their locality delay and being 
> added to the DelayQueue leading sometimes to missed locality chances when all 
> LLap resources are fully utilized.
> To address the issue we should handle the two cases separately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25599) Addendum HIVE-25570 Hive should send full URL path for authorization for the command insert overwrite location

2021-10-08 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426052#comment-17426052
 ] 

Panagiotis Garefalakis commented on HIVE-25599:
---

Already resolved via 
https://github.com/apache/hive/commit/988be055289becbfc37b17264edafeca3edefbec

> Addendum HIVE-25570 Hive should send full URL path for authorization for the 
> command insert overwrite location
> --
>
> Key: HIVE-25599
> URL: https://issues.apache.org/jira/browse/HIVE-25599
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25599) Addendum HIVE-25570 Hive should send full URL path for authorization for the command insert overwrite location

2021-10-08 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-25599.
---
Resolution: Duplicate

> Addendum HIVE-25570 Hive should send full URL path for authorization for the 
> command insert overwrite location
> --
>
> Key: HIVE-25599
> URL: https://issues.apache.org/jira/browse/HIVE-25599
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25599) Addendum HIVE-25570 Hive should send full URL path for authorization for the command insert overwrite location

2021-10-07 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25599:
--
Summary: Addendum HIVE-25570 Hive should send full URL path for 
authorization for the command insert overwrite location  (was: Addendum of 
HIVE-25570 Hive should send full URL path for authorization for the command 
insert overwrite location)

> Addendum HIVE-25570 Hive should send full URL path for authorization for the 
> command insert overwrite location
> --
>
> Key: HIVE-25599
> URL: https://issues.apache.org/jira/browse/HIVE-25599
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25599) Addendum of HIVE-25570 Hive should send full URL path for authorization for the command insert overwrite location

2021-10-07 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25599:
-


> Addendum of HIVE-25570 Hive should send full URL path for authorization for 
> the command insert overwrite location
> -
>
> Key: HIVE-25599
> URL: https://issues.apache.org/jira/browse/HIVE-25599
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25520) Enable concatenate for external table.

2021-10-06 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25520:
--
Fix Version/s: 4.0.0

> Enable concatenate for external table.
> --
>
> Key: HIVE-25520
> URL: https://issues.apache.org/jira/browse/HIVE-25520
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Harish JP
>Assignee: Harish JP
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Concatenate for external tables are disabled, enable this under a flag.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25520) Enable concatenate for external table.

2021-10-06 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425134#comment-17425134
 ] 

Panagiotis Garefalakis commented on HIVE-25520:
---

Resolved via https://github.com/apache/hive/pull/2640

> Enable concatenate for external table.
> --
>
> Key: HIVE-25520
> URL: https://issues.apache.org/jira/browse/HIVE-25520
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Harish JP
>Assignee: Harish JP
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Concatenate for external tables are disabled, enable this under a flag.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25520) Enable concatenate for external table.

2021-10-06 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-25520.
---
Resolution: Fixed

> Enable concatenate for external table.
> --
>
> Key: HIVE-25520
> URL: https://issues.apache.org/jira/browse/HIVE-25520
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Harish JP
>Assignee: Harish JP
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Concatenate for external tables are disabled, enable this under a flag.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25541) JsonSerDe: TBLPROPERTY treating nested json as String

2021-09-20 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25541:
--
Description: 
Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not 
support loading nested json into a string type directly. It requires the 
declaring the column as complex type (struct, map, array) to unpack nested json 
data.

Even though the data field is not a valid JSON String type there is value 
treating it as plain String instead of throwing an exception as we currently do.

{code:java}
create table json_table(data string, messageid string, publish_time bigint, 
attributes string);

{"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}}
{code}

This JIRA introduces an extra Table Property allowing to Stringify Complex JSON 
values instead of forcing the User to define the complete nested structure

  was:
Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not 
support loading nested json into a string type directly. It requires the 
declaring the column as complex type (struct, map, array) to unpack nested json 
data.

Even though the data field is not a valid JSON String type there is value 
treating it as plain String instead of throwing an exception as we currently do.

{code:java}
create table json_table(data string, messageid string, publish_time bigint, 
attributes string);

{"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}}
{code}

This JIRA introduces an extra Table property allowing to Stringify Complex JSON 
values instead of forcing the User to define the complete nested structure


> JsonSerDe: TBLPROPERTY treating nested json as String
> -
>
> Key: HIVE-25541
> URL: https://issues.apache.org/jira/browse/HIVE-25541
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not 
> support loading nested json into a string type directly. It requires the 
> declaring the column as complex type (struct, map, array) to unpack nested 
> json data.
> Even though the data field is not a valid JSON String type there is value 
> treating it as plain String instead of throwing an exception as we currently 
> do.
> {code:java}
> create table json_table(data string, messageid string, publish_time bigint, 
> attributes string);
> {"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}}
> {code}
> This JIRA introduces an extra Table Property allowing to Stringify Complex 
> JSON values instead of forcing the User to define the complete nested 
> structure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25541) JsonSerDe: TBLPROPERTY treating nested json as String

2021-09-20 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25541:
-


> JsonSerDe: TBLPROPERTY treating nested json as String
> -
>
> Key: HIVE-25541
> URL: https://issues.apache.org/jira/browse/HIVE-25541
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> Native Jsonserde 'org.apache.hive.hcatalog.data.JsonSerDe' currently does not 
> support loading nested json into a string type directly. It requires the 
> declaring the column as complex type (struct, map, array) to unpack nested 
> json data.
> Even though the data field is not a valid JSON String type there is value 
> treating it as plain String instead of throwing an exception as we currently 
> do.
> {code:java}
> create table json_table(data string, messageid string, publish_time bigint, 
> attributes string);
> {"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}}
> {code}
> This JIRA introduces an extra Table property allowing to Stringify Complex 
> JSON values instead of forcing the User to define the complete nested 
> structure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down.

2021-09-15 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25527:
--
Parent: HIVE-24913
Issue Type: Sub-task  (was: Bug)

> LLAP Scheduler task exits with fatal error if the executor node is down.
> 
>
> Key: HIVE-25527
> URL: https://issues.apache.org/jira/browse/HIVE-25527
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In case the executor host has gone down, activeInstances will be updated with 
> null. So we need to check for empty/null values before accessing it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24316) Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1

2021-08-24 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403937#comment-17403937
 ] 

Panagiotis Garefalakis commented on HIVE-24316:
---

Hey [~glapark] thanks for bringing this up -- taking a look at 
MemoryManagerImpl looks like checkMemory() is the new method that determines if 
the scale has changed and since ORC-361 removed getTotalMemoryPool() calls from 
multiple places we are loosing the effect of controlling the memory pool.

The intention behind  LlapAwareMemoryManager  was to have memory per executor 
instead of the entire heap since multiple writers are involved. An idea could 
be to restore getTotalMemoryPool calls where needed .

> Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1
> -
>
> Key: HIVE-24316
> URL: https://issues.apache.org/jira/browse/HIVE-24316
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 3.1.3
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.3
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> This will bring eleven bug fixes.
>  * ORC 1.5.7: [https://issues.apache.org/jira/projects/ORC/versions/12345702]
>  * ORC 1.5.8: [https://issues.apache.org/jira/projects/ORC/versions/12346462]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25415) Disable auto-assign reviewer on forks

2021-08-03 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-25415.
---
Resolution: Fixed

> Disable auto-assign reviewer on forks
> -
>
> Key: HIVE-25415
> URL: https://issues.apache.org/jira/browse/HIVE-25415
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Josh Soref
>Assignee: Josh Soref
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code:java}
> Run shufo/auto-assign-reviewer-by-files@v1.1.1
> 5{
> 6  '**/*.thrift': [ 'kgyrtkirk', 'klcopp' ],
> 7  '**/*.g': [ 'kgyrtkirk' ],
> 8  '**/package.jdo': [ 'kgyrtkirk' ],
> 9  '**/schq/**': [ 'kgyrtkirk' ],
> 10  '**/*Scheduled*': [ 'kgyrtkirk' ],
> 11  '**/*[sS]ketches*': [ 'kgyrtkirk' ],
> 12  Jenkinsfile: [ 'kgyrtkirk' ],
> 13  '.github/**': [ 'kgyrtkirk' ],
> 14  '**/ddl/**': [ 'miklosgergely' ],
> 15  '**/ql/*@(Driver|Compiler|Executor)*.java': [ 'miklosgergely' ],
> 16  '**/schematool/**': [ 'miklosgergely' ],
> 17  '**/metatool/**': [ 'miklosgergely' ],
> 18  '**/tez/**/*.java': [ 'abstractdog' ],
> 19  '**/*Tez*java': [ 'abstractdog' ],
> 20  '**/*TopNKey*java': [ 'kasakrisz' ],
> 21  '**/*CardinalityPreserving*java': [ 'kasakrisz' ],
> 22  '**/*Llap*java': [ 'pgaref' ]
> 23}
> 24  
> beeline/src/test/org/apache/hive/beeline/schematool/TestHiveSchemaTool.java 
> matches **/schematool/**
> 25finished!
> 26(node:1453) UnhandledPromiseRejectionWarning: HttpError: Reviews may only 
> be requested from collaborators. One or more of the users or teams you 
> specified is not a collaborator of the check-spelling/hive repository.
> 27at 
> /home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:301912
> 28at processTicksAndRejections (internal/process/task_queues.js:93:5)
> 29at async assignReviewers 
> (/home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:39056)
> 30(node:1453) UnhandledPromiseRejectionWarning: Unhandled promise rejection. 
> This error originated either by throwing inside of an async function without 
> a catch block, or by rejecting a promise which was not handled with .catch(). 
> (rejection id: 1)
> 31(node:1453) [DEP0018] DeprecationWarning: Unhandled promise rejections are 
> deprecated. In the future, promise rejections that are not handled will 
> terminate the Node.js process with a non-zero exit code.
> Complete job0s {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25415) Disable auto-assign reviewer on forks

2021-08-03 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25415:
--
Fix Version/s: 4.0.0

> Disable auto-assign reviewer on forks
> -
>
> Key: HIVE-25415
> URL: https://issues.apache.org/jira/browse/HIVE-25415
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Josh Soref
>Assignee: Josh Soref
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code:java}
> Run shufo/auto-assign-reviewer-by-files@v1.1.1
> 5{
> 6  '**/*.thrift': [ 'kgyrtkirk', 'klcopp' ],
> 7  '**/*.g': [ 'kgyrtkirk' ],
> 8  '**/package.jdo': [ 'kgyrtkirk' ],
> 9  '**/schq/**': [ 'kgyrtkirk' ],
> 10  '**/*Scheduled*': [ 'kgyrtkirk' ],
> 11  '**/*[sS]ketches*': [ 'kgyrtkirk' ],
> 12  Jenkinsfile: [ 'kgyrtkirk' ],
> 13  '.github/**': [ 'kgyrtkirk' ],
> 14  '**/ddl/**': [ 'miklosgergely' ],
> 15  '**/ql/*@(Driver|Compiler|Executor)*.java': [ 'miklosgergely' ],
> 16  '**/schematool/**': [ 'miklosgergely' ],
> 17  '**/metatool/**': [ 'miklosgergely' ],
> 18  '**/tez/**/*.java': [ 'abstractdog' ],
> 19  '**/*Tez*java': [ 'abstractdog' ],
> 20  '**/*TopNKey*java': [ 'kasakrisz' ],
> 21  '**/*CardinalityPreserving*java': [ 'kasakrisz' ],
> 22  '**/*Llap*java': [ 'pgaref' ]
> 23}
> 24  
> beeline/src/test/org/apache/hive/beeline/schematool/TestHiveSchemaTool.java 
> matches **/schematool/**
> 25finished!
> 26(node:1453) UnhandledPromiseRejectionWarning: HttpError: Reviews may only 
> be requested from collaborators. One or more of the users or teams you 
> specified is not a collaborator of the check-spelling/hive repository.
> 27at 
> /home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:301912
> 28at processTicksAndRejections (internal/process/task_queues.js:93:5)
> 29at async assignReviewers 
> (/home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:39056)
> 30(node:1453) UnhandledPromiseRejectionWarning: Unhandled promise rejection. 
> This error originated either by throwing inside of an async function without 
> a catch block, or by rejecting a promise which was not handled with .catch(). 
> (rejection id: 1)
> 31(node:1453) [DEP0018] DeprecationWarning: Unhandled promise rejections are 
> deprecated. In the future, promise rejections that are not handled will 
> terminate the Node.js process with a non-zero exit code.
> Complete job0s {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25415) Disable auto-assign reviewer on forks

2021-08-03 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17392092#comment-17392092
 ] 

Panagiotis Garefalakis commented on HIVE-25415:
---

Resolved via: https://github.com/apache/hive/pull/2554

> Disable auto-assign reviewer on forks
> -
>
> Key: HIVE-25415
> URL: https://issues.apache.org/jira/browse/HIVE-25415
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Josh Soref
>Assignee: Josh Soref
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code:java}
> Run shufo/auto-assign-reviewer-by-files@v1.1.1
> 5{
> 6  '**/*.thrift': [ 'kgyrtkirk', 'klcopp' ],
> 7  '**/*.g': [ 'kgyrtkirk' ],
> 8  '**/package.jdo': [ 'kgyrtkirk' ],
> 9  '**/schq/**': [ 'kgyrtkirk' ],
> 10  '**/*Scheduled*': [ 'kgyrtkirk' ],
> 11  '**/*[sS]ketches*': [ 'kgyrtkirk' ],
> 12  Jenkinsfile: [ 'kgyrtkirk' ],
> 13  '.github/**': [ 'kgyrtkirk' ],
> 14  '**/ddl/**': [ 'miklosgergely' ],
> 15  '**/ql/*@(Driver|Compiler|Executor)*.java': [ 'miklosgergely' ],
> 16  '**/schematool/**': [ 'miklosgergely' ],
> 17  '**/metatool/**': [ 'miklosgergely' ],
> 18  '**/tez/**/*.java': [ 'abstractdog' ],
> 19  '**/*Tez*java': [ 'abstractdog' ],
> 20  '**/*TopNKey*java': [ 'kasakrisz' ],
> 21  '**/*CardinalityPreserving*java': [ 'kasakrisz' ],
> 22  '**/*Llap*java': [ 'pgaref' ]
> 23}
> 24  
> beeline/src/test/org/apache/hive/beeline/schematool/TestHiveSchemaTool.java 
> matches **/schematool/**
> 25finished!
> 26(node:1453) UnhandledPromiseRejectionWarning: HttpError: Reviews may only 
> be requested from collaborators. One or more of the users or teams you 
> specified is not a collaborator of the check-spelling/hive repository.
> 27at 
> /home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:301912
> 28at processTicksAndRejections (internal/process/task_queues.js:93:5)
> 29at async assignReviewers 
> (/home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:39056)
> 30(node:1453) UnhandledPromiseRejectionWarning: Unhandled promise rejection. 
> This error originated either by throwing inside of an async function without 
> a catch block, or by rejecting a promise which was not handled with .catch(). 
> (rejection id: 1)
> 31(node:1453) [DEP0018] DeprecationWarning: Unhandled promise rejections are 
> deprecated. In the future, promise rejections that are not handled will 
> terminate the Node.js process with a non-zero exit code.
> Complete job0s {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25415) Disable auto-assign reviewer on forks

2021-08-03 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25415:
--
Summary: Disable auto-assign reviewer on forks  (was: auto-assign breaks on 
forks)

> Disable auto-assign reviewer on forks
> -
>
> Key: HIVE-25415
> URL: https://issues.apache.org/jira/browse/HIVE-25415
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Josh Soref
>Assignee: Josh Soref
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code:java}
> Run shufo/auto-assign-reviewer-by-files@v1.1.1
> 5{
> 6  '**/*.thrift': [ 'kgyrtkirk', 'klcopp' ],
> 7  '**/*.g': [ 'kgyrtkirk' ],
> 8  '**/package.jdo': [ 'kgyrtkirk' ],
> 9  '**/schq/**': [ 'kgyrtkirk' ],
> 10  '**/*Scheduled*': [ 'kgyrtkirk' ],
> 11  '**/*[sS]ketches*': [ 'kgyrtkirk' ],
> 12  Jenkinsfile: [ 'kgyrtkirk' ],
> 13  '.github/**': [ 'kgyrtkirk' ],
> 14  '**/ddl/**': [ 'miklosgergely' ],
> 15  '**/ql/*@(Driver|Compiler|Executor)*.java': [ 'miklosgergely' ],
> 16  '**/schematool/**': [ 'miklosgergely' ],
> 17  '**/metatool/**': [ 'miklosgergely' ],
> 18  '**/tez/**/*.java': [ 'abstractdog' ],
> 19  '**/*Tez*java': [ 'abstractdog' ],
> 20  '**/*TopNKey*java': [ 'kasakrisz' ],
> 21  '**/*CardinalityPreserving*java': [ 'kasakrisz' ],
> 22  '**/*Llap*java': [ 'pgaref' ]
> 23}
> 24  
> beeline/src/test/org/apache/hive/beeline/schematool/TestHiveSchemaTool.java 
> matches **/schematool/**
> 25finished!
> 26(node:1453) UnhandledPromiseRejectionWarning: HttpError: Reviews may only 
> be requested from collaborators. One or more of the users or teams you 
> specified is not a collaborator of the check-spelling/hive repository.
> 27at 
> /home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:301912
> 28at processTicksAndRejections (internal/process/task_queues.js:93:5)
> 29at async assignReviewers 
> (/home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:39056)
> 30(node:1453) UnhandledPromiseRejectionWarning: Unhandled promise rejection. 
> This error originated either by throwing inside of an async function without 
> a catch block, or by rejecting a promise which was not handled with .catch(). 
> (rejection id: 1)
> 31(node:1453) [DEP0018] DeprecationWarning: Unhandled promise rejections are 
> deprecated. In the future, promise rejections that are not handled will 
> terminate the Node.js process with a non-zero exit code.
> Complete job0s {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25415) auto-assign breaks on forks

2021-08-03 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25415:
-

Assignee: Josh Soref

> auto-assign breaks on forks
> ---
>
> Key: HIVE-25415
> URL: https://issues.apache.org/jira/browse/HIVE-25415
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Josh Soref
>Assignee: Josh Soref
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code:java}
> Run shufo/auto-assign-reviewer-by-files@v1.1.1
> 5{
> 6  '**/*.thrift': [ 'kgyrtkirk', 'klcopp' ],
> 7  '**/*.g': [ 'kgyrtkirk' ],
> 8  '**/package.jdo': [ 'kgyrtkirk' ],
> 9  '**/schq/**': [ 'kgyrtkirk' ],
> 10  '**/*Scheduled*': [ 'kgyrtkirk' ],
> 11  '**/*[sS]ketches*': [ 'kgyrtkirk' ],
> 12  Jenkinsfile: [ 'kgyrtkirk' ],
> 13  '.github/**': [ 'kgyrtkirk' ],
> 14  '**/ddl/**': [ 'miklosgergely' ],
> 15  '**/ql/*@(Driver|Compiler|Executor)*.java': [ 'miklosgergely' ],
> 16  '**/schematool/**': [ 'miklosgergely' ],
> 17  '**/metatool/**': [ 'miklosgergely' ],
> 18  '**/tez/**/*.java': [ 'abstractdog' ],
> 19  '**/*Tez*java': [ 'abstractdog' ],
> 20  '**/*TopNKey*java': [ 'kasakrisz' ],
> 21  '**/*CardinalityPreserving*java': [ 'kasakrisz' ],
> 22  '**/*Llap*java': [ 'pgaref' ]
> 23}
> 24  
> beeline/src/test/org/apache/hive/beeline/schematool/TestHiveSchemaTool.java 
> matches **/schematool/**
> 25finished!
> 26(node:1453) UnhandledPromiseRejectionWarning: HttpError: Reviews may only 
> be requested from collaborators. One or more of the users or teams you 
> specified is not a collaborator of the check-spelling/hive repository.
> 27at 
> /home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:301912
> 28at processTicksAndRejections (internal/process/task_queues.js:93:5)
> 29at async assignReviewers 
> (/home/runner/work/_actions/shufo/auto-assign-reviewer-by-files/v1.1.1/dist/index.js:1:39056)
> 30(node:1453) UnhandledPromiseRejectionWarning: Unhandled promise rejection. 
> This error originated either by throwing inside of an async function without 
> a catch block, or by rejecting a promise which was not handled with .catch(). 
> (rejection id: 1)
> 31(node:1453) [DEP0018] DeprecationWarning: Unhandled promise rejections are 
> deprecated. In the future, promise rejections that are not handled will 
> terminate the Node.js process with a non-zero exit code.
> Complete job0s {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25398) Converted external tables should be able to configure purge behaviour

2021-07-28 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25398:
--
Component/s: Standalone Metastore

> Converted external tables should be able to configure purge behaviour
> -
>
> Key: HIVE-25398
> URL: https://issues.apache.org/jira/browse/HIVE-25398
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> Creating non-ACID MANAGED tables is not allowed on Hive, which is instead 
> converting these tables to External: 
> https://issues.apache.org/jira/browse/HIVE-22158
> During table translation  both TRANSLATED_TO_EXTERNAL and 
> 'external.table.purge' are set to True. However, there could be the case that 
> the second parameter is already set in the table properties by the User. This 
> is ticket is adding an extra check to maintain that property if set.
> PS: A cleaner solution would be to create these Tables as External directly 
> but there could be the case the User is taking advantage of the translation 
> and is expecting the data NOT to be purged!
> Example:
> {code:java}
> -- Non-ACID table will be translated to EXTERNAL
> create table c(c int) LOCATION 'etp_1' 
> TBLPROPERTIES('transactional'='false','external.table.purge'='false');
> insert into c values(1);
> -- Maintain the purge=false property set above
> desc formatted c;
> select count(*) from c;
> drop table c;
> -- Create table in same location, data should still be there
> create table c(c int) LOCATION 'etp_1' 
> TBLPROPERTIES('transactional'='false','external.table.purge'='false');
> desc formatted c;
> select count(*) from c;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25398) Converted external tables should be able to configure purge behaviour

2021-07-28 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25398:
-


> Converted external tables should be able to configure purge behaviour
> -
>
> Key: HIVE-25398
> URL: https://issues.apache.org/jira/browse/HIVE-25398
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> Creating non-ACID MANAGED tables is not allowed on Hive, which is instead 
> converting these tables to External: 
> https://issues.apache.org/jira/browse/HIVE-22158
> During table translation  both TRANSLATED_TO_EXTERNAL and 
> 'external.table.purge' are set to True. However, there could be the case that 
> the second parameter is already set in the table properties by the User. This 
> is ticket is adding an extra check to maintain that property if set.
> PS: A cleaner solution would be to create these Tables as External directly 
> but there could be the case the User is taking advantage of the translation 
> and is expecting the data NOT to be purged!
> Example:
> {code:java}
> -- Non-ACID table will be translated to EXTERNAL
> create table c(c int) LOCATION 'etp_1' 
> TBLPROPERTIES('transactional'='false','external.table.purge'='false');
> insert into c values(1);
> -- Maintain the purge=false property set above
> desc formatted c;
> select count(*) from c;
> drop table c;
> -- Create table in same location, data should still be there
> create table c(c int) LOCATION 'etp_1' 
> TBLPROPERTIES('transactional'='false','external.table.purge'='false');
> desc formatted c;
> select count(*) from c;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25190) BytesColumnVector fails when the aggregate size is > 1gb

2021-07-26 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-25190.
---
Resolution: Fixed

> BytesColumnVector fails when the aggregate size is > 1gb
> 
>
> Key: HIVE-25190
> URL: https://issues.apache.org/jira/browse/HIVE-25190
> Project: Hive
>  Issue Type: Bug
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Currently, BytesColumnVector will allocate a buffer for small values (< 1mb), 
> but fail with:
> {code:java}
> new RuntimeException("Overflow of newLength. smallBuffer.length="
> + smallBuffer.length + ", nextElemLength=" + nextElemLength);
> {code:java}
> if the aggregate size of the buffer crosses over 1gb. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25190) BytesColumnVector fails when the aggregate size is > 1gb

2021-07-26 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387281#comment-17387281
 ] 

Panagiotis Garefalakis commented on HIVE-25190:
---

Thanks [~dongjoon] -- I was hesitating to close as we need a new storage-api 
version (as the fix) -- should be 2.8.1

> BytesColumnVector fails when the aggregate size is > 1gb
> 
>
> Key: HIVE-25190
> URL: https://issues.apache.org/jira/browse/HIVE-25190
> Project: Hive
>  Issue Type: Bug
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Currently, BytesColumnVector will allocate a buffer for small values (< 1mb), 
> but fail with:
> {code:java}
> new RuntimeException("Overflow of newLength. smallBuffer.length="
> + smallBuffer.length + ", nextElemLength=" + nextElemLength);
> {code:java}
> if the aggregate size of the buffer crosses over 1gb. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25386) hive-storage-api should not have guava compile dependency

2021-07-26 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25386:
-

Assignee: Dongjoon Hyun

> hive-storage-api should not have guava compile dependency
> -
>
> Key: HIVE-25386
> URL: https://issues.apache.org/jira/browse/HIVE-25386
> Project: Hive
>  Issue Type: Bug
>  Components: storage-api
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> https://mvnrepository.com/artifact/org.apache.hive/hive-storage-api/2.8.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24458) Allow access to SArgs without converting to disjunctive normal form

2021-07-23 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-24458:
--
Fix Version/s: (was: storage-2.7.3)

> Allow access to SArgs without converting to disjunctive normal form
> ---
>
> Key: HIVE-24458
> URL: https://issues.apache.org/jira/browse/HIVE-24458
> Project: Hive
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> For some use cases, it is useful to have access to the SArg expression in a 
> non-normalized form. Currently, the SArg only provides the fully normalized 
> expression.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay

2021-07-21 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25362 started by Panagiotis Garefalakis.
-
> LLAP: ensure tasks with locality have a chance to adjust delay
> --
>
> Key: HIVE-25362
> URL: https://issues.apache.org/jira/browse/HIVE-25362
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-24914 introduced a short-circuit optimization when all nodes are busy 
> returning DELAYED_RESOURCES and reseting locality delay for a given tasks.
> However, this may prevent tasks from adjusting their locality delay and being 
> added to the DelayQueue leading sometimes to missed locality chances when all 
> LLap resources are fully utilized.
> To address the issue we should handle the two cases separately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay

2021-07-21 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25362:
--
Component/s: llap

> LLAP: ensure tasks with locality have a chance to adjust delay
> --
>
> Key: HIVE-25362
> URL: https://issues.apache.org/jira/browse/HIVE-25362
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-24914 introduced a short-circuit optimization when all nodes are busy 
> returning DELAYED_RESOURCES and reseting locality delay for a given tasks.
> However, this may prevent tasks from adjusting their locality delay and being 
> added to the DelayQueue leading sometimes to missed locality chances when all 
> LLap resources are fully utilized.
> To address the issue we should handle the two cases separately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay

2021-07-21 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25362:
--
Description: 
HIVE-24914 introduced a short-circuit optimization when all nodes are busy 
returning DELAYED_RESOURCES and reseting locality delay for a given tasks.

However, this may prevent tasks from adjusting their locality delay and being 
added to the DelayQueue leading sometimes to missed locality chances when all 
LLap resources are fully utilized.
To address the issue we should handle the two cases separately.

  was:
HIVE-24914 introduced a short-circuit optimization when all nodes are busy 
returning DELAYED_RESOURCES and reseting locality delay for a given tasks.

However, this may prevent tasks from being added to the DelayQueue leading to 
worse locality when all LLap resources are fully utilized.
To address the issue we should handle the two cases separately.


> LLAP: ensure tasks with locality have a chance to adjust delay
> --
>
> Key: HIVE-25362
> URL: https://issues.apache.org/jira/browse/HIVE-25362
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> HIVE-24914 introduced a short-circuit optimization when all nodes are busy 
> returning DELAYED_RESOURCES and reseting locality delay for a given tasks.
> However, this may prevent tasks from adjusting their locality delay and being 
> added to the DelayQueue leading sometimes to missed locality chances when all 
> LLap resources are fully utilized.
> To address the issue we should handle the two cases separately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay

2021-07-21 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25362:
--
Summary: LLAP: ensure tasks with locality have a chance to adjust delay  
(was: LLAP: ensure tasks with locality have a chance to adjust localityDelay)

> LLAP: ensure tasks with locality have a chance to adjust delay
> --
>
> Key: HIVE-25362
> URL: https://issues.apache.org/jira/browse/HIVE-25362
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> HIVE-24914 introduced a short-circuit optimization when all nodes are busy 
> returning DELAYED_RESOURCES and reseting locality delay for a given tasks.
> However, this may prevent tasks from being added to the DelayQueue leading to 
> worse locality when all LLap resources are fully utilized.
> To address the issue we should handle the two cases separately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust localityDelay

2021-07-21 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25362:
--
Summary: LLAP: ensure tasks with locality have a chance to adjust 
localityDelay  (was: LLAP: ensure tasks with locality are added to DelayQueue)

> LLAP: ensure tasks with locality have a chance to adjust localityDelay
> --
>
> Key: HIVE-25362
> URL: https://issues.apache.org/jira/browse/HIVE-25362
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> HIVE-24914 introduced a short-circuit optimization when all nodes are busy 
> returning DELAYED_RESOURCES and reseting locality delay for a given tasks.
> However, this may prevent tasks from being added to the DelayQueue leading to 
> worse locality when all LLap resources are fully utilized.
> To address the issue we should handle the two cases separately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25362) LLAP: ensure tasks with locality are added to DelayQueue

2021-07-21 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25362:
--
Parent: HIVE-24913
Issue Type: Sub-task  (was: Bug)

> LLAP: ensure tasks with locality are added to DelayQueue
> 
>
> Key: HIVE-25362
> URL: https://issues.apache.org/jira/browse/HIVE-25362
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> HIVE-24914 introduced a short-circuit optimization when all nodes are busy 
> returning DELAYED_RESOURCES and reseting locality delay for a given tasks.
> However, this may prevent tasks from being added to the DelayQueue leading to 
> worse locality when all LLap resources are fully utilized.
> To address the issue we should handle the two cases separately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25362) LLAP: ensure tasks with locality are added to DelayQueue

2021-07-21 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25362:
-


> LLAP: ensure tasks with locality are added to DelayQueue
> 
>
> Key: HIVE-25362
> URL: https://issues.apache.org/jira/browse/HIVE-25362
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> HIVE-24914 introduced a short-circuit optimization when all nodes are busy 
> returning DELAYED_RESOURCES and reseting locality delay for a given tasks.
> However, this may prevent tasks from being added to the DelayQueue leading to 
> worse locality when all LLap resources are fully utilized.
> To address the issue we should handle the two cases separately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21489) EXPLAIN command throws ClassCastException in Hive

2021-07-13 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379963#comment-17379963
 ] 

Panagiotis Garefalakis commented on HIVE-21489:
---

Resolved via https://github.com/apache/hive/pull/2373 
Thanks [~rameshkumar] for the patch! 

> EXPLAIN command throws ClassCastException in Hive
> -
>
> Key: HIVE-21489
> URL: https://issues.apache.org/jira/browse/HIVE-21489
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.4, 3.1.2
>Reporter: Ping Lu
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.3, 4.0.0
>
> Attachments: HIVE-21489.1.patch, HIVE-21489.2.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> I'm trying to run commands like explain select * from src in hive-2.3.4,but 
> it falls with the ClassCastException: 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer cannot be cast to 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer
> Steps to reproduce:
> 1)hive.execution.engine is the default value mr
> 2)hive.security.authorization.enabled is set to true, and 
> hive.security.authorization.manager is set to 
> org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider
> 3)start hivecli to run command:explain select * from src
> I debug the code and find the issue HIVE-18778 causing the above 
> ClassCastException.If I set hive.in.test to true,the explain command can be 
> successfully executed。
> Now,I have one question,due to hive.in.test cann't be modified at runtime.how 
> to run explain command with using default authorization in hive-2.3.4,



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21489) EXPLAIN command throws ClassCastException in Hive

2021-07-13 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-21489:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> EXPLAIN command throws ClassCastException in Hive
> -
>
> Key: HIVE-21489
> URL: https://issues.apache.org/jira/browse/HIVE-21489
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.4, 3.1.2
>Reporter: Ping Lu
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.3, 4.0.0
>
> Attachments: HIVE-21489.1.patch, HIVE-21489.2.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> I'm trying to run commands like explain select * from src in hive-2.3.4,but 
> it falls with the ClassCastException: 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer cannot be cast to 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer
> Steps to reproduce:
> 1)hive.execution.engine is the default value mr
> 2)hive.security.authorization.enabled is set to true, and 
> hive.security.authorization.manager is set to 
> org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider
> 3)start hivecli to run command:explain select * from src
> I debug the code and find the issue HIVE-18778 causing the above 
> ClassCastException.If I set hive.in.test to true,the explain command can be 
> successfully executed。
> Now,I have one question,due to hive.in.test cann't be modified at runtime.how 
> to run explain command with using default authorization in hive-2.3.4,



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21489) EXPLAIN command throws ClassCastException in Hive

2021-07-13 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-21489:
--
Affects Version/s: 3.1.2

> EXPLAIN command throws ClassCastException in Hive
> -
>
> Key: HIVE-21489
> URL: https://issues.apache.org/jira/browse/HIVE-21489
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.4, 3.1.2
>Reporter: Ping Lu
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21489.1.patch, HIVE-21489.2.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> I'm trying to run commands like explain select * from src in hive-2.3.4,but 
> it falls with the ClassCastException: 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer cannot be cast to 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer
> Steps to reproduce:
> 1)hive.execution.engine is the default value mr
> 2)hive.security.authorization.enabled is set to true, and 
> hive.security.authorization.manager is set to 
> org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider
> 3)start hivecli to run command:explain select * from src
> I debug the code and find the issue HIVE-18778 causing the above 
> ClassCastException.If I set hive.in.test to true,the explain command can be 
> successfully executed。
> Now,I have one question,due to hive.in.test cann't be modified at runtime.how 
> to run explain command with using default authorization in hive-2.3.4,



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21489) EXPLAIN command throws ClassCastException in Hive

2021-07-13 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-21489:
--
Fix Version/s: 4.0.0
   3.1.3

> EXPLAIN command throws ClassCastException in Hive
> -
>
> Key: HIVE-21489
> URL: https://issues.apache.org/jira/browse/HIVE-21489
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.4, 3.1.2
>Reporter: Ping Lu
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.3, 4.0.0
>
> Attachments: HIVE-21489.1.patch, HIVE-21489.2.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> I'm trying to run commands like explain select * from src in hive-2.3.4,but 
> it falls with the ClassCastException: 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer cannot be cast to 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer
> Steps to reproduce:
> 1)hive.execution.engine is the default value mr
> 2)hive.security.authorization.enabled is set to true, and 
> hive.security.authorization.manager is set to 
> org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider
> 3)start hivecli to run command:explain select * from src
> I debug the code and find the issue HIVE-18778 causing the above 
> ClassCastException.If I set hive.in.test to true,the explain command can be 
> successfully executed。
> Now,I have one question,due to hive.in.test cann't be modified at runtime.how 
> to run explain command with using default authorization in hive-2.3.4,



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25242) Query performs extremely slow with hive.vectorized.adaptor.usage.mode = chosen

2021-06-22 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25242:
-

Assignee: Attila Magyar

>  Query performs extremely slow with hive.vectorized.adaptor.usage.mode = 
> chosen
> ---
>
> Key: HIVE-25242
> URL: https://issues.apache.org/jira/browse/HIVE-25242
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> If hive.vectorized.adaptor.usage.mode is set to chosen only certain UDFS are 
> vectorized through the vectorized adaptor.
> Queries like this one, performs very slowly because the concat is not chosen 
> to be vectorized.
> {code:java}
> select count(*) from tbl where to_date(concat(year, '-', month, '-', day)) 
> between to_date('2018-12-01') and to_date('2021-03-01');  {code}
> The patch whitelists the concat udf so that it uses the vectorized adaptor in 
> chosen mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25248) Fix TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1

2021-06-15 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25248:
-

Assignee: Panagiotis Garefalakis

> Fix 
> TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1
> ---
>
> Key: HIVE-25248
> URL: https://issues.apache.org/jira/browse/HIVE-25248
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> This test is failing randomly recently
> http://ci.hive.apache.org/job/hive-flaky-check/233/testReport/org.apache.hadoop.hive.llap.tezplugins/TestLlapTaskSchedulerService/testForcedLocalityMultiplePreemptionsSameHost1/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24458) Allow access to SArgs without converting to disjunctive normal form

2021-06-08 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-24458:
--
Fix Version/s: storage-2.7.3

> Allow access to SArgs without converting to disjunctive normal form
> ---
>
> Key: HIVE-24458
> URL: https://issues.apache.org/jira/browse/HIVE-24458
> Project: Hive
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, storage-2.7.3
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> For some use cases, it is useful to have access to the SArg expression in a 
> non-normalized form. Currently, the SArg only provides the fully normalized 
> expression.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25117) Vector PTF ClassCastException with Decimal64

2021-06-07 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25117:
--
Affects Version/s: 4.0.0

> Vector PTF ClassCastException with Decimal64
> 
>
> Key: HIVE-25117
> URL: https://issues.apache.org/jira/browse/HIVE-25117
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Panagiotis Garefalakis
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: vector_ptf_classcast_exception.q
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Only reproduces when there is at least 1 buffered batch, so needed 2 rows 
> with 1 row/batch:
> {code:java}
> set hive.vectorized.testing.reducer.batch.size=1;
> {code}
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25117) Vector PTF ClassCastException with Decimal64

2021-06-07 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-25117.
---
Resolution: Fixed

> Vector PTF ClassCastException with Decimal64
> 
>
> Key: HIVE-25117
> URL: https://issues.apache.org/jira/browse/HIVE-25117
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Panagiotis Garefalakis
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: vector_ptf_classcast_exception.q
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Only reproduces when there is at least 1 buffered batch, so needed 2 rows 
> with 1 row/batch:
> {code:java}
> set hive.vectorized.testing.reducer.batch.size=1;
> {code}
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25117) Vector PTF ClassCastException with Decimal64

2021-06-07 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25117:
--
Fix Version/s: 4.0.0

> Vector PTF ClassCastException with Decimal64
> 
>
> Key: HIVE-25117
> URL: https://issues.apache.org/jira/browse/HIVE-25117
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Panagiotis Garefalakis
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: vector_ptf_classcast_exception.q
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Only reproduces when there is at least 1 buffered batch, so needed 2 rows 
> with 1 row/batch:
> {code:java}
> set hive.vectorized.testing.reducer.batch.size=1;
> {code}
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25117) Vector PTF ClassCastException with Decimal64

2021-06-07 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358477#comment-17358477
 ] 

Panagiotis Garefalakis commented on HIVE-25117:
---

Revolved via https://github.com/apache/hive/pull/2286 
Thanks [~rameshkumar] for the patch! 

> Vector PTF ClassCastException with Decimal64
> 
>
> Key: HIVE-25117
> URL: https://issues.apache.org/jira/browse/HIVE-25117
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: vector_ptf_classcast_exception.q
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Only reproduces when there is at least 1 buffered batch, so needed 2 rows 
> with 1 row/batch:
> {code:java}
> set hive.vectorized.testing.reducer.batch.size=1;
> {code}
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-25180) Update netty to 4.1.60.Final

2021-06-07 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358449#comment-17358449
 ] 

Panagiotis Garefalakis edited comment on HIVE-25180 at 6/7/21, 8:50 AM:


Resolved via https://github.com/apache/hive/pull/2345 thanks [~Csaba] and 
[~kgyrtkirk] for the review!


was (Author: pgaref):
Resolved via https://github.com/apache/hive/pull/2345 thanks [~Csaba]

> Update netty to 4.1.60.Final
> 
>
> Key: HIVE-25180
> URL: https://issues.apache.org/jira/browse/HIVE-25180
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Zoltan Haindrich
>Assignee: Csaba Juhász
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25180) Update netty to 4.1.60.Final

2021-06-07 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-25180.
---
Resolution: Fixed

> Update netty to 4.1.60.Final
> 
>
> Key: HIVE-25180
> URL: https://issues.apache.org/jira/browse/HIVE-25180
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Zoltan Haindrich
>Assignee: Csaba Juhász
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25180) Update netty to 4.1.60.Final

2021-06-07 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358449#comment-17358449
 ] 

Panagiotis Garefalakis commented on HIVE-25180:
---

Resolved via https://github.com/apache/hive/pull/2345 thanks [~Csaba]

> Update netty to 4.1.60.Final
> 
>
> Key: HIVE-25180
> URL: https://issues.apache.org/jira/browse/HIVE-25180
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Zoltan Haindrich
>Assignee: Csaba Juhász
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25180) Update netty to 4.1.60.Final

2021-06-07 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25180:
--
Affects Version/s: 4.0.0

> Update netty to 4.1.60.Final
> 
>
> Key: HIVE-25180
> URL: https://issues.apache.org/jira/browse/HIVE-25180
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Zoltan Haindrich
>Assignee: Csaba Juhász
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25180) Update netty to 4.1.60.Final

2021-06-07 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25180:
--
Fix Version/s: 4.0.0

> Update netty to 4.1.60.Final
> 
>
> Key: HIVE-25180
> URL: https://issues.apache.org/jira/browse/HIVE-25180
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Zoltan Haindrich
>Assignee: Csaba Juhász
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25180) Update netty to 4.1.60.Final

2021-06-07 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25180:
-

Assignee: Csaba Juhász

> Update netty to 4.1.60.Final
> 
>
> Key: HIVE-25180
> URL: https://issues.apache.org/jira/browse/HIVE-25180
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Csaba Juhász
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25202) Support decimal64 operations for PTF operators

2021-06-04 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25202:
--
Affects Version/s: 4.0.0

> Support decimal64 operations for PTF operators
> --
>
> Key: HIVE-25202
> URL: https://issues.apache.org/jira/browse/HIVE-25202
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> After the support for decimal64 vectorization for multiple operators, PTF 
> operators were found guilty of breaking the decimal64 chain if they happen to 
> occur between two operators. As a result they introduce unnecessary cast to 
> decimal. In order to prevent this, we will support PTF operators to handle 
> decimal64 data types too



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24037) Parallelize hash table constructions in map joins

2021-06-04 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-24037:
-

Assignee: (was: Ramesh Kumar Thangarajan)

> Parallelize hash table constructions in map joins
> -
>
> Key: HIVE-24037
> URL: https://issues.apache.org/jira/browse/HIVE-24037
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Parallelize hash table constructions in map joins



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25163) UnsupportedTemporalTypeException when starting llap

2021-06-04 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-25163.
---
Resolution: Fixed

> UnsupportedTemporalTypeException when starting llap
> ---
>
> Key: HIVE-25163
> URL: https://issues.apache.org/jira/browse/HIVE-25163
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When trying to start the LLAP service I get
> {noformat}
> java.time.temporal.UnsupportedTemporalTypeException: Unsupported field: Year
>   at java.time.Instant.getLong(Instant.java:603)
>   at 
> java.time.format.DateTimePrintContext$1.getLong(DateTimePrintContext.java:205)
>   at 
> java.time.format.DateTimePrintContext.getValue(DateTimePrintContext.java:298)
>   at 
> java.time.format.DateTimeFormatterBuilder$NumberPrinterParser.format(DateTimeFormatterBuilder.java:2551)
>   at 
> java.time.format.DateTimeFormatterBuilder$CompositePrinterParser.format(DateTimeFormatterBuilder.java:2190)
>   at 
> java.time.format.DateTimeFormatter.formatTo(DateTimeFormatter.java:1746)
>   at 
> java.time.format.DateTimeFormatter.format(DateTimeFormatter.java:1720)
>   at 
> org.apache.hadoop.hive.llap.cli.service.LlapServiceDriver.startLlap(LlapServiceDriver.java:301)
>   at 
> org.apache.hadoop.hive.llap.cli.service.LlapServiceDriver.run(LlapServiceDriver.java:133)
>   at 
> org.apache.hadoop.hive.llap.cli.service.LlapServiceDriver.main(LlapServiceDriver.java:386)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25163) UnsupportedTemporalTypeException when starting llap

2021-06-04 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17357250#comment-17357250
 ] 

Panagiotis Garefalakis commented on HIVE-25163:
---

Resolved via https://github.com/apache/hive/pull/2322 
Thanks for the patch [~stoty] ! 

> UnsupportedTemporalTypeException when starting llap
> ---
>
> Key: HIVE-25163
> URL: https://issues.apache.org/jira/browse/HIVE-25163
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When trying to start the LLAP service I get
> {noformat}
> java.time.temporal.UnsupportedTemporalTypeException: Unsupported field: Year
>   at java.time.Instant.getLong(Instant.java:603)
>   at 
> java.time.format.DateTimePrintContext$1.getLong(DateTimePrintContext.java:205)
>   at 
> java.time.format.DateTimePrintContext.getValue(DateTimePrintContext.java:298)
>   at 
> java.time.format.DateTimeFormatterBuilder$NumberPrinterParser.format(DateTimeFormatterBuilder.java:2551)
>   at 
> java.time.format.DateTimeFormatterBuilder$CompositePrinterParser.format(DateTimeFormatterBuilder.java:2190)
>   at 
> java.time.format.DateTimeFormatter.formatTo(DateTimeFormatter.java:1746)
>   at 
> java.time.format.DateTimeFormatter.format(DateTimeFormatter.java:1720)
>   at 
> org.apache.hadoop.hive.llap.cli.service.LlapServiceDriver.startLlap(LlapServiceDriver.java:301)
>   at 
> org.apache.hadoop.hive.llap.cli.service.LlapServiceDriver.run(LlapServiceDriver.java:133)
>   at 
> org.apache.hadoop.hive.llap.cli.service.LlapServiceDriver.main(LlapServiceDriver.java:386)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25163) UnsupportedTemporalTypeException when starting llap

2021-06-04 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25163:
--
Fix Version/s: 4.0.0

> UnsupportedTemporalTypeException when starting llap
> ---
>
> Key: HIVE-25163
> URL: https://issues.apache.org/jira/browse/HIVE-25163
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When trying to start the LLAP service I get
> {noformat}
> java.time.temporal.UnsupportedTemporalTypeException: Unsupported field: Year
>   at java.time.Instant.getLong(Instant.java:603)
>   at 
> java.time.format.DateTimePrintContext$1.getLong(DateTimePrintContext.java:205)
>   at 
> java.time.format.DateTimePrintContext.getValue(DateTimePrintContext.java:298)
>   at 
> java.time.format.DateTimeFormatterBuilder$NumberPrinterParser.format(DateTimeFormatterBuilder.java:2551)
>   at 
> java.time.format.DateTimeFormatterBuilder$CompositePrinterParser.format(DateTimeFormatterBuilder.java:2190)
>   at 
> java.time.format.DateTimeFormatter.formatTo(DateTimeFormatter.java:1746)
>   at 
> java.time.format.DateTimeFormatter.format(DateTimeFormatter.java:1720)
>   at 
> org.apache.hadoop.hive.llap.cli.service.LlapServiceDriver.startLlap(LlapServiceDriver.java:301)
>   at 
> org.apache.hadoop.hive.llap.cli.service.LlapServiceDriver.run(LlapServiceDriver.java:133)
>   at 
> org.apache.hadoop.hive.llap.cli.service.LlapServiceDriver.main(LlapServiceDriver.java:386)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25169) using coalesce via vector,source column type is int and target column type is bigint,the result of target is zero

2021-05-28 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353306#comment-17353306
 ] 

Panagiotis Garefalakis commented on HIVE-25169:
---

Hey [~junnan.yang] thanks for reporting this! Would it make sense to backport 
the ticket that resolved this from master?
On a general note it would be much easier to review this with a github PR and a 
test case.

Cheers


> using coalesce via vector,source column type is int and target column type is 
> bigint,the result of target is zero
> -
>
> Key: HIVE-25169
> URL: https://issues.apache.org/jira/browse/HIVE-25169
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 3.1.2
>Reporter: junnan.yang
>Priority: Major
> Attachments: HIVE-25169.01.patch
>
>
> sourceTable:
>     product_id int;
> ###
> targetTable:
>     product_id bigint;
> ##
> sql: 
>     insert overwrite table targetTable:
>     select 
>     ..
>      coalesce(product_id,-1),
>     ..
>     from sourceTable;
> ##
> explain sql :
>      UDFToLong(COALESCE(product_id,-1)) (type: bigint)
> ##
> result :
>      the column product_id in targetTable is zero, this is wrong result
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25155) Bump ORC to 1.6.8

2021-05-27 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17352606#comment-17352606
 ] 

Panagiotis Garefalakis commented on HIVE-25155:
---

Resolved via https://github.com/apache/hive/pull/2313

> Bump ORC to 1.6.8
> -
>
> Key: HIVE-25155
> URL: https://issues.apache.org/jira/browse/HIVE-25155
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
>  https://orc.apache.org/news/2021/05/21/ORC-1.6.8/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25155) Bump ORC to 1.6.8

2021-05-27 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-25155.
---
Resolution: Fixed

> Bump ORC to 1.6.8
> -
>
> Key: HIVE-25155
> URL: https://issues.apache.org/jira/browse/HIVE-25155
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
>  https://orc.apache.org/news/2021/05/21/ORC-1.6.8/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25155) Bump ORC to 1.6.8

2021-05-27 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25155:
--
Fix Version/s: 4.0.0

> Bump ORC to 1.6.8
> -
>
> Key: HIVE-25155
> URL: https://issues.apache.org/jira/browse/HIVE-25155
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
>  https://orc.apache.org/news/2021/05/21/ORC-1.6.8/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25148) Support parallel load for Optimized HT implementations

2021-05-27 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25148:
--
Summary: Support parallel load for Optimized HT implementations  (was: 
Support parallel load for Fast HT implementations)

> Support parallel load for Optimized HT implementations
> --
>
> Key: HIVE-25148
> URL: https://issues.apache.org/jira/browse/HIVE-25148
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   6   7   >