[jira] [Updated] (HIVE-28019) Fix query type information in proto files for load queries

2024-04-22 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-28019:

Summary: Fix query type information in proto files for load queries  (was: 
Fix query type information in proto files for load and explain queries)

> Fix query type information in proto files for load queries
> --
>
> Key: HIVE-28019
> URL: https://issues.apache.org/jira/browse/HIVE-28019
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>
> Certain query types like LOAD, export, import and explain queries did not 
> produce the right Hive operation type



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-28019) Fix query type information in proto files for load and explain queries

2024-04-15 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837401#comment-17837401
 ] 

Ramesh Kumar Thangarajan commented on HIVE-28019:
-

Hi [~zabetak] First of all, thank you very much for the review on this. :)

I am with you on the fact that HiveOperation was introduced for authorization 
and may be we should not change it to represent the query type. But I still 
believe we should do the change for PREHOOK: type: and POSTHOOK: type: and also 
the HiveProtoLoggingHook.

I feel that the change to HiveOperation.Explain for the explain queries is 
needed mostly because we use the HiveOperation to print in the preexecute and 
postexecute actions.

[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/hooks/PreExecutePrinter.java#L69]

At present we report the type information for the queries in preexec and 
postexec as below:

PREHOOK: type: QUERY

POSTHOOK: type: QUERY

I think this is the query type information that is reported along with other 
information on the query. If that is the case I feel we should not report other 
type for explain queries. If this change is loss of information shouldn't the 
usage of type wrong by the users? 

Although we can skip this and fix only the HiveProtoLoggingHook to address 
right query type, I feel we will report two different information for the same 
query in different places. Also keeping them synchronized will help us in the 
complete testing for all types of queries.

Please let me know if you think my points make sense. I will address to not 
touch the commandType and rather create a field to represent explain queries 
and use that to report the correct query type in HiveProtoLoggingHook and the 
PREHOOK: type: and POSTHOOK: type.

> Fix query type information in proto files for load and explain queries
> --
>
> Key: HIVE-28019
> URL: https://issues.apache.org/jira/browse/HIVE-28019
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>
> Certain query types like LOAD, export, import and explain queries did not 
> produce the right Hive operation type



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28129) Execute statement doesnot report the correct query string information

2024-04-02 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-28129:

Fix Version/s: 4.0.0

> Execute statement doesnot report the correct query string information
> -
>
> Key: HIVE-28129
> URL: https://issues.apache.org/jira/browse/HIVE-28129
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Execute statement does not report the correct query type information.
> It inherits the sql statement type of the subsequent queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-28129) Execute statement doesnot report the correct query string information

2024-04-02 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-28129.
-
Resolution: Fixed

> Execute statement doesnot report the correct query string information
> -
>
> Key: HIVE-28129
> URL: https://issues.apache.org/jira/browse/HIVE-28129
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Execute statement does not report the correct query type information.
> It inherits the sql statement type of the subsequent queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28171) Active queries API does not provide query information if they are still pending

2024-04-02 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-28171:

Summary: Active queries API does not provide query information if they are 
still pending  (was: Active queries API does not provide information on the 
queries if they are still pending)

> Active queries API does not provide query information if they are still 
> pending
> ---
>
> Key: HIVE-28171
> URL: https://issues.apache.org/jira/browse/HIVE-28171
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Active queries API does not provide information on the queries if they are 
> still pending. SO we will never know what query is pending when we query this 
> API endpoint.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-28171) Active queries API does not provide information on the queries if they are still pending

2024-04-02 Thread Ramesh Kumar Thangarajan (Jira)
Ramesh Kumar Thangarajan created HIVE-28171:
---

 Summary: Active queries API does not provide information on the 
queries if they are still pending
 Key: HIVE-28171
 URL: https://issues.apache.org/jira/browse/HIVE-28171
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Ramesh Kumar Thangarajan
Assignee: Ramesh Kumar Thangarajan


Active queries API does not provide information on the queries if they are 
still pending. SO we will never know what query is pending when we query this 
API endpoint.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28170) Implement drop stats

2024-04-02 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-28170:

Description: We need a way to drop the stats associated with the 
table/partition and its columns. This can help a lot in migration or 
replication where the stats data take huge time to copy. Particularly when the 
table is partitioned, we have stats rows for each table, partition, column 
combination, which can get huge when the number of partitions is huge.

> Implement drop stats
> 
>
> Key: HIVE-28170
> URL: https://issues.apache.org/jira/browse/HIVE-28170
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> We need a way to drop the stats associated with the table/partition and its 
> columns. This can help a lot in migration or replication where the stats data 
> take huge time to copy. Particularly when the table is partitioned, we have 
> stats rows for each table, partition, column combination, which can get huge 
> when the number of partitions is huge.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-28170) Implement drop stats

2024-04-02 Thread Ramesh Kumar Thangarajan (Jira)
Ramesh Kumar Thangarajan created HIVE-28170:
---

 Summary: Implement drop stats
 Key: HIVE-28170
 URL: https://issues.apache.org/jira/browse/HIVE-28170
 Project: Hive
  Issue Type: Task
Reporter: Ramesh Kumar Thangarajan
Assignee: Ramesh Kumar Thangarajan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-28128) explain reoptimization doesnot report the correct querytype information

2024-03-25 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-28128.
-
Fix Version/s: 4.1.0
   Resolution: Done

> explain reoptimization doesnot report the correct querytype information
> ---
>
> Key: HIVE-28128
> URL: https://issues.apache.org/jira/browse/HIVE-28128
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.1.0
>
>
> explain reoptimization doesnot report the correct querytype information and 
> sometimes result in QUERY as the query type instead of explain



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-28128) explain reoptimization doesnot report the correct querytype information

2024-03-25 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830641#comment-17830641
 ] 

Ramesh Kumar Thangarajan commented on HIVE-28128:
-

Addressed as part of the jra https://issues.apache.org/jira/browse/HIVE-28019

Resolving this.

> explain reoptimization doesnot report the correct querytype information
> ---
>
> Key: HIVE-28128
> URL: https://issues.apache.org/jira/browse/HIVE-28128
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> explain reoptimization doesnot report the correct querytype information and 
> sometimes result in QUERY as the query type instead of explain



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28129) Execute statement doesnot report the correct query string information

2024-03-22 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-28129:

Issue Type: Bug  (was: Task)

> Execute statement doesnot report the correct query string information
> -
>
> Key: HIVE-28129
> URL: https://issues.apache.org/jira/browse/HIVE-28129
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>
> Execute statement does not report the correct query type information.
> It inherits the sql statement type of the subsequent queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28128) explain reoptimization doesnot report the correct querytype information

2024-03-22 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-28128:

Issue Type: Bug  (was: Task)

> explain reoptimization doesnot report the correct querytype information
> ---
>
> Key: HIVE-28128
> URL: https://issues.apache.org/jira/browse/HIVE-28128
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> explain reoptimization doesnot report the correct querytype information and 
> sometimes result in QUERY as the query type instead of explain



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27751) Log Query Compilation summary in an accumulated way

2024-03-22 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-27751.
-
Fix Version/s: 4.1.0
   Resolution: Fixed

> Log Query Compilation summary in an accumulated way
> ---
>
> Key: HIVE-27751
> URL: https://issues.apache.org/jira/browse/HIVE-27751
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.1.0
>
>
> Query Compilation summary is very useful for reading and collecting all the 
> measures of compile time in a single place. It is also useful in debugging a 
> performance issue in the query compilation phase and also to report and 
> compare with various runs
> In order to run test this. Please set the config hive.compile.print.summary 
> to true in any q file and run the test to see the Query Compilation Summary 
> in the logs. One example of the output is below. The order of operations are 
> maintained while print the summary too:
> {code:java}
> Query Compilation Summary
> --
> waitCompile   
>0 ms
> parse 
>4 ms
> getTableConstraints - HS2-cache   
>   69 ms
> optimizer - Calcite: Plan generation  
>  257 ms
> optimizer - Calcite: Prejoin ordering transformation  
>   20 ms
> optimizer - Calcite: Postjoin ordering transformation 
>   24 ms
> optimizer 
>  705 ms
> optimizer - HiveOpConverterPostProc   
>0 ms
> optimizer - Generator 
>   24 ms
> optimizer - PartitionColumnsSeparator 
>1 ms
> optimizer - SyntheticJoinPredicate
>2 ms
> optimizer - SimplePredicatePushDown   
>8 ms
> optimizer - RedundantDynamicPruningConditionsRemoval  
>0 ms
> optimizer - SortedDynPartitionTimeGranularityOptimizer
>2 ms
> optimizer - PartitionPruner   
>3 ms
> optimizer - PartitionConditionRemover 
>2 ms
> optimizer - GroupByOptimizer  
>2 ms
> optimizer - ColumnPruner  
>   10 ms
> optimizer - CountDistinctRewriteProc  
>1 ms
> optimizer - SamplePruner  
>1 ms
> optimizer - MapJoinProcessor  
>2 ms
> optimizer - BucketingSortingReduceSinkOptimizer   
>2 ms
> optimizer - UnionProcessor
>2 ms
> optimizer - JoinReorder   
>0 ms
> optimizer - FixedBucketPruningOptimizer   
>2 ms
> optimizer - BucketVersionPopulator
>2 ms
> optimizer - NonBlockingOpDeDupProc
>1 ms
> optimizer - IdentityProjectRemover
>0 ms
> optimizer - LimitPushdownOptimizer
>2 ms
> optimizer - OrderlessLimitPushDownOptimizer   
>1 ms
> optimizer - StatsOptimizer
>  

[jira] [Work started] (HIVE-28129) Execute statement doesnot report the correct query string information

2024-03-22 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-28129 started by Ramesh Kumar Thangarajan.
---
> Execute statement doesnot report the correct query string information
> -
>
> Key: HIVE-28129
> URL: https://issues.apache.org/jira/browse/HIVE-28129
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Execute statement does not report the correct query type information.
> It inherits the sql statement type of the subsequent queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28129) Execute statement doesnot report the correct query string information

2024-03-22 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-28129:

Summary: Execute statement doesnot report the correct query string 
information  (was: Prepare statement doesnot report the correct query type 
information)

> Execute statement doesnot report the correct query string information
> -
>
> Key: HIVE-28129
> URL: https://issues.apache.org/jira/browse/HIVE-28129
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Prepare statement doesnot report the correct query type information.
> It inherits the sql statement type of the subsequent queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28129) Execute statement doesnot report the correct query string information

2024-03-22 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-28129:

Description: 
Execute statement does not report the correct query type information.

It inherits the sql statement type of the subsequent queries.

  was:
Prepare statement doesnot report the correct query type information.

It inherits the sql statement type of the subsequent queries.


> Execute statement doesnot report the correct query string information
> -
>
> Key: HIVE-28129
> URL: https://issues.apache.org/jira/browse/HIVE-28129
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Execute statement does not report the correct query type information.
> It inherits the sql statement type of the subsequent queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-28129) Prepare statement doesnot report the correct query type information

2024-03-19 Thread Ramesh Kumar Thangarajan (Jira)
Ramesh Kumar Thangarajan created HIVE-28129:
---

 Summary: Prepare statement doesnot report the correct query type 
information
 Key: HIVE-28129
 URL: https://issues.apache.org/jira/browse/HIVE-28129
 Project: Hive
  Issue Type: Task
Reporter: Ramesh Kumar Thangarajan
Assignee: Ramesh Kumar Thangarajan


Prepare statement doesnot report the correct query type information.

It inherits the sql statement type of the subsequent queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-28128) explain reoptimization doesnot report the correct querytype information

2024-03-19 Thread Ramesh Kumar Thangarajan (Jira)
Ramesh Kumar Thangarajan created HIVE-28128:
---

 Summary: explain reoptimization doesnot report the correct 
querytype information
 Key: HIVE-28128
 URL: https://issues.apache.org/jira/browse/HIVE-28128
 Project: Hive
  Issue Type: Task
Reporter: Ramesh Kumar Thangarajan
Assignee: Ramesh Kumar Thangarajan


explain reoptimization doesnot report the correct querytype information and 
sometimes result in QUERY as the query type instead of explain



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28019) Fix query type information in proto files for load and explain queries

2024-02-20 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-28019:

Summary: Fix query type information in proto files for load and explain 
queries  (was: Fix query type information in proto files for load, export, 
import and explain queries)

> Fix query type information in proto files for load and explain queries
> --
>
> Key: HIVE-28019
> URL: https://issues.apache.org/jira/browse/HIVE-28019
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>
> Certain query types like LOAD, export, import and explain queries did not 
> produce the right Hive operation type



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28019) Fix query type information in proto files for load, export, import and explain queries

2024-01-23 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-28019:

Summary: Fix query type information in proto files for load, export, import 
and explain queries  (was: Wrong query type information in proto files for 
load, export, import and explain queries)

> Fix query type information in proto files for load, export, import and 
> explain queries
> --
>
> Key: HIVE-28019
> URL: https://issues.apache.org/jira/browse/HIVE-28019
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Certain query types like LOAD, export, import and explain queries did not 
> produce the right Hive operation type



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-28019) Wrong query type information in proto files for load, export, import and explain queries

2024-01-23 Thread Ramesh Kumar Thangarajan (Jira)
Ramesh Kumar Thangarajan created HIVE-28019:
---

 Summary: Wrong query type information in proto files for load, 
export, import and explain queries
 Key: HIVE-28019
 URL: https://issues.apache.org/jira/browse/HIVE-28019
 Project: Hive
  Issue Type: Task
  Components: HiveServer2
Reporter: Ramesh Kumar Thangarajan
Assignee: Ramesh Kumar Thangarajan


Certain query types like LOAD, export, import and explain queries did not 
produce the right Hive operation type



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27751) Log Query Compilation summary in an accumulated way

2024-01-22 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809764#comment-17809764
 ] 

Ramesh Kumar Thangarajan commented on HIVE-27751:
-

Hi [~zabetak] 

Thank you very much for reviewing this. I have updated the description with the 
sample output. 

Usually the debug logs are all spread across multiple places and we do not have 
a easy way to get the details from user when they run into performance issues. 
As part of this PR, main idea is to output the information in the command line 
output too. This will be done only if the config is turned on. That is what I 
meant by accumulated as we get all the details related to Query Compilation at 
one single place and its visible to the user as part of the query output.

Also I have addressed your comments, can you let me know what you think about 
the latest patch?

> Log Query Compilation summary in an accumulated way
> ---
>
> Key: HIVE-27751
> URL: https://issues.apache.org/jira/browse/HIVE-27751
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>
> Query Compilation summary is very useful for reading and collecting all the 
> measures of compile time in a single place. It is also useful in debugging a 
> performance issue in the query compilation phase and also to report and 
> compare with various runs
> In order to run test this. Please set the config hive.compile.print.summary 
> to true in any q file and run the test to see the Query Compilation Summary 
> in the logs. One example of the output is below. The order of operations are 
> maintained while print the summary too:
> {code:java}
> Query Compilation Summary
> --
> waitCompile   
>0 ms
> parse 
>4 ms
> getTableConstraints - HS2-cache   
>   69 ms
> optimizer - Calcite: Plan generation  
>  257 ms
> optimizer - Calcite: Prejoin ordering transformation  
>   20 ms
> optimizer - Calcite: Postjoin ordering transformation 
>   24 ms
> optimizer 
>  705 ms
> optimizer - HiveOpConverterPostProc   
>0 ms
> optimizer - Generator 
>   24 ms
> optimizer - PartitionColumnsSeparator 
>1 ms
> optimizer - SyntheticJoinPredicate
>2 ms
> optimizer - SimplePredicatePushDown   
>8 ms
> optimizer - RedundantDynamicPruningConditionsRemoval  
>0 ms
> optimizer - SortedDynPartitionTimeGranularityOptimizer
>2 ms
> optimizer - PartitionPruner   
>3 ms
> optimizer - PartitionConditionRemover 
>2 ms
> optimizer - GroupByOptimizer  
>2 ms
> optimizer - ColumnPruner  
>   10 ms
> optimizer - CountDistinctRewriteProc  
>1 ms
> optimizer - SamplePruner  
>1 ms
> optimizer - MapJoinProcessor  
>2 ms
> optimizer - BucketingSortingReduceSinkOptimizer   
>2 ms
> optimizer - UnionProcessor
>2 ms
> optimizer - JoinReorder   
>0 ms
> optimizer - FixedBucketPruningOptimizer   
>2 ms
> optimizer - 

[jira] [Updated] (HIVE-27751) Log Query Compilation summary in an accumulated way

2024-01-22 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-27751:

Description: 
Query Compilation summary is very useful for reading and collecting all the 
measures of compile time in a single place. It is also useful in debugging a 
performance issue in the query compilation phase and also to report and compare 
with various runs

In order to run test this. Please set the config hive.compile.print.summary to 
true in any q file and run the test to see the Query Compilation Summary in the 
logs. One example of the output is below. The order of operations are 
maintained while print the summary too:
{code:java}
Query Compilation Summary
--
waitCompile 
 0 ms
parse   
 4 ms
getTableConstraints - HS2-cache 
69 ms
optimizer - Calcite: Plan generation
   257 ms
optimizer - Calcite: Prejoin ordering transformation
20 ms
optimizer - Calcite: Postjoin ordering transformation   
24 ms
optimizer   
   705 ms
optimizer - HiveOpConverterPostProc 
 0 ms
optimizer - Generator   
24 ms
optimizer - PartitionColumnsSeparator   
 1 ms
optimizer - SyntheticJoinPredicate  
 2 ms
optimizer - SimplePredicatePushDown 
 8 ms
optimizer - RedundantDynamicPruningConditionsRemoval
 0 ms
optimizer - SortedDynPartitionTimeGranularityOptimizer  
 2 ms
optimizer - PartitionPruner 
 3 ms
optimizer - PartitionConditionRemover   
 2 ms
optimizer - GroupByOptimizer
 2 ms
optimizer - ColumnPruner
10 ms
optimizer - CountDistinctRewriteProc
 1 ms
optimizer - SamplePruner
 1 ms
optimizer - MapJoinProcessor
 2 ms
optimizer - BucketingSortingReduceSinkOptimizer 
 2 ms
optimizer - UnionProcessor  
 2 ms
optimizer - JoinReorder 
 0 ms
optimizer - FixedBucketPruningOptimizer 
 2 ms
optimizer - BucketVersionPopulator  
 2 ms
optimizer - NonBlockingOpDeDupProc  
 1 ms
optimizer - IdentityProjectRemover  
 0 ms
optimizer - LimitPushdownOptimizer  
 2 ms
optimizer - OrderlessLimitPushDownOptimizer 
 1 ms
optimizer - StatsOptimizer  
 0 ms
optimizer - SimpleFetchOptimizer
 0 ms
TezCompiler - Run top n key optimization
 2 ms
TezCompiler - Setup dynamic partition pruning   
 3 ms
optimizer - Merge single column semi-join reducers to composite 
 0 ms
partition-retrieving
 1 ms
TezCompiler - Setup stats in the operator plan  

[jira] [Updated] (HIVE-27751) Log Query Compilation summary in an accumulated way

2024-01-22 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-27751:

Description: 
Query Compilation summary is very useful for reading and collecting all the 
measures of compile time in a single place. It is also useful in debugging a 
performance issue in the query compilation phase and also to report and compare 
with various runs

 

After the 

  was:Query Compilation summary is very useful for reading and collecting all 
the measures of compile time in a single place. It is also useful in debugging 
a performance issue in the query compilation phase and also to report and 
compare with various runs


> Log Query Compilation summary in an accumulated way
> ---
>
> Key: HIVE-27751
> URL: https://issues.apache.org/jira/browse/HIVE-27751
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>
> Query Compilation summary is very useful for reading and collecting all the 
> measures of compile time in a single place. It is also useful in debugging a 
> performance issue in the query compilation phase and also to report and 
> compare with various runs
>  
> After the 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work started] (HIVE-27843) Add QueryOperation to Hive proto logger for post execution hook information

2023-11-27 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-27843 started by Ramesh Kumar Thangarajan.
---
> Add QueryOperation to Hive proto logger for post execution hook information
> ---
>
> Key: HIVE-27843
> URL: https://issues.apache.org/jira/browse/HIVE-27843
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>
> Currently the query operation type is missing in the proto logger
> Add QueryOperation to Hive proto logger for post execution hook information



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27843) Add QueryOperation to Hive proto logger for post execution hook information

2023-11-27 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-27843.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

> Add QueryOperation to Hive proto logger for post execution hook information
> ---
>
> Key: HIVE-27843
> URL: https://issues.apache.org/jira/browse/HIVE-27843
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Currently the query operation type is missing in the proto logger
> Add QueryOperation to Hive proto logger for post execution hook information



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27687) Logger variable should be static final as its creation takes more time in query compilation

2023-11-22 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-27687.
-
Resolution: Fixed

> Logger variable should be static final as its creation takes more time in 
> query compilation
> ---
>
> Key: HIVE-27687
> URL: https://issues.apache.org/jira/browse/HIVE-27687
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: Screenshot 2023-09-12 at 5.03.31 PM.png
>
>
> In query compilation, 
> LoggerFactory.getLogger() seems to take up more time. Some of the serde 
> classes use non static variable for Logger that forces the getLogger() call 
> for each of the class creation.
> Making Logger variable static final will avoid this code path for every serde 
> class construction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27687) Logger variable should be static final as its creation takes more time in query compilation

2023-11-22 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788845#comment-17788845
 ] 

Ramesh Kumar Thangarajan commented on HIVE-27687:
-

[~zabetak] Thanks, marked it.

> Logger variable should be static final as its creation takes more time in 
> query compilation
> ---
>
> Key: HIVE-27687
> URL: https://issues.apache.org/jira/browse/HIVE-27687
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: Screenshot 2023-09-12 at 5.03.31 PM.png
>
>
> In query compilation, 
> LoggerFactory.getLogger() seems to take up more time. Some of the serde 
> classes use non static variable for Logger that forces the getLogger() call 
> for each of the class creation.
> Making Logger variable static final will avoid this code path for every serde 
> class construction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HIVE-27687) Logger variable should be static final as its creation takes more time in query compilation

2023-11-22 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan closed HIVE-27687.
---

> Logger variable should be static final as its creation takes more time in 
> query compilation
> ---
>
> Key: HIVE-27687
> URL: https://issues.apache.org/jira/browse/HIVE-27687
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: Screenshot 2023-09-12 at 5.03.31 PM.png
>
>
> In query compilation, 
> LoggerFactory.getLogger() seems to take up more time. Some of the serde 
> classes use non static variable for Logger that forces the getLogger() call 
> for each of the class creation.
> Making Logger variable static final will avoid this code path for every serde 
> class construction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27687) Logger variable should be static final as its creation takes more time in query compilation

2023-11-22 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-27687:

Fix Version/s: 4.0.0

> Logger variable should be static final as its creation takes more time in 
> query compilation
> ---
>
> Key: HIVE-27687
> URL: https://issues.apache.org/jira/browse/HIVE-27687
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: Screenshot 2023-09-12 at 5.03.31 PM.png
>
>
> In query compilation, 
> LoggerFactory.getLogger() seems to take up more time. Some of the serde 
> classes use non static variable for Logger that forces the getLogger() call 
> for each of the class creation.
> Making Logger variable static final will avoid this code path for every serde 
> class construction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (HIVE-27687) Logger variable should be static final as its creation takes more time in query compilation

2023-11-22 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reopened HIVE-27687:
-

> Logger variable should be static final as its creation takes more time in 
> query compilation
> ---
>
> Key: HIVE-27687
> URL: https://issues.apache.org/jira/browse/HIVE-27687
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2023-09-12 at 5.03.31 PM.png
>
>
> In query compilation, 
> LoggerFactory.getLogger() seems to take up more time. Some of the serde 
> classes use non static variable for Logger that forces the getLogger() call 
> for each of the class creation.
> Making Logger variable static final will avoid this code path for every serde 
> class construction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27876) Incorrect query results on tables with ClusterBy & SortBy

2023-11-21 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788476#comment-17788476
 ] 

Ramesh Kumar Thangarajan commented on HIVE-27876:
-

[~kkasa] I was looking into fixing this. But I had 2 questions to think about:
1. Should we expect the data to be bucketed and sorted globally after the 
inserts? Because if all the 4 rows are inserted in the table in a single 
statement query like below, then I guess the optimization works fine.
insert into test_bucket values (1, 'user1', 'dept1'), ( 2, 'user2' , 'dept2'), 
(1, 'user1', 'dept1'), ( 2, 'user2' , 'dept2');
2. Having map side group by is still useful even if the data is sorted locally 
within a bucket, it is only a problem when we remove the ReduceSinkOperator. In 
that case, can we just skip removing ReduceSinkOperator as part of the 
optimization. Will we still get any real improvements as part of this 
optimization(even after skipping to remove ReduceSinkOperator)?

 

> Incorrect query results on tables with ClusterBy & SortBy
> -
>
> Key: HIVE-27876
> URL: https://issues.apache.org/jira/browse/HIVE-27876
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Repro:
>  
> {code:java}
> create external table test_bucket(age int, name string, dept string) 
> clustered by (age, name) sorted by (age asc, name asc) into 2 buckets stored 
> as orc;
> insert into test_bucket values (1, 'user1', 'dept1'), ( 2, 'user2' , 'dept2');
> insert into test_bucket values (1, 'user1', 'dept1'), ( 2, 'user2' , 'dept2');
> //empty wrong results
> select age, name, count(*) from test_bucket group by  age, name having 
> count(*) > 1; 
> +--+---+--+
> | age  | name  | _c2  |
> +--+---+--+
> +--+---+--+
> // Workaround
> set hive.map.aggr=false;
> select age, name, count(*) from test_bucket group by  age, name having 
> count(*) > 1; 
> +--++--+
> | age  |  name  | _c2  |
> +--++--+
> | 1    | user1  | 2    |
> | 2    | user2  | 2    |
> +--++--+ {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27876) Incorrect query results on tables with ClusterBy & SortBy

2023-11-21 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-27876:
---

Assignee: Ramesh Kumar Thangarajan

> Incorrect query results on tables with ClusterBy & SortBy
> -
>
> Key: HIVE-27876
> URL: https://issues.apache.org/jira/browse/HIVE-27876
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Repro:
>  
> {code:java}
> create external table test_bucket(age int, name string, dept string) 
> clustered by (age, name) sorted by (age asc, name asc) into 2 buckets stored 
> as orc;
> insert into test_bucket values (1, 'user1', 'dept1'), ( 2, 'user2' , 'dept2');
> insert into test_bucket values (1, 'user1', 'dept1'), ( 2, 'user2' , 'dept2');
> //empty wrong results
> select age, name, count(*) from test_bucket group by  age, name having 
> count(*) > 1; 
> +--+---+--+
> | age  | name  | _c2  |
> +--+---+--+
> +--+---+--+
> // Workaround
> set hive.map.aggr=false;
> select age, name, count(*) from test_bucket group by  age, name having 
> count(*) > 1; 
> +--++--+
> | age  |  name  | _c2  |
> +--++--+
> | 1    | user1  | 2    |
> | 2    | user2  | 2    |
> +--++--+ {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27687) Logger variable should be static final as its creation takes more time in query compilation

2023-11-20 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-27687.
-
Resolution: Fixed

> Logger variable should be static final as its creation takes more time in 
> query compilation
> ---
>
> Key: HIVE-27687
> URL: https://issues.apache.org/jira/browse/HIVE-27687
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2023-09-12 at 5.03.31 PM.png
>
>
> In query compilation, 
> LoggerFactory.getLogger() seems to take up more time. Some of the serde 
> classes use non static variable for Logger that forces the getLogger() call 
> for each of the class creation.
> Making Logger variable static final will avoid this code path for every serde 
> class construction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27843) Add QueryOperation to Hive proto logger for post execution hook information

2023-11-01 Thread Ramesh Kumar Thangarajan (Jira)
Ramesh Kumar Thangarajan created HIVE-27843:
---

 Summary: Add QueryOperation to Hive proto logger for post 
execution hook information
 Key: HIVE-27843
 URL: https://issues.apache.org/jira/browse/HIVE-27843
 Project: Hive
  Issue Type: Task
Reporter: Ramesh Kumar Thangarajan
Assignee: Ramesh Kumar Thangarajan


Currently the query operation type is missing in the proto logger

Add QueryOperation to Hive proto logger for post execution hook information



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27773) get_valid_write_ids is being called multiple times for a single query

2023-10-05 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-27773:

Description: 
Looking at the below logs suggest that the get_valid_write_ids is not cached 
for a single query for a single table. It is being called multiple times across 
different phases in the compilation of the query. We should verify if we can 
safely cache and re use the results. That way we can avoid around 40-50 ms out 
of 678ms compilation time.

 
{code:java}
2023-09-19T02:55:06,940 INFO [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] ql.Driver: Compiling 
command(queryId=rameshkumar_20230919025506_b005cc57-1717-4798-b8da-b502aa7ca3d6):
2023-09-19T02:55:06,967 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:06,979 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:06,980 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:06,986 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:06,988 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:06,995 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:06,997 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,007 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,009 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,017 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,018 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,026 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,059 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,068 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,618 INFO [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] ql.Driver: Completed compiling 
command(queryId=rameshkumar_20230919025506_b005cc57-1717-4798-b8da-b502aa7ca3d6);
 Time taken: 0.678 seconds{code}

  was:
Looking at the below logs suggest that the get_valid_write_ids is not cached 
for a single query for a single table. It is being called multiple times across 
different phases in the compilation of the query. We should verify if we can 
safely cache and re use the results. That way we can avoid at the 40-50 ms out 
of 678ms compilation time.

 
{code:java}
2023-09-19T02:55:06,940 INFO [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] ql.Driver: Compiling 
command(queryId=rameshkumar_20230919025506_b005cc57-1717-4798-b8da-b502aa7ca3d6):
2023-09-19T02:55:06,967 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:06,979 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:06,980 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:06,986 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:06,988 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:06,995 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:06,997 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,007 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,009 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,017 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,018 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,026 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,059 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,068 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,618 INFO [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] ql.Driver: Completed 

[jira] [Work started] (HIVE-27773) get_valid_write_ids is being called multiple times for a single query

2023-10-05 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-27773 started by Ramesh Kumar Thangarajan.
---
> get_valid_write_ids is being called multiple times for a single query
> -
>
> Key: HIVE-27773
> URL: https://issues.apache.org/jira/browse/HIVE-27773
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Looking at the below logs suggest that the get_valid_write_ids is not cached 
> for a single query for a single table. It is being called multiple times 
> across different phases in the compilation of the query. We should verify if 
> we can safely cache and re use the results. That way we can avoid around 
> 40-50 ms out of 678ms compilation time.
>  
> {code:java}
> 2023-09-19T02:55:06,940 INFO [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener 
> at 0.0.0.0/50501] ql.Driver: Compiling 
> command(queryId=rameshkumar_20230919025506_b005cc57-1717-4798-b8da-b502aa7ca3d6):
> 2023-09-19T02:55:06,967 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener 
> at 0.0.0.0/50501] metrics.PerfLogger:  from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
> 2023-09-19T02:55:06,979 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener 
> at 0.0.0.0/50501] metrics.PerfLogger:  start=1695117306967 end=1695117306979 duration=12 
> from=org.apache.hadoop.hive.metastore.RetryingHMSHandler retryCount=0 
> error=false>
> 2023-09-19T02:55:06,980 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener 
> at 0.0.0.0/50501] metrics.PerfLogger:  from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
> 2023-09-19T02:55:06,986 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener 
> at 0.0.0.0/50501] metrics.PerfLogger:  start=1695117306980 end=1695117306986 duration=6 
> from=org.apache.hadoop.hive.metastore.RetryingHMSHandler retryCount=0 
> error=false>
> 2023-09-19T02:55:06,988 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener 
> at 0.0.0.0/50501] metrics.PerfLogger:  from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
> 2023-09-19T02:55:06,995 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener 
> at 0.0.0.0/50501] metrics.PerfLogger:  start=1695117306988 end=1695117306995 duration=7 
> from=org.apache.hadoop.hive.metastore.RetryingHMSHandler retryCount=0 
> error=false>
> 2023-09-19T02:55:06,997 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener 
> at 0.0.0.0/50501] metrics.PerfLogger:  from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
> 2023-09-19T02:55:07,007 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener 
> at 0.0.0.0/50501] metrics.PerfLogger:  start=1695117306997 end=1695117307007 duration=10 
> from=org.apache.hadoop.hive.metastore.RetryingHMSHandler retryCount=0 
> error=false>
> 2023-09-19T02:55:07,009 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener 
> at 0.0.0.0/50501] metrics.PerfLogger:  from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
> 2023-09-19T02:55:07,017 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener 
> at 0.0.0.0/50501] metrics.PerfLogger:  start=1695117307009 end=1695117307017 duration=8 
> from=org.apache.hadoop.hive.metastore.RetryingHMSHandler retryCount=0 
> error=false>
> 2023-09-19T02:55:07,018 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener 
> at 0.0.0.0/50501] metrics.PerfLogger:  from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
> 2023-09-19T02:55:07,026 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener 
> at 0.0.0.0/50501] metrics.PerfLogger:  start=1695117307018 end=1695117307026 duration=8 
> from=org.apache.hadoop.hive.metastore.RetryingHMSHandler retryCount=0 
> error=false>
> 2023-09-19T02:55:07,059 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener 
> at 0.0.0.0/50501] metrics.PerfLogger:  from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
> 2023-09-19T02:55:07,068 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener 
> at 0.0.0.0/50501] metrics.PerfLogger:  start=1695117307059 end=1695117307068 duration=9 
> from=org.apache.hadoop.hive.metastore.RetryingHMSHandler retryCount=0 
> error=false>
> 2023-09-19T02:55:07,618 INFO [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener 
> at 0.0.0.0/50501] ql.Driver: Completed compiling 
> command(queryId=rameshkumar_20230919025506_b005cc57-1717-4798-b8da-b502aa7ca3d6);
>  Time taken: 0.678 seconds{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27773) get_valid_write_ids is being called multiple times for a single query

2023-10-05 Thread Ramesh Kumar Thangarajan (Jira)
Ramesh Kumar Thangarajan created HIVE-27773:
---

 Summary: get_valid_write_ids is being called multiple times for a 
single query
 Key: HIVE-27773
 URL: https://issues.apache.org/jira/browse/HIVE-27773
 Project: Hive
  Issue Type: Task
Reporter: Ramesh Kumar Thangarajan
Assignee: Ramesh Kumar Thangarajan


Looking at the below logs suggest that the get_valid_write_ids is not cached 
for a single query for a single table. It is being called multiple times across 
different phases in the compilation of the query. We should verify if we can 
safely cache and re use the results. That way we can avoid at the 40-50 ms out 
of 678ms compilation time.

 
{code:java}
2023-09-19T02:55:06,940 INFO [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] ql.Driver: Compiling 
command(queryId=rameshkumar_20230919025506_b005cc57-1717-4798-b8da-b502aa7ca3d6):
2023-09-19T02:55:06,967 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:06,979 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:06,980 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:06,986 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:06,988 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:06,995 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:06,997 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,007 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,009 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,017 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,018 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,026 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,059 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,068 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] metrics.PerfLogger: 
2023-09-19T02:55:07,618 INFO [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 
0.0.0.0/50501] ql.Driver: Completed compiling 
command(queryId=rameshkumar_20230919025506_b005cc57-1717-4798-b8da-b502aa7ca3d6);
 Time taken: 0.678 seconds{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27751) Log Query Compilation summary in an accumulated way

2023-09-28 Thread Ramesh Kumar Thangarajan (Jira)
Ramesh Kumar Thangarajan created HIVE-27751:
---

 Summary: Log Query Compilation summary in an accumulated way
 Key: HIVE-27751
 URL: https://issues.apache.org/jira/browse/HIVE-27751
 Project: Hive
  Issue Type: Task
  Components: Hive
Reporter: Ramesh Kumar Thangarajan
Assignee: Ramesh Kumar Thangarajan


Query Compilation summary is very useful for reading and collecting all the 
measures of compile time in a single place. It is also useful in debugging a 
performance issue in the query compilation phase and also to report and compare 
with various runs



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27687) Logger variable should be static final as its creation takes more time in query compilation

2023-09-12 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-27687:

Attachment: Screenshot 2023-09-12 at 5.03.31 PM.png

> Logger variable should be static final as its creation takes more time in 
> query compilation
> ---
>
> Key: HIVE-27687
> URL: https://issues.apache.org/jira/browse/HIVE-27687
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: Screenshot 2023-09-12 at 5.03.31 PM.png
>
>
> In query compilation, 
> LoggerFactory.getLogger() seems to take up more time. Some of the serde 
> classes use non static variable for Logger that forces the getLogger() call 
> for each of the class creation.
> Making Logger variable static final will avoid this code path for every serde 
> class construction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27687) Logger variable should be static final as its creation takes more time in query compilation

2023-09-12 Thread Ramesh Kumar Thangarajan (Jira)
Ramesh Kumar Thangarajan created HIVE-27687:
---

 Summary: Logger variable should be static final as its creation 
takes more time in query compilation
 Key: HIVE-27687
 URL: https://issues.apache.org/jira/browse/HIVE-27687
 Project: Hive
  Issue Type: Task
  Components: Hive
Reporter: Ramesh Kumar Thangarajan
Assignee: Ramesh Kumar Thangarajan


In query compilation, 

LoggerFactory.getLogger() seems to take up more time. Some of the serde classes 
use non static variable for Logger that forces the getLogger() call for each of 
the class creation.

Making Logger variable static final will avoid this code path for every serde 
class construction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-23127) Replace listPartitionsByExpr with GetPartitionsWithSpecs in Partition pruner

2023-07-25 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-23127:
---

Assignee: Ramesh Kumar Thangarajan  (was: Vineet Garg)

> Replace listPartitionsByExpr with GetPartitionsWithSpecs in Partition pruner
> 
>
> Key: HIVE-23127
> URL: https://issues.apache.org/jira/browse/HIVE-23127
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: Vineet Garg
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23127.1.patch, HIVE-23127.2.patch
>
>
> GetPartitionsWithSpecs reduces data transfer by deduplicating storage 
> descriptor



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27293) Vectorization: Incorrect results with nvl for ORC table

2023-06-09 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-27293.
-
Resolution: Fixed

> Vectorization: Incorrect results with nvl for ORC table
> ---
>
> Key: HIVE-27293
> URL: https://issues.apache.org/jira/browse/HIVE-27293
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Riju Trivedi
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: esource.txt, vectorization_nvl.q
>
>
> Attached repro.q file and data file used to reproduce the issue.
> {code:java}
> Insert overwrite table etarget
> select mt.*, floor(rand() * 1) as bdata_no from (select nvl(np.client_id,' 
> '),nvl(np.id_enddate,cast(0 as decimal(10,0))),nvl(np.client_gender,' 
> '),nvl(np.birthday,cast(0 as decimal(10,0))),nvl(np.nationality,' 
> '),nvl(np.address_zipcode,' '),nvl(np.income,cast(0 as 
> decimal(15,2))),nvl(np.address,' '),nvl(np.part_date,cast(0 as int)) from 
> (select * from esource where part_date = 20230414) np) mt;
>  {code}
> Outcome:
> {code:java}
> select client_id,birthday,income from etarget; 
> 15678   0  0.00
> 67891  19313  -1.00
> 12345  0  0.00{code}
> Expected Result :
> {code:java}
> select client_id,birthday,income from etarget; 
> 12345 19613 -1.00
> 67891 19313 -1.00 
> 15678 0  0.00{code}
> Disabling hive.vectorized.use.vectorized.input.format produces correct output.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27293) Vectorization: Incorrect results with nvl for ORC table

2023-05-30 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-27293:
---

Assignee: Ramesh Kumar Thangarajan

> Vectorization: Incorrect results with nvl for ORC table
> ---
>
> Key: HIVE-27293
> URL: https://issues.apache.org/jira/browse/HIVE-27293
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Riju Trivedi
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: esource.txt, vectorization_nvl.q
>
>
> Attached repro.q file and data file used to reproduce the issue.
> {code:java}
> Insert overwrite table etarget
> select mt.*, floor(rand() * 1) as bdata_no from (select nvl(np.client_id,' 
> '),nvl(np.id_enddate,cast(0 as decimal(10,0))),nvl(np.client_gender,' 
> '),nvl(np.birthday,cast(0 as decimal(10,0))),nvl(np.nationality,' 
> '),nvl(np.address_zipcode,' '),nvl(np.income,cast(0 as 
> decimal(15,2))),nvl(np.address,' '),nvl(np.part_date,cast(0 as int)) from 
> (select * from esource where part_date = 20230414) np) mt;
>  {code}
> Outcome:
> {code:java}
> select client_id,birthday,income from etarget; 
> 15678   0  0.00
> 67891  19313  -1.00
> 12345  0  0.00{code}
> Expected Result :
> {code:java}
> select client_id,birthday,income from etarget; 
> 12345 19613 -1.00
> 67891 19313 -1.00 
> 15678 0  0.00{code}
> Disabling hive.vectorized.use.vectorized.input.format produces correct output.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27244) Iceberg: Implement LOAD data for unpartitioned table via Append API

2023-04-11 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-27244:
---

Assignee: Ramesh Kumar Thangarajan

> Iceberg: Implement LOAD data for unpartitioned table via Append API
> ---
>
> Key: HIVE-27244
> URL: https://issues.apache.org/jira/browse/HIVE-27244
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Use Append API for Iceberg Load data command, Same as migration use case



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-26989) Fix predicate pushdown for Timestamp with TZ

2023-01-26 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-26989.
-
Resolution: Fixed

> Fix predicate pushdown for Timestamp with TZ
> 
>
> Key: HIVE-26989
> URL: https://issues.apache.org/jira/browse/HIVE-26989
> Project: Hive
>  Issue Type: Task
>  Components: Hive, Iceberg integration
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Running a query which is filtering for {{TIMESTAMP WITH LOCAL TIME ZONE}} 
> returns the correct results but the predicate is not pushed to Iceberg.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-26989) Fix predicate pushdown for Timestamp with TZ

2023-01-25 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-26989:
---


> Fix predicate pushdown for Timestamp with TZ
> 
>
> Key: HIVE-26989
> URL: https://issues.apache.org/jira/browse/HIVE-26989
> Project: Hive
>  Issue Type: Task
>  Components: Hive, Iceberg integration
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Running a query which is filtering for {{TIMESTAMP WITH LOCAL TIME ZONE}} 
> returns the correct results but the predicate is not pushed to Iceberg.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-26837) CTLT with hive.create.as.external.legacy as true creates managed table instead of external table

2023-01-05 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-26837.
-
Resolution: Fixed

> CTLT with hive.create.as.external.legacy as true creates managed table 
> instead of external table
> 
>
> Key: HIVE-26837
> URL: https://issues.apache.org/jira/browse/HIVE-26837
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> When CTLT is used with the config hive.create.as.external.legacy=true, it 
> still creates managed table by default. Use below to reproduce.
> create external table test_ext(empno int, name string) partitioned by(dept 
> string) stored as orc;
> desc formatted test_ext;
> set hive.create.as.external.legacy=true;
> create table test_external like test_ext;
> desc formatted test_external;



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26837) CTLT with hive.create.as.external.legacy as true creates managed table instead of external table

2022-12-13 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-26837:

Component/s: HiveServer2

> CTLT with hive.create.as.external.legacy as true creates managed table 
> instead of external table
> 
>
> Key: HIVE-26837
> URL: https://issues.apache.org/jira/browse/HIVE-26837
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> When CTLT is used with the config hive.create.as.external.legacy=true, it 
> still creates managed table by default. Use below to reproduce.
> create external table test_ext(empno int, name string) partitioned by(dept 
> string) stored as orc;
> desc formatted test_ext;
> set hive.create.as.external.legacy=true;
> create table test_external like test_ext;
> desc formatted test_external;



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26837) CTLT with hive.create.as.external.legacy as true creates managed table instead of external table

2022-12-12 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-26837:

Description: 
When CTLT is used with the config hive.create.as.external.legacy=true, it still 
creates managed table by default. Use below to reproduce.

create external table test_ext(empno int, name string) partitioned by(dept 
string) stored as orc;
desc formatted test_ext;

set hive.create.as.external.legacy=true;

create table test_external like test_ext;
desc formatted test_external;

  was:
When CTLT is used with the config hive.create.as.external.legacy=true, it still 
creates managed table by default. Use below to reproduce.

create table test_mm(empno int, name string) partitioned by(dept string) stored 
as orc tblproperties('transactional'='true', 
'transactional_properties'='default');
desc formatted test_mm;

set hive.create.as.external.legacy=true;

create table test_external like test_mm;
desc formatted test_external;


> CTLT with hive.create.as.external.legacy as true creates managed table 
> instead of external table
> 
>
> Key: HIVE-26837
> URL: https://issues.apache.org/jira/browse/HIVE-26837
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> When CTLT is used with the config hive.create.as.external.legacy=true, it 
> still creates managed table by default. Use below to reproduce.
> create external table test_ext(empno int, name string) partitioned by(dept 
> string) stored as orc;
> desc formatted test_ext;
> set hive.create.as.external.legacy=true;
> create table test_external like test_ext;
> desc formatted test_external;



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-26837) CTLT with hive.create.as.external.legacy as true creates managed table instead of external table

2022-12-12 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-26837:
---


> CTLT with hive.create.as.external.legacy as true creates managed table 
> instead of external table
> 
>
> Key: HIVE-26837
> URL: https://issues.apache.org/jira/browse/HIVE-26837
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> When CTLT is used with the config hive.create.as.external.legacy=true, it 
> still creates managed table by default
> create table test_mm(empno int, name string) partitioned by(dept string) 
> stored as orc tblproperties('transactional'='true', 
> 'transactional_properties'='default');
> desc formatted test_mm;
> set hive.create.as.external.legacy=true;
> create table test_external like test_mm;
> desc formatted test_external;



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26837) CTLT with hive.create.as.external.legacy as true creates managed table instead of external table

2022-12-12 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-26837:

Description: 
When CTLT is used with the config hive.create.as.external.legacy=true, it still 
creates managed table by default. Use below to reproduce.

create table test_mm(empno int, name string) partitioned by(dept string) stored 
as orc tblproperties('transactional'='true', 
'transactional_properties'='default');
desc formatted test_mm;

set hive.create.as.external.legacy=true;

create table test_external like test_mm;
desc formatted test_external;

  was:
When CTLT is used with the config hive.create.as.external.legacy=true, it still 
creates managed table by default

create table test_mm(empno int, name string) partitioned by(dept string) stored 
as orc tblproperties('transactional'='true', 
'transactional_properties'='default');
desc formatted test_mm;

set hive.create.as.external.legacy=true;

create table test_external like test_mm;
desc formatted test_external;


> CTLT with hive.create.as.external.legacy as true creates managed table 
> instead of external table
> 
>
> Key: HIVE-26837
> URL: https://issues.apache.org/jira/browse/HIVE-26837
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> When CTLT is used with the config hive.create.as.external.legacy=true, it 
> still creates managed table by default. Use below to reproduce.
> create table test_mm(empno int, name string) partitioned by(dept string) 
> stored as orc tblproperties('transactional'='true', 
> 'transactional_properties'='default');
> desc formatted test_mm;
> set hive.create.as.external.legacy=true;
> create table test_external like test_mm;
> desc formatted test_external;



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26208) Exception in Vectorization with Decimal64 to Decimal casting

2022-06-20 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17556496#comment-17556496
 ] 

Ramesh Kumar Thangarajan commented on HIVE-26208:
-

Thanks [~scarlin] for the patch, I just merged the patch.

> Exception in Vectorization with Decimal64 to Decimal casting
> 
>
> Key: HIVE-26208
> URL: https://issues.apache.org/jira/browse/HIVE-26208
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Steve Carlin
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The following query fails:
>  
> {code:java}
> select count(*)
> from
>   int_txt
>   where
>          (( 1.0 * i) / ( 1.0 * i)) > 1.2;
> {code}
> with the following exception:
>  
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector
>         at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DecimalColDivideDecimalColumn.evaluate(DecimalColDivideDecimalColumn.java:59)
>  ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:334)
>  ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FilterDecimalColGreaterDecimalScalar.evaluate(FilterDecimalColGreaterDecimalScalar.java:62)
>  ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:125)
>  ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919) 
> ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:171)
>  ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:809)
>  ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:900)
>  ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT]       
>         ... 19 more
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (HIVE-26269) Class cast exception when vectorization is enabled for certain case when cases

2022-06-15 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-26269.
-
Resolution: Fixed

> Class cast exception when vectorization is enabled for certain case when cases
> --
>
> Key: HIVE-26269
> URL: https://issues.apache.org/jira/browse/HIVE-26269
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Class cast exception when vectorization is enabled for certain case when cases



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (HIVE-26239) Shutdown Hash table load executor service threads when they are interrupted

2022-05-31 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-26239.
-
Resolution: Fixed

> Shutdown Hash table load executor service threads when they are interrupted
> ---
>
> Key: HIVE-26239
> URL: https://issues.apache.org/jira/browse/HIVE-26239
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (HIVE-26269) Class cast exception when vectorization is enabled for certain case when cases

2022-05-26 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-26269:
---


> Class cast exception when vectorization is enabled for certain case when cases
> --
>
> Key: HIVE-26269
> URL: https://issues.apache.org/jira/browse/HIVE-26269
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Class cast exception when vectorization is enabled for certain case when cases



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (HIVE-26241) Support Geospatial datatypes

2022-05-18 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-26241:
---


> Support Geospatial datatypes
> 
>
> Key: HIVE-26241
> URL: https://issues.apache.org/jira/browse/HIVE-26241
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-26239) Shutdown Hash table load executor service threads when they are interrupted

2022-05-18 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-26239:

Issue Type: Bug  (was: Task)

> Shutdown Hash table load executor service threads when they are interrupted
> ---
>
> Key: HIVE-26239
> URL: https://issues.apache.org/jira/browse/HIVE-26239
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (HIVE-26239) Shutdown Hash table load executor service threads when they are interrupted

2022-05-18 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-26239:
---


> Shutdown Hash table load executor service threads when they are interrupted
> ---
>
> Key: HIVE-26239
> URL: https://issues.apache.org/jira/browse/HIVE-26239
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (HIVE-26219) Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy

2022-05-13 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-26219.
-
Resolution: Fixed

> Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy
> --
>
> Key: HIVE-26219
> URL: https://issues.apache.org/jira/browse/HIVE-26219
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
>  Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy 
> so other services like Ranger can still use the old API



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (HIVE-26219) Encapsulate the API change so other services like Ranger can still use the old API

2022-05-10 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-26219:
---


> Encapsulate the API change so other services like Ranger can still use the 
> old API
> --
>
> Key: HIVE-26219
> URL: https://issues.apache.org/jira/browse/HIVE-26219
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-26219) Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy

2022-05-10 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-26219:

Description:  Encapsulate the API change for 
FileUtils.isActionPermittedForFileHierarchy so other services like Ranger can 
still use the old API  (was:  Encapsulate the API change for so other services 
like Ranger can still use the old API)

> Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy
> --
>
> Key: HIVE-26219
> URL: https://issues.apache.org/jira/browse/HIVE-26219
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
>  Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy 
> so other services like Ranger can still use the old API



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-26219) Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy

2022-05-10 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-26219:

Description:  Encapsulate the API change for so other services like Ranger 
can still use the old API

> Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy
> --
>
> Key: HIVE-26219
> URL: https://issues.apache.org/jira/browse/HIVE-26219
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
>  Encapsulate the API change for so other services like Ranger can still use 
> the old API



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-26219) Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy

2022-05-10 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-26219:

Summary: Encapsulate the API change for 
FileUtils.isActionPermittedForFileHierarchy  (was: Encapsulate the API change 
so other services like Ranger can still use the old API)

> Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy
> --
>
> Key: HIVE-26219
> URL: https://issues.apache.org/jira/browse/HIVE-26219
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (HIVE-26199) Reduce FileSystem init during user impersonation

2022-05-10 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-26199.
-
Resolution: Fixed

> Reduce FileSystem init during user impersonation
> 
>
> Key: HIVE-26199
> URL: https://issues.apache.org/jira/browse/HIVE-26199
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (HIVE-26199) Reduce FileSystem init during user impersonation

2022-05-02 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-26199:
---


> Reduce FileSystem init during user impersonation
> 
>
> Key: HIVE-26199
> URL: https://issues.apache.org/jira/browse/HIVE-26199
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-25697) Upgrade commons-compress to 1.21

2021-11-14 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443543#comment-17443543
 ] 

Ramesh Kumar Thangarajan commented on HIVE-25697:
-

[~pgaref]  Can you please help review this?

> Upgrade commons-compress to 1.21
> 
>
> Key: HIVE-25697
> URL: https://issues.apache.org/jira/browse/HIVE-25697
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Upgrade commons-compress to 1.21 due to CVEs



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25697) Upgrade commons-compress to 1.21

2021-11-12 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-25697:
---


> Upgrade commons-compress to 1.21
> 
>
> Key: HIVE-25697
> URL: https://issues.apache.org/jira/browse/HIVE-25697
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Upgrade commons-compress to 1.21 due to CVEs



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25583) Support parallel load for HastTables - Interfaces

2021-09-30 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-25583:

Parent: HIVE-24037
Issue Type: Sub-task  (was: Task)

> Support parallel load for HastTables - Interfaces
> -
>
> Key: HIVE-25583
> URL: https://issues.apache.org/jira/browse/HIVE-25583
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25583) Support parallel load for HastTables - Interfaces

2021-09-30 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-25583:
---


> Support parallel load for HastTables - Interfaces
> -
>
> Key: HIVE-25583
> URL: https://issues.apache.org/jira/browse/HIVE-25583
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-21489) EXPLAIN command throws ClassCastException in Hive

2021-06-08 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-21489:
---

Assignee: Ramesh Kumar Thangarajan  (was: Daniel Dai)

> EXPLAIN command throws ClassCastException in Hive
> -
>
> Key: HIVE-21489
> URL: https://issues.apache.org/jira/browse/HIVE-21489
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.4
>Reporter: Ping Lu
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-21489.1.patch, HIVE-21489.2.patch
>
>
> I'm trying to run commands like explain select * from src in hive-2.3.4,but 
> it falls with the ClassCastException: 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer cannot be cast to 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer
> Steps to reproduce:
> 1)hive.execution.engine is the default value mr
> 2)hive.security.authorization.enabled is set to true, and 
> hive.security.authorization.manager is set to 
> org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider
> 3)start hivecli to run command:explain select * from src
> I debug the code and find the issue HIVE-18778 causing the above 
> ClassCastException.If I set hive.in.test to true,the explain command can be 
> successfully executed。
> Now,I have one question,due to hive.in.test cann't be modified at runtime.how 
> to run explain command with using default authorization in hive-2.3.4,



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21489) EXPLAIN command throws ClassCastException in Hive

2021-06-07 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358879#comment-17358879
 ] 

Ramesh Kumar Thangarajan commented on HIVE-21489:
-

[~daijy] This issue can still be reproduced if it is Storage based 
Authorization and it will be better if we can fix this. The above patch seems 
to be reasonable for me. 

cc [~hashutosh] [~thejas]

> EXPLAIN command throws ClassCastException in Hive
> -
>
> Key: HIVE-21489
> URL: https://issues.apache.org/jira/browse/HIVE-21489
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.4
>Reporter: Ping Lu
>Assignee: Daniel Dai
>Priority: Major
> Attachments: HIVE-21489.1.patch, HIVE-21489.2.patch
>
>
> I'm trying to run commands like explain select * from src in hive-2.3.4,but 
> it falls with the ClassCastException: 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer cannot be cast to 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer
> Steps to reproduce:
> 1)hive.execution.engine is the default value mr
> 2)hive.security.authorization.enabled is set to true, and 
> hive.security.authorization.manager is set to 
> org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider
> 3)start hivecli to run command:explain select * from src
> I debug the code and find the issue HIVE-18778 causing the above 
> ClassCastException.If I set hive.in.test to true,the explain command can be 
> successfully executed。
> Now,I have one question,due to hive.in.test cann't be modified at runtime.how 
> to run explain command with using default authorization in hive-2.3.4,



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25210) oracle.sql.CLOB cannot be cast to java.lang.String in PARTITION_PARAMS table

2021-06-07 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-25210.
-
Resolution: Not A Problem

> oracle.sql.CLOB cannot be cast to java.lang.String in PARTITION_PARAMS table
> 
>
> Key: HIVE-25210
> URL: https://issues.apache.org/jira/browse/HIVE-25210
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> oracle.sql.CLOB cannot be cast to java.lang.String in PARTITION_PARAMS table



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25210) oracle.sql.CLOB cannot be cast to java.lang.String in PARTITION_PARAMS table

2021-06-07 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-25210:
---


> oracle.sql.CLOB cannot be cast to java.lang.String in PARTITION_PARAMS table
> 
>
> Key: HIVE-25210
> URL: https://issues.apache.org/jira/browse/HIVE-25210
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> oracle.sql.CLOB cannot be cast to java.lang.String in PARTITION_PARAMS table



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25202) Support decimal64 operations for PTF operators

2021-06-04 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-25202:
---


> Support decimal64 operations for PTF operators
> --
>
> Key: HIVE-25202
> URL: https://issues.apache.org/jira/browse/HIVE-25202
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> After the support for decimal64 vectorization for multiple operators, PTF 
> operators were found guilty of breaking the decimal64 chain if they happen to 
> occur between two operators. As a result they introduce unnecessary cast to 
> decimal. In order to prevent this, we will support PTF operators to handle 
> decimal64 data types too



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25117) Vector PTF ClassCastException with Decimal64

2021-05-18 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347297#comment-17347297
 ] 

Ramesh Kumar Thangarajan commented on HIVE-25117:
-

[~pgaref] Can you please help review this?

> Vector PTF ClassCastException with Decimal64
> 
>
> Key: HIVE-25117
> URL: https://issues.apache.org/jira/browse/HIVE-25117
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: vector_ptf_classcast_exception.q
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Only reproduces when there is at least 1 buffered batch, so needed 2 rows 
> with 1 row/batch:
> {code:java}
> set hive.vectorized.testing.reducer.batch.size=1;
> {code}
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25117) Vector PTF ClassCastException with Decimal64

2021-05-14 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-25117:
---

Assignee: Ramesh Kumar Thangarajan

> Vector PTF ClassCastException with Decimal64
> 
>
> Key: HIVE-25117
> URL: https://issues.apache.org/jira/browse/HIVE-25117
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: vector_ptf_classcast_exception.q
>
>
> Only reproduces when there is at least 1 buffered batch, so needed 2 rows 
> with 1 row/batch:
> {code:java}
> set hive.vectorized.testing.reducer.batch.size=1;
> {code}
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24900) Failed compaction does not cleanup the directories

2021-05-04 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-24900.
-
Resolution: Fixed

> Failed compaction does not cleanup the directories
> --
>
> Key: HIVE-24900
> URL: https://issues.apache.org/jira/browse/HIVE-24900
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Failed compaction does not cleanup the directories



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24927) Alter table rename moves temporary folders as part of the rename

2021-03-23 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-24927:
---


> Alter table rename moves temporary folders as part of the rename
> 
>
> Key: HIVE-24927
> URL: https://issues.apache.org/jira/browse/HIVE-24927
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Alter table rename moves temporary folders as part of the rename



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24900) Failed compaction does not cleanup the directories

2021-03-17 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-24900:
---


> Failed compaction does not cleanup the directories
> --
>
> Key: HIVE-24900
> URL: https://issues.apache.org/jira/browse/HIVE-24900
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Failed compaction does not cleanup the directories



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24195) Avoid reallocation of the arrays in the lateral view explode of complex types

2020-09-24 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-24195:

Description: Avoid reallocation of the arrays in the lateral view explode 
of complex types

> Avoid reallocation of the arrays in the lateral view explode of complex types
> -
>
> Key: HIVE-24195
> URL: https://issues.apache.org/jira/browse/HIVE-24195
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Avoid reallocation of the arrays in the lateral view explode of complex types



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24195) Avoid reallocation of the arrays in the lateral view explode of complex types

2020-09-24 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-24195:
---


> Avoid reallocation of the arrays in the lateral view explode of complex types
> -
>
> Key: HIVE-24195
> URL: https://issues.apache.org/jira/browse/HIVE-24195
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events

2020-09-14 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17195269#comment-17195269
 ] 

Ramesh Kumar Thangarajan commented on HIVE-24070:
-

Yes I will close this jira, and lets work on HIVE-22290

> ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of 
> pending events
> --
>
> Key: HIVE-24070
> URL: https://issues.apache.org/jira/browse/HIVE-24070
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.0.0
>
>
> If there are large number of events that haven't been cleaned up for some 
> reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory 
> while it loads all the events to be deleted.
>  It should fetch events in batches.
> Similar to https://issues.apache.org/jira/browse/HIVE-19430



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events

2020-09-14 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-24070.
-
Resolution: Duplicate

> ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of 
> pending events
> --
>
> Key: HIVE-24070
> URL: https://issues.apache.org/jira/browse/HIVE-24070
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.0.0
>
>
> If there are large number of events that haven't been cleaned up for some 
> reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory 
> while it loads all the events to be deleted.
>  It should fetch events in batches.
> Similar to https://issues.apache.org/jira/browse/HIVE-19430



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24145) Fix preemption issues in reducers and file sink operators

2020-09-10 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-24145:
---


> Fix preemption issues in reducers and file sink operators
> -
>
> Key: HIVE-24145
> URL: https://issues.apache.org/jira/browse/HIVE-24145
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> There are two issues because of preemption:
>  # Reducers are getting reordered as part of optimizations because of which 
> more preemption happen
>  # Preemption in the middle of writing can cause the file to not close and 
> lead to errors when we read the file later



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24071) Continue cleaning the NotificationEvents till we have data greater than TTL

2020-08-25 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-24071:
---


> Continue cleaning the NotificationEvents till we have data greater than TTL
> ---
>
> Key: HIVE-24071
> URL: https://issues.apache.org/jira/browse/HIVE-24071
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.0.0
>
>
> Continue cleaning the NotificationEvents till we have data greater than TTL.
> Currently we only clean the notification events once every 2 hours and also 
> strict 1 every time. We should continue deleting until we clear up all 
> the notification events greater than TTL.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events

2020-08-25 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183905#comment-17183905
 ] 

Ramesh Kumar Thangarajan commented on HIVE-24070:
-

[~anishek] Do we already have a Jira for that? Otherwise I can create one.

> ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of 
> pending events
> --
>
> Key: HIVE-24070
> URL: https://issues.apache.org/jira/browse/HIVE-24070
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.0.0
>
>
> If there are large number of events that haven't been cleaned up for some 
> reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory 
> while it loads all the events to be deleted.
>  It should fetch events in batches.
> Similar to https://issues.apache.org/jira/browse/HIVE-19430



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events

2020-08-25 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183900#comment-17183900
 ] 

Ramesh Kumar Thangarajan commented on HIVE-24070:
-

the issue you mentioned is still present and needs to be addressed though it 
might not cause the service to stop. The OOM currently cause the service to 
stop. We might have to create a separate Jira to address that issue.

> ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of 
> pending events
> --
>
> Key: HIVE-24070
> URL: https://issues.apache.org/jira/browse/HIVE-24070
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.0.0
>
>
> If there are large number of events that haven't been cleaned up for some 
> reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory 
> while it loads all the events to be deleted.
>  It should fetch events in batches.
> Similar to https://issues.apache.org/jira/browse/HIVE-19430



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events

2020-08-25 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183898#comment-17183898
 ] 

Ramesh Kumar Thangarajan commented on HIVE-24070:
-

Jira HIVE-19430 only solves the OOM issue with cleanNotificationEvents(). We 
still reach OOM through cleanWriteNotificationEvents() at 
[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L10429]

> ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of 
> pending events
> --
>
> Key: HIVE-24070
> URL: https://issues.apache.org/jira/browse/HIVE-24070
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.0.0
>
>
> If there are large number of events that haven't been cleaned up for some 
> reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory 
> while it loads all the events to be deleted.
>  It should fetch events in batches.
> Similar to https://issues.apache.org/jira/browse/HIVE-19430



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events

2020-08-25 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-24070:

Description: 
If there are large number of events that haven't been cleaned up for some 
reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory 
while it loads all the events to be deleted.
 It should fetch events in batches.

Similar to https://issues.apache.org/jira/browse/HIVE-19430

  was:
If there are large number of events that haven't been cleaned up for some 
reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory 
while it loads all the events to be deleted.
It should fetch events in batches.


> ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of 
> pending events
> --
>
> Key: HIVE-24070
> URL: https://issues.apache.org/jira/browse/HIVE-24070
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.0.0
>
>
> If there are large number of events that haven't been cleaned up for some 
> reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory 
> while it loads all the events to be deleted.
>  It should fetch events in batches.
> Similar to https://issues.apache.org/jira/browse/HIVE-19430



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events

2020-08-25 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-24070:
---


> ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of 
> pending events
> --
>
> Key: HIVE-24070
> URL: https://issues.apache.org/jira/browse/HIVE-24070
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.0.0
>
>
> If there are large number of events that haven't been cleaned up for some 
> reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory 
> while it loads all the events to be deleted.
> It should fetch events in batches.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24037) Parallelize hash table constructions in map joins

2020-08-13 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-24037:
---


> Parallelize hash table constructions in map joins
> -
>
> Key: HIVE-24037
> URL: https://issues.apache.org/jira/browse/HIVE-24037
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Parallelize hash table constructions in map joins



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-22934) Hive server interactive log counters to error stream

2020-07-07 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-22934:
---

Assignee: Ramesh Kumar Thangarajan  (was: Antal Sinkovits)

> Hive server interactive log counters to error stream
> 
>
> Key: HIVE-22934
> URL: https://issues.apache.org/jira/browse/HIVE-22934
> Project: Hive
>  Issue Type: Bug
>Reporter: Slim Bouguerra
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22934.01.patch, HIVE-22934.02.patch, 
> HIVE-22934.03.patch, HIVE-22934.04.patch, HIVE-22934.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive server is logging the console output to system error stream.
> This need to be fixed because 
> First we do not roll the file.
> Second writing to such file is done sequential and can lead to throttle/poor 
> perf.
> {code}
> -rw-r--r--  1 hive hadoop 9.5G Feb 26 17:22 hive-server2-interactive.err
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23665) Rewrite last_value to first_value to enable streaming results

2020-06-26 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17146476#comment-17146476
 ] 

Ramesh Kumar Thangarajan commented on HIVE-23665:
-

[~jcamachorodriguez] [~vgarg] Can you please review the attached PR and let me 
know your thoughts?

> Rewrite last_value to first_value to enable streaming results
> -
>
> Key: HIVE-23665
> URL: https://issues.apache.org/jira/browse/HIVE-23665
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23665.1.patch, HIVE-23665.2.patch, 
> HIVE-23665.3.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Rewrite last_value to first_value to enable streaming results
> last_value cannot be streamed because the intermediate results need to be 
> buffered to determine the window result till we get the last row in the 
> window. But if we can rewrite to first_value we can stream the results, 
> although the order of results will not be guaranteed (also not important)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23666) checkHashModeEfficiency is skipped when a groupby operator doesn't have a grouping set

2020-06-12 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17134624#comment-17134624
 ] 

Ramesh Kumar Thangarajan commented on HIVE-23666:
-

I have created the PR at [https://github.com/apache/hive/pull/1103]. Can you 
please help me review the patch?

> checkHashModeEfficiency is skipped when a groupby operator doesn't have a 
> grouping set
> --
>
> Key: HIVE-23666
> URL: https://issues.apache.org/jira/browse/HIVE-23666
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23666.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> checkHashModeEfficiency is skipped when a groupby operator doesn't have a 
> grouping set



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23665) Rewrite last_value to first_value to enable streaming results

2020-06-12 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-23665:

Attachment: (was: HIVE-23665.3.patch)

> Rewrite last_value to first_value to enable streaming results
> -
>
> Key: HIVE-23665
> URL: https://issues.apache.org/jira/browse/HIVE-23665
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23665.1.patch, HIVE-23665.2.patch, 
> HIVE-23665.3.patch
>
>
> Rewrite last_value to first_value to enable streaming results
> last_value cannot be streamed because the intermediate results need to be 
> buffered to determine the window result till we get the last row in the 
> window. But if we can rewrite to first_value we can stream the results, 
> although the order of results will not be guaranteed (also not important)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23665) Rewrite last_value to first_value to enable streaming results

2020-06-12 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-23665:

Attachment: HIVE-23665.3.patch
Status: Patch Available  (was: Open)

> Rewrite last_value to first_value to enable streaming results
> -
>
> Key: HIVE-23665
> URL: https://issues.apache.org/jira/browse/HIVE-23665
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23665.1.patch, HIVE-23665.2.patch, 
> HIVE-23665.3.patch
>
>
> Rewrite last_value to first_value to enable streaming results
> last_value cannot be streamed because the intermediate results need to be 
> buffered to determine the window result till we get the last row in the 
> window. But if we can rewrite to first_value we can stream the results, 
> although the order of results will not be guaranteed (also not important)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23665) Rewrite last_value to first_value to enable streaming results

2020-06-12 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-23665:

Status: Open  (was: Patch Available)

> Rewrite last_value to first_value to enable streaming results
> -
>
> Key: HIVE-23665
> URL: https://issues.apache.org/jira/browse/HIVE-23665
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23665.1.patch, HIVE-23665.2.patch, 
> HIVE-23665.3.patch
>
>
> Rewrite last_value to first_value to enable streaming results
> last_value cannot be streamed because the intermediate results need to be 
> buffered to determine the window result till we get the last row in the 
> window. But if we can rewrite to first_value we can stream the results, 
> although the order of results will not be guaranteed (also not important)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23665) Rewrite last_value to first_value to enable streaming results

2020-06-12 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-23665:

Status: Open  (was: Patch Available)

> Rewrite last_value to first_value to enable streaming results
> -
>
> Key: HIVE-23665
> URL: https://issues.apache.org/jira/browse/HIVE-23665
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23665.1.patch, HIVE-23665.2.patch
>
>
> Rewrite last_value to first_value to enable streaming results
> last_value cannot be streamed because the intermediate results need to be 
> buffered to determine the window result till we get the last row in the 
> window. But if we can rewrite to first_value we can stream the results, 
> although the order of results will not be guaranteed (also not important)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23665) Rewrite last_value to first_value to enable streaming results

2020-06-12 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-23665:

Attachment: HIVE-23665.3.patch
Status: Patch Available  (was: Open)

> Rewrite last_value to first_value to enable streaming results
> -
>
> Key: HIVE-23665
> URL: https://issues.apache.org/jira/browse/HIVE-23665
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-23665.1.patch, HIVE-23665.2.patch, 
> HIVE-23665.3.patch
>
>
> Rewrite last_value to first_value to enable streaming results
> last_value cannot be streamed because the intermediate results need to be 
> buffered to determine the window result till we get the last row in the 
> window. But if we can rewrite to first_value we can stream the results, 
> although the order of results will not be guaranteed (also not important)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   6   >