[jira] [Updated] (HIVE-28019) Fix query type information in proto files for load queries
[ https://issues.apache.org/jira/browse/HIVE-28019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-28019: Summary: Fix query type information in proto files for load queries (was: Fix query type information in proto files for load and explain queries) > Fix query type information in proto files for load queries > -- > > Key: HIVE-28019 > URL: https://issues.apache.org/jira/browse/HIVE-28019 > Project: Hive > Issue Type: Task > Components: HiveServer2 >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > > Certain query types like LOAD, export, import and explain queries did not > produce the right Hive operation type -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-28019) Fix query type information in proto files for load and explain queries
[ https://issues.apache.org/jira/browse/HIVE-28019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837401#comment-17837401 ] Ramesh Kumar Thangarajan commented on HIVE-28019: - Hi [~zabetak] First of all, thank you very much for the review on this. :) I am with you on the fact that HiveOperation was introduced for authorization and may be we should not change it to represent the query type. But I still believe we should do the change for PREHOOK: type: and POSTHOOK: type: and also the HiveProtoLoggingHook. I feel that the change to HiveOperation.Explain for the explain queries is needed mostly because we use the HiveOperation to print in the preexecute and postexecute actions. [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/hooks/PreExecutePrinter.java#L69] At present we report the type information for the queries in preexec and postexec as below: PREHOOK: type: QUERY POSTHOOK: type: QUERY I think this is the query type information that is reported along with other information on the query. If that is the case I feel we should not report other type for explain queries. If this change is loss of information shouldn't the usage of type wrong by the users? Although we can skip this and fix only the HiveProtoLoggingHook to address right query type, I feel we will report two different information for the same query in different places. Also keeping them synchronized will help us in the complete testing for all types of queries. Please let me know if you think my points make sense. I will address to not touch the commandType and rather create a field to represent explain queries and use that to report the correct query type in HiveProtoLoggingHook and the PREHOOK: type: and POSTHOOK: type. > Fix query type information in proto files for load and explain queries > -- > > Key: HIVE-28019 > URL: https://issues.apache.org/jira/browse/HIVE-28019 > Project: Hive > Issue Type: Task > Components: HiveServer2 >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > > Certain query types like LOAD, export, import and explain queries did not > produce the right Hive operation type -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-28129) Execute statement doesnot report the correct query string information
[ https://issues.apache.org/jira/browse/HIVE-28129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-28129: Fix Version/s: 4.0.0 > Execute statement doesnot report the correct query string information > - > > Key: HIVE-28129 > URL: https://issues.apache.org/jira/browse/HIVE-28129 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Execute statement does not report the correct query type information. > It inherits the sql statement type of the subsequent queries. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-28129) Execute statement doesnot report the correct query string information
[ https://issues.apache.org/jira/browse/HIVE-28129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan resolved HIVE-28129. - Resolution: Fixed > Execute statement doesnot report the correct query string information > - > > Key: HIVE-28129 > URL: https://issues.apache.org/jira/browse/HIVE-28129 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Execute statement does not report the correct query type information. > It inherits the sql statement type of the subsequent queries. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-28171) Active queries API does not provide query information if they are still pending
[ https://issues.apache.org/jira/browse/HIVE-28171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-28171: Summary: Active queries API does not provide query information if they are still pending (was: Active queries API does not provide information on the queries if they are still pending) > Active queries API does not provide query information if they are still > pending > --- > > Key: HIVE-28171 > URL: https://issues.apache.org/jira/browse/HIVE-28171 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Active queries API does not provide information on the queries if they are > still pending. SO we will never know what query is pending when we query this > API endpoint. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-28171) Active queries API does not provide information on the queries if they are still pending
Ramesh Kumar Thangarajan created HIVE-28171: --- Summary: Active queries API does not provide information on the queries if they are still pending Key: HIVE-28171 URL: https://issues.apache.org/jira/browse/HIVE-28171 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Ramesh Kumar Thangarajan Assignee: Ramesh Kumar Thangarajan Active queries API does not provide information on the queries if they are still pending. SO we will never know what query is pending when we query this API endpoint. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-28170) Implement drop stats
[ https://issues.apache.org/jira/browse/HIVE-28170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-28170: Description: We need a way to drop the stats associated with the table/partition and its columns. This can help a lot in migration or replication where the stats data take huge time to copy. Particularly when the table is partitioned, we have stats rows for each table, partition, column combination, which can get huge when the number of partitions is huge. > Implement drop stats > > > Key: HIVE-28170 > URL: https://issues.apache.org/jira/browse/HIVE-28170 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > We need a way to drop the stats associated with the table/partition and its > columns. This can help a lot in migration or replication where the stats data > take huge time to copy. Particularly when the table is partitioned, we have > stats rows for each table, partition, column combination, which can get huge > when the number of partitions is huge. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-28170) Implement drop stats
Ramesh Kumar Thangarajan created HIVE-28170: --- Summary: Implement drop stats Key: HIVE-28170 URL: https://issues.apache.org/jira/browse/HIVE-28170 Project: Hive Issue Type: Task Reporter: Ramesh Kumar Thangarajan Assignee: Ramesh Kumar Thangarajan -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-28128) explain reoptimization doesnot report the correct querytype information
[ https://issues.apache.org/jira/browse/HIVE-28128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan resolved HIVE-28128. - Fix Version/s: 4.1.0 Resolution: Done > explain reoptimization doesnot report the correct querytype information > --- > > Key: HIVE-28128 > URL: https://issues.apache.org/jira/browse/HIVE-28128 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Fix For: 4.1.0 > > > explain reoptimization doesnot report the correct querytype information and > sometimes result in QUERY as the query type instead of explain -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-28128) explain reoptimization doesnot report the correct querytype information
[ https://issues.apache.org/jira/browse/HIVE-28128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830641#comment-17830641 ] Ramesh Kumar Thangarajan commented on HIVE-28128: - Addressed as part of the jra https://issues.apache.org/jira/browse/HIVE-28019 Resolving this. > explain reoptimization doesnot report the correct querytype information > --- > > Key: HIVE-28128 > URL: https://issues.apache.org/jira/browse/HIVE-28128 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > explain reoptimization doesnot report the correct querytype information and > sometimes result in QUERY as the query type instead of explain -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-28129) Execute statement doesnot report the correct query string information
[ https://issues.apache.org/jira/browse/HIVE-28129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-28129: Issue Type: Bug (was: Task) > Execute statement doesnot report the correct query string information > - > > Key: HIVE-28129 > URL: https://issues.apache.org/jira/browse/HIVE-28129 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > > Execute statement does not report the correct query type information. > It inherits the sql statement type of the subsequent queries. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-28128) explain reoptimization doesnot report the correct querytype information
[ https://issues.apache.org/jira/browse/HIVE-28128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-28128: Issue Type: Bug (was: Task) > explain reoptimization doesnot report the correct querytype information > --- > > Key: HIVE-28128 > URL: https://issues.apache.org/jira/browse/HIVE-28128 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > explain reoptimization doesnot report the correct querytype information and > sometimes result in QUERY as the query type instead of explain -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-27751) Log Query Compilation summary in an accumulated way
[ https://issues.apache.org/jira/browse/HIVE-27751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan resolved HIVE-27751. - Fix Version/s: 4.1.0 Resolution: Fixed > Log Query Compilation summary in an accumulated way > --- > > Key: HIVE-27751 > URL: https://issues.apache.org/jira/browse/HIVE-27751 > Project: Hive > Issue Type: Task > Components: Hive >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Fix For: 4.1.0 > > > Query Compilation summary is very useful for reading and collecting all the > measures of compile time in a single place. It is also useful in debugging a > performance issue in the query compilation phase and also to report and > compare with various runs > In order to run test this. Please set the config hive.compile.print.summary > to true in any q file and run the test to see the Query Compilation Summary > in the logs. One example of the output is below. The order of operations are > maintained while print the summary too: > {code:java} > Query Compilation Summary > -- > waitCompile >0 ms > parse >4 ms > getTableConstraints - HS2-cache > 69 ms > optimizer - Calcite: Plan generation > 257 ms > optimizer - Calcite: Prejoin ordering transformation > 20 ms > optimizer - Calcite: Postjoin ordering transformation > 24 ms > optimizer > 705 ms > optimizer - HiveOpConverterPostProc >0 ms > optimizer - Generator > 24 ms > optimizer - PartitionColumnsSeparator >1 ms > optimizer - SyntheticJoinPredicate >2 ms > optimizer - SimplePredicatePushDown >8 ms > optimizer - RedundantDynamicPruningConditionsRemoval >0 ms > optimizer - SortedDynPartitionTimeGranularityOptimizer >2 ms > optimizer - PartitionPruner >3 ms > optimizer - PartitionConditionRemover >2 ms > optimizer - GroupByOptimizer >2 ms > optimizer - ColumnPruner > 10 ms > optimizer - CountDistinctRewriteProc >1 ms > optimizer - SamplePruner >1 ms > optimizer - MapJoinProcessor >2 ms > optimizer - BucketingSortingReduceSinkOptimizer >2 ms > optimizer - UnionProcessor >2 ms > optimizer - JoinReorder >0 ms > optimizer - FixedBucketPruningOptimizer >2 ms > optimizer - BucketVersionPopulator >2 ms > optimizer - NonBlockingOpDeDupProc >1 ms > optimizer - IdentityProjectRemover >0 ms > optimizer - LimitPushdownOptimizer >2 ms > optimizer - OrderlessLimitPushDownOptimizer >1 ms > optimizer - StatsOptimizer >
[jira] [Work started] (HIVE-28129) Execute statement doesnot report the correct query string information
[ https://issues.apache.org/jira/browse/HIVE-28129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-28129 started by Ramesh Kumar Thangarajan. --- > Execute statement doesnot report the correct query string information > - > > Key: HIVE-28129 > URL: https://issues.apache.org/jira/browse/HIVE-28129 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Execute statement does not report the correct query type information. > It inherits the sql statement type of the subsequent queries. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-28129) Execute statement doesnot report the correct query string information
[ https://issues.apache.org/jira/browse/HIVE-28129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-28129: Summary: Execute statement doesnot report the correct query string information (was: Prepare statement doesnot report the correct query type information) > Execute statement doesnot report the correct query string information > - > > Key: HIVE-28129 > URL: https://issues.apache.org/jira/browse/HIVE-28129 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Prepare statement doesnot report the correct query type information. > It inherits the sql statement type of the subsequent queries. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-28129) Execute statement doesnot report the correct query string information
[ https://issues.apache.org/jira/browse/HIVE-28129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-28129: Description: Execute statement does not report the correct query type information. It inherits the sql statement type of the subsequent queries. was: Prepare statement doesnot report the correct query type information. It inherits the sql statement type of the subsequent queries. > Execute statement doesnot report the correct query string information > - > > Key: HIVE-28129 > URL: https://issues.apache.org/jira/browse/HIVE-28129 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Execute statement does not report the correct query type information. > It inherits the sql statement type of the subsequent queries. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-28129) Prepare statement doesnot report the correct query type information
Ramesh Kumar Thangarajan created HIVE-28129: --- Summary: Prepare statement doesnot report the correct query type information Key: HIVE-28129 URL: https://issues.apache.org/jira/browse/HIVE-28129 Project: Hive Issue Type: Task Reporter: Ramesh Kumar Thangarajan Assignee: Ramesh Kumar Thangarajan Prepare statement doesnot report the correct query type information. It inherits the sql statement type of the subsequent queries. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-28128) explain reoptimization doesnot report the correct querytype information
Ramesh Kumar Thangarajan created HIVE-28128: --- Summary: explain reoptimization doesnot report the correct querytype information Key: HIVE-28128 URL: https://issues.apache.org/jira/browse/HIVE-28128 Project: Hive Issue Type: Task Reporter: Ramesh Kumar Thangarajan Assignee: Ramesh Kumar Thangarajan explain reoptimization doesnot report the correct querytype information and sometimes result in QUERY as the query type instead of explain -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-28019) Fix query type information in proto files for load and explain queries
[ https://issues.apache.org/jira/browse/HIVE-28019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-28019: Summary: Fix query type information in proto files for load and explain queries (was: Fix query type information in proto files for load, export, import and explain queries) > Fix query type information in proto files for load and explain queries > -- > > Key: HIVE-28019 > URL: https://issues.apache.org/jira/browse/HIVE-28019 > Project: Hive > Issue Type: Task > Components: HiveServer2 >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > > Certain query types like LOAD, export, import and explain queries did not > produce the right Hive operation type -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-28019) Fix query type information in proto files for load, export, import and explain queries
[ https://issues.apache.org/jira/browse/HIVE-28019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-28019: Summary: Fix query type information in proto files for load, export, import and explain queries (was: Wrong query type information in proto files for load, export, import and explain queries) > Fix query type information in proto files for load, export, import and > explain queries > -- > > Key: HIVE-28019 > URL: https://issues.apache.org/jira/browse/HIVE-28019 > Project: Hive > Issue Type: Task > Components: HiveServer2 >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Certain query types like LOAD, export, import and explain queries did not > produce the right Hive operation type -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-28019) Wrong query type information in proto files for load, export, import and explain queries
Ramesh Kumar Thangarajan created HIVE-28019: --- Summary: Wrong query type information in proto files for load, export, import and explain queries Key: HIVE-28019 URL: https://issues.apache.org/jira/browse/HIVE-28019 Project: Hive Issue Type: Task Components: HiveServer2 Reporter: Ramesh Kumar Thangarajan Assignee: Ramesh Kumar Thangarajan Certain query types like LOAD, export, import and explain queries did not produce the right Hive operation type -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27751) Log Query Compilation summary in an accumulated way
[ https://issues.apache.org/jira/browse/HIVE-27751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809764#comment-17809764 ] Ramesh Kumar Thangarajan commented on HIVE-27751: - Hi [~zabetak] Thank you very much for reviewing this. I have updated the description with the sample output. Usually the debug logs are all spread across multiple places and we do not have a easy way to get the details from user when they run into performance issues. As part of this PR, main idea is to output the information in the command line output too. This will be done only if the config is turned on. That is what I meant by accumulated as we get all the details related to Query Compilation at one single place and its visible to the user as part of the query output. Also I have addressed your comments, can you let me know what you think about the latest patch? > Log Query Compilation summary in an accumulated way > --- > > Key: HIVE-27751 > URL: https://issues.apache.org/jira/browse/HIVE-27751 > Project: Hive > Issue Type: Task > Components: Hive >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > > Query Compilation summary is very useful for reading and collecting all the > measures of compile time in a single place. It is also useful in debugging a > performance issue in the query compilation phase and also to report and > compare with various runs > In order to run test this. Please set the config hive.compile.print.summary > to true in any q file and run the test to see the Query Compilation Summary > in the logs. One example of the output is below. The order of operations are > maintained while print the summary too: > {code:java} > Query Compilation Summary > -- > waitCompile >0 ms > parse >4 ms > getTableConstraints - HS2-cache > 69 ms > optimizer - Calcite: Plan generation > 257 ms > optimizer - Calcite: Prejoin ordering transformation > 20 ms > optimizer - Calcite: Postjoin ordering transformation > 24 ms > optimizer > 705 ms > optimizer - HiveOpConverterPostProc >0 ms > optimizer - Generator > 24 ms > optimizer - PartitionColumnsSeparator >1 ms > optimizer - SyntheticJoinPredicate >2 ms > optimizer - SimplePredicatePushDown >8 ms > optimizer - RedundantDynamicPruningConditionsRemoval >0 ms > optimizer - SortedDynPartitionTimeGranularityOptimizer >2 ms > optimizer - PartitionPruner >3 ms > optimizer - PartitionConditionRemover >2 ms > optimizer - GroupByOptimizer >2 ms > optimizer - ColumnPruner > 10 ms > optimizer - CountDistinctRewriteProc >1 ms > optimizer - SamplePruner >1 ms > optimizer - MapJoinProcessor >2 ms > optimizer - BucketingSortingReduceSinkOptimizer >2 ms > optimizer - UnionProcessor >2 ms > optimizer - JoinReorder >0 ms > optimizer - FixedBucketPruningOptimizer >2 ms > optimizer -
[jira] [Updated] (HIVE-27751) Log Query Compilation summary in an accumulated way
[ https://issues.apache.org/jira/browse/HIVE-27751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-27751: Description: Query Compilation summary is very useful for reading and collecting all the measures of compile time in a single place. It is also useful in debugging a performance issue in the query compilation phase and also to report and compare with various runs In order to run test this. Please set the config hive.compile.print.summary to true in any q file and run the test to see the Query Compilation Summary in the logs. One example of the output is below. The order of operations are maintained while print the summary too: {code:java} Query Compilation Summary -- waitCompile 0 ms parse 4 ms getTableConstraints - HS2-cache 69 ms optimizer - Calcite: Plan generation 257 ms optimizer - Calcite: Prejoin ordering transformation 20 ms optimizer - Calcite: Postjoin ordering transformation 24 ms optimizer 705 ms optimizer - HiveOpConverterPostProc 0 ms optimizer - Generator 24 ms optimizer - PartitionColumnsSeparator 1 ms optimizer - SyntheticJoinPredicate 2 ms optimizer - SimplePredicatePushDown 8 ms optimizer - RedundantDynamicPruningConditionsRemoval 0 ms optimizer - SortedDynPartitionTimeGranularityOptimizer 2 ms optimizer - PartitionPruner 3 ms optimizer - PartitionConditionRemover 2 ms optimizer - GroupByOptimizer 2 ms optimizer - ColumnPruner 10 ms optimizer - CountDistinctRewriteProc 1 ms optimizer - SamplePruner 1 ms optimizer - MapJoinProcessor 2 ms optimizer - BucketingSortingReduceSinkOptimizer 2 ms optimizer - UnionProcessor 2 ms optimizer - JoinReorder 0 ms optimizer - FixedBucketPruningOptimizer 2 ms optimizer - BucketVersionPopulator 2 ms optimizer - NonBlockingOpDeDupProc 1 ms optimizer - IdentityProjectRemover 0 ms optimizer - LimitPushdownOptimizer 2 ms optimizer - OrderlessLimitPushDownOptimizer 1 ms optimizer - StatsOptimizer 0 ms optimizer - SimpleFetchOptimizer 0 ms TezCompiler - Run top n key optimization 2 ms TezCompiler - Setup dynamic partition pruning 3 ms optimizer - Merge single column semi-join reducers to composite 0 ms partition-retrieving 1 ms TezCompiler - Setup stats in the operator plan
[jira] [Updated] (HIVE-27751) Log Query Compilation summary in an accumulated way
[ https://issues.apache.org/jira/browse/HIVE-27751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-27751: Description: Query Compilation summary is very useful for reading and collecting all the measures of compile time in a single place. It is also useful in debugging a performance issue in the query compilation phase and also to report and compare with various runs After the was:Query Compilation summary is very useful for reading and collecting all the measures of compile time in a single place. It is also useful in debugging a performance issue in the query compilation phase and also to report and compare with various runs > Log Query Compilation summary in an accumulated way > --- > > Key: HIVE-27751 > URL: https://issues.apache.org/jira/browse/HIVE-27751 > Project: Hive > Issue Type: Task > Components: Hive >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > > Query Compilation summary is very useful for reading and collecting all the > measures of compile time in a single place. It is also useful in debugging a > performance issue in the query compilation phase and also to report and > compare with various runs > > After the -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work started] (HIVE-27843) Add QueryOperation to Hive proto logger for post execution hook information
[ https://issues.apache.org/jira/browse/HIVE-27843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-27843 started by Ramesh Kumar Thangarajan. --- > Add QueryOperation to Hive proto logger for post execution hook information > --- > > Key: HIVE-27843 > URL: https://issues.apache.org/jira/browse/HIVE-27843 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > > Currently the query operation type is missing in the proto logger > Add QueryOperation to Hive proto logger for post execution hook information -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-27843) Add QueryOperation to Hive proto logger for post execution hook information
[ https://issues.apache.org/jira/browse/HIVE-27843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan resolved HIVE-27843. - Fix Version/s: 4.0.0 Resolution: Fixed > Add QueryOperation to Hive proto logger for post execution hook information > --- > > Key: HIVE-27843 > URL: https://issues.apache.org/jira/browse/HIVE-27843 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Currently the query operation type is missing in the proto logger > Add QueryOperation to Hive proto logger for post execution hook information -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-27687) Logger variable should be static final as its creation takes more time in query compilation
[ https://issues.apache.org/jira/browse/HIVE-27687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan resolved HIVE-27687. - Resolution: Fixed > Logger variable should be static final as its creation takes more time in > query compilation > --- > > Key: HIVE-27687 > URL: https://issues.apache.org/jira/browse/HIVE-27687 > Project: Hive > Issue Type: Task > Components: Hive >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: Screenshot 2023-09-12 at 5.03.31 PM.png > > > In query compilation, > LoggerFactory.getLogger() seems to take up more time. Some of the serde > classes use non static variable for Logger that forces the getLogger() call > for each of the class creation. > Making Logger variable static final will avoid this code path for every serde > class construction. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27687) Logger variable should be static final as its creation takes more time in query compilation
[ https://issues.apache.org/jira/browse/HIVE-27687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788845#comment-17788845 ] Ramesh Kumar Thangarajan commented on HIVE-27687: - [~zabetak] Thanks, marked it. > Logger variable should be static final as its creation takes more time in > query compilation > --- > > Key: HIVE-27687 > URL: https://issues.apache.org/jira/browse/HIVE-27687 > Project: Hive > Issue Type: Task > Components: Hive >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: Screenshot 2023-09-12 at 5.03.31 PM.png > > > In query compilation, > LoggerFactory.getLogger() seems to take up more time. Some of the serde > classes use non static variable for Logger that forces the getLogger() call > for each of the class creation. > Making Logger variable static final will avoid this code path for every serde > class construction. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (HIVE-27687) Logger variable should be static final as its creation takes more time in query compilation
[ https://issues.apache.org/jira/browse/HIVE-27687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan closed HIVE-27687. --- > Logger variable should be static final as its creation takes more time in > query compilation > --- > > Key: HIVE-27687 > URL: https://issues.apache.org/jira/browse/HIVE-27687 > Project: Hive > Issue Type: Task > Components: Hive >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: Screenshot 2023-09-12 at 5.03.31 PM.png > > > In query compilation, > LoggerFactory.getLogger() seems to take up more time. Some of the serde > classes use non static variable for Logger that forces the getLogger() call > for each of the class creation. > Making Logger variable static final will avoid this code path for every serde > class construction. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27687) Logger variable should be static final as its creation takes more time in query compilation
[ https://issues.apache.org/jira/browse/HIVE-27687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-27687: Fix Version/s: 4.0.0 > Logger variable should be static final as its creation takes more time in > query compilation > --- > > Key: HIVE-27687 > URL: https://issues.apache.org/jira/browse/HIVE-27687 > Project: Hive > Issue Type: Task > Components: Hive >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: Screenshot 2023-09-12 at 5.03.31 PM.png > > > In query compilation, > LoggerFactory.getLogger() seems to take up more time. Some of the serde > classes use non static variable for Logger that forces the getLogger() call > for each of the class creation. > Making Logger variable static final will avoid this code path for every serde > class construction. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (HIVE-27687) Logger variable should be static final as its creation takes more time in query compilation
[ https://issues.apache.org/jira/browse/HIVE-27687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reopened HIVE-27687: - > Logger variable should be static final as its creation takes more time in > query compilation > --- > > Key: HIVE-27687 > URL: https://issues.apache.org/jira/browse/HIVE-27687 > Project: Hive > Issue Type: Task > Components: Hive >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Attachments: Screenshot 2023-09-12 at 5.03.31 PM.png > > > In query compilation, > LoggerFactory.getLogger() seems to take up more time. Some of the serde > classes use non static variable for Logger that forces the getLogger() call > for each of the class creation. > Making Logger variable static final will avoid this code path for every serde > class construction. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27876) Incorrect query results on tables with ClusterBy & SortBy
[ https://issues.apache.org/jira/browse/HIVE-27876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788476#comment-17788476 ] Ramesh Kumar Thangarajan commented on HIVE-27876: - [~kkasa] I was looking into fixing this. But I had 2 questions to think about: 1. Should we expect the data to be bucketed and sorted globally after the inserts? Because if all the 4 rows are inserted in the table in a single statement query like below, then I guess the optimization works fine. insert into test_bucket values (1, 'user1', 'dept1'), ( 2, 'user2' , 'dept2'), (1, 'user1', 'dept1'), ( 2, 'user2' , 'dept2'); 2. Having map side group by is still useful even if the data is sorted locally within a bucket, it is only a problem when we remove the ReduceSinkOperator. In that case, can we just skip removing ReduceSinkOperator as part of the optimization. Will we still get any real improvements as part of this optimization(even after skipping to remove ReduceSinkOperator)? > Incorrect query results on tables with ClusterBy & SortBy > - > > Key: HIVE-27876 > URL: https://issues.apache.org/jira/browse/HIVE-27876 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Repro: > > {code:java} > create external table test_bucket(age int, name string, dept string) > clustered by (age, name) sorted by (age asc, name asc) into 2 buckets stored > as orc; > insert into test_bucket values (1, 'user1', 'dept1'), ( 2, 'user2' , 'dept2'); > insert into test_bucket values (1, 'user1', 'dept1'), ( 2, 'user2' , 'dept2'); > //empty wrong results > select age, name, count(*) from test_bucket group by age, name having > count(*) > 1; > +--+---+--+ > | age | name | _c2 | > +--+---+--+ > +--+---+--+ > // Workaround > set hive.map.aggr=false; > select age, name, count(*) from test_bucket group by age, name having > count(*) > 1; > +--++--+ > | age | name | _c2 | > +--++--+ > | 1 | user1 | 2 | > | 2 | user2 | 2 | > +--++--+ {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-27876) Incorrect query results on tables with ClusterBy & SortBy
[ https://issues.apache.org/jira/browse/HIVE-27876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-27876: --- Assignee: Ramesh Kumar Thangarajan > Incorrect query results on tables with ClusterBy & SortBy > - > > Key: HIVE-27876 > URL: https://issues.apache.org/jira/browse/HIVE-27876 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Repro: > > {code:java} > create external table test_bucket(age int, name string, dept string) > clustered by (age, name) sorted by (age asc, name asc) into 2 buckets stored > as orc; > insert into test_bucket values (1, 'user1', 'dept1'), ( 2, 'user2' , 'dept2'); > insert into test_bucket values (1, 'user1', 'dept1'), ( 2, 'user2' , 'dept2'); > //empty wrong results > select age, name, count(*) from test_bucket group by age, name having > count(*) > 1; > +--+---+--+ > | age | name | _c2 | > +--+---+--+ > +--+---+--+ > // Workaround > set hive.map.aggr=false; > select age, name, count(*) from test_bucket group by age, name having > count(*) > 1; > +--++--+ > | age | name | _c2 | > +--++--+ > | 1 | user1 | 2 | > | 2 | user2 | 2 | > +--++--+ {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-27687) Logger variable should be static final as its creation takes more time in query compilation
[ https://issues.apache.org/jira/browse/HIVE-27687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan resolved HIVE-27687. - Resolution: Fixed > Logger variable should be static final as its creation takes more time in > query compilation > --- > > Key: HIVE-27687 > URL: https://issues.apache.org/jira/browse/HIVE-27687 > Project: Hive > Issue Type: Task > Components: Hive >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Attachments: Screenshot 2023-09-12 at 5.03.31 PM.png > > > In query compilation, > LoggerFactory.getLogger() seems to take up more time. Some of the serde > classes use non static variable for Logger that forces the getLogger() call > for each of the class creation. > Making Logger variable static final will avoid this code path for every serde > class construction. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27843) Add QueryOperation to Hive proto logger for post execution hook information
Ramesh Kumar Thangarajan created HIVE-27843: --- Summary: Add QueryOperation to Hive proto logger for post execution hook information Key: HIVE-27843 URL: https://issues.apache.org/jira/browse/HIVE-27843 Project: Hive Issue Type: Task Reporter: Ramesh Kumar Thangarajan Assignee: Ramesh Kumar Thangarajan Currently the query operation type is missing in the proto logger Add QueryOperation to Hive proto logger for post execution hook information -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27773) get_valid_write_ids is being called multiple times for a single query
[ https://issues.apache.org/jira/browse/HIVE-27773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-27773: Description: Looking at the below logs suggest that the get_valid_write_ids is not cached for a single query for a single table. It is being called multiple times across different phases in the compilation of the query. We should verify if we can safely cache and re use the results. That way we can avoid around 40-50 ms out of 678ms compilation time. {code:java} 2023-09-19T02:55:06,940 INFO [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] ql.Driver: Compiling command(queryId=rameshkumar_20230919025506_b005cc57-1717-4798-b8da-b502aa7ca3d6): 2023-09-19T02:55:06,967 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:06,979 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:06,980 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:06,986 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:06,988 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:06,995 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:06,997 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,007 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,009 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,017 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,018 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,026 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,059 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,068 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,618 INFO [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] ql.Driver: Completed compiling command(queryId=rameshkumar_20230919025506_b005cc57-1717-4798-b8da-b502aa7ca3d6); Time taken: 0.678 seconds{code} was: Looking at the below logs suggest that the get_valid_write_ids is not cached for a single query for a single table. It is being called multiple times across different phases in the compilation of the query. We should verify if we can safely cache and re use the results. That way we can avoid at the 40-50 ms out of 678ms compilation time. {code:java} 2023-09-19T02:55:06,940 INFO [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] ql.Driver: Compiling command(queryId=rameshkumar_20230919025506_b005cc57-1717-4798-b8da-b502aa7ca3d6): 2023-09-19T02:55:06,967 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:06,979 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:06,980 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:06,986 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:06,988 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:06,995 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:06,997 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,007 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,009 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,017 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,018 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,026 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,059 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,068 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,618 INFO [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] ql.Driver: Completed
[jira] [Work started] (HIVE-27773) get_valid_write_ids is being called multiple times for a single query
[ https://issues.apache.org/jira/browse/HIVE-27773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-27773 started by Ramesh Kumar Thangarajan. --- > get_valid_write_ids is being called multiple times for a single query > - > > Key: HIVE-27773 > URL: https://issues.apache.org/jira/browse/HIVE-27773 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Looking at the below logs suggest that the get_valid_write_ids is not cached > for a single query for a single table. It is being called multiple times > across different phases in the compilation of the query. We should verify if > we can safely cache and re use the results. That way we can avoid around > 40-50 ms out of 678ms compilation time. > > {code:java} > 2023-09-19T02:55:06,940 INFO [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener > at 0.0.0.0/50501] ql.Driver: Compiling > command(queryId=rameshkumar_20230919025506_b005cc57-1717-4798-b8da-b502aa7ca3d6): > 2023-09-19T02:55:06,967 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener > at 0.0.0.0/50501] metrics.PerfLogger: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2023-09-19T02:55:06,979 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener > at 0.0.0.0/50501] metrics.PerfLogger: start=1695117306967 end=1695117306979 duration=12 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler retryCount=0 > error=false> > 2023-09-19T02:55:06,980 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener > at 0.0.0.0/50501] metrics.PerfLogger: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2023-09-19T02:55:06,986 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener > at 0.0.0.0/50501] metrics.PerfLogger: start=1695117306980 end=1695117306986 duration=6 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler retryCount=0 > error=false> > 2023-09-19T02:55:06,988 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener > at 0.0.0.0/50501] metrics.PerfLogger: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2023-09-19T02:55:06,995 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener > at 0.0.0.0/50501] metrics.PerfLogger: start=1695117306988 end=1695117306995 duration=7 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler retryCount=0 > error=false> > 2023-09-19T02:55:06,997 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener > at 0.0.0.0/50501] metrics.PerfLogger: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2023-09-19T02:55:07,007 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener > at 0.0.0.0/50501] metrics.PerfLogger: start=1695117306997 end=1695117307007 duration=10 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler retryCount=0 > error=false> > 2023-09-19T02:55:07,009 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener > at 0.0.0.0/50501] metrics.PerfLogger: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2023-09-19T02:55:07,017 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener > at 0.0.0.0/50501] metrics.PerfLogger: start=1695117307009 end=1695117307017 duration=8 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler retryCount=0 > error=false> > 2023-09-19T02:55:07,018 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener > at 0.0.0.0/50501] metrics.PerfLogger: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2023-09-19T02:55:07,026 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener > at 0.0.0.0/50501] metrics.PerfLogger: start=1695117307018 end=1695117307026 duration=8 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler retryCount=0 > error=false> > 2023-09-19T02:55:07,059 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener > at 0.0.0.0/50501] metrics.PerfLogger: from=org.apache.hadoop.hive.metastore.RetryingHMSHandler> > 2023-09-19T02:55:07,068 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener > at 0.0.0.0/50501] metrics.PerfLogger: start=1695117307059 end=1695117307068 duration=9 > from=org.apache.hadoop.hive.metastore.RetryingHMSHandler retryCount=0 > error=false> > 2023-09-19T02:55:07,618 INFO [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener > at 0.0.0.0/50501] ql.Driver: Completed compiling > command(queryId=rameshkumar_20230919025506_b005cc57-1717-4798-b8da-b502aa7ca3d6); > Time taken: 0.678 seconds{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27773) get_valid_write_ids is being called multiple times for a single query
Ramesh Kumar Thangarajan created HIVE-27773: --- Summary: get_valid_write_ids is being called multiple times for a single query Key: HIVE-27773 URL: https://issues.apache.org/jira/browse/HIVE-27773 Project: Hive Issue Type: Task Reporter: Ramesh Kumar Thangarajan Assignee: Ramesh Kumar Thangarajan Looking at the below logs suggest that the get_valid_write_ids is not cached for a single query for a single table. It is being called multiple times across different phases in the compilation of the query. We should verify if we can safely cache and re use the results. That way we can avoid at the 40-50 ms out of 678ms compilation time. {code:java} 2023-09-19T02:55:06,940 INFO [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] ql.Driver: Compiling command(queryId=rameshkumar_20230919025506_b005cc57-1717-4798-b8da-b502aa7ca3d6): 2023-09-19T02:55:06,967 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:06,979 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:06,980 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:06,986 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:06,988 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:06,995 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:06,997 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,007 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,009 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,017 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,018 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,026 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,059 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,068 DEBUG [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] metrics.PerfLogger: 2023-09-19T02:55:07,618 INFO [fa0fa087-7e2c-45b8-bd27-b94fbbe23e49 Listener at 0.0.0.0/50501] ql.Driver: Completed compiling command(queryId=rameshkumar_20230919025506_b005cc57-1717-4798-b8da-b502aa7ca3d6); Time taken: 0.678 seconds{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27751) Log Query Compilation summary in an accumulated way
Ramesh Kumar Thangarajan created HIVE-27751: --- Summary: Log Query Compilation summary in an accumulated way Key: HIVE-27751 URL: https://issues.apache.org/jira/browse/HIVE-27751 Project: Hive Issue Type: Task Components: Hive Reporter: Ramesh Kumar Thangarajan Assignee: Ramesh Kumar Thangarajan Query Compilation summary is very useful for reading and collecting all the measures of compile time in a single place. It is also useful in debugging a performance issue in the query compilation phase and also to report and compare with various runs -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27687) Logger variable should be static final as its creation takes more time in query compilation
[ https://issues.apache.org/jira/browse/HIVE-27687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-27687: Attachment: Screenshot 2023-09-12 at 5.03.31 PM.png > Logger variable should be static final as its creation takes more time in > query compilation > --- > > Key: HIVE-27687 > URL: https://issues.apache.org/jira/browse/HIVE-27687 > Project: Hive > Issue Type: Task > Components: Hive >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Attachments: Screenshot 2023-09-12 at 5.03.31 PM.png > > > In query compilation, > LoggerFactory.getLogger() seems to take up more time. Some of the serde > classes use non static variable for Logger that forces the getLogger() call > for each of the class creation. > Making Logger variable static final will avoid this code path for every serde > class construction. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27687) Logger variable should be static final as its creation takes more time in query compilation
Ramesh Kumar Thangarajan created HIVE-27687: --- Summary: Logger variable should be static final as its creation takes more time in query compilation Key: HIVE-27687 URL: https://issues.apache.org/jira/browse/HIVE-27687 Project: Hive Issue Type: Task Components: Hive Reporter: Ramesh Kumar Thangarajan Assignee: Ramesh Kumar Thangarajan In query compilation, LoggerFactory.getLogger() seems to take up more time. Some of the serde classes use non static variable for Logger that forces the getLogger() call for each of the class creation. Making Logger variable static final will avoid this code path for every serde class construction. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-23127) Replace listPartitionsByExpr with GetPartitionsWithSpecs in Partition pruner
[ https://issues.apache.org/jira/browse/HIVE-23127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-23127: --- Assignee: Ramesh Kumar Thangarajan (was: Vineet Garg) > Replace listPartitionsByExpr with GetPartitionsWithSpecs in Partition pruner > > > Key: HIVE-23127 > URL: https://issues.apache.org/jira/browse/HIVE-23127 > Project: Hive > Issue Type: Task > Components: HiveServer2 >Reporter: Vineet Garg >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Attachments: HIVE-23127.1.patch, HIVE-23127.2.patch > > > GetPartitionsWithSpecs reduces data transfer by deduplicating storage > descriptor -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-27293) Vectorization: Incorrect results with nvl for ORC table
[ https://issues.apache.org/jira/browse/HIVE-27293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan resolved HIVE-27293. - Resolution: Fixed > Vectorization: Incorrect results with nvl for ORC table > --- > > Key: HIVE-27293 > URL: https://issues.apache.org/jira/browse/HIVE-27293 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 4.0.0-alpha-2 >Reporter: Riju Trivedi >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Attachments: esource.txt, vectorization_nvl.q > > > Attached repro.q file and data file used to reproduce the issue. > {code:java} > Insert overwrite table etarget > select mt.*, floor(rand() * 1) as bdata_no from (select nvl(np.client_id,' > '),nvl(np.id_enddate,cast(0 as decimal(10,0))),nvl(np.client_gender,' > '),nvl(np.birthday,cast(0 as decimal(10,0))),nvl(np.nationality,' > '),nvl(np.address_zipcode,' '),nvl(np.income,cast(0 as > decimal(15,2))),nvl(np.address,' '),nvl(np.part_date,cast(0 as int)) from > (select * from esource where part_date = 20230414) np) mt; > {code} > Outcome: > {code:java} > select client_id,birthday,income from etarget; > 15678 0 0.00 > 67891 19313 -1.00 > 12345 0 0.00{code} > Expected Result : > {code:java} > select client_id,birthday,income from etarget; > 12345 19613 -1.00 > 67891 19313 -1.00 > 15678 0 0.00{code} > Disabling hive.vectorized.use.vectorized.input.format produces correct output. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-27293) Vectorization: Incorrect results with nvl for ORC table
[ https://issues.apache.org/jira/browse/HIVE-27293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-27293: --- Assignee: Ramesh Kumar Thangarajan > Vectorization: Incorrect results with nvl for ORC table > --- > > Key: HIVE-27293 > URL: https://issues.apache.org/jira/browse/HIVE-27293 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 4.0.0-alpha-2 >Reporter: Riju Trivedi >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Attachments: esource.txt, vectorization_nvl.q > > > Attached repro.q file and data file used to reproduce the issue. > {code:java} > Insert overwrite table etarget > select mt.*, floor(rand() * 1) as bdata_no from (select nvl(np.client_id,' > '),nvl(np.id_enddate,cast(0 as decimal(10,0))),nvl(np.client_gender,' > '),nvl(np.birthday,cast(0 as decimal(10,0))),nvl(np.nationality,' > '),nvl(np.address_zipcode,' '),nvl(np.income,cast(0 as > decimal(15,2))),nvl(np.address,' '),nvl(np.part_date,cast(0 as int)) from > (select * from esource where part_date = 20230414) np) mt; > {code} > Outcome: > {code:java} > select client_id,birthday,income from etarget; > 15678 0 0.00 > 67891 19313 -1.00 > 12345 0 0.00{code} > Expected Result : > {code:java} > select client_id,birthday,income from etarget; > 12345 19613 -1.00 > 67891 19313 -1.00 > 15678 0 0.00{code} > Disabling hive.vectorized.use.vectorized.input.format produces correct output. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-27244) Iceberg: Implement LOAD data for unpartitioned table via Append API
[ https://issues.apache.org/jira/browse/HIVE-27244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-27244: --- Assignee: Ramesh Kumar Thangarajan > Iceberg: Implement LOAD data for unpartitioned table via Append API > --- > > Key: HIVE-27244 > URL: https://issues.apache.org/jira/browse/HIVE-27244 > Project: Hive > Issue Type: Sub-task >Reporter: Ayush Saxena >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Use Append API for Iceberg Load data command, Same as migration use case -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-26989) Fix predicate pushdown for Timestamp with TZ
[ https://issues.apache.org/jira/browse/HIVE-26989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan resolved HIVE-26989. - Resolution: Fixed > Fix predicate pushdown for Timestamp with TZ > > > Key: HIVE-26989 > URL: https://issues.apache.org/jira/browse/HIVE-26989 > Project: Hive > Issue Type: Task > Components: Hive, Iceberg integration >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Running a query which is filtering for {{TIMESTAMP WITH LOCAL TIME ZONE}} > returns the correct results but the predicate is not pushed to Iceberg. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-26989) Fix predicate pushdown for Timestamp with TZ
[ https://issues.apache.org/jira/browse/HIVE-26989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-26989: --- > Fix predicate pushdown for Timestamp with TZ > > > Key: HIVE-26989 > URL: https://issues.apache.org/jira/browse/HIVE-26989 > Project: Hive > Issue Type: Task > Components: Hive, Iceberg integration >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Running a query which is filtering for {{TIMESTAMP WITH LOCAL TIME ZONE}} > returns the correct results but the predicate is not pushed to Iceberg. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-26837) CTLT with hive.create.as.external.legacy as true creates managed table instead of external table
[ https://issues.apache.org/jira/browse/HIVE-26837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan resolved HIVE-26837. - Resolution: Fixed > CTLT with hive.create.as.external.legacy as true creates managed table > instead of external table > > > Key: HIVE-26837 > URL: https://issues.apache.org/jira/browse/HIVE-26837 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > When CTLT is used with the config hive.create.as.external.legacy=true, it > still creates managed table by default. Use below to reproduce. > create external table test_ext(empno int, name string) partitioned by(dept > string) stored as orc; > desc formatted test_ext; > set hive.create.as.external.legacy=true; > create table test_external like test_ext; > desc formatted test_external; -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26837) CTLT with hive.create.as.external.legacy as true creates managed table instead of external table
[ https://issues.apache.org/jira/browse/HIVE-26837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-26837: Component/s: HiveServer2 > CTLT with hive.create.as.external.legacy as true creates managed table > instead of external table > > > Key: HIVE-26837 > URL: https://issues.apache.org/jira/browse/HIVE-26837 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > When CTLT is used with the config hive.create.as.external.legacy=true, it > still creates managed table by default. Use below to reproduce. > create external table test_ext(empno int, name string) partitioned by(dept > string) stored as orc; > desc formatted test_ext; > set hive.create.as.external.legacy=true; > create table test_external like test_ext; > desc formatted test_external; -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26837) CTLT with hive.create.as.external.legacy as true creates managed table instead of external table
[ https://issues.apache.org/jira/browse/HIVE-26837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-26837: Description: When CTLT is used with the config hive.create.as.external.legacy=true, it still creates managed table by default. Use below to reproduce. create external table test_ext(empno int, name string) partitioned by(dept string) stored as orc; desc formatted test_ext; set hive.create.as.external.legacy=true; create table test_external like test_ext; desc formatted test_external; was: When CTLT is used with the config hive.create.as.external.legacy=true, it still creates managed table by default. Use below to reproduce. create table test_mm(empno int, name string) partitioned by(dept string) stored as orc tblproperties('transactional'='true', 'transactional_properties'='default'); desc formatted test_mm; set hive.create.as.external.legacy=true; create table test_external like test_mm; desc formatted test_external; > CTLT with hive.create.as.external.legacy as true creates managed table > instead of external table > > > Key: HIVE-26837 > URL: https://issues.apache.org/jira/browse/HIVE-26837 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > When CTLT is used with the config hive.create.as.external.legacy=true, it > still creates managed table by default. Use below to reproduce. > create external table test_ext(empno int, name string) partitioned by(dept > string) stored as orc; > desc formatted test_ext; > set hive.create.as.external.legacy=true; > create table test_external like test_ext; > desc formatted test_external; -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-26837) CTLT with hive.create.as.external.legacy as true creates managed table instead of external table
[ https://issues.apache.org/jira/browse/HIVE-26837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-26837: --- > CTLT with hive.create.as.external.legacy as true creates managed table > instead of external table > > > Key: HIVE-26837 > URL: https://issues.apache.org/jira/browse/HIVE-26837 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > When CTLT is used with the config hive.create.as.external.legacy=true, it > still creates managed table by default > create table test_mm(empno int, name string) partitioned by(dept string) > stored as orc tblproperties('transactional'='true', > 'transactional_properties'='default'); > desc formatted test_mm; > set hive.create.as.external.legacy=true; > create table test_external like test_mm; > desc formatted test_external; -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26837) CTLT with hive.create.as.external.legacy as true creates managed table instead of external table
[ https://issues.apache.org/jira/browse/HIVE-26837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-26837: Description: When CTLT is used with the config hive.create.as.external.legacy=true, it still creates managed table by default. Use below to reproduce. create table test_mm(empno int, name string) partitioned by(dept string) stored as orc tblproperties('transactional'='true', 'transactional_properties'='default'); desc formatted test_mm; set hive.create.as.external.legacy=true; create table test_external like test_mm; desc formatted test_external; was: When CTLT is used with the config hive.create.as.external.legacy=true, it still creates managed table by default create table test_mm(empno int, name string) partitioned by(dept string) stored as orc tblproperties('transactional'='true', 'transactional_properties'='default'); desc formatted test_mm; set hive.create.as.external.legacy=true; create table test_external like test_mm; desc formatted test_external; > CTLT with hive.create.as.external.legacy as true creates managed table > instead of external table > > > Key: HIVE-26837 > URL: https://issues.apache.org/jira/browse/HIVE-26837 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > When CTLT is used with the config hive.create.as.external.legacy=true, it > still creates managed table by default. Use below to reproduce. > create table test_mm(empno int, name string) partitioned by(dept string) > stored as orc tblproperties('transactional'='true', > 'transactional_properties'='default'); > desc formatted test_mm; > set hive.create.as.external.legacy=true; > create table test_external like test_mm; > desc formatted test_external; -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-26208) Exception in Vectorization with Decimal64 to Decimal casting
[ https://issues.apache.org/jira/browse/HIVE-26208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17556496#comment-17556496 ] Ramesh Kumar Thangarajan commented on HIVE-26208: - Thanks [~scarlin] for the patch, I just merged the patch. > Exception in Vectorization with Decimal64 to Decimal casting > > > Key: HIVE-26208 > URL: https://issues.apache.org/jira/browse/HIVE-26208 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Steve Carlin >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > The following query fails: > > {code:java} > select count(*) > from > int_txt > where > (( 1.0 * i) / ( 1.0 * i)) > 1.2; > {code} > with the following exception: > > {code:java} > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector > at > org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DecimalColDivideDecimalColumn.evaluate(DecimalColDivideDecimalColumn.java:59) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:334) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FilterDecimalColGreaterDecimalScalar.evaluate(FilterDecimalColGreaterDecimalScalar.java:62) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:125) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:171) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:809) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:900) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > ... 19 more > {code} > > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (HIVE-26269) Class cast exception when vectorization is enabled for certain case when cases
[ https://issues.apache.org/jira/browse/HIVE-26269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan resolved HIVE-26269. - Resolution: Fixed > Class cast exception when vectorization is enabled for certain case when cases > -- > > Key: HIVE-26269 > URL: https://issues.apache.org/jira/browse/HIVE-26269 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Class cast exception when vectorization is enabled for certain case when cases -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (HIVE-26239) Shutdown Hash table load executor service threads when they are interrupted
[ https://issues.apache.org/jira/browse/HIVE-26239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan resolved HIVE-26239. - Resolution: Fixed > Shutdown Hash table load executor service threads when they are interrupted > --- > > Key: HIVE-26239 > URL: https://issues.apache.org/jira/browse/HIVE-26239 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (HIVE-26269) Class cast exception when vectorization is enabled for certain case when cases
[ https://issues.apache.org/jira/browse/HIVE-26269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-26269: --- > Class cast exception when vectorization is enabled for certain case when cases > -- > > Key: HIVE-26269 > URL: https://issues.apache.org/jira/browse/HIVE-26269 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Class cast exception when vectorization is enabled for certain case when cases -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (HIVE-26241) Support Geospatial datatypes
[ https://issues.apache.org/jira/browse/HIVE-26241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-26241: --- > Support Geospatial datatypes > > > Key: HIVE-26241 > URL: https://issues.apache.org/jira/browse/HIVE-26241 > Project: Hive > Issue Type: Sub-task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26239) Shutdown Hash table load executor service threads when they are interrupted
[ https://issues.apache.org/jira/browse/HIVE-26239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-26239: Issue Type: Bug (was: Task) > Shutdown Hash table load executor service threads when they are interrupted > --- > > Key: HIVE-26239 > URL: https://issues.apache.org/jira/browse/HIVE-26239 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (HIVE-26239) Shutdown Hash table load executor service threads when they are interrupted
[ https://issues.apache.org/jira/browse/HIVE-26239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-26239: --- > Shutdown Hash table load executor service threads when they are interrupted > --- > > Key: HIVE-26239 > URL: https://issues.apache.org/jira/browse/HIVE-26239 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (HIVE-26219) Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy
[ https://issues.apache.org/jira/browse/HIVE-26219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan resolved HIVE-26219. - Resolution: Fixed > Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy > -- > > Key: HIVE-26219 > URL: https://issues.apache.org/jira/browse/HIVE-26219 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy > so other services like Ranger can still use the old API -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (HIVE-26219) Encapsulate the API change so other services like Ranger can still use the old API
[ https://issues.apache.org/jira/browse/HIVE-26219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-26219: --- > Encapsulate the API change so other services like Ranger can still use the > old API > -- > > Key: HIVE-26219 > URL: https://issues.apache.org/jira/browse/HIVE-26219 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26219) Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy
[ https://issues.apache.org/jira/browse/HIVE-26219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-26219: Description: Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy so other services like Ranger can still use the old API (was: Encapsulate the API change for so other services like Ranger can still use the old API) > Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy > -- > > Key: HIVE-26219 > URL: https://issues.apache.org/jira/browse/HIVE-26219 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy > so other services like Ranger can still use the old API -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26219) Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy
[ https://issues.apache.org/jira/browse/HIVE-26219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-26219: Description: Encapsulate the API change for so other services like Ranger can still use the old API > Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy > -- > > Key: HIVE-26219 > URL: https://issues.apache.org/jira/browse/HIVE-26219 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Encapsulate the API change for so other services like Ranger can still use > the old API -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26219) Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy
[ https://issues.apache.org/jira/browse/HIVE-26219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-26219: Summary: Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy (was: Encapsulate the API change so other services like Ranger can still use the old API) > Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy > -- > > Key: HIVE-26219 > URL: https://issues.apache.org/jira/browse/HIVE-26219 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (HIVE-26199) Reduce FileSystem init during user impersonation
[ https://issues.apache.org/jira/browse/HIVE-26199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan resolved HIVE-26199. - Resolution: Fixed > Reduce FileSystem init during user impersonation > > > Key: HIVE-26199 > URL: https://issues.apache.org/jira/browse/HIVE-26199 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (HIVE-26199) Reduce FileSystem init during user impersonation
[ https://issues.apache.org/jira/browse/HIVE-26199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-26199: --- > Reduce FileSystem init during user impersonation > > > Key: HIVE-26199 > URL: https://issues.apache.org/jira/browse/HIVE-26199 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (HIVE-25697) Upgrade commons-compress to 1.21
[ https://issues.apache.org/jira/browse/HIVE-25697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17443543#comment-17443543 ] Ramesh Kumar Thangarajan commented on HIVE-25697: - [~pgaref] Can you please help review this? > Upgrade commons-compress to 1.21 > > > Key: HIVE-25697 > URL: https://issues.apache.org/jira/browse/HIVE-25697 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Upgrade commons-compress to 1.21 due to CVEs -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (HIVE-25697) Upgrade commons-compress to 1.21
[ https://issues.apache.org/jira/browse/HIVE-25697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-25697: --- > Upgrade commons-compress to 1.21 > > > Key: HIVE-25697 > URL: https://issues.apache.org/jira/browse/HIVE-25697 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Upgrade commons-compress to 1.21 due to CVEs -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HIVE-25583) Support parallel load for HastTables - Interfaces
[ https://issues.apache.org/jira/browse/HIVE-25583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-25583: Parent: HIVE-24037 Issue Type: Sub-task (was: Task) > Support parallel load for HastTables - Interfaces > - > > Key: HIVE-25583 > URL: https://issues.apache.org/jira/browse/HIVE-25583 > Project: Hive > Issue Type: Sub-task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25583) Support parallel load for HastTables - Interfaces
[ https://issues.apache.org/jira/browse/HIVE-25583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-25583: --- > Support parallel load for HastTables - Interfaces > - > > Key: HIVE-25583 > URL: https://issues.apache.org/jira/browse/HIVE-25583 > Project: Hive > Issue Type: Task >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-21489) EXPLAIN command throws ClassCastException in Hive
[ https://issues.apache.org/jira/browse/HIVE-21489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-21489: --- Assignee: Ramesh Kumar Thangarajan (was: Daniel Dai) > EXPLAIN command throws ClassCastException in Hive > - > > Key: HIVE-21489 > URL: https://issues.apache.org/jira/browse/HIVE-21489 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.4 >Reporter: Ping Lu >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Attachments: HIVE-21489.1.patch, HIVE-21489.2.patch > > > I'm trying to run commands like explain select * from src in hive-2.3.4,but > it falls with the ClassCastException: > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer cannot be cast to > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer > Steps to reproduce: > 1)hive.execution.engine is the default value mr > 2)hive.security.authorization.enabled is set to true, and > hive.security.authorization.manager is set to > org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider > 3)start hivecli to run command:explain select * from src > I debug the code and find the issue HIVE-18778 causing the above > ClassCastException.If I set hive.in.test to true,the explain command can be > successfully executed。 > Now,I have one question,due to hive.in.test cann't be modified at runtime.how > to run explain command with using default authorization in hive-2.3.4, -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21489) EXPLAIN command throws ClassCastException in Hive
[ https://issues.apache.org/jira/browse/HIVE-21489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358879#comment-17358879 ] Ramesh Kumar Thangarajan commented on HIVE-21489: - [~daijy] This issue can still be reproduced if it is Storage based Authorization and it will be better if we can fix this. The above patch seems to be reasonable for me. cc [~hashutosh] [~thejas] > EXPLAIN command throws ClassCastException in Hive > - > > Key: HIVE-21489 > URL: https://issues.apache.org/jira/browse/HIVE-21489 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.4 >Reporter: Ping Lu >Assignee: Daniel Dai >Priority: Major > Attachments: HIVE-21489.1.patch, HIVE-21489.2.patch > > > I'm trying to run commands like explain select * from src in hive-2.3.4,but > it falls with the ClassCastException: > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer cannot be cast to > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer > Steps to reproduce: > 1)hive.execution.engine is the default value mr > 2)hive.security.authorization.enabled is set to true, and > hive.security.authorization.manager is set to > org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider > 3)start hivecli to run command:explain select * from src > I debug the code and find the issue HIVE-18778 causing the above > ClassCastException.If I set hive.in.test to true,the explain command can be > successfully executed。 > Now,I have one question,due to hive.in.test cann't be modified at runtime.how > to run explain command with using default authorization in hive-2.3.4, -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25210) oracle.sql.CLOB cannot be cast to java.lang.String in PARTITION_PARAMS table
[ https://issues.apache.org/jira/browse/HIVE-25210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan resolved HIVE-25210. - Resolution: Not A Problem > oracle.sql.CLOB cannot be cast to java.lang.String in PARTITION_PARAMS table > > > Key: HIVE-25210 > URL: https://issues.apache.org/jira/browse/HIVE-25210 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > oracle.sql.CLOB cannot be cast to java.lang.String in PARTITION_PARAMS table -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25210) oracle.sql.CLOB cannot be cast to java.lang.String in PARTITION_PARAMS table
[ https://issues.apache.org/jira/browse/HIVE-25210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-25210: --- > oracle.sql.CLOB cannot be cast to java.lang.String in PARTITION_PARAMS table > > > Key: HIVE-25210 > URL: https://issues.apache.org/jira/browse/HIVE-25210 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > oracle.sql.CLOB cannot be cast to java.lang.String in PARTITION_PARAMS table -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25202) Support decimal64 operations for PTF operators
[ https://issues.apache.org/jira/browse/HIVE-25202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-25202: --- > Support decimal64 operations for PTF operators > -- > > Key: HIVE-25202 > URL: https://issues.apache.org/jira/browse/HIVE-25202 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > After the support for decimal64 vectorization for multiple operators, PTF > operators were found guilty of breaking the decimal64 chain if they happen to > occur between two operators. As a result they introduce unnecessary cast to > decimal. In order to prevent this, we will support PTF operators to handle > decimal64 data types too -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25117) Vector PTF ClassCastException with Decimal64
[ https://issues.apache.org/jira/browse/HIVE-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347297#comment-17347297 ] Ramesh Kumar Thangarajan commented on HIVE-25117: - [~pgaref] Can you please help review this? > Vector PTF ClassCastException with Decimal64 > > > Key: HIVE-25117 > URL: https://issues.apache.org/jira/browse/HIVE-25117 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Attachments: vector_ptf_classcast_exception.q > > Time Spent: 10m > Remaining Estimate: 0h > > Only reproduces when there is at least 1 buffered batch, so needed 2 rows > with 1 row/batch: > {code:java} > set hive.vectorized.testing.reducer.batch.size=1; > {code} > {code:java} > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664) > at > org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228) > at > org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318) > at > org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403) > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25117) Vector PTF ClassCastException with Decimal64
[ https://issues.apache.org/jira/browse/HIVE-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-25117: --- Assignee: Ramesh Kumar Thangarajan > Vector PTF ClassCastException with Decimal64 > > > Key: HIVE-25117 > URL: https://issues.apache.org/jira/browse/HIVE-25117 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Attachments: vector_ptf_classcast_exception.q > > > Only reproduces when there is at least 1 buffered batch, so needed 2 rows > with 1 row/batch: > {code:java} > set hive.vectorized.testing.reducer.batch.size=1; > {code} > {code:java} > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664) > at > org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228) > at > org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318) > at > org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403) > at > org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24900) Failed compaction does not cleanup the directories
[ https://issues.apache.org/jira/browse/HIVE-24900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan resolved HIVE-24900. - Resolution: Fixed > Failed compaction does not cleanup the directories > -- > > Key: HIVE-24900 > URL: https://issues.apache.org/jira/browse/HIVE-24900 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > > Failed compaction does not cleanup the directories -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24927) Alter table rename moves temporary folders as part of the rename
[ https://issues.apache.org/jira/browse/HIVE-24927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-24927: --- > Alter table rename moves temporary folders as part of the rename > > > Key: HIVE-24927 > URL: https://issues.apache.org/jira/browse/HIVE-24927 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Alter table rename moves temporary folders as part of the rename -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24900) Failed compaction does not cleanup the directories
[ https://issues.apache.org/jira/browse/HIVE-24900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-24900: --- > Failed compaction does not cleanup the directories > -- > > Key: HIVE-24900 > URL: https://issues.apache.org/jira/browse/HIVE-24900 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Failed compaction does not cleanup the directories -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24195) Avoid reallocation of the arrays in the lateral view explode of complex types
[ https://issues.apache.org/jira/browse/HIVE-24195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-24195: Description: Avoid reallocation of the arrays in the lateral view explode of complex types > Avoid reallocation of the arrays in the lateral view explode of complex types > - > > Key: HIVE-24195 > URL: https://issues.apache.org/jira/browse/HIVE-24195 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Avoid reallocation of the arrays in the lateral view explode of complex types -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24195) Avoid reallocation of the arrays in the lateral view explode of complex types
[ https://issues.apache.org/jira/browse/HIVE-24195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-24195: --- > Avoid reallocation of the arrays in the lateral view explode of complex types > - > > Key: HIVE-24195 > URL: https://issues.apache.org/jira/browse/HIVE-24195 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events
[ https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17195269#comment-17195269 ] Ramesh Kumar Thangarajan commented on HIVE-24070: - Yes I will close this jira, and lets work on HIVE-22290 > ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of > pending events > -- > > Key: HIVE-24070 > URL: https://issues.apache.org/jira/browse/HIVE-24070 > Project: Hive > Issue Type: Bug > Components: repl >Affects Versions: 4.0.0 >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Fix For: 4.0.0 > > > If there are large number of events that haven't been cleaned up for some > reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory > while it loads all the events to be deleted. > It should fetch events in batches. > Similar to https://issues.apache.org/jira/browse/HIVE-19430 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events
[ https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan resolved HIVE-24070. - Resolution: Duplicate > ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of > pending events > -- > > Key: HIVE-24070 > URL: https://issues.apache.org/jira/browse/HIVE-24070 > Project: Hive > Issue Type: Bug > Components: repl >Affects Versions: 4.0.0 >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Fix For: 4.0.0 > > > If there are large number of events that haven't been cleaned up for some > reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory > while it loads all the events to be deleted. > It should fetch events in batches. > Similar to https://issues.apache.org/jira/browse/HIVE-19430 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24145) Fix preemption issues in reducers and file sink operators
[ https://issues.apache.org/jira/browse/HIVE-24145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-24145: --- > Fix preemption issues in reducers and file sink operators > - > > Key: HIVE-24145 > URL: https://issues.apache.org/jira/browse/HIVE-24145 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > There are two issues because of preemption: > # Reducers are getting reordered as part of optimizations because of which > more preemption happen > # Preemption in the middle of writing can cause the file to not close and > lead to errors when we read the file later -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24071) Continue cleaning the NotificationEvents till we have data greater than TTL
[ https://issues.apache.org/jira/browse/HIVE-24071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-24071: --- > Continue cleaning the NotificationEvents till we have data greater than TTL > --- > > Key: HIVE-24071 > URL: https://issues.apache.org/jira/browse/HIVE-24071 > Project: Hive > Issue Type: Bug > Components: repl >Affects Versions: 4.0.0 >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Fix For: 4.0.0 > > > Continue cleaning the NotificationEvents till we have data greater than TTL. > Currently we only clean the notification events once every 2 hours and also > strict 1 every time. We should continue deleting until we clear up all > the notification events greater than TTL. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events
[ https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183905#comment-17183905 ] Ramesh Kumar Thangarajan commented on HIVE-24070: - [~anishek] Do we already have a Jira for that? Otherwise I can create one. > ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of > pending events > -- > > Key: HIVE-24070 > URL: https://issues.apache.org/jira/browse/HIVE-24070 > Project: Hive > Issue Type: Bug > Components: repl >Affects Versions: 4.0.0 >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Fix For: 4.0.0 > > > If there are large number of events that haven't been cleaned up for some > reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory > while it loads all the events to be deleted. > It should fetch events in batches. > Similar to https://issues.apache.org/jira/browse/HIVE-19430 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events
[ https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183900#comment-17183900 ] Ramesh Kumar Thangarajan commented on HIVE-24070: - the issue you mentioned is still present and needs to be addressed though it might not cause the service to stop. The OOM currently cause the service to stop. We might have to create a separate Jira to address that issue. > ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of > pending events > -- > > Key: HIVE-24070 > URL: https://issues.apache.org/jira/browse/HIVE-24070 > Project: Hive > Issue Type: Bug > Components: repl >Affects Versions: 4.0.0 >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Fix For: 4.0.0 > > > If there are large number of events that haven't been cleaned up for some > reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory > while it loads all the events to be deleted. > It should fetch events in batches. > Similar to https://issues.apache.org/jira/browse/HIVE-19430 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events
[ https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183898#comment-17183898 ] Ramesh Kumar Thangarajan commented on HIVE-24070: - Jira HIVE-19430 only solves the OOM issue with cleanNotificationEvents(). We still reach OOM through cleanWriteNotificationEvents() at [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L10429] > ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of > pending events > -- > > Key: HIVE-24070 > URL: https://issues.apache.org/jira/browse/HIVE-24070 > Project: Hive > Issue Type: Bug > Components: repl >Affects Versions: 4.0.0 >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Fix For: 4.0.0 > > > If there are large number of events that haven't been cleaned up for some > reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory > while it loads all the events to be deleted. > It should fetch events in batches. > Similar to https://issues.apache.org/jira/browse/HIVE-19430 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events
[ https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-24070: Description: If there are large number of events that haven't been cleaned up for some reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory while it loads all the events to be deleted. It should fetch events in batches. Similar to https://issues.apache.org/jira/browse/HIVE-19430 was: If there are large number of events that haven't been cleaned up for some reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory while it loads all the events to be deleted. It should fetch events in batches. > ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of > pending events > -- > > Key: HIVE-24070 > URL: https://issues.apache.org/jira/browse/HIVE-24070 > Project: Hive > Issue Type: Bug > Components: repl >Affects Versions: 4.0.0 >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Fix For: 4.0.0 > > > If there are large number of events that haven't been cleaned up for some > reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory > while it loads all the events to be deleted. > It should fetch events in batches. > Similar to https://issues.apache.org/jira/browse/HIVE-19430 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events
[ https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-24070: --- > ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of > pending events > -- > > Key: HIVE-24070 > URL: https://issues.apache.org/jira/browse/HIVE-24070 > Project: Hive > Issue Type: Bug > Components: repl >Affects Versions: 4.0.0 >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Fix For: 4.0.0 > > > If there are large number of events that haven't been cleaned up for some > reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory > while it loads all the events to be deleted. > It should fetch events in batches. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24037) Parallelize hash table constructions in map joins
[ https://issues.apache.org/jira/browse/HIVE-24037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-24037: --- > Parallelize hash table constructions in map joins > - > > Key: HIVE-24037 > URL: https://issues.apache.org/jira/browse/HIVE-24037 > Project: Hive > Issue Type: Improvement >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > > Parallelize hash table constructions in map joins -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-22934) Hive server interactive log counters to error stream
[ https://issues.apache.org/jira/browse/HIVE-22934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan reassigned HIVE-22934: --- Assignee: Ramesh Kumar Thangarajan (was: Antal Sinkovits) > Hive server interactive log counters to error stream > > > Key: HIVE-22934 > URL: https://issues.apache.org/jira/browse/HIVE-22934 > Project: Hive > Issue Type: Bug >Reporter: Slim Bouguerra >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22934.01.patch, HIVE-22934.02.patch, > HIVE-22934.03.patch, HIVE-22934.04.patch, HIVE-22934.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Hive server is logging the console output to system error stream. > This need to be fixed because > First we do not roll the file. > Second writing to such file is done sequential and can lead to throttle/poor > perf. > {code} > -rw-r--r-- 1 hive hadoop 9.5G Feb 26 17:22 hive-server2-interactive.err > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23665) Rewrite last_value to first_value to enable streaming results
[ https://issues.apache.org/jira/browse/HIVE-23665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17146476#comment-17146476 ] Ramesh Kumar Thangarajan commented on HIVE-23665: - [~jcamachorodriguez] [~vgarg] Can you please review the attached PR and let me know your thoughts? > Rewrite last_value to first_value to enable streaming results > - > > Key: HIVE-23665 > URL: https://issues.apache.org/jira/browse/HIVE-23665 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23665.1.patch, HIVE-23665.2.patch, > HIVE-23665.3.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > Rewrite last_value to first_value to enable streaming results > last_value cannot be streamed because the intermediate results need to be > buffered to determine the window result till we get the last row in the > window. But if we can rewrite to first_value we can stream the results, > although the order of results will not be guaranteed (also not important) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23666) checkHashModeEfficiency is skipped when a groupby operator doesn't have a grouping set
[ https://issues.apache.org/jira/browse/HIVE-23666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17134624#comment-17134624 ] Ramesh Kumar Thangarajan commented on HIVE-23666: - I have created the PR at [https://github.com/apache/hive/pull/1103]. Can you please help me review the patch? > checkHashModeEfficiency is skipped when a groupby operator doesn't have a > grouping set > -- > > Key: HIVE-23666 > URL: https://issues.apache.org/jira/browse/HIVE-23666 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23666.1.patch > > Time Spent: 10m > Remaining Estimate: 0h > > checkHashModeEfficiency is skipped when a groupby operator doesn't have a > grouping set -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23665) Rewrite last_value to first_value to enable streaming results
[ https://issues.apache.org/jira/browse/HIVE-23665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-23665: Attachment: (was: HIVE-23665.3.patch) > Rewrite last_value to first_value to enable streaming results > - > > Key: HIVE-23665 > URL: https://issues.apache.org/jira/browse/HIVE-23665 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Attachments: HIVE-23665.1.patch, HIVE-23665.2.patch, > HIVE-23665.3.patch > > > Rewrite last_value to first_value to enable streaming results > last_value cannot be streamed because the intermediate results need to be > buffered to determine the window result till we get the last row in the > window. But if we can rewrite to first_value we can stream the results, > although the order of results will not be guaranteed (also not important) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23665) Rewrite last_value to first_value to enable streaming results
[ https://issues.apache.org/jira/browse/HIVE-23665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-23665: Attachment: HIVE-23665.3.patch Status: Patch Available (was: Open) > Rewrite last_value to first_value to enable streaming results > - > > Key: HIVE-23665 > URL: https://issues.apache.org/jira/browse/HIVE-23665 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Attachments: HIVE-23665.1.patch, HIVE-23665.2.patch, > HIVE-23665.3.patch > > > Rewrite last_value to first_value to enable streaming results > last_value cannot be streamed because the intermediate results need to be > buffered to determine the window result till we get the last row in the > window. But if we can rewrite to first_value we can stream the results, > although the order of results will not be guaranteed (also not important) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23665) Rewrite last_value to first_value to enable streaming results
[ https://issues.apache.org/jira/browse/HIVE-23665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-23665: Status: Open (was: Patch Available) > Rewrite last_value to first_value to enable streaming results > - > > Key: HIVE-23665 > URL: https://issues.apache.org/jira/browse/HIVE-23665 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Attachments: HIVE-23665.1.patch, HIVE-23665.2.patch, > HIVE-23665.3.patch > > > Rewrite last_value to first_value to enable streaming results > last_value cannot be streamed because the intermediate results need to be > buffered to determine the window result till we get the last row in the > window. But if we can rewrite to first_value we can stream the results, > although the order of results will not be guaranteed (also not important) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23665) Rewrite last_value to first_value to enable streaming results
[ https://issues.apache.org/jira/browse/HIVE-23665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-23665: Status: Open (was: Patch Available) > Rewrite last_value to first_value to enable streaming results > - > > Key: HIVE-23665 > URL: https://issues.apache.org/jira/browse/HIVE-23665 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Attachments: HIVE-23665.1.patch, HIVE-23665.2.patch > > > Rewrite last_value to first_value to enable streaming results > last_value cannot be streamed because the intermediate results need to be > buffered to determine the window result till we get the last row in the > window. But if we can rewrite to first_value we can stream the results, > although the order of results will not be guaranteed (also not important) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23665) Rewrite last_value to first_value to enable streaming results
[ https://issues.apache.org/jira/browse/HIVE-23665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramesh Kumar Thangarajan updated HIVE-23665: Attachment: HIVE-23665.3.patch Status: Patch Available (was: Open) > Rewrite last_value to first_value to enable streaming results > - > > Key: HIVE-23665 > URL: https://issues.apache.org/jira/browse/HIVE-23665 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Attachments: HIVE-23665.1.patch, HIVE-23665.2.patch, > HIVE-23665.3.patch > > > Rewrite last_value to first_value to enable streaming results > last_value cannot be streamed because the intermediate results need to be > buffered to determine the window result till we get the last row in the > window. But if we can rewrite to first_value we can stream the results, > although the order of results will not be guaranteed (also not important) -- This message was sent by Atlassian Jira (v8.3.4#803005)