[jira] [Commented] (HIVE-27512) CalciteSemanticException.UnsupportedFeature enum to capital

2023-11-20 Thread Mahesh Raju Somalaraju (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788299#comment-17788299
 ] 

Mahesh Raju Somalaraju commented on HIVE-27512:
---

[~abstractdog] i have assigned this Jira to  myself and will raise PR.

> CalciteSemanticException.UnsupportedFeature enum to capital
> ---
>
> Key: HIVE-27512
> URL: https://issues.apache.org/jira/browse/HIVE-27512
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: Mahesh Raju Somalaraju
>Priority: Major
>  Labels: newbie
>
> https://github.com/apache/hive/blob/3bc62cbc2d42c22dfd55f78ad7b41ec84a71380f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/CalciteSemanticException.java#L32-L39
> {code}
>   public enum UnsupportedFeature {
> Distinct_without_an_aggreggation, Duplicates_in_RR, 
> Filter_expression_with_non_boolean_return_type,
> Having_clause_without_any_groupby, Invalid_column_reference, 
> Invalid_decimal,
> Less_than_equal_greater_than, Others, Same_name_in_multiple_expressions,
> Schema_less_table, Select_alias_in_having_clause, Select_transform, 
> Subquery,
> Table_sample_clauses, UDTF, Union_type, Unique_join,
> HighPrecissionTimestamp // CALCITE-1690
>   };
> {code}
> this just hurts my eyes, I expect it as DISTINCT_WITHOUT_AN_AGGREGATION ...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27512) CalciteSemanticException.UnsupportedFeature enum to capital

2023-11-20 Thread Mahesh Raju Somalaraju (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahesh Raju Somalaraju reassigned HIVE-27512:
-

Assignee: Mahesh Raju Somalaraju

> CalciteSemanticException.UnsupportedFeature enum to capital
> ---
>
> Key: HIVE-27512
> URL: https://issues.apache.org/jira/browse/HIVE-27512
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: Mahesh Raju Somalaraju
>Priority: Major
>  Labels: newbie
>
> https://github.com/apache/hive/blob/3bc62cbc2d42c22dfd55f78ad7b41ec84a71380f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/CalciteSemanticException.java#L32-L39
> {code}
>   public enum UnsupportedFeature {
> Distinct_without_an_aggreggation, Duplicates_in_RR, 
> Filter_expression_with_non_boolean_return_type,
> Having_clause_without_any_groupby, Invalid_column_reference, 
> Invalid_decimal,
> Less_than_equal_greater_than, Others, Same_name_in_multiple_expressions,
> Schema_less_table, Select_alias_in_having_clause, Select_transform, 
> Subquery,
> Table_sample_clauses, UDTF, Union_type, Unique_join,
> HighPrecissionTimestamp // CALCITE-1690
>   };
> {code}
> this just hurts my eyes, I expect it as DISTINCT_WITHOUT_AN_AGGREGATION ...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27703) Remove PowerMock from itests/hive-jmh and upgrade mockito to 4.11

2023-11-20 Thread Mahesh Raju Somalaraju (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahesh Raju Somalaraju resolved HIVE-27703.
---
Resolution: Duplicate

Handled in the part of : https://issues.apache.org/jira/browse/HIVE-27736

> Remove PowerMock from itests/hive-jmh and upgrade mockito to 4.11
> -
>
> Key: HIVE-27703
> URL: https://issues.apache.org/jira/browse/HIVE-27703
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: Zsolt Miskolczi
>Priority: Major
>  Labels: newbie, starter
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27894) Enhance HMS Handler Logs for all 'get_partition' functions.

2023-11-20 Thread Shivangi Jha (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shivangi Jha updated HIVE-27894:

Description: The HMSHandler 
(standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java)
 class encompasses various functions pertaining to partition information, yet 
its current implementation lacks comprehensive logging of substantial partition 
data. Enhancing this aspect would significantly contribute to improved log 
readability and facilitate more effective debugging processes.  (was: The 
HMSHandler (src/main/java/org/apache/hadoop/hive/metastore/HMSHandler) class 
encompasses various functions pertaining to partition information, yet its 
current implementation lacks comprehensive logging of substantial partition 
data. Enhancing this aspect would significantly contribute to improved log 
readability and facilitate more effective debugging processes.)

> Enhance HMS Handler Logs for all 'get_partition' functions.
> ---
>
> Key: HIVE-27894
> URL: https://issues.apache.org/jira/browse/HIVE-27894
> Project: Hive
>  Issue Type: Improvement
>Reporter: Shivangi Jha
>Assignee: Shivangi Jha
>Priority: Major
>
> The HMSHandler 
> (standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java)
>  class encompasses various functions pertaining to partition information, yet 
> its current implementation lacks comprehensive logging of substantial 
> partition data. Enhancing this aspect would significantly contribute to 
> improved log readability and facilitate more effective debugging processes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27894) Enhance HMS Handler Logs for all 'get_partition' functions.

2023-11-20 Thread Shivangi Jha (Jira)
Shivangi Jha created HIVE-27894:
---

 Summary: Enhance HMS Handler Logs for all 'get_partition' 
functions.
 Key: HIVE-27894
 URL: https://issues.apache.org/jira/browse/HIVE-27894
 Project: Hive
  Issue Type: Improvement
Reporter: Shivangi Jha


The HMSHandler (src/main/java/org/apache/hadoop/hive/metastore/HMSHandler) 
class encompasses various functions pertaining to partition information, yet 
its current implementation lacks comprehensive logging of substantial partition 
data. Enhancing this aspect would significantly contribute to improved log 
readability and facilitate more effective debugging processes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27871) Fix some formatting problems is YarnQueueHelper

2023-11-20 Thread Mahesh Raju Somalaraju (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahesh Raju Somalaraju updated HIVE-27871:
--
Status: Patch Available  (was: Open)

> Fix some formatting problems is YarnQueueHelper
> ---
>
> Key: HIVE-27871
> URL: https://issues.apache.org/jira/browse/HIVE-27871
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: Mahesh Raju Somalaraju
>Priority: Major
>  Labels: newbie, pull-request-available
>
> https://github.com/apache/hive/blob/cbc5d2d7d650f90882c5c4ad0026a94d2e586acb/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/YarnQueueHelper.java#L54-L57
> {code}
>   private static String webapp_conf_key = YarnConfiguration.RM_WEBAPP_ADDRESS;
>   private static String webapp_ssl_conf_key = 
> YarnConfiguration.RM_WEBAPP_HTTPS_ADDRESS;
>   private static String yarn_HA_enabled = YarnConfiguration.RM_HA_ENABLED;
>   private static String yarn_HA_rmids = YarnConfiguration.RM_HA_IDS;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27687) Logger variable should be static final as its creation takes more time in query compilation

2023-11-20 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-27687.
-
Resolution: Fixed

> Logger variable should be static final as its creation takes more time in 
> query compilation
> ---
>
> Key: HIVE-27687
> URL: https://issues.apache.org/jira/browse/HIVE-27687
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2023-09-12 at 5.03.31 PM.png
>
>
> In query compilation, 
> LoggerFactory.getLogger() seems to take up more time. Some of the serde 
> classes use non static variable for Logger that forces the getLogger() call 
> for each of the class creation.
> Making Logger variable static final will avoid this code path for every serde 
> class construction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27876) Incorrect query results on tables with ClusterBy & SortBy

2023-11-20 Thread Krisztian Kasa (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788031#comment-17788031
 ] 

Krisztian Kasa commented on HIVE-27876:
---

The query in the description has a plan:
{code}
POSTHOOK: query: explain
select age, name, count(*) from test_bucket group by  age, name having count(*) 
> 1
POSTHOOK: type: QUERY
POSTHOOK: Input: default@test_bucket
 A masked pattern was here 
STAGE DEPENDENCIES:
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-0
Fetch Operator
  limit: -1
  Processor Tree:
TableScan
  alias: test_bucket
  Select Operator
expressions: age (type: int), name (type: string)
outputColumnNames: age, name
Group By Operator
  aggregations: count()
  keys: age (type: int), name (type: string)
  mode: final
  outputColumnNames: _col0, _col1, _col2
  Filter Operator
predicate: (_col2 > 1L) (type: boolean)
ListSink
{code}

In this case 2 bucket files are created. Both are sorted but only at file 
level. The records are fetched this order by FetchOperator 
{code}
1   user1   dept1
2   user2   dept2
1   user1   dept1
2   user2   dept2
{code}
Data is not sorted globally and group by operator treats all {{age, name}} 
column values as distinct values hence {{count( * )}} is 1 for all the key 
values then Filter operator filters out all records.

Possible workaround to turn off map side group by optimization
https://github.com/apache/hive/blob/feda35389dc28c8c9bf3c8a3d39de53ba90e41c0/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L2019-L2022
{code}
set hive.map.groupby.sorted=false;
{code}

> Incorrect query results on tables with ClusterBy & SortBy
> -
>
> Key: HIVE-27876
> URL: https://issues.apache.org/jira/browse/HIVE-27876
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Priority: Major
>
> Repro:
>  
> {code:java}
> create external table test_bucket(age int, name string, dept string) 
> clustered by (age, name) sorted by (age asc, name asc) into 2 buckets stored 
> as orc;
> insert into test_bucket values (1, 'user1', 'dept1'), ( 2, 'user2' , 'dept2');
> insert into test_bucket values (1, 'user1', 'dept1'), ( 2, 'user2' , 'dept2');
> //empty wrong results
> select age, name, count(*) from test_bucket group by  age, name having 
> count(*) > 1; 
> +--+---+--+
> | age  | name  | _c2  |
> +--+---+--+
> +--+---+--+
> // Workaround
> set hive.map.aggr=false;
> select age, name, count(*) from test_bucket group by  age, name having 
> count(*) > 1; 
> +--++--+
> | age  |  name  | _c2  |
> +--++--+
> | 1    | user1  | 2    |
> | 2    | user2  | 2    |
> +--++--+ {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27884) LLAP: Reuse FileSystem objects from cache across different tasks in the same LLAP daemon + review deleteOnExit usage

2023-11-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27884:
--
Labels: pull-request-available  (was: )

> LLAP: Reuse FileSystem objects from cache across different tasks in the same 
> LLAP daemon + review deleteOnExit usage
> 
>
> Key: HIVE-27884
> URL: https://issues.apache.org/jira/browse/HIVE-27884
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>
> Originally, when the task runner was added to HIVE-10028 
> ([here|https://github.com/apache/hive/blob/23f40bd88043db3cb4efe3a763cbfd5c01a81d2f/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L202]),
>  the FileSystem.closeAllForUGI was commented out for some reasons, and then, 
> in the scope of HIVE-9898 it was simply added back, 
> [here|https://github.com/apache/hive/commit/91c46a44dd9fbb68d01f22e93c4ce0931a4598e0#diff-270dbe6639879ca543ae21c44a239af6145390726d45fee832be809894bfc88eR236]
> A FileSystem.close call basically does the 
> [following|https://github.com/apache/hadoop/blob/0c10bab7bb77aa4ea3ca26c899ab28131561e052/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L2700-L2710]:
> 1. delete all paths that were marked as delete-on-exit.
> 2. removes the instance from the cache
> I saw that we 
> [call|https://github.com/apache/hive/blob/eb6f0b0c57dd55335927b7dde08cd47f4d00e74d/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L302]
>  
> {code}
> FileSystem.closeAllForUGI
> {code}
> at the end of all task attempts, so we almost completely disable hadoop's 
> filesystem cache during a long-running LLAP daemon lifecycle
> some investigations on azure showed that creating a filesystem can be quite 
> expensive, as it involves the recreation of a whole object hierarchy like:
> {code}
> AzureBlobFileSystem -> AzureBlobFileSystemStore --> AbfsClient -> 
> TokenProvider(MsiTokenProvider)
> {code}
> which ends up pinging the token auth endpoint of azure, leading to e.g. a 
> HTTP response 429
> We need to check whether we can remove this closeAllForUGI in LLAP, 
> additionally check and remove all deleteOnExit calls that belong to hadoop 
> FileSystem objects (doesn't necessarily apply to java.io.File.deleteOnExit 
> calls):
> {code}
> grep -iRH "deleteOnExit" --include="*.java" | grep -v "test"
> ...
> ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:// 
> in recent hadoop versions, use deleteOnExit to clean tmp files.
> ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:
> autoDelete = fs.deleteOnExit(fsp.outPaths[filesIdx]);
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/PathInfo.java:
> fileSystem.deleteOnExit(dir);
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: 
>  parentDir.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: 
>  tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:
> parentDir.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:
> tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/ObjectContainer.java:  
>   tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java:
> autoDelete = fs.deleteOnExit(outPath);
> {code}
> I believe deleteOnExit is fine if we don't want to bother with removing temp 
> files, however, these deletions might want to go to a more hive-specific 
> scope if we want to really reuse cached filesystems in a safe manner.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27884) LLAP: Reuse FileSystem objects from cache across different tasks in the same LLAP daemon + review deleteOnExit usage

2023-11-20 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-27884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-27884:

Summary: LLAP: Reuse FileSystem objects from cache across different tasks 
in the same LLAP daemon + review deleteOnExit usage  (was: LLAP: Reuse 
FileSystem objects from cache across different tasks in the same LLAP daemon)

> LLAP: Reuse FileSystem objects from cache across different tasks in the same 
> LLAP daemon + review deleteOnExit usage
> 
>
> Key: HIVE-27884
> URL: https://issues.apache.org/jira/browse/HIVE-27884
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> Originally, when the task runner was added to HIVE-10028 
> ([here|https://github.com/apache/hive/blob/23f40bd88043db3cb4efe3a763cbfd5c01a81d2f/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L202]),
>  the FileSystem.closeAllForUGI was commented out for some reasons, and then, 
> in the scope of HIVE-9898 it was simply added back, 
> [here|https://github.com/apache/hive/commit/91c46a44dd9fbb68d01f22e93c4ce0931a4598e0#diff-270dbe6639879ca543ae21c44a239af6145390726d45fee832be809894bfc88eR236]
> A FileSystem.close call basically does the 
> [following|https://github.com/apache/hadoop/blob/0c10bab7bb77aa4ea3ca26c899ab28131561e052/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L2700-L2710]:
> 1. delete all paths that were marked as delete-on-exit.
> 2. removes the instance from the cache
> I saw that we 
> [call|https://github.com/apache/hive/blob/eb6f0b0c57dd55335927b7dde08cd47f4d00e74d/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L302]
>  
> {code}
> FileSystem.closeAllForUGI
> {code}
> at the end of all task attempts, so we almost completely disable hadoop's 
> filesystem cache during a long-running LLAP daemon lifecycle
> some investigations on azure showed that creating a filesystem can be quite 
> expensive, as it involves the recreation of a whole object hierarchy like:
> {code}
> AzureBlobFileSystem -> AzureBlobFileSystemStore --> AbfsClient -> 
> TokenProvider(MsiTokenProvider)
> {code}
> which ends up pinging the token auth endpoint of azure, leading to e.g. a 
> HTTP response 429
> We need to check whether we can remove this closeAllForUGI in LLAP, 
> additionally check and remove all deleteOnExit calls that belong to hadoop 
> FileSystem objects (doesn't necessarily apply to java.io.File.deleteOnExit 
> calls):
> {code}
> grep -iRH "deleteOnExit" --include="*.java" | grep -v "test"
> ...
> ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:// 
> in recent hadoop versions, use deleteOnExit to clean tmp files.
> ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:
> autoDelete = fs.deleteOnExit(fsp.outPaths[filesIdx]);
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/PathInfo.java:
> fileSystem.deleteOnExit(dir);
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: 
>  parentDir.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: 
>  tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:
> parentDir.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:
> tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/ObjectContainer.java:  
>   tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java:
> autoDelete = fs.deleteOnExit(outPath);
> {code}
> I believe deleteOnExit is fine if we don't want to bother with removing temp 
> files, however, these deletions might want to go to a more hive-specific 
> scope if we want to really reuse cached filesystems in a safe manner.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27888) Backport of HIVE-22429, HIVE-14898, HIVE-22231, HIVE-20507, HIVE-24786 to branch-3

2023-11-20 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan resolved HIVE-27888.
-
Fix Version/s: 3.2.0
   Resolution: Fixed

> Backport of HIVE-22429, HIVE-14898, HIVE-22231, HIVE-20507, HIVE-24786 to 
> branch-3
> --
>
> Key: HIVE-27888
> URL: https://issues.apache.org/jira/browse/HIVE-27888
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.2.0
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27888) Backport of HIVE-22429, HIVE-14898, HIVE-22231, HIVE-20507, HIVE-24786 to branch-3

2023-11-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27888:
--
Labels: pull-request-available  (was: )

> Backport of HIVE-22429, HIVE-14898, HIVE-22231, HIVE-20507, HIVE-24786 to 
> branch-3
> --
>
> Key: HIVE-27888
> URL: https://issues.apache.org/jira/browse/HIVE-27888
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.2.0
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-27884) LLAP: Reuse FileSystem objects from cache across different tasks in the same LLAP daemon

2023-11-20 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-27884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17787941#comment-17787941
 ] 

László Bodor edited comment on HIVE-27884 at 11/20/23 10:59 AM:


minor note: it's time to reconsider using autoDelete = fs.deleteOnExit() kind 
of things

we tend do something like:
1. autoDelete = fs.deleteOnExit
2. fs.deleteOnExit returns true if the file/dir exists and false if it doesn't
3. we later delete the file if !autoDelete - so if it didn't exist when 
fs.deleteOnExit was called - instead of simply calling fs.delete on that file 
(and ignoring the fact that it exists now or not, delete() will take care, 
right?)

my point is that checking the return value of deleteOnExit doesn't make any 
sense in making decisions about whether to delete files or not later :)

the conditional fs.delete is currently called in Operator.closeOp which is 
supposed to be used regardless of the task attempt's outcome
worst case, if fatal JVM error happens, when we cannot rely on 
Operator.closeOp, we cannot rely on hadoop's FileSystem.deleteOnExit as well


was (Author: abstractdog):
minor note: it's time to reconsider using autoDelete = fs.deleteOnExit() kind 
of things

we tend do something like:
1. autoDelete = fs.deleteOnExit
2. fs.deleteOnExit returns true if the file/dir exists and false if it doesn't
3. we later delete the file if !autoDelete - so if it didn't exist when 
fs.deleteOnExit was called - instead of simply calling fs.delete on that file 
(and ignoring the fact that it exists now or not, delete() will take care, 
right?)

my point is that checking the return value of deleteOnExit doesn't make any 
sense in making decisions about whether to delete files or not later :)



> LLAP: Reuse FileSystem objects from cache across different tasks in the same 
> LLAP daemon
> 
>
> Key: HIVE-27884
> URL: https://issues.apache.org/jira/browse/HIVE-27884
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> Originally, when the task runner was added to HIVE-10028 
> ([here|https://github.com/apache/hive/blob/23f40bd88043db3cb4efe3a763cbfd5c01a81d2f/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L202]),
>  the FileSystem.closeAllForUGI was commented out for some reasons, and then, 
> in the scope of HIVE-9898 it was simply added back, 
> [here|https://github.com/apache/hive/commit/91c46a44dd9fbb68d01f22e93c4ce0931a4598e0#diff-270dbe6639879ca543ae21c44a239af6145390726d45fee832be809894bfc88eR236]
> A FileSystem.close call basically does the 
> [following|https://github.com/apache/hadoop/blob/0c10bab7bb77aa4ea3ca26c899ab28131561e052/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L2700-L2710]:
> 1. delete all paths that were marked as delete-on-exit.
> 2. removes the instance from the cache
> I saw that we 
> [call|https://github.com/apache/hive/blob/eb6f0b0c57dd55335927b7dde08cd47f4d00e74d/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L302]
>  
> {code}
> FileSystem.closeAllForUGI
> {code}
> at the end of all task attempts, so we almost completely disable hadoop's 
> filesystem cache during a long-running LLAP daemon lifecycle
> some investigations on azure showed that creating a filesystem can be quite 
> expensive, as it involves the recreation of a whole object hierarchy like:
> {code}
> AzureBlobFileSystem -> AzureBlobFileSystemStore --> AbfsClient -> 
> TokenProvider(MsiTokenProvider)
> {code}
> which ends up pinging the token auth endpoint of azure, leading to e.g. a 
> HTTP response 429
> We need to check whether we can remove this closeAllForUGI in LLAP, 
> additionally check and remove all deleteOnExit calls that belong to hadoop 
> FileSystem objects (doesn't necessarily apply to java.io.File.deleteOnExit 
> calls):
> {code}
> grep -iRH "deleteOnExit" --include="*.java" | grep -v "test"
> ...
> ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:// 
> in recent hadoop versions, use deleteOnExit to clean tmp files.
> ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:
> autoDelete = fs.deleteOnExit(fsp.outPaths[filesIdx]);
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/PathInfo.java:
> fileSystem.deleteOnExit(dir);
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: 
>  parentDir.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: 
>  tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:
> parentDir.deleteOnExit();
> 

[jira] [Comment Edited] (HIVE-27884) LLAP: Reuse FileSystem objects from cache across different tasks in the same LLAP daemon

2023-11-20 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-27884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17787941#comment-17787941
 ] 

László Bodor edited comment on HIVE-27884 at 11/20/23 10:41 AM:


minor note: it's time to reconsider using autoDelete = fs.deleteOnExit() kind 
of things

we tend do something like:
1. autoDelete = fs.deleteOnExit
2. fs.deleteOnExit returns true if the file/dir exists and false if it doesn't
3. we later delete the file if !autoDelete - so if it didn't exist when 
fs.deleteOnExit was called - instead of simply calling fs.delete on that file 
(and ignoring the fact that it exists now or not, delete() will take care, 
right?)

my point is that checking the return value of deleteOnExit doesn't make any 
sense in making decisions about whether to delete files or not later :)




was (Author: abstractdog):
minor note: it's time to reconsider using autoDelete = fs.deleteOnExit() kind 
of things

we tend do something like:
1. autoDelete = fs.deleteOnExit
2. fs.deleteOnExit returns true if the file/dir exists and false if it doesn't
3. we later delete the file if !autoDelete - so if it didn't exist when 
fs.deleteOnExit was called - instead of simply calling fs.delete on that file 
(and ignoring the fact that it's existing now or not

my point is that checking the return value of deleteOnExit doesn't make any 
sense in making decisions about whether to delete files or not later :)



> LLAP: Reuse FileSystem objects from cache across different tasks in the same 
> LLAP daemon
> 
>
> Key: HIVE-27884
> URL: https://issues.apache.org/jira/browse/HIVE-27884
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> Originally, when the task runner was added to HIVE-10028 
> ([here|https://github.com/apache/hive/blob/23f40bd88043db3cb4efe3a763cbfd5c01a81d2f/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L202]),
>  the FileSystem.closeAllForUGI was commented out for some reasons, and then, 
> in the scope of HIVE-9898 it was simply added back, 
> [here|https://github.com/apache/hive/commit/91c46a44dd9fbb68d01f22e93c4ce0931a4598e0#diff-270dbe6639879ca543ae21c44a239af6145390726d45fee832be809894bfc88eR236]
> A FileSystem.close call basically does the 
> [following|https://github.com/apache/hadoop/blob/0c10bab7bb77aa4ea3ca26c899ab28131561e052/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L2700-L2710]:
> 1. delete all paths that were marked as delete-on-exit.
> 2. removes the instance from the cache
> I saw that we 
> [call|https://github.com/apache/hive/blob/eb6f0b0c57dd55335927b7dde08cd47f4d00e74d/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L302]
>  
> {code}
> FileSystem.closeAllForUGI
> {code}
> at the end of all task attempts, so we almost completely disable hadoop's 
> filesystem cache during a long-running LLAP daemon lifecycle
> some investigations on azure showed that creating a filesystem can be quite 
> expensive, as it involves the recreation of a whole object hierarchy like:
> {code}
> AzureBlobFileSystem -> AzureBlobFileSystemStore --> AbfsClient -> 
> TokenProvider(MsiTokenProvider)
> {code}
> which ends up pinging the token auth endpoint of azure, leading to e.g. a 
> HTTP response 429
> We need to check whether we can remove this closeAllForUGI in LLAP, 
> additionally check and remove all deleteOnExit calls that belong to hadoop 
> FileSystem objects (doesn't necessarily apply to java.io.File.deleteOnExit 
> calls):
> {code}
> grep -iRH "deleteOnExit" --include="*.java" | grep -v "test"
> ...
> ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:// 
> in recent hadoop versions, use deleteOnExit to clean tmp files.
> ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:
> autoDelete = fs.deleteOnExit(fsp.outPaths[filesIdx]);
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/PathInfo.java:
> fileSystem.deleteOnExit(dir);
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: 
>  parentDir.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: 
>  tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:
> parentDir.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:
> tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/ObjectContainer.java:  
>   tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java:
> autoDelete = 

[jira] [Commented] (HIVE-27884) LLAP: Reuse FileSystem objects from cache across different tasks in the same LLAP daemon

2023-11-20 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-27884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17787941#comment-17787941
 ] 

László Bodor commented on HIVE-27884:
-

minor note: it's time to reconsider using autoDelete = fs.deleteOnExit() kind 
of things

we tend do something like:
1. autoDelete = fs.deleteOnExit
2. fs.deleteOnExit returns true if the file/dir exists and false if it doesn't
3. we later delete the file if !autoDelete - so if it didn't exist when 
fs.deleteOnExit was called - instead of simply calling fs.delete on that file 
(and ignoring the fact that it's existing now or not

my point is that checking the return value of deleteOnExit doesn't make any 
sense in making decisions about whether to delete files or not later :)



> LLAP: Reuse FileSystem objects from cache across different tasks in the same 
> LLAP daemon
> 
>
> Key: HIVE-27884
> URL: https://issues.apache.org/jira/browse/HIVE-27884
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> Originally, when the task runner was added to HIVE-10028 
> ([here|https://github.com/apache/hive/blob/23f40bd88043db3cb4efe3a763cbfd5c01a81d2f/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L202]),
>  the FileSystem.closeAllForUGI was commented out for some reasons, and then, 
> in the scope of HIVE-9898 it was simply added back, 
> [here|https://github.com/apache/hive/commit/91c46a44dd9fbb68d01f22e93c4ce0931a4598e0#diff-270dbe6639879ca543ae21c44a239af6145390726d45fee832be809894bfc88eR236]
> A FileSystem.close call basically does the 
> [following|https://github.com/apache/hadoop/blob/0c10bab7bb77aa4ea3ca26c899ab28131561e052/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L2700-L2710]:
> 1. delete all paths that were marked as delete-on-exit.
> 2. removes the instance from the cache
> I saw that we 
> [call|https://github.com/apache/hive/blob/eb6f0b0c57dd55335927b7dde08cd47f4d00e74d/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L302]
>  
> {code}
> FileSystem.closeAllForUGI
> {code}
> at the end of all task attempts, so we almost completely disable hadoop's 
> filesystem cache during a long-running LLAP daemon lifecycle
> some investigations on azure showed that creating a filesystem can be quite 
> expensive, as it involves the recreation of a whole object hierarchy like:
> {code}
> AzureBlobFileSystem -> AzureBlobFileSystemStore --> AbfsClient -> 
> TokenProvider(MsiTokenProvider)
> {code}
> which ends up pinging the token auth endpoint of azure, leading to e.g. a 
> HTTP response 429
> We need to check whether we can remove this closeAllForUGI in LLAP, 
> additionally check and remove all deleteOnExit calls that belong to hadoop 
> FileSystem objects (doesn't necessarily apply to java.io.File.deleteOnExit 
> calls):
> {code}
> grep -iRH "deleteOnExit" --include="*.java" | grep -v "test"
> ...
> ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:// 
> in recent hadoop versions, use deleteOnExit to clean tmp files.
> ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:
> autoDelete = fs.deleteOnExit(fsp.outPaths[filesIdx]);
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/PathInfo.java:
> fileSystem.deleteOnExit(dir);
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: 
>  parentDir.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: 
>  tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:
> parentDir.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:
> tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/ObjectContainer.java:  
>   tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java:
> autoDelete = fs.deleteOnExit(outPath);
> {code}
> I believe deleteOnExit is fine if we don't want to bother with removing temp 
> files, however, these deletions might want to go to a more hive-specific 
> scope if we want to really reuse cached filesystems in a safe manner.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work started] (HIVE-27884) LLAP: Reuse FileSystem objects from cache across different tasks in the same LLAP daemon

2023-11-20 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-27884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-27884 started by László Bodor.
---
> LLAP: Reuse FileSystem objects from cache across different tasks in the same 
> LLAP daemon
> 
>
> Key: HIVE-27884
> URL: https://issues.apache.org/jira/browse/HIVE-27884
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> Originally, when the task runner was added to HIVE-10028 
> ([here|https://github.com/apache/hive/blob/23f40bd88043db3cb4efe3a763cbfd5c01a81d2f/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L202]),
>  the FileSystem.closeAllForUGI was commented out for some reasons, and then, 
> in the scope of HIVE-9898 it was simply added back, 
> [here|https://github.com/apache/hive/commit/91c46a44dd9fbb68d01f22e93c4ce0931a4598e0#diff-270dbe6639879ca543ae21c44a239af6145390726d45fee832be809894bfc88eR236]
> A FileSystem.close call basically does the 
> [following|https://github.com/apache/hadoop/blob/0c10bab7bb77aa4ea3ca26c899ab28131561e052/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L2700-L2710]:
> 1. delete all paths that were marked as delete-on-exit.
> 2. removes the instance from the cache
> I saw that we 
> [call|https://github.com/apache/hive/blob/eb6f0b0c57dd55335927b7dde08cd47f4d00e74d/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L302]
>  
> {code}
> FileSystem.closeAllForUGI
> {code}
> at the end of all task attempts, so we almost completely disable hadoop's 
> filesystem cache during a long-running LLAP daemon lifecycle
> some investigations on azure showed that creating a filesystem can be quite 
> expensive, as it involves the recreation of a whole object hierarchy like:
> {code}
> AzureBlobFileSystem -> AzureBlobFileSystemStore --> AbfsClient -> 
> TokenProvider(MsiTokenProvider)
> {code}
> which ends up pinging the token auth endpoint of azure, leading to e.g. a 
> HTTP response 429
> We need to check whether we can remove this closeAllForUGI in LLAP, 
> additionally check and remove all deleteOnExit calls that belong to hadoop 
> FileSystem objects (doesn't necessarily apply to java.io.File.deleteOnExit 
> calls):
> {code}
> grep -iRH "deleteOnExit" --include="*.java" | grep -v "test"
> ...
> ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:// 
> in recent hadoop versions, use deleteOnExit to clean tmp files.
> ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:
> autoDelete = fs.deleteOnExit(fsp.outPaths[filesIdx]);
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/PathInfo.java:
> fileSystem.deleteOnExit(dir);
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: 
>  parentDir.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: 
>  tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:
> parentDir.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:
> tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/ObjectContainer.java:  
>   tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java:
> autoDelete = fs.deleteOnExit(outPath);
> {code}
> I believe deleteOnExit is fine if we don't want to bother with removing temp 
> files, however, these deletions might want to go to a more hive-specific 
> scope if we want to really reuse cached filesystems in a safe manner.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27884) LLAP: Reuse FileSystem objects from cache across different tasks in the same LLAP daemon

2023-11-20 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-27884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17787907#comment-17787907
 ] 

László Bodor commented on HIVE-27884:
-

just check all the occurrences of *deleteOnExit,* and considering that:
 # only hadoop's FileSystem.deleteOnExit matters in this scope 
(java.io.File.deleteOnExit doesn't)
 # only classes that run in LLAP daemon matter (basically Operators and runtime 
components, not scratch dir / session dir / SessionState etc.)

only these are what I have to take care of:

 {code}
ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:
autoDelete = fs.deleteOnExit(fsp.outPaths[filesIdx]);
ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java:  
  autoDelete = fs.deleteOnExit(outPath);
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateMapper.java:
  autoDelete = fs.deleteOnExit(outPath);
{code}

> LLAP: Reuse FileSystem objects from cache across different tasks in the same 
> LLAP daemon
> 
>
> Key: HIVE-27884
> URL: https://issues.apache.org/jira/browse/HIVE-27884
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> Originally, when the task runner was added to HIVE-10028 
> ([here|https://github.com/apache/hive/blob/23f40bd88043db3cb4efe3a763cbfd5c01a81d2f/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L202]),
>  the FileSystem.closeAllForUGI was commented out for some reasons, and then, 
> in the scope of HIVE-9898 it was simply added back, 
> [here|https://github.com/apache/hive/commit/91c46a44dd9fbb68d01f22e93c4ce0931a4598e0#diff-270dbe6639879ca543ae21c44a239af6145390726d45fee832be809894bfc88eR236]
> A FileSystem.close call basically does the 
> [following|https://github.com/apache/hadoop/blob/0c10bab7bb77aa4ea3ca26c899ab28131561e052/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L2700-L2710]:
> 1. delete all paths that were marked as delete-on-exit.
> 2. removes the instance from the cache
> I saw that we 
> [call|https://github.com/apache/hive/blob/eb6f0b0c57dd55335927b7dde08cd47f4d00e74d/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L302]
>  
> {code}
> FileSystem.closeAllForUGI
> {code}
> at the end of all task attempts, so we almost completely disable hadoop's 
> filesystem cache during a long-running LLAP daemon lifecycle
> some investigations on azure showed that creating a filesystem can be quite 
> expensive, as it involves the recreation of a whole object hierarchy like:
> {code}
> AzureBlobFileSystem -> AzureBlobFileSystemStore --> AbfsClient -> 
> TokenProvider(MsiTokenProvider)
> {code}
> which ends up pinging the token auth endpoint of azure, leading to e.g. a 
> HTTP response 429
> We need to check whether we can remove this closeAllForUGI in LLAP, 
> additionally check and remove all deleteOnExit calls that belong to hadoop 
> FileSystem objects (doesn't necessarily apply to java.io.File.deleteOnExit 
> calls):
> {code}
> grep -iRH "deleteOnExit" --include="*.java" | grep -v "test"
> ...
> ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:// 
> in recent hadoop versions, use deleteOnExit to clean tmp files.
> ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:
> autoDelete = fs.deleteOnExit(fsp.outPaths[filesIdx]);
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/PathInfo.java:
> fileSystem.deleteOnExit(dir);
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: 
>  parentDir.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: 
>  tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:
> parentDir.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:
> tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/ObjectContainer.java:  
>   tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java:
> autoDelete = fs.deleteOnExit(outPath);
> {code}
> I believe deleteOnExit is fine if we don't want to bother with removing temp 
> files, however, these deletions might want to go to a more hive-specific 
> scope if we want to really reuse cached filesystems in a safe manner.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26986) A DAG created by OperatorGraph is not equal to the Tez DAG.

2023-11-20 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-26986:
--
Target Version/s:   (was: 4.0.0)

> A DAG created by OperatorGraph is not equal to the Tez DAG.
> ---
>
> Key: HIVE-26986
> URL: https://issues.apache.org/jira/browse/HIVE-26986
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 4.0.0-alpha-2
>Reporter: Seonggon Namgung
>Assignee: Seonggon Namgung
>Priority: Major
>  Labels: pull-request-available
> Attachments: Query71 OperatorGraph.png, Query71 TezDAG.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> A DAG created by OperatorGraph is not equal to the corresponding DAG that is 
> submitted to Tez.
> Because of this problem, ParallelEdgeFixer reports a pair of normal edges to 
> a parallel edge.
> We observe this problem by comparing OperatorGraph and Tez DAG when running 
> TPC-DS query 71 on 1TB ORC format managed table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26986) A DAG created by OperatorGraph is not equal to the Tez DAG.

2023-11-20 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-26986:
--
Labels: pull-request-available  (was: hive-4.0.0-must 
pull-request-available)

> A DAG created by OperatorGraph is not equal to the Tez DAG.
> ---
>
> Key: HIVE-26986
> URL: https://issues.apache.org/jira/browse/HIVE-26986
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 4.0.0-alpha-2
>Reporter: Seonggon Namgung
>Assignee: Seonggon Namgung
>Priority: Major
>  Labels: pull-request-available
> Attachments: Query71 OperatorGraph.png, Query71 TezDAG.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> A DAG created by OperatorGraph is not equal to the corresponding DAG that is 
> submitted to Tez.
> Because of this problem, ParallelEdgeFixer reports a pair of normal edges to 
> a parallel edge.
> We observe this problem by comparing OperatorGraph and Tez DAG when running 
> TPC-DS query 71 on 1TB ORC format managed table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27893) NoSuchElementException issue when batchSize is set to 0 in PartitionIterable

2023-11-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27893:
--
Labels: pull-request-available  (was: )

> NoSuchElementException issue when batchSize is set to 0 in PartitionIterable
> 
>
> Key: HIVE-27893
> URL: https://issues.apache.org/jira/browse/HIVE-27893
> Project: Hive
>  Issue Type: Bug
>Reporter: Vikram Ahuja
>Assignee: Vikram Ahuja
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27893) NoSuchElementException when batchSize is set to 0 in PartitionIterable

2023-11-20 Thread Vikram Ahuja (Jira)
Vikram Ahuja created HIVE-27893:
---

 Summary: NoSuchElementException when batchSize is set to 0 in 
PartitionIterable
 Key: HIVE-27893
 URL: https://issues.apache.org/jira/browse/HIVE-27893
 Project: Hive
  Issue Type: Bug
Reporter: Vikram Ahuja
Assignee: Vikram Ahuja






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27893) NoSuchElementException issue when batchSize is set to 0 in PartitionIterable

2023-11-20 Thread Vikram Ahuja (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Ahuja updated HIVE-27893:

Summary: NoSuchElementException issue when batchSize is set to 0 in 
PartitionIterable  (was: NoSuchElementException when batchSize is set to 0 in 
PartitionIterable)

> NoSuchElementException issue when batchSize is set to 0 in PartitionIterable
> 
>
> Key: HIVE-27893
> URL: https://issues.apache.org/jira/browse/HIVE-27893
> Project: Hive
>  Issue Type: Bug
>Reporter: Vikram Ahuja
>Assignee: Vikram Ahuja
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27884) LLAP: Reuse FileSystem objects from cache across different tasks in the same LLAP daemon

2023-11-20 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-27884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned HIVE-27884:
---

Assignee: László Bodor

> LLAP: Reuse FileSystem objects from cache across different tasks in the same 
> LLAP daemon
> 
>
> Key: HIVE-27884
> URL: https://issues.apache.org/jira/browse/HIVE-27884
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> Originally, when the task runner was added to HIVE-10028 
> ([here|https://github.com/apache/hive/blob/23f40bd88043db3cb4efe3a763cbfd5c01a81d2f/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L202]),
>  the FileSystem.closeAllForUGI was commented out for some reasons, and then, 
> in the scope of HIVE-9898 it was simply added back, 
> [here|https://github.com/apache/hive/commit/91c46a44dd9fbb68d01f22e93c4ce0931a4598e0#diff-270dbe6639879ca543ae21c44a239af6145390726d45fee832be809894bfc88eR236]
> A FileSystem.close call basically does the 
> [following|https://github.com/apache/hadoop/blob/0c10bab7bb77aa4ea3ca26c899ab28131561e052/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L2700-L2710]:
> 1. delete all paths that were marked as delete-on-exit.
> 2. removes the instance from the cache
> I saw that we 
> [call|https://github.com/apache/hive/blob/eb6f0b0c57dd55335927b7dde08cd47f4d00e74d/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L302]
>  
> {code}
> FileSystem.closeAllForUGI
> {code}
> at the end of all task attempts, so we almost completely disable hadoop's 
> filesystem cache during a long-running LLAP daemon lifecycle
> some investigations on azure showed that creating a filesystem can be quite 
> expensive, as it involves the recreation of a whole object hierarchy like:
> {code}
> AzureBlobFileSystem -> AzureBlobFileSystemStore --> AbfsClient -> 
> TokenProvider(MsiTokenProvider)
> {code}
> which ends up pinging the token auth endpoint of azure, leading to e.g. a 
> HTTP response 429
> We need to check whether we can remove this closeAllForUGI in LLAP, 
> additionally check and remove all deleteOnExit calls that belong to hadoop 
> FileSystem objects (doesn't necessarily apply to java.io.File.deleteOnExit 
> calls):
> {code}
> grep -iRH "deleteOnExit" --include="*.java" | grep -v "test"
> ...
> ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:// 
> in recent hadoop versions, use deleteOnExit to clean tmp files.
> ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:
> autoDelete = fs.deleteOnExit(fsp.outPaths[filesIdx]);
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/PathInfo.java:
> fileSystem.deleteOnExit(dir);
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: 
>  parentDir.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: 
>  tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:
> parentDir.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:
> tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/ObjectContainer.java:  
>   tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java:
> autoDelete = fs.deleteOnExit(outPath);
> {code}
> I believe deleteOnExit is fine if we don't want to bother with removing temp 
> files, however, these deletions might want to go to a more hive-specific 
> scope if we want to really reuse cached filesystems in a safe manner.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27881) Introduce hive javaagents to implement pluggable instrumentation

2023-11-20 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-27881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-27881:

Summary: Introduce hive javaagents to implement pluggable instrumentation  
(was: Introduce hive-agents module to implement javaagents for trusted 
instrumentation)

> Introduce hive javaagents to implement pluggable instrumentation
> 
>
> Key: HIVE-27881
> URL: https://issues.apache.org/jira/browse/HIVE-27881
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Priority: Major
>
> From time to time, we face issues where only some runtime magic would help us 
> investigate the problems, like agents or the aspect-oriented approach.
> I can recall the following jiras:
> HIVE-25806: socket leak, that was investigated finally by 
> https://github.com/jenkinsci/lib-file-leak-detector/
> HIVE-26985: an idea about tracking Hive objects, that generated an argument 
> about how to achieve that
> HIVE-27875: a socket leak again, which then turned out to be solved by 
> HIVE-25736 upstream, I just missed this patch downstream
> Basically, using an agent means 2 things:
> 1) having the agent jar on local filesystem wherever hive components run
> 2) adding a javaagent clause to the JVM options
> 2) should be possible anytime, that's how we configure our products, right? 
> but 1) is simply not possible in containerized environments: even if I can 
> create an image + convince a customer to use that, that's a security concern, 
> why would they use an unknown/unofficial image contaminated by an unknown 
> agent (like lib-file-leak-detector above)
> Using agents is a good way to instrument our code on-demand, and it's crucial 
> to make it easily pluggable, otherwise, we're gonna face performance problems 
> (guess what happens if you watch and instrument every single socket and save 
> their traces by default in your product :) )
> I think a set of instrumentation functionalities can be added to a hive 
> module, which then leads to a hive-agents.jar, which by default can be 
> included in any hive component's JVM args. The javaagent command line args 
> will then drive what instrumentation we really want to turn on, like:
> {code}
>  -javaagent:/lib/hive-agents-x.y.jar=socket-leak-detector,config-detector
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)