[jira] [Work logged] (HIVE-24188) CTLT from MM to External fails because table txn properties are not skipped
[ https://issues.apache.org/jira/browse/HIVE-24188?focusedWorklogId=487942=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487942 ] ASF GitHub Bot logged work on HIVE-24188: - Author: ASF GitHub Bot Created on: 22/Sep/20 05:01 Start Date: 22/Sep/20 05:01 Worklog Time Spent: 10m Work Description: nareshpr opened a new pull request #1516: URL: https://github.com/apache/hive/pull/1516 What changes were proposed in this pull request? Included check to skip TXN tblproperties for external table from MMM Why are the changes needed? CTLT is failing from MM to External Does this PR introduce any user-facing change? no How was this patch tested? Using repro sql, included it as part of testcase. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487942) Remaining Estimate: 0h Time Spent: 10m > CTLT from MM to External fails because table txn properties are not skipped > --- > > Key: HIVE-24188 > URL: https://issues.apache.org/jira/browse/HIVE-24188 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Repro steps > > {code:java} > set hive.support.concurrency=true; > set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; > create table test_mm(age int, name string) partitioned by(dept string) stored > as orc tblproperties('transactional'='true', > 'transactional_properties'='default'); > create external table test_external like test_mm LOCATION > '${system:test.tmp.dir}/create_like_mm_to_external'; > {code} > Fails with below exception > {code:java} > Error: Error while processing statement: FAILED: Execution Error, return code > 1 from org.apache.hadoop.hive.ql.exec.DDLTask. > MetaException(message:default.test_external cannot be declared transactional > because it's an external table) (state=08S01,code=1){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24188) CTLT from MM to External fails because table txn properties are not skipped
[ https://issues.apache.org/jira/browse/HIVE-24188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24188: -- Labels: pull-request-available (was: ) > CTLT from MM to External fails because table txn properties are not skipped > --- > > Key: HIVE-24188 > URL: https://issues.apache.org/jira/browse/HIVE-24188 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Repro steps > > {code:java} > set hive.support.concurrency=true; > set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; > create table test_mm(age int, name string) partitioned by(dept string) stored > as orc tblproperties('transactional'='true', > 'transactional_properties'='default'); > create external table test_external like test_mm LOCATION > '${system:test.tmp.dir}/create_like_mm_to_external'; > {code} > Fails with below exception > {code:java} > Error: Error while processing statement: FAILED: Execution Error, return code > 1 from org.apache.hadoop.hive.ql.exec.DDLTask. > MetaException(message:default.test_external cannot be declared transactional > because it's an external table) (state=08S01,code=1){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24187) Handle _files creation for HA config with same nameservice name on source and destination
[ https://issues.apache.org/jira/browse/HIVE-24187?focusedWorklogId=487930=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487930 ] ASF GitHub Bot logged work on HIVE-24187: - Author: ASF GitHub Bot Created on: 22/Sep/20 04:25 Start Date: 22/Sep/20 04:25 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #1515: URL: https://github.com/apache/hive/pull/1515#discussion_r492467429 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcrossInstances.java ## @@ -1963,4 +2062,12 @@ private void setupUDFJarOnHDFS(Path identityUdfLocalPath, Path identityUdfHdfsPa FileSystem fs = primary.miniDFSCluster.getFileSystem(); fs.copyFromLocalFile(identityUdfLocalPath, identityUdfHdfsPath); } + + private List getHdfsNamespaceClause() { Review comment: replace with nameservice This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487930) Time Spent: 40m (was: 0.5h) > Handle _files creation for HA config with same nameservice name on source and > destination > - > > Key: HIVE-24187 > URL: https://issues.apache.org/jira/browse/HIVE-24187 > Project: Hive > Issue Type: Improvement >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24187.01.patch > > Time Spent: 40m > Remaining Estimate: 0h > > Current HA is supported only for different nameservices on Source and > Destination. We need to add support of same nameservice on Source and > Destination. > Local nameservice will be passed correctly to the repl command. > Remote nameservice will be a random name and corresponding configs for the > same. > Example: > Clusters originally configured with ns for hdfs: > src: ns1 > target : ns1 > We can denote remote name with some random name, say for example: nsRemote. > This is how the command will see the ns w.r.t source and target: > Repl Dump : src: ns1, target: nsRemote > Repl Load: src: nsRemote, target: ns1 > Entries in the _files(for managed table data loc) will be made with nsRemote > in stead of ns1(for src). > Example: > hdfs://nsRemote/whLoc/dbName.db/table1:checksum:subDir:hdfs://nsRemote/cmroot > Same way list of external table data locations will also be modified using > nsRemote in stead of ns1(for src). > New configs can control the behavior: > *hive.repl.ha.datapath.replace.remote.nameservice = * > *hive.repl.ha.datapath.replace.remote.nameservice.name = * > Based on the above configs replacement of nameservice can be done. > This will also require that 'hive.repl.rootdir' is passed accordingly during > dump and load: > Repl dump: > ||Repl Operation||Repl Command|| > |*Staging on source cluster*| > |Repl Dump|repl dump dbName with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')| > |Repl Load|repl load dbName into dbName > with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')| > |*Staging on target cluster*| > |Repl Dump|repl dump dbName > with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')| > |Repl Load|repl load dbName into dbName > with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24188) CTLT from MM to External fails because table txn properties are not skipped
[ https://issues.apache.org/jira/browse/HIVE-24188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naresh P R reassigned HIVE-24188: - > CTLT from MM to External fails because table txn properties are not skipped > --- > > Key: HIVE-24188 > URL: https://issues.apache.org/jira/browse/HIVE-24188 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > > Repro steps > > {code:java} > set hive.support.concurrency=true; > set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; > create table test_mm(age int, name string) partitioned by(dept string) stored > as orc tblproperties('transactional'='true', > 'transactional_properties'='default'); > create external table test_external like test_mm LOCATION > '${system:test.tmp.dir}/create_like_mm_to_external'; > {code} > Fails with below exception > {code:java} > Error: Error while processing statement: FAILED: Execution Error, return code > 1 from org.apache.hadoop.hive.ql.exec.DDLTask. > MetaException(message:default.test_external cannot be declared transactional > because it's an external table) (state=08S01,code=1){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23637) Fix FindBug issues in hive-cli
[ https://issues.apache.org/jira/browse/HIVE-23637?focusedWorklogId=487667=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487667 ] ASF GitHub Bot logged work on HIVE-23637: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:25 Start Date: 22/Sep/20 03:25 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1162: URL: https://github.com/apache/hive/pull/1162 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487667) Time Spent: 40m (was: 0.5h) > Fix FindBug issues in hive-cli > -- > > Key: HIVE-23637 > URL: https://issues.apache.org/jira/browse/HIVE-23637 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Attachments: spotbugsXml.xml > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24163) Dynamic Partitioning Insert fail for MM table fail during MoveTask
[ https://issues.apache.org/jira/browse/HIVE-24163?focusedWorklogId=487658=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487658 ] ASF GitHub Bot logged work on HIVE-24163: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:25 Start Date: 22/Sep/20 03:25 Worklog Time Spent: 10m Work Description: kuczoram commented on pull request #1507: URL: https://github.com/apache/hive/pull/1507#issuecomment-696209500 The file listing in the Utilities.getFullDPSpecs method was not correct for MM tables and for ACID tables when direct insert was on. This method returned all partitions from these tables, not just the ones affected by the current query. Because of this, the lineage information for inserting with dynamic partitioning into tables like these was not correct. Compared the lineage information with when inserting into external tables and for external tables only the partitions are present which are affected by the query. This is because for external tables when inserting into the table, the data first get written into the staging dir and when listing the partitions, this directory is checked and it contains only the newly inserted data. But for MM tables and ACID direct insert, the staging dir is missing, so it will check the table directory and lists everything from it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487658) Time Spent: 0.5h (was: 20m) > Dynamic Partitioning Insert fail for MM table fail during MoveTask > -- > > Key: HIVE-24163 > URL: https://issues.apache.org/jira/browse/HIVE-24163 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Rajkumar Singh >Assignee: Marta Kuczora >Priority: Major > Labels: pull-request-available > Fix For: 3.1.2 > > Time Spent: 0.5h > Remaining Estimate: 0h > > -- DDLs and Query > {code:java} > create table `class` (name varchar(8), sex varchar(1), age double precision, > height double precision, weight double precision); > insert into table class values ('RAJ','MALE',28,12,12); > CREATE TABLE `PART1` (`id` DOUBLE,`N` DOUBLE,`Name` VARCHAR(8),`Sex` > VARCHAR(1)) PARTITIONED BY(Weight string, Age > string, Height string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' > LINES TERMINATED BY '\012' STORED AS TEXTFILE; > INSERT INTO TABLE `part1` PARTITION (`Weight`,`Age`,`Height`) SELECT 0, 0, > `Name`,`Sex`,`Weight`,`Age`,`Height` FROM `class`; > {code} > it fail during the MoveTask execution: > {code:java} > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: partition > hdfs://hostname:8020/warehouse/tablespace/managed/hive/part1/.hive-staging_hive_2020-09-02_13-29-58_765_4475282758764123921-1/-ext-1/tmpstats-0_FS_3 > is not a directory! > at > org.apache.hadoop.hive.ql.metadata.Hive.getValidPartitionsInPath(Hive.java:2769) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at > org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:2837) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at > org.apache.hadoop.hive.ql.exec.MoveTask.handleDynParts(MoveTask.java:562) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:440) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at
[jira] [Work logged] (HIVE-24179) Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement
[ https://issues.apache.org/jira/browse/HIVE-24179?focusedWorklogId=487718=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487718 ] ASF GitHub Bot logged work on HIVE-24179: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:29 Start Date: 22/Sep/20 03:29 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1509: URL: https://github.com/apache/hive/pull/1509#discussion_r491922128 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java ## @@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws SemanticException { String dbName = stripQuotes(root.getChild(0).getText()); boolean isExtended = (root.getChildCount() > 1); -HiveTxnManager txnManager = null; +boolean useNewLocksFormat; try { - txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); + HiveTxnManager txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); Review comment: Do we create txnManager instance here just to get the value of useNewLocksFormat flag? ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java ## @@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws SemanticException { String dbName = stripQuotes(root.getChild(0).getText()); boolean isExtended = (root.getChildCount() > 1); -HiveTxnManager txnManager = null; +boolean useNewLocksFormat; try { - txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); + HiveTxnManager txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); Review comment: Do we create txnManager instance here just to get the value of useNewLocksFormat flag? ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java ## @@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws SemanticException { String dbName = stripQuotes(root.getChild(0).getText()); boolean isExtended = (root.getChildCount() > 1); -HiveTxnManager txnManager = null; +boolean useNewLocksFormat; try { - txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); + HiveTxnManager txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); Review comment: TxnManager is initialised before query compilation and has a session scope. Correct way to access it is via SessionState. Looks like ShowDbLocksAnalyzer was always creating new instance of TxnManager. ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java ## @@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws SemanticException { String dbName = stripQuotes(root.getChild(0).getText()); boolean isExtended = (root.getChildCount() > 1); -HiveTxnManager txnManager = null; +boolean useNewLocksFormat; try { - txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); + HiveTxnManager txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); Review comment: TxnManager is initialized before query compilation and has a session scope. Correct way to access it is via SessionState. Looks like ShowDbLocksAnalyzer was always creating new instance of TxnManager. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487718) Time Spent: 1.5h (was: 1h 20m) > Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement > --- > > Key: HIVE-24179 > URL: https://issues.apache.org/jira/browse/HIVE-24179 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: summary.png > > Time Spent: 1.5h > Remaining Estimate: 0h > > The problem can be reproduced by executing repeatedly a SHOW LOCK statement > and monitoring the heap memory of HS2. For a small heap (e.g., 2g) it only > takes a few minutes before the server crashes with OutOfMemory error such as > the one shown below. > {noformat} > java.lang.OutOfMemoryError: GC overhead limit exceeded > at
[jira] [Work logged] (HIVE-23900) Replace Base64 in exec Package
[ https://issues.apache.org/jira/browse/HIVE-23900?focusedWorklogId=487908=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487908 ] ASF GitHub Bot logged work on HIVE-23900: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:45 Start Date: 22/Sep/20 03:45 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1297: URL: https://github.com/apache/hive/pull/1297#issuecomment-696455652 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487908) Time Spent: 0.5h (was: 20m) > Replace Base64 in exec Package > -- > > Key: HIVE-23900 > URL: https://issues.apache.org/jira/browse/HIVE-23900 > Project: Hive > Issue Type: Sub-task >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24187) Handle _files creation for HA config with same nameservice name on source and destination
[ https://issues.apache.org/jira/browse/HIVE-24187?focusedWorklogId=487928=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487928 ] ASF GitHub Bot logged work on HIVE-24187: - Author: ASF GitHub Bot Created on: 22/Sep/20 04:00 Start Date: 22/Sep/20 04:00 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #1515: URL: https://github.com/apache/hive/pull/1515#discussion_r492461348 ## File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/ReplChangeManager.java ## @@ -424,6 +424,20 @@ public String encodeFileUri(String fileUriStr, String fileChecksum, String encod return encodedUri; } + public static String encodeFileUri(String fileUriStr, String fileChecksum, String cmroot, String encodedSubDir) { +String encodedUri = fileUriStr; +if ((fileChecksum != null) && (cmroot != null)) { + encodedUri = encodedUri + URI_FRAGMENT_SEPARATOR + fileChecksum + URI_FRAGMENT_SEPARATOR + cmroot; +} else { + encodedUri = encodedUri + URI_FRAGMENT_SEPARATOR + URI_FRAGMENT_SEPARATOR; Review comment: why do we have 2 URI_FRAGMENT_SEPARATOR ## File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/ReplChangeManager.java ## @@ -424,6 +424,20 @@ public String encodeFileUri(String fileUriStr, String fileChecksum, String encod return encodedUri; } + public static String encodeFileUri(String fileUriStr, String fileChecksum, String cmroot, String encodedSubDir) { +String encodedUri = fileUriStr; +if ((fileChecksum != null) && (cmroot != null)) { + encodedUri = encodedUri + URI_FRAGMENT_SEPARATOR + fileChecksum + URI_FRAGMENT_SEPARATOR + cmroot; +} else { + encodedUri = encodedUri + URI_FRAGMENT_SEPARATOR + URI_FRAGMENT_SEPARATOR; +} +encodedUri = encodedUri + URI_FRAGMENT_SEPARATOR + ((encodedSubDir != null) ? encodedSubDir : ""); +if (LOG.isDebugEnabled()) { Review comment: Do we need this check? ## File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ## @@ -522,6 +522,14 @@ private static void populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal REPLCMINTERVAL("hive.repl.cm.interval","3600s", new TimeValidator(TimeUnit.SECONDS), "Inteval for cmroot cleanup thread."), + REPL_HA_DATAPATH_REPLACE_REMOTE_NAMESERVICE("hive.repl.ha.datapath.replace.remote.nameservice", false, +"When HDFS is HA enabled and both source and target clusters are configured with same nameservice names," + +"enable this flag and provide a "), Review comment: sentence is incomplete ## File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/ReplChangeManager.java ## @@ -424,6 +424,20 @@ public String encodeFileUri(String fileUriStr, String fileChecksum, String encod return encodedUri; } + public static String encodeFileUri(String fileUriStr, String fileChecksum, String cmroot, String encodedSubDir) { +String encodedUri = fileUriStr; +if ((fileChecksum != null) && (cmroot != null)) { Review comment: empty check not needed? ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/Utils.java ## @@ -72,6 +76,40 @@ public static void writeOutput(List> listValues, Path outputFile, H writeOutput(listValues, outputFile, hiveConf, false); } + /** + * Given a ReplChangeManger's encoded uri, replaces the namespace and returns the modified encoded uri. + */ + public static String replaceNameSpaceInEncodedURI(String cmEncodedURI, HiveConf hiveConf) throws SemanticException { Review comment: replace name service? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487928) Time Spent: 0.5h (was: 20m) > Handle _files creation for HA config with same nameservice name on source and > destination > - > > Key: HIVE-24187 > URL: https://issues.apache.org/jira/browse/HIVE-24187 > Project: Hive > Issue Type: Improvement >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24187.01.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > Current HA is supported only
[jira] [Work logged] (HIVE-23793) Review of QueryInfo Class
[ https://issues.apache.org/jira/browse/HIVE-23793?focusedWorklogId=487648=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487648 ] ASF GitHub Bot logged work on HIVE-23793: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:24 Start Date: 22/Sep/20 03:24 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1197: URL: https://github.com/apache/hive/pull/1197 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487648) Time Spent: 2h 50m (was: 2h 40m) > Review of QueryInfo Class > - > > Key: HIVE-23793 > URL: https://issues.apache.org/jira/browse/HIVE-23793 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > Time Spent: 2h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-16490) Hive should not use private HDFS APIs for encryption
[ https://issues.apache.org/jira/browse/HIVE-16490?focusedWorklogId=487725=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487725 ] ASF GitHub Bot logged work on HIVE-16490: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:30 Start Date: 22/Sep/20 03:30 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1279: URL: https://github.com/apache/hive/pull/1279#issuecomment-695860130 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487725) Time Spent: 1h 40m (was: 1.5h) > Hive should not use private HDFS APIs for encryption > > > Key: HIVE-16490 > URL: https://issues.apache.org/jira/browse/HIVE-16490 > Project: Hive > Issue Type: Improvement > Components: Encryption >Affects Versions: 2.2.0 >Reporter: Andrew Wang >Assignee: Naveen Gangam >Priority: Critical > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > When compiling against bleeding edge versions of Hive and Hadoop, we > discovered that HIVE-16047 references a private HDFS API, DFSClient, to get > at various encryption related information. The private API was recently > changed by HADOOP-14104, which broke Hive compilation. > It'd be better to instead use publicly supported APIs. HDFS-11687 has been > filed to add whatever encryption APIs are needed by Hive. This JIRA is to > move Hive over to these new APIs. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24179) Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement
[ https://issues.apache.org/jira/browse/HIVE-24179?focusedWorklogId=487707=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487707 ] ASF GitHub Bot logged work on HIVE-24179: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:29 Start Date: 22/Sep/20 03:29 Worklog Time Spent: 10m Work Description: zabetak commented on a change in pull request #1509: URL: https://github.com/apache/hive/pull/1509#discussion_r492225545 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java ## @@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws SemanticException { String dbName = stripQuotes(root.getChild(0).getText()); boolean isExtended = (root.getChildCount() > 1); -HiveTxnManager txnManager = null; +boolean useNewLocksFormat; try { - txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); + HiveTxnManager txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); Review comment: That's correct. Normally, we should (but I am not 100% sure) have a TxnManager at this point so there is no need to create a new one just to obtain the flag. I pushed commit https://github.com/apache/hive/pull/1509/commits/297882ee80d52689a9cc1c68da9f7580918439bb to try this out. ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java ## @@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws SemanticException { String dbName = stripQuotes(root.getChild(0).getText()); boolean isExtended = (root.getChildCount() > 1); -HiveTxnManager txnManager = null; +boolean useNewLocksFormat; try { - txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); + HiveTxnManager txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); Review comment: Did you check the last commit? Do you have something else in mind? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487707) Time Spent: 1h 20m (was: 1h 10m) > Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement > --- > > Key: HIVE-24179 > URL: https://issues.apache.org/jira/browse/HIVE-24179 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: summary.png > > Time Spent: 1h 20m > Remaining Estimate: 0h > > The problem can be reproduced by executing repeatedly a SHOW LOCK statement > and monitoring the heap memory of HS2. For a small heap (e.g., 2g) it only > takes a few minutes before the server crashes with OutOfMemory error such as > the one shown below. > {noformat} > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.util.Arrays.copyOf(Arrays.java:3332) > at > java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124) > at > java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448) > at java.lang.StringBuilder.append(StringBuilder.java:136) > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.encodeMessage(ForkedChannelEncoder.j > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.setOutErr(ForkedChannelEncoder.java: > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.stdErr(ForkedChannelEncoder.java:166 > at > org.apache.maven.surefire.booter.ForkingRunListener.writeTestOutput(ForkingRunListener.jav > at > org.apache.maven.surefire.report.ConsoleOutputCapture$ForwardingPrintStream.write(ConsoleO > at > org.apache.logging.log4j.core.util.CloseShieldOutputStream.write(CloseShieldOutputStream.j > at > org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStream > at > org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager > at > org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java: > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(Abst > at >
[jira] [Work logged] (HIVE-23640) Fix FindBug issues in hive-druid-handler
[ https://issues.apache.org/jira/browse/HIVE-23640?focusedWorklogId=487527=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487527 ] ASF GitHub Bot logged work on HIVE-23640: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:13 Start Date: 22/Sep/20 03:13 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1164: URL: https://github.com/apache/hive/pull/1164 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487527) Time Spent: 40m (was: 0.5h) > Fix FindBug issues in hive-druid-handler > > > Key: HIVE-23640 > URL: https://issues.apache.org/jira/browse/HIVE-23640 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Attachments: spotbugsXml.xml > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23882) Compiler should skip MJ keyExpr for probe optimization
[ https://issues.apache.org/jira/browse/HIVE-23882?focusedWorklogId=487755=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487755 ] ASF GitHub Bot logged work on HIVE-23882: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:33 Start Date: 22/Sep/20 03:33 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1286: URL: https://github.com/apache/hive/pull/1286#issuecomment-695860126 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487755) Time Spent: 40m (was: 0.5h) > Compiler should skip MJ keyExpr for probe optimization > -- > > Key: HIVE-23882 > URL: https://issues.apache.org/jira/browse/HIVE-23882 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > In probe we cannot currently support Key expressions (on the big table Side) > as ORC CVs Probe directly the smalltable HT (there is no expr evaluation at > that level). > TezCompiler should take this into account when picking MJs to push probe > details -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement
[ https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=487618=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487618 ] ASF GitHub Bot logged work on HIVE-24009: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:21 Start Date: 22/Sep/20 03:21 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1472: URL: https://github.com/apache/hive/pull/1472#discussion_r492224561 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java ## @@ -63,19 +63,19 @@ private VectorizationContext taskVectorizationContext; - protected transient JobConf jc; - private transient boolean inputFileChanged = false; + protected JobConf jc; Review comment: I actually tried not keeping these fields but I was running into all sorts of issues like unable to serialize/de-serialize or plan generating without metadata etc. I am not sure if we need to keep all of these fields or we can selectively choose, I went by almost all in interest of time. If Gopal or Rajesh thinks that this may cause performance issue I can open a follow-up to investigate and choose fields selectively. ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ## @@ -387,6 +387,12 @@ protected volatile boolean disableJoinMerge = false; protected final boolean defaultJoinMerge; + /* + * This is used by prepare/execute statement + * Prepare/Execute requires operators to be copied and cached + */ + protected Map topOpsCopy = null; Review comment: Original operator tree shape is changed when going through physical transformations and task generation (don't know why though), as a result this operator tree can not be used later to regenerate tasks or re-running physical transformations. Therefore we make a copy and cache it after operator tree is generated. I will leave a comment. ## File path: ql/src/test/results/clientpositive/llap/constprog_dpp.q.out ## @@ -84,12 +84,13 @@ Stage-0 Select Operator [SEL_40] (rows=1 width=4) Output:["_col0"] TableScan [TS_24] (rows=1 width=4) -Output:["id"] +default@tb2,tb2,Tbl:COMPLETE,Col:NONE,Output:["id"] Review comment: Yeah I think this is likely side effect of some changes in w.r.t serialization/de-serialization. Although this is positive side effect now that we have more information in explain plan. ## File path: ql/src/test/results/clientpositive/llap/constprog_dpp.q.out ## @@ -84,12 +84,13 @@ Stage-0 Select Operator [SEL_40] (rows=1 width=4) Output:["_col0"] TableScan [TS_24] (rows=1 width=4) -Output:["id"] +default@tb2,tb2,Tbl:COMPLETE,Col:NONE,Output:["id"] <-Map 6 [CONTAINS] vectorized, llap Reduce Output Operator [RS_45] Limit [LIM_44] (rows=1 width=2) Number of rows:1 Select Operator [SEL_43] (rows=1 width=0) Output:["_col0"] TableScan [TS_29] (rows=1 width=0) +default@tb2,tb2,Tbl:PARTIAL,Col:COMPLETE Review comment: I confirmed that this is expected. I compared this plan against master (with explain.user set to false) and there is no difference in the plan. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487618) Time Spent: 1h 20m (was: 1h 10m) > Support partition pruning and other physical transformations for EXECUTE > statement > --- > > Key: HIVE-24009 > URL: https://issues.apache.org/jira/browse/HIVE-24009 > Project: Hive > Issue Type: Sub-task >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Current partition pruning (compile time) isn't kicked in for EXECUTE > statements. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement
[ https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=487584=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487584 ] ASF GitHub Bot logged work on HIVE-24009: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:18 Start Date: 22/Sep/20 03:18 Worklog Time Spent: 10m Work Description: jcamachor commented on a change in pull request #1472: URL: https://github.com/apache/hive/pull/1472#discussion_r491751273 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/ExecuteStatementAnalyzer.java ## @@ -253,14 +191,17 @@ public void analyzeInternal(ASTNode root) throws SemanticException { String queryName = getQueryName(root); if (ss.getPreparePlans().containsKey(queryName)) { // retrieve cached plan from session state - BaseSemanticAnalyzer cachedPlan = ss.getPreparePlans().get(queryName); + SemanticAnalyzer cachedPlan = ss.getPreparePlans().get(queryName); // make copy of the plan - createTaskCopy(cachedPlan); + //createTaskCopy(cachedPlan); Review comment: Can remove line commented out. ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/PrepareStatementAnalyzer.java ## @@ -54,6 +58,21 @@ private void savePlan(String queryName) throws SemanticException{ ss.getPreparePlans().put(queryName, this); } + private T makeCopy(final Object task, Class objClass) { +ByteArrayOutputStream baos = new ByteArrayOutputStream(); Review comment: Can we leave a comment on this method to understand what it is trying to do? ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java ## @@ -63,19 +63,19 @@ private VectorizationContext taskVectorizationContext; - protected transient JobConf jc; - private transient boolean inputFileChanged = false; + protected JobConf jc; Review comment: Do we need to keep all these fields for the plan cache in the operator, table, etc.? I am wondering about the implications of keeping them when the operator plan is serialized (i.e., whether that could have an performance impact). @t3rmin4t0r , @rbalamohan , could you comment on this? ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/ExecuteStatementAnalyzer.java ## @@ -286,6 +227,24 @@ public void analyzeInternal(ASTNode root) throws SemanticException { this.acidFileSinks.addAll(cachedPlan.getAcidFileSinks()); this.initCtx(cachedPlan.getCtx()); this.ctx.setCboInfo(cachedPlan.getCboInfo()); + this.setLoadFileWork(cachedPlan.getLoadFileWork()); + this.setLoadTableWork(cachedPlan.getLoadTableWork()); + + this.setQB(cachedPlan.getQB()); + + ParseContext pctxt = this.getParseContext(); + // partition pruner + Transform ppr = new PartitionPruner(); + ppr.transform(pctxt); + + //pctxt.setQueryProperties(this.queryProperties); + if (!ctx.getExplainLogical()) { +TaskCompiler compiler = TaskCompilerFactory.getCompiler(conf, pctxt); +compiler.init(queryState, console, db); +compiler.compile(pctxt, rootTasks, inputs, outputs); +fetchTask = pctxt.getFetchTask(); +//fetchTask = makeCopy(cachedPlan.getFetchTask(), cachedPlan.getFetchTask().getClass()); Review comment: This comment too. ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/ExecuteStatementAnalyzer.java ## @@ -286,6 +227,24 @@ public void analyzeInternal(ASTNode root) throws SemanticException { this.acidFileSinks.addAll(cachedPlan.getAcidFileSinks()); this.initCtx(cachedPlan.getCtx()); this.ctx.setCboInfo(cachedPlan.getCboInfo()); + this.setLoadFileWork(cachedPlan.getLoadFileWork()); + this.setLoadTableWork(cachedPlan.getLoadTableWork()); + + this.setQB(cachedPlan.getQB()); + + ParseContext pctxt = this.getParseContext(); + // partition pruner + Transform ppr = new PartitionPruner(); + ppr.transform(pctxt); + + //pctxt.setQueryProperties(this.queryProperties); Review comment: Same, can be removed? ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ## @@ -387,6 +387,12 @@ protected volatile boolean disableJoinMerge = false; protected final boolean defaultJoinMerge; + /* + * This is used by prepare/execute statement + * Prepare/Execute requires operators to be copied and cached + */ + protected Map topOpsCopy = null; Review comment: Why do you need to keep a copy instead of using the original operators? Could you leave a comment on that? ## File path: ql/src/test/results/clientpositive/llap/constprog_dpp.q.out ## @@ -84,12 +84,13 @@ Stage-0 Select Operator [SEL_40] (rows=1 width=4) Output:["_col0"]
[jira] [Work logged] (HIVE-24187) Handle _files creation for HA config with same nameservice name on source and destination
[ https://issues.apache.org/jira/browse/HIVE-24187?focusedWorklogId=487796=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487796 ] ASF GitHub Bot logged work on HIVE-24187: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:36 Start Date: 22/Sep/20 03:36 Worklog Time Spent: 10m Work Description: pkumarsinha opened a new pull request #1515: URL: https://github.com/apache/hive/pull/1515 …e name on source and destination ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487796) Time Spent: 20m (was: 10m) > Handle _files creation for HA config with same nameservice name on source and > destination > - > > Key: HIVE-24187 > URL: https://issues.apache.org/jira/browse/HIVE-24187 > Project: Hive > Issue Type: Improvement >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24187.01.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Current HA is supported only for different nameservices on Source and > Destination. We need to add support of same nameservice on Source and > Destination. > Local nameservice will be passed correctly to the repl command. > Remote nameservice will be a random name and corresponding configs for the > same. > Example: > Clusters originally configured with ns for hdfs: > src: ns1 > target : ns1 > We can denote remote name with some random name, say for example: nsRemote. > This is how the command will see the ns w.r.t source and target: > Repl Dump : src: ns1, target: nsRemote > Repl Load: src: nsRemote, target: ns1 > Entries in the _files(for managed table data loc) will be made with nsRemote > in stead of ns1(for src). > Example: > hdfs://nsRemote/whLoc/dbName.db/table1:checksum:subDir:hdfs://nsRemote/cmroot > Same way list of external table data locations will also be modified using > nsRemote in stead of ns1(for src). > New configs can control the behavior: > *hive.repl.ha.datapath.replace.remote.nameservice = * > *hive.repl.ha.datapath.replace.remote.nameservice.name = * > Based on the above configs replacement of nameservice can be done. > This will also require that 'hive.repl.rootdir' is passed accordingly during > dump and load: > Repl dump: > ||Repl Operation||Repl Command|| > |*Staging on source cluster*| > |Repl Dump|repl dump dbName with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')| > |Repl Load|repl load dbName into dbName > with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')| > |*Staging on target cluster*| > |Repl Dump|repl dump dbName > with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')| > |Repl Load|repl load dbName into dbName > with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24185) Upgrade snappy-java to 1.1.7.5
[ https://issues.apache.org/jira/browse/HIVE-24185?focusedWorklogId=487781=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487781 ] ASF GitHub Bot logged work on HIVE-24185: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:35 Start Date: 22/Sep/20 03:35 Worklog Time Spent: 10m Work Description: pgaref opened a new pull request #1513: URL: https://github.com/apache/hive/pull/1513 Change-Id: I6d314e48f96006f549974d1907a0d6de563d7250 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487781) Time Spent: 20m (was: 10m) > Upgrade snappy-java to 1.1.7.5 > -- > > Key: HIVE-24185 > URL: https://issues.apache.org/jira/browse/HIVE-24185 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Trivial > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Bump version to take advantage of perf improvements, glibc compatibility etc. > https://github.com/xerial/snappy-java/blob/master/Milestone.md#snappy-java-117-2017-11-30 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23639) Fix FindBug issues in hive-contrib
[ https://issues.apache.org/jira/browse/HIVE-23639?focusedWorklogId=487872=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487872 ] ASF GitHub Bot logged work on HIVE-23639: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:42 Start Date: 22/Sep/20 03:42 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1163: URL: https://github.com/apache/hive/pull/1163 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487872) Time Spent: 40m (was: 0.5h) > Fix FindBug issues in hive-contrib > -- > > Key: HIVE-23639 > URL: https://issues.apache.org/jira/browse/HIVE-23639 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Attachments: spotbugsXml.xml > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23870) Optimise multiple text conversions in WritableHiveCharObjectInspector.getPrimitiveJavaObject / HiveCharWritable
[ https://issues.apache.org/jira/browse/HIVE-23870?focusedWorklogId=487852=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487852 ] ASF GitHub Bot logged work on HIVE-23870: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:40 Start Date: 22/Sep/20 03:40 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1282: URL: https://github.com/apache/hive/pull/1282#issuecomment-696455663 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487852) Time Spent: 1h (was: 50m) > Optimise multiple text conversions in > WritableHiveCharObjectInspector.getPrimitiveJavaObject / HiveCharWritable > --- > > Key: HIVE-23870 > URL: https://issues.apache.org/jira/browse/HIVE-23870 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: image-2020-07-17-11-31-38-241.png > > Time Spent: 1h > Remaining Estimate: 0h > > Observed this when creating materialized view. > [https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableHiveCharObjectInspector.java#L85] > Same content is converted to Text multiple times. > !image-2020-07-17-11-31-38-241.png|width=1048,height=936! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24184) Re-order methods in Driver
[ https://issues.apache.org/jira/browse/HIVE-24184?focusedWorklogId=487543=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487543 ] ASF GitHub Bot logged work on HIVE-24184: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:14 Start Date: 22/Sep/20 03:14 Worklog Time Spent: 10m Work Description: miklosgergely opened a new pull request #1512: URL: https://github.com/apache/hive/pull/1512 ### What changes were proposed in this pull request? Driver is still a huge class, with a lot of methods. They are not representing the order of the process done by the Driver (compilation, execution, result providing, closing). Also the constructors are not at the beginning of the class. All of these make the class harder to read. By re-ordering them it would be easier. ### Why are the changes needed? To make the Driver class cleaner, thus easier to read/understand. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? All the tests are still passing. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487543) Time Spent: 20m (was: 10m) > Re-order methods in Driver > -- > > Key: HIVE-24184 > URL: https://issues.apache.org/jira/browse/HIVE-24184 > Project: Hive > Issue Type: Sub-task > Components: Hive >Reporter: Miklos Gergely >Assignee: Miklos Gergely >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Driver is still a huge class, with a lot of methods. They are not > representing the order of the process done by the Driver (compilation, > execution, result providing, closing). Also the constructors are not at the > beginning of the class. All of these make the class harder to read. By > re-ordering them it would be easier. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24159) Kafka storage handler broken in secure environment pt2: short-circuit on non-secure environment
[ https://issues.apache.org/jira/browse/HIVE-24159?focusedWorklogId=487458=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487458 ] ASF GitHub Bot logged work on HIVE-24159: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:03 Start Date: 22/Sep/20 03:03 Worklog Time Spent: 10m Work Description: abstractdog merged pull request #1495: URL: https://github.com/apache/hive/pull/1495 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487458) Time Spent: 50m (was: 40m) > Kafka storage handler broken in secure environment pt2: short-circuit on > non-secure environment > --- > > Key: HIVE-24159 > URL: https://issues.apache.org/jira/browse/HIVE-24159 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 50m > Remaining Estimate: 0h > > As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized > upstream that the kafka qtest fails. Instead of setting up a kerberized > environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't > seen hive.server2.authentication.kerberos.principal used in *.q files) I > managed to make the test with a simple > UserGroupInformation.isSecurityEnabled() check, which can be also useful for > every non-secure environment. > For reference, the exception was: > {code} > 2020-09-14T03:30:01,217 ERROR [a42ef4c6-190c-47a6-86ad-8bf13b8a2dc1 main] > tez.TezTask: Failed to execute tez graph. > org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient > at > org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:451) > ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.kafka.clients.admin.Admin.create(Admin.java:59) > ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?] > at > org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:39) > ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?] > at > org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaDelegationTokenForBrokers(DagUtils.java:333) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaCredentials(DagUtils.java:301) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.DagUtils.addCredentials(DagUtils.java:282) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:516) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:223) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at >
[jira] [Work logged] (HIVE-23728) Run metastore verification tests during precommit
[ https://issues.apache.org/jira/browse/HIVE-23728?focusedWorklogId=487463=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487463 ] ASF GitHub Bot logged work on HIVE-23728: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:03 Start Date: 22/Sep/20 03:03 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1154: URL: https://github.com/apache/hive/pull/1154#issuecomment-696455704 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487463) Time Spent: 0.5h (was: 20m) > Run metastore verification tests during precommit > - > > Key: HIVE-23728 > URL: https://issues.apache.org/jira/browse/HIVE-23728 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23838) KafkaRecordIteratorTest is flaky
[ https://issues.apache.org/jira/browse/HIVE-23838?focusedWorklogId=487460=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487460 ] ASF GitHub Bot logged work on HIVE-23838: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:03 Start Date: 22/Sep/20 03:03 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1245: URL: https://github.com/apache/hive/pull/1245 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487460) Time Spent: 2h (was: 1h 50m) > KafkaRecordIteratorTest is flaky > > > Key: HIVE-23838 > URL: https://issues.apache.org/jira/browse/HIVE-23838 > Project: Hive > Issue Type: Bug > Components: kafka integration >Affects Versions: 4.0.0 >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > Failed on [4th run of flaky test > checker|http://ci.hive.apache.org/job/hive-flaky-check/69/] with > org.apache.kafka.common.errors.TimeoutException: Timeout expired after > 1milliseconds while awaiting InitProducerId -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23877) Hive on Spark incorrect partition pruning ANALYZE TABLE
[ https://issues.apache.org/jira/browse/HIVE-23877?focusedWorklogId=487455=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487455 ] ASF GitHub Bot logged work on HIVE-23877: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:02 Start Date: 22/Sep/20 03:02 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1278: URL: https://github.com/apache/hive/pull/1278#issuecomment-696455677 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487455) Time Spent: 50m (was: 40m) > Hive on Spark incorrect partition pruning ANALYZE TABLE > --- > > Key: HIVE-23877 > URL: https://issues.apache.org/jira/browse/HIVE-23877 > Project: Hive > Issue Type: Bug >Reporter: Han >Assignee: Han >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Partitions are pruned based on the partition specification in ANALYZE TABLE > command and cached in TableSpec. > When compiling, It's unnecessary to use PartitionPruner.prune() to get > partitions again. And PartitionPruner can not prune partitions for ANALYZE > TABLE command, so it will get all partitions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24172) Fix TestMmCompactorOnMr
[ https://issues.apache.org/jira/browse/HIVE-24172?focusedWorklogId=487421=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487421 ] ASF GitHub Bot logged work on HIVE-24172: - Author: ASF GitHub Bot Created on: 22/Sep/20 02:59 Start Date: 22/Sep/20 02:59 Worklog Time Spent: 10m Work Description: klcopp opened a new pull request #1514: URL: https://github.com/apache/hive/pull/1514 Setting the execution engine as MR in the driver field (driver.getConf().setBoolVar(...)) only affects queries in setup and teardown. Compaction runs using the conf field. So the execution engine needed to be set to MR in conf so that compaction would pick it up. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487421) Time Spent: 20m (was: 10m) > Fix TestMmCompactorOnMr > --- > > Key: HIVE-24172 > URL: https://issues.apache.org/jira/browse/HIVE-24172 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > test is unstable; > http://ci.hive.apache.org/job/hive-flaky-check/112/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-22098) Data loss occurs when multiple tables are join with different bucket_version
[ https://issues.apache.org/jira/browse/HIVE-22098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GuangMing Lu updated HIVE-22098: Attachment: join_test.sql > Data loss occurs when multiple tables are join with different bucket_version > > > Key: HIVE-22098 > URL: https://issues.apache.org/jira/browse/HIVE-22098 > Project: Hive > Issue Type: Bug > Components: Operators >Affects Versions: 3.1.0, 3.1.2 >Reporter: GuangMing Lu >Assignee: yongtaoliao >Priority: Blocker > Labels: data-loss, wrongresults > Attachments: HIVE-22098.1.patch, image-2019-08-12-18-45-15-771.png, > join_test.sql, table_a_data.orc, table_b_data.orc, table_c_data.orc > > > When different bucketVersion of tables do join and no of reducers is greater > than 2, the result is incorrect (*data loss*). > *Scenario 1*: Three tables join. The temporary result data of table_a in the > first table and table_b in the second table joins result is recorded as > tmp_a_b, When it joins with the third table, the bucket_version=2 of the > table created by default after hive-3.0.0, temporary data tmp_a_b initialized > the bucketVerison=-1, and then ReduceSinkOperator Verketison=-1 is joined. In > the init method, the hash algorithm of selecting join column is selected > according to bucketVersion. If bucketVersion = 2 and is not an acid > operation, it will acquired the new algorithm of hash. Otherwise, the old > algorithm of hash is acquired. Because of the inconsistency of the algorithm > of hash, the partition of data allocation caused are different. At stage of > Reducer, Data with the same key can not be paired resulting in data loss. > *Scenario 2*: create two test tables, create table > table_bucketversion_1(col_1 string, col_2 string) TBLPROPERTIES > ('bucketing_version'='1'); table_bucketversion_2(col_1 string, col_2 string) > TBLPROPERTIES ('bucketing_version'='2'); > when use table_bucketversion_1 to join table_bucketversion_2, partial result > data will be loss due to bucketVerison is different. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-22098) Data loss occurs when multiple tables are join with different bucket_version
[ https://issues.apache.org/jira/browse/HIVE-22098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GuangMing Lu updated HIVE-22098: Attachment: (was: join_test.sql) > Data loss occurs when multiple tables are join with different bucket_version > > > Key: HIVE-22098 > URL: https://issues.apache.org/jira/browse/HIVE-22098 > Project: Hive > Issue Type: Bug > Components: Operators >Affects Versions: 3.1.0, 3.1.2 >Reporter: GuangMing Lu >Assignee: yongtaoliao >Priority: Blocker > Labels: data-loss, wrongresults > Attachments: HIVE-22098.1.patch, image-2019-08-12-18-45-15-771.png, > join_test.sql, table_a_data.orc, table_b_data.orc, table_c_data.orc > > > When different bucketVersion of tables do join and no of reducers is greater > than 2, the result is incorrect (*data loss*). > *Scenario 1*: Three tables join. The temporary result data of table_a in the > first table and table_b in the second table joins result is recorded as > tmp_a_b, When it joins with the third table, the bucket_version=2 of the > table created by default after hive-3.0.0, temporary data tmp_a_b initialized > the bucketVerison=-1, and then ReduceSinkOperator Verketison=-1 is joined. In > the init method, the hash algorithm of selecting join column is selected > according to bucketVersion. If bucketVersion = 2 and is not an acid > operation, it will acquired the new algorithm of hash. Otherwise, the old > algorithm of hash is acquired. Because of the inconsistency of the algorithm > of hash, the partition of data allocation caused are different. At stage of > Reducer, Data with the same key can not be paired resulting in data loss. > *Scenario 2*: create two test tables, create table > table_bucketversion_1(col_1 string, col_2 string) TBLPROPERTIES > ('bucketing_version'='1'); table_bucketversion_2(col_1 string, col_2 string) > TBLPROPERTIES ('bucketing_version'='2'); > when use table_bucketversion_1 to join table_bucketversion_2, partial result > data will be loss due to bucketVerison is different. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23900) Replace Base64 in exec Package
[ https://issues.apache.org/jira/browse/HIVE-23900?focusedWorklogId=487366=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487366 ] ASF GitHub Bot logged work on HIVE-23900: - Author: ASF GitHub Bot Created on: 22/Sep/20 00:48 Start Date: 22/Sep/20 00:48 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1297: URL: https://github.com/apache/hive/pull/1297#issuecomment-696455652 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487366) Time Spent: 20m (was: 10m) > Replace Base64 in exec Package > -- > > Key: HIVE-23900 > URL: https://issues.apache.org/jira/browse/HIVE-23900 > Project: Hive > Issue Type: Sub-task >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23728) Run metastore verification tests during precommit
[ https://issues.apache.org/jira/browse/HIVE-23728?focusedWorklogId=487367=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487367 ] ASF GitHub Bot logged work on HIVE-23728: - Author: ASF GitHub Bot Created on: 22/Sep/20 00:48 Start Date: 22/Sep/20 00:48 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1154: URL: https://github.com/apache/hive/pull/1154#issuecomment-696455704 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487367) Time Spent: 20m (was: 10m) > Run metastore verification tests during precommit > - > > Key: HIVE-23728 > URL: https://issues.apache.org/jira/browse/HIVE-23728 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23877) Hive on Spark incorrect partition pruning ANALYZE TABLE
[ https://issues.apache.org/jira/browse/HIVE-23877?focusedWorklogId=487365=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487365 ] ASF GitHub Bot logged work on HIVE-23877: - Author: ASF GitHub Bot Created on: 22/Sep/20 00:48 Start Date: 22/Sep/20 00:48 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1278: URL: https://github.com/apache/hive/pull/1278#issuecomment-696455677 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487365) Time Spent: 40m (was: 0.5h) > Hive on Spark incorrect partition pruning ANALYZE TABLE > --- > > Key: HIVE-23877 > URL: https://issues.apache.org/jira/browse/HIVE-23877 > Project: Hive > Issue Type: Bug >Reporter: Han >Assignee: Han >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Partitions are pruned based on the partition specification in ANALYZE TABLE > command and cached in TableSpec. > When compiling, It's unnecessary to use PartitionPruner.prune() to get > partitions again. And PartitionPruner can not prune partitions for ANALYZE > TABLE command, so it will get all partitions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23870) Optimise multiple text conversions in WritableHiveCharObjectInspector.getPrimitiveJavaObject / HiveCharWritable
[ https://issues.apache.org/jira/browse/HIVE-23870?focusedWorklogId=487364=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487364 ] ASF GitHub Bot logged work on HIVE-23870: - Author: ASF GitHub Bot Created on: 22/Sep/20 00:48 Start Date: 22/Sep/20 00:48 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1282: URL: https://github.com/apache/hive/pull/1282#issuecomment-696455663 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487364) Time Spent: 50m (was: 40m) > Optimise multiple text conversions in > WritableHiveCharObjectInspector.getPrimitiveJavaObject / HiveCharWritable > --- > > Key: HIVE-23870 > URL: https://issues.apache.org/jira/browse/HIVE-23870 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: image-2020-07-17-11-31-38-241.png > > Time Spent: 50m > Remaining Estimate: 0h > > Observed this when creating materialized view. > [https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableHiveCharObjectInspector.java#L85] > Same content is converted to Text multiple times. > !image-2020-07-17-11-31-38-241.png|width=1048,height=936! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23640) Fix FindBug issues in hive-druid-handler
[ https://issues.apache.org/jira/browse/HIVE-23640?focusedWorklogId=487370=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487370 ] ASF GitHub Bot logged work on HIVE-23640: - Author: ASF GitHub Bot Created on: 22/Sep/20 00:48 Start Date: 22/Sep/20 00:48 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1164: URL: https://github.com/apache/hive/pull/1164 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487370) Time Spent: 0.5h (was: 20m) > Fix FindBug issues in hive-druid-handler > > > Key: HIVE-23640 > URL: https://issues.apache.org/jira/browse/HIVE-23640 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Attachments: spotbugsXml.xml > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23637) Fix FindBug issues in hive-cli
[ https://issues.apache.org/jira/browse/HIVE-23637?focusedWorklogId=487368=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487368 ] ASF GitHub Bot logged work on HIVE-23637: - Author: ASF GitHub Bot Created on: 22/Sep/20 00:48 Start Date: 22/Sep/20 00:48 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1162: URL: https://github.com/apache/hive/pull/1162 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487368) Time Spent: 0.5h (was: 20m) > Fix FindBug issues in hive-cli > -- > > Key: HIVE-23637 > URL: https://issues.apache.org/jira/browse/HIVE-23637 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Attachments: spotbugsXml.xml > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23639) Fix FindBug issues in hive-contrib
[ https://issues.apache.org/jira/browse/HIVE-23639?focusedWorklogId=487369=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487369 ] ASF GitHub Bot logged work on HIVE-23639: - Author: ASF GitHub Bot Created on: 22/Sep/20 00:48 Start Date: 22/Sep/20 00:48 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1163: URL: https://github.com/apache/hive/pull/1163 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487369) Time Spent: 0.5h (was: 20m) > Fix FindBug issues in hive-contrib > -- > > Key: HIVE-23639 > URL: https://issues.apache.org/jira/browse/HIVE-23639 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Attachments: spotbugsXml.xml > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24187) Handle _files creation for HA config with same nameservice name on source and destination
[ https://issues.apache.org/jira/browse/HIVE-24187?focusedWorklogId=487337=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487337 ] ASF GitHub Bot logged work on HIVE-24187: - Author: ASF GitHub Bot Created on: 21/Sep/20 23:42 Start Date: 21/Sep/20 23:42 Worklog Time Spent: 10m Work Description: pkumarsinha opened a new pull request #1515: URL: https://github.com/apache/hive/pull/1515 …e name on source and destination ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487337) Remaining Estimate: 0h Time Spent: 10m > Handle _files creation for HA config with same nameservice name on source and > destination > - > > Key: HIVE-24187 > URL: https://issues.apache.org/jira/browse/HIVE-24187 > Project: Hive > Issue Type: Improvement >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Attachments: HIVE-24187.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Current HA is supported only for different nameservices on Source and > Destination. We need to add support of same nameservice on Source and > Destination. > Local nameservice will be passed correctly to the repl command. > Remote nameservice will be a random name and corresponding configs for the > same. > Example: > Clusters originally configured with ns for hdfs: > src: ns1 > target : ns1 > We can denote remote name with some random name, say for example: nsRemote. > This is how the command will see the ns w.r.t source and target: > Repl Dump : src: ns1, target: nsRemote > Repl Load: src: nsRemote, target: ns1 > Entries in the _files(for managed table data loc) will be made with nsRemote > in stead of ns1(for src). > Example: > hdfs://nsRemote/whLoc/dbName.db/table1:checksum:subDir:hdfs://nsRemote/cmroot > Same way list of external table data locations will also be modified using > nsRemote in stead of ns1(for src). > New configs can control the behavior: > *hive.repl.ha.datapath.replace.remote.nameservice = * > *hive.repl.ha.datapath.replace.remote.nameservice.name = * > Based on the above configs replacement of nameservice can be done. > This will also require that 'hive.repl.rootdir' is passed accordingly during > dump and load: > Repl dump: > ||Repl Operation||Repl Command|| > |*Staging on source cluster*| > |Repl Dump|repl dump dbName with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')| > |Repl Load|repl load dbName into dbName > with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')| > |*Staging on target cluster*| > |Repl Dump|repl dump dbName > with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')| > |Repl Load|repl load dbName into dbName > with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24187) Handle _files creation for HA config with same nameservice name on source and destination
[ https://issues.apache.org/jira/browse/HIVE-24187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24187: -- Labels: pull-request-available (was: ) > Handle _files creation for HA config with same nameservice name on source and > destination > - > > Key: HIVE-24187 > URL: https://issues.apache.org/jira/browse/HIVE-24187 > Project: Hive > Issue Type: Improvement >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24187.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Current HA is supported only for different nameservices on Source and > Destination. We need to add support of same nameservice on Source and > Destination. > Local nameservice will be passed correctly to the repl command. > Remote nameservice will be a random name and corresponding configs for the > same. > Example: > Clusters originally configured with ns for hdfs: > src: ns1 > target : ns1 > We can denote remote name with some random name, say for example: nsRemote. > This is how the command will see the ns w.r.t source and target: > Repl Dump : src: ns1, target: nsRemote > Repl Load: src: nsRemote, target: ns1 > Entries in the _files(for managed table data loc) will be made with nsRemote > in stead of ns1(for src). > Example: > hdfs://nsRemote/whLoc/dbName.db/table1:checksum:subDir:hdfs://nsRemote/cmroot > Same way list of external table data locations will also be modified using > nsRemote in stead of ns1(for src). > New configs can control the behavior: > *hive.repl.ha.datapath.replace.remote.nameservice = * > *hive.repl.ha.datapath.replace.remote.nameservice.name = * > Based on the above configs replacement of nameservice can be done. > This will also require that 'hive.repl.rootdir' is passed accordingly during > dump and load: > Repl dump: > ||Repl Operation||Repl Command|| > |*Staging on source cluster*| > |Repl Dump|repl dump dbName with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')| > |Repl Load|repl load dbName into dbName > with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')| > |*Staging on target cluster*| > |Repl Dump|repl dump dbName > with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')| > |Repl Load|repl load dbName into dbName > with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24187) Handle _files creation for HA config with same nameservice name on source and destination
[ https://issues.apache.org/jira/browse/HIVE-24187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha updated HIVE-24187: Status: Patch Available (was: Open) > Handle _files creation for HA config with same nameservice name on source and > destination > - > > Key: HIVE-24187 > URL: https://issues.apache.org/jira/browse/HIVE-24187 > Project: Hive > Issue Type: Improvement >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Attachments: HIVE-24187.01.patch > > > Current HA is supported only for different nameservices on Source and > Destination. We need to add support of same nameservice on Source and > Destination. > Local nameservice will be passed correctly to the repl command. > Remote nameservice will be a random name and corresponding configs for the > same. > Example: > Clusters originally configured with ns for hdfs: > src: ns1 > target : ns1 > We can denote remote name with some random name, say for example: nsRemote. > This is how the command will see the ns w.r.t source and target: > Repl Dump : src: ns1, target: nsRemote > Repl Load: src: nsRemote, target: ns1 > Entries in the _files(for managed table data loc) will be made with nsRemote > in stead of ns1(for src). > Example: > hdfs://nsRemote/whLoc/dbName.db/table1:checksum:subDir:hdfs://nsRemote/cmroot > Same way list of external table data locations will also be modified using > nsRemote in stead of ns1(for src). > New configs can control the behavior: > *hive.repl.ha.datapath.replace.remote.nameservice = * > *hive.repl.ha.datapath.replace.remote.nameservice.name = * > Based on the above configs replacement of nameservice can be done. > This will also require that 'hive.repl.rootdir' is passed accordingly during > dump and load: > Repl dump: > ||Repl Operation||Repl Command|| > |*Staging on source cluster*| > |Repl Dump|repl dump dbName with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')| > |Repl Load|repl load dbName into dbName > with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')| > |*Staging on target cluster*| > |Repl Dump|repl dump dbName > with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')| > |Repl Load|repl load dbName into dbName > with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24187) Handle _files creation for HA config with same nameservice name on source and destination
[ https://issues.apache.org/jira/browse/HIVE-24187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha updated HIVE-24187: Attachment: HIVE-24187.01.patch > Handle _files creation for HA config with same nameservice name on source and > destination > - > > Key: HIVE-24187 > URL: https://issues.apache.org/jira/browse/HIVE-24187 > Project: Hive > Issue Type: Improvement >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Attachments: HIVE-24187.01.patch > > > Current HA is supported only for different nameservices on Source and > Destination. We need to add support of same nameservice on Source and > Destination. > Local nameservice will be passed correctly to the repl command. > Remote nameservice will be a random name and corresponding configs for the > same. > Example: > Clusters originally configured with ns for hdfs: > src: ns1 > target : ns1 > We can denote remote name with some random name, say for example: nsRemote. > This is how the command will see the ns w.r.t source and target: > Repl Dump : src: ns1, target: nsRemote > Repl Load: src: nsRemote, target: ns1 > Entries in the _files(for managed table data loc) will be made with nsRemote > in stead of ns1(for src). > Example: > hdfs://nsRemote/whLoc/dbName.db/table1:checksum:subDir:hdfs://nsRemote/cmroot > Same way list of external table data locations will also be modified using > nsRemote in stead of ns1(for src). > New configs can control the behavior: > *hive.repl.ha.datapath.replace.remote.nameservice = * > *hive.repl.ha.datapath.replace.remote.nameservice.name = * > Based on the above configs replacement of nameservice can be done. > This will also require that 'hive.repl.rootdir' is passed accordingly during > dump and load: > Repl dump: > ||Repl Operation||Repl Command|| > |*Staging on source cluster*| > |Repl Dump|repl dump dbName with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')| > |Repl Load|repl load dbName into dbName > with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')| > |*Staging on target cluster*| > |Repl Dump|repl dump dbName > with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')| > |Repl Load|repl load dbName into dbName > with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24187) Handle _files creation for HA config with same nameservice name on source and destination
[ https://issues.apache.org/jira/browse/HIVE-24187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha updated HIVE-24187: Summary: Handle _files creation for HA config with same nameservice name on source and destination (was: Handle _files creation for HA config with same nameservice on source and destination) > Handle _files creation for HA config with same nameservice name on source and > destination > - > > Key: HIVE-24187 > URL: https://issues.apache.org/jira/browse/HIVE-24187 > Project: Hive > Issue Type: Improvement >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > > Current HA is supported only for different nameservices on Source and > Destination. We need to add support of same nameservice on Source and > Destination. > Local nameservice will be passed correctly to the repl command. > Remote nameservice will be a random name and corresponding configs for the > same. > Example: > Clusters originally configured with ns for hdfs: > src: ns1 > target : ns1 > We can denote remote name with some random name, say for example: nsRemote. > This is how the command will see the ns w.r.t source and target: > Repl Dump : src: ns1, target: nsRemote > Repl Load: src: nsRemote, target: ns1 > Entries in the _files(for managed table data loc) will be made with nsRemote > in stead of ns1(for src). > Example: > hdfs://nsRemote/whLoc/dbName.db/table1:checksum:subDir:hdfs://nsRemote/cmroot > Same way list of external table data locations will also be modified using > nsRemote in stead of ns1(for src). > New configs can control the behavior: > *hive.repl.ha.datapath.replace.remote.nameservice = * > *hive.repl.ha.datapath.replace.remote.nameservice.name = * > Based on the above configs replacement of nameservice can be done. > This will also require that 'hive.repl.rootdir' is passed accordingly during > dump and load: > Repl dump: > ||Repl Operation||Repl Command|| > |*Staging on source cluster*| > |Repl Dump|repl dump dbName with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')| > |Repl Load|repl load dbName into dbName > with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')| > |*Staging on target cluster*| > |Repl Dump|repl dump dbName > with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')| > |Repl Load|repl load dbName into dbName > with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24187) Handle _files creation for HA config with same nameservice on source and destination
[ https://issues.apache.org/jira/browse/HIVE-24187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha reassigned HIVE-24187: --- > Handle _files creation for HA config with same nameservice on source and > destination > > > Key: HIVE-24187 > URL: https://issues.apache.org/jira/browse/HIVE-24187 > Project: Hive > Issue Type: Improvement >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > > Current HA is supported only for different nameservices on Source and > Destination. We need to add support of same nameservice on Source and > Destination. > Local nameservice will be passed correctly to the repl command. > Remote nameservice will be a random name and corresponding configs for the > same. > Example: > Clusters originally configured with ns for hdfs: > src: ns1 > target : ns1 > We can denote remote name with some random name, say for example: nsRemote. > This is how the command will see the ns w.r.t source and target: > Repl Dump : src: ns1, target: nsRemote > Repl Load: src: nsRemote, target: ns1 > Entries in the _files(for managed table data loc) will be made with nsRemote > in stead of ns1(for src). > Example: > hdfs://nsRemote/whLoc/dbName.db/table1:checksum:subDir:hdfs://nsRemote/cmroot > Same way list of external table data locations will also be modified using > nsRemote in stead of ns1(for src). > New configs can control the behavior: > *hive.repl.ha.datapath.replace.remote.nameservice = * > *hive.repl.ha.datapath.replace.remote.nameservice.name = * > Based on the above configs replacement of nameservice can be done. > This will also require that 'hive.repl.rootdir' is passed accordingly during > dump and load: > Repl dump: > ||Repl Operation||Repl Command|| > |*Staging on source cluster*| > |Repl Dump|repl dump dbName with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')| > |Repl Load|repl load dbName into dbName > with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')| > |*Staging on target cluster*| > |Repl Dump|repl dump dbName > with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')| > |Repl Load|repl load dbName into dbName > with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24179) Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement
[ https://issues.apache.org/jira/browse/HIVE-24179?focusedWorklogId=487298=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487298 ] ASF GitHub Bot logged work on HIVE-24179: - Author: ASF GitHub Bot Created on: 21/Sep/20 21:47 Start Date: 21/Sep/20 21:47 Worklog Time Spent: 10m Work Description: zabetak commented on a change in pull request #1509: URL: https://github.com/apache/hive/pull/1509#discussion_r492364790 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java ## @@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws SemanticException { String dbName = stripQuotes(root.getChild(0).getText()); boolean isExtended = (root.getChildCount() > 1); -HiveTxnManager txnManager = null; +boolean useNewLocksFormat; try { - txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); + HiveTxnManager txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); Review comment: Did you check the last commit? Do you have something else in mind? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487298) Time Spent: 1h 10m (was: 1h) > Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement > --- > > Key: HIVE-24179 > URL: https://issues.apache.org/jira/browse/HIVE-24179 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: summary.png > > Time Spent: 1h 10m > Remaining Estimate: 0h > > The problem can be reproduced by executing repeatedly a SHOW LOCK statement > and monitoring the heap memory of HS2. For a small heap (e.g., 2g) it only > takes a few minutes before the server crashes with OutOfMemory error such as > the one shown below. > {noformat} > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.util.Arrays.copyOf(Arrays.java:3332) > at > java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124) > at > java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448) > at java.lang.StringBuilder.append(StringBuilder.java:136) > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.encodeMessage(ForkedChannelEncoder.j > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.setOutErr(ForkedChannelEncoder.java: > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.stdErr(ForkedChannelEncoder.java:166 > at > org.apache.maven.surefire.booter.ForkingRunListener.writeTestOutput(ForkingRunListener.jav > at > org.apache.maven.surefire.report.ConsoleOutputCapture$ForwardingPrintStream.write(ConsoleO > at > org.apache.logging.log4j.core.util.CloseShieldOutputStream.write(CloseShieldOutputStream.j > at > org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStream > at > org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager > at > org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java: > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(Abst > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutp > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputS > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java: > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:12 > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(Appender > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) > at > org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:543) > at > org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:502) > at > org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:485) > at >
[jira] [Resolved] (HIVE-23271) Can't start Hive Interactive Server in HDP 3.1.4 Cluster
[ https://issues.apache.org/jira/browse/HIVE-23271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Tagra resolved HIVE-23271. Resolution: Fixed > Can't start Hive Interactive Server in HDP 3.1.4 Cluster > > > Key: HIVE-23271 > URL: https://issues.apache.org/jira/browse/HIVE-23271 > Project: Hive > Issue Type: Bug > Components: Configuration >Affects Versions: 3.1.0 > Environment: All nodes have CentOS 7 > Cluster HDP 3.1.4 >Reporter: Gerardo Adrián Aguirre Vivar >Assignee: Gerardo Adrián Aguirre Vivar >Priority: Major > > Hive interactive server is not working. The installation guide has been > followed > ([https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.4/performance-tuning/content/hive_prepare_to_tune_performance.html]) > but when the server try to start, an errors appears. > > LOGS: > > 2020-04-22T16:43:48,271 INFO [main]: impl.YarnClientImpl > (YarnClientImpl.java:submitApplication(306)) - Submitted application > application_1587555843754_0015 > 2020-04-22T16:43:48,275 INFO [main]: client.TezClient > (TezClient.java:start(404)) - The url to track the Tez Session: > http://:8088/proxy/application_1587555843754_0015/ > 2020-04-22T16:43:53,435 INFO [main]: client.TezClient > (TezClient.java:getAppMasterStatus(881)) - *{color:#0747a6}Failed to retrieve > AM Status via proxy{color}* > com.google.protobuf.ServiceException: java.io.EOFException: End of File > Exception between local host is: > "/10.22.39.12"; destination host is: > "":33889; : java.io.EOFException; For more details see: > http://wiki.apache.org/hadoop/EOFException > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:242) > ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) > ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] > at com.sun.proxy.$Proxy75.getAMStatus(Unknown Source) ~[?:?] > at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:874) > [tez-api-0.9.1.3.1.4.0-315.jar:0.9.1.3.1.4.0-315] > at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:1011) > [tez-api-0.9.1.3.1.4.0-315.jar:0.9.1.3.1.4.0-315] > at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:982) > [tez-api-0.9.1.3.1.4.0-315.jar:0.9.1.3.1.4.0-315] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezSessionState.java:536) > [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:451) > [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(TezSessionPoolSession.java:124) > [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:373) > [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:236) > [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPool.startInitialSession(TezSessionPool.java:354) > [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPool.startUnderInitLock(TezSessionPool.java:166) > [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPool.start(TezSessionPool.java:123) > [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.startPool(TezSessionPoolManager.java:112) > [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hive.service.server.HiveServer2.initAndStartTezSessionPoolManager(HiveServer2.java:855) > [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hive.service.server.HiveServer2.startOrReconnectTezSessions(HiveServer2.java:828) > [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:752) > [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1078) > [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hive.service.server.HiveServer2.access$1700(HiveServer2.java:136) > [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1346) > [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1190) > [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at
[jira] [Reopened] (HIVE-23271) Can't start Hive Interactive Server in HDP 3.1.4 Cluster
[ https://issues.apache.org/jira/browse/HIVE-23271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Tagra reopened HIVE-23271: > Can't start Hive Interactive Server in HDP 3.1.4 Cluster > > > Key: HIVE-23271 > URL: https://issues.apache.org/jira/browse/HIVE-23271 > Project: Hive > Issue Type: Bug > Components: Configuration >Affects Versions: 3.1.0 > Environment: All nodes have CentOS 7 > Cluster HDP 3.1.4 >Reporter: Gerardo Adrián Aguirre Vivar >Assignee: Gerardo Adrián Aguirre Vivar >Priority: Major > > Hive interactive server is not working. The installation guide has been > followed > ([https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.4/performance-tuning/content/hive_prepare_to_tune_performance.html]) > but when the server try to start, an errors appears. > > LOGS: > > 2020-04-22T16:43:48,271 INFO [main]: impl.YarnClientImpl > (YarnClientImpl.java:submitApplication(306)) - Submitted application > application_1587555843754_0015 > 2020-04-22T16:43:48,275 INFO [main]: client.TezClient > (TezClient.java:start(404)) - The url to track the Tez Session: > http://:8088/proxy/application_1587555843754_0015/ > 2020-04-22T16:43:53,435 INFO [main]: client.TezClient > (TezClient.java:getAppMasterStatus(881)) - *{color:#0747a6}Failed to retrieve > AM Status via proxy{color}* > com.google.protobuf.ServiceException: java.io.EOFException: End of File > Exception between local host is: > "/10.22.39.12"; destination host is: > "":33889; : java.io.EOFException; For more details see: > http://wiki.apache.org/hadoop/EOFException > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:242) > ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) > ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] > at com.sun.proxy.$Proxy75.getAMStatus(Unknown Source) ~[?:?] > at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:874) > [tez-api-0.9.1.3.1.4.0-315.jar:0.9.1.3.1.4.0-315] > at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:1011) > [tez-api-0.9.1.3.1.4.0-315.jar:0.9.1.3.1.4.0-315] > at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:982) > [tez-api-0.9.1.3.1.4.0-315.jar:0.9.1.3.1.4.0-315] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezSessionState.java:536) > [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:451) > [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(TezSessionPoolSession.java:124) > [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:373) > [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:236) > [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPool.startInitialSession(TezSessionPool.java:354) > [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPool.startUnderInitLock(TezSessionPool.java:166) > [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPool.start(TezSessionPool.java:123) > [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.startPool(TezSessionPoolManager.java:112) > [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hive.service.server.HiveServer2.initAndStartTezSessionPoolManager(HiveServer2.java:855) > [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hive.service.server.HiveServer2.startOrReconnectTezSessions(HiveServer2.java:828) > [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:752) > [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1078) > [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hive.service.server.HiveServer2.access$1700(HiveServer2.java:136) > [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at > org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1346) > [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1190) > [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] > at
[jira] [Work logged] (HIVE-24179) Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement
[ https://issues.apache.org/jira/browse/HIVE-24179?focusedWorklogId=487172=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487172 ] ASF GitHub Bot logged work on HIVE-24179: - Author: ASF GitHub Bot Created on: 21/Sep/20 18:29 Start Date: 21/Sep/20 18:29 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1509: URL: https://github.com/apache/hive/pull/1509#discussion_r492262911 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java ## @@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws SemanticException { String dbName = stripQuotes(root.getChild(0).getText()); boolean isExtended = (root.getChildCount() > 1); -HiveTxnManager txnManager = null; +boolean useNewLocksFormat; try { - txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); + HiveTxnManager txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); Review comment: TxnManager is initialized before query compilation and has a session scope. Correct way to access it is via SessionState. Looks like ShowDbLocksAnalyzer was always creating new instance of TxnManager. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487172) Time Spent: 1h (was: 50m) > Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement > --- > > Key: HIVE-24179 > URL: https://issues.apache.org/jira/browse/HIVE-24179 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: summary.png > > Time Spent: 1h > Remaining Estimate: 0h > > The problem can be reproduced by executing repeatedly a SHOW LOCK statement > and monitoring the heap memory of HS2. For a small heap (e.g., 2g) it only > takes a few minutes before the server crashes with OutOfMemory error such as > the one shown below. > {noformat} > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.util.Arrays.copyOf(Arrays.java:3332) > at > java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124) > at > java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448) > at java.lang.StringBuilder.append(StringBuilder.java:136) > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.encodeMessage(ForkedChannelEncoder.j > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.setOutErr(ForkedChannelEncoder.java: > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.stdErr(ForkedChannelEncoder.java:166 > at > org.apache.maven.surefire.booter.ForkingRunListener.writeTestOutput(ForkingRunListener.jav > at > org.apache.maven.surefire.report.ConsoleOutputCapture$ForwardingPrintStream.write(ConsoleO > at > org.apache.logging.log4j.core.util.CloseShieldOutputStream.write(CloseShieldOutputStream.j > at > org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStream > at > org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager > at > org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java: > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(Abst > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutp > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputS > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java: > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:12 > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(Appender > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) > at > org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:543) > at > org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:502) > at >
[jira] [Work logged] (HIVE-24179) Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement
[ https://issues.apache.org/jira/browse/HIVE-24179?focusedWorklogId=487170=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487170 ] ASF GitHub Bot logged work on HIVE-24179: - Author: ASF GitHub Bot Created on: 21/Sep/20 18:28 Start Date: 21/Sep/20 18:28 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1509: URL: https://github.com/apache/hive/pull/1509#discussion_r492262911 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java ## @@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws SemanticException { String dbName = stripQuotes(root.getChild(0).getText()); boolean isExtended = (root.getChildCount() > 1); -HiveTxnManager txnManager = null; +boolean useNewLocksFormat; try { - txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); + HiveTxnManager txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); Review comment: TxnManager is initialised before query compilation and has a session scope. Correct way to access it is via SessionState. Looks like ShowDbLocksAnalyzer was always creating new instance of TxnManager. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487170) Time Spent: 50m (was: 40m) > Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement > --- > > Key: HIVE-24179 > URL: https://issues.apache.org/jira/browse/HIVE-24179 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: summary.png > > Time Spent: 50m > Remaining Estimate: 0h > > The problem can be reproduced by executing repeatedly a SHOW LOCK statement > and monitoring the heap memory of HS2. For a small heap (e.g., 2g) it only > takes a few minutes before the server crashes with OutOfMemory error such as > the one shown below. > {noformat} > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.util.Arrays.copyOf(Arrays.java:3332) > at > java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124) > at > java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448) > at java.lang.StringBuilder.append(StringBuilder.java:136) > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.encodeMessage(ForkedChannelEncoder.j > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.setOutErr(ForkedChannelEncoder.java: > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.stdErr(ForkedChannelEncoder.java:166 > at > org.apache.maven.surefire.booter.ForkingRunListener.writeTestOutput(ForkingRunListener.jav > at > org.apache.maven.surefire.report.ConsoleOutputCapture$ForwardingPrintStream.write(ConsoleO > at > org.apache.logging.log4j.core.util.CloseShieldOutputStream.write(CloseShieldOutputStream.j > at > org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStream > at > org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager > at > org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java: > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(Abst > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutp > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputS > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java: > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:12 > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(Appender > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) > at > org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:543) > at > org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:502) > at >
[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement
[ https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=487125=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487125 ] ASF GitHub Bot logged work on HIVE-24009: - Author: ASF GitHub Bot Created on: 21/Sep/20 17:48 Start Date: 21/Sep/20 17:48 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1472: URL: https://github.com/apache/hive/pull/1472#discussion_r492240115 ## File path: ql/src/test/results/clientpositive/llap/constprog_dpp.q.out ## @@ -84,12 +84,13 @@ Stage-0 Select Operator [SEL_40] (rows=1 width=4) Output:["_col0"] TableScan [TS_24] (rows=1 width=4) -Output:["id"] +default@tb2,tb2,Tbl:COMPLETE,Col:NONE,Output:["id"] <-Map 6 [CONTAINS] vectorized, llap Reduce Output Operator [RS_45] Limit [LIM_44] (rows=1 width=2) Number of rows:1 Select Operator [SEL_43] (rows=1 width=0) Output:["_col0"] TableScan [TS_29] (rows=1 width=0) +default@tb2,tb2,Tbl:PARTIAL,Col:COMPLETE Review comment: I confirmed that this is expected. I compared this plan against master (with explain.user set to false) and there is no difference in the plan. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487125) Time Spent: 1h (was: 50m) > Support partition pruning and other physical transformations for EXECUTE > statement > --- > > Key: HIVE-24009 > URL: https://issues.apache.org/jira/browse/HIVE-24009 > Project: Hive > Issue Type: Sub-task >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Current partition pruning (compile time) isn't kicked in for EXECUTE > statements. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement
[ https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=487109=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487109 ] ASF GitHub Bot logged work on HIVE-24009: - Author: ASF GitHub Bot Created on: 21/Sep/20 17:30 Start Date: 21/Sep/20 17:30 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1472: URL: https://github.com/apache/hive/pull/1472#discussion_r492229758 ## File path: ql/src/test/results/clientpositive/llap/constprog_dpp.q.out ## @@ -84,12 +84,13 @@ Stage-0 Select Operator [SEL_40] (rows=1 width=4) Output:["_col0"] TableScan [TS_24] (rows=1 width=4) -Output:["id"] +default@tb2,tb2,Tbl:COMPLETE,Col:NONE,Output:["id"] Review comment: Yeah I think this is likely side effect of some changes in w.r.t serialization/de-serialization. Although this is positive side effect now that we have more information in explain plan. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487109) Time Spent: 50m (was: 40m) > Support partition pruning and other physical transformations for EXECUTE > statement > --- > > Key: HIVE-24009 > URL: https://issues.apache.org/jira/browse/HIVE-24009 > Project: Hive > Issue Type: Sub-task >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Current partition pruning (compile time) isn't kicked in for EXECUTE > statements. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24122) When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong
[ https://issues.apache.org/jira/browse/HIVE-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199545#comment-17199545 ] Stamatis Zampetakis commented on HIVE-24122: Great, one problem less to deal with :) > When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong > --- > > Key: HIVE-24122 > URL: https://issues.apache.org/jira/browse/HIVE-24122 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 3.1.0, 3.1.2 >Reporter: GuangMing Lu >Assignee: GuangMing Lu >Priority: Major > Fix For: 4.0.0 > > > {code:java} > create database testdb; > CREATE TABLE IF NOT EXISTS testdb.z_tab > ( > SEARCHWORD STRING, > COUNT_NUM BIGINT, > WORDS STRING > ) > ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' > STORED AS TEXTFILE; > insert into table testdb.z_tab > values('hivetest',111,'aaa'),('hivetest2',111,'bbb'); > set hive.cbo.enable=true; > SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab; > SELECT CAST(searchword as bigint) IS NULL FROM testdb.z_tab; > {code} > The SQL results for both queries are the same, as follows: > {noformat} > +---+ > | _c0 | > +---+ > | true | > | true | > +---+{noformat} > SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab; execute > result is wrong > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement
[ https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=487107=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487107 ] ASF GitHub Bot logged work on HIVE-24009: - Author: ASF GitHub Bot Created on: 21/Sep/20 17:29 Start Date: 21/Sep/20 17:29 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1472: URL: https://github.com/apache/hive/pull/1472#discussion_r492229067 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ## @@ -387,6 +387,12 @@ protected volatile boolean disableJoinMerge = false; protected final boolean defaultJoinMerge; + /* + * This is used by prepare/execute statement + * Prepare/Execute requires operators to be copied and cached + */ + protected Map topOpsCopy = null; Review comment: Original operator tree shape is changed when going through physical transformations and task generation (don't know why though), as a result this operator tree can not be used later to regenerate tasks or re-running physical transformations. Therefore we make a copy and cache it after operator tree is generated. I will leave a comment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487107) Time Spent: 40m (was: 0.5h) > Support partition pruning and other physical transformations for EXECUTE > statement > --- > > Key: HIVE-24009 > URL: https://issues.apache.org/jira/browse/HIVE-24009 > Project: Hive > Issue Type: Sub-task >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Current partition pruning (compile time) isn't kicked in for EXECUTE > statements. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24179) Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement
[ https://issues.apache.org/jira/browse/HIVE-24179?focusedWorklogId=487103=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487103 ] ASF GitHub Bot logged work on HIVE-24179: - Author: ASF GitHub Bot Created on: 21/Sep/20 17:23 Start Date: 21/Sep/20 17:23 Worklog Time Spent: 10m Work Description: zabetak commented on a change in pull request #1509: URL: https://github.com/apache/hive/pull/1509#discussion_r492225545 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java ## @@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws SemanticException { String dbName = stripQuotes(root.getChild(0).getText()); boolean isExtended = (root.getChildCount() > 1); -HiveTxnManager txnManager = null; +boolean useNewLocksFormat; try { - txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); + HiveTxnManager txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); Review comment: That's correct. Normally, we should (but I am not 100% sure) have a TxnManager at this point so there is no need to create a new one just to obtain the flag. I pushed commit https://github.com/apache/hive/pull/1509/commits/297882ee80d52689a9cc1c68da9f7580918439bb to try this out. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487103) Time Spent: 40m (was: 0.5h) > Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement > --- > > Key: HIVE-24179 > URL: https://issues.apache.org/jira/browse/HIVE-24179 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: summary.png > > Time Spent: 40m > Remaining Estimate: 0h > > The problem can be reproduced by executing repeatedly a SHOW LOCK statement > and monitoring the heap memory of HS2. For a small heap (e.g., 2g) it only > takes a few minutes before the server crashes with OutOfMemory error such as > the one shown below. > {noformat} > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.util.Arrays.copyOf(Arrays.java:3332) > at > java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124) > at > java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448) > at java.lang.StringBuilder.append(StringBuilder.java:136) > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.encodeMessage(ForkedChannelEncoder.j > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.setOutErr(ForkedChannelEncoder.java: > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.stdErr(ForkedChannelEncoder.java:166 > at > org.apache.maven.surefire.booter.ForkingRunListener.writeTestOutput(ForkingRunListener.jav > at > org.apache.maven.surefire.report.ConsoleOutputCapture$ForwardingPrintStream.write(ConsoleO > at > org.apache.logging.log4j.core.util.CloseShieldOutputStream.write(CloseShieldOutputStream.j > at > org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStream > at > org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager > at > org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java: > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(Abst > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutp > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputS > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java: > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:12 > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(Appender > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) > at > org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:543) > at >
[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement
[ https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=487101=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487101 ] ASF GitHub Bot logged work on HIVE-24009: - Author: ASF GitHub Bot Created on: 21/Sep/20 17:22 Start Date: 21/Sep/20 17:22 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1472: URL: https://github.com/apache/hive/pull/1472#discussion_r492224561 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java ## @@ -63,19 +63,19 @@ private VectorizationContext taskVectorizationContext; - protected transient JobConf jc; - private transient boolean inputFileChanged = false; + protected JobConf jc; Review comment: I actually tried not keeping these fields but I was running into all sorts of issues like unable to serialize/de-serialize or plan generating without metadata etc. I am not sure if we need to keep all of these fields or we can selectively choose, I went by almost all in interest of time. If Gopal or Rajesh thinks that this may cause performance issue I can open a follow-up to investigate and choose fields selectively. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487101) Time Spent: 0.5h (was: 20m) > Support partition pruning and other physical transformations for EXECUTE > statement > --- > > Key: HIVE-24009 > URL: https://issues.apache.org/jira/browse/HIVE-24009 > Project: Hive > Issue Type: Sub-task >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Current partition pruning (compile time) isn't kicked in for EXECUTE > statements. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24163) Dynamic Partitioning Insert fail for MM table fail during MoveTask
[ https://issues.apache.org/jira/browse/HIVE-24163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199477#comment-17199477 ] Marta Kuczora commented on HIVE-24163: -- There was a typo in the direct insert path. But turned out that there is more issue. The file listing in the Utilities.getFullDPSpecs method was not correct for MM tables and for ACID tables when direct insert was on. This method returned all partitions from these tables, not just the ones affected by the current query. Because of this, the lineage information for inserting with dynamic partitioning into tables like these was not correct. Compared the lineage information with when inserting into external tables and for external tables only the partitions are present which are affected by the query. This is because for external tables when inserting into the table, the data first get written into the staging dir and when listing the partitions, this directory is checked and it contains only the newly inserted data. But for MM tables and ACID direct insert, the staging dir is missing, so it will check the table directory and lists everything from it. > Dynamic Partitioning Insert fail for MM table fail during MoveTask > -- > > Key: HIVE-24163 > URL: https://issues.apache.org/jira/browse/HIVE-24163 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Rajkumar Singh >Assignee: Marta Kuczora >Priority: Major > Labels: pull-request-available > Fix For: 3.1.2 > > Time Spent: 20m > Remaining Estimate: 0h > > -- DDLs and Query > {code:java} > create table `class` (name varchar(8), sex varchar(1), age double precision, > height double precision, weight double precision); > insert into table class values ('RAJ','MALE',28,12,12); > CREATE TABLE `PART1` (`id` DOUBLE,`N` DOUBLE,`Name` VARCHAR(8),`Sex` > VARCHAR(1)) PARTITIONED BY(Weight string, Age > string, Height string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' > LINES TERMINATED BY '\012' STORED AS TEXTFILE; > INSERT INTO TABLE `part1` PARTITION (`Weight`,`Age`,`Height`) SELECT 0, 0, > `Name`,`Sex`,`Weight`,`Age`,`Height` FROM `class`; > {code} > it fail during the MoveTask execution: > {code:java} > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: partition > hdfs://hostname:8020/warehouse/tablespace/managed/hive/part1/.hive-staging_hive_2020-09-02_13-29-58_765_4475282758764123921-1/-ext-1/tmpstats-0_FS_3 > is not a directory! > at > org.apache.hadoop.hive.ql.metadata.Hive.getValidPartitionsInPath(Hive.java:2769) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at > org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:2837) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at > org.apache.hadoop.hive.ql.exec.MoveTask.handleDynParts(MoveTask.java:562) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:440) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225) > ~[hive-service-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > {code} > The reason is Task write the fsstat during the FileSinkOperator closing, HS2 > ran the MoveTask to move data into the
[jira] [Work logged] (HIVE-24163) Dynamic Partitioning Insert fail for MM table fail during MoveTask
[ https://issues.apache.org/jira/browse/HIVE-24163?focusedWorklogId=487020=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487020 ] ASF GitHub Bot logged work on HIVE-24163: - Author: ASF GitHub Bot Created on: 21/Sep/20 15:58 Start Date: 21/Sep/20 15:58 Worklog Time Spent: 10m Work Description: kuczoram commented on pull request #1507: URL: https://github.com/apache/hive/pull/1507#issuecomment-696209500 The file listing in the Utilities.getFullDPSpecs method was not correct for MM tables and for ACID tables when direct insert was on. This method returned all partitions from these tables, not just the ones affected by the current query. Because of this, the lineage information for inserting with dynamic partitioning into tables like these was not correct. Compared the lineage information with when inserting into external tables and for external tables only the partitions are present which are affected by the query. This is because for external tables when inserting into the table, the data first get written into the staging dir and when listing the partitions, this directory is checked and it contains only the newly inserted data. But for MM tables and ACID direct insert, the staging dir is missing, so it will check the table directory and lists everything from it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487020) Time Spent: 20m (was: 10m) > Dynamic Partitioning Insert fail for MM table fail during MoveTask > -- > > Key: HIVE-24163 > URL: https://issues.apache.org/jira/browse/HIVE-24163 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Rajkumar Singh >Assignee: Marta Kuczora >Priority: Major > Labels: pull-request-available > Fix For: 3.1.2 > > Time Spent: 20m > Remaining Estimate: 0h > > -- DDLs and Query > {code:java} > create table `class` (name varchar(8), sex varchar(1), age double precision, > height double precision, weight double precision); > insert into table class values ('RAJ','MALE',28,12,12); > CREATE TABLE `PART1` (`id` DOUBLE,`N` DOUBLE,`Name` VARCHAR(8),`Sex` > VARCHAR(1)) PARTITIONED BY(Weight string, Age > string, Height string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' > LINES TERMINATED BY '\012' STORED AS TEXTFILE; > INSERT INTO TABLE `part1` PARTITION (`Weight`,`Age`,`Height`) SELECT 0, 0, > `Name`,`Sex`,`Weight`,`Age`,`Height` FROM `class`; > {code} > it fail during the MoveTask execution: > {code:java} > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: partition > hdfs://hostname:8020/warehouse/tablespace/managed/hive/part1/.hive-staging_hive_2020-09-02_13-29-58_765_4475282758764123921-1/-ext-1/tmpstats-0_FS_3 > is not a directory! > at > org.apache.hadoop.hive.ql.metadata.Hive.getValidPartitionsInPath(Hive.java:2769) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at > org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:2837) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at > org.apache.hadoop.hive.ql.exec.MoveTask.handleDynParts(MoveTask.java:562) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:440) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) > ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] > at
[jira] [Commented] (HIVE-24060) When the CBO is false, NPE is thrown by an EXCEPT or INTERSECT execution
[ https://issues.apache.org/jira/browse/HIVE-24060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199373#comment-17199373 ] GuangMing Lu commented on HIVE-24060: - Hey [~dengzh] Such is the case, but hive-1.2.1 is available, which leads to incompatibility problems for some users, whether we need to consider it > When the CBO is false, NPE is thrown by an EXCEPT or INTERSECT execution > > > Key: HIVE-24060 > URL: https://issues.apache.org/jira/browse/HIVE-24060 > Project: Hive > Issue Type: Bug > Components: CBO, Hive >Affects Versions: 3.1.0, 3.1.2 >Reporter: GuangMing Lu >Priority: Major > > {code:java} > set hive.cbo.enable=false; > create table testtable(idx string, namex string) stored as orc; > insert into testtable values('123', 'aaa'), ('234', 'bbb'); > explain select a.idx from (select idx,namex from testtable intersect select > idx,namex from testtable) a > {code} > The execution throws a NullPointException: > {code:java} > 2020-08-24 15:12:24,261 | WARN | HiveServer2-Handler-Pool: Thread-345 | > Error executing statement: | > org.apache.hive.service.cli.thrift.ThriftCLIService.executeNewStatement(ThriftCLIService.java:1155) > org.apache.hive.service.cli.HiveSQLException: Error while compiling > statement: FAILED: NullPointerException null > at > org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:341) > ~[hive-service-3.1.0.jar:3.1.0] > at > org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:215) > ~[hive-service-3.1.0.jar:3.1.0] > at > org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:316) > ~[hive-service-3.1.0.jar:3.1.0] > at > org.apache.hive.service.cli.operation.Operation.run(Operation.java:253) > ~[hive-service-3.1.0.jar:3.1.0] > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:684) > ~[hive-service-3.1.0.jar:3.1.0] > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:670) > ~[hive-service-3.1.0.jar:3.1.0] > at > org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:342) > ~[hive-service-3.1.0.jar:3.1.0] > at > org.apache.hive.service.cli.thrift.ThriftCLIService.executeNewStatement(ThriftCLIService.java:1144) > ~[hive-service-3.1.0.jar:3.1.0] > at > org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:1280) > ~[hive-service-3.1.0.jar:3.1.0] > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557) > ~[hive-service-rpc-3.1.0.jar:3.1.0] > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542) > ~[hive-service-rpc-3.1.0.jar:3.1.0] > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > ~[libthrift-0.9.3.jar:0.9.3] > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > ~[libthrift-0.9.3.jar:0.9.3] > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:648) > ~[hive-standalone-metastore-3.1.0.jar:3.1.0] > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > ~[libthrift-0.9.3.jar:0.9.3] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > ~[?:1.8.0_201] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ~[?:1.8.0_201] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201] > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:4367) > ~[hive-exec-3.1.0.jar:3.1.0] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:4346) > ~[hive-exec-3.1.0.jar:3.1.0] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:10576) > ~[hive-exec-3.1.0.jar:3.1.0] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:10515) > ~[hive-exec-3.1.0.jar:3.1.0] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11434) > ~[hive-exec-3.1.0.jar:3.1.0] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11291) > ~[hive-exec-3.1.0.jar:3.1.0] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11318) > ~[hive-exec-3.1.0.jar:3.1.0] > at >
[jira] [Updated] (HIVE-24186) The aggregate class operation fails when the CBO is false
[ https://issues.apache.org/jira/browse/HIVE-24186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GuangMing Lu updated HIVE-24186: Affects Version/s: 3.1.2 > The aggregate class operation fails when the CBO is false > - > > Key: HIVE-24186 > URL: https://issues.apache.org/jira/browse/HIVE-24186 > Project: Hive > Issue Type: Bug > Components: CBO, SQL >Affects Versions: 3.1.0, 3.1.2 >Reporter: GuangMing Lu >Priority: Major > > {code:java} > create table table_1 > ( > idx string, > namex string > ) stored as orc; > create table table_2 > ( > sid string, > sname string > )stored as orc; > set hive.cbo.enable=false; > explain > insert into table table_1(idx , namex) > select t.sid idx, '123' namex > from table_2 t > group by t.sid > order by 1,2; > {code} > Executing the above SQL will report an error, errors as follows: > {code:java} > org.apache.hive.service.cli.HiveSQLException: Error while compiling > statement: FAILED: SemanticException [Error 10004]: Line 4:7 Invalid table > alias or column reference 't': (possible column names are: _col0, _col1) > at > org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:341) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:215) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:316) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.cli.operation.Operation.run(Operation.java:253) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:684) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:670) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at sun.reflect.GeneratedMethodAccessor151.invoke(Unknown Source) > ~[?:?] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_242] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_242] > at > org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at java.security.AccessController.doPrivileged(Native Method) > ~[?:1.8.0_242] > at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_242] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1737) > ~[hadoop-common-3.1.1-hw-ei-302001-SNAPSHOT.jar:?] > at > org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at com.sun.proxy.$Proxy66.executeStatementAsync(Unknown Source) ~[?:?] > at > org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:342) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.cli.thrift.ThriftCLIService.executeNewStatement(ThriftCLIService.java:1144) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:1280) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557) > > ~[hive-service-rpc-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542) > > ~[hive-service-rpc-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > ~[hive-exec-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at
[jira] [Updated] (HIVE-24186) The aggregate class operation fails when the CBO is false
[ https://issues.apache.org/jira/browse/HIVE-24186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GuangMing Lu updated HIVE-24186: Fix Version/s: (was: 3.1.2) (was: 3.1.0) > The aggregate class operation fails when the CBO is false > - > > Key: HIVE-24186 > URL: https://issues.apache.org/jira/browse/HIVE-24186 > Project: Hive > Issue Type: Bug > Components: CBO, SQL >Affects Versions: 3.1.0 >Reporter: GuangMing Lu >Priority: Major > > {code:java} > create table table_1 > ( > idx string, > namex string > ) stored as orc; > create table table_2 > ( > sid string, > sname string > )stored as orc; > set hive.cbo.enable=false; > explain > insert into table table_1(idx , namex) > select t.sid idx, '123' namex > from table_2 t > group by t.sid > order by 1,2; > {code} > Executing the above SQL will report an error, errors as follows: > {code:java} > org.apache.hive.service.cli.HiveSQLException: Error while compiling > statement: FAILED: SemanticException [Error 10004]: Line 4:7 Invalid table > alias or column reference 't': (possible column names are: _col0, _col1) > at > org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:341) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:215) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:316) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.cli.operation.Operation.run(Operation.java:253) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:684) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:670) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at sun.reflect.GeneratedMethodAccessor151.invoke(Unknown Source) > ~[?:?] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_242] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_242] > at > org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at java.security.AccessController.doPrivileged(Native Method) > ~[?:1.8.0_242] > at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_242] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1737) > ~[hadoop-common-3.1.1-hw-ei-302001-SNAPSHOT.jar:?] > at > org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at com.sun.proxy.$Proxy66.executeStatementAsync(Unknown Source) ~[?:?] > at > org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:342) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.cli.thrift.ThriftCLIService.executeNewStatement(ThriftCLIService.java:1144) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:1280) > ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557) > > ~[hive-service-rpc-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542) > > ~[hive-service-rpc-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > ~[hive-exec-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT] > at
[jira] [Reopened] (HIVE-24122) When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong
[ https://issues.apache.org/jira/browse/HIVE-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reopened HIVE-24122: --- > When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong > --- > > Key: HIVE-24122 > URL: https://issues.apache.org/jira/browse/HIVE-24122 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 3.1.0, 3.1.2 >Reporter: GuangMing Lu >Assignee: GuangMing Lu >Priority: Major > Fix For: 4.0.0 > > > {code:java} > create database testdb; > CREATE TABLE IF NOT EXISTS testdb.z_tab > ( > SEARCHWORD STRING, > COUNT_NUM BIGINT, > WORDS STRING > ) > ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' > STORED AS TEXTFILE; > insert into table testdb.z_tab > values('hivetest',111,'aaa'),('hivetest2',111,'bbb'); > set hive.cbo.enable=true; > SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab; > SELECT CAST(searchword as bigint) IS NULL FROM testdb.z_tab; > {code} > The SQL results for both queries are the same, as follows: > {noformat} > +---+ > | _c0 | > +---+ > | true | > | true | > +---+{noformat} > SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab; execute > result is wrong > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24122) When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong
[ https://issues.apache.org/jira/browse/HIVE-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis resolved HIVE-24122. --- Resolution: Not A Problem > When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong > --- > > Key: HIVE-24122 > URL: https://issues.apache.org/jira/browse/HIVE-24122 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 3.1.0, 3.1.2 >Reporter: GuangMing Lu >Assignee: GuangMing Lu >Priority: Major > Fix For: 4.0.0 > > > {code:java} > create database testdb; > CREATE TABLE IF NOT EXISTS testdb.z_tab > ( > SEARCHWORD STRING, > COUNT_NUM BIGINT, > WORDS STRING > ) > ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' > STORED AS TEXTFILE; > insert into table testdb.z_tab > values('hivetest',111,'aaa'),('hivetest2',111,'bbb'); > set hive.cbo.enable=true; > SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab; > SELECT CAST(searchword as bigint) IS NULL FROM testdb.z_tab; > {code} > The SQL results for both queries are the same, as follows: > {noformat} > +---+ > | _c0 | > +---+ > | true | > | true | > +---+{noformat} > SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab; execute > result is wrong > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24122) When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong
[ https://issues.apache.org/jira/browse/HIVE-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GuangMing Lu reassigned HIVE-24122: --- Assignee: GuangMing Lu > When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong > --- > > Key: HIVE-24122 > URL: https://issues.apache.org/jira/browse/HIVE-24122 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 3.1.0, 3.1.2 >Reporter: GuangMing Lu >Assignee: GuangMing Lu >Priority: Major > Fix For: 4.0.0 > > > {code:java} > create database testdb; > CREATE TABLE IF NOT EXISTS testdb.z_tab > ( > SEARCHWORD STRING, > COUNT_NUM BIGINT, > WORDS STRING > ) > ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' > STORED AS TEXTFILE; > insert into table testdb.z_tab > values('hivetest',111,'aaa'),('hivetest2',111,'bbb'); > set hive.cbo.enable=true; > SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab; > SELECT CAST(searchword as bigint) IS NULL FROM testdb.z_tab; > {code} > The SQL results for both queries are the same, as follows: > {noformat} > +---+ > | _c0 | > +---+ > | true | > | true | > +---+{noformat} > SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab; execute > result is wrong > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24122) When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong
[ https://issues.apache.org/jira/browse/HIVE-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GuangMing Lu resolved HIVE-24122. - Fix Version/s: 4.0.0 Resolution: Fixed > When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong > --- > > Key: HIVE-24122 > URL: https://issues.apache.org/jira/browse/HIVE-24122 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 3.1.0, 3.1.2 >Reporter: GuangMing Lu >Priority: Major > Fix For: 4.0.0 > > > {code:java} > create database testdb; > CREATE TABLE IF NOT EXISTS testdb.z_tab > ( > SEARCHWORD STRING, > COUNT_NUM BIGINT, > WORDS STRING > ) > ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' > STORED AS TEXTFILE; > insert into table testdb.z_tab > values('hivetest',111,'aaa'),('hivetest2',111,'bbb'); > set hive.cbo.enable=true; > SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab; > SELECT CAST(searchword as bigint) IS NULL FROM testdb.z_tab; > {code} > The SQL results for both queries are the same, as follows: > {noformat} > +---+ > | _c0 | > +---+ > | true | > | true | > +---+{noformat} > SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab; execute > result is wrong > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HIVE-24122) When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong
[ https://issues.apache.org/jira/browse/HIVE-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199362#comment-17199362 ] GuangMing Lu edited comment on HIVE-24122 at 9/21/20, 12:44 PM: Hey {color:#0066cc} [~zabetak] {color} {color:#0066cc}{color:#172b4d} Thanks for reminding me that I was test in the master is ok, the reason why the master used calcite-1.21.{color} {color} After analysis, the problem was fixed in calcite 1.19 or above was (Author: luguangming): Hey {color:#0066cc} [~zabetak] {color:#172b4d} Thanks for reminding me that I was test in the master is ok, the reason why the master used calcite-1.21. {color}{color} {color:#0066cc}{color:#172b4d}After analysis, the problem was fixed in calcite 1.19 or above{color}{color} > When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong > --- > > Key: HIVE-24122 > URL: https://issues.apache.org/jira/browse/HIVE-24122 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 3.1.0, 3.1.2 >Reporter: GuangMing Lu >Priority: Major > > {code:java} > create database testdb; > CREATE TABLE IF NOT EXISTS testdb.z_tab > ( > SEARCHWORD STRING, > COUNT_NUM BIGINT, > WORDS STRING > ) > ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' > STORED AS TEXTFILE; > insert into table testdb.z_tab > values('hivetest',111,'aaa'),('hivetest2',111,'bbb'); > set hive.cbo.enable=true; > SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab; > SELECT CAST(searchword as bigint) IS NULL FROM testdb.z_tab; > {code} > The SQL results for both queries are the same, as follows: > {noformat} > +---+ > | _c0 | > +---+ > | true | > | true | > +---+{noformat} > SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab; execute > result is wrong > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24122) When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong
[ https://issues.apache.org/jira/browse/HIVE-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199362#comment-17199362 ] GuangMing Lu commented on HIVE-24122: - Hey {color:#0066cc} [~zabetak] {color:#172b4d} Thanks for reminding me that I was test in the master is ok, the reason why the master used calcite-1.21. {color}{color} {color:#0066cc}{color:#172b4d}After analysis, the problem was fixed in calcite 1.19 or above{color}{color} > When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong > --- > > Key: HIVE-24122 > URL: https://issues.apache.org/jira/browse/HIVE-24122 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 3.1.0, 3.1.2 >Reporter: GuangMing Lu >Priority: Major > > {code:java} > create database testdb; > CREATE TABLE IF NOT EXISTS testdb.z_tab > ( > SEARCHWORD STRING, > COUNT_NUM BIGINT, > WORDS STRING > ) > ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' > STORED AS TEXTFILE; > insert into table testdb.z_tab > values('hivetest',111,'aaa'),('hivetest2',111,'bbb'); > set hive.cbo.enable=true; > SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab; > SELECT CAST(searchword as bigint) IS NULL FROM testdb.z_tab; > {code} > The SQL results for both queries are the same, as follows: > {noformat} > +---+ > | _c0 | > +---+ > | true | > | true | > +---+{noformat} > SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab; execute > result is wrong > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24159) Kafka storage handler broken in secure environment pt2: short-circuit on non-secure environment
[ https://issues.apache.org/jira/browse/HIVE-24159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199355#comment-17199355 ] László Bodor commented on HIVE-24159: - PR merged, thanks [~ashutoshc] for the review! > Kafka storage handler broken in secure environment pt2: short-circuit on > non-secure environment > --- > > Key: HIVE-24159 > URL: https://issues.apache.org/jira/browse/HIVE-24159 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized > upstream that the kafka qtest fails. Instead of setting up a kerberized > environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't > seen hive.server2.authentication.kerberos.principal used in *.q files) I > managed to make the test with a simple > UserGroupInformation.isSecurityEnabled() check, which can be also useful for > every non-secure environment. > For reference, the exception was: > {code} > 2020-09-14T03:30:01,217 ERROR [a42ef4c6-190c-47a6-86ad-8bf13b8a2dc1 main] > tez.TezTask: Failed to execute tez graph. > org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient > at > org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:451) > ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.kafka.clients.admin.Admin.create(Admin.java:59) > ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?] > at > org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:39) > ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?] > at > org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaDelegationTokenForBrokers(DagUtils.java:333) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaCredentials(DagUtils.java:301) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.DagUtils.addCredentials(DagUtils.java:282) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:516) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:223) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) > [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:193) > [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412) > [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:343) > [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1465) > [classes/:?] > at > org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1438) > [classes/:?] > at >
[jira] [Updated] (HIVE-24159) Kafka storage handler broken in secure environment pt2: short-circuit on non-secure environment
[ https://issues.apache.org/jira/browse/HIVE-24159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-24159: Fix Version/s: 4.0.0 > Kafka storage handler broken in secure environment pt2: short-circuit on > non-secure environment > --- > > Key: HIVE-24159 > URL: https://issues.apache.org/jira/browse/HIVE-24159 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized > upstream that the kafka qtest fails. Instead of setting up a kerberized > environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't > seen hive.server2.authentication.kerberos.principal used in *.q files) I > managed to make the test with a simple > UserGroupInformation.isSecurityEnabled() check, which can be also useful for > every non-secure environment. > For reference, the exception was: > {code} > 2020-09-14T03:30:01,217 ERROR [a42ef4c6-190c-47a6-86ad-8bf13b8a2dc1 main] > tez.TezTask: Failed to execute tez graph. > org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient > at > org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:451) > ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.kafka.clients.admin.Admin.create(Admin.java:59) > ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?] > at > org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:39) > ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?] > at > org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaDelegationTokenForBrokers(DagUtils.java:333) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaCredentials(DagUtils.java:301) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.DagUtils.addCredentials(DagUtils.java:282) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:516) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:223) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) > [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:193) > [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412) > [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:343) > [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1465) > [classes/:?] > at > org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1438) > [classes/:?] > at >
[jira] [Resolved] (HIVE-24159) Kafka storage handler broken in secure environment pt2: short-circuit on non-secure environment
[ https://issues.apache.org/jira/browse/HIVE-24159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor resolved HIVE-24159. - Resolution: Fixed > Kafka storage handler broken in secure environment pt2: short-circuit on > non-secure environment > --- > > Key: HIVE-24159 > URL: https://issues.apache.org/jira/browse/HIVE-24159 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized > upstream that the kafka qtest fails. Instead of setting up a kerberized > environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't > seen hive.server2.authentication.kerberos.principal used in *.q files) I > managed to make the test with a simple > UserGroupInformation.isSecurityEnabled() check, which can be also useful for > every non-secure environment. > For reference, the exception was: > {code} > 2020-09-14T03:30:01,217 ERROR [a42ef4c6-190c-47a6-86ad-8bf13b8a2dc1 main] > tez.TezTask: Failed to execute tez graph. > org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient > at > org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:451) > ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.kafka.clients.admin.Admin.create(Admin.java:59) > ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?] > at > org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:39) > ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?] > at > org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaDelegationTokenForBrokers(DagUtils.java:333) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaCredentials(DagUtils.java:301) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.DagUtils.addCredentials(DagUtils.java:282) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:516) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:223) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) > [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:193) > [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412) > [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:343) > [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1465) > [classes/:?] > at > org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1438) > [classes/:?] > at >
[jira] [Work logged] (HIVE-24159) Kafka storage handler broken in secure environment pt2: short-circuit on non-secure environment
[ https://issues.apache.org/jira/browse/HIVE-24159?focusedWorklogId=486885=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486885 ] ASF GitHub Bot logged work on HIVE-24159: - Author: ASF GitHub Bot Created on: 21/Sep/20 12:16 Start Date: 21/Sep/20 12:16 Worklog Time Spent: 10m Work Description: abstractdog merged pull request #1495: URL: https://github.com/apache/hive/pull/1495 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 486885) Time Spent: 40m (was: 0.5h) > Kafka storage handler broken in secure environment pt2: short-circuit on > non-secure environment > --- > > Key: HIVE-24159 > URL: https://issues.apache.org/jira/browse/HIVE-24159 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized > upstream that the kafka qtest fails. Instead of setting up a kerberized > environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't > seen hive.server2.authentication.kerberos.principal used in *.q files) I > managed to make the test with a simple > UserGroupInformation.isSecurityEnabled() check, which can be also useful for > every non-secure environment. > For reference, the exception was: > {code} > 2020-09-14T03:30:01,217 ERROR [a42ef4c6-190c-47a6-86ad-8bf13b8a2dc1 main] > tez.TezTask: Failed to execute tez graph. > org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient > at > org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:451) > ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.kafka.clients.admin.Admin.create(Admin.java:59) > ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?] > at > org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:39) > ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?] > at > org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaDelegationTokenForBrokers(DagUtils.java:333) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaCredentials(DagUtils.java:301) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.DagUtils.addCredentials(DagUtils.java:282) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:516) > ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:223) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232) > [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT] > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) >
[jira] [Work logged] (HIVE-24172) Fix TestMmCompactorOnMr
[ https://issues.apache.org/jira/browse/HIVE-24172?focusedWorklogId=486882=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486882 ] ASF GitHub Bot logged work on HIVE-24172: - Author: ASF GitHub Bot Created on: 21/Sep/20 12:14 Start Date: 21/Sep/20 12:14 Worklog Time Spent: 10m Work Description: klcopp opened a new pull request #1514: URL: https://github.com/apache/hive/pull/1514 Setting the execution engine as MR in the driver field (driver.getConf().setBoolVar(...)) only affects queries in setup and teardown. Compaction runs using the conf field. So the execution engine needed to be set to MR in conf so that compaction would pick it up. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 486882) Remaining Estimate: 0h Time Spent: 10m > Fix TestMmCompactorOnMr > --- > > Key: HIVE-24172 > URL: https://issues.apache.org/jira/browse/HIVE-24172 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Karen Coppage >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > test is unstable; > http://ci.hive.apache.org/job/hive-flaky-check/112/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24172) Fix TestMmCompactorOnMr
[ https://issues.apache.org/jira/browse/HIVE-24172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24172: -- Labels: pull-request-available (was: ) > Fix TestMmCompactorOnMr > --- > > Key: HIVE-24172 > URL: https://issues.apache.org/jira/browse/HIVE-24172 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > test is unstable; > http://ci.hive.apache.org/job/hive-flaky-check/112/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24185) Upgrade snappy-java to 1.1.7.5
[ https://issues.apache.org/jira/browse/HIVE-24185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24185: -- Labels: pull-request-available (was: ) > Upgrade snappy-java to 1.1.7.5 > -- > > Key: HIVE-24185 > URL: https://issues.apache.org/jira/browse/HIVE-24185 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Trivial > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Bump version to take advantage of perf improvements, glibc compatibility etc. > https://github.com/xerial/snappy-java/blob/master/Milestone.md#snappy-java-117-2017-11-30 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24185) Upgrade snappy-java to 1.1.7.5
[ https://issues.apache.org/jira/browse/HIVE-24185?focusedWorklogId=486880=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486880 ] ASF GitHub Bot logged work on HIVE-24185: - Author: ASF GitHub Bot Created on: 21/Sep/20 11:58 Start Date: 21/Sep/20 11:58 Worklog Time Spent: 10m Work Description: pgaref opened a new pull request #1513: URL: https://github.com/apache/hive/pull/1513 Change-Id: I6d314e48f96006f549974d1907a0d6de563d7250 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 486880) Remaining Estimate: 0h Time Spent: 10m > Upgrade snappy-java to 1.1.7.5 > -- > > Key: HIVE-24185 > URL: https://issues.apache.org/jira/browse/HIVE-24185 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > Bump version to take advantage of perf improvements, glibc compatibility etc. > https://github.com/xerial/snappy-java/blob/master/Milestone.md#snappy-java-117-2017-11-30 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24185) Upgrade snappy-java to 1.1.7.5
[ https://issues.apache.org/jira/browse/HIVE-24185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-24185: - > Upgrade snappy-java to 1.1.7.5 > -- > > Key: HIVE-24185 > URL: https://issues.apache.org/jira/browse/HIVE-24185 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Trivial > > Bump version to take advantage of perf improvements, glibc compatibility etc. > https://github.com/xerial/snappy-java/blob/master/Milestone.md#snappy-java-117-2017-11-30 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24172) Fix TestMmCompactorOnMr
[ https://issues.apache.org/jira/browse/HIVE-24172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199335#comment-17199335 ] Karen Coppage commented on HIVE-24172: -- http://ci.hive.apache.org/job/hive-flaky-check/115/ > Fix TestMmCompactorOnMr > --- > > Key: HIVE-24172 > URL: https://issues.apache.org/jira/browse/HIVE-24172 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Karen Coppage >Priority: Major > > test is unstable; > http://ci.hive.apache.org/job/hive-flaky-check/112/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24179) Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement
[ https://issues.apache.org/jira/browse/HIVE-24179?focusedWorklogId=486867=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486867 ] ASF GitHub Bot logged work on HIVE-24179: - Author: ASF GitHub Bot Created on: 21/Sep/20 11:24 Start Date: 21/Sep/20 11:24 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1509: URL: https://github.com/apache/hive/pull/1509#discussion_r491922128 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java ## @@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws SemanticException { String dbName = stripQuotes(root.getChild(0).getText()); boolean isExtended = (root.getChildCount() > 1); -HiveTxnManager txnManager = null; +boolean useNewLocksFormat; try { - txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); + HiveTxnManager txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); Review comment: Do we create txnManager instance here just to get the value of useNewLocksFormat flag? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 486867) Time Spent: 0.5h (was: 20m) > Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement > --- > > Key: HIVE-24179 > URL: https://issues.apache.org/jira/browse/HIVE-24179 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: summary.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > The problem can be reproduced by executing repeatedly a SHOW LOCK statement > and monitoring the heap memory of HS2. For a small heap (e.g., 2g) it only > takes a few minutes before the server crashes with OutOfMemory error such as > the one shown below. > {noformat} > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.util.Arrays.copyOf(Arrays.java:3332) > at > java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124) > at > java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448) > at java.lang.StringBuilder.append(StringBuilder.java:136) > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.encodeMessage(ForkedChannelEncoder.j > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.setOutErr(ForkedChannelEncoder.java: > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.stdErr(ForkedChannelEncoder.java:166 > at > org.apache.maven.surefire.booter.ForkingRunListener.writeTestOutput(ForkingRunListener.jav > at > org.apache.maven.surefire.report.ConsoleOutputCapture$ForwardingPrintStream.write(ConsoleO > at > org.apache.logging.log4j.core.util.CloseShieldOutputStream.write(CloseShieldOutputStream.j > at > org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStream > at > org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager > at > org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java: > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(Abst > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutp > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputS > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java: > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:12 > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(Appender > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) > at > org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:543) > at > org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:502) > at > org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:485) > at >
[jira] [Comment Edited] (HIVE-21964) jdbc handler class cast exception
[ https://issues.apache.org/jira/browse/HIVE-21964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199318#comment-17199318 ] chenruotao edited comment on HIVE-21964 at 9/21/20, 10:35 AM: -- I has the same problem like this, but not Decimal. the type of date and timestramp would throw the same exception when job running, so I do not use the type provided by hive(org.apache.hadoop.hive.common.type), and it worked. the code like this : case DATE: // if (rowVal instanceof java.sql.Date){ // LocalDate localDate = ((java.sql.Date) rowVal).toLocalDate(); //rowVal=Date.of(localDate.getYear(),localDate.getMonthValue(),localDate.getDayOfMonth(); // } else { // rowVal = Date.valueOf (rowVal.toString()); // } rowVal = Date.valueOf (rowVal.toString()); break; case TIMESTAMP: // if (rowVal instanceof java.sql.Timestamp) { // LocalDateTime localDateTime = ((java.sql.Timestamp) rowVal).toLocalDateTime(); //rowVal=Timestamp.ofEpochSecond(localDateTime.toEpochSecond(UTC),localDateTime.geNano()); // } else { // rowVal = Timestamp.valueOf (rowVal.toString()); // } rowVal = Timestamp.valueOf (rowVal.toString()); was (Author: chenruotao): I has the same problem like this, but not Decimal. the type of date and timestramp would throw the same exception when job running, so I do not use the type provided by hive(org.apache.hadoop.hive.common.type), and it worked. the code like this : case DATE: // if (rowVal instanceof java.sql.Date){ // LocalDate localDate = ((java.sql.Date) rowVal).toLocalDate(); //rowVal=Date.of(localDate.getYear(),localDate.getMonthValue(),localDate.getDayOfMonth(); // }else{ // rowVal = Date.valueOf (rowVal.toString()); // } rowVal = Date.valueOf (rowVal.toString()); break; case TIMESTAMP: // if (rowVal instanceof java.sql.Timestamp){ // LocalDateTime localDateTime = ((java.sql.Timestamp) rowVal).toLocalDateTime(); //rowVal=Timestamp.ofEpochSecond(localDateTime.toEpochSecond(UTC),localDateTime.geNano()); // }else{ // rowVal = Timestamp.valueOf (rowVal.toString()); // } rowVal = Timestamp.valueOf (rowVal.toString()); > jdbc handler class cast exception > - > > Key: HIVE-21964 > URL: https://issues.apache.org/jira/browse/HIVE-21964 > Project: Hive > Issue Type: Improvement > Components: JDBC >Affects Versions: 3.1.1 >Reporter: Aloys Zhang >Priority: Major > > Using hive jdbc handler to query external mysql data source with type decimal > type, it throws class cast Exception : > > {code:java} > 2019-07-08T11:11:50,424 ERROR [7787918f-3111-4706-a3b3-0097fa1bc117 main] > CliDriver: Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: java.math.BigDecimal cannot be cast to > org.apache.hadoop.hive.common.type.HiveDecimal > java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: java.math.BigDecimal cannot be cast to > org.apache.hadoop.hive.common.type.HiveDecimal > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:162) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2691) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229) > at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) > at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:226) > at org.apache.hadoop.util.RunJar.main(RunJar.java:141) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: java.math.BigDecimal cannot be cast to > org.apache.hadoop.hive.common.type.HiveDecimal > at > org.apache.hadoop.hive.ql.exec.ListSinkOperator.process(ListSinkOperator.java:98) > at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) > at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995) > at
[jira] [Comment Edited] (HIVE-21964) jdbc handler class cast exception
[ https://issues.apache.org/jira/browse/HIVE-21964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199318#comment-17199318 ] chenruotao edited comment on HIVE-21964 at 9/21/20, 10:35 AM: -- I has the same problem like this, but not Decimal. the type of date and timestramp would throw the same exception when job running, so I do not use the type provided by hive(org.apache.hadoop.hive.common.type), and it worked. the code like this : case DATE: rowVal = Date.valueOf (rowVal.toString()); break; case TIMESTAMP: rowVal = Timestamp.valueOf (rowVal.toString()); was (Author: chenruotao): I has the same problem like this, but not Decimal. the type of date and timestramp would throw the same exception when job running, so I do not use the type provided by hive(org.apache.hadoop.hive.common.type), and it worked. the code like this : case DATE: // if (rowVal instanceof java.sql.Date){ // LocalDate localDate = ((java.sql.Date) rowVal).toLocalDate(); //rowVal=Date.of(localDate.getYear(),localDate.getMonthValue(),localDate.getDayOfMonth(); // } else { // rowVal = Date.valueOf (rowVal.toString()); // } rowVal = Date.valueOf (rowVal.toString()); break; case TIMESTAMP: // if (rowVal instanceof java.sql.Timestamp) { // LocalDateTime localDateTime = ((java.sql.Timestamp) rowVal).toLocalDateTime(); //rowVal=Timestamp.ofEpochSecond(localDateTime.toEpochSecond(UTC),localDateTime.geNano()); // } else { // rowVal = Timestamp.valueOf (rowVal.toString()); // } rowVal = Timestamp.valueOf (rowVal.toString()); > jdbc handler class cast exception > - > > Key: HIVE-21964 > URL: https://issues.apache.org/jira/browse/HIVE-21964 > Project: Hive > Issue Type: Improvement > Components: JDBC >Affects Versions: 3.1.1 >Reporter: Aloys Zhang >Priority: Major > > Using hive jdbc handler to query external mysql data source with type decimal > type, it throws class cast Exception : > > {code:java} > 2019-07-08T11:11:50,424 ERROR [7787918f-3111-4706-a3b3-0097fa1bc117 main] > CliDriver: Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: java.math.BigDecimal cannot be cast to > org.apache.hadoop.hive.common.type.HiveDecimal > java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: java.math.BigDecimal cannot be cast to > org.apache.hadoop.hive.common.type.HiveDecimal > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:162) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2691) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229) > at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) > at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:226) > at org.apache.hadoop.util.RunJar.main(RunJar.java:141) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: java.math.BigDecimal cannot be cast to > org.apache.hadoop.hive.common.type.HiveDecimal > at > org.apache.hadoop.hive.ql.exec.ListSinkOperator.process(ListSinkOperator.java:98) > at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) > at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511) > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) > ... 14 more > Caused by: java.lang.ClassCastException: java.math.BigDecimal cannot be cast > to org.apache.hadoop.hive.common.type.HiveDecimal > at >
[jira] [Comment Edited] (HIVE-21964) jdbc handler class cast exception
[ https://issues.apache.org/jira/browse/HIVE-21964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199318#comment-17199318 ] chenruotao edited comment on HIVE-21964 at 9/21/20, 10:35 AM: -- I has the same problem like this, but not Decimal. the type of date and timestramp would throw the same exception when job running, so I do not use the type provided by hive(org.apache.hadoop.hive.common.type), and it worked. the code like this : case DATE: // if (rowVal instanceof java.sql.Date){ // LocalDate localDate = ((java.sql.Date) rowVal).toLocalDate(); //rowVal=Date.of(localDate.getYear(),localDate.getMonthValue(),localDate.getDayOfMonth(); // }else{ // rowVal = Date.valueOf (rowVal.toString()); // } rowVal = Date.valueOf (rowVal.toString()); break; case TIMESTAMP: // if (rowVal instanceof java.sql.Timestamp){ // LocalDateTime localDateTime = ((java.sql.Timestamp) rowVal).toLocalDateTime(); //rowVal=Timestamp.ofEpochSecond(localDateTime.toEpochSecond(UTC),localDateTime.geNano()); // }else{ // rowVal = Timestamp.valueOf (rowVal.toString()); // } rowVal = Timestamp.valueOf (rowVal.toString()); was (Author: chenruotao): I has the same problem like this, but not Decimal. the type of date and timestramp would throw the same exception when job running, so I do not use the type provided by hive(org.apache.hadoop.hive.common.type), and it worked. the code like this : case DATE: // if (rowVal instanceof java.sql.Date) { // LocalDate localDate = ((java.sql.Date) rowVal).toLocalDate(); // rowVal = Date.of(localDate.getYear(), localDate.getMonthValue(), localDate.getDayOfMonth()); // } else { // rowVal = Date.valueOf (rowVal.toString()); // } rowVal = Date.valueOf (rowVal.toString()); break; case TIMESTAMP: // if (rowVal instanceof java.sql.Timestamp) { // LocalDateTime localDateTime = ((java.sql.Timestamp) rowVal).toLocalDateTime(); // rowVal = Timestamp.ofEpochSecond(localDateTime.toEpochSecond(UTC), localDateTime.getNano()); // } else { // rowVal = Timestamp.valueOf (rowVal.toString()); // } rowVal = Timestamp.valueOf (rowVal.toString()); > jdbc handler class cast exception > - > > Key: HIVE-21964 > URL: https://issues.apache.org/jira/browse/HIVE-21964 > Project: Hive > Issue Type: Improvement > Components: JDBC >Affects Versions: 3.1.1 >Reporter: Aloys Zhang >Priority: Major > > Using hive jdbc handler to query external mysql data source with type decimal > type, it throws class cast Exception : > > {code:java} > 2019-07-08T11:11:50,424 ERROR [7787918f-3111-4706-a3b3-0097fa1bc117 main] > CliDriver: Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: java.math.BigDecimal cannot be cast to > org.apache.hadoop.hive.common.type.HiveDecimal > java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: java.math.BigDecimal cannot be cast to > org.apache.hadoop.hive.common.type.HiveDecimal > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:162) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2691) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229) > at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) > at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:226) > at org.apache.hadoop.util.RunJar.main(RunJar.java:141) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: java.math.BigDecimal cannot be cast to > org.apache.hadoop.hive.common.type.HiveDecimal > at > org.apache.hadoop.hive.ql.exec.ListSinkOperator.process(ListSinkOperator.java:98) > at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) > at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995) > at
[jira] [Commented] (HIVE-21964) jdbc handler class cast exception
[ https://issues.apache.org/jira/browse/HIVE-21964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199318#comment-17199318 ] chenruotao commented on HIVE-21964: --- I has the same problem like this, but not Decimal. the type of date and timestramp would throw the same exception when job running, so I do not use the type provided by hive(org.apache.hadoop.hive.common.type), and it worked. the code like this : case DATE: // if (rowVal instanceof java.sql.Date) { // LocalDate localDate = ((java.sql.Date) rowVal).toLocalDate(); // rowVal = Date.of(localDate.getYear(), localDate.getMonthValue(), localDate.getDayOfMonth()); // } else { // rowVal = Date.valueOf (rowVal.toString()); // } rowVal = Date.valueOf (rowVal.toString()); break; case TIMESTAMP: // if (rowVal instanceof java.sql.Timestamp) { // LocalDateTime localDateTime = ((java.sql.Timestamp) rowVal).toLocalDateTime(); // rowVal = Timestamp.ofEpochSecond(localDateTime.toEpochSecond(UTC), localDateTime.getNano()); // } else { // rowVal = Timestamp.valueOf (rowVal.toString()); // } rowVal = Timestamp.valueOf (rowVal.toString()); > jdbc handler class cast exception > - > > Key: HIVE-21964 > URL: https://issues.apache.org/jira/browse/HIVE-21964 > Project: Hive > Issue Type: Improvement > Components: JDBC >Affects Versions: 3.1.1 >Reporter: Aloys Zhang >Priority: Major > > Using hive jdbc handler to query external mysql data source with type decimal > type, it throws class cast Exception : > > {code:java} > 2019-07-08T11:11:50,424 ERROR [7787918f-3111-4706-a3b3-0097fa1bc117 main] > CliDriver: Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: java.math.BigDecimal cannot be cast to > org.apache.hadoop.hive.common.type.HiveDecimal > java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: java.math.BigDecimal cannot be cast to > org.apache.hadoop.hive.common.type.HiveDecimal > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:162) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2691) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229) > at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) > at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:226) > at org.apache.hadoop.util.RunJar.main(RunJar.java:141) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: java.math.BigDecimal cannot be cast to > org.apache.hadoop.hive.common.type.HiveDecimal > at > org.apache.hadoop.hive.ql.exec.ListSinkOperator.process(ListSinkOperator.java:98) > at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) > at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511) > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) > ... 14 more > Caused by: java.lang.ClassCastException: java.math.BigDecimal cannot be cast > to org.apache.hadoop.hive.common.type.HiveDecimal > at > org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveDecimalObjectInspector.getPrimitiveJavaObject(JavaHiveDecimalObjectInspector.java:55) > at > org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:329) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:292) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:247) > at >
[jira] [Work logged] (HIVE-24179) Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement
[ https://issues.apache.org/jira/browse/HIVE-24179?focusedWorklogId=486849=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486849 ] ASF GitHub Bot logged work on HIVE-24179: - Author: ASF GitHub Bot Created on: 21/Sep/20 09:58 Start Date: 21/Sep/20 09:58 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #1509: URL: https://github.com/apache/hive/pull/1509#discussion_r491922128 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java ## @@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws SemanticException { String dbName = stripQuotes(root.getChild(0).getText()); boolean isExtended = (root.getChildCount() > 1); -HiveTxnManager txnManager = null; +boolean useNewLocksFormat; try { - txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); + HiveTxnManager txnManager = TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); Review comment: Do we create txnManager instance here just to get the value of useNewLocksFormat flag? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 486849) Time Spent: 20m (was: 10m) > Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement > --- > > Key: HIVE-24179 > URL: https://issues.apache.org/jira/browse/HIVE-24179 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: summary.png > > Time Spent: 20m > Remaining Estimate: 0h > > The problem can be reproduced by executing repeatedly a SHOW LOCK statement > and monitoring the heap memory of HS2. For a small heap (e.g., 2g) it only > takes a few minutes before the server crashes with OutOfMemory error such as > the one shown below. > {noformat} > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.util.Arrays.copyOf(Arrays.java:3332) > at > java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124) > at > java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448) > at java.lang.StringBuilder.append(StringBuilder.java:136) > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.encodeMessage(ForkedChannelEncoder.j > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.setOutErr(ForkedChannelEncoder.java: > at > org.apache.maven.surefire.booter.ForkedChannelEncoder.stdErr(ForkedChannelEncoder.java:166 > at > org.apache.maven.surefire.booter.ForkingRunListener.writeTestOutput(ForkingRunListener.jav > at > org.apache.maven.surefire.report.ConsoleOutputCapture$ForwardingPrintStream.write(ConsoleO > at > org.apache.logging.log4j.core.util.CloseShieldOutputStream.write(CloseShieldOutputStream.j > at > org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStream > at > org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager > at > org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java: > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(Abst > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutp > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputS > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java: > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:12 > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(Appender > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) > at > org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:543) > at > org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:502) > at > org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:485) > at >
[jira] [Resolved] (HIVE-17462) hive_1.2.1 hiveserver2 memory leak
[ https://issues.apache.org/jira/browse/HIVE-17462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kongxianghe resolved HIVE-17462. Fix Version/s: 1.2.1 Resolution: Won't Fix won‘t fix. > hive_1.2.1 hiveserver2 memory leak > > > Key: HIVE-17462 > URL: https://issues.apache.org/jira/browse/HIVE-17462 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 > Environment: hive version 1.2.1 >Reporter: gehaijiang >Assignee: kongxianghe >Priority: Major > Fix For: 1.2.1 > > > hiveserver2 memory leak > hive use third UDF (vs-1.0.2-SNAPSHOT.jar , > alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar . and so on ) > lr-x-- 1 data data 64 Sep 5 18:37 964 -> > /tmp/9e38cc04-5693-474b-9c7d-bfdd978bcbb4_resources/vs-1.0.2-SNAPSHOT.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 10:41 965 -> > /tmp/188bbf2a-d8a5-48a7-81fc-b807f9ff201d_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 17:41 97 -> > /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/jsr305-3.0.0.jar > lrwx-- 1 data data 64 Sep 5 18:37 975 -> socket:[1318353317] > lr-x-- 1 data data 64 Sep 6 02:38 977 -> > /tmp/64e309dc-352f-4ba4-b871-1aa78fe05945_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 17:41 98 -> > /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar > lrwx-- 1 data data 64 Sep 6 08:40 983 -> socket:[1299459344] > lr-x-- 1 data data 64 Sep 5 19:37 987 -> > /tmp/c3054987-c9c6-468a-8b5c-6e20b1972e0b_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 17:41 99 -> > /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/guava-11.0.2.jar > lr-x-- 1 data data 64 Sep 6 08:40 994 -> > /tmp/fc5c44b3-9bd8-4a32-a39a-66cd44032fee_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 06:39 996 -> > /tmp/3b3c2bd6-0a0e-4599-b757-4a048a968457_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 5 17:36 999 -> > /tmp/6ad76494-cdda-430b-b7d0-2213731655a8_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 20084 data 20 0 13.6g 11g 533m S 62.3 9.2 6619:16 java > /home/data/programs/jdk/jdk-current/bin/java-Djava.net.preferIPv4Stack=true-Dhadoop.log.dir=/home/data/hadoop/logs-Dhadoop.log.file=hadoop.log-Dhadoop.home.dir=/home/data/programs/hadoop-2.7.1-Dhadoop.id.str=data-Dhadoop.root.logger=INFO,DRFA-Djava.library.path=/home/data/programs/hadoop-2.7.1/lib/native-Dhadoop.policy.file=hadoop-policy.xml-Djava.net.preferIPv4Stack=true-XX:+UseConcMarkSweepGC-Xms8g-Xmx8g-Dhadoop.security.logger=INFO,NullAppenderorg.apache.hadoop.util.RunJar/home/data/programs/hive-current/lib/hive-service-1.2.1.jarorg.apache.hive.service.server.HiveServer2--hiveconfhive.log.file=hiveserver2.log -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-17462) hive_1.2.1 hiveserver2 memory leak
[ https://issues.apache.org/jira/browse/HIVE-17462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kongxianghe reassigned HIVE-17462: -- Assignee: kongxianghe > hive_1.2.1 hiveserver2 memory leak > > > Key: HIVE-17462 > URL: https://issues.apache.org/jira/browse/HIVE-17462 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 > Environment: hive version 1.2.1 >Reporter: gehaijiang >Assignee: kongxianghe >Priority: Major > > hiveserver2 memory leak > hive use third UDF (vs-1.0.2-SNAPSHOT.jar , > alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar . and so on ) > lr-x-- 1 data data 64 Sep 5 18:37 964 -> > /tmp/9e38cc04-5693-474b-9c7d-bfdd978bcbb4_resources/vs-1.0.2-SNAPSHOT.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 10:41 965 -> > /tmp/188bbf2a-d8a5-48a7-81fc-b807f9ff201d_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 17:41 97 -> > /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/jsr305-3.0.0.jar > lrwx-- 1 data data 64 Sep 5 18:37 975 -> socket:[1318353317] > lr-x-- 1 data data 64 Sep 6 02:38 977 -> > /tmp/64e309dc-352f-4ba4-b871-1aa78fe05945_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 17:41 98 -> > /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar > lrwx-- 1 data data 64 Sep 6 08:40 983 -> socket:[1299459344] > lr-x-- 1 data data 64 Sep 5 19:37 987 -> > /tmp/c3054987-c9c6-468a-8b5c-6e20b1972e0b_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 17:41 99 -> > /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/guava-11.0.2.jar > lr-x-- 1 data data 64 Sep 6 08:40 994 -> > /tmp/fc5c44b3-9bd8-4a32-a39a-66cd44032fee_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 06:39 996 -> > /tmp/3b3c2bd6-0a0e-4599-b757-4a048a968457_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 5 17:36 999 -> > /tmp/6ad76494-cdda-430b-b7d0-2213731655a8_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 20084 data 20 0 13.6g 11g 533m S 62.3 9.2 6619:16 java > /home/data/programs/jdk/jdk-current/bin/java-Djava.net.preferIPv4Stack=true-Dhadoop.log.dir=/home/data/hadoop/logs-Dhadoop.log.file=hadoop.log-Dhadoop.home.dir=/home/data/programs/hadoop-2.7.1-Dhadoop.id.str=data-Dhadoop.root.logger=INFO,DRFA-Djava.library.path=/home/data/programs/hadoop-2.7.1/lib/native-Dhadoop.policy.file=hadoop-policy.xml-Djava.net.preferIPv4Stack=true-XX:+UseConcMarkSweepGC-Xms8g-Xmx8g-Dhadoop.security.logger=INFO,NullAppenderorg.apache.hadoop.util.RunJar/home/data/programs/hive-current/lib/hive-service-1.2.1.jarorg.apache.hive.service.server.HiveServer2--hiveconfhive.log.file=hiveserver2.log -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HIVE-17462) hive_1.2.1 hiveserver2 memory leak
[ https://issues.apache.org/jira/browse/HIVE-17462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199204#comment-17199204 ] kongxianghe edited comment on HIVE-17462 at 9/21/20, 7:13 AM: -- was fixed in https://issues.apache.org/jira/browse/HIVE-10453 you may use hdp-1.2.1000.2.6.1.0 or some other versions,this patch may be not added into it. decompile your hive-exec-1.2.1xxx.jar and find class SessionState.class . {code} public void close() throws IOException { registry.clear();; registry.clear(); if (txnMgr != null) txnMgr.closeTxnManager(); JavaUtils.closeClassLoadersTo(conf.getClassLoader(), parentLoader); File resourceDir = @@ -1493,7 +1493,7 @@ public class SessionState { sparkSession = null; } } // this line might be lost!! registry.closeCUDFLoaders(); dropSessionPaths(conf); } {code} registry.closeCUDFLoaders(); in hdp 2.6.1 hive-1.2.1 this line is missing and that might cause this problem . was (Author: kongxianghe): was fixed in https://issues.apache.org/jira/browse/HIVE-10453 you may use hdp-1.2.1000.2.6.1.0 or some other versions,this patch may be not added into it. decompile your hive-exec-1.2.1xxx.jar and find class SessionState.class . {code} public void close() throws IOException { registry.clear();; registry.clear(); if (txnMgr != null) txnMgr.closeTxnManager(); JavaUtils.closeClassLoadersTo(conf.getClassLoader(), parentLoader); File resourceDir = @@ -1493,7 +1493,7 @@ public class SessionState { sparkSession = null; } } // this line might be lost!! registry.closeCUDFLoaders(); dropSessionPaths(conf); } {code} registry.closeCUDFLoaders(); in hdp 2.6.1 hive-1.2.1 this line is missing,might clause this problem > hive_1.2.1 hiveserver2 memory leak > > > Key: HIVE-17462 > URL: https://issues.apache.org/jira/browse/HIVE-17462 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 > Environment: hive version 1.2.1 >Reporter: gehaijiang >Priority: Major > > hiveserver2 memory leak > hive use third UDF (vs-1.0.2-SNAPSHOT.jar , > alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar . and so on ) > lr-x-- 1 data data 64 Sep 5 18:37 964 -> > /tmp/9e38cc04-5693-474b-9c7d-bfdd978bcbb4_resources/vs-1.0.2-SNAPSHOT.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 10:41 965 -> > /tmp/188bbf2a-d8a5-48a7-81fc-b807f9ff201d_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 17:41 97 -> > /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/jsr305-3.0.0.jar > lrwx-- 1 data data 64 Sep 5 18:37 975 -> socket:[1318353317] > lr-x-- 1 data data 64 Sep 6 02:38 977 -> > /tmp/64e309dc-352f-4ba4-b871-1aa78fe05945_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 17:41 98 -> > /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar > lrwx-- 1 data data 64 Sep 6 08:40 983 -> socket:[1299459344] > lr-x-- 1 data data 64 Sep 5 19:37 987 -> > /tmp/c3054987-c9c6-468a-8b5c-6e20b1972e0b_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 17:41 99 -> > /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/guava-11.0.2.jar > lr-x-- 1 data data 64 Sep 6 08:40 994 -> > /tmp/fc5c44b3-9bd8-4a32-a39a-66cd44032fee_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 06:39 996 -> > /tmp/3b3c2bd6-0a0e-4599-b757-4a048a968457_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 5 17:36 999 -> > /tmp/6ad76494-cdda-430b-b7d0-2213731655a8_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 20084 data 20 0 13.6g 11g 533m S 62.3 9.2 6619:16 java > /home/data/programs/jdk/jdk-current/bin/java-Djava.net.preferIPv4Stack=true-Dhadoop.log.dir=/home/data/hadoop/logs-Dhadoop.log.file=hadoop.log-Dhadoop.home.dir=/home/data/programs/hadoop-2.7.1-Dhadoop.id.str=data-Dhadoop.root.logger=INFO,DRFA-Djava.library.path=/home/data/programs/hadoop-2.7.1/lib/native-Dhadoop.policy.file=hadoop-policy.xml-Djava.net.preferIPv4Stack=true-XX:+UseConcMarkSweepGC-Xms8g-Xmx8g-Dhadoop.security.logger=INFO,NullAppenderorg.apache.hadoop.util.RunJar/home/data/programs/hive-current/lib/hive-service-1.2.1.jarorg.apache.hive.service.server.HiveServer2--hiveconfhive.log.file=hiveserver2.log -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-17462) hive_1.2.1 hiveserver2 memory leak
[ https://issues.apache.org/jira/browse/HIVE-17462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199204#comment-17199204 ] kongxianghe commented on HIVE-17462: was fixed in https://issues.apache.org/jira/browse/HIVE-10453 you may use hdp-1.2.1000.2.6.1.0 or some other versions,this patch may be not added into it. decompile your hive-exec-1.2.1xxx.jar and find class SessionState.class . {code} public void close() throws IOException { registry.clear();; registry.clear(); if (txnMgr != null) txnMgr.closeTxnManager(); JavaUtils.closeClassLoadersTo(conf.getClassLoader(), parentLoader); File resourceDir = @@ -1493,7 +1493,7 @@ public class SessionState { sparkSession = null; } } // this line might be lost!! registry.closeCUDFLoaders(); dropSessionPaths(conf); } {code} registry.closeCUDFLoaders(); in hdp 2.6.1 hive-1.2.1 this line is missing,might clause this problem > hive_1.2.1 hiveserver2 memory leak > > > Key: HIVE-17462 > URL: https://issues.apache.org/jira/browse/HIVE-17462 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 > Environment: hive version 1.2.1 >Reporter: gehaijiang >Priority: Major > > hiveserver2 memory leak > hive use third UDF (vs-1.0.2-SNAPSHOT.jar , > alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar . and so on ) > lr-x-- 1 data data 64 Sep 5 18:37 964 -> > /tmp/9e38cc04-5693-474b-9c7d-bfdd978bcbb4_resources/vs-1.0.2-SNAPSHOT.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 10:41 965 -> > /tmp/188bbf2a-d8a5-48a7-81fc-b807f9ff201d_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 17:41 97 -> > /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/jsr305-3.0.0.jar > lrwx-- 1 data data 64 Sep 5 18:37 975 -> socket:[1318353317] > lr-x-- 1 data data 64 Sep 6 02:38 977 -> > /tmp/64e309dc-352f-4ba4-b871-1aa78fe05945_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 17:41 98 -> > /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar > lrwx-- 1 data data 64 Sep 6 08:40 983 -> socket:[1299459344] > lr-x-- 1 data data 64 Sep 5 19:37 987 -> > /tmp/c3054987-c9c6-468a-8b5c-6e20b1972e0b_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 17:41 99 -> > /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/guava-11.0.2.jar > lr-x-- 1 data data 64 Sep 6 08:40 994 -> > /tmp/fc5c44b3-9bd8-4a32-a39a-66cd44032fee_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 6 06:39 996 -> > /tmp/3b3c2bd6-0a0e-4599-b757-4a048a968457_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > lr-x-- 1 data data 64 Sep 5 17:36 999 -> > /tmp/6ad76494-cdda-430b-b7d0-2213731655a8_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar > (deleted) > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 20084 data 20 0 13.6g 11g 533m S 62.3 9.2 6619:16 java > /home/data/programs/jdk/jdk-current/bin/java-Djava.net.preferIPv4Stack=true-Dhadoop.log.dir=/home/data/hadoop/logs-Dhadoop.log.file=hadoop.log-Dhadoop.home.dir=/home/data/programs/hadoop-2.7.1-Dhadoop.id.str=data-Dhadoop.root.logger=INFO,DRFA-Djava.library.path=/home/data/programs/hadoop-2.7.1/lib/native-Dhadoop.policy.file=hadoop-policy.xml-Djava.net.preferIPv4Stack=true-XX:+UseConcMarkSweepGC-Xms8g-Xmx8g-Dhadoop.security.logger=INFO,NullAppenderorg.apache.hadoop.util.RunJar/home/data/programs/hive-current/lib/hive-service-1.2.1.jarorg.apache.hive.service.server.HiveServer2--hiveconfhive.log.file=hiveserver2.log -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24170) Add UDF resources explicitly to the classpath while handling drop function event during load.
[ https://issues.apache.org/jira/browse/HIVE-24170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anishek Agarwal updated HIVE-24170: --- Resolution: Fixed Status: Resolved (was: Patch Available) Merged to master, Thanks for the patch [~pkumarsinha] and review [~aasha] > Add UDF resources explicitly to the classpath while handling drop function > event during load. > - > > Key: HIVE-24170 > URL: https://issues.apache.org/jira/browse/HIVE-24170 > Project: Hive > Issue Type: Bug >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24170.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)