[jira] [Work logged] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies
[ https://issues.apache.org/jira/browse/HIVE-24227?focusedWorklogId=528574&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-528574 ] ASF GitHub Bot logged work on HIVE-24227: - Author: ASF GitHub Bot Created on: 27/Dec/20 01:03 Start Date: 27/Dec/20 01:03 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1550: URL: https://github.com/apache/hive/pull/1550 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 528574) Time Spent: 3h (was: 2h 50m) > sys.replication_metrics table shows incorrect status for failed policies > > > Key: HIVE-24227 > URL: https://issues.apache.org/jira/browse/HIVE-24227 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24227.04.patch, HIVE-24227.05.patch, > HIVE-24227.06.patch, HIVE-24227.07.patch, HIVE-24227.08.patch > > Time Spent: 3h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies
[ https://issues.apache.org/jira/browse/HIVE-24227?focusedWorklogId=526412&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-526412 ] ASF GitHub Bot logged work on HIVE-24227: - Author: ASF GitHub Bot Created on: 20/Dec/20 00:56 Start Date: 20/Dec/20 00:56 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1550: URL: https://github.com/apache/hive/pull/1550#issuecomment-748546567 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 526412) Time Spent: 2h 50m (was: 2h 40m) > sys.replication_metrics table shows incorrect status for failed policies > > > Key: HIVE-24227 > URL: https://issues.apache.org/jira/browse/HIVE-24227 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24227.04.patch, HIVE-24227.05.patch, > HIVE-24227.06.patch, HIVE-24227.07.patch, HIVE-24227.08.patch > > Time Spent: 2h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies
[ https://issues.apache.org/jira/browse/HIVE-24227?focusedWorklogId=502514&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-502514 ] ASF GitHub Bot logged work on HIVE-24227: - Author: ASF GitHub Bot Created on: 20/Oct/20 05:23 Start Date: 20/Oct/20 05:23 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #1550: URL: https://github.com/apache/hive/pull/1550#discussion_r507575660 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java ## @@ -1324,5 +1331,18 @@ private String relativeExtInfoPath(String dbName) { return File.separator + dbName.toLowerCase() + File.separator + FILE_NAME; } } - + + private Path getNonRecoverablePath(Path dumpDir, String dbName) throws IOException { Review comment: Need to be utility method. This is part of other tests also ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/AckTask.java ## @@ -45,9 +46,10 @@ public int execute() { Path ackPath = work.getAckFilePath(); Utils.create(ackPath, conf); LOG.info("Created ack file : {} ", ackPath); -} catch (SemanticException e) { +} catch (Exception e) { setException(e); - return ErrorMsg.getErrorMsg(e.getMessage()).getErrorCode(); + return ReplUtils.handleException(true, e, work.getAckFilePath().getParent().getParent().toString(), Review comment: can this be null work.getAckFilePath().getParent().getParent() This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 502514) Time Spent: 2h 40m (was: 2.5h) > sys.replication_metrics table shows incorrect status for failed policies > > > Key: HIVE-24227 > URL: https://issues.apache.org/jira/browse/HIVE-24227 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies
[ https://issues.apache.org/jira/browse/HIVE-24227?focusedWorklogId=495373&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495373 ] ASF GitHub Bot logged work on HIVE-24227: - Author: ASF GitHub Bot Created on: 05/Oct/20 14:53 Start Date: 05/Oct/20 14:53 Worklog Time Spent: 10m Work Description: ArkoSharma commented on a change in pull request #1550: URL: https://github.com/apache/hive/pull/1550#discussion_r499659807 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/DDLTask.java ## @@ -82,8 +89,32 @@ public int execute() { throw new IllegalArgumentException("Unknown DDL request: " + ddlDesc.getClass()); } } catch (Throwable e) { + LOG.error("DDLTask failed", e); + int errorCode = ErrorMsg.getErrorMsg(e.getMessage()).getErrorCode(); + try { +ReplicationMetricCollector metricCollector = work.getMetricCollector(); +if (errorCode > 4) { + //in case of replication related task, dumpDirectory should not be null + if(work.dumpDirectory != null) { +Path nonRecoverableMarker = new Path(work.dumpDirectory, ReplAck.NON_RECOVERABLE_MARKER.toString()); +org.apache.hadoop.hive.ql.parse.repl.dump.Utils.writeStackTrace(e, nonRecoverableMarker, conf); +if(metricCollector != null){ + metricCollector.reportStageEnd(getName(), Status.FAILED_ADMIN, nonRecoverableMarker.toString()); +} + } + if(metricCollector != null){ Review comment: In replication flows, dumpDirectory and metricCollector both should be non-null. This line considers the corner case where metricCollector might have been configured but not dumpDirectory. Still it is a replication case since only replication tasks can initialise and pass metricCollector. So we should indicate FAILED_ADMIN state at-least (non-recoverable path is null). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495373) Time Spent: 2.5h (was: 2h 20m) > sys.replication_metrics table shows incorrect status for failed policies > > > Key: HIVE-24227 > URL: https://issues.apache.org/jira/browse/HIVE-24227 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies
[ https://issues.apache.org/jira/browse/HIVE-24227?focusedWorklogId=495367&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495367 ] ASF GitHub Bot logged work on HIVE-24227: - Author: ASF GitHub Bot Created on: 05/Oct/20 14:46 Start Date: 05/Oct/20 14:46 Worklog Time Spent: 10m Work Description: ArkoSharma commented on a change in pull request #1550: URL: https://github.com/apache/hive/pull/1550#discussion_r499654760 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/AlterDatabaseHandler.java ## @@ -77,9 +79,22 @@ alterDbDesc = new AlterDatabaseSetOwnerDesc(actualDbName, new PrincipalDesc(newDb.getOwnerName(), newDb.getOwnerType()), context.eventOnlyReplicationSpec()); } + Path metricPath = null; + ReplicationMetricCollector metricCollector = null; + try{ +metricPath = ReplUtils.getMetricPath(context, context.hiveConf); Review comment: hiveConf has default access in Context, can't be accessed by ReplUtils. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495367) Time Spent: 2h 20m (was: 2h 10m) > sys.replication_metrics table shows incorrect status for failed policies > > > Key: HIVE-24227 > URL: https://issues.apache.org/jira/browse/HIVE-24227 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies
[ https://issues.apache.org/jira/browse/HIVE-24227?focusedWorklogId=495355&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495355 ] ASF GitHub Bot logged work on HIVE-24227: - Author: ASF GitHub Bot Created on: 05/Oct/20 14:25 Start Date: 05/Oct/20 14:25 Worklog Time Spent: 10m Work Description: ArkoSharma commented on a change in pull request #1550: URL: https://github.com/apache/hive/pull/1550#discussion_r499638418 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/DirCopyTask.java ## @@ -140,7 +142,23 @@ public int execute() { } }); } catch (Exception e) { - throw new SecurityException(ErrorMsg.REPL_RETRY_EXHAUSTED.format(e.getMessage()), e); Review comment: This check is being done in the following lines. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495355) Time Spent: 2h 10m (was: 2h) > sys.replication_metrics table shows incorrect status for failed policies > > > Key: HIVE-24227 > URL: https://issues.apache.org/jira/browse/HIVE-24227 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies
[ https://issues.apache.org/jira/browse/HIVE-24227?focusedWorklogId=495283&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495283 ] ASF GitHub Bot logged work on HIVE-24227: - Author: ASF GitHub Bot Created on: 05/Oct/20 11:50 Start Date: 05/Oct/20 11:50 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #1550: URL: https://github.com/apache/hive/pull/1550#discussion_r499539993 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/util/ReplUtils.java ## @@ -280,6 +299,49 @@ public static PathFilter getBootstrapDirectoryFilter(final FileSystem fs) { }; } + public static Path getMetricPath(MessageHandler.Context context, HiveConf hiveConf) throws Exception{ +DumpType dumpType; +Path metricPath = null; +String dumpMetaFile = DumpMetaData.getDmdFileName(); +FileSystem fs = null; +if(context.dmd != null) { + dumpType = context.dmd.getDumpType(); + fs = context.dmd.getDumpFilePath().getFileSystem(hiveConf); + metricPath = context.dmd.getDumpFilePath().getParent(); +} +else { + dumpType = null; + if(context.location != null){ +metricPath = (new Path(context.location)).getParent(); +fs = (new Path(context.location)).getFileSystem(hiveConf); + } +} +//traverse to hiveDumpRoot required by metric-collector +while (metricPath != null && fs != null && dumpType != DumpType.BOOTSTRAP && dumpType != DumpType.INCREMENTAL) { + metricPath = metricPath.getParent(); + if (fs.exists(new Path(metricPath, dumpMetaFile))) { +dumpType = (new DumpMetaData(metricPath, hiveConf)).getDumpType(); + } +} +return metricPath; + } + + public static ReplicationMetricCollector getMetricCollector(MessageHandler.Context context, String dbName, + Path metricPath, HiveConf hiveConf) throws Exception { +if (metricPath != null) { + DumpType dumpType = (new DumpMetaData(metricPath, hiveConf)).getDumpType(); + //for using this, dumpType should be either INCREMENTAL or BOOTSTRAP. + if (dumpType == DumpType.BOOTSTRAP) { +return new BootstrapLoadMetricCollector(dbName, metricPath.toString(), +context.dmd.getDumpExecutionId(), hiveConf); Review comment: you can pass just the DumpExecutionId. no need to pass the entire context This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495283) Time Spent: 1h 50m (was: 1h 40m) > sys.replication_metrics table shows incorrect status for failed policies > > > Key: HIVE-24227 > URL: https://issues.apache.org/jira/browse/HIVE-24227 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies
[ https://issues.apache.org/jira/browse/HIVE-24227?focusedWorklogId=495284&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495284 ] ASF GitHub Bot logged work on HIVE-24227: - Author: ASF GitHub Bot Created on: 05/Oct/20 11:50 Start Date: 05/Oct/20 11:50 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #1550: URL: https://github.com/apache/hive/pull/1550#discussion_r499540414 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/util/ReplUtils.java ## @@ -280,6 +299,49 @@ public static PathFilter getBootstrapDirectoryFilter(final FileSystem fs) { }; } + public static Path getMetricPath(MessageHandler.Context context, HiveConf hiveConf) throws Exception{ +DumpType dumpType; +Path metricPath = null; +String dumpMetaFile = DumpMetaData.getDmdFileName(); +FileSystem fs = null; +if(context.dmd != null) { + dumpType = context.dmd.getDumpType(); + fs = context.dmd.getDumpFilePath().getFileSystem(hiveConf); + metricPath = context.dmd.getDumpFilePath().getParent(); +} +else { + dumpType = null; + if(context.location != null){ +metricPath = (new Path(context.location)).getParent(); +fs = (new Path(context.location)).getFileSystem(hiveConf); + } +} +//traverse to hiveDumpRoot required by metric-collector +while (metricPath != null && fs != null && dumpType != DumpType.BOOTSTRAP && dumpType != DumpType.INCREMENTAL) { Review comment: this may be error prone. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495284) Time Spent: 2h (was: 1h 50m) > sys.replication_metrics table shows incorrect status for failed policies > > > Key: HIVE-24227 > URL: https://issues.apache.org/jira/browse/HIVE-24227 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies
[ https://issues.apache.org/jira/browse/HIVE-24227?focusedWorklogId=495281&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495281 ] ASF GitHub Bot logged work on HIVE-24227: - Author: ASF GitHub Bot Created on: 05/Oct/20 11:41 Start Date: 05/Oct/20 11:41 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #1550: URL: https://github.com/apache/hive/pull/1550#discussion_r499535302 ## File path: ql/src/java/org/apache/hadoop/hive/ql/plan/ReplTxnWork.java ## @@ -92,6 +120,18 @@ public ReplTxnWork(String dbName, String tableName, List partNames, this.operation = type; } + public ReplTxnWork(String dbName, String tableName, List partNames, Review comment: have 2 constructors. one with the dumpDirectory and metricCollector and one without. That way you don't need to change existing code This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495281) Time Spent: 1h 40m (was: 1.5h) > sys.replication_metrics table shows incorrect status for failed policies > > > Key: HIVE-24227 > URL: https://issues.apache.org/jira/browse/HIVE-24227 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies
[ https://issues.apache.org/jira/browse/HIVE-24227?focusedWorklogId=495280&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495280 ] ASF GitHub Bot logged work on HIVE-24227: - Author: ASF GitHub Bot Created on: 05/Oct/20 11:38 Start Date: 05/Oct/20 11:38 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #1550: URL: https://github.com/apache/hive/pull/1550#discussion_r499533749 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/AlterDatabaseHandler.java ## @@ -77,9 +79,22 @@ alterDbDesc = new AlterDatabaseSetOwnerDesc(actualDbName, new PrincipalDesc(newDb.getOwnerName(), newDb.getOwnerType()), context.eventOnlyReplicationSpec()); } + Path metricPath = null; + ReplicationMetricCollector metricCollector = null; + try{ +metricPath = ReplUtils.getMetricPath(context, context.hiveConf); Review comment: you are passing the context already. hiveconf is part of that This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495280) Time Spent: 1.5h (was: 1h 20m) > sys.replication_metrics table shows incorrect status for failed policies > > > Key: HIVE-24227 > URL: https://issues.apache.org/jira/browse/HIVE-24227 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies
[ https://issues.apache.org/jira/browse/HIVE-24227?focusedWorklogId=495277&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495277 ] ASF GitHub Bot logged work on HIVE-24227: - Author: ASF GitHub Bot Created on: 05/Oct/20 11:30 Start Date: 05/Oct/20 11:30 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #1550: URL: https://github.com/apache/hive/pull/1550#discussion_r499529820 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java ## @@ -355,8 +360,32 @@ public int execute() { } catch (Exception e) { setException(e); LOG.info("Failed to persist stats in metastore", e); + int errorCode = ErrorMsg.getErrorMsg(e.getMessage()).getErrorCode(); + try { +ReplicationMetricCollector metricCollector = work.getMetricCollector(); +if (errorCode > 4) { Review comment: Same applies to all the tasks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495277) Time Spent: 1h 20m (was: 1h 10m) > sys.replication_metrics table shows incorrect status for failed policies > > > Key: HIVE-24227 > URL: https://issues.apache.org/jira/browse/HIVE-24227 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies
[ https://issues.apache.org/jira/browse/HIVE-24227?focusedWorklogId=495276&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495276 ] ASF GitHub Bot logged work on HIVE-24227: - Author: ASF GitHub Bot Created on: 05/Oct/20 11:30 Start Date: 05/Oct/20 11:30 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #1550: URL: https://github.com/apache/hive/pull/1550#discussion_r499529679 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/util/ReplUtils.java ## @@ -242,14 +250,25 @@ public static String getNonEmpty(String configParam, HiveConf hiveConf, String e return taskList; } + public static List> addTasksForLoadingColStats(ColumnStatistics colStats, + HiveConf conf, + UpdatedMetaDataTracker updatedMetadata, + org.apache.hadoop.hive.metastore.api.Table tableObj, + long writeId) throws IOException, TException{ +return addTasksForLoadingColStats(colStats, conf, updatedMetadata, tableObj, Review comment: Same applies to other places as well This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495276) Time Spent: 1h 10m (was: 1h) > sys.replication_metrics table shows incorrect status for failed policies > > > Key: HIVE-24227 > URL: https://issues.apache.org/jira/browse/HIVE-24227 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies
[ https://issues.apache.org/jira/browse/HIVE-24227?focusedWorklogId=495272&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495272 ] ASF GitHub Bot logged work on HIVE-24227: - Author: ASF GitHub Bot Created on: 05/Oct/20 11:25 Start Date: 05/Oct/20 11:25 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #1550: URL: https://github.com/apache/hive/pull/1550#discussion_r499526784 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/util/ReplUtils.java ## @@ -242,14 +250,25 @@ public static String getNonEmpty(String configParam, HiveConf hiveConf, String e return taskList; } + public static List> addTasksForLoadingColStats(ColumnStatistics colStats, + HiveConf conf, + UpdatedMetaDataTracker updatedMetadata, + org.apache.hadoop.hive.metastore.api.Table tableObj, + long writeId) throws IOException, TException{ +return addTasksForLoadingColStats(colStats, conf, updatedMetadata, tableObj, Review comment: create a overloaded method. Needn't pass null This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495272) Time Spent: 1h (was: 50m) > sys.replication_metrics table shows incorrect status for failed policies > > > Key: HIVE-24227 > URL: https://issues.apache.org/jira/browse/HIVE-24227 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies
[ https://issues.apache.org/jira/browse/HIVE-24227?focusedWorklogId=495270&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495270 ] ASF GitHub Bot logged work on HIVE-24227: - Author: ASF GitHub Bot Created on: 05/Oct/20 11:17 Start Date: 05/Oct/20 11:17 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #1550: URL: https://github.com/apache/hive/pull/1550#discussion_r499522906 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/DirCopyTask.java ## @@ -140,7 +142,23 @@ public int execute() { } }); } catch (Exception e) { - throw new SecurityException(ErrorMsg.REPL_RETRY_EXHAUSTED.format(e.getMessage()), e); Review comment: need to check why this task was throwing the exception initially This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495270) Time Spent: 50m (was: 40m) > sys.replication_metrics table shows incorrect status for failed policies > > > Key: HIVE-24227 > URL: https://issues.apache.org/jira/browse/HIVE-24227 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies
[ https://issues.apache.org/jira/browse/HIVE-24227?focusedWorklogId=495268&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495268 ] ASF GitHub Bot logged work on HIVE-24227: - Author: ASF GitHub Bot Created on: 05/Oct/20 11:15 Start Date: 05/Oct/20 11:15 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #1550: URL: https://github.com/apache/hive/pull/1550#discussion_r499521480 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/CopyTask.java ## @@ -103,7 +108,32 @@ protected int copyOnePath(Path fromPath, Path toPath) { } catch (Exception e) { console.printError("Failed with exception " + e.getMessage(), "\n" + StringUtils.stringifyException(e)); - return (1); + LOG.error("CopyTask failed", e); Review comment: exception is not set at the task level This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495268) Time Spent: 40m (was: 0.5h) > sys.replication_metrics table shows incorrect status for failed policies > > > Key: HIVE-24227 > URL: https://issues.apache.org/jira/browse/HIVE-24227 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies
[ https://issues.apache.org/jira/browse/HIVE-24227?focusedWorklogId=495267&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495267 ] ASF GitHub Bot logged work on HIVE-24227: - Author: ASF GitHub Bot Created on: 05/Oct/20 11:14 Start Date: 05/Oct/20 11:14 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #1550: URL: https://github.com/apache/hive/pull/1550#discussion_r499521480 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/CopyTask.java ## @@ -103,7 +108,32 @@ protected int copyOnePath(Path fromPath, Path toPath) { } catch (Exception e) { console.printError("Failed with exception " + e.getMessage(), "\n" + StringUtils.stringifyException(e)); - return (1); + LOG.error("CopyTask failed", e); Review comment: exception is not set This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495267) Time Spent: 0.5h (was: 20m) > sys.replication_metrics table shows incorrect status for failed policies > > > Key: HIVE-24227 > URL: https://issues.apache.org/jira/browse/HIVE-24227 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies
[ https://issues.apache.org/jira/browse/HIVE-24227?focusedWorklogId=495266&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495266 ] ASF GitHub Bot logged work on HIVE-24227: - Author: ASF GitHub Bot Created on: 05/Oct/20 11:13 Start Date: 05/Oct/20 11:13 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #1550: URL: https://github.com/apache/hive/pull/1550#discussion_r499519404 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/DDLTask.java ## @@ -82,8 +89,32 @@ public int execute() { throw new IllegalArgumentException("Unknown DDL request: " + ddlDesc.getClass()); } } catch (Throwable e) { + LOG.error("DDLTask failed", e); + int errorCode = ErrorMsg.getErrorMsg(e.getMessage()).getErrorCode(); + try { +ReplicationMetricCollector metricCollector = work.getMetricCollector(); +if (errorCode > 4) { + //in case of replication related task, dumpDirectory should not be null + if(work.dumpDirectory != null) { +Path nonRecoverableMarker = new Path(work.dumpDirectory, ReplAck.NON_RECOVERABLE_MARKER.toString()); +org.apache.hadoop.hive.ql.parse.repl.dump.Utils.writeStackTrace(e, nonRecoverableMarker, conf); +if(metricCollector != null){ + metricCollector.reportStageEnd(getName(), Status.FAILED_ADMIN, nonRecoverableMarker.toString()); +} + } + if(metricCollector != null){ Review comment: this is needed only in replication case ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/DDLTask.java ## @@ -82,8 +89,32 @@ public int execute() { throw new IllegalArgumentException("Unknown DDL request: " + ddlDesc.getClass()); } } catch (Throwable e) { + LOG.error("DDLTask failed", e); + int errorCode = ErrorMsg.getErrorMsg(e.getMessage()).getErrorCode(); + try { +ReplicationMetricCollector metricCollector = work.getMetricCollector(); +if (errorCode > 4) { + //in case of replication related task, dumpDirectory should not be null + if(work.dumpDirectory != null) { +Path nonRecoverableMarker = new Path(work.dumpDirectory, ReplAck.NON_RECOVERABLE_MARKER.toString()); +org.apache.hadoop.hive.ql.parse.repl.dump.Utils.writeStackTrace(e, nonRecoverableMarker, conf); +if(metricCollector != null){ + metricCollector.reportStageEnd(getName(), Status.FAILED_ADMIN, nonRecoverableMarker.toString()); +} + } + if(metricCollector != null){ +metricCollector.reportStageEnd(getName(), Status.FAILED_ADMIN, null); + } +} else { + if(metricCollector != null){ +work.getMetricCollector().reportStageEnd(getName(), Status.FAILED); Review comment: use metricCollector directly ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java ## @@ -355,8 +360,32 @@ public int execute() { } catch (Exception e) { setException(e); LOG.info("Failed to persist stats in metastore", e); + int errorCode = ErrorMsg.getErrorMsg(e.getMessage()).getErrorCode(); + try { +ReplicationMetricCollector metricCollector = work.getMetricCollector(); +if (errorCode > 4) { Review comment: All this code can be part of a util method. ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java ## @@ -464,14 +468,63 @@ public int execute() { console.printInfo("\n", StringUtils.stringifyException(he),false); } } - setException(he); + LOG.error("MoveTask failed", he); + errorCode = ErrorMsg.getErrorMsg(he.getMessage()).getErrorCode(); + try { +ReplicationMetricCollector metricCollector = work.getMetricCollector(); Review comment: util method ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/DDLTask.java ## @@ -82,8 +89,32 @@ public int execute() { throw new IllegalArgumentException("Unknown DDL request: " + ddlDesc.getClass()); } } catch (Throwable e) { + LOG.error("DDLTask failed", e); Review comment: print the DDL operation too. DDL task can be called for different operation ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java ## @@ -355,8 +360,32 @@ public int execute() { } catch (Exception e) { setException(e); LOG.info("Failed to persist stats in metastore", e); + int errorCode = ErrorMsg.getErrorMsg(e.getMessage()).getErrorCode(); + try { +ReplicationMetricCollector metricCollector = work.getMetricCollector(); +if (errorCode > 4) { + //in case of repl
[jira] [Work logged] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies
[ https://issues.apache.org/jira/browse/HIVE-24227?focusedWorklogId=495178&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495178 ] ASF GitHub Bot logged work on HIVE-24227: - Author: ASF GitHub Bot Created on: 05/Oct/20 06:08 Start Date: 05/Oct/20 06:08 Worklog Time Spent: 10m Work Description: ArkoSharma opened a new pull request #1550: URL: https://github.com/apache/hive/pull/1550 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495178) Remaining Estimate: 0h Time Spent: 10m > sys.replication_metrics table shows incorrect status for failed policies > > > Key: HIVE-24227 > URL: https://issues.apache.org/jira/browse/HIVE-24227 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)