[
https://issues.apache.org/jira/browse/HIVE-24145?focusedWorklogId=482691&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482691
]
ASF GitHub Bot logged work on HIVE-24145:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 12/Sep/20 20:12
Start Date: 12/Sep/20 20:12
Worklog Time Spent: 10m
Work Description: rbalamohan commented on a change in pull request #1485:
URL: https://github.com/apache/hive/pull/1485#discussion_r486786544
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
##########
@@ -216,29 +216,47 @@ public FSPaths(Path specPath, boolean isMmTable, boolean
isDirectInsert, boolean
}
public void closeWriters(boolean abort) throws HiveException {
+ Exception exception = null;
for (int idx = 0; idx < outWriters.length; idx++) {
if (outWriters[idx] != null) {
try {
outWriters[idx].close(abort);
updateProgress();
} catch (IOException e) {
- throw new HiveException(e);
+ exception = e;
+ LOG.error("Error closing " + outWriters[idx].toString(), e);
+ // continue closing others
}
}
}
- try {
+ for (int i = 0; i < updaters.length; i++) {
+ if (updaters[i] != null) {
+ SerDeStats stats = updaters[i].getStats();
+ // Ignore 0 row files except in case of insert overwrite
+ if (isDirectInsert && (stats.getRowCount() > 0 ||
isInsertOverwrite)) {
+ outPathsCommitted[i] = updaters[i].getUpdatedFilePath();
+ }
+ try {
+ updaters[i].close(abort);
+ } catch (IOException e) {
+ exception = e;
+ LOG.error("Error closing " + updaters[i].toString(), e);
+ // continue closing others
+ }
+ }
+ }
+ // Made an attempt to close all writers.
+ if (exception != null) {
for (int i = 0; i < updaters.length; i++) {
if (updaters[i] != null) {
- SerDeStats stats = updaters[i].getStats();
- // Ignore 0 row files except in case of insert overwrite
- if (isDirectInsert && (stats.getRowCount() > 0 ||
isInsertOverwrite)) {
- outPathsCommitted[i] = updaters[i].getUpdatedFilePath();
+ try {
+ fs.delete(updaters[i].getUpdatedFilePath(), true);
+ } catch (IOException e) {
+ e.printStackTrace();
Review comment:
LOG?
##########
File path:
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
##########
@@ -284,6 +285,11 @@ public Object process(Node nd, Stack<Node> stack,
NodeProcessorCtx procCtx,
// Create ReduceSink operator
ReduceSinkOperator rsOp = getReduceSinkOp(partitionPositions,
sortPositions, sortOrder, sortNullOrder,
allRSCols, bucketColumns, numBuckets, fsParent,
fsOp.getConf().getWriteType());
+ // we have to make sure not to reorder the child operators as it might
cause weird behavior in the tasks at
+ // the same level. when there is auto stats gather at the same level as
another operation then it might
+ // cause unnecessary preemption. Maintaining the order here to avoid
such preemption and possible errors
Review comment:
Plz add TEZ-3296 as ref if possible.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 482691)
Time Spent: 40m (was: 0.5h)
> Fix preemption issues in reducers and file sink operators
> ---------------------------------------------------------
>
> Key: HIVE-24145
> URL: https://issues.apache.org/jira/browse/HIVE-24145
> Project: Hive
> Issue Type: Bug
> Reporter: Ramesh Kumar Thangarajan
> Assignee: Ramesh Kumar Thangarajan
> Priority: Major
> Labels: pull-request-available
> Time Spent: 40m
> Remaining Estimate: 0h
>
> There are two issues because of preemption:
> # Reducers are gettingĀ reordered as part of optimizations because of which
> more preemption happen
> # Preemption in the middle of writing can cause the file to not close and
> lead to errors when we read the file later
--
This message was sent by Atlassian Jira
(v8.3.4#803005)