akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r510102591
##########
File path:
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputFormat.java
##########
@@ -92,6 +95,14 @@ public void checkOutputSpecs(FileSystem fileSystem, JobConf
jobConf) throws IOEx
}
String tablePath =
FileFactory.getCarbonFile(carbonLoadModel.getTablePath()).getAbsolutePath();
TaskAttemptID taskAttemptID =
TaskAttemptID.forName(jc.get("mapred.task.id"));
+ // taskAttemptID will be null when the insert job is fired from presto.
Presto send the JobConf
+ // and since presto does not use the MR framework for execution, the
mapred.task.id will be
+ // null, so prepare a new ID.
+ if (taskAttemptID == null) {
+ SimpleDateFormat formatter = new SimpleDateFormat("yyyyMMddHHmm");
+ String jobTrackerId = formatter.format(new Date());
+ taskAttemptID = new TaskAttemptID(jobTrackerId, 0, TaskType.MAP, 0, 0);
Review comment:
> ok, If this task number is used in file name, in case of
non-transactional concurrent write. two files can have same file name leading
to many issues. so, I suggested UUID. you can check again.
I set the taskID to loadmodel only of the mapred.task.id is present and
taskAttempt is not null, if null i dont set taskID to loadmodel, when we call
super.getRecordWriter, CarbonTableOutputFormat will set load model based on
DEFAULT_TASK_NO. Please have a look, transactional tables also shouldn't be
problem
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]