akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r508236769
##########
File path:
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputFormat.java
##########
@@ -92,6 +95,14 @@ public void checkOutputSpecs(FileSystem fileSystem, JobConf
jobConf) throws IOEx
}
String tablePath =
FileFactory.getCarbonFile(carbonLoadModel.getTablePath()).getAbsolutePath();
TaskAttemptID taskAttemptID =
TaskAttemptID.forName(jc.get("mapred.task.id"));
+ // taskAttemptID will be null when the insert job is fired from presto.
Presto send the JobConf
+ // and since presto does not use the MR framework for execution, the
mapred.task.id will be
+ // null, so prepare a new ID.
+ if (taskAttemptID == null) {
+ SimpleDateFormat formatter = new SimpleDateFormat("yyyyMMddHHmm");
+ String jobTrackerId = formatter.format(new Date());
+ taskAttemptID = new TaskAttemptID(jobTrackerId, 0, TaskType.MAP, 0, 0);
Review comment:
Here `taskAttemptID ` is `TaskAttemptID` object. Since for every writer
it creates new task, there should be no problem. We get the jobconf from
presto, we prepare the taskattemptid just for writer close purpose and
initialize, so it should be fine i guess. what you think?
With respect to ORC writer if you see, ORC uses the different
`FIleOutPutFormat `from `mapred `package, we use `mapreduce `package, In
mapred, taskcontext is not used, so they are not using this.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]