[
https://issues.apache.org/jira/browse/TEZ-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sergey Shelukhin updated TEZ-2315:
----------------------------------
Description:
In JobSubmitter::submitJobInternal, MR does:
{noformat}
private void checkSpecs(Job job) throws ClassNotFoundException,
InterruptedException, IOException {
JobConf jConf = (JobConf)job.getConfiguration();
// Check the output specification
if (jConf.getNumReduceTasks() == 0 ?
jConf.getUseNewMapper() : jConf.getUseNewReducer()) {
org.apache.hadoop.mapreduce.OutputFormat<?, ?> output =
ReflectionUtils.newInstance(job.getOutputFormatClass(),
job.getConfiguration());
output.checkOutputSpecs(job);
} else {
jConf.getOutputFormat().checkOutputSpecs(jtFs, jConf);
}
}
{noformat}
Note that if outputformat does not exist, it is created via reflection
specifically for this call.
Tez should also call this. In Hive, via HiveOutputFormatImpl, this methods is
called on FileSinkOperator, which calls it on custom formats. This is necessary
for some of them because they set configuration there.
was:
In JobSubmitter::submitJobInternal, MR does:
{noformat}
private void checkSpecs(Job job) throws ClassNotFoundException,
InterruptedException, IOException {
JobConf jConf = (JobConf)job.getConfiguration();
// Check the output specification
if (jConf.getNumReduceTasks() == 0 ?
jConf.getUseNewMapper() : jConf.getUseNewReducer()) {
org.apache.hadoop.mapreduce.OutputFormat<?, ?> output =
ReflectionUtils.newInstance(job.getOutputFormatClass(),
job.getConfiguration());
output.checkOutputSpecs(job);
} else {
jConf.getOutputFormat().checkOutputSpecs(jtFs, jConf);
}
}
{noformat}
Note that if inputformat does not exist, it is created via reflection
specifically for this call.
Tez should also call this. In Hive, via HiveOutputFormatImpl, this methods is
called on FileSinkOperator, which calls it on custom formats. This is necessary
for some of them because they set configuration there.
> TEZ does not call checkOutputSpecs on OutputFormat
> --------------------------------------------------
>
> Key: TEZ-2315
> URL: https://issues.apache.org/jira/browse/TEZ-2315
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Sergey Shelukhin
> Assignee: Siddharth Seth
>
> In JobSubmitter::submitJobInternal, MR does:
> {noformat}
> private void checkSpecs(Job job) throws ClassNotFoundException,
> InterruptedException, IOException {
> JobConf jConf = (JobConf)job.getConfiguration();
> // Check the output specification
> if (jConf.getNumReduceTasks() == 0 ?
> jConf.getUseNewMapper() : jConf.getUseNewReducer()) {
> org.apache.hadoop.mapreduce.OutputFormat<?, ?> output =
> ReflectionUtils.newInstance(job.getOutputFormatClass(),
> job.getConfiguration());
> output.checkOutputSpecs(job);
> } else {
> jConf.getOutputFormat().checkOutputSpecs(jtFs, jConf);
> }
> }
> {noformat}
> Note that if outputformat does not exist, it is created via reflection
> specifically for this call.
> Tez should also call this. In Hive, via HiveOutputFormatImpl, this methods is
> called on FileSinkOperator, which calls it on custom formats. This is
> necessary for some of them because they set configuration there.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)