Sergey Shelukhin created TEZ-2315:
-------------------------------------

             Summary: TEZ does not call checkOutputSpecs on OutputFormat
                 Key: TEZ-2315
                 URL: https://issues.apache.org/jira/browse/TEZ-2315
             Project: Apache Tez
          Issue Type: Bug
            Reporter: Sergey Shelukhin


In JobSubmitter::submitJobInternal, MR does:
{noformat}

  private void checkSpecs(Job job) throws ClassNotFoundException, 
      InterruptedException, IOException {
    JobConf jConf = (JobConf)job.getConfiguration();
    // Check the output specification
    if (jConf.getNumReduceTasks() == 0 ? 
        jConf.getUseNewMapper() : jConf.getUseNewReducer()) {
      org.apache.hadoop.mapreduce.OutputFormat<?, ?> output =
        ReflectionUtils.newInstance(job.getOutputFormatClass(),
          job.getConfiguration());
      output.checkOutputSpecs(job);
    } else {
      jConf.getOutputFormat().checkOutputSpecs(jtFs, jConf);
    }
  }
{noformat}

Note that if inputformat does not exist, it is created via reflection 
specifically for this call. 

Tez should also call this. In Hive, via HiveOutputFormatImpl, this methods is 
called on FileSinkOperator, which calls it on custom formats. This is necessary 
for some of them because they set configuration there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to