Generalize the FileFormat Interface in Hive
-------------------------------------------

                 Key: HIVE-360
                 URL: https://issues.apache.org/jira/browse/HIVE-360
             Project: Hadoop Hive
          Issue Type: Improvement
            Reporter: Zheng Shao


Currently the FileFormat support in Hive is not generalized - we do "if ... 
else" to support TextFileFormat and SequenceFileFormat. There is no way to 
support a 3rd one without changing the "if...else" structure. We should make an 
interface for the FileFormat need for Hive.

The OutputFileFormat interface that Hive requires will contain one more method 
than the Hadoop OutputFileFormat - create a File with a specific name.

Hive.g:409 (Hive.g already supports the custom file format but 
DDLSemanticAnalyzer.java is not recognizing it yet
{code}
KW_STORED KW_AS KW_INPUTFORMAT inFmt=StringLiteral KW_OUTPUTFORMAT 
outFmt=StringLiteral
{code}

Please add the handling of TOK_TABLEFILEFORMAT here:
DDLSemanticAnalyzer.java:223
{code}
        case HiveParser.TOK_TBLSEQUENCEFILE:
        ...
{code}

Please add the handling of custom outputFormat here by adding a new interface 
(and cast the user-provided file format to that interface), instead of doing 
"if ... else"
FileSinkOperator.java:129-174:
{code}
      if(outputFormat instanceof IgnoreKeyTextOutputFormat) {
        finalPath = new Path(Utilities.toTempPath(conf.getDirName()), 
Utilities.getTaskId(hconf) +
                             Utilities.getFileExtension(jc, isCompressed));
      ...
{code}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to