Hi All,
GenericFileOutputOpeator which is in Malhar repository works only for
few file systems. GenericFileOutputOpeator is extended from
AbstractFileOutputOperator.
Reason: openStream() method which is in AbstractFileOutputOperator calls
append operation. But, all the file systems doesn't support append
operation. Some of the file systems which are not supported append()
operation are FTP, S3.
If the GenericFileOutputOpeator used for file systems which are not
supported append() operation and operator goes down & comes back then file
system throws exception "Not Supported".
Solution: Following method needs to be called instead of fs.append():
protected FSDataOutputStream openStreamForNonAppendFS(Path filepath) throws
IOException {
Path appendTmpFile = new Path(filepath + “_APPENDING”);
rename(filepath, appendTmpFile);
FSDataInputStream fsIn = fs.open(appendTmpFile);
FSDataOutputStream fsOut = fs.create(filepath);
IOUtils.copy(fsIn, fsOut);
flush(fsOut);
fs.delete(appendTmpFile);
return fsOut;
}
Below are the options to fix this issue.
(1) Fix it in AbstractFileOutputOperator - Catch the "Not Supported"
exception and then call the openStreamForNonAppendFS() method.
(2) Fix it in GenericFileOutputOpeator (Same as approach (1))
(3) Create a new operator which extends from AbstractFileOutputOperator and
override the openStream() method. This new operator could be used only for
file systems which are not supported append operation.
Please share your thoughts and vote on above approaches.
Regards,
Chaitanya