I sent a link to you.

File system is hdfs.

Versions:
hdp HDP-2.2.4.2-2
hdfs 2.6.0.2.2
MapReduce2 2.6.0.2.2
YARN 2.6.0.2.2
hive 0.14.0.2.2
tez 0.5.2.2.2

It was a tez query that caused the exception, but I doubt that’s relevant.


[http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726]

Grant Overby
Software Engineer
Cisco.com<http://www.cisco.com/>
grove...@cisco.com<mailto:grove...@cisco.com>
Mobile: 865 724 4910






[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think before you 
print.

This email may contain confidential and privileged material for the sole use of 
the intended recipient. Any review, use, distribution or disclosure by others 
is strictly prohibited. If you are not the intended recipient (or authorized to 
receive for the recipient), please contact the sender by reply email and delete 
all copies of this message.

Please click 
here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for 
Company Registration Information.





From: Owen O'Malley <omal...@apache.org<mailto:omal...@apache.org>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Date: Friday, May 22, 2015 at 12:51 PM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Cc: "Bhavana Kamichetty (bkamiche)" 
<bkami...@cisco.com<mailto:bkami...@cisco.com>>
Subject: Re: Malformed Orc file Invalid postscript length 0

Bhavana,
   Could you send me (omal...@apache.org<mailto:omal...@apache.org>) the 
incorrect ORC file? Which file system were you using? hdfs? Which version of 
Hadoop and Hive?

Thanks,
   Owen

On Fri, May 22, 2015 at 9:37 AM, Grant Overby (groverby) 
<grove...@cisco.com<mailto:grove...@cisco.com>> wrote:

I’m getting the following exception when Hive executes a query on an external 
table. It seems the postscript isn’t written even though .close() is called and 
returns normally. Any thoughts?



java.io.IOException: Malformed ORC file 
hdfs://twig06.twigs:8020/warehouse/completed/events/connection_events/dt=1432229400/1432229419251-bb46892c-939f-45ca-b867-da3675d0ca72.orc.
 Invalid postscript length 0

at 
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.ensureOrcFooter(ReaderImpl.java:230)

at 
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:370)

at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:311)

at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:228)

at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1130)

at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1039)

at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)

These orc files are written manually using an orc writer:

Path tmpPath = new Path(tmpPathName);
Configuration writerConf = new Configuration();
OrcFile.WriterOptions writerOptions = OrcFile.writerOptions(writerConf);
writerOptions.bufferSize(256 * 1024);
writerOptions.compress(SNAPPY);
writerOptions.fileSystem(fileSystem);
writerOptions.inspector(new FlatTableObjectInspector(dbName + "." + tableName, 
fields));
writerOptions.rowIndexStride(10_000);
writerOptions.blockPadding(true);
writerOptions.stripeSize(122 * 1024 * 1024);
writerOptions.version(V_0_12);
writer = OrcFile.createWriter(tmpPath, writerOptions);

The writer.close() is executed and only if writer.close() returns normally is 
the orc file moved from a tmp dir to the external table partition’s dir.


private void closeWriter() {
    if (writer != null) {
        try {
            writer.close();
            Path tmpPath = new Path(tmpPathName);
            if (fileSystem.exists(tmpPath) && 
fileSystem.getFileStatus(tmpPath).getLen() > 0) {
                Path completedPath = new Path(completedPathName);
                fileSystem.setPermission(tmpPath, PERMISSION_664);
                fileSystem.rename(tmpPath, completedPath);
                
HiveOperations.getInstance().registerExternalizedPartition(dbName, tableName, 
partition);
            } else if (fileSystem.exists(tmpPath)) {
                fileSystem.delete(tmpPath, false);
            }
        } catch (IOException e) {
            Throwables.propagate(e);
        } finally {
            writer = null;
        }
    }
}

I expect writer.close() to write the postscript, but it seems not to have.

http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hive/hive-exec/0.14.0/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java#WriterImpl.close%28%29


Thoughts?
Am I doing something wrong? Bug?
Fix?

[http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726]

Grant Overby
Software Engineer
Cisco.com<http://www.cisco.com/>
grove...@cisco.com<mailto:grove...@cisco.com>
Mobile: 865 724 4910<tel:865%20724%204910>






[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think before you 
print.

This email may contain confidential and privileged material for the sole use of 
the intended recipient. Any review, use, distribution or disclosure by others 
is strictly prohibited. If you are not the intended recipient (or authorized to 
receive for the recipient), please contact the sender by reply email and delete 
all copies of this message.

Please click 
here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for 
Company Registration Information.





Reply via email to