Cheolsoo Park created PIG-4003:
----------------------------------

             Summary: Error is thrown by JobStats.getOutputSize() when storing 
to a Hive table 
                 Key: PIG-4003
                 URL: https://issues.apache.org/jira/browse/PIG-4003
             Project: Pig
          Issue Type: Bug
            Reporter: Cheolsoo Park
            Assignee: Cheolsoo Park
             Fix For: 0.14.0


Here is an example of stack trace printed to console output. Technically, this 
is a warning message and does not make the job fail. However, this is certainly 
not user-friendly.
{code}
4/06/09 16:20:28 WARN pigstats.JobStats: unable to find the output file
java.io.FileNotFoundException: File 
hdfs://10.61.10.185:9000/user/cheolsoop/prodhive.benchmark.unittest_vhs_bitrate_asn_sum_stg_test2
 does not exist.
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:654)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:712)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:708)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:708)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.FileBasedOutputSizeReader.getOutputSize(FileBasedOutputSizeReader.java:65)
        at 
org.apache.pig.tools.pigstats.JobStats.getOutputSize(JobStats.java:352)
{code}
The issue is that FileBasedOutputSizeReader mis-interprets hive table name as 
hdfs path.
{code}
@Override
public boolean supports(POStore sto, Configuration conf) {
    return UriUtil.isHDFSFileOrLocalOrS3N(getLocationUri(sto), conf);
}
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to