AccessControl exception thrown by estimateNumberOfReducers if data includes 
unreadable subdirectories
-----------------------------------------------------------------------------------------------------

                 Key: PIG-2386
                 URL: https://issues.apache.org/jira/browse/PIG-2386
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.9.1
            Reporter: Adam Portley
            Priority: Minor


Pig estimates the number of reducers based on input data size.  The code to 
calculate the input size throws an exception if the data contains any 
unreadable subdirectories (perhaps subsets of the data with restricted read 
permissions): 

Caused by: org.apache.hadoop.security.AccessControlException:
org.apache.hadoop.security.AccessControlException: Permission denied:
user=<removed>, access=READ_EXECUTE, inode="secure":owner:secure:rwxr-x---
         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method)
         at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
         at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
         at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
         at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:669)
         at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:280)
         at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getPathLength(JobControlCompiler.java:791)
         at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getPathLength(JobControlCompiler.java:794)
 at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getTotalInputFileSize(JobControlCompiler.java:779)
         at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.estimateNumberOfReducers(JobControlCompiler.java:739)
         at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:587)
         ... 12 more


Pig should catch this exception and ignore unreadable directories when 
calculating the input size. 
Users can work around the issue by specifying default_parallel or PARALLEL.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to