Madhavi Nadig created PIG-4386:
----------------------------------

             Summary: How many files can be submitted to a pig job at once?
                 Key: PIG-4386
                 URL: https://issues.apache.org/jira/browse/PIG-4386
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.13.1
         Environment: {code}
$pig --version
Apache Pig version 0.13.1-mapr-1410 (rexported) 
compiled Nov 05 2014, 10:16:28
{code}

            Reporter: Madhavi Nadig


Pig fails mysteriously when I specify the root of a large directory tree as the 
LOAD input in my script. The exception that it throws offers no insight into 
what's happening. The same script works perfectly when there are fewer files.

It's a very simple script as you can see below:

{code}
SET pig.noSplitCombination true;
raw_record = LOAD '/data/directory/tree/root' USING PigStorage(',');
filtered = FILTER raw_record by $1 == 251068;
filtered_data = FOREACH filtered GENERATE (chararray)$0, (chararray)$1, 
(chararray)$2;
STORE filtered_data INTO '/data/output/directory/' USING PigStorage();
{code}
Here's the error message I see :
{code}
   ERROR 2244: Job scope-594 failed, hadoop does not return any error message
org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job scope-594 
failed, hadoop does not return any error message
    at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:178)
    at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:232)
    at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
    at org.apache.pig.Main.run(Main.java:608)
    at org.apache.pig.Main.main(Main.java:156)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
{code}
How many files can PIG process at once?




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to