Hi,
I am trying to load a large gzip file and process using pig. The file
size is about 200M [gzip].
Everytime I run the following script, I get outofmemory errors.
The hadoop-site.xml is attached. The pig and the hadoop jobtracker logs
are attached as well.
$ pig
>>>
x1 = LOAD 'file:///mnt/transaction_ar20090907_1102_126.CSV.gz' using
PigStorage('\u0002');
y1 = LIMIT x1 10;
dump y1;
>>>
Environment :
hadoop-0.20.0
pig-0.3.0 [ patched with Pig-660-4 to work with hadoop-0.20.0 ]
ec2 [ c1.medium ] [with 4 slaves]
How do I know for a given hadoop job whether I have enough
instances/RAM? What is the best way to asses the RAM/CPU/Machine
footprint for a given job?
Thanks,
Irfan
ERROR 6016: Out of memory.
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias y1
at org.apache.pig.PigServer.openIterator(PigServer.java:469)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:522)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:190)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
at org.apache.pig.Main.main(Main.java:350)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 6016: Out of memory.
at java.util.Arrays.copyOf(Arrays.java:2760)
at java.util.Arrays.copyOf(Arrays.java:2734)
at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
at java.util.ArrayList.add(ArrayList.java:351)
at org.apache.pig.builtin.PigStorage.readField(PigStorage.java:286)
at org.apache.pig.builtin.PigStorage.getNext(PigStorage.java:117)
at org.apache.pig.backend.executionengine.PigSlice.next(PigSlice.java:104)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper$1.next(SliceWrapper.java:162)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper$1.next(SliceWrapper.java:138)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:191)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:175)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
Caused by: java.lang.OutOfMemoryError
2009-09-08 10:06:13,551 WARN org.apache.hadoop.conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively
2009-09-08 10:06:13,581 INFO org.apache.hadoop.mapred.JobTracker: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting JobTracker
STARTUP_MSG: host = domU-12-31-39-07-50-C2.compute-1.internal/10.209.83.48
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.20.0
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r 763504; compiled by 'ndaley' on Thu Apr 9 05:18:40 UTC 2009
************************************************************/
2009-09-08 10:06:13,815 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=JobTracker, port=50002
2009-09-08 10:06:13,908 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2009-09-08 10:06:14,027 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50030
2009-09-08 10:06:14,027 INFO org.mortbay.log: jetty-6.1.14
2009-09-08 10:06:16,821 INFO org.mortbay.log: Started [email protected]:50030
2009-09-08 10:06:16,822 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
2009-09-08 10:06:16,872 INFO org.apache.hadoop.mapred.JobTracker: JobTracker up at: 50002
2009-09-08 10:06:16,872 INFO org.apache.hadoop.mapred.JobTracker: JobTracker webserver: 50030
2009-09-08 10:06:17,045 INFO org.apache.hadoop.mapred.JobTracker: Cleaning up the system directory
2009-09-08 10:06:17,274 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2009-09-08 10:06:17,275 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50002: starting
2009-09-08 10:06:17,275 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50002: starting
2009-09-08 10:06:17,275 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 50002: starting
2009-09-08 10:06:17,276 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 50002: starting
2009-09-08 10:06:17,276 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 50002: starting
2009-09-08 10:06:17,276 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 50002: starting
2009-09-08 10:06:17,276 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 50002: starting
2009-09-08 10:06:17,276 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 50002: starting
2009-09-08 10:06:17,276 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 50002: starting
2009-09-08 10:06:17,276 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 50002: starting
2009-09-08 10:06:17,276 INFO org.apache.hadoop.mapred.JobTracker: Starting RUNNING
2009-09-08 10:06:17,277 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 50002: starting
2009-09-08 10:06:17,716 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/domU-12-31-39-07-50-C2.compute-1.internal
2009-09-08 10:13:03,008 INFO org.apache.hadoop.mapred.EagerTaskInitializationListener: Initializing job_200909081006_0001
2009-09-08 10:13:03,013 INFO org.apache.hadoop.mapred.JobHistory: Nothing to recover! Generating a new filename domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0001_root_Job4376148819153439144.jar for job job_200909081006_0001
2009-09-08 10:13:03,026 INFO org.apache.hadoop.mapred.JobHistory: domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0001_root_Job4376148819153439144.jar doesnt exist! Using domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0001_root_Job4376148819153439144.jar.recover for recovery.
2009-09-08 10:13:04,432 INFO org.apache.hadoop.mapred.JobHistory: domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0001_root_Job4376148819153439144.jar doesnt exist! Using domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0001_root_Job4376148819153439144.jar.recover as the master history file for user.
2009-09-08 10:13:05,194 INFO org.apache.hadoop.mapred.JobInProgress: Input size for job job_200909081006_0001 = 151
2009-09-08 10:13:05,194 INFO org.apache.hadoop.mapred.JobInProgress: Split info for job:job_200909081006_0001 with 1 splits:
2009-09-08 10:13:05,195 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/localhost
2009-09-08 10:13:05,195 INFO org.apache.hadoop.mapred.JobInProgress: tip:task_200909081006_0001_m_000000 has split on node:/default-rack/localhost
2009-09-08 10:13:06,253 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_200909081006_0001_m_000002_0' to tip task_200909081006_0001_m_000002, for tracker 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:13:09,306 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_200909081006_0001_m_000002_0' has completed task_200909081006_0001_m_000002 successfully.
2009-09-08 10:13:09,310 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_200909081006_0001_m_000000_0' to tip task_200909081006_0001_m_000000, for tracker 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:13:09,312 INFO org.apache.hadoop.mapred.JobInProgress: Choosing rack-local task task_200909081006_0001_m_000000
2009-09-08 10:13:15,350 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_200909081006_0001_m_000000_0' has completed task_200909081006_0001_m_000000 successfully.
2009-09-08 10:13:15,353 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_200909081006_0001_m_000001_0' to tip task_200909081006_0001_m_000001, for tracker 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:13:18,366 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_200909081006_0001_m_000001_0' has completed task_200909081006_0001_m_000001 successfully.
2009-09-08 10:13:18,367 INFO org.apache.hadoop.mapred.JobInProgress: Job job_200909081006_0001 has completed successfully.
2009-09-08 10:13:18,442 INFO org.apache.hadoop.mapred.JobHistory: Recovered job history filename for job job_200909081006_0001 is domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0001_root_Job4376148819153439144.jar
2009-09-08 10:13:18,544 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0001_m_000000_0' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:13:18,544 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0001_m_000001_0' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:13:18,544 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0001_m_000002_0' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:13:27,843 INFO org.apache.hadoop.mapred.EagerTaskInitializationListener: Initializing job_200909081006_0002
2009-09-08 10:13:27,845 INFO org.apache.hadoop.mapred.JobHistory: Nothing to recover! Generating a new filename domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0002_root_Job5240765340979271360.jar for job job_200909081006_0002
2009-09-08 10:13:27,846 INFO org.apache.hadoop.mapred.JobHistory: domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0002_root_Job5240765340979271360.jar doesnt exist! Using domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0002_root_Job5240765340979271360.jar.recover for recovery.
2009-09-08 10:13:27,920 INFO org.apache.hadoop.mapred.JobHistory: domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0002_root_Job5240765340979271360.jar doesnt exist! Using domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0002_root_Job5240765340979271360.jar.recover as the master history file for user.
2009-09-08 10:13:28,481 INFO org.apache.hadoop.mapred.JobInProgress: Input size for job job_200909081006_0002 = 151
2009-09-08 10:13:28,481 INFO org.apache.hadoop.mapred.JobInProgress: Split info for job:job_200909081006_0002 with 1 splits:
2009-09-08 10:13:28,482 INFO org.apache.hadoop.mapred.JobInProgress: tip:task_200909081006_0002_m_000000 has split on node:/default-rack/localhost
2009-09-08 10:13:30,555 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_200909081006_0002_m_000002_0' to tip task_200909081006_0002_m_000002, for tracker 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:13:33,562 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_200909081006_0002_m_000002_0' has completed task_200909081006_0002_m_000002 successfully.
2009-09-08 10:13:33,563 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_200909081006_0002_m_000000_0' to tip task_200909081006_0002_m_000000, for tracker 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:13:33,563 INFO org.apache.hadoop.mapred.JobInProgress: Choosing rack-local task task_200909081006_0002_m_000000
2009-09-08 10:13:36,604 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_200909081006_0002_m_000000_0' has completed task_200909081006_0002_m_000000 successfully.
2009-09-08 10:13:36,605 INFO org.apache.hadoop.mapred.ResourceEstimator: completedMapsUpdates:1 completedMapsInputSize:152 completedMapsOutputSize:409
2009-09-08 10:13:36,610 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_200909081006_0002_r_000000_0' to tip task_200909081006_0002_r_000000, for tracker 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:13:48,631 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_200909081006_0002_r_000000_0' has completed task_200909081006_0002_r_000000 successfully.
2009-09-08 10:13:48,634 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_200909081006_0002_m_000001_0' to tip task_200909081006_0002_m_000001, for tracker 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:13:51,638 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_200909081006_0002_m_000001_0' has completed task_200909081006_0002_m_000001 successfully.
2009-09-08 10:13:51,638 INFO org.apache.hadoop.mapred.JobInProgress: Job job_200909081006_0002 has completed successfully.
2009-09-08 10:13:51,754 INFO org.apache.hadoop.mapred.JobHistory: Recovered job history filename for job job_200909081006_0002 is domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0002_root_Job5240765340979271360.jar
2009-09-08 10:13:51,843 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0002_m_000000_0' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:13:51,843 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0002_m_000001_0' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:13:51,843 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0002_m_000002_0' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:13:51,843 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0002_r_000000_0' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:13:58,252 INFO org.apache.hadoop.mapred.EagerTaskInitializationListener: Initializing job_200909081006_0003
2009-09-08 10:13:58,255 INFO org.apache.hadoop.mapred.JobHistory: Nothing to recover! Generating a new filename domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0003_root_PigLatin%3ADefaultJobName for job job_200909081006_0003
2009-09-08 10:13:58,258 INFO org.apache.hadoop.mapred.JobHistory: domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0003_root_PigLatin%3ADefaultJobName doesnt exist! Using domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0003_root_PigLatin%3ADefaultJobName.recover for recovery.
2009-09-08 10:13:58,337 INFO org.apache.hadoop.mapred.JobHistory: domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0003_root_PigLatin%3ADefaultJobName doesnt exist! Using domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0003_root_PigLatin%3ADefaultJobName.recover as the master history file for user.
2009-09-08 10:13:58,890 INFO org.apache.hadoop.mapred.JobInProgress: Input size for job job_200909081006_0003 = 151
2009-09-08 10:13:58,890 INFO org.apache.hadoop.mapred.JobInProgress: Split info for job:job_200909081006_0003 with 1 splits:
2009-09-08 10:13:58,890 INFO org.apache.hadoop.mapred.JobInProgress: tip:task_200909081006_0003_m_000000 has split on node:/default-rack/localhost
2009-09-08 10:14:00,850 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_200909081006_0003_m_000002_0' to tip task_200909081006_0003_m_000002, for tracker 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:14:03,854 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_200909081006_0003_m_000002_0' has completed task_200909081006_0003_m_000002 successfully.
2009-09-08 10:14:03,855 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_200909081006_0003_m_000000_0' to tip task_200909081006_0003_m_000000, for tracker 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:14:03,855 INFO org.apache.hadoop.mapred.JobInProgress: Choosing rack-local task task_200909081006_0003_m_000000
2009-09-08 10:14:06,860 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_200909081006_0003_m_000000_0' has completed task_200909081006_0003_m_000000 successfully.
2009-09-08 10:14:06,860 INFO org.apache.hadoop.mapred.ResourceEstimator: completedMapsUpdates:1 completedMapsInputSize:152 completedMapsOutputSize:409
2009-09-08 10:14:06,862 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_200909081006_0003_r_000000_0' to tip task_200909081006_0003_r_000000, for tracker 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:14:21,888 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_200909081006_0003_r_000000_0' has completed task_200909081006_0003_r_000000 successfully.
2009-09-08 10:14:21,889 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_200909081006_0003_m_000001_0' to tip task_200909081006_0003_m_000001, for tracker 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:14:24,892 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_200909081006_0003_m_000001_0' has completed task_200909081006_0003_m_000001 successfully.
2009-09-08 10:14:24,893 INFO org.apache.hadoop.mapred.JobInProgress: Job job_200909081006_0003 has completed successfully.
2009-09-08 10:14:24,979 INFO org.apache.hadoop.mapred.JobHistory: Recovered job history filename for job job_200909081006_0003 is domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0003_root_PigLatin%3ADefaultJobName
2009-09-08 10:14:25,052 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0003_m_000000_0' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:14:25,052 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0003_m_000001_0' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:14:25,052 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0003_m_000002_0' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:14:25,052 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0003_r_000000_0' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:15:13,466 INFO org.apache.hadoop.mapred.EagerTaskInitializationListener: Initializing job_200909081006_0004
2009-09-08 10:15:13,470 INFO org.apache.hadoop.mapred.JobHistory: Nothing to recover! Generating a new filename domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0004_root_Job3181357076430015345.jar for job job_200909081006_0004
2009-09-08 10:15:13,471 INFO org.apache.hadoop.mapred.JobHistory: domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0004_root_Job3181357076430015345.jar doesnt exist! Using domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0004_root_Job3181357076430015345.jar.recover for recovery.
2009-09-08 10:15:13,587 INFO org.apache.hadoop.mapred.JobHistory: domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0004_root_Job3181357076430015345.jar doesnt exist! Using domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0004_root_Job3181357076430015345.jar.recover as the master history file for user.
2009-09-08 10:15:14,203 INFO org.apache.hadoop.mapred.JobInProgress: Input size for job job_200909081006_0004 = 260104276
2009-09-08 10:15:14,203 INFO org.apache.hadoop.mapred.JobInProgress: Split info for job:job_200909081006_0004 with 1 splits:
2009-09-08 10:15:14,203 INFO org.apache.hadoop.mapred.JobInProgress: tip:task_200909081006_0004_m_000000 has split on node:/default-rack/localhost
2009-09-08 10:15:16,080 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_200909081006_0004_m_000002_0' to tip task_200909081006_0004_m_000002, for tracker 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:15:19,096 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_200909081006_0004_m_000002_0' has completed task_200909081006_0004_m_000002 successfully.
2009-09-08 10:15:19,097 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_200909081006_0004_m_000000_0' to tip task_200909081006_0004_m_000000, for tracker 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:15:19,097 INFO org.apache.hadoop.mapred.JobInProgress: Choosing rack-local task task_200909081006_0004_m_000000
2009-09-08 10:15:43,146 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_200909081006_0004_m_000000_0: java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2760)
at java.util.Arrays.copyOf(Arrays.java:2734)
at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
at java.util.ArrayList.add(ArrayList.java:351)
at org.apache.pig.builtin.PigStorage.readField(PigStorage.java:286)
at org.apache.pig.builtin.PigStorage.getNext(PigStorage.java:117)
at org.apache.pig.backend.executionengine.PigSlice.next(PigSlice.java:104)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper$1.next(SliceWrapper.java:162)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper$1.next(SliceWrapper.java:138)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:191)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:175)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
2009-09-08 10:15:46,151 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_200909081006_0004_m_000000_1' to tip task_200909081006_0004_m_000000, for tracker 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:15:46,151 INFO org.apache.hadoop.mapred.JobInProgress: Choosing rack-local task task_200909081006_0004_m_000000
2009-09-08 10:15:46,151 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0004_m_000000_0' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:16:10,341 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_200909081006_0004_m_000000_1: java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2760)
at java.util.Arrays.copyOf(Arrays.java:2734)
at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
at java.util.ArrayList.add(ArrayList.java:351)
at org.apache.pig.builtin.PigStorage.readField(PigStorage.java:286)
at org.apache.pig.builtin.PigStorage.getNext(PigStorage.java:117)
at org.apache.pig.backend.executionengine.PigSlice.next(PigSlice.java:104)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper$1.next(SliceWrapper.java:162)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper$1.next(SliceWrapper.java:138)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:191)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:175)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
2009-09-08 10:16:13,345 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_200909081006_0004_m_000000_2' to tip task_200909081006_0004_m_000000, for tracker 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:16:13,345 INFO org.apache.hadoop.mapred.JobInProgress: Choosing rack-local task task_200909081006_0004_m_000000
2009-09-08 10:16:13,345 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0004_m_000000_1' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:16:37,468 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_200909081006_0004_m_000000_2: java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2760)
at java.util.Arrays.copyOf(Arrays.java:2734)
at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
at java.util.ArrayList.add(ArrayList.java:351)
at org.apache.pig.builtin.PigStorage.readField(PigStorage.java:286)
at org.apache.pig.builtin.PigStorage.getNext(PigStorage.java:117)
at org.apache.pig.backend.executionengine.PigSlice.next(PigSlice.java:104)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper$1.next(SliceWrapper.java:162)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper$1.next(SliceWrapper.java:138)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:191)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:175)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
2009-09-08 10:16:40,472 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_200909081006_0004_m_000000_3' to tip task_200909081006_0004_m_000000, for tracker 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:16:40,472 INFO org.apache.hadoop.mapred.JobInProgress: Choosing rack-local task task_200909081006_0004_m_000000
2009-09-08 10:16:40,472 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0004_m_000000_2' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:17:04,550 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_200909081006_0004_m_000000_3: java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2760)
at java.util.Arrays.copyOf(Arrays.java:2734)
at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
at java.util.ArrayList.add(ArrayList.java:351)
at org.apache.pig.builtin.PigStorage.readField(PigStorage.java:286)
at org.apache.pig.builtin.PigStorage.getNext(PigStorage.java:117)
at org.apache.pig.backend.executionengine.PigSlice.next(PigSlice.java:104)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper$1.next(SliceWrapper.java:162)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper$1.next(SliceWrapper.java:138)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:191)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:175)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
2009-09-08 10:17:07,554 INFO org.apache.hadoop.mapred.TaskInProgress: TaskInProgress task_200909081006_0004_m_000000 has failed 4 times.
2009-09-08 10:17:07,555 INFO org.apache.hadoop.mapred.JobInProgress: TaskTracker at 'domU-12-31-39-07-50-C2.compute-1.internal' turned 'flaky'
2009-09-08 10:17:07,555 INFO org.apache.hadoop.mapred.JobInProgress: Aborting job job_200909081006_0004
2009-09-08 10:17:07,555 INFO org.apache.hadoop.mapred.JobInProgress: Killing job 'job_200909081006_0004'
2009-09-08 10:17:07,555 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_200909081006_0004_m_000001_0' to tip task_200909081006_0004_m_000001, for tracker 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:17:07,556 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0004_m_000000_3' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:17:10,561 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_200909081006_0004_m_000001_0' has completed task_200909081006_0004_m_000001 successfully.
2009-09-08 10:17:10,606 INFO org.apache.hadoop.mapred.JobHistory: Recovered job history filename for job job_200909081006_0004 is domU-12-31-39-07-50-C2.compute-1.internal_1252418773830_job_200909081006_0004_root_Job3181357076430015345.jar
2009-09-08 10:17:10,660 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0004_m_000000_0' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:17:10,660 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0004_m_000000_1' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:17:10,660 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0004_m_000000_2' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:17:10,660 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0004_m_000000_3' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:17:10,660 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0004_m_000001_0' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
2009-09-08 10:17:10,660 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_200909081006_0004_m_000002_0' from 'tracker_domU-12-31-39-07-50-C2.compute-1.internal:localhost.localdomain/127.0.0.1:57318'
[r...@domu-12-31-39-07-50-c2 ~]#
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/mnt/hadoop</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://domU-12-31-39-07-50-C2.compute-1.internal:50001</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>hdfs://domU-12-31-39-07-50-C2.compute-1.internal:50002</value>
</property>
<property>
<name>tasktracker.http.threads</name>
<value>80</value>
</property>
<property>
<name>mapred.reduce.parallel.copies</name>
<value>1</value>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>1</value>
</property>
<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>3</value>
</property>
<property>
<name>mapred.tasktracker.reduce.tasks.maximum</name>
<value>3</value>
</property>
<property>
<name>mapred.output.compress</name>
<value>true</value>
</property>
<property>
<name>mapred.output.compression.type</name>
<value>BLOCK</value>
</property>
<property>
<name>dfs.client.block.write.retries</name>
<value>3</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx550m</value>
</property>
<property>
<name>io.compression.codecs</name>
<value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec</value>
<description>A list of the compression codec classes that can be used for compression/decompression.</description>
</property>
<property>
<name>fs.s3.awsAccessKeyId</name>
<value>xxxx</value>
</property>
<property>
<name>fs.s3.awsSecretAccessKey</name>
<value>xxxx</value>
</property>
<property>
<name>fs.s3n.awsAccessKeyId</name>
<value>xxxx</value>
</property>
<property>
<name>fs.s3n.awsSecretAccessKey</name>
<value>xxxx</value>
</property>
</configuration>