I'm having a weird sort of problem. When I run pig on my cluster
(unfortunately also
adminstered by me), it works fine with grunt, but not as a script.
This morning I
started a job in grunt that proceeded OK, until I lost the connection
and grunt died.
(screen is not installed on the cluster, and I don't have root.) So I
decided to run the
script like so:
nohup pig <script.pig>
I never get anything past:
2009-11-27 15:43:04,081 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0%
complete
The one I ran in grunt would occisionally update to 2%, 8%, 20%, etc.
When I run it in
batch mode, that never happens.
However, I can see it's doing something, because I can tail the
JobTracker log:
2009-11-27 19:16:03,093 INFO org.apache.hadoop.mapred.ResourceEstimator:
completedMapsUpdates:12935 completedMapsInputSize:13563343495
completedMapsOutputSize:20135178306
2009-11-27 19:16:03,094 INFO org.apache.hadoop.mapred.JobTracker: Adding task
'attempt_200911270748_0004_m_013052_0' to tip
task_200911270748_0004_m_013052, for
tracker 'tracker_tuson64:localhost.localdomain/127.0.0.1:53589'
2009-11-27 19:16:03,094 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing data-local
task task_200911270748_0004_m_013052
I mean, it says it has completed map 12935. It actually still hasn't
produced any result
directories, so that seems weird. Even weirder, when pig is running
in batch mode, I
can't run anything else on hadoop!
For example, now I can't run my usual hadoop program, which works fine
normally. This
little test, that should run in about 1 second, never even starts:
[le...@tuson118 dna]$ hadoop jar /home/leek2/hadoopstuff/dna/dnamapper.jar
org.myorg.DNAMapper /dna/kmer.txt /dna/kmerres5 5
Setting kmer size 5
09/11/27 19:21:17 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the
arguments. Applications should implement Tool for the same.
09/11/27 19:21:17 INFO input.FileInputFormat: Total input paths to process : 1
09/11/27 19:21:18 INFO mapred.JobClient: Running job: job_200911270748_0008
09/11/27 19:21:19 INFO mapred.JobClient: map 0% reduce 0%
What is going on here? How can I figure out if anything is happening?
Jim