I am running a mapper job which generates a large number of output records
for every input record.
about 32,000,000,000 output records from about 150 mappers - each record
about 200 bytes
The job is failing with timeouts.
When I alter the code to do exactly what it did previously but only output
1 in 100 output records it runs to completion with no
difficulty.
I believe I am saturating some local resource on the mapper but this gets
WAY beyond my knowledge of what is going on internally
Any bright ideas?
--
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com