Pig 8/9 setting reduce count to 1 for cross grid input
------------------------------------------------------
Key: PIG-2252
URL: https://issues.apache.org/jira/browse/PIG-2252
Project: Pig
Issue Type: Bug
Affects Versions: 0.9.0, 0.8.1
Reporter: David Capwell
I have a pig script that was reading data from another grid and I noticed that
only one reducer was ever used, and the pig logs shows this:
2011-08-28 23:56:08,907 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=0
2011-08-28 23:56:08,907 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Neither PARALLEL nor default parallelism is set for this job. Setting number
of reducers to 1
When I copy the same file to local grid pig is pointing to and run the same
script, it then spins off 127 reducers and this is the logs:
2011-08-29 01:09:21,435 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- BytesPerReducer= 1000000000 maxReducers=999 totalInputFileSize=127391724005
2011-08-29 01:09:21,436 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Neither PARALLEL nor default parallelism is set for this job. Setting number
of reducers to 127
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira