[
https://issues.apache.org/jira/browse/PIG-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093325#comment-13093325
]
David Capwell commented on PIG-2252:
------------------------------------
When I do an ls on the path, it shows multiple directories and each directory
contains multiple partition files. I can get the size for these files.
> Pig 8/9 setting reduce count to 1 for cross grid input
> ------------------------------------------------------
>
> Key: PIG-2252
> URL: https://issues.apache.org/jira/browse/PIG-2252
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.8.1, 0.9.0
> Reporter: David Capwell
>
> I have a pig script that was reading data from another grid and I noticed
> that only one reducer was ever used, and the pig logs shows this:
> 2011-08-28 23:56:08,907 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=0
> 2011-08-28 23:56:08,907 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - Neither PARALLEL nor default parallelism is set for this job. Setting
> number of reducers to 1
> When I copy the same file to local grid pig is pointing to and run the same
> script, it then spins off 127 reducers and this is the logs:
> 2011-08-29 01:09:21,435 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - BytesPerReducer= 1000000000 maxReducers=999 totalInputFileSize=127391724005
> 2011-08-29 01:09:21,436 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - Neither PARALLEL nor default parallelism is set for this job. Setting
> number of reducers to 127
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira