I have few questions on running the pig script/ map-reduce jobs. 1. I know that pig creates *logical, physical and then execution plans* before it really starts executing the map/reduce job; I am able to look at the logical/physical plans using the command *explain <alias_name>*; But how do I view the execution plan (which I suppose list the different map/reduce tasks planned)? In the course of pig execution, I see that many jobs (map/reduce pair) are created. Want to understand what each of these jobs solve.
2. Is there any definitive guide which I can use to understand the plans created because what is spat is difficult to understand. 3. I am able to change the number of map jobs by changing the number of input file blocks. Do I have control over the number of reduce jobs as well? How do I set the number of reducers? 4. What is the default heap memory size in mapper/reducer nodes? Which job parameters reflect these? Will I be able to change the heap memory by -Xmx 1024m option? My jobs used to fail when I set the heap memory in this way - May be there are some restrictions on what values can be supplied? Thanks much!