One thing I not sure is whether "mapred.tasktracker.map.tasks.maximum" is a
client side setting. If it is, you can create "pig-cluster-hadoop-site.xml",
and put the directory containing it in classpath.
pig-cluster-hadoop-site.xml is the additional hadoop settings specific to
Pig. It has the same format as other hadoop config files.

Daniel

On Fri, Jul 8, 2011 at 10:08 AM, Dylan Scott <[email protected]> wrote:

> I have a Hadoop job running through Pig for which I would like to limit the
> number of concurrently running mappers per task tracker. The
> mapred.tasktracker.map.tasks.maximum property seems to be just what I want
> to modify, but unfortunately I cannot modify it in mapred-site.xml as this
> configuration is shared by many different jobs, most of which don't need to
> be limited in the same way.
> I'm wondering what the best way to set this option would be. I noticed that
> using the Configuration returned by UDFContext.getJobConf() will not work,
> as it is a copy of the configuration and so writing the property here will
> not get passed back to the system. I'm given access to the Job object in my
> store func's setStoreLocation method, would setting the property on this
> Job's configuration get passed back to the system? If not is there a good
> way to set a property like this from within a Pig UDF?
>

Reply via email to