Hi, I have a cluster running YARN, and mapreduce jobs run as expected when they are executed from one of the nodes. However, when I run Pig scripts from a remote client, Pig connects to HDFS and HBase but runs its MapReduce job using the LocalJobRunner. Jobs finish successfully, but they aren't using the YARN architecture. I have placed all the configuration files in the Pig configuration directory, and this must be right otherwise Pig wouldn't connect to my cluster's HDFS and HBase.
I have even put "mapreduce.framework.name=yarn" in the pig.properties file. Any ideas to get jobs submitted to a remote Hadoop cluster to work in distributed mode? -Kevin