Hello Pig-Users,
I have been looking for a flexible way to specify the cluster hostnames of
the remote mapreduce instance that I want to connect to - I've looked
through the PigServer API / source code but haven't found a way of doing
this - it seems that if PigServer is instantiated in mapreduce mode, it only
looks in the classpath for core-site.xml & map-reduce-site.xml
I've found it very convenient to use code like the following for managing
data in HDFS using the Hadoop API:
Configuration config = new Configuration();
config.set("fs.default.name", hadoopClusterHdfsUrl);
fs = FileSystem.get(config);
Is there a way to do anything similar in embedded Pig? It seems that having
properties that can be filtered in at runtime would allow more flexibility
than depending on external files.
Thanks,
- Andrew