On Feb 14, 2012, at 6:17 AM, lulynn_2008 wrote: > Hi, > I am doing e2e testing to pig-0.9.1. Here is my reference: > https://cwiki.apache.org/confluence/display/PIG/HowToTest > Please give your suggestions to following questions: > 1. The tests needs a cluster which includes a Name Node/Job Tracker and three > for Data Node/Task Trackers. Where is the cluster information saved? Are they > saved in hadoop conf files? Pig uses the standard Hadoop configuration files (mapred-site.xml, core-site.xml, hdfs-site.xml) to find cluster information. If you set harness.hadoop.home to the directory where you installed Hadoop when you call ant test-e2e (it will fail if you don't) these will automatically get picked up.
> 2. I assume the tests need a hadoop cluster environment, and just install > pig(low version and pig-0.9.1) and run the test cmds in Name Node.Name > Node/Job Tracker to generate test data. Please correct me if I was wrong. Any machine that has access to the cluster and has Hadoop installed on it with the same configuration files as your cluster will work. It need not be the NN/JT specifically. But that machine will work fine. > 3. Is there any data transfer between nodes in clusters during generating > test data and testing? If yes, when happened? Yes, the harness generates data on the machine it's run on and then does a copyFromLocal to load it into HDFS. > > Thank you. Alan.
