I'm glad to report that I got the pig gradle test to work.
I then run it as unix user 'tom' (the vmWare VM used in Hadoop for Dummies).
Problem no home directory. Fixed that.
Next problem:
/tmp/hadoop-yarn/staging is perms 700 ( and not addressed by
/usr/lib/hadoop/libexec/init-hdfs.sh, should it be?)
Fix:
% hdfs dfs -chmod -R 1777 /tmp/hadoop-yarn/staging
Next problem, pig jobs are in "ACCEPTED" state and never move to running, with
wonderfully no hints in logs. Solution: Up the memory settings in
mapred-site.xml
<property>
<name>mapreduce.map.memory.mb</name>
<value>2400</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx2048m</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>4400</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx4096m</value>
</property>
I guess the defaults are just too low…
Next problem, test aborts due to test resource clean up issue. Solved with
the following change to the test:
[tom@localhost pig]$ git diff -- TestPigSmoke.groovy
diff --git a/bigtop-tests/smoke-tests/pig/TestPigSmoke.groovy
b/bigtop-tests/smoke-tests/pig/TestPigSmoke.groovy
index 9902267..9511626 100644
--- a/bigtop-tests/smoke-tests/pig/TestPigSmoke.groovy
+++ b/bigtop-tests/smoke-tests/pig/TestPigSmoke.groovy
@@ -41,6 +41,7 @@ class TestPigSmoke {
@AfterClass
public static void tearDown() {
sh.exec("hadoop fs -rmr -skipTrash pigsmoketest");
+ sh.exec("hadoop fs -rmr -skipTrash pig-output-wordcount");
}
@BeforeClass
So, not casting stones… but when are the smoke tests run? Seems like it would
be optimal if they would get run on a bigtop distro prior to a bigtop release,
on each of the supported OS's (sounds like a lot of work… is there something
like that?). For the case of users stepping in to bigtop for the first time
(raises hand…) it would be nice if the docs could say install big top and
then add a non-privileged user, up memory parameters and (??) etc. Now run all
the tests and ensure they pass.
Tim
From: Jay Vyas <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Date: Wednesday, September 24, 2014 10:36 AM
To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: Re: smoke tests in 0.7.0
Hi Tim. Great to hear your making progress..
Your on the right track but i forgot the details. But yes: you'll have to run
some simple commands as user hdfs to set up permissions for "root".
You can try running your tests as user "hdfs". That is a good hammer to use
since hdfs is super user on Hadoop systems that use HDFS as the file system.
In other systems like gluster, we usually have root as the super user.
Directory perms are always a pain in Hadoop setup. Anything you suggest to
make it more user friendly maybe create a jira. On this route, we have done
bigtop-1200 which now encodes all info in a json file so that any FileSystem
Can use the bigtop for provisioner. I can discuss that with you also if you
want later on (send me a private message).
I haven't merged that to replace init-hdfs , but it is functionally equivalent
, and can be found in the code base (see jiras bigtop-952 and bigtop-1200 for
details).
On Sep 24, 2014, at 12:50 PM, Tim Harsch
<[email protected]<mailto:[email protected]>> wrote:
Thanks that was helpful. So, I looked closely at the TestPigSmoke test and
tried repeating it's steps manually, which really helped. I was able to track
the issue down to a perms problem for running as user root. See this:
[root@localhost pig]# hadoop fs -ls /
Found 6 items
drwxrwxrwx - hdfs supergroup 0 2014-09-24 00:32 /benchmarks
drwxr-xr-x - hbase hbase 0 2014-09-24 00:32 /hbase
drwxr-xr-x - solr solr 0 2014-09-24 00:32 /solr
drwxrwxrwt - hdfs supergroup 0 2014-09-24 18:33 /tmp
drwxr-xr-x - hdfs supergroup 0 2014-09-24 00:33 /user
drwxr-xr-x - hdfs supergroup 0 2014-09-24 00:32 /var
[root@localhost pig]# hadoop fs -ls /tmp
Found 2 items
drwxrwxrwx - mapred mapred 0 2014-09-24 00:37 /tmp/hadoop-yarn
drwxr-xr-x - root supergroup 0 2014-09-24 01:29
/tmp/temp-1450563950
[root@localhost pig]# hadoop fs -ls /tmp/hadoop-yarn
Found 1 items
drwxrwx--- - mapred mapred 0 2014-09-24 00:37
/tmp/hadoop-yarn/staging
[root@localhost pig]# hadoop fs -ls /tmp/hadoop-yarn/staging
ls: Permission denied: user=root, access=READ_EXECUTE,
inode="/tmp/hadoop-yarn/staging":mapred:mapred:drwxrwx---
OK, makes sense. But I'm a little confused.. I thought all the directories
would be set up correctly by the script /usr/lib/hadoop/libexec/init-hdfs.sh,
which as you can tell from the above output, I did run it. From the docs I've
read the assumption is that after running /usr/lib/hadoop/libexec/init-hdfs.sh
all tests should pass… but perhaps I missed some instruction somewhere.
Tim
From: jay vyas <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Date: Wednesday, September 24, 2014 5:46 AM
To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: Re: smoke tests in 0.7.0
Thanks tim. It could be related to permissions on the DFS... depending on the
user you are running the job as.
Can you paste the error you got ? In general the errors should be eay to track
down in smoke-tests (you can just hack some print statements into the groovy
script under pig/).
Also, the stack trace should give you some information ?