Thanks for the clear explanation, Harsh! Jason
On Fri, Sep 14, 2012 at 10:08 PM, Harsh J <[email protected]> wrote: > Jason, > > So far you've made sure your HDFS data is securely placed that it > doesn't get wiped. This much is sufficient for going ahead with > running HBase. > > For the rest of the files that are going to /tmp, you will need to > tweak the config of "hadoop.tmp.dir" to make it not do so, and also > change HADOOP_OPTS in hadoop-env.sh to include a > -Djava.io.tmpdir=$HOME/tmp to move the temporary file requests off to > a new path under your $HOME. > > However, doing this is not absolutely necessary. For HDFS, all that > really matters is the name and data directories, which you have > already moved to a persistent zone. > > On Sat, Sep 15, 2012 at 12:33 AM, Jason Huang <[email protected]> wrote: >> Thanks. >> >> This makes sense - checking hdfs-default.xml found the same property >> named dfs.name.dir and dfs.data.dir. >> >> Now I am no longer formatting the default tmp folders taken from >> hdfs-default.xml. >> >> However, after formatting the name node, hadoop automatically created >> another folder: >> /tmp/hsperfdata_jasonhuang >> >> Does anyone know what that directory is for? >> >> And after I started hadoop (running ./start-all.sh), another folder >> /tmp/hadoop-jasonhuang was created, together with a few files: >> /tmp/hadoop-jasonhuang-datanode.pid >> /tmp/hadoop-jasonhuang-jobtracker.pid >> /tmp/hadoop-jasonhuang-namenode.pid >> /tmp/hadoop-jasonhuang-secondarynamenode.pid >> /tmp/hadoop-jasonhuang-tasktracker.pid >> >> Are those files generated at the correct location? >> >> I've looked at the logs for both name node and master node and there >> seemed to be no error. However, I am not sure if these files are >> generated at the correct place or not. I am installing HBase on top of >> this and want to make sure Hadoop is working correctly before going >> further. >> >> thanks! >> >> Jason >> >> On Fri, Sep 14, 2012 at 1:36 PM, Harsh J <[email protected]> wrote: >>> If you are using 1.0.3, then the config names are wrong. You need >>> dfs.name.dir and dfs.data.dir instead. Those configs you have are for >>> 2.x based releases. >>> >>> Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly >>> more >>> portable/templatey config :) >>> >>> On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <[email protected]> wrote: >>>> Hello, >>>> >>>> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a >>>> pseudo-distributed mode. >>>> >>>> After download / install / setup config files I ran the following >>>> namenode format command as suggested in the user guide: >>>> >>>> $bin/hadoop namenode -format >>>> >>>> Here is the output: >>>> ************************************************************/ >>>> 12/09/14 10:46:42 INFO util.GSet: VM type = 32-bit >>>> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB >>>> 12/09/14 10:46:42 INFO util.GSet: capacity = 2^23 = 8388608 entries >>>> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608 >>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang >>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup >>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true >>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: >>>> dfs.block.invalidate.limit=100 >>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: >>>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), >>>> accessTokenLifetime=0 min(s) >>>> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring >>>> more than 10 times >>>> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in >>>> 0 seconds. >>>> 12/09/14 10:46:42 INFO common.Storage: Storage directory >>>> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted. >>>> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG: >>>> /************************************************************ >>>> >>>> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name >>>> >>>> However, in my config file I've assigned a different directory (see >>>> hdfs-site.xml below): >>>> <configuration> >>>> <property> >>>> <name>dfs.replication</name> >>>> <value>1</value> >>>> </property> >>>> <property> >>>> <name>dfs.namenode.name.dir</name> >>>> <value>/Users/jasonhuang/hdfs/name</value> >>>> </property> >>>> <property> >>>> <name>dfs.datanode.data.dir</name> >>>> <value>/Users/jasonhuang/hdfs/data</value> >>>> </property> >>>> >>>> Does anyone know why the hdfs-site.xml might not be respected? >>>> >>>> Also, after formatting the name node, I did a search for the fsimage >>>> file in my local file directories (from root dir) and here is what I >>>> found: >>>> $ sudo find / -name fsimage >>>> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage >>>> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage >>>> >>>> I don't understand why the name node format picked (and created) these >>>> two directories... >>>> >>>> Any thoughts? >>>> >>>> Thanks! >>>> >>>> Jason >>> >>> >>> >>> -- >>> Harsh J > > > > -- > Harsh J
