[Hadoop Wiki] Update of "FAQ" by SomeOtherAccount

Apache Wiki Thu, 30 Jun 2011 12:39:35 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The "FAQ" page has been changed by SomeOtherAccount:
http://wiki.apache.org/hadoop/FAQ?action=diff&rev1=105&rev2=106

Comment:
Adding NFS question

   * `mapred.child.java.opts = -Xmx1024m`
  
  == What kind of hardware scales best for Hadoop? ==
- The short answer is dual processor/dual core machines with 4-8GB of RAM using 
ECC memory. Machines should be moderately high-end commodity machines to be 
most cost-effective and typically cost 1/2 - 2/3 the cost of normal production 
application servers but are not desktop-class machines. This cost tends to be 
$2-5K. For a more detailed discussion, see MachineScaling page.
+ The short answer is dual processor/dual core machines with 4-8GB of RAM using 
ECC memory, depending upon workflow needs. Machines should be moderately 
high-end commodity machines to be most cost-effective and typically cost 1/2 - 
2/3 the cost of normal production application servers but are not desktop-class 
machines. This cost tends to be $2-5K. For a more detailed discussion, see 
MachineScaling page.
  
  == I have a new node I want to add to a running Hadoop cluster; how do I 
start services on just one node? ==
  This also applies to the case where a machine has crashed and rebooted, etc, 
and you need to get it to rejoin the cluster. You do not need to shutdown 
and/or restart the entire cluster in this case.
@@ -86, +86 @@

   * general is for people interested in the administrivia of Hadoop (e.g., new 
release discussion).
   * -user mailing lists are for people using the various components of the 
framework.  For example, if you are writing a job and have a question on the 
MapReduce API, a posting to mapreduce-user would be appropriate.
   * -dev mailing lists are for people who are changing the source code of the 
framework.  For example, if you are implementing a new file system and want to 
know about the FileSystem API, hdfs-dev would be the appropriate mailing list.
+ 
+ == What does "NFS: Cannot create lock on (some dir)" mean? ==
+ 
+ This actually is not a problem with Hadoop, but represents a problem with the 
setup of the environment it is operating.
+ 
+ 
+ Usually, this error means that the NFS server to which the process is writing 
does not support file system locks.  NFS prior to v4 requires a locking service 
daemon to run (typically rpc.lockd) in order to provide this functionality.  
NFSv4 has file system locks built into the protocol.
+ 
+ In some (rarer) instances, it might represent a problem with certain Linux 
kernels that did not implement the flock() system call properly.
+ 
+ It is highly recommended that the only NFS connection in a Hadoop setup be 
the place where the NameNode writes a secondary or tertiary copy of the fsimage 
and edits log.  All other users of NFS are not recommended for optimal 
performance.
  
  = MapReduce =
  == Do I have to write my job in Java? ==

[Hadoop Wiki] Update of "FAQ" by SomeOtherAccount

Reply via email to