[Hadoop Wiki] Update of "FAQ" by GautamGopalakrishnan

Apache Wiki Fri, 24 Jul 2015 23:24:59 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The "FAQ" page has been changed by GautamGopalakrishnan:
https://wiki.apache.org/hadoop/FAQ?action=diff&rev1=112&rev2=113

Comment:
Added 3.17 (Operation category READ/WRITE is not supported in state standby)

  See ConnectionRefused
  
  == Why is the 'hadoop.tmp.dir' config default user.name dependent? ==
+ We need a directory that a user can write and also not to interfere with 
other users.   If we didn't include the username, then different users would 
share the same tmp directory.   This can cause authorization problems, if 
folks' default umask doesn't permit write by others.   It can also result in 
folks stomping on each other, when they're, e.g., playing with HDFS and 
re-format their filesystem.
- 
- We need a directory that a user can write and also not to interfere with 
other users.  
- If we didn't include the username, then different users would share the same 
tmp directory.  
- This can cause authorization problems, if folks' default umask doesn't permit 
write by others.  
- It can also result in folks stomping on each other, when they're, e.g., 
playing with HDFS and
- re-format their filesystem.
  
  == Does Hadoop require SSH? ==
  Hadoop provided scripts (e.g., start-mapred.sh and start-dfs.sh) use ssh in 
order to start and stop the various daemons and some other utilities. The 
Hadoop framework in itself does not '''require''' ssh. Daemons (e.g. 
TaskTracker and DataNode) can also be started manually on each node without the 
script's help.
  
  == What mailing lists are available for more help? ==
- 
  A description of all the mailing lists are on the 
http://hadoop.apache.org/mailing_lists.html page.  In general:
  
   * general is for people interested in the administrivia of Hadoop (e.g., new 
release discussion).
@@ -96, +90 @@

   * -dev mailing lists are for people who are changing the source code of the 
framework.  For example, if you are implementing a new file system and want to 
know about the FileSystem API, hdfs-dev would be the appropriate mailing list.
  
  == What does "NFS: Cannot create lock on (some dir)" mean? ==
- 
  This actually is not a problem with Hadoop, but represents a problem with the 
setup of the environment it is operating.
- 
  
  Usually, this error means that the NFS server to which the process is writing 
does not support file system locks.  NFS prior to v4 requires a locking service 
daemon to run (typically rpc.lockd) in order to provide this functionality.  
NFSv4 has file system locks built into the protocol.
  
@@ -293, +285 @@

  Hadoop currently does not have a method by which to do this automatically.  
To do this manually:
  
   1. Shutdown the DataNode involved
-  2. Use the UNIX mv command to move the individual block replica and meta 
pairs from one directory to another on the selected host. On releases which 
have HDFS-6482 (Apache Hadoop 2.6.0+) you also need to ensure the subdir-named 
directory structure remains exactly the same when moving the blocks across the 
disks. For example, if the block replica and its meta pair were under 
'''/data/1'''/dfs/dn/current/BP-1788246909-172.23.1.202-1412278461680/current/finalized'''/subdir0/subdir1'''/,
 and you wanted to move it to /data/5/ disk, then it MUST be moved into the 
same subdirectory structure underneath that, i.e. 
'''/data/5'''/dfs/dn/current/BP-1788246909-172.23.1.202-1412278461680/current/finalized'''/subdir0/subdir1/'''.
 If this is not maintained, the DN will no longer be able to locate the 
replicas after the move.
+  1. Use the UNIX mv command to move the individual block replica and meta 
pairs from one directory to another on the selected host. On releases which 
have HDFS-6482 (Apache Hadoop 2.6.0+) you also need to ensure the subdir-named 
directory structure remains exactly the same when moving the blocks across the 
disks. For example, if the block replica and its meta pair were under 
'''/data/1'''/dfs/dn/current/BP-1788246909-172.23.1.202-1412278461680/current/finalized'''/subdir0/subdir1'''/,
 and you wanted to move it to /data/5/ disk, then it MUST be moved into the 
same subdirectory structure underneath that, i.e. 
'''/data/5'''/dfs/dn/current/BP-1788246909-172.23.1.202-1412278461680/current/finalized'''/subdir0/subdir1/'''.
 If this is not maintained, the DN will no longer be able to locate the 
replicas after the move.
-  3. Restart the DataNode.
+  1. Restart the DataNode.
  
  == What does "file could only be replicated to 0 nodes, instead of 1" mean? ==
  The NameNode does not have any available !DataNodes.  This can be caused by a 
wide variety of reasons.  Check the DataNode logs, the NameNode logs, network 
connectivity, ... Please see the page: CouldOnlyBeReplicatedTo
@@ -303, +295 @@

  No.  This is why it is very important to configure 
[[http://hadoop.apache.org/hdfs/docs/current/hdfs-default.html#dfs.namenode.name.dir|dfs.namenode.name.dir]]
 to write to two filesystems on different physical hosts, use the 
SecondaryNameNode, etc.
  
  == I got a warning on the NameNode web UI "WARNING : There are about 32 
missing blocks. Please check the log or run fsck." What does it mean? ==
+ This means that 32 blocks in your HDFS installation don’t have a single 
replica on any of the live !DataNodes.<<BR>> Block replica files can be found 
on a DataNode in storage directories specified by configuration parameter 
[[http://hadoop.apache.org/hdfs/docs/current/hdfs-default.html#dfs.datanode.data.dir|dfs.datanode.data.dir]].
 If the parameter is not set in the DataNode’s {{{hdfs-site.xml}}}, then the 
default location {{{/tmp}}} will be used. This default is intended to be used 
only for testing. In a production system this is an easy way to lose actual 
data, as local OS may enforce recycling policies on {{{/tmp}}}. Thus the 
parameter must be overridden.<<BR>> If 
[[http://hadoop.apache.org/hdfs/docs/current/hdfs-default.html#dfs.datanode.data.dir|dfs.datanode.data.dir]]
 correctly specifies storage directories on all !DataNodes, then you might have 
a real data loss, which can be a result of faulty hardware or software bugs. If 
the file(s) containing missing blocks represent transient data or can be 
recovered from an external source, then the easiest way is to remove (and 
potentially restore) them. Run 
[[http://hadoop.apache.org/common/docs/current/hdfs_user_guide.html#fsck|fsck]] 
in order to determine which files have missing blocks. If you would like 
(highly appreciated) to further investigate the cause of data loss, then you 
can dig into NameNode and DataNode logs. From the logs one can track the entire 
life cycle of a particular block and its replicas.
- This means that 32 blocks in your HDFS installation don’t have a single 
replica on any of the live !DataNodes.<<BR>>
- Block replica files can be found on a DataNode in storage directories 
specified by configuration parameter 
[[http://hadoop.apache.org/hdfs/docs/current/hdfs-default.html#dfs.datanode.data.dir|dfs.datanode.data.dir]].
 If the parameter is not set in the DataNode’s {{{hdfs-site.xml}}}, then the 
default location {{{/tmp}}} will be used. This default is intended to be used 
only for testing. In a production system this is an easy way to lose actual 
data, as local OS may enforce recycling policies on {{{/tmp}}}. Thus the 
parameter must be overridden.<<BR>>
- If 
[[http://hadoop.apache.org/hdfs/docs/current/hdfs-default.html#dfs.datanode.data.dir|dfs.datanode.data.dir]]
 correctly specifies storage directories on all !DataNodes, then you might have 
a real data loss, which can be a result of faulty hardware or software bugs. If 
the file(s) containing missing blocks represent transient data or can be 
recovered from an external source, then the easiest way is to remove (and 
potentially restore) them. Run 
[[http://hadoop.apache.org/common/docs/current/hdfs_user_guide.html#fsck|fsck]] 
in order to determine which files have missing blocks. If you would like 
(highly appreciated) to further investigate the cause of data loss, then you 
can dig into NameNode and DataNode logs. From the logs one can track the entire 
life cycle of a particular block and its replicas.
  
  == If a block size of 64MB is used and a file is written that uses less than 
64MB, will 64MB of disk space be consumed? ==
- 
- Short answer: No.  
+ Short answer: No.
  
- Longer answer:  Since HFDS does not do raw disk block storage, there are two 
block sizes in use when writing a file in HDFS: the HDFS blocks size and the 
underlying file system's block size.  HDFS will create files up to the size of 
the HDFS block size as well as a meta file that contains CRC32 checksums for 
that block.  The underlying file system store that file as increments of its 
block size on the actual raw disk, just as it would any other file.
+ Longer  answer:  Since HFDS does not do raw disk block storage, there are two 
 block sizes in use when writing a file in HDFS: the HDFS blocks size and  the 
underlying file system's block size.  HDFS will create files up to  the size of 
the HDFS block size as well as a meta file that contains  CRC32 checksums for 
that block.  The underlying file system store that  file as increments of its 
block size on the actual raw disk, just as it  would any other file.
+ 
+ == What does the message "Operation category READ/WRITE is not supported in 
state standby" mean? ==
+ In an HA-enabled cluster, DFS clients cannot know in advance which namenode 
is active at a given time. So when a client contacts a namenode and it happens 
to be the standby, the READ or WRITE operation will be refused and this message 
is logged. The client will then automatically contact the other namenode and 
try the operation again. As long as there is one active and one standby 
namenode in the cluster, this message can be safely ignored.
+ 
+ If an application is configured to contact only one namenode always, this 
message indicates that the application is failing to perform any read/write 
operation. In such situations, the application would need to be modified to 
contact the other namenode.
  
  = Platform Specific =
  == General ==
- 
  === Problems building the C/C++ Code ===
- 
  While most of Hadoop is built using Java, a larger and growing portion is 
being rewritten in C and C++.  As a result, the code portability between 
platforms is going down.  Part of the problem is the lack of access to 
platforms other than Linux and our tendency to use specific BSD, GNU, or System 
V functionality in places where the POSIX-usage is non-existent, difficult, or 
non-performant.
  
  That said, the biggest loss of native compiled code will be mostly 
performance of the system and the security features present in newer releases 
of Hadoop.  The other Hadoop features usually have Java analogs that work 
albeit slower than their C cousins.  The exception to this is security, which 
absolutely requires compiled code.

[Hadoop Wiki] Update of "FAQ" by GautamGopalakrishnan

Reply via email to