Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "FAQ" page has been changed by GautamGopalakrishnan: https://wiki.apache.org/hadoop/FAQ?action=diff&rev1=112&rev2=113 Comment: Added 3.17 (Operation category READ/WRITE is not supported in state standby) See ConnectionRefused == Why is the 'hadoop.tmp.dir' config default user.name dependent? == + We need a directory that a user can write and also not to interfere with other users. If we didn't include the username, then different users would share the same tmp directory. This can cause authorization problems, if folks' default umask doesn't permit write by others. It can also result in folks stomping on each other, when they're, e.g., playing with HDFS and re-format their filesystem. - - We need a directory that a user can write and also not to interfere with other users. - If we didn't include the username, then different users would share the same tmp directory. - This can cause authorization problems, if folks' default umask doesn't permit write by others. - It can also result in folks stomping on each other, when they're, e.g., playing with HDFS and - re-format their filesystem. == Does Hadoop require SSH? == Hadoop provided scripts (e.g., start-mapred.sh and start-dfs.sh) use ssh in order to start and stop the various daemons and some other utilities. The Hadoop framework in itself does not '''require''' ssh. Daemons (e.g. TaskTracker and DataNode) can also be started manually on each node without the script's help. == What mailing lists are available for more help? == - A description of all the mailing lists are on the http://hadoop.apache.org/mailing_lists.html page. In general: * general is for people interested in the administrivia of Hadoop (e.g., new release discussion). @@ -96, +90 @@ * -dev mailing lists are for people who are changing the source code of the framework. For example, if you are implementing a new file system and want to know about the FileSystem API, hdfs-dev would be the appropriate mailing list. == What does "NFS: Cannot create lock on (some dir)" mean? == - This actually is not a problem with Hadoop, but represents a problem with the setup of the environment it is operating. - Usually, this error means that the NFS server to which the process is writing does not support file system locks. NFS prior to v4 requires a locking service daemon to run (typically rpc.lockd) in order to provide this functionality. NFSv4 has file system locks built into the protocol. @@ -293, +285 @@ Hadoop currently does not have a method by which to do this automatically. To do this manually: 1. Shutdown the DataNode involved - 2. Use the UNIX mv command to move the individual block replica and meta pairs from one directory to another on the selected host. On releases which have HDFS-6482 (Apache Hadoop 2.6.0+) you also need to ensure the subdir-named directory structure remains exactly the same when moving the blocks across the disks. For example, if the block replica and its meta pair were under '''/data/1'''/dfs/dn/current/BP-1788246909-172.23.1.202-1412278461680/current/finalized'''/subdir0/subdir1'''/, and you wanted to move it to /data/5/ disk, then it MUST be moved into the same subdirectory structure underneath that, i.e. '''/data/5'''/dfs/dn/current/BP-1788246909-172.23.1.202-1412278461680/current/finalized'''/subdir0/subdir1/'''. If this is not maintained, the DN will no longer be able to locate the replicas after the move. + 1. Use the UNIX mv command to move the individual block replica and meta pairs from one directory to another on the selected host. On releases which have HDFS-6482 (Apache Hadoop 2.6.0+) you also need to ensure the subdir-named directory structure remains exactly the same when moving the blocks across the disks. For example, if the block replica and its meta pair were under '''/data/1'''/dfs/dn/current/BP-1788246909-172.23.1.202-1412278461680/current/finalized'''/subdir0/subdir1'''/, and you wanted to move it to /data/5/ disk, then it MUST be moved into the same subdirectory structure underneath that, i.e. '''/data/5'''/dfs/dn/current/BP-1788246909-172.23.1.202-1412278461680/current/finalized'''/subdir0/subdir1/'''. If this is not maintained, the DN will no longer be able to locate the replicas after the move. - 3. Restart the DataNode. + 1. Restart the DataNode. == What does "file could only be replicated to 0 nodes, instead of 1" mean? == The NameNode does not have any available !DataNodes. This can be caused by a wide variety of reasons. Check the DataNode logs, the NameNode logs, network connectivity, ... Please see the page: CouldOnlyBeReplicatedTo @@ -303, +295 @@ No. This is why it is very important to configure [[http://hadoop.apache.org/hdfs/docs/current/hdfs-default.html#dfs.namenode.name.dir|dfs.namenode.name.dir]] to write to two filesystems on different physical hosts, use the SecondaryNameNode, etc. == I got a warning on the NameNode web UI "WARNING : There are about 32 missing blocks. Please check the log or run fsck." What does it mean? == + This means that 32 blocks in your HDFS installation don’t have a single replica on any of the live !DataNodes.<<BR>> Block replica files can be found on a DataNode in storage directories specified by configuration parameter [[http://hadoop.apache.org/hdfs/docs/current/hdfs-default.html#dfs.datanode.data.dir|dfs.datanode.data.dir]]. If the parameter is not set in the DataNode’s {{{hdfs-site.xml}}}, then the default location {{{/tmp}}} will be used. This default is intended to be used only for testing. In a production system this is an easy way to lose actual data, as local OS may enforce recycling policies on {{{/tmp}}}. Thus the parameter must be overridden.<<BR>> If [[http://hadoop.apache.org/hdfs/docs/current/hdfs-default.html#dfs.datanode.data.dir|dfs.datanode.data.dir]] correctly specifies storage directories on all !DataNodes, then you might have a real data loss, which can be a result of faulty hardware or software bugs. If the file(s) containing missing blocks represent transient data or can be recovered from an external source, then the easiest way is to remove (and potentially restore) them. Run [[http://hadoop.apache.org/common/docs/current/hdfs_user_guide.html#fsck|fsck]] in order to determine which files have missing blocks. If you would like (highly appreciated) to further investigate the cause of data loss, then you can dig into NameNode and DataNode logs. From the logs one can track the entire life cycle of a particular block and its replicas. - This means that 32 blocks in your HDFS installation don’t have a single replica on any of the live !DataNodes.<<BR>> - Block replica files can be found on a DataNode in storage directories specified by configuration parameter [[http://hadoop.apache.org/hdfs/docs/current/hdfs-default.html#dfs.datanode.data.dir|dfs.datanode.data.dir]]. If the parameter is not set in the DataNode’s {{{hdfs-site.xml}}}, then the default location {{{/tmp}}} will be used. This default is intended to be used only for testing. In a production system this is an easy way to lose actual data, as local OS may enforce recycling policies on {{{/tmp}}}. Thus the parameter must be overridden.<<BR>> - If [[http://hadoop.apache.org/hdfs/docs/current/hdfs-default.html#dfs.datanode.data.dir|dfs.datanode.data.dir]] correctly specifies storage directories on all !DataNodes, then you might have a real data loss, which can be a result of faulty hardware or software bugs. If the file(s) containing missing blocks represent transient data or can be recovered from an external source, then the easiest way is to remove (and potentially restore) them. Run [[http://hadoop.apache.org/common/docs/current/hdfs_user_guide.html#fsck|fsck]] in order to determine which files have missing blocks. If you would like (highly appreciated) to further investigate the cause of data loss, then you can dig into NameNode and DataNode logs. From the logs one can track the entire life cycle of a particular block and its replicas. == If a block size of 64MB is used and a file is written that uses less than 64MB, will 64MB of disk space be consumed? == - - Short answer: No. + Short answer: No. - Longer answer: Since HFDS does not do raw disk block storage, there are two block sizes in use when writing a file in HDFS: the HDFS blocks size and the underlying file system's block size. HDFS will create files up to the size of the HDFS block size as well as a meta file that contains CRC32 checksums for that block. The underlying file system store that file as increments of its block size on the actual raw disk, just as it would any other file. + Longer answer: Since HFDS does not do raw disk block storage, there are two block sizes in use when writing a file in HDFS: the HDFS blocks size and the underlying file system's block size. HDFS will create files up to the size of the HDFS block size as well as a meta file that contains CRC32 checksums for that block. The underlying file system store that file as increments of its block size on the actual raw disk, just as it would any other file. + + == What does the message "Operation category READ/WRITE is not supported in state standby" mean? == + In an HA-enabled cluster, DFS clients cannot know in advance which namenode is active at a given time. So when a client contacts a namenode and it happens to be the standby, the READ or WRITE operation will be refused and this message is logged. The client will then automatically contact the other namenode and try the operation again. As long as there is one active and one standby namenode in the cluster, this message can be safely ignored. + + If an application is configured to contact only one namenode always, this message indicates that the application is failing to perform any read/write operation. In such situations, the application would need to be modified to contact the other namenode. = Platform Specific = == General == - === Problems building the C/C++ Code === - While most of Hadoop is built using Java, a larger and growing portion is being rewritten in C and C++. As a result, the code portability between platforms is going down. Part of the problem is the lack of access to platforms other than Linux and our tendency to use specific BSD, GNU, or System V functionality in places where the POSIX-usage is non-existent, difficult, or non-performant. That said, the biggest loss of native compiled code will be mostly performance of the system and the security features present in newer releases of Hadoop. The other Hadoop features usually have Java analogs that work albeit slower than their C cousins. The exception to this is security, which absolutely requires compiled code.
