[
https://issues.apache.org/jira/browse/HADOOP-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luca Telloli updated HADOOP-5189:
---------------------------------
Attachment: HADOOP-5189-trunk-preview.patch
I'm posting a new preview version that addresses two features:
- Logging on multiple devices
- Writing IDs on Zookeeper (that is, no longer usage of files to write
information)
I additionally moved EditLogFileOutputStream and EditLogFileInputStream out of
the FSEditLog class.
A sample configuration is the following:
<property>
<name>dfs.name.dir</name>
<value>/tmp/localhdfs</value>
</property>
<property>
<name>dfs.name.edits.dir</name>
<value>/tmp/hdfsedits</value>
</property>
<property>
<name>hdfs.editlog</name>
<value>FILE,BOOKKEEPER</value>
</property>
NOTE: The hdfs.editlog is a new property that has to be specified for this
patch to work.
RUNNING ZOOKEEPER AND BOOKKEEPER EASILY
To run ZooKeeper and BookKeeper in one shot, there' a class in the bookkeeper
.jar named org.apache.bookkeeper.util.LocalBookKeeper which can run a
ZooKeeper along with a user-specified number of BookKeepers.
An example command is the following:
java -cp
lib/log4j-1.2.15.jar:lib/junit-3.8.1.jar:lib/zookeeper-dev.jar:lib/zookeeper-dev-bookkeeper.jar
org.apache.bookkeeper.util.LocalBookKeeper
LOGGING ON MULTIPLE DEVICES
The initial semantic is very simple, and is the following:
- when writing an operation, write sequentially to all types of logging
- when reading operations (during the startup or checkpoint), read from the
first logging system; at the moment this is the first storage directory, so
still file-based
There's no fall-back mechanism implemented yet if the first logging system
fails (the idea would be to go with the next one and exclude the failed one
from the array of streams).
The current loadFSEdits(StorageDirectory) should eventually change to a
loadFSEdits() where no storage directory is needed. Maybe a
loadFSEdits(EditLogInputStream) would be even better.
DRAWBACKS
Currently, storage directories can be of three types: IMAGE, EDITS and
IMAGE_AND_EDITS, with the last one being the default one. With this patch I
exclude the IMAGE_AND_EDITS type, so user are forced to use the dfs.name.dir
and dfs.name.edits.dir to specify a directory for IMAGE and a directory for
EDITS, when using file logging.
> Integration with BookKeeper logging system
> ------------------------------------------
>
> Key: HADOOP-5189
> URL: https://issues.apache.org/jira/browse/HADOOP-5189
> Project: Hadoop Core
> Issue Type: New Feature
> Affects Versions: 0.19.0
> Reporter: Luca Telloli
> Attachments: create.png, HADOOP-5189-trunk-preview.patch,
> HADOOP-5189-trunk-preview.patch, HADOOP-5189.patch, HADOOP-5189.patch
>
>
> BookKeeper is a system to reliably log streams of records
> (https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a
> natural target for such a system for being the metadata repository of the
> entire file system for HDFS.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.