[
https://issues.apache.org/jira/browse/HADOOP-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697979#action_12697979
]
Konstantin Shvachko commented on HADOOP-5189:
---------------------------------------------
# I've got a compile error:
{code}
[javac]
hadoop/src/hdfs/org/apache/hadoop/hdfs/server/namenode/BackupStorage.java:29:
cannot find symbol
[javac] symbol : class EditLogFileInputStream
[javac] location: class org.apache.hadoop.hdfs.server.namenode.FSEditLog
[javac] import
org.apache.hadoop.hdfs.server.namenode.FSEditLog.EditLogFileInputStream;
{code}
# It is better to create a separate jira for factoring out
{{EditLogFileOutputStream}} and {{EditLogFileInputStream}} from {{FSEditLog}}.
This makes sense whether BookKeeper or not, and it will help not to obscure
changes you really do to the code.
# Why do you need a new method {{setStorageDirectories()}} with a Boolean
parameter, which is not used anywhere inside.
# We will need some automation, which will add zookeeper and bookkeeper jars to
the project and synchronize them with new releases.
Can it be done with ivy?
# I agree that edits input part of the code is not generalized for input
streams other than EditLogFileInputStream. This is because there were no
alternatives yet. We should work on it.
The drawback of the approach you implement, besides that it requires separate
image and edits directories, which you mention, is that you do not have a way
to retrieve the latest checkpoint time from the BookKeeper. This is critical
for choosing the latest version of the journal, and you can only get the latest
checkpoint time from the local file (StorageDirectory) that corresponds to the
stream. The StorageDirectory may be out of sync with the real state of the
BookKeeper data.
Suppose that you use one file output stream and one BKOutputStream.
Suppose the bookKeeper output stream dies, the name-node keeps writing to the
file output stream for another hour or so, and then gets restarted.
If name-node configured to read from the bookKeeper input stream, then it will
get an outdated state of the namespace, because the current state is in the
local file not in the BK.
In general I am very glad that this is moving in the right direction and we
will eventually have a framework which will allow to plug in different logging
systems and intermix them if necessary.
> Integration with BookKeeper logging system
> ------------------------------------------
>
> Key: HADOOP-5189
> URL: https://issues.apache.org/jira/browse/HADOOP-5189
> Project: Hadoop Core
> Issue Type: New Feature
> Affects Versions: 0.19.0
> Reporter: Luca Telloli
> Attachments: create.png, HADOOP-5189-trunk-preview.patch,
> HADOOP-5189-trunk-preview.patch, HADOOP-5189.patch, HADOOP-5189.patch
>
>
> BookKeeper is a system to reliably log streams of records
> (https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a
> natural target for such a system for being the metadata repository of the
> entire file system for HDFS.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.