[ 
https://issues.apache.org/jira/browse/HDFS-397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luca Telloli updated HDFS-397:
------------------------------

    Attachment: HDFS-397.patch

Raghu, I acknowledge that there might be a problem with the current code, 
although unit tests for BackupNode pass correctly, and I implemented a few 
changes that should fix this issue. The problematic lines, as you correctly 
pointed out, are: 

{noformat}
@@ -295,7 +295,7 @@
-    editLog.divertFileStreams(STORAGE_JSPOOL_DIR + "/" + STORAGE_JSPOOL_FILE);
+    editLog.divertFileStreams();

@@ -363,7 +363,7 @@
-    editLog.revertFileStreams(STORAGE_JSPOOL_DIR + "/" + STORAGE_JSPOOL_FILE);
+    editLog.purgeEditLog();
{noformat}

>From what I understand, the BackupStorage wants to write into a subdirectory 
>of a storage directory that is not STORAGE_DIR_CURRENT, but STORAGE_SPOOL_DIR. 

Given the original code, and reading current trunk, what will happen in 
divertFileStreams is the following: 
- the code iterates over all current storage directories of type EDITS and open 
streams of type FILE 
- check if the streams matches the directory 
- then for each stream {
- closes the stream 
- creates a new stream, that in this case would point to 
STORAGE_SPOOL_DIR/STORAGE_SPOOL_FILE
- replaces the old stream with the new stream in the list of streams
} 

On revertFileStream, what happens is the following: 
- the code iterates over all current storage directories of type EDITS and open 
streams of type FILE 
- check if the streams matches the directory 
- then for each stream {
- closes STORAGE_SPOOL_FILE/edits.new 
- renames the parameter source to CURRENT/edits
} 

I'm not sure why the file in the STORAGE_SPOOL_DIR gets renamed to the edits 
file in the STORAGE_DIR_CURRENT, but I made a couple of modifications to my 
code that will implement that same mechanism. 

I understand though that the mechanism of writing into a subdirectory is due to 
the fact that, even if BackupNode and Namenode run on different machine, they 
might be writing on a shared storage directory, for instance on NFS, which I 
guess is a concrete example of when the test could fail, but not directly 
implementable in the test itself. 

My modifications include adding a JournalType parameter to the 
divertFileStreams() and purgeEditLog() methods, as follows: 
{noformat}
-    editLog.divertFileStreams(STORAGE_JSPOOL_DIR + "/" + STORAGE_JSPOOL_FILE);
+    editLog.divertFileStreams(JournalType.BACKUP);
{noformat}
and
{noformat}
-    editLog.revertFileStreams(STORAGE_JSPOOL_DIR + "/" + STORAGE_JSPOOL_FILE);
+    editLog.purgeEditLog(JournalType.BACKUP);
{noformat}

which is then used in the EditLogFileOutputStream class to take the correct 
decisions. The final strings in BackupStorage have been made protected so I can 
read them from the EditLogFileOutputStream class. 

So I'd say that this is still a reorganization of code, if minor I'll let you 
to decide, and there's no change in semantics implied or expected. Unit tests 
pass fine on Linux. 

{quote}
It is very unrealistic to expect tests to catch every random bug and typo in a 
patch (for anything I have ever worked on or known.
{quote}
I agree with you, but that's not what I said. 


> Incorporate storage directories into EditLogFileInput/Output streams
> --------------------------------------------------------------------
>
>                 Key: HDFS-397
>                 URL: https://issues.apache.org/jira/browse/HDFS-397
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Luca Telloli
>            Assignee: Luca Telloli
>         Attachments: HADOOP-6001.patch, HADOOP-6001.patch, HADOOP-6001.patch, 
> HDFS-397.patch, HDFS-397.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to