[jira] [Commented] (BOOKKEEPER-346) Detect IOExceptions in LedgerCache and bookie should look at next ledger dir(if any)

Sijie Guo (JIRA) Tue, 11 Sep 2012 03:33:15 -0700

    [ 
https://issues.apache.org/jira/browse/BOOKKEEPER-346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452889#comment-13452889
 ]


Sijie Guo commented on BOOKKEEPER-346:
--------------------------------------

{code}
+    /**
+     * Copies current file contents upto specified size to the target file and
+     * deletes the current file.
+     * If size not known then pass size as Long.MAX_VALUE to copy complete 
file.
+     */
+    public void moveToNewLocation(File newFile, long size) throws IOException {
+        checkOpen(false);
+        if (size == Long.MAX_VALUE) {
+            size = fc.size();
+        }
+        if (!newFile.exists()) {
+            checkParents(newFile);
+            newFile.createNewFile();
+        }
+        FileChannel newFc = new RandomAccessFile(newFile, "rw").getChannel();
+        try {
+            long written = 0;
+            while (written < size) {
+                long count = fc.transferTo(written, size, newFc);
+                if (count <= 0) {
+                    throw new IOException("Copying to new location failed");
+                }
+                written += count;
+            }
+            if (written <= 0 && size > 0) {
+                throw new IOException("Copying to new location failed");
+            }
+        } finally {
+            newFc.force(true);
+        }
+        fc.close();
+        fc = newFc;
+        if (!delete()) {
+            LOG.warn("Failed to delete the previous index file " + lf);
+        }
+        lf = newFile;
+    }
+
{code}

first, the operation is not synchronized. it would be called in different 
threads (in flush thread and thread serving requests).

second, this is not a transacted operation. so if the bookie server failed 
before deleted the old index file. then there are two index files existed after 
bookie server restarted. it would introduce the issue described in 
BOOKKEEPER-374.

one possible solution for the movement, we need record a transaction for the 
movement, say 'file1' move to 'file2'. if the bookie failed before delete the 
original file. 'file1' and 'file2' are both existed in the system. when the 
bookie server restarts, it starts to replay the journal to ensure 'file1' is 
copied to 'file2' and 'file1' is deleted before starting serving the requests.

and we could leverage Journal to record such transaction.

since it is not a small change, I would like to see others' comments for this 
change.


                
> Detect IOExceptions in LedgerCache and bookie should look at next ledger 
> dir(if any)
> ------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-346
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-346
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>    Affects Versions: 4.1.0
>            Reporter: Rakesh R
>            Assignee: Vinay
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-346.patch, BOOKKEEPER-346.patch, 
> BOOKKEEPER-346.patch, BOOKKEEPER-346.patch
>
>
> This jira to detect IOExceptions in "LedgerCache" to iterate over all the 
> configured ledger(s).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-346) Detect IOExceptions in LedgerCache and bookie should look at next ledger dir(if any)

Reply via email to