[jira] [Commented] (HBASE-4282) Potential data loss in retries of WAL close introduced in HBASE-4222

stack (Commented) (JIRA) Thu, 06 Oct 2011 14:11:56 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122297#comment-13122297
 ]


stack commented on HBASE-4282:
------------------------------

On v3, the txids are pretty useless at least out in logs?  No harm logging them 
I suppose but there is nothing I can infer given a txid?  Is that so?

Why this:


{code}
-            if (unflushedEntries.get() <= syncedTillHere) {
-              Thread.sleep(this.optionalFlushInterval);
-            }
+            Thread.sleep(this.optionalFlushInterval);
{code}


Swap these lines on commit?

{code}
+    TEST_UTIL.cleanupTestDir();
+    TEST_UTIL.shutdownMiniCluster();
{code}


This is a good thing to assert:

{code}
+    assertTrue("Need HDFS-826 for this test", log.canGetCurReplicas());
{code}

A similar assertion over in TestLogRolling found an issue in 205 RC1.

Nice test 
                
> Potential data loss in retries of WAL close introduced in HBASE-4222
> --------------------------------------------------------------------
>
>                 Key: HBASE-4282
>                 URL: https://issues.apache.org/jira/browse/HBASE-4282
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0, 0.90.5
>            Reporter: Gary Helmling
>            Assignee: Gary Helmling
>            Priority: Blocker
>             Fix For: 0.92.0, 0.90.5
>
>         Attachments: HBASE-4282_0.90_2.patch, HBASE-4282_trunk_2.patch, 
> HBASE-4282_trunk_3.patch, HBASE-4282_trunk_prelim.patch
>
>
> The ability to ride over WAL close errors on log rolling added in HBASE-4222 
> could lead to missing HLog entries if:
> * A table has DEFERRED_LOG_FLUSH=true
> * There are unflushed WALEdit entries for that table in the current 
> SequenceFile writer buffer
> Since the writes were already acknowledged to the client, just ignoring the 
> close error to allow for another log roll doesn't seem like the right thing 
> to do here.
> We could easily flag this state and only ride over the close error if there 
> aren't unflushed entries.  This would bring the above condition back to the 
> previous behavior of aborting the region server.  However, aborting the 
> region server in this state is still guaranteeing data loss.  Is there 
> anything we can do better in this case?  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4282) Potential data loss in retries of WAL close introduced in HBASE-4222

Reply via email to