[ 
https://issues.apache.org/jira/browse/HBASE-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HBASE-4282:
---------------------------------

    Fix Version/s: 0.94.0
          Summary: RegionServer should abort when WAL close encounters an error 
with unflushed edits  (was: Potential data loss in retries of WAL close 
introduced in HBASE-4222)
     Hadoop Flags: Reviewed

Updated title for clarity that this is restoring the prior region server abort 
behavior when data would be lost riding over the close error.
                
> RegionServer should abort when WAL close encounters an error with unflushed 
> edits
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-4282
>                 URL: https://issues.apache.org/jira/browse/HBASE-4282
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0, 0.90.5
>            Reporter: Gary Helmling
>            Assignee: Gary Helmling
>            Priority: Blocker
>             Fix For: 0.92.0, 0.94.0, 0.90.5
>
>         Attachments: HBASE-4282_0.90_2.patch, HBASE-4282_0.90_final.patch, 
> HBASE-4282_0.92_final.patch, HBASE-4282_trunk_2.patch, 
> HBASE-4282_trunk_3.patch, HBASE-4282_trunk_final.patch, 
> HBASE-4282_trunk_prelim.patch
>
>
> The ability to ride over WAL close errors on log rolling added in HBASE-4222 
> could lead to missing HLog entries if:
> * A table has DEFERRED_LOG_FLUSH=true
> * There are unflushed WALEdit entries for that table in the current 
> SequenceFile writer buffer
> Since the writes were already acknowledged to the client, just ignoring the 
> close error to allow for another log roll doesn't seem like the right thing 
> to do here.
> We could easily flag this state and only ride over the close error if there 
> aren't unflushed entries.  This would bring the above condition back to the 
> previous behavior of aborting the region server.  However, aborting the 
> region server in this state is still guaranteeing data loss.  Is there 
> anything we can do better in this case?  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to