[
https://issues.apache.org/jira/browse/HBASE-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jean-Daniel Cryans updated HBASE-2238:
--------------------------------------
Summary: Review all transitions -- compactions, splits, region opens,
log roll/splitting -- for crash-proofyness and atomicity (was: Review all
transitions -- compactions, splits, region opens, log splitting -- for
crash-proofyness and atomicity)
Fix Version/s: 0.21.0
Priority: Blocker (was: Major)
I'm upgrading this to blocker for 0.21, any GC that kills a RS that rolls after
sleeping and still gets some edits can result in data loss.
> Review all transitions -- compactions, splits, region opens, log
> roll/splitting -- for crash-proofyness and atomicity
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-2238
> URL: https://issues.apache.org/jira/browse/HBASE-2238
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Priority: Blocker
> Fix For: 0.21.0
>
>
> This issue is about reviewing state transitions in hbase to ensure we're
> sufficently hardened against crashes. This issue I see as an umbrella issue
> under which we'd look at compactions, splits, log splits, region opens --
> what else is there? We'd look at each in turn to see how we survive crash at
> any time during the transition. For example, we think compactions idempotent
> but we need to prove it so. Splits are for sure not, not at the moment
> (Witness disabled parents but daughters missing or only one of them
> available).
> Part of this issue would be writing tests that aim to break transitions.
> In light of above, here is recent off-list note from Todd Lipcon (and
> "another"):
> {code}
> I thought a bit more last night about the discussion we were having
> regarding various HBase components doing operations on the HDFS data,
> and ensuring that in various racy scenarios that we don't have two
> region servers or masters overlapping.
> I came to the conclusion that ZK data can't be used to actually have
> effective locks on HDFS directories, since we can never know that we
> still have a ZK lock when we do an operation. Thus the operations
> themselves have to be idempotent, or recoverable in the case of
> multiple nodes trying to do the same thing. Or, we have to use HDFS
> itself as a locking mechanism - this is what we discussed using write
> leases essentially as locks.
> Since I didn't really trust myself, I ran my thoughts by "Another"
> and he concurs (see
> below). Figured this is food for thought for designing HBase data
> management to be completely safe/correct.
> ...
> ---------- Forwarded message ----------
> From: Another <[email protected]>
> Date: Wed, Feb 17, 2010 at 10:50 AM
> Subject: locks
> To: Todd Lipcon <[email protected]>
> Short answer is no, you're right.
> Because HDFS and ZK are partitioned (in the sense that there's no
> communication between them) and there may be an unknown delay between
> acquiring the lock and performing the operation on HDFS you have no
> way of knowing that you still own the lock, like you say.
> If the lock cannot be revoked while you have it (no timeouts) then you
> can atomically check that you still have the lock and do the operation
> on HDFS, because checking is a no-op. Designing a system with no lock
> revocation in the face of failures is an exercise for the reader :)
> The right way is for HDFS and ZK to communicate to construct an atomic
> operation. ZK could give a token to the client which it also gives to
> HDFS, and HDFS uses that token to do admission control. There's
> probably some neat theorem about causality and the impossibility of
> doing distributed locking without a sufficiently strong atomic
> primitive here.
> Another
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.