On Mon, Nov 22, 2010 at 1:58 PM, David Jeske <dav...@gmail.com> wrote:
> On Mon, Nov 22, 2010 at 11:52 AM, Todd Lipcon <t...@lipcon.org> wrote: > >> Not quite. The replica synchronization code is pretty messy, but basically >> it will take the longest replica that may have been synced, not a quorum. >> >> i.e the guarantee is that "if you successfully sync() data, it will be >> present after replica synchronization". Unsynced data *may* be present after >> replica synchronization. >> >> But keep in mind that recovery is blocking in most cases - ie if the RS is >> writing to a pipeline and waiting on acks, and one of the nodes in the >> pipeline dies, then it will recover the pipeline (without the dead node) and >> continue syncing to the remaining two nodes. The client is still blocked at >> this point. >> > > I see. So it sounds like my statement #1 was wrong. Will the RS ever > timeout the write and fail in the face of not being able to push it to HDFS? > > Is it correct to say: > > Once a write is issued to HBase, it will either catistrophicly fail (i.e. > disconnect), in which case the write with either have failed or succeeded, > and if it succeeded, future reads will always show that write? As opposed to > Cassandra, which in all configurations where reads allow a subset of all > nodes, can "fail" a write while having the write show a temporary period of > inconsistency (depending on who you talk to) followed by the write either > applying or not applying depending on whether or not it actually wrote a > single node during the "failure to meet the write consistency request"? > Yes, this seems accurate to me. > > Does Cassandra have any return result which distinguishes between these two > states: > > 1 - your data was not written to any nodes (true failure) > 2 - your data was written to at least 1 node, but not enough to meet your > write-consistency count > > ? > > > >