Re: HBase and Cassandra on StackOverflow

Andrew Purtell Tue, 30 Aug 2011 20:53:36 -0700

> > Will the write call to HBase block until the record written is fully
> > replicated ?
> no. data isn't written to disk immediately


Not so black and white.

Full replication in HDFS != writes to disk. Full replication is acknowledgement 
there are replicas at all DataNodes in the pipeline, and with rack-aware 
placement that is at least one non-rack-local replica. In practice this is good 
enough to give HDFS 5 or 6 nines of data availability; Hortonworks had a blog 
post about that recently. 

In our production we do have our DataNodes patched to call fsync() when a block 
write is completed. This will provide some marginal improvement over the 
default for the case where suddenly power is lost to the whole datacenter, but 
marginal is the key word here.

 
Best regards,


- Andy


Problems worthy of attack prove their worth by hitting back. - Piet Hein (via 
Tom White)


>________________________________
>From: Joseph Boyd <[email protected]>
>To: [email protected]
>Sent: Wednesday, August 31, 2011 4:04 AM
>Subject: Re: HBase and Cassandra on StackOverflow
>
>On Tue, Aug 30, 2011 at 12:22 PM, Sam Seigal <[email protected]> wrote:
>>
>> Will the write call to HBase block until the record written is fully
>> replicated ?
>
>no. data isn't written to disk immediately
>
>> If not (since it is happening at the block level), then isn't
>> there a window where a region server goes down, the data might not be
>> available anywhere else, until it comes back up ?
>
>the data would be in the write ahead log.
>
>
>...joe
>
>
>> On Tue, Aug 30, 2011 at 9:17 AM, Andrew Purtell <[email protected]> wrote:
>>
>> > > Is the replication strategy for HBase completely reliant on HDFS' block
>> > > replication pipelining ?
>> >
>> > Yes.
>> >
>> > > Is this replication process asynchronous ?
>> >
>> >
>> > No.
>> > Best regards,
>> >
>> >
>> >        - Andy
>> >
>> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> > (via Tom White)
>> >
>> >
>> > >________________________________
>> > >From: Sam Seigal <[email protected]>
>> > >To: [email protected]; Andrew Purtell <[email protected]>
>> > >Cc: "[email protected]" <[email protected]>
>> > >Sent: Tuesday, August 30, 2011 7:35 PM
>> > >Subject: Re: HBase and Cassandra on StackOverflow
>> > >
>> > >A question inline:
>> > >
>> > >On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <[email protected]>
>> > wrote:
>> > >
>> > >> Hi Chris,
>> > >>
>> > >> Appreciate your answer on the post.
>> > >>
>> > >> Personally speaking however the endless Cassandra vs. HBase discussion
>> > is
>> > >> tiresome and rarely do blog posts or emails in this regard shed any
>> > light.
>> > >> Often, Cassandra proponents mis-state their case out of ignorance of
>> > HBase
>> > >> or due to commercial or personal agendas. It is difficult to find clear
>> > eyed
>> > >> analysis among the partisans. I'm not sure it will make any difference
>> > >> posting a rebuttal to some random thing jbellis says. Better to focus on
>> > >> improving HBase than play whack a mole.
>> > >>
>> > >>
>> > >> Regarding some of the specific points in that post:
>> > >>
>> > >> HBase is proven in production deployments larger than the largest
>> > publicly
>> > >> reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But
>> > basically
>> > >> this is the same order of magnitude, with HBase having a slight edge. I
>> > >> don't see a meaningful difference here. Stating otherwise is false.
>> > >>
>> > >> HBase supports replication between clusters (i.e. data centers). I
>> > believe,
>> > >> but admit I'm not super familiar with the Cassandra option here, that
>> > the
>> > >> main difference is HBase provides simple mechanism and the user must
>> > build a
>> > >> replication architecture useful for them; while Cassandra attempts to
>> > hide
>> > >> some of that complexity. I do not know if they succeed there, but large
>> > >> scale cross data center replication is rarely one size fits all so I
>> > doubt
>> > >> it.
>> > >>
>> > >> Cassandra does not have strong consistency in the sense that HBase
>> > >> provides. It can provide strong consistency, but at the cost of failing
>> > any
>> > >> read if there is insufficient quorum. HBase/HDFS does not have that
>> > >> limitation. On the other hand, HBase has its own and different scenarios
>> > >> where data may not be immediately available. The differences between the
>> > >> systems are nuanced and which to use depends on the use case
>> > requirements.
>> > >>
>> > >>
>> > >I have a question regarding this point. Is the replication strategy for
>> > >HBase completely reliant on HDFS' block replication pipelining ? Is this
>> > >replication process asynchronous ? If it is, then is there not a window,
>> > >where when a machine is to die and the replication pipeline for a
>> > particular
>> > >block has not started yet, that block will be unavailable until the
>> > machine
>> > >comes back up ? Sorry, if I am missing something important here.
>> > >
>> > >
>> > >> Cassandra's RandomPartitioner / hash based partitioning means efficient
>> > >> MapReduce or table scanning is not possible, whereas HBase's distributed
>> > >> ordered tree is naturally efficient for such use cases, I believe
>> > explaining
>> > >> why Hadoop users often prefer it. This may or may not be a problem for
>> > any
>> > >> given use case. Using an ordered partitioner with Cassandra used to
>> > require
>> > >> frequent manual rebalancing to avoid blowing up nodes. I don't know if
>> > more
>> > >> recent versions still have this mis-feature.
>> > >>
>> > >> Cassandra is no less complex than HBase. All of this complexity is
>> > "hidden"
>> > >> in the sense that with Hadoop/HBase the layering is obvious -- HDFS,
>> > HBase,
>> > >> etc. -- but the Cassandra internals are no less layered. An impartial
>> > >> analysis of implementation and algorithms will reveal that Cassandra's
>> > >> theory of operation in its full detail is substantially more complex.
>> > >> Compare the BigTable and Dynamo papers and this is clear. There are
>> > actually
>> > >> more opportunities for something to go wrong with Cassandra.
>> > >>
>> > >> While we are looking at codebases, it should be noted that HBase has
>> > >> substantially more unit tests.
>> > >>
>> > >> With Cassandra, all RPC is via Thrift with various wrappers, so actually
>> > >> all Cassandra clients are second class in the sense that jbellis means
>> > when
>> > >> he states "Non-Java clients are not second-class citizens".
>> > >>
>> > >> The master-slave versus peer-to-peer argument is larger than Cassandra
>> > vs.
>> > >> HBase, and not nearly as one sided as claimed. The famous (infamous?)
>> > global
>> > >> failure of Amazon's S3 in 2008, a fully peer-to-peer system, due to a
>> > single
>> > >> flipped bit in a gossip message demonstrates how in peer to peer systems
>> > >> every node can be a single point of failure. There is no obvious winner,
>> > >> instead, a series of trade offs. Claiming otherwise is intellectually
>> > >> dishonest. Master-slave architectures seem easier to operate and reason
>> > >> about in my experience. Of course, I'm partial there.
>> > >>
>> > >> I have just scratched the surface.
>> > >>
>> > >>
>> > >> Best regards,
>> > >>
>> > >>
>> > >>        - Andy
>> > >>
>> > >> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> > >> (via Tom White)
>> > >>
>> > >>
>> > >> >________________________________
>> > >> >From: Chris Tarnas <[email protected]>
>> > >> >To: [email protected]
>> > >> >Sent: Tuesday, August 30, 2011 2:02 PM
>> > >> >Subject: HBase and Cassandra on StackOverflow
>> > >> >
>> > >> >Someone with better knowledge than might be interested in helping
>> > answer
>> > >> this question over at StackOverflow:
>> > >> >
>> > >> >
>> > >>
>> > http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra
>> > >> >
>> > >> >-chris
>> > >> >
>> > >> >
>> > >>
>> > >
>> > >
>> > >
>> >
>
>
>

Re: HBase and Cassandra on StackOverflow

Reply via email to