I think HDFS-200 does call fflush, so it will get to the OS buffers...

-ryan

On Sat, May 8, 2010 at 8:28 PM, Todd Lipcon <t...@cloudera.com> wrote:
> I think the point they were trying to make in the YCSB paper is that,
> even with the WAL and hflush(), the data does not get synced to disk.
> hflush() only ensures that the WAL data has made it to three
> datanodes' OS caches, but doesn't actually guarantee anything is on
> physical media.
> I agree it's not clearly articulated, but that's what's happening. The
> API that will cause fsync() on the DNs is called hsync() and has not
> been written yet.
> -Todd
> On Sat, May 8, 2010 at 6:12 PM, Ryan Rawson <ryano...@gmail.com> wrote:
>>
>> Yes that section is very misleading. What is actually happening is like so:
>>
>> - Every time you write to HBase the data is written to a Write Ahead Log.
>> - If there is a regionserver failure the log is replayed to recover the data
>> - Due to a HDFS bug, the data in the most recent file, which is
>> rotated at 64MB by default, is lost.
>>
>> The other good news is that serious effort is being undertaken to push
>> a version of HDFS without this bug. Hopefully within a week people
>> will be able to download a version of HDFS and not run into this
>> situation.
>>
>> -ryan
>>
>> On Sat, May 8, 2010 at 6:10 PM, MauMau <maumau...@gmail.com> wrote:
>> > Thanks Amandeep and Ryan,
>> >
>> > I could make sure that unlike Cassandra, HBase does not do in-memory
>> > replication. So, the paragraph below in Yahoo's report is partly incorrect:
>> >
>> > Cassandra, sharded MySQL and PNUTS, all updates were
>> > synched to disk before returning to the client. HBase does
>> > not sync to disk, but relies on in-memory replication across
>> > multiple servers for durability; this increases write throughput
>> > and reduces latency, but can result in data loss on failure.
>> >
>> > Maumau
>> >
>> >
>> > ----- Original Message ----- From: "Ryan Rawson" <ryano...@gmail.com>
>> > To: <hbase-user@hadoop.apache.org>
>> > Sent: Sunday, May 09, 2010 7:10 AM
>> > Subject: Re: Does HBase do in-memory replication of rows?
>> >
>> >
>> > For more architectural details of HBase, check out the bigtable paper,
>> > it's fairly detailed, short and accessible.
>> >
>> > On Sat, May 8, 2010 at 2:39 PM, Amandeep Khurana <ama...@gmail.com> wrote:
>> >>
>> >> HBase does not do in-memory replication. Your data goes into a region,
>> >> which
>> >> has only one instance. Writes go to the write ahead log first, which is
>> >> written to the disk. However, since HDFS doesnt yet have a fully
>> >> performing
>> >> flush functionality, there is a chance of losing the chunk of data. The
>> >> next
>> >> release of HBase will guarantee data durability since by then the flush
>> >> functionality would be fully working.
>> >>
>> >> Regarding replication - the difference between Cassandra and HBase is that
>> >> when you do a write in Cassandra, it doesnt return unless it has written
>> >> to
>> >> W nodes, which is configurable. In case of HBase, the replication is taken
>> >> care of by the filesystem (HDFS). When the region is flushed to the disk,
>> >> HDFS replicates the HFiles (in which the data for the regions is stored).
>> >> For more details of the working, read the Bigtable paper and
>> >> http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html.
>> >
>> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Reply via email to