There are many implications related to this.  The core trade-off as I see it is 
between storage and read performance.

With the current setup, after we read blocks from HDFS into memory, we can just 
usher KeyValues straight out of the on-disk format and to the client without 
any further allocation or copies.  This is a highly desirable property.

If we were to only keep what was absolutely necessary (could not be inferred or 
explicitly tracked in some way), then we would have to do a lot of work at read 
time to regenerate client-friendly data.

I'm not sure exactly what you mean by storing the row length at the beginning 
of each row.  Families are certainly the easiest of these optimizations to make 
but change read behavior significantly.  It has been talked about and there's 
probably a jira hanging around somewhere.

In the end, the HDFS/HBase philosophy is that disk/storage is cheap so we 
should do what we can (within reason) for read performance.

Much of this is mitigated by the use of compression.  Currently we only utilize 
block compression (gzip default, lzo preferred).  BigTable uses a special 
prefix-compression which is ideal for this duplication issue; maybe one day we 
could do that too.

JG

> -----Original Message-----
> From: Matt Corgan [mailto:mcor...@hotpads.com]
> Sent: Wednesday, March 31, 2010 7:06 PM
> To: hbase-user@hadoop.apache.org
> Cc: a...@cloudera.com; jlhu...@cs.nctu.edu.tw; kevin_h...@tsmc.com
> Subject: Re: Data size
> 
> Out of curiousity, why is it necessary to store the family and row with
> every cell?  Aren't all the contents of a family confined to the same
> file,
> and couldn't a row length be stored at the beginning of each row or in
> a
> block index?  Is this true for values in the caches and memstore as
> well?
> 
> It could have drastic implications for storing rows with many small
> values
> but with long keys, long column names, and innocently verbose column
> family
> names.
> 
> Matt
> 
> 2010/3/31 alex kamil <alex.ka...@gmail.com>
> 
> > i would also suggest to chk dfs.*replication* setting in hdfs (in
> /conf/*
> > hdfs*-site.xml)
> >
> > A-K
> >
> > 2010/3/31 Jean-Daniel Cryans <jdcry...@apache.org>
> >
> > > HBase is column-oriented; every cell is stored with the row,
> family,
> > > qualifier and timestamp so every pieces of data will bring a larger
> > > disk usage. Without any knowledge of your keys, I can't comment
> much
> > > more.
> > >
> > > Then HDFS keeps a trash so every file compacted will end up
> there...
> > > if you just did the import, there will be a lot of these.
> > >
> > > Finally if you imported the data more than once, hbase keeps 3
> > > versions by default.
> > >
> > > So in short, is it reasonable? Answer: it depends!
> > >
> > > J-D
> > >
> > > 2010/3/31  <y_823...@tsmc.com>:
> > > > Hi,
> > > >
> > > > We've dumped oracele data to files then put these files into
> different
> > > > hbase table.
> > > > The size of these files is 35G; we saw the HDFS usage up to 562G
> after
> > > > putting it into hbase.
> > > > Is that reasonable?
> > > > Thanks
> > > >
> > > >
> > > >
> > > > Fleming Chiu(邱宏明)
> > > > 707-6128
> > > > y_823...@tsmc.com
> > > > 週一無肉日吃素救地球(Meat Free Monday Taiwan)
> > > >
> > > >
> > > >
> > >
> >  --------------------------------------------------------------------
> -------
> > > >                                                         TSMC
> PROPERTY
> > > >  This email communication (and any attachments) is proprietary
> > > information
> > > >  for the sole use of its
> > > >  intended recipient. Any unauthorized review, use or distribution
> by
> > > anyone
> > > >  other than the intended
> > > >  recipient is strictly prohibited.  If you are not the intended
> > > recipient,
> > > >  please notify the sender by
> > > >  replying to this email, and then delete this email and any
> copies of
> > it
> > > >  immediately. Thank you.
> > > >
> > >
> >  --------------------------------------------------------------------
> -------
> > > >
> > > >
> > > >
> > > >
> > >
> >

Reply via email to