Hey guys, You can't just turn off versioning - it's not a optional feature, its a core part of how the storage architecture works. I can suggest both the bigtable paper and also this blog entry: http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html
To get a sense of what version are for, why you can't "turn them off". -ryan On Thu, May 6, 2010 at 10:47 PM, Kevin Apte <technicalarchitect2...@gmail.com> wrote: > If compression is used overhead of versioning is not significant. Many > people want versioning of data for many reasons- including auditing and > compliance. In some database systems, analyzing data is effective only if > performed on the same version. > > I agree that if there is no need, versioning should be turned off. > > Kevin > > > > On Fri, May 7, 2010 at 10:49 AM, Takayuki Tsunakawa < > tsunakawa.ta...@jp.fujitsu.com> wrote: > >> Hello, Kevin-san >> >> Yes, Hadoop DFS maintains three copies of the same data (version) at >> the file system level. What I'm wondering about is the necessity of >> different versions of cells by HBase at the database level. >> Amazon SimpleDB, Microsoft Azure Table, and Google App Engine >> Datastore do not provide versioning. So I felt that many people do not >> have to use versioning and the default maximum versions of HBase had >> better be >> >> Regards >> Takayuki >> >> >> ----- Original Message ----- >> From: "Kevin Apte" <technicalarchitect2...@gmail.com> >> To: <hbase-user@hadoop.apache.org> >> Sent: Friday, May 07, 2010 1:51 PM >> Subject: Re: How is column timestamp useful? >> >> >> > Hadoop philosophy is to deploy on low cost disks and keep 3 copies >> of data >> > for redundancy. This ensures that the costs are very low- perhaps 5 >> to 10 >> > times lower than what large Enterprises are paying for expensive SAN >> > configurations. >> > >> > This does not mean one needs to waste storage- If you store files >> > compressed using gZip, multiple versions of a row may compress very >> well. >> > >> > Kevin >> > >> > >> > >> > On Fri, May 7, 2010 at 10:14 AM, tsuna <tsuna...@gmail.com> wrote: >> > >> >> In addition to what Ryan said, even if the default maximum number >> of >> >> versions for a cell is 3 doesn't mean that you end up wasting >> space. >> >> If you only ever write one version, that's what you end up paying >> for. >> >> >> >> -- >> >> Benoit "tsuna" Sigoure >> >> Software Engineer @ www.StumbleUpon.com >> >> >> > >> >> >> >