>From some of their presentations, I've gathered that they implement B-Tree's instead of LSM's on top of their file system which allows random writes. They also claim that they are converting random mutation requests to the B-Tree leafs to sequential-writes. They are also talking about mini-WALs to do this, so there might be mini-LSM's going on. Not sure.
Any case, agreed with, if there are LSMs there are compactions. LSM vs B-Trees tradeoff's are well understood. Enis On Tue, Feb 19, 2013 at 12:12 AM, lars hofhansl <[email protected]> wrote: > If you store data in LSM trees you need compactions. > The advantage is that your data files are immutable. > MapR has a mutable file system and they probably store their data in > something more akin to B-Trees...? > Or maybe they somehow avoid the expensive merge sorting of many small > files. It seems that is has to be one or the other. > > (Maybe somebody from MapR reads this and can explain how it actually > works.) > > Compations let you trade random IO for sequential IO (just to state the > obvious). It seems that you can't have it both ways. > > -- Lars > > > > ________________________________ > From: Otis Gospodnetic <[email protected]> > To: [email protected] > Sent: Monday, February 18, 2013 7:30 PM > Subject: HBase without compactions? > > Hello, > > It's kind of funny, we run SPM, which includes SPM for HBase (performance > monitoring service/tool for HBase essentially) and we currently store all > performance metrics in HBase. > > I see a ton of HBase development activity, which is great, but it just > occurred to me that I don't think I recall seeing anything about getting > rid of compactions. Yet, compactions are one thing that I know hurt us the > most and is one thing that MapR somehow got rid of in their implementation. > > Have there been any discussions,attempts, or thoughts about finding a way > to avoid compactions? > > Thanks, > Otis > -- > HBASE Performance Monitoring - http://sematext.com/spm/index.html >
