Hi, our machines have 24GB of RAM (for 8 cores) and HBase gets 6 GBs. The map jobs all have 768 MB memory.
Currently we're using CDH3b3. We'll definitely implement my idea of distributing the rows into multiple columns similarly to what Friso said. A comment from somebody who has really wide rows would be interesting, though. Thanks, fnord 2010/11/22 Todd Lipcon <[email protected]> > Hi, > > Which version are you using? > > During the 0.89 development series we got a bunch of new work in trunk > (mostly thanks to Facebook and TrendMicro) for wide rows. Maybe one of the > FB guys can comment, but I believe they have some very wide rows in their > application. > > Thanks > -Todd > > On Mon, Nov 22, 2010 at 2:01 AM, fnord 99 <[email protected]> wrote: > > > Hi all, > > > > I recently filled an hbase table with many millions of columns in each > row > > (!). The problem that now occured was that I always get a Heap Space > Error > > from the JVM with a subsequent shutdown of all regionservers in which the > > error occurs. Since the error isn't thrown in any of my own classes, I > > think > > that the problem is the following: > > > > * a row is always completely read into memory upon access (at least all > > column families that I'm interested in) > > * the Result object holds the complete family-qualifier-value pairs in a > > KeyValue[] > > * this is sometimes too much to be handled by the physical memory each > map > > can get, therefore a heap space error is thrown > > > > My question is now: is there any lazy fetching technique implemented > within > > the single key-values within one row? In my opinion it should be but I > > couldn't find anything in the source code or wiki that hints to that. > > > > Any ideas on how to go around this problem? I had the idea to rebuild the > > table schema to store more data in the row key and less data in the > column > > families which would make the tables "thinner" and "longer". It would > work > > in the current setup, however, it wouldn't solve the original problem... > > > > Thanks already in advance for any input on that, > > > > fnord999 > > > > > > -- > Todd Lipcon > Software Engineer, Cloudera >
