i would highly recommend reading the hbase book to decide what configuration settings to change. it depends on what kind of load and what kind of data you will be storing. e.g. im using a subset of the list you mentioned. other settings that i use are hbase.hregion.max.filesize, hbase.rpc.timeout, hbase.regionserver.handler.count, hbase.hregion.majorcompaction, zookeeper.session.timeout, hbase.regionserver.codecs. GC settings are also different from standard one.
yes hbase can handle hundreds of thousands of columns.. however, i have had lot of problems when the row size and individual column value sizes were large (10s of MB for me). i solved that by removing some storage redundancy that i had.. other than that, i dont know what settings would have really helped me. so others can comment on this. thanks On Mon, Mar 5, 2012 at 11:39 PM, Qian Ye <[email protected]> wrote: > Hi all: > > I'm a newbie to HBase. Here are two questions about hbase in production > environment. I would very appreciate it if anyone could give a help. > > 1. Which hbase configuration recommended to be set, rather than use the > default, when using in production environment? So far, I knew that these > parameters should be set as need, > hbase.regionserver.handler.count > hbase.client.write.buffer > hbase.hregion.memstore.block.multiplier > hbase.server.thread.wakefrequency > hbase.regionserver.lease.period > hbase.hstore.blockingStoreFiles > > 2. Can hbase handle a column family with hundreds of columns? Some columns > may contain values whose size can reach about 20MB? > > Thanks for ur help > > -- > With Regards! > > Ye, Qian >
