yes, I have changes the BLOCK CACHE % to 0.35. -Vibhav
On Mon, Apr 1, 2013 at 10:20 PM, Ted Yu <[email protected]> wrote: > I was aware of that discussion which was about MAX_FILESIZE and BLOCKSIZE > > My suggestion was about block cache percentage. > > Cheers > > > On Mon, Apr 1, 2013 at 4:57 AM, Vibhav Mundra <[email protected]> wrote: > > > I have used the following site: > > http://grokbase.com/t/hbase/user/11bat80x7m/row-get-very-slow > > > > to lessen the value of block cache. > > > > -Vibhav > > > > > > On Mon, Apr 1, 2013 at 4:23 PM, Ted <[email protected]> wrote: > > > > > Can you increase block cache size ? > > > > > > What version of hbase are you using ? > > > > > > Thanks > > > > > > On Apr 1, 2013, at 3:47 AM, Vibhav Mundra <[email protected]> wrote: > > > > > > > The typical size of each of my row is less than 1KB. > > > > > > > > Regarding the memory, I have used 8GB for Hbase regionservers and 4 > GB > > > for > > > > datanodes and I dont see them completely used. So I ruled out the GC > > > aspect. > > > > > > > > In case u still believe that GC is an issue, I will upload the gc > logs. > > > > > > > > -Vibhav > > > > > > > > > > > > On Mon, Apr 1, 2013 at 3:46 PM, ramkrishna vasudevan < > > > > [email protected]> wrote: > > > > > > > >> Hi > > > >> > > > >> How big is your row? Are they wider rows and what would be the size > > of > > > >> every cell? > > > >> How many read threads are getting used? > > > >> > > > >> > > > >> Were you able to take a thread dump when this was happening? Have > you > > > seen > > > >> the GC log? > > > >> May be need some more info before we can think of the problem. > > > >> > > > >> Regards > > > >> Ram > > > >> > > > >> > > > >> On Mon, Apr 1, 2013 at 3:39 PM, Vibhav Mundra <[email protected]> > > wrote: > > > >> > > > >>> Hi All, > > > >>> > > > >>> I am trying to use Hbase for real-time data retrieval with a > timeout > > of > > > >> 50 > > > >>> ms. > > > >>> > > > >>> I am using 2 machines as datanode and regionservers, > > > >>> and one machine as a master for hadoop and Hbase. > > > >>> > > > >>> But I am able to fire only 3000 queries per sec and 10% of them are > > > >> timing > > > >>> out. > > > >>> The database has 60 million rows. > > > >>> > > > >>> Are these figure okie, or I am missing something. > > > >>> I have used the scanner caching to be equal to one, because for > each > > > time > > > >>> we are fetching a single row only. > > > >>> > > > >>> Here are the various configurations: > > > >>> > > > >>> *Our schema > > > >>> *{NAME => 'mytable', FAMILIES => [{NAME => 'cf', > DATA_BLOCK_ENCODING > > => > > > >>> 'NONE', BLOOMFILTER => 'ROWCOL', REPLICATION_SCOPE => '0', > > COMPRESSION > > > => > > > >>> 'GZ', VERSIONS => '1', TTL => '2147483647', MIN_VERSIONS => '0', > KEE > > > >>> P_DELETED_CELLS => 'false', BLOCKSIZE => '8192', ENCODE_ON_DISK => > > > >> 'true', > > > >>> IN_MEMORY => 'false', BLOCKCACHE => 'true'}]} > > > >>> > > > >>> *Configuration* > > > >>> 1 Machine having both hbase and hadoop master > > > >>> 2 machines having both region server node and datanode > > > >>> total 285 region servers > > > >>> > > > >>> *Machine Level Optimizations:* > > > >>> a)No of file descriptors is 1000000(ulimit -n gives 1000000) > > > >>> b)Increase the read-ahead value to 4096 > > > >>> c)Added noatime,nodiratime to the disks > > > >>> > > > >>> *Hadoop Optimizations:* > > > >>> dfs.datanode.max.xcievers = 4096 > > > >>> dfs.block.size = 33554432 > > > >>> dfs.datanode.handler.count = 256 > > > >>> io.file.buffer.size = 65536 > > > >>> hadoop data is split on 4 directories, so that different disks are > > > being > > > >>> accessed > > > >>> > > > >>> *Hbase Optimizations*: > > > >>> > > > >>> hbase.client.scanner.caching=1 #We have specifcally added this, as > > we > > > >>> return always one row. > > > >>> hbase.regionserver.handler.count=3200 > > > >>> hfile.block.cache.size=0.35 > > > >>> hbase.hregion.memstore.mslab.enabled=true > > > >>> hfile.min.blocksize.size=16384 > > > >>> hfile.min.blocksize.size=4 > > > >>> hbase.hstore.blockingStoreFiles=200 > > > >>> hbase.regionserver.optionallogflushinterval=60000 > > > >>> hbase.hregion.majorcompaction=0 > > > >>> hbase.hstore.compaction.max=100 > > > >>> hbase.hstore.compactionThreshold=100 > > > >>> > > > >>> *Hbase-GC > > > >>> *-XX:+UseConcMarkSweepGC -XX:+UseParNewGC > > -XX:+CMSParallelRemarkEnabled > > > >>> -XX:SurvivorRatio=20 -XX:ParallelGCThreads=16 > > > >>> *Hadoop-GC* > > > >>> -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > > > >>> > > > >>> -Vibhav > > > >> > > > > > >
