Our row-keys do not contain time. By time-based scans, I mean, an MR over the Hbase table where the scan object has no startRow or endRow but has a startTime and endTime.
Our row key format is <MD5 of UUID>+UUID, so, we expect good distribution. We have pre-split initially to prevent any initial hotspotting. ~Rahul. ________________________________ From: anil gupta <[email protected]> To: [email protected]; Rahul Ravindran <[email protected]> Sent: Tuesday, June 4, 2013 9:31 PM Subject: Re: Scan + Gets are disk bound On Tue, Jun 4, 2013 at 11:48 AM, Rahul Ravindran <[email protected]> wrote: Hi, > >We are relatively new to Hbase, and we are hitting a roadblock on our scan >performance. I searched through the email archives and applied a bunch of the >recommendations there, but they did not improve much. So, I am hoping I am >missing something which you could guide me towards. Thanks in advance. > >We are currently writing data and reading in an almost continuous mode (stream >of data written into an HBase table and then we run a time-based MR on top of >this Table). We currently were backed up and about 1.5 TB of data was loaded >into the table and we began performing time-based scan MRs in 10 minute time >intervals(startTime and endTime interval is 10 minutes). Most of the 10 minute >interval had about 100 GB of data to process. > >Our workflow was to primarily eliminate duplicates from this table. We have >maxVersions = 5 for the table. We use TableInputFormat to perform the >time-based scan to ensure data locality. In the mapper, we check if there >exists a previous version of the row in a time period earlier to the timestamp >of the input row. If not, we emit that row. > >We looked at https://issues.apache.org/jira/browse/HBASE-4683 and hence turned >off block cache for this table with the expectation that the block index and >bloom filter will be cached in the block cache. We expect duplicates to be >rare and hence hope for most of these checks to be fulfilled by the bloom >filter. Unfortunately, we notice very slow performance on account of being >disk bound. Looking at jstack, we notice that most of the time, we appear to >be hitting disk for the block index. We performed a major compaction and >retried and performance improved some, but not by much. We are processing data >at about 2 MB per second. > > We are using CDH 4.2.1 HBase 0.94.2 and HDFS 2.0.0 running with 8 >datanodes/regionservers(each with 32 cores, 4x1TB disks and 60 GB RAM). Anil: You dont have the right balance between disk,cpu and ram. You have too much of CPU, RAM but very less NUMBER of disks. Usually, its better to have a Disk/Cpu_core ratio near 0.6-0.8. Your's is around 0.13. This seems to be the biggest reason of your problem. HBase is running with 30 GB Heap size, memstore values being capped at 3 GB and flush thresholds being 0.15 and 0.2. Blockcache is at 0.5 of total heap size(15 GB). We are using SNAPPY for our tables. > > >A couple of questions: > * Is the performance of the time-based scan bad after a major >compaction? > Anil: In general, TimeBased(i am assuming you have built your rowkey on timestamp) scans are not good for HBase because of region hot-spotting. Have you tried setting the ScannerCaching to a higher number? > * What can we do to help alleviate being disk bound? The typical >answer of adding more RAM does not seem to have helped, or we are missing some >other config > Anil: Try adding more disks to your machines. > > >Below are some of the metrics from a Regionserver webUI: > >requestsPerSecond=5895, numberOfOnlineRegions=60, numberOfStores=60, >numberOfStorefiles=209, storefileIndexSizeMB=6, rootIndexSizeKB=7131, >totalStaticIndexSizeKB=415995, totalStaticBloomSizeKB=2514675, >memstoreSizeMB=0, mbInMemoryWithoutWAL=0, numberOfPutsWithoutWAL=0, >readRequestsCount=30589690, writeRequestsCount=0, compactionQueueSize=0, >flushQueueSize=0, usedHeapMB=2688, maxHeapMB=30672, blockCacheSizeMB=1604.86, >blockCacheFreeMB=13731.24, blockCacheCount=11817, blockCacheHitCount=27592222, >blockCacheMissCount=25373411, blockCacheEvictedCount=7112, >blockCacheHitRatio=52%, blockCacheHitCachingRatio=72%, >hdfsBlocksLocalityIndex=91, slowHLogAppendCount=0, >fsReadLatencyHistogramMean=15409428.56, fsReadLatencyHistogramCount=1559927, >fsReadLatencyHistogramMedian=230609.5, fsReadLatencyHistogram75th=280094.75, >fsReadLatencyHistogram95th=9574280.4, fsReadLatencyHistogram99th=100981301.2, >fsReadLatencyHistogram999th=511591146.03, > fsPreadLatencyHistogramMean=3895616.6, fsPreadLatencyHistogramCount=420000, >fsPreadLatencyHistogramMedian=954552, fsPreadLatencyHistogram75th=8723662.5, >fsPreadLatencyHistogram95th=11159637.65, >fsPreadLatencyHistogram99th=37763281.57, >fsPreadLatencyHistogram999th=273192813.91, >fsWriteLatencyHistogramMean=6124343.91, fsWriteLatencyHistogramCount=1140000, >fsWriteLatencyHistogramMedian=374379, fsWriteLatencyHistogram75th=431395.75, >fsWriteLatencyHistogram95th=576853.8, fsWriteLatencyHistogram99th=1034159.75, >fsWriteLatencyHistogram999th=5687910.29 > > > >key size: 20 bytes > >Table description: >{NAME => 'foo', FAMILIES => [{NAME => 'f', DATA_BLOCK_ENCODING => 'NONE', >BLOOMFI true > LTER => 'ROW', REPLICATION_SCOPE => '0', COMPRESSION => 'SNAPPY', VERSIONS => >'5', TTL => ' > 2592000', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => >'65536', ENCODE_ > ON_DISK => 'true', IN_MEMORY => 'false', BLOCKCACHE => 'false'}]} -- Thanks & Regards, Anil Gupta
