On Wed, Dec 21, 2011 at 8:56 AM, Daniel Iancu <[email protected]> wrote: > Hi there > I'm investigating a problem we have with a MR job and I discovered that the > tasks that fail (scan lease expired while fetching next row) were processing > one particular region. > I've written a small app that scans that region and counts its rows and run > it on same machine where region is hosted. The result is very very poor, > scan speed is in average 7 rows/sec and sometimes when scan caching is > increased it gets lease expired exception. By contrary, scanning the other > regions from same table on same machine with same caching value gets ~3800 > rows/sec. Any idea what can cause such dizastrous scan performance on a > particular region ? >
If you move the region to another host, do you same same perf (Perhaps some hardware issue?). Otherwise, if you look at the data under that region, what do you see. First do a listing of the hdfs content. Next try looking at the actual key values with the hfile main tool: Poke down in here http://hbase.apache.org/book/regions.arch.html#store > Some extra info > > hbase is 0.90.4 > lease timeout is 4 minutes > table has 1 family, cell values are empty, row keys and qualifiers are small > strings, biggest row has 146 columns > row sizes are almost identical since table was create by a load tool and > each row has almost the same nr of colums with same kind of values... > all regions have 1 store file of ~655MB > cluster has no activity except the test app > GC activity looks normal > regions might have many deleted KV (we were testing data cleanup with MR > jobs) Looksee first w/ hfile tool. If a major compaction 'fixes' it, then it could be having to pass over lots of delete items. St.Ack
