Hi there
I'm investigating a problem we have with a MR job and I discovered that the tasks that fail (scan lease expired while fetching next row) were processing one particular region. I've written a small app that scans that region and counts its rows and run it on same machine where region is hosted. The result is very very poor, scan speed is in average 7 rows/sec and sometimes when scan caching is increased it gets lease expired exception. By contrary, scanning the other regions from same table on same machine with same caching value gets ~3800 rows/sec. Any idea what can cause such dizastrous scan performance on a particular region ?

Some extra info

hbase is 0.90.4
lease timeout is 4 minutes
table has 1 family, cell values are empty, row keys and qualifiers are small strings, biggest row has 146 columns row sizes are almost identical since table was create by a load tool and each row has almost the same nr of colums with same kind of values...
all regions have 1 store file of ~655MB
cluster has no activity except the test app
GC activity looks normal
regions might have many deleted KV (we were testing data cleanup with MR jobs)
major compaction is deactivated and we didn't run it for some time

Can this problem be caused by the last 2 points above, many deleted KV concentrated on that region so they need to be skipped by the StoreScanners?
Any other thoughts?

Thanks
Daniel




Reply via email to