Re: hbase scan performance

2014-04-10 Thread Patrick Wendell
This job might still be faster... in MapReduce there will be other overheads in addition to the fact that doing sequential reads from HBase is slow. But it's possible the bottleneck is the HBase scan performance. - Patrick On Wed, Apr 9, 2014 at 10:10 AM, Jerry Lam wrote: > Hi Dave,

Re: hbase scan performance

2014-04-09 Thread Jerry Lam
Hi Dave, This is HBase solution to the poor scan performance issue: https://issues.apache.org/jira/browse/HBASE-8369 I encountered the same issue before. To the best of my knowledge, this is not a mapreduce issue. It is hbase issue. If you are planning to swap out mapreduce and replace it with sp

hbase scan performance

2014-04-09 Thread David Quigley
Hi all, We are currently using hbase to store user data and periodically doing a full scan to aggregate data. The reason we use hbase is that we need a single user's data to be contiguous, so as user data comes in, we need the ability to update a random access store. The performance of a full hba