On Fri, Feb 17, 2012 at 1:21 PM, Bryan Keller <[email protected]> wrote: > I have been experimenting with local reads. For me, enabling did not help > improve read performance at all, I get the same performance either way. I can > see in the data node logs it is passing back the local path, so it is enabled > properly.
I was surprised when I read this until I saw this: > > Perhaps the benefits of local reads are dependent on the type of data and the > workload? In my test I'm scanning through the entire table via a map reduce > job. It's a wide table with maybe 20k columns per row on average. I have > scanner caching set to 10. It's definitely not going to help make sequential reads faster. > > My read performance is about 10% of the disk max read throughput, i.e. my > disks can get 100 mb/sec tested with hdparm and scan performance is about 10 > mb/sec. Not too bad I suppose. Maybe you're not pushing it enough? J-D
