On top of that I've always been wondering if pushing down the pre-caching would help, like if you know you want to send back 1000 rows then read a bigger chunk of data from HDFS.
J-D On Sun, Mar 11, 2012 at 3:01 PM, Stack <[email protected]> wrote: > On Sun, Mar 11, 2012 at 1:49 PM, Ted Yu <[email protected]> wrote: >> I wonder if there was recent performance comparison for scan between using >> pread vs. using seek()+read(). >> > > No. > > Nothing has changed in pread vs seek+read as far as I know so you'll > probably just end up confirming seek+read is still faster than pread > when scanning. > > To improve scan speed, we need read ahead I'd say. > > St.Ack
