On Mon, Jun 27, 2011 at 11:37 PM, Aditya Karanth A <[email protected]> wrote: >> I have heard that bigger the size of the regionserver, more time it takes >> for region splitting and slower the reads are. Is this true? > (I have not been able to experiment with all these in our environments yet, > but if anyone has been there and done that, would be good to know) >
Well, splitting is fast in that it just writes out references files; it does not actually rewrite data so size shouldn't matter. Scan reads don't care about file size (bigger may actually be slightly faster). Random read performance also is unrelated to file/region size (We consult the in-memory index to figure where to jump to to start the read -- this should be the same for big or small files). St.Ack
