Hi JM, If the region boundaries will not change, does that mean,
If my data access pattern has skews (say a certain part (30%) of my data will almost never be used), then a proportion (30%) of my server will always be idle? A region server has to have a continuous rowkey range? Jianshi On Sat, Aug 16, 2014 at 2:46 AM, Jean-Marc Spaggiari < [email protected]> wrote: > H Jianshi, > > Not sure to get your question. > > Can I rephrase it? > > So you have 10 regions, and each of those regions has 10 HFiles. Then you > run a major compaction on the table. Correct? > > Then you will end up with: > > reg1:[files:1] > reg2:[files:2] > reg3:[files:3] > ... > > Regions boundaries will not change. But each region will not have a single > underlaying file. > > HTH, > > JM > > > 2014-08-15 1:53 GMT-04:00 Jianshi Huang <[email protected]>: > > > Say I have 100 split files on 10 region servers, and I did a major > compact. > > > > Will these split files be distributed like this: > > reg1: [splits 1,2,..,10] > > reg2: [splits 11,12,...,20] > > ... > > > > Or like this: > > reg1: [splits: 1, 11, 21, ... , 91] > > reg2: [splits: 2, 12, 22, ... , 92] > > ... > > > > And if I want to specify the locality and the stride of split files? How > > can I do it in HBase? > > > > > > -- > > Jianshi Huang > > > > LinkedIn: jianshi > > Twitter: @jshuang > > Github & Blog: http://huangjs.github.com/ > > > -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/
