Sure, I'll contribute a patch for this. Will create a JIRA for the same. On Thu, Aug 6, 2015 at 8:02 AM, hongbin ma <[email protected]> wrote:
> @liangmeng please refer to Singh's hint. > > @Singh, it is a good idea to allow for more flexible cube size > configuration. However there're some issues to address: > > 1. Currently CubeCapacity(SMALL, MEDIUM, LARGE) is at cube level, not at > segment level. We cannot automatically apply tiny cut size to segments, > while apply medium/large cut size to merged large segments. However, it is > possible that we discard the SMALL, MEDIUM, LARGE concept and dynamically > determine each segment's region size depending on its size. > > 2. We do not have enough capacity to do this now, and to be honest it's not > a critical issue for us because most of our cubes are larger than 100G, big > region size is perferrable to us. So if you really need this capability can > you please think about contributing a patch? we'll review it and pack it > into the next 0.7 release > > > On Wed, Aug 5, 2015 at 11:11 PM, Bijeet Singh <[email protected]> > wrote: > > > While we are here, controlling the number of regions while creating a new > > segment, with the cube size options - SMALL, MEDIUM, LARGE - sometimes > > seems too restrictive. Even with SMALL size cubes, number of reducers are > > sometimes too low, which adversely impacts the cube build time. > > > > By modifying the cut size to an even smaller value, I was able to bring > > down the average reduce time of this step, from ~25mins to ~2mins. But I > > understand that setting very low cut sizes will lead to creation of very > > small regions, which isn't very desirable either. But too many smaller > > regions problem for new segments can be handled while merging the > segments. > > > > So, does it make sense to make the setting of cube size option a bit more > > flexible ? Through either making it configurable or by providing more > > options(other than small, medium, large) while creating the cube. > > > > Thanks, > > Bijeet > > > > > > > > On Wed, Aug 5, 2015 at 8:21 PM, Bijeet Singh <[email protected]> > > wrote: > > > > > The number of reducers in this step, depends on the cube size selected > > > while creating the cube. You can try with a small size cube. > > > > > > But if you are getting only one reducer, even with a small cube, you'd > > > have to probably tweak the cut size in RangeKeyDistributionReducer to > > even > > > smaller values. That will help you increase the number of reducers. > > > > > > 2015-08-05 18:13 GMT+05:30 liangmeng <[email protected]>: > > > > > >> there is only one reducer, and reduce take too much time, is it > possible > > >> to increase reducenum? > > >> > > >> > > >> > > >> 梁猛 > > >> 中国移动广东公司 网管维护中心 网管支撑室 > > >> 电话:13802880779 > > >> 邮箱: [email protected] ,[email protected] > > >> 地址:广东省广州市珠江新城珠江西路11号 广东全球通大厦北3楼 > > >> 邮编:510623 > > >> > > > > > > > > > > > > -- > Regards, > > *Bin Mahone | 马洪宾* > Apache Kylin: http://kylin.io > Github: https://github.com/binmahone >
