One more item i havn't specified is : Number of key_groups being sent to each subsequent mar reduce jobs. In the current design of Kylin, It is very optimal in terms of taking minimal key_groups to the next stage.
But looking at the approach I was thinking about - ( Emiting cuboid based keys from one stage mapper with C_Id approach ), Combiner becomes a key as it would lead to group at mapper side and bringing down too many number of values to be transferred to reducer side. -Ilamparithi M. -- View this message in context: http://apache-kylin.74782.x6.nabble.com/N-Cuboids-preparation-MapReduce-Trying-to-avoid-multiple-stage-read-tp3528p3531.html Sent from the Apache Kylin mailing list archive at Nabble.com.
