Hi Guys,

Now the cube is built by multi-stage map-reduce job. It may introduce 
unnecessary latency for some cases (e.g. incremental building). 


We can introduce another cube building algorithm as below:
1. When the mapper process the raw record, it will generate all the valid 
combination record that will be put into memory.
2. When memory is almost full, mapper will write all the combination records to 
reducer. 
3. After mapper write the records to reduce, it will cleanup the memory for 
further process.


Basically, mapper logically split the data block by memory limitation.


Thanks
Jiang Xu

Reply via email to