Thank you Ted, Luke. I think GLB problem hits when you have not built the cuboid that matches the exact group by statement in the query. That's when one needs to think how to satisfy the query.
I work for HCL Technologies. An MnC services company. We eek out free time to play around with new ideas.. Cube build is through single MR. Thats where the idea started for us. And our infra is on VMs. And all VMs get virtual disks from LVM managing 4 physical disks. I.e. IO path does not scale much due to physical constraints. The build time was more or less like kylin. No surprises there.
