hongbin ma created KYLIN-1485:
---------------------------------

             Summary: Block certain cuboids from being computed if it contains 
nearly as many rows as its parent
                 Key: KYLIN-1485
                 URL: https://issues.apache.org/jira/browse/KYLIN-1485
             Project: Kylin
          Issue Type: New Feature
            Reporter: hongbin ma
            Assignee: hongbin ma


Inspired by the output of KYLIN-1483:

{noformat}
|---- Cuboid 111111111, est row: 5962, est MB: 0.12
    |---- Cuboid 101111111, est row: 5886, est MB: 0.11, shrink: 98.73%
        |---- Cuboid 100111111, est row: 5958, est MB: 0.11, shrink: 101.22%
            |---- Cuboid 100110111, est row: 5904, est MB: 0.11, shrink: 99.09%
                |---- Cuboid 100100111, est row: 5911, est MB: 0.11, shrink: 
100.12%
                    |---- Cuboid 100000111, est row: 5628, est MB: 0.1, shrink: 
95.21%
        |---- Cuboid 101110111, est row: 5970, est MB: 0.11, shrink: 101.43%
            |---- Cuboid 101100111, est row: 5941, est MB: 0.11, shrink: 99.51%
                |---- Cuboid 101000111, est row: 5914, est MB: 0.11, shrink: 
99.55%
    |---- Cuboid 110111111, est row: 6036, est MB: 0.12, shrink: 101.24%
        |---- Cuboid 110110111, est row: 5944, est MB: 0.12, shrink: 98.48%
            |---- Cuboid 110100111, est row: 5994, est MB: 0.12, shrink: 100.84%
                |---- Cuboid 110000111, est row: 5921, est MB: 0.11, shrink: 
98.78%
    |---- Cuboid 111110111, est row: 5970, est MB: 0.12, shrink: 100.13%
        |---- Cuboid 111100111, est row: 5896, est MB: 0.12, shrink: 98.76%
            |---- Cuboid 111000111, est row: 5925, est MB: 0.11, shrink: 100.49%
    |---- Cuboid 111111000, est row: 5952, est MB: 0.1, shrink: 99.83%
        |---- Cuboid 101111000, est row: 5819, est MB: 0.09, shrink: 97.77%
            |---- Cuboid 100111000, est row: 5744, est MB: 0.09, shrink: 98.71%
                |---- Cuboid 100110000, est row: 5622, est MB: 0.09, shrink: 
97.88%
                    |---- Cuboid 100100000, est row: 4846, est MB: 0.07, 
shrink: 86.2%
            |---- Cuboid 101110000, est row: 5742, est MB: 0.09, shrink: 98.68%
                |---- Cuboid 101100000, est row: 5816, est MB: 0.09, shrink: 
101.29%
                    |---- Cuboid 101000000, est row: 5720, est MB: 0.09, 
shrink: 98.35%
        |---- Cuboid 110111000, est row: 5937, est MB: 0.1, shrink: 99.75%
            |---- Cuboid 110110000, est row: 5957, est MB: 0.1, shrink: 100.34%
                |---- Cuboid 110100000, est row: 5866, est MB: 0.09, shrink: 
98.47%
                    |---- Cuboid 110000000, est row: 5934, est MB: 0.09, 
shrink: 101.16%
        |---- Cuboid 111110000, est row: 5905, est MB: 0.1, shrink: 99.21%
            |---- Cuboid 111100000, est row: 5876, est MB: 0.09, shrink: 99.51%
                |---- Cuboid 111000000, est row: 5950, est MB: 0.09, shrink: 
101.26%
{noformat}

we found that it is possible that some of cuboids almost aggregated nothing 
compared to its parent. Based these estimated stats, we'll try to simply add 
them into a blacklist to avoid computing. Note that with the adoption of such 
segment based blacklist rule, different segments may have different set of 
cuboids computed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to