Jihoon Son created TAJO-1995:
--------------------------------

             Summary: Improve range partitioning using histogram
                 Key: TAJO-1995
                 URL: https://issues.apache.org/jira/browse/TAJO-1995
             Project: Tajo
          Issue Type: New Feature
          Components: QueryMaster
            Reporter: Jihoon Son
            Assignee: Jihoon Son
             Fix For: 0.12.0


Currently implemented range repartition algorithm has two major problems as 
follows:
* It assumes that data distribution is uniform, so is fragile for skewed data 
distribution.
* Given floating point values, it ignores the numbers to the right to the 
decimal point, so is difficult to guess the proper partition number.

One of the solutions for this problem is to use the histogram. With a 
histogram, we can figure out data distribution and provide a proper handling of 
floating point values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to