[ https://issues.apache.org/jira/browse/KUDU-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
LiFu He reassigned KUDU-1994: ----------------------------- Assignee: Thomas D'Silva (was: LiFu He) > Automatically Create New Range Partitions When Needed > ----------------------------------------------------- > > Key: KUDU-1994 > URL: https://issues.apache.org/jira/browse/KUDU-1994 > Project: Kudu > Issue Type: Improvement > Affects Versions: 1.3.0 > Reporter: Alan Jackoway > Assignee: Thomas D'Silva > Priority: Major > Labels: roadmap-candidate > > We have a few Kudu tables where we use a range-partitioned timestamp as part > of the key. The intention of this is to keep data locality for data that is > likely to be scanned together, such as events in a timeseries. > Currently we create these with a partitions that look like this: > {noformat} > RANGE (ts) ( > PARTITION 0 <= VALUES < 1420088400000, > PARTITION 1420088400000 <= VALUES < 1427860800000, > PARTITION 1427860800000 <= VALUES < 1435723200000, > PARTITION 1435723200000 <= VALUES < 1443672000000, > PARTITION 1443672000000 <= VALUES < 1451624400000, > PARTITION 1451624400000 <= VALUES < 1459483200000, > PARTITION 1459483200000 <= VALUES < 1467345600000, > PARTITION 1467345600000 <= VALUES < 1475294400000, > PARTITION 1475294400000 <= VALUES < 1483246800000, > PARTITION 1483246800000 <= VALUES < 1491033600000, > PARTITION 1491033600000 <= VALUES < 1498896000000, > PARTITION 1498896000000 <= VALUES < 1506844800000 > ) > {noformat} > The problem is that as time goes on we have to choose to either create empty > partitions in advance of when we are writing data or risk forgetting to > create a partition and having writes of new data fail. > Ideally, Kudu would have a way to indicate the size of the partitions (in > this example 3 months converted to milliseconds) and then automatically > create new partitions when new data comes in that needs the partition. -- This message was sent by Atlassian Jira (v8.3.4#803005)