Re: Partition and Split rows

Dan Burkert Thu, 12 May 2016 13:47:23 -0700

On Thu, May 12, 2016 at 11:39 AM, Sand Stone <sand.m.st...@gmail.com> wrote:


I don't know how Kudu load balance the data across the tablet servers.
>

Individual tablets are replicated and balanced across all available tablet
servers, for more on that see
http://getkudu.io/docs/schema_design.html#data-distribution.



> For example, do I need to pre-calculate every day, a list of 5 minutes
> apart timestamps at table creation? [assume I have to create a new table
> every day].
>

If you wish to range partition on the time column, then yes, currently you
must specify the splits upfront during table creation (but this will change
with the non-covering range partitions work).


>
> My hope, with the additional 5-min column, and use it as the range
> partition column, is that so I could spread the data evenly across the
> tablet servers.
>

I don't think this is meaningfully different than range partitioning on the
full time column with splits every 5 minutes.


> Also, since 5-min interval data are always colocated together, the read
> query could be efficient too.
>

Data colocation is a function of the partitioning and indexing.  As I
mentioned before, if you have timestamp as part of your primary key then
you can guarantee that scans specifying a time range are efficient. Overall
it sounds like you are attempting to get fast scans by creating many fine
grained partitions, as you might with Parquet.  This won't be an efficient
strategy in Kudu, since each tablet server should only have on the order of
10-20 tablets.  Instead, take advantage of the index capability of Primary
Keys.

- Dan


> On Thu, May 12, 2016 at 11:13 AM, Dan Burkert <d...@cloudera.com> wrote:
>
>> Forgot to add the PK specification to the CREATE TABLE, it should have
>> read as follows:
>>
>> CREATE TABLE metrics (metric STRING, time TIMESTAMP, value DOUBLE)
>> PRIMARY KEY (metric, time);
>>
>> - Dan
>>
>>
>> On Thu, May 12, 2016 at 11:12 AM, Dan Burkert <d...@cloudera.com> wrote:
>>
>>>
>>> On Thu, May 12, 2016 at 11:05 AM, Sand Stone <sand.m.st...@gmail.com>
>>> wrote:
>>>
>>>> > Is the requirement to pre-aggregate by time window?
>>>> No, I am thinking to create a column say, "minute". It's basically the
>>>> minute field of the timestamp column(even round to 5-min bucket depending
>>>> on the needs). So it's a computed column being filled in on data ingestion.
>>>> My goal is that this field would help with data filtering at read/query
>>>> time, say select certain projection at minute 10-15, to speed up the read
>>>> queries.
>>>>
>>>
>>> In many cases, Kudu can do his for you without having to add special
>>> columns.  The requirements are that the timestamp is part of the primary
>>> key, and any columns that come before the timestamp in the primary key (if
>>> it's a compound PK), have equality predicates.  So for instance, if you
>>> create a table such as:
>>>
>>> CREATE TABLE metrics (metric STRING, time TIMESTAMP, value DOUBLE);
>>>
>>> then queries such as
>>>
>>> SELECT time, value FROM metrics WHERE metric = "my-metric" AND time >
>>> 2016-05-01T00:00 AND time < 2016-05-01T00:05
>>>
>>> Then only the data for that 5 minute time window will be read from
>>> disk.  If the query didn't have the equality predicate on the 'metric'
>>> column, then it would do a much bigger scan + filter operation.  If you
>>> want more background on how this is achieved, check out the partition
>>> pruning design doc:
>>> https://github.com/apache/incubator-kudu/blob/master/docs/design-docs/scan-optimization-partition-pruning.md
>>> .
>>>
>>> - Dan
>>>
>>>
>>>
>>>> Thanks for the info., I will follow them.
>>>>
>>>> On Thu, May 12, 2016 at 10:50 AM, Dan Burkert <d...@cloudera.com> wrote:
>>>>
>>>>> Hey Sand,
>>>>>
>>>>> Sorry for the delayed response.  I'm not quite following your use
>>>>> case.  Is the requirement to pre-aggregate by time window? I don't think
>>>>> Kudu can help you directly with that (nothing built in), but you could
>>>>> always create a separate table to store the pre-aggregated values.  As far
>>>>> as applying functions to do row splits, that is an interesting idea, but I
>>>>> think once Kudu has support for range bounds (the non-covering range
>>>>> partition design doc linked above), you can simply create the bounds where
>>>>> the function would have put them.  For example, if you want a partition 
>>>>> for
>>>>> every five minutes, you can create the bounds accordingly.
>>>>>
>>>>> Earlier this week I gave a talk on timeseries in Kudu, I've included
>>>>> some slides that may be interesting to you.  Additionally, you may want to
>>>>> check out https://github.com/danburkert/kudu-ts, it's a very young
>>>>>  (not feature complete) metrics layer on top of Kudu, it may give you some
>>>>> ideas.
>>>>>
>>>>> - Dan
>>>>>
>>>>> On Sat, May 7, 2016 at 1:28 PM, Sand Stone <sand.m.st...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thanks for sharing, Dan. The diagrams explained clearly how the
>>>>>> current system works.
>>>>>>
>>>>>> As for things in my mind. Take the schema of <host,metric,time,...>,
>>>>>> say, I am interested in data for the past 5 mins, 10 mins, etc. Or,
>>>>>> aggregate at 5 mins interval for the past 3 days, 7 days, ... Looks like 
>>>>>> I
>>>>>> need to introduce a special 5-min bar column, use that column to do range
>>>>>> partition to spread data across the tablet servers so that I could 
>>>>>> leverage
>>>>>> parallel filtering.
>>>>>>
>>>>>> The cost of this extra column (INT8) is not ideal but not too bad
>>>>>> either (storage cost wise, compression should do wonders). So I am 
>>>>>> thinking
>>>>>> whether it would be better to take "functions" as row split instead of 
>>>>>> only
>>>>>> constants. Of course if business requires to drop down to 1-min bar, the
>>>>>> data has to be re-sharded again. So a more cost effective way of doing 
>>>>>> this
>>>>>> on a production cluster would be good.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, May 7, 2016 at 8:50 AM, Dan Burkert <d...@cloudera.com> wrote:
>>>>>>
>>>>>>> Hi Sand,
>>>>>>>
>>>>>>> I've been working on some diagrams to help explain some of the more
>>>>>>> advanced partitioning types, it's attached.   Still pretty rough at this
>>>>>>> point, but the goal is to clean it up and move it into the Kudu
>>>>>>> documentation proper.  I'm interested to hear what kind of time series 
>>>>>>> you
>>>>>>> are interested in Kudu for.  I'm tasked with improving Kudu for time
>>>>>>> series, you can follow progress here
>>>>>>> <https://issues.apache.org/jira/browse/KUDU-1306>. If you have any
>>>>>>> additional ideas I'd love to hear them.  You may also be interested in a
>>>>>>> small project that a JD and I have been working on in the past week to
>>>>>>> build an OpenTSDB style store on top of Kudu, you can find it here
>>>>>>> <https://github.com/danburkert/kudu-ts>.  Still quite feature
>>>>>>> limited at this point.
>>>>>>>
>>>>>>> - Dan
>>>>>>>
>>>>>>> On Fri, May 6, 2016 at 4:51 PM, Sand Stone <sand.m.st...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks. Will read.
>>>>>>>>
>>>>>>>> Given that I am researching time series data, row locality is
>>>>>>>> crucial :-)
>>>>>>>>
>>>>>>>> On Fri, May 6, 2016 at 3:57 PM, Jean-Daniel Cryans <
>>>>>>>> jdcry...@apache.org> wrote:
>>>>>>>>
>>>>>>>>> We do have non-covering range partitions coming in the next few
>>>>>>>>> months, here's the design (in review):
>>>>>>>>> http://gerrit.cloudera.org:8080/#/c/2772/9/docs/design-docs/non-covering-range-partitions.md
>>>>>>>>>
>>>>>>>>> The "Background & Motivation" section should give you a good idea
>>>>>>>>> of why I'm mentioning this.
>>>>>>>>>
>>>>>>>>> Meanwhile, if you don't need row locality, using hash partitioning
>>>>>>>>> could be good enough.
>>>>>>>>>
>>>>>>>>> J-D
>>>>>>>>>
>>>>>>>>> On Fri, May 6, 2016 at 3:53 PM, Sand Stone <sand.m.st...@gmail.com
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>> Makes sense.
>>>>>>>>>>
>>>>>>>>>> Yeah it would be cool if users could specify/control the split
>>>>>>>>>> rows after the table is created. Now, I have to "think ahead" to 
>>>>>>>>>> pre-create
>>>>>>>>>> the range buckets.
>>>>>>>>>>
>>>>>>>>>> On Fri, May 6, 2016 at 3:49 PM, Jean-Daniel Cryans <
>>>>>>>>>> jdcry...@apache.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> You will only get 1 tablet and no data distribution, which is
>>>>>>>>>>> bad.
>>>>>>>>>>>
>>>>>>>>>>> That's also how HBase works, but it will split regions as you
>>>>>>>>>>> insert data and eventually you'll get some data distribution even 
>>>>>>>>>>> if it
>>>>>>>>>>> doesn't start in an ideal situation. Tablet splitting will come 
>>>>>>>>>>> later for
>>>>>>>>>>> Kudu.
>>>>>>>>>>>
>>>>>>>>>>> J-D
>>>>>>>>>>>
>>>>>>>>>>> On Fri, May 6, 2016 at 3:42 PM, Sand Stone <
>>>>>>>>>>> sand.m.st...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> One more questions, how does the range partition work if I
>>>>>>>>>>>> don't specify the split rows?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, May 6, 2016 at 3:37 PM, Sand Stone <
>>>>>>>>>>>> sand.m.st...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks, Misty. The "advanced" impala example helped.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I was just reading the Java API,CreateTableOptions.java, it's
>>>>>>>>>>>>> unclear how the range partition column names associated with the 
>>>>>>>>>>>>> partial
>>>>>>>>>>>>> rows params in the addSplitRow API.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, May 6, 2016 at 3:08 PM, Misty Stanley-Jones <
>>>>>>>>>>>>> mstanleyjo...@cloudera.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Sand,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please have a look at
>>>>>>>>>>>>>> http://getkudu.io/docs/kudu_impala_integration.html#partitioning_tables
>>>>>>>>>>>>>> and see if it is helpful to you.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Misty
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, May 6, 2016 at 2:00 PM, Sand Stone <
>>>>>>>>>>>>>> sand.m.st...@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi, I am new to Kudu. I wonder how the split rows work. I
>>>>>>>>>>>>>>> know from some docs, this is currently for pre-creation the 
>>>>>>>>>>>>>>> table. I am
>>>>>>>>>>>>>>> researching how to partition (hash+range) some time series test 
>>>>>>>>>>>>>>> data.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Is there an example? or notes somewhere I could read upon.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks much.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Partition and Split rows

Reply via email to