Is that 4TB per tablet server, regardless of how many tablets it has? If I have 128GB of data per day, then each tablet server hits the recommended limit after about a month. To store 10 years of data, I would need 120 tablet servers to avoid going over the limit. Is that the best solution or is there another alternative?
How many cores are recommended per tablet server? If I typically only scan one day of data at time, could a single core service multiple tablet servers? On Fri, Feb 24, 2017 at 11:22 PM, Paul Brannan <[email protected]> wrote: > The test doesn't exactly reproduce what I did in my sample program. > > I'm able to successfully drop the unbounded partition in both cases > (calling set_range_partition_columns only vs calling > set_range_partition_columns+add_hash_partitions). However, if I omit the > call to DropRangePartition, then AddRangePartition succeeds in the first > case and fails in the second case. I expect it to succeed in both cases or > fail in both cases. > > I've attached a simple program which demonstrates. > > > On Fri, Feb 24, 2017 at 7:09 PM, Dan Burkert <[email protected]> > wrote: > >> Hi Paul, >> >> I can't reproduce the behavior you are describing, I always get a single >> unbounded range partition when creating the table without specifying range >> bounds or splits (regardless of hash partitioning). I searched and couldn't >> find a unit test for this behavior, so I wrote one - you might compare your >> code against my test. https://gerrit.cloudera.org/#/c/6153/ >> >> Thanks, >> Dan >> >> On Fri, Feb 24, 2017 at 2:41 PM, Paul Brannan < >> [email protected]> wrote: >> >>> I can verify that dropping the unbounded range partition allows me to >>> later add bounded partitions. >>> >>> If I only have range partitioning (by commenting out the call to >>> add_hash_partitions), adding a bounded partition succeeds, regardless of >>> whether I first drop the unbounded partition. This seems surprising; why >>> the difference? >>> >>> On Fri, Feb 24, 2017 at 4:20 PM, Dan Burkert <[email protected]> >>> wrote: >>> >>>> Hi Paul, >>>> >>>> I think the issue you are running into is that if you don't add a range >>>> partition explicitly during table creation (by calling add_range_partition >>>> or inserting a split with add_range_partition_split), Kudu will default to >>>> creating 1 unbounded range partition. So your two options are to add the >>>> range partition during table creation time, or if you only know that >>>> partition you want at a later time, you can drop the existing partition >>>> (alterer->DropRangePartition with two empty rows), then add the range >>>> partition. Note that dropping the range partition will effectively >>>> truncate the table. This can be done with the same alterer in a single >>>> transaction. If you want to see a bunch of examples, you can check out >>>> this unit test: https://github.com/apache/kudu/blob/master/src/kudu/in >>>> tegration-tests/alter_table-test.cc#L1106. >>>> >>>> - Dan >>>> >>>> On Fri, Feb 24, 2017 at 10:53 AM, Paul Brannan < >>>> [email protected]> wrote: >>>> >>>>> I'm trying to create a table with one-column range-partitioned and >>>>> another column hash-partitioned. Documentation for add_hash_partitions >>>>> and >>>>> set_range_partition_columns suggest this should be possible ("Tables must >>>>> be created with either range, hash, or range and hash partitioning"). >>>>> >>>>> I have a schema with three INT64 columns ("time", "key", and >>>>> "value"). When I create the table, I set up the partitioning: >>>>> >>>>> (*table_creator) >>>>> .table_name("test_table") >>>>> .schema(&schema) >>>>> .add_hash_partitions({"key"}, 2) >>>>> .set_range_partition_columns({"time"}) >>>>> .num_replicas(1) >>>>> .Create() >>>>> >>>>> I later try to add a partition: >>>>> >>>>> auto timesplit(KuduSchema & schema, std::int64_t t) { >>>>> auto split = schema.NewRow(); >>>>> check_ok(split->SetInt64("time", t)); >>>>> return split; >>>>> } >>>>> >>>>> alterer->AddRangePartition( >>>>> timesplit(schema, date_start), >>>>> timesplit(schema, next_date_start)); >>>>> >>>>> check_ok(alterer->Alter()); >>>>> >>>>> But I get an error "Invalid argument: New range partition conflicts >>>>> with existing range partition". >>>>> >>>>> How are hash and range partitioning intended to be mixed? >>>>> >>>>> >>>> >>> >> >
