The test doesn't exactly reproduce what I did in my sample program.

I'm able to successfully drop the unbounded partition in both cases
(calling set_range_partition_columns only vs calling
set_range_partition_columns+add_hash_partitions).  However, if I omit the
call to DropRangePartition, then AddRangePartition succeeds in the first
case and fails in the second case.  I expect it to succeed in both cases or
fail in both cases.

I've attached a simple program which demonstrates.


On Fri, Feb 24, 2017 at 7:09 PM, Dan Burkert <[email protected]> wrote:

> Hi Paul,
>
> I can't reproduce the behavior you are describing, I always get a single
> unbounded range partition when creating the table without specifying range
> bounds or splits (regardless of hash partitioning). I searched and couldn't
> find a unit test for this behavior, so I wrote one - you might compare your
> code against my test. https://gerrit.cloudera.org/#/c/6153/
>
> Thanks,
> Dan
>
> On Fri, Feb 24, 2017 at 2:41 PM, Paul Brannan <[email protected]
> > wrote:
>
>> I can verify that dropping the unbounded range partition allows me to
>> later add bounded partitions.
>>
>> If I only have range partitioning (by commenting out the call to
>> add_hash_partitions), adding a bounded partition succeeds, regardless of
>> whether I first drop the unbounded partition.  This seems surprising; why
>> the difference?
>>
>> On Fri, Feb 24, 2017 at 4:20 PM, Dan Burkert <[email protected]>
>> wrote:
>>
>>> Hi Paul,
>>>
>>> I think the issue you are running into is that if you don't add a range
>>> partition explicitly during table creation (by calling add_range_partition
>>> or inserting a split with add_range_partition_split), Kudu will default to
>>> creating 1 unbounded range partition.  So your two options are to add the
>>> range partition during table creation time, or if you only know that
>>> partition you want at a later time, you can drop the existing partition
>>> (alterer->DropRangePartition with two empty rows), then add the range
>>> partition.  Note that dropping the range partition will effectively
>>> truncate the table.  This can be done with the same alterer in a single
>>> transaction.  If you want to see a bunch of examples, you can check out
>>> this unit test: https://github.com/apache/kudu/blob/master/src/kudu/in
>>> tegration-tests/alter_table-test.cc#L1106.
>>>
>>> - Dan
>>>
>>> On Fri, Feb 24, 2017 at 10:53 AM, Paul Brannan <
>>> [email protected]> wrote:
>>>
>>>> I'm trying to create a table with one-column range-partitioned and
>>>> another column hash-partitioned.  Documentation for add_hash_partitions and
>>>> set_range_partition_columns suggest this should be possible ("Tables must
>>>> be created with either range, hash, or range and hash partitioning").
>>>>
>>>> I have a schema with three INT64 columns ("time", "key", and "value").
>>>> When I create the table, I set up the partitioning:
>>>>
>>>> (*table_creator)
>>>>   .table_name("test_table")
>>>>   .schema(&schema)
>>>>   .add_hash_partitions({"key"}, 2)
>>>>   .set_range_partition_columns({"time"})
>>>>   .num_replicas(1)
>>>>   .Create()
>>>>
>>>> I later try to add a partition:
>>>>
>>>> auto timesplit(KuduSchema & schema, std::int64_t t) {
>>>>   auto split = schema.NewRow();
>>>>   check_ok(split->SetInt64("time", t));
>>>>   return split;
>>>> }
>>>>
>>>> alterer->AddRangePartition(
>>>>   timesplit(schema, date_start),
>>>>   timesplit(schema, next_date_start));
>>>>
>>>> check_ok(alterer->Alter());
>>>>
>>>> But I get an error "Invalid argument: New range partition conflicts
>>>> with existing range partition".
>>>>
>>>> How are hash and range partitioning intended to be mixed?
>>>>
>>>>
>>>
>>
>
#include <kudu/client/client.h>

#include <iostream>
#include <stdexcept>
#include <memory>

namespace
{

using namespace kudu;
using namespace kudu::client;

void check_ok(kudu::Status status)
{
  if (!status.ok())
  {
    throw std::runtime_error(status.ToString());
  }
}

template<typename T>
std::unique_ptr<T> own(T * raw_ptr)
{
  std::unique_ptr<T> p(raw_ptr);
  return p;
}

}

int main()
{
  sp::shared_ptr<KuduClient> client;
  check_ok(KuduClientBuilder()
    .add_master_server_addr("localhost")
    .Build(&client));

  KuduSchema schema;
  KuduSchemaBuilder b;
  b.AddColumn("date")->Type(KuduColumnSchema::INT32)->NotNull();
  b.AddColumn("key")->Type(KuduColumnSchema::INT32)->NotNull();
  b.AddColumn("value")->Type(KuduColumnSchema::INT32)->NotNull();
  b.SetPrimaryKey({"date", "key"});
  check_ok(b.Build(&schema));

  // Range partition columns only
  {
    check_ok(client->DeleteTable("test_table"));

    auto table_creator = own(client->NewTableCreator());
    check_ok((*table_creator)
        .table_name("test_table")
        .schema(&schema)
        .set_range_partition_columns({"date"})
        .num_replicas(1)
        .Create());

    auto alterer = own(client->NewTableAlterer("test_table"));
    auto lower = own(schema.NewRow()); check_ok(lower->SetInt32("date", 20170224));
    auto upper = own(schema.NewRow()); check_ok(lower->SetInt32("date", 20170225));
    alterer->AddRangePartition(lower.release(), upper.release());

    // *** This call succeeds
    check_ok(alterer->Alter());
  }

  // Both range and hash partition columns
  {
    check_ok(client->DeleteTable("test_table"));

    auto table_creator = own(client->NewTableCreator());
    check_ok((*table_creator)
        .table_name("test_table")
        .schema(&schema)
        .set_range_partition_columns({"date"})
        .add_hash_partitions({"key"}, 2)
        .num_replicas(1)
        .Create());

    auto alterer = own(client->NewTableAlterer("test_table"));

    auto lower = own(schema.NewRow()); check_ok(lower->SetInt32("date", 20170224));
    auto upper = own(schema.NewRow()); check_ok(lower->SetInt32("date", 20170225));
    alterer->AddRangePartition(lower.release(), upper.release());

    // *** But this call does not succeed?????
    check_ok(alterer->Alter());
  }
}

Reply via email to