Re: Why does partition keys have to be in the primary keys?

Alexey Serbin Wed, 06 May 2020 14:13:20 -0700

Hi,

The restriction on the partitioning key to be composed of primary key
columns significantly simplifies the design and implementation.

However, I'm not sure I understand why the rules of partitioning come to
play here.  To me it looks like the main question is about the schema for
the table, i.e. what should be the primary key.  If different pipelines use
different values for the 'day' field, but one result record is expected,
does it imply that pipelines need to update already existing records?  If
so, then maybe use UPSERT instead of INSERT for those pipelines?

I would start with trying to understand what's the primary key for the
table to satisfy the requirements.  Once it's clear, I'd think about the
partitioning rules for the table.

Thanks,

Alexey

On Wed, May 6, 2020 at 4:54 AM Ray Liu (rayliu) <[email protected]> wrote:

> We have two pipelines writing to the same table, and that table is ranged
> partitioned by “day” field.
>
>
> Each pipeline fills some of the fields in the table with the same key.
>
>
>
> But the “day” field in these two pipelines may be different.
>
>
>
> Because range partition keys must exist in primary keys, so there will be
> two records in the result table.
>
>
>
> What we want is one complete record.
>
>
>
> So my question is why does partition keys have to be in the primary keys?
>
>
>
> Is there any workaround for this?
>

Re: Why does partition keys have to be in the primary keys?

Reply via email to