Sounds good.

On Wed, Jul 6, 2016 at 1:43 PM, Jim Apple <[email protected]> wrote:

> +1
>
> On Wed, Jul 6, 2016 at 1:20 PM, Marcel Kornacker <[email protected]>
> wrote:
> > Could we then do the following: normalize all exprs <val>="" and
> > <val>=null to <val> is null, and then send through
> > HdfsPartitionPruner?
> >
> > On Wed, Jul 6, 2016 at 11:41 AM, Alex Behm <[email protected]>
> wrote:
> >> I think the main issue is that we need to preserve compatibility.
> >> z = "" should select the same partition(s) regardless of whether it is
> used
> >> in a single-clause or multi-clause DDL statement.
> >> My understanding is that today z = "" is equivalent to z IS NULL, so the
> >> same should be true for the new multi-clause DDL.
> >>
> >>
> >>
> >>
> >> On Wed, Jul 6, 2016 at 11:33 AM, Jim Apple <[email protected]>
> wrote:
> >>
> >>> In your opinion, in multi-clause DDL statements like
> >>>
> >>> alter table p partition (j<2 or j>0, k like "%", z = '') set uncached;
> >>>
> >>> Should "z = ''" be a synonym for "z IS NULL" like it is in the
> >>> single-clause DDL?
> >>>
> >>> On Wed, Jul 6, 2016 at 11:28 AM, Marcel Kornacker <[email protected]
> >
> >>> wrote:
> >>> > On Wed, Jul 6, 2016 at 10:20 AM, Jim Apple <[email protected]>
> wrote:
> >>> >> Let me try to explain what is going on here.
> >>> >>
> >>> >> Currently, if a user wants to specify a null partition for a DDL
> >>> >> operation, they write something like
> >>> >>
> >>> >> compute incremental stats incremental_null_part_key partition(p =
> NULL);
> >>> >
> >>> > We need to keep this working for the time being.
> >>> >
> >>> >>
> >>> >> For an empty string, they could write:
> >>> >>
> >>> >> alter table t_part drop partition (j=2, s='')
> >>> >>
> >>> >> This is unfortunate, as nothing "equals" NULL, and empty strings are
> >>> >> mapped to the NULL partition value.
> >>> >>
> >>> >> Amos has written a patch that allows DDL operations to work on more
> >>> >> than one partition at a time. These look like:
> >>> >>
> >>> >> alter table p partition (j<2 or j>0, k like "%") set uncached;
> >>> >>
> >>> >> Here, the clauses separated by commas are ANDed together to make one
> >>> >> clause. The question is whether these clauses, which now are clauses
> >>> >> and not just strangley-interpreted-equality, should keep the old
> >>> >> behavior or break existing queries.
> >>> >
> >>> > For these clauses we should use 'IS [NOT] NULL'.
> >>> >
> >>> >>
> >>> >> On Wed, Jul 6, 2016 at 6:44 AM, Amos Bird <[email protected]>
> wrote:
> >>> >>> This problem came from
> https://issues.cloudera.org/browse/IMPALA-1654
> >>> , CR at https://gerrit.cloudera.org/#/c/1563/ . This patch will make
> >>> general predicates possible in most partition DDL operations. However,
> for
> >>> NULL partitions, the old KV way no longer works. Broken cases are
> <string
> >>> val>="" and <val>=null. This is due to the usage of HdfsPartitionPruner
> >>> which is used for Query time partition pruning. Should we keep the old
> way
> >>> of treating NULL partition as special cases?
> >>> >>>
> >>> >>> Amos
> >>>
>

Reply via email to