Sounds good.
On Wed, Jul 6, 2016 at 1:43 PM, Jim Apple <[email protected]> wrote: > +1 > > On Wed, Jul 6, 2016 at 1:20 PM, Marcel Kornacker <[email protected]> > wrote: > > Could we then do the following: normalize all exprs <val>="" and > > <val>=null to <val> is null, and then send through > > HdfsPartitionPruner? > > > > On Wed, Jul 6, 2016 at 11:41 AM, Alex Behm <[email protected]> > wrote: > >> I think the main issue is that we need to preserve compatibility. > >> z = "" should select the same partition(s) regardless of whether it is > used > >> in a single-clause or multi-clause DDL statement. > >> My understanding is that today z = "" is equivalent to z IS NULL, so the > >> same should be true for the new multi-clause DDL. > >> > >> > >> > >> > >> On Wed, Jul 6, 2016 at 11:33 AM, Jim Apple <[email protected]> > wrote: > >> > >>> In your opinion, in multi-clause DDL statements like > >>> > >>> alter table p partition (j<2 or j>0, k like "%", z = '') set uncached; > >>> > >>> Should "z = ''" be a synonym for "z IS NULL" like it is in the > >>> single-clause DDL? > >>> > >>> On Wed, Jul 6, 2016 at 11:28 AM, Marcel Kornacker <[email protected] > > > >>> wrote: > >>> > On Wed, Jul 6, 2016 at 10:20 AM, Jim Apple <[email protected]> > wrote: > >>> >> Let me try to explain what is going on here. > >>> >> > >>> >> Currently, if a user wants to specify a null partition for a DDL > >>> >> operation, they write something like > >>> >> > >>> >> compute incremental stats incremental_null_part_key partition(p = > NULL); > >>> > > >>> > We need to keep this working for the time being. > >>> > > >>> >> > >>> >> For an empty string, they could write: > >>> >> > >>> >> alter table t_part drop partition (j=2, s='') > >>> >> > >>> >> This is unfortunate, as nothing "equals" NULL, and empty strings are > >>> >> mapped to the NULL partition value. > >>> >> > >>> >> Amos has written a patch that allows DDL operations to work on more > >>> >> than one partition at a time. These look like: > >>> >> > >>> >> alter table p partition (j<2 or j>0, k like "%") set uncached; > >>> >> > >>> >> Here, the clauses separated by commas are ANDed together to make one > >>> >> clause. The question is whether these clauses, which now are clauses > >>> >> and not just strangley-interpreted-equality, should keep the old > >>> >> behavior or break existing queries. > >>> > > >>> > For these clauses we should use 'IS [NOT] NULL'. > >>> > > >>> >> > >>> >> On Wed, Jul 6, 2016 at 6:44 AM, Amos Bird <[email protected]> > wrote: > >>> >>> This problem came from > https://issues.cloudera.org/browse/IMPALA-1654 > >>> , CR at https://gerrit.cloudera.org/#/c/1563/ . This patch will make > >>> general predicates possible in most partition DDL operations. However, > for > >>> NULL partitions, the old KV way no longer works. Broken cases are > <string > >>> val>="" and <val>=null. This is due to the usage of HdfsPartitionPruner > >>> which is used for Query time partition pruning. Should we keep the old > way > >>> of treating NULL partition as special cases? > >>> >>> > >>> >>> Amos > >>> >
