Re: question on NULL representation in DB?

Qifan Chen Fri, 30 Oct 2015 11:19:40 -0700

Pretty much and in certain situations, IN list will be mapped to RANGE SPEC.


*Between*

This can be verified by looking what will happen during normalization and
preCodeGen stage.  The two methods to check is Between::transformNode()
and Between::preCodeGen().

void Between::transformNode(NormWA & normWARef,
                            ExprValueId & locationOfPointerToMe,
                            ExprGroupId & introduceSemiJoinHere,
                            const ValueIdSet & externalInputs)
{
  DBGSETDBG( "TRANSFORM_DEBUG" )
  DBGIF(
    unp = "";
    unparse(unp);
    cerr << (Int32)getOperatorType() << " "
         << (Int32)getValueId() << " "
         << (void *)this << " "
         << unp << endl;
  )

  // ---------------------------------------------------------------------
  // Transform the operands of the Between
  // ---------------------------------------------------------------------
  ItemExpr::transformNode(normWARef, locationOfPointerToMe,
                          introduceSemiJoinHere, externalInputs);

  ItemExpr *tfm = transformIntoTwoComparisons();  <===================

ItemExpr * Between::preCodeGen(Generator * generator)
{
  if (nodeIsPreCodeGenned())
    return getReplacementExpr();

  // transform "A BETWEEN B AND C" to "A >= B AND A <= C"
  ItemExpr * newExpr =
    generator->getExpGenerator()->createExprTree(
      "@A1 >= @A2 AND @A1 <= @A3", 0, 3, child(0), child(1), child(2));

  newExpr->bindNode(generator->getBindWA());
  setReplacementExpr(newExpr->preCodeGen(generator));
  markAsPreCodeGenned();
  return getReplacementExpr();
}


*IN list*

ItemExpr *convertINvaluesToOR(ItemExpr *lhs, ItemExpr *rhs) called
from processINlist().

Sounds like both, if in AND and OR form, can be mapped to the FilterList
filter.


Thanks --Qifan


On Fri, Oct 30, 2015 at 12:44 PM, Eric Owhadi <[email protected]> wrote:

> I have seen between and in be normalized to regular combination of
> <col><op><val> ANDed orORed.
> Isn't that always the case? I was expecting this behavior to happen , and
> then not worry about between and in?
>
> -----Original Message-----
> From: Qifan Chen [mailto:[email protected]]
> Sent: Friday, October 30, 2015 12:41 PM
> To: dev <[email protected]>
> Subject: Re: question on NULL representation in DB?
>
> We also need to consider other types of SQL predicates to be pushdown down,
> such as BETWEEN and IN list.
>
> Thanks --Qifan
>
> On Fri, Oct 30, 2015 at 12:19 PM, Eric Owhadi <[email protected]>
> wrote:
>
> > No, I am working on a solution to keep current 0xFF trick while not
> > requiring expression re-evaluation and pushing up null columns...
> > Eric
> >
> > -----Original Message-----
> > From: Suresh Subbiah [mailto:[email protected]]
> > Sent: Friday, October 30, 2015 6:39 AM
> > To: [email protected]
> > Subject: Re: question on NULL representation in DB?
> >
> > Hi,
> >
> > Summarising personal correspondence with Anoop on this question "
> > there are
> > 2 ways in which traf puts and detect null value in hbase:
> > either a missing value or a column
> >
> > value with null indicator prefix.
> >
> >
> >
> > During insert, we use the first method of not inserting that column.
> >
> > But during update, we use the second method of putting in null
> > indicator as the value of that column.
> >
> >
> >
> > We create rowid with null values or part of key with null values by
> > putting in null indicator and zeroing
> >
> > out remainder of the field. That way 2 null values will be compared
> > equal and null will sort high."
> >
> >
> > Are we thinking that if we implemented an update which sets a column
> > value to be NULL as a HBase Delete of that cell then predicate
> > pushdown need not check for null values again? It will be some work to
> > split out an Update as Put of NonNull values and then a Delete of null
> > values. Will need an expression to be evaluated at Update time. I
> > suppose we have a choice, pay the cost of expression eval during select
> or
> > during update.
> >
> >
> > Thanks
> >
> > Suresh
> >
> > On Thu, Oct 29, 2015 at 1:34 PM, Selva Govindarajan <
> > [email protected]> wrote:
> >
> > > By default, the null values are not inserted into hbase. If the
> > > column is nullable, the first additional byte determines if the
> > > column value is null or not. When a value is inserted into nullable
> > > column the first byte is always 0x00. When the column value is
> > > updated to null, the existing column value will be replaced with
> > > 0xFF in the first byte because it is not possible to switch to
> > > delete cell value in the midst of data execution.
> > >
> > > Selva
> > >
> > > -----Original Message-----
> > > From: Eric Owhadi [mailto:[email protected]]
> > > Sent: Thursday, October 29, 2015 11:06 AM
> > > To: [email protected]
> > > Subject: question on NULL representation in DB?
> > >
> > > Reading the code, I have a hard time understanding the various ways
> > > NULLs are represented in the DB for non-aligned format.
> > >
> > > I see comments in the code suggesting that nullable columns have the
> > > first value byte representing if the value is null, but I also see
> > > special cases all over the place that take care of null as being
> > > totally absent cells.
> > >
> > > The former method (adding a first byte indicating a null) having
> > > consequences on predicate push down -> need to re-do predicate
> > > evaluation at trafodion layer to deal with null semantic.
> > >
> > >
> > >
> > > But I am not sure why we have this special situation of coding null
> > > with a byte, instead of always dealing with nulls as being “absent”
> > > cell? I am sure there is a reason, but I just could not figure it
> > > out…
> > >
> > > Someone can help?
> > >
> > > Eric
> > >
> >
>
>
>
> --
> Regards, --Qifan
>



-- 
Regards, --Qifan

Re: question on NULL representation in DB?

Reply via email to