We also need to consider other types of SQL predicates to be pushdown down, such as BETWEEN and IN list.
Thanks --Qifan On Fri, Oct 30, 2015 at 12:19 PM, Eric Owhadi <[email protected]> wrote: > No, I am working on a solution to keep current 0xFF trick while not > requiring expression re-evaluation and pushing up null columns... > Eric > > -----Original Message----- > From: Suresh Subbiah [mailto:[email protected]] > Sent: Friday, October 30, 2015 6:39 AM > To: [email protected] > Subject: Re: question on NULL representation in DB? > > Hi, > > Summarising personal correspondence with Anoop on this question " there > are > 2 ways in which traf puts and detect null value in hbase: > either a missing value or a column > > value with null indicator prefix. > > > > During insert, we use the first method of not inserting that column. > > But during update, we use the second method of putting in null indicator as > the value of that column. > > > > We create rowid with null values or part of key with null values by putting > in null indicator and zeroing > > out remainder of the field. That way 2 null values will be compared equal > and null will sort high." > > > Are we thinking that if we implemented an update which sets a column value > to be NULL as a HBase Delete of that cell then predicate pushdown need not > check for null values again? It will be some work to split out an Update as > Put of NonNull values and then a Delete of null values. Will need an > expression to be evaluated at Update time. I suppose we have a choice, pay > the cost of expression eval during select or during update. > > > Thanks > > Suresh > > On Thu, Oct 29, 2015 at 1:34 PM, Selva Govindarajan < > [email protected]> wrote: > > > By default, the null values are not inserted into hbase. If the column > > is nullable, the first additional byte determines if the column value > > is null or not. When a value is inserted into nullable column the > > first byte is always 0x00. When the column value is updated to null, > > the existing column value will be replaced with 0xFF in the first byte > > because it is not possible to switch to delete cell value in the midst of > > data execution. > > > > Selva > > > > -----Original Message----- > > From: Eric Owhadi [mailto:[email protected]] > > Sent: Thursday, October 29, 2015 11:06 AM > > To: [email protected] > > Subject: question on NULL representation in DB? > > > > Reading the code, I have a hard time understanding the various ways > > NULLs are represented in the DB for non-aligned format. > > > > I see comments in the code suggesting that nullable columns have the > > first value byte representing if the value is null, but I also see > > special cases all over the place that take care of null as being totally > > absent cells. > > > > The former method (adding a first byte indicating a null) having > > consequences on predicate push down -> need to re-do predicate > > evaluation at trafodion layer to deal with null semantic. > > > > > > > > But I am not sure why we have this special situation of coding null > > with a byte, instead of always dealing with nulls as being “absent” > > cell? I am sure there is a reason, but I just could not figure it out… > > > > Someone can help? > > > > Eric > > > -- Regards, --Qifan
