I have seen between and in be normalized to regular combination of <col><op><val> ANDed orORed. Isn't that always the case? I was expecting this behavior to happen , and then not worry about between and in?
-----Original Message----- From: Qifan Chen [mailto:[email protected]] Sent: Friday, October 30, 2015 12:41 PM To: dev <[email protected]> Subject: Re: question on NULL representation in DB? We also need to consider other types of SQL predicates to be pushdown down, such as BETWEEN and IN list. Thanks --Qifan On Fri, Oct 30, 2015 at 12:19 PM, Eric Owhadi <[email protected]> wrote: > No, I am working on a solution to keep current 0xFF trick while not > requiring expression re-evaluation and pushing up null columns... > Eric > > -----Original Message----- > From: Suresh Subbiah [mailto:[email protected]] > Sent: Friday, October 30, 2015 6:39 AM > To: [email protected] > Subject: Re: question on NULL representation in DB? > > Hi, > > Summarising personal correspondence with Anoop on this question " > there are > 2 ways in which traf puts and detect null value in hbase: > either a missing value or a column > > value with null indicator prefix. > > > > During insert, we use the first method of not inserting that column. > > But during update, we use the second method of putting in null > indicator as the value of that column. > > > > We create rowid with null values or part of key with null values by > putting in null indicator and zeroing > > out remainder of the field. That way 2 null values will be compared > equal and null will sort high." > > > Are we thinking that if we implemented an update which sets a column > value to be NULL as a HBase Delete of that cell then predicate > pushdown need not check for null values again? It will be some work to > split out an Update as Put of NonNull values and then a Delete of null > values. Will need an expression to be evaluated at Update time. I > suppose we have a choice, pay the cost of expression eval during select or > during update. > > > Thanks > > Suresh > > On Thu, Oct 29, 2015 at 1:34 PM, Selva Govindarajan < > [email protected]> wrote: > > > By default, the null values are not inserted into hbase. If the > > column is nullable, the first additional byte determines if the > > column value is null or not. When a value is inserted into nullable > > column the first byte is always 0x00. When the column value is > > updated to null, the existing column value will be replaced with > > 0xFF in the first byte because it is not possible to switch to > > delete cell value in the midst of data execution. > > > > Selva > > > > -----Original Message----- > > From: Eric Owhadi [mailto:[email protected]] > > Sent: Thursday, October 29, 2015 11:06 AM > > To: [email protected] > > Subject: question on NULL representation in DB? > > > > Reading the code, I have a hard time understanding the various ways > > NULLs are represented in the DB for non-aligned format. > > > > I see comments in the code suggesting that nullable columns have the > > first value byte representing if the value is null, but I also see > > special cases all over the place that take care of null as being > > totally absent cells. > > > > The former method (adding a first byte indicating a null) having > > consequences on predicate push down -> need to re-do predicate > > evaluation at trafodion layer to deal with null semantic. > > > > > > > > But I am not sure why we have this special situation of coding null > > with a byte, instead of always dealing with nulls as being “absent” > > cell? I am sure there is a reason, but I just could not figure it > > out… > > > > Someone can help? > > > > Eric > > > -- Regards, --Qifan
