No, I am working on a solution to keep current 0xFF trick while not requiring expression re-evaluation and pushing up null columns... Eric
-----Original Message----- From: Suresh Subbiah [mailto:[email protected]] Sent: Friday, October 30, 2015 6:39 AM To: [email protected] Subject: Re: question on NULL representation in DB? Hi, Summarising personal correspondence with Anoop on this question " there are 2 ways in which traf puts and detect null value in hbase: either a missing value or a column value with null indicator prefix. During insert, we use the first method of not inserting that column. But during update, we use the second method of putting in null indicator as the value of that column. We create rowid with null values or part of key with null values by putting in null indicator and zeroing out remainder of the field. That way 2 null values will be compared equal and null will sort high." Are we thinking that if we implemented an update which sets a column value to be NULL as a HBase Delete of that cell then predicate pushdown need not check for null values again? It will be some work to split out an Update as Put of NonNull values and then a Delete of null values. Will need an expression to be evaluated at Update time. I suppose we have a choice, pay the cost of expression eval during select or during update. Thanks Suresh On Thu, Oct 29, 2015 at 1:34 PM, Selva Govindarajan < [email protected]> wrote: > By default, the null values are not inserted into hbase. If the column > is nullable, the first additional byte determines if the column value > is null or not. When a value is inserted into nullable column the > first byte is always 0x00. When the column value is updated to null, > the existing column value will be replaced with 0xFF in the first byte > because it is not possible to switch to delete cell value in the midst of > data execution. > > Selva > > -----Original Message----- > From: Eric Owhadi [mailto:[email protected]] > Sent: Thursday, October 29, 2015 11:06 AM > To: [email protected] > Subject: question on NULL representation in DB? > > Reading the code, I have a hard time understanding the various ways > NULLs are represented in the DB for non-aligned format. > > I see comments in the code suggesting that nullable columns have the > first value byte representing if the value is null, but I also see > special cases all over the place that take care of null as being totally > absent cells. > > The former method (adding a first byte indicating a null) having > consequences on predicate push down -> need to re-do predicate > evaluation at trafodion layer to deal with null semantic. > > > > But I am not sure why we have this special situation of coding null > with a byte, instead of always dealing with nulls as being “absent” > cell? I am sure there is a reason, but I just could not figure it out… > > Someone can help? > > Eric >
