mcvsubbu commented on issue #4230: NULL value support for all data types URL: https://github.com/apache/incubator-pinot/issues/4230#issuecomment-497181413 > @mcvsubbu is there an issue if the presence vector is always built? No issue, just that we change the behavior from what people are used to. I am guessing there is at least someone out there who relies on the fact that a null will be auto-populated by default null value. Many applications also set custom default null values in their schema. > > If I understand things correctly, I believe the current state of handling nullables is a bit unfortunate. The scenario you mentioned is not just an issue for building presence vector, but, as @mayankshriv pointed out, it can actually result in incorrect aggregations. > > Is the defaultNullValue of 0 just for metrics? Or is it based on datatype? 0 for metrics, and Long.MIN or Int.MIN or "null" for dimensions. > In either case, should the actual behavior be corrected? We should not allow default "defaultNullValues" but we elevate the presence vector to a first class citizen - something that should always be present (no pun intended); if users provide a "defaultValue", we can use that if the column is missing for a row (and don't filter it out at any stage?)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
