walterddr commented on issue #10381:
URL: https://github.com/apache/pinot/issues/10381#issuecomment-1753261690

   > I don't understand these properties. I would expect that in V2 there is no 
nullability query property. The engine should treat nullability as any other 
type property. The same we don't have properties to treat strings as binaries, 
we should not have properties to enable/disable nullability.
   
   Based on my proposal above (config) section, v2 doesn't need any nullability 
query property.
   
   > I think we have a clear way to define null column nullability: to set that 
in the schema. For compatibility reasons, we also have a way to define table 
level nullability in table config. We can treat the latter as the default 
column level nullability.
   
   Currently the "table-level nullability" is not "this table contains null", 
but an orthogonal property "for all the columns, generate null-vectors in the 
segment". this chart should show what I meant:
   
   
   | Control | During Query | During Ingestion |
   | -------- | ------- | ------- |
   | Table-level Control | via `SET enableNullHandling=true;` | via 
`tableConfig.tableIndexConfig.nullHandlingEnabled` |
   | Column-level Control | via schema `isNullableField` (discussed here) | via 
schema `isNullableField` (not discussed here) |
   
   my argument: 
   1. the `tableConfig.tableIndexConfig.nullHandlingEnabled` should not be used 
during query time (do not couple them)
   2. `isNullableField` in schema can be coupled in the future but i dont want 
to couple them here and now due to backward compatibility discussions
   
   
   > ## enableNullhandling
   > Why do we need that? Ideally the table config nullability will tell us the 
default nullability. Do we expect users changing nullability at runtime? Why? 
That is not a normal flag in other databases.
   > 
   > Are you proposing this property in order to keep backward compatibility so 
if it is not enabled, V1 ignores column level nullability? My proposal is to do 
not have this. There is no backward compatibility issue given that column level 
nullability is a new feature.
   > 
   > 1. If users do not define column level nullability in the schema, v1 will 
honor `tableConfig.getIndexingConfig().isNullHandlingEnabled `, which is 
backward compatible.
   
   IMO this is not a good idea, see my table above. we dont want to couple 
ingestion time config with query time config
   
   > 2. If users do define column level nullability, then they have changed the 
schema, so it is backward compatible to return different values.
   > 
   > ## enableNullHandlingIfColumnIsNullable
   > In which situation we want this disabled? I mean, in which cases we expect 
users to define a column as nullable but want to skip that? It sounds super 
confusing to understand.
   
   this is what I meant in the config section above (sorry if it is not clear) 
- `enableNullHandlingIfColumnIsNullable` is a hidden query option auto-attached 
to the leaf stage of v2 only
   
   **v1** behavior:
   1. if `enableNullhandling` is set, current v1 behavior
   2. if `enableNullhandling` is not set, current v1 behavior
   3. if `enableNullHandlingIfColumnIsNullable` is set, utilize column-level 
nullability  (user wont see it when directly using v1)
   
   **v2** behavior: 
   1. always utilize column-level nullability, and auto attach 
`enableNullHandlingIfColumnIsNullable` option to leaf stage of v2.
   
   i am not 100% sure `enableNullHandlingIfColumnIsNullable` is needed. if not 
(e.g. we can return different results from v1 when `enableNullhandling` is not 
set but column nullability is set), then the behavior is even simpler:
   
   **v1** behavior:
   1. if `enableNullhandling` is set, current v1 behavior
   2. if `enableNullhandling` is not set, utilize column-level nullability  <-- 
different result
   
   **v2** behavior: 
   1. always utilize column-level nullability
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to