mbutrovich commented on issue #2636: URL: https://github.com/apache/datafusion-comet/issues/2636#issuecomment-3437472425
This raises a larger issue of "do we want to support dictionary-encoded string columns in Comet?" My understanding of the situation: - `native_comet` scan tries to dictionary-encoded string columns. - `native_datafusion` and `native_iceberg_compat` do not dictionary-encode string columns, but have the option of using Utf8View in the future for similar gains. - Some Comet functions support dictionary-encoded strings. - DataFusion _does not_ intend to keep dictionary-encoded strings as first-class citizens in functions, so when we rely on them we must unpack strings first. My long term goal is to get us to Uft8View support and drop dictionary-encoded query processing since that brings Comet into alignment with DataFusion and Arrow-rs, but I am happy to discuss. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
