xinlifoobar commented on PR #10711: URL: https://github.com/apache/datafusion/pull/10711#issuecomment-2143614517
> Thanks again @xinlifoobar -- I went through this PR and I don't think we can merge it yet as is because it has a regression. > > The current code will extract the values as the underlying parquet type (e.g. as an Int32Array) and then the predicate pruning code will cast it as necessary > > However, when I pushed one more commit to make the types that are not yet handled explicit (rather than relying on a `_`) I see a bunch of types that aren't yet covered that are important (like `Interval` for example) > > Thus I suggest: > > 1. Break out your change to support extracting timestamps with/without timezones as its own PR (it is quite good) > 2. We then work on filling out tests for the other types (I'll file tickets) > 3. Then we can come back to this PR (or maybe we want to work on it in parallel) > > I am sorry I didn't see this before and I am sorry that we clearly don't have adequate test coverage LGTM. It was clear to me what it is missing. Thanks! This PR is mixed with too many items I think... I will move out some of the code, e.g., for timestamps, later in separated PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
