xinlifoobar commented on PR #10711:
URL: https://github.com/apache/datafusion/pull/10711#issuecomment-2143614517

   > Thanks again @xinlifoobar -- I went through this PR and I don't think we 
can merge it yet as is because it has a regression.
   > 
   > The current code will extract the values as the underlying parquet type 
(e.g. as an Int32Array) and then the predicate pruning code will cast it as 
necessary
   > 
   > However, when I pushed one more commit to make the types that are not yet 
handled explicit (rather than relying on a `_`) I see a bunch of types that 
aren't yet covered that are important (like `Interval` for example)
   > 
   > Thus I suggest:
   > 
   > 1. Break out your change to support extracting timestamps with/without 
timezones as its own PR (it is quite good)
   > 2. We then work on filling out tests for the other types (I'll file 
tickets)
   > 3. Then we can come back to this PR (or maybe we want to work on it in 
parallel)
   > 
   > I am sorry I didn't see this before and I am sorry that we clearly don't 
have adequate test coverage
   
   LGTM. It was clear to me what it is missing. Thanks!
   
   This PR is mixed with too many items I think... I will move out some of the 
code, e.g., for timestamps, later in separated PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to