Hi Yian, How are you? Though I do not have complete understanding of the code involved, but one difference between or and In is that In would be on a single column, while or pred can involve different columns , so may provide better filter? Regards Asif
On Wed, Aug 13, 2025, 10:45 PM Yian Liou <yl...@berkeley.edu.invalid> wrote: > Hi Everyone, > > I was exploring the details of the Parquet In Predicate in > ParquetFilters.scala and had some lingering questions. > > What are the advantages of pushing ORs rather than an IN predicate from > Parquet when the number of items is less than or equal to the > InFilterThreshold? > > I also see that the canPartialPushDownConjusts check occurs when the > number of items in the In Filter is above the InFilterThreshold but not > when the number of items is below the threshold. Is there a particular > reason for doing so? > > Thanks in advance for any insights! > > Best Regards, > Yian >