eejbyfeldt commented on issue #810: URL: https://github.com/apache/datafusion-comet/issues/810#issuecomment-2306561051
I had a quick look at this in a profiler and to me it did not look like much time was spent in the `BloomFilterMightContain` related code. The things that stood out as taking significant time was copying of data due to unpacking of dictionaries during the scan operation here: https://github.com/eejbyfeldt/datafusion-comet/blob/a99f7428398793507b31188c8919e4cf128d8d38/native/core/src/execution/operators/scan.rs#L353-L370 and the copying done in the FilterExec which sounds similar to what is discussed in https://github.com/apache/datafusion-comet/issues/808 So maybe this ticket should be replace with one about removing the unpacking of dictionaries in the scan operator. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org