goldmedal commented on PR #12816: URL: https://github.com/apache/datafusion/pull/12816#issuecomment-2409843953
> Here is the performance of this PR. Some queries are slower, some are faster. > > I believe once we turn on string view everything will be faster. > Thanks @alamb It's interesting 🤔 Does this benchmark only include the change made by this PR, or does it include others? It seems there are many queries slowed down by this PR. Before this PR, the casting flow is ``` Binary(parquet) -> Binary(arrow) -> BinaryView(arrow) -> StringView(arrow) ``` Now, it's ``` Binary(paruqet) -> StringView(arrow) ``` Theoretically, we save the two steps (including the most expensive ones) for it. I have no idea why they would be slower. I might try to do some profiling for the slower cases 🤔 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org