iemejia commented on PR #56479: URL: https://github.com/apache/spark/pull/56479#issuecomment-4694353019
@LuciferYang Would you mind taking a look at this one when you get a chance? While working on the Parquet encoding benchmarks, I noticed the workflow was spending ~5-10 min generating TPC-DS data on every run even when the benchmark does not use it (because `contains(inputs.class, '*')` matches any wildcard pattern, not just the literal `*`). I also kept having to wait for full 20-30 min runs to complete only to discover the runner landed on the wrong CPU. For that, I added an optional `expected-cpu` input parameter that detects the runner CPU immediately after checkout and fails the job within seconds if it does not match -- so you do not waste the entire compilation + benchmark time before finding out. These two small fixes should save a lot of time for anyone using the benchmark workflow with specific class patterns and CPU-sensitive comparisons. Happy to adjust anything if needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
