xinrong-databricks edited a comment on pull request #33964:
URL: https://github.com/apache/spark/pull/33964#issuecomment-921943161


   100 predicates are used.
   
   The long projection takes 188786 ms.
   `isin` takes 61167 ms.
   Broadcast DF (Join) takes 54841 ms.
   
   Broadcast DF (Join) is the best, but it's hard to apply due to current 
function structure.
   
   I checked the Spark UI, the long projection(original approach) is 3 times 
slower because of its planning time. Considering the execution time only (as 
below), the long projection `runWithChainOr ` is faster. However, its planning 
time makes its total time the worst.
   
   
![image](https://user-images.githubusercontent.com/47337188/133825828-e376f0d8-3247-416b-a2c6-0b7a21ab7cb8.png)
   
   
   CC @HyukjinKwon @ueshin


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to