sririshindra commented on PR #7096:
URL: https://github.com/apache/iceberg/pull/7096#issuecomment-1478318029

   > > In that case couldn't you set something like `sql("SET 
spark.sql.autoBroadcastJoinThreshold=-1")` before the 
DeleteOrphanFilesSparkAction and change it back to default once it finishes.
   > 
   > If I knew this, I would definitely set it up like this. But in fact not 
all users know how to set it up, unless after OOM occurs, I think the cost may 
be even greater.
   
   I think this is always a problem with any spark job that involves join. If 
the join estimations are wrong you get OOM. That doesn't mean that we 
completely disable broadcast join for everybody. if your users are unable to 
disable broadcast join via config, then maybe you can disable it for them on 
your own fork so that it doesn't affect everybody else who is using Iceberg. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to