zhengruifeng commented on PR #40520: URL: https://github.com/apache/spark/pull/40520#issuecomment-1480490580
> Barrier mode is only used in specific ML case, i.e. in model training routine, we will only use it in one pattern: > > dataset.mapInPandas(..., is_barrier=True).collect() > To simply the implementation, we can implement a barrierMapInPandasAndCollect instead, and define a execution plan stage like BarrierMapInPandasAndCollectExec If it is the only use case, i think it will be safe to add dedicated logical plan and physical plan for it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
