You can set the "target_partitions" config to specify parallelism. It defaults to number of cores.
https://arrow.apache.org/datafusion/user-guide/configs.html TableProvider partitions will be executed in parallel using async futures. Not necessarily one thread per partition. Hope that helps. Andy. On Thu, Feb 9, 2023 at 9:04 AM Olo Sawyerr <[email protected]> wrote: > Hi there, > > Any idea on this please? > > Regards, > > Olo > > Sent from Outlook for iOS <https://aka.ms/o0ukef> > ------------------------------ > *From:* Olo Sawyerr > *Sent:* Friday, February 3, 2023 4:27:52 PM > *To:* [email protected] <[email protected]> > *Subject:* [Datafusion] Parallelism > > Hi there, > > I have a couple of questions about parallel execution in DataFusion. > > > 1. How does the parallelism relate to TableProvider partitions? Is it > one "thread" per partition? > 2. Can the parallelism be controlled in any way or just is scale > automatically based on the number or processors on the machine? > > Regards, > Olo > > Sent from Outlook for iOS <https://aka.ms/o0ukef> >
