Batch is a good name for it. On Thu, Jun 4, 2026 at 8:35 PM Jens Scheffler <[email protected]> wrote:
> Hi David et al, > > I was very convinced about Dynamic Task Sharding during the call because: > > * Dynamic Task Mapping - we all know > * Dynamic Task Iteration - the new async kid in town? Taking all into > a single execution (with risk of fail all or nothing...) > > As David was describing the way to put the iterations into > (partitions/slices/chunks) I am still up for it. > > Batching would also be okay but feels like more matching for the thing > that "Iteration" is for, looping in async over a list. But the term > discussion was more that if you have 17 000 in the list you probably > rather want to track 170 "batches/partitions" as task processes being > supervised of each running 100 list items. As the "batch" is 17 000 > items, the "split/partitioning" to be named "batch" sounds a bit > un-natural. Because previous "iteration" also was a bit of a batch. > > Or do I mis-interpret? > > Dynamic Task Mapping: > > items = maky_me_a_work_list() > > serious_work = PythonOperator.partial( > task_id="serious_work", > ... > ).expand(op_args=items) > > > Dynamic Task Iteration: > > async_work = PythonOperator.partial( > task_id="async_work", > ... > ).iterate(op_args=items) > > > Dynamic Task Iteration with "partitions/slices"? > > large_async_work_in_pieces = PythonOperator.partial( > task_id="large_async_work_in_pieces", > ... > ).iterate(op_args=items, shrad=170) > > large_async_work_in_pieces = PythonOperator.partial( > task_id="large_async_work_in_pieces", > ... > ).iterate(op_args=items, slice=170) > > large_async_work_in_pieces = PythonOperator.partial( > task_id="large_async_work_in_pieces", > ... > ).iterate(op_args=items, batch=100) > > > (okay reading the code, "partition", "shrad" or "slice" would describe > how many pieces to vut the elephant into and "batch" would be convincing > to tell how many tasks to put together sharing a loop... so > thinking-out-loud "batch" would be also OK if we want to describe the > "package side of the elephan slice". > > @David ... if I mis-understood can you share the PR link or the demo > code to re-read what you presented? > > Jens > > > On 04.06.26 19:54, Tzu-ping Chung via dev wrote: > > I think dynamic task batch(ing) would be reasonable. > > > > Python’s itertools has batched() that kind of is the same concept. > > > > TP > > > > > >> On 5 Jun 2026, at 00:56, Blain David<[email protected]> wrote: > >> > >> Hi all, > >> > >> We need a better name than partition for Dynamic Task Partitioning. > >> > >> The main issue is that partition already strongly suggests asset/data > partitions in Airflow, > >> so using the same word here creates avoidable confusion for users and > contributors. > >> > >> We’d like a term that is clear, intuitive, and doesn’t overlap with > existing Airflow concepts. > >> > >> Some alternatives raised so far during the devcall: > >> > >> > >> * > >> batch (e.g. Dynamic Task Batching) > >> * > >> chunk (e.g. Dynamic Task Chunking) > >> * > >> slice (bit confusing but chose to still mention it anway) > >> * > >> shard > >> * > >> segment > >> > >> > >> My current lean is towards chunk and batch. It feels familiar, readable > in both code and docs, and avoids the existing partition/data-partition > association. > >> > >> I’d love feedback on: > >> > >> > >> * > >> which term feels most natural > >> * > >> which term is least ambiguous > >> * > >> or whether there’s a better option we haven’t considered? > >> > >> > >> One note: map was mentioned as well, but that seems too close to > existing task.map() terminology. > >> > >> Please share thoughts, especially if you have concerns about any of the > options above or a stronger suggestion for the long-term name. > >> > >> Naming is indeed hard 🙂 > >> > >> Kind regards, > >> David > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail:[email protected] > > For additional commands, e-mail:[email protected] > >
