Funny you should say "step", I have an AIP coming along on that front, so please don't name it step -- specificall about "Task Steps". I should have it out soon.
On Thu, 4 Jun 2026 at 23:03, Jarek Potiuk <[email protected]> wrote: > Hi David and everyone, > > I agree with Ash; the "Dynamic" part of the current naming is quite > misleading. I’ve been giving this some thought, and I believe we might be > looking for the wrong thing to name (batch) because we are looking at it > from the wrong angle. > > (And yes a lot of this comes after some chatting with Claude about my gut > feelings). > > To address concerns about communicating resilience and durability, we > should consider this from the perspective of the "durability unit." > Currently, Airflow has three such units: > > 1. Task: Regular tasks. > 2. Mapped Task: Individual elements in a task array. > 3. Deferred Task: Sub-task parts stored as state in the Triggerer DB. > > In all three cases, if the infrastructure fails, the entire unit is lost > and must restart from the beginning. David's new construct idea doesn't > change these execution-layer properties. Whether it’s a single task or a > mapped task, the "unit of durability" remains the same, even if the "unit > of work" within it changes. > > Therefore, we should use a different name for that "construct"—one that > clearly indicates these sequential actions are not tasks and do not carry > task-level durability guarantees. After exploring some options, I think > "step" is the most effective term. It has strong precedent in CI systems > (GitHub Actions, GitLab) and AWS Step Functions, where a job/task is the > durable unit and steps are subordinate, sequential, and ephemeral. Other > alternatives like "action," "leg," or "segment" are possible, but "step" > offers instant comprehension. > > Using "step" would also allow for a clean syntax that distinguishes it from > the @task decorator. For example: > > @step(retries=2) > def load(item): > warehouse.write(item) > > @task > def ingest_orders(data): > load.map(data) > > Or even do things asynchronously - not sequentially > > @step(retries=2) > async def load(item): > await warehouse.write(item) > > @task > async def ingest_orders(data): > load.map(data) # map normally is synchronous but we could make it work > with async steps > > This approach allows for invisible retries within a task and clearly > communicates that if a step fails, the task is the recovery point. It also > clarifies the "chunking" concept we were discussing (BTW. "chunk" is > probably what were looking for - but in this concept we don't have to name > it as a concept in Airflow): > > @step(retries=2) > async def load(item): > await warehouse.write(item) > > @task > async def process_chunk(chunk): > load.map(chunk) > > # Note! This is just an example of how we can add chunking - we could make > it better without a separate task, materialization and with some other > syntactic sugar > @task > def make_chunks(items, size=100): > return [items[i:i+size] for i in range(0, len(items), size) > > process_chunk.expand(chunk=make_chunks(get_items())) > > By introducing "step" as a distinct unit, we can more accurately > communicate that it lacks the independent durability of a task while > providing users with a familiar mental model. > > What do you think? > > Best regards, > Jarek >
