#26530: Batch operations on large querysets
-------------------------------------+-------------------------------------
Reporter: mjtamlyn | Owner: nobody
Type: New feature | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by akaariai):
I'm ok with the batch(size=100) approach. I haven't had a need for this,
and I'm unfamiliar how common such cases are. But if this is something
commonly needed, then I think having an ORM method that does this in a way
that is both efficient and correct when ran on table modified concurrently
is the way to go.
Thinking of this a bit more, batching only on primary key ordering is the
only safe approach if we want to definitely avoid seeing the same object
in multiple batches. Otherwise for example:
{{{
for batch in qs.order_by('mod_date', 'pk').batch(size=1000):
for obj in batch:
obj.foo = 'bar'
obj.mod_date = datetime.now()
obj.save()
}}}
could end up in indefinite loop.
--
Ticket URL: <https://code.djangoproject.com/ticket/26530#comment:5>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/066.45708986a1bee3ffad1b83effaf05f11%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.