While I'm not sure I have any particular suggestion for this, the problem
is important enough for a few of the projects I'm maintaining. I have
implemented more than a few custom save() methods,
post_init/pre_save/post_save signals and complicated ways of setting
update_fields and acting on their values and I would really like if there
was a simple standard way of tracking dirty fields and saving only them in
a reliable way. I'll be grateful if someone finally makes it happen.

Ivan.

On Tue, Jan 29, 2019 at 4:43 PM Patryk Zawadzki <pat...@room-303.com> wrote:

> I feel conflicted here as I've spent numerous hours under different
> contracts fixing bugs caused by missing update_fields and leading to data
> loss or inconsistent database states ;)
>
> On a more serious note, please make this happen.
>
> With concurrent updates calling save() without update_fields is asking for
> trouble, one of our customers has save() methods specifically patched to
> raise exceptions under these circumstances because of how easy it is to
> mess things up and how hard it is to track them down.
>
> pon., 28 sty 2019 o 15:00 Daniel Tao <daniel....@gmail.com> napisaƂ(a):
>
>> Hi!
>>
>> This is my first post on this list. I recently left a comment on #27017
>> <https://code.djangoproject.com/ticket/27017#comment:3> requesting to
>> re-open the topic of only saving dirty fields in save(). Tim Graham
>> helpfully directed to #4102 <https://code.djangoproject.com/ticket/4102> and
>> advised that I make the proposal on the dev mailing list, so that's what
>> I'm doing :)
>>
>> I've gone through the history of #4102 and taken notes on the challenges
>> that arose when this was first attempted 12 years ago (!). The TL;DR is
>> that, while there were indeed quite a few complications, I don't see
>> anything that came up that should be considered a flat-out showstopper.
>> Rather, what happened was that at about the 69th comment
>> <https://code.djangoproject.com/ticket/4102#comment:69>, back in 2012,
>> the conversation shifted as Matt Long made this observation:
>>
>> what started as a simple opt-in feature request of adding a field
>>> white-list to the Model's save function morphed into a complicated dirty
>>> flag approach that obviously has many edge cases and performance
>>> implications given that this ticket has been open for 5 years now
>>
>>
>> From here it seems things progressed towards the current solution of
>> supporting the update_fields argument, and that's where things ended. I
>> would like to point out that Matt did *not* advocate for completely
>> abandoning all efforts to support dirty field tracking; to the contrary, in
>> the same comment he said this (emphasis mine):
>>
>> Clearly some people feel differently and favor the dirty flag approach
>>> for a more hands-off approach. As such, I propose adding support for *both
>>> methods*
>>
>>
>> With that in mind, I believe it's worth re-opening this discussion. For a
>> fairly lengthy justification, see my aforementioned comment on #27017
>> <https://code.djangoproject.com/ticket/27017#comment:3>. I'll copy the
>> effective TL;DR of the proposal here for convenience:
>>
>> In my opinion Django could make most code bases inherently more resilient
>>> against latent race conditions by implementing some form of dirty field
>>> tracking and effectively providing the functionality of update_fields
>>> automatically. I would like to propose a new setting, something like
>>> SAVE_UPDATE_DIRTY_FIELDS_ONLY, to change the ORM's default behavior so
>>> that calls to Model.save() only update the fields that have been set on
>>> the model instance. Naturally for backwards compatibility this setting
>>> would be False by default.
>>
>>
>> As for the concerns that were raised when this was first attempted, I
>> will now attempt to summarize what I found along with, in most cases, a bit
>> of editorializing from me.
>>
>> Performance
>>
>> The performance angle was first explored in a comment
>> <https://code.djangoproject.com/ticket/4102#comment:7> that said it
>> "doesn't look good" and provided some benchmarks showing a performance hit
>> from 0.17s to 2.64s for setting an attribute using the timeit
>> <https://docs.python.org/3/library/timeit.html> package. I didn't see
>> anyone point out that the timeit method defaults to executing code *a
>> million times*; so the throughput of the operation went from about 6
>> million to closer to 400 thousand times per second. (The percentage change
>> is indeed significant, but this doesn't *smell* like a potential
>> bottleneck.)
>>
>> It was noted in a couple
>> <https://code.djangoproject.com/ticket/4102#comment:14> places
>> <https://code.djangoproject.com/ticket/4102#comment:29> that it seems
>> potentially shortsighted to focus so much on the performance of getting and
>> setting attributes without taking into account the potential performance
>> *benefits* of executing smaller UPDATE statements that write fewer
>> columns. As far as I can tell, no one in the thread on #4102 actively
>> investigated the latter.
>>
>> Based on the unlikelihood of attribute setting representing a performance
>> bottleneck, and the lack of data for the performance impact of executing
>> smaller updates, I would consider the performance discussion largely a
>> distraction. Though I do think it's worth measuring the latter.
>>
>> Compatibility
>>
>> It was observed that the two approaches considered on the ticket
>> (overriding __setattr__ or defining custom property setters) would not
>> work with obj.__dict__.update
>> <https://code.djangoproject.com/ticket/4102#comment:12>, which is
>> apparently an optimization you can find prescribed on some blogs out int he
>> wild. This supports the premise that this behavior should be opt in, so
>> devs who are using incompatible techniques can stay away (unless/until
>> they've removed those optimizations from their code).
>>
>> I had already suggested putting this functionality behind a setting, so I
>> think we're good here.
>>
>> Correctness
>>
>> There were some bugs related to
>> <https://code.djangoproject.com/ticket/4102#comment:39> multi-table
>> inheritance <https://code.djangoproject.com/ticket/4102#comment:67>. It
>> appears these were both followed by patches with proposed fixes.
>>
>> Another bug was that fields updated in pre_save such as
>> DateTimeField(auto_now=True) were not being updated
>> <https://code.djangoproject.com/ticket/4102#comment:60>. There was a
>> patch to fix this as well.
>>
>> The point was raised that this approach poses a problem for mutable
>> field types <https://code.djangoproject.com/ticket/4102#comment:68> (e.g.,
>> ArrayField). Presumably one approach for addressing that would be the
>> one employed by django-save-the-change here
>> <https://github.com/karanlyons/django-save-the-change/blob/master/save_the_change/decorators.py#L121>
>> .
>>
>> One gets the sense from following the thread that, surely, there would be
>> more bugs discovered were someone to resurrect this work. However, again, I
>> don't see any showstoppers here. We're talking about an advanced framework
>> with a large set of features, and so naturally it's going to be difficult
>> to get this right while taking all supported use cases into account.
>>
>> Conclusion
>>
>> As I said up top, I think it's worth re-opening #27017, or creating a new
>> ticket for the behavior it was requesting (updating only dirty fields
>> automatically in save()) where future contributors might discuss
>> possible approaches and, with any luck, submit pull requests. I believe it
>> would be both possible and helpful, and I also *suspect* that the
>> process would go a bit more smoothly today, as the project has a huge suite
>> of tests to catch possible edge cases and provide greater confidence about
>> the correctness of any patches submitted.
>>
>> P.S. I'm a little worried that someone will now direct me to *another*
>> long thread I should read filled with even more reasons this idea caused
>> problems in the past. If so, no worries! I'll do it. I've already come this
>> far :)
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Django developers (Contributions to Django itself)" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to django-developers+unsubscr...@googlegroups.com.
>> To post to this group, send email to django-developers@googlegroups.com.
>> Visit this group at https://groups.google.com/group/django-developers.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/django-developers/f13f860e-8dc9-4a5d-80c5-f1201e741a9d%40googlegroups.com
>> <https://groups.google.com/d/msgid/django-developers/f13f860e-8dc9-4a5d-80c5-f1201e741a9d%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> --
> Patryk Zawadzki
> I solve problems.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/CANw2pUGuMXs1RmqAHZjaoHuTpNKzVnWRi68bioZegJxkv_QL9Q%40mail.gmail.com
> <https://groups.google.com/d/msgid/django-developers/CANw2pUGuMXs1RmqAHZjaoHuTpNKzVnWRi68bioZegJxkv_QL9Q%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CADPNjZ676ALr-S%3DPFkRj%3DC0uXZeMUK13Y%2B-7iW%3DOGJnWSwYxxg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
  • P... Daniel Tao
    • ... Patryk Zawadzki
      • ... 'Ivan Anishchuk' via Django developers (Contributions to Django itself)
    • ... Josh Smeaton
      • ... Adam Johnson
        • ... Tobias Wiese
    • ... Daniel Tao
      • ... Tim Graham
        • ... charettes
          • ... Tobias McNulty

Reply via email to