Re: ./manage.py loaddata calls .save() on all models... should it?

2012-08-29 Thread Hobson Lane
OOPS, turns out I had a post_save receiver creating the reciprocal rel. So 
this is not a django problem.

On Wednesday, August 29, 2012 1:40:52 PM UTC-7, Hobson Lane wrote:
>
> In django 1.4, my model save() gets called for ManyToManyField "through" 
> models for loaddata on that model:
>
> class Entity(models.Model):
> related_entities = models.ManyToManyField('self', 
> through='EntityRelationship', symmetrical=False, related_name='related_to+')
> 
> class EntityRelationship(models.Model):
> from_entity   = models.ForeignKey(Entity, 
> related_name='from_entities')
> to_entity   = models.ForeignKey(Entity, 
> related_name='to_entities')
>
> def save(self, *args, **kwargs):
> print >>sys.stderr, EntityRelationship.objects.all()
> super(EntityRelationship, self).save(*args, **kwargs) 
> print >>sys.stderr, EntityRelationship.objects.all()
> assert False
>
> class Meta:
> db_table = u'EntityRelationship'
> ordering = ["to_entity"]
> unique_together = (('from_entity','to_entity'))
>
> def __unicode__(self):
> return '%s -> %s'%(self.from_entity.pk,self.to_entity.pk)
>
> Interestingly, `save()` is not called for the forward relationship, the 
> one explicity specified in the json fixture. loaddata must do a "raw" 
> insert for this one. But is called as part of validation of the 
> relationship when inserting a record for the reciprocal relationship:
>
> >>> manage.py loaddata fixtures/EntityRelationship.json
> [ 2>]
> [ 2>,  1>]
>
> I wouldn't expect loaddata should insert a receiprocal relationship record 
> not specified in the fixture. And doing so without calling save() for both 
> inserts makes it very difficult to deal with fixtures that contain a random 
> network of connected entities. You eventually get an IntegrityError during 
> the loaddata because duplicate records (relationships) have already been 
> entered. And it prevents the user from loading a fixture of asymmetrical 
> (not necessarily a reciprocal relationship for every edge of the graph) 
> relationships. At least I'm having trouble doing so.
>
> Perhaps the '+' in the related name ('related_to+') is causing me grief, 
> or some other subtlety of self-referential relationships in Django that I 
> don't understand.
>
> On Tuesday, March 29, 2011 11:21:43 PM UTC-7, George Karpenkov wrote:
>>
>> If we'll look into core/management/commands/loaddata we'll see the 
>> line 
>> "obj.save(using=using)" which saves the data. 
>>
>> *however* consider the case when application has some custom database- 
>> altering logic in .save method. The common thing that comes to mind is 
>> timestamp, or something similar. What would happen is that instead of 
>> loading data the data from the fixture will be partly changed, which 
>> is really not what you want. 
>>
>> That's not the worst case though - I've just spent 40 minutes loading 
>> the database from a fixture which contained data in django-tagging, 
>> which inserts it's own "INSERT" SQL statements into save. So loaddata 
>> was consistently crashing with "column blah is not unique" while the 
>> data from the dump was perfectly fine. It made me quite sad. 
>>
>> so coming to think of it i can't really think of a use case where 
>> you'd want the custom logic in .save() to be executed. All the data is 
>> already there, and we *know* that it is valid data - so what else we 
>> might possibly want to do with it? (unless our application 
>> communicates over network with different services and it uses .save() 
>> to maintain integrity with them, but I haven't seen a single django 
>> website like that) 
>> So I think that some lower-level logic should be called instead. 
>>
>> Any comments on why loaddata was implemented this way in the first 
>> place?
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-developers/-/UM2eZoJRMP0J.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: ./manage.py loaddata calls .save() on all models... should it?

2012-08-29 Thread Hobson Lane
In django 1.4, my model save() gets called for ManyToManyField "through" 
models for loaddata on that model:

class Entity(models.Model):
related_entities = models.ManyToManyField('self', 
through='EntityRelationship', symmetrical=False, related_name='related_to+')

class EntityRelationship(models.Model):
from_entity   = models.ForeignKey(Entity, 
related_name='from_entities')
to_entity   = models.ForeignKey(Entity, 
related_name='to_entities')

def save(self, *args, **kwargs):
print >>sys.stderr, EntityRelationship.objects.all()
super(EntityRelationship, self).save(*args, **kwargs) 
print >>sys.stderr, EntityRelationship.objects.all()
assert False

class Meta:
db_table = u'EntityRelationship'
ordering = ["to_entity"]
unique_together = (('from_entity','to_entity'))

def __unicode__(self):
return '%s -> %s'%(self.from_entity.pk,self.to_entity.pk)
   
Interestingly, `save()` is not called for the forward relationship, the one 
explicity specified in the json fixture. loaddata must do a "raw" insert 
for this one. But is called as part of validation of the relationship when 
inserting a record for the reciprocal relationship:

>>> manage.py loaddata fixtures/EntityRelationship.json
[ 2>]
[ 2>,  1>]

I wouldn't expect loaddata should insert a receiprocal relationship record 
not specified in the fixture. And doing so without calling save() for both 
inserts makes it very difficult to deal with fixtures that contain a random 
network of connected entities. You eventually get an IntegrityError during 
the loaddata because duplicate records (relationships) have already been 
entered. And it prevents the user from loading a fixture of asymmetrical 
(not necessarily a reciprocal relationship for every edge of the graph) 
relationships. At least I'm having trouble doing so.

Perhaps the '+' in the related name ('related_to+') is causing me grief, or 
some other subtlety of self-referential relationships in Django that I 
don't understand.

On Tuesday, March 29, 2011 11:21:43 PM UTC-7, George Karpenkov wrote:
>
> If we'll look into core/management/commands/loaddata we'll see the 
> line 
> "obj.save(using=using)" which saves the data. 
>
> *however* consider the case when application has some custom database- 
> altering logic in .save method. The common thing that comes to mind is 
> timestamp, or something similar. What would happen is that instead of 
> loading data the data from the fixture will be partly changed, which 
> is really not what you want. 
>
> That's not the worst case though - I've just spent 40 minutes loading 
> the database from a fixture which contained data in django-tagging, 
> which inserts it's own "INSERT" SQL statements into save. So loaddata 
> was consistently crashing with "column blah is not unique" while the 
> data from the dump was perfectly fine. It made me quite sad. 
>
> so coming to think of it i can't really think of a use case where 
> you'd want the custom logic in .save() to be executed. All the data is 
> already there, and we *know* that it is valid data - so what else we 
> might possibly want to do with it? (unless our application 
> communicates over network with different services and it uses .save() 
> to maintain integrity with them, but I haven't seen a single django 
> website like that) 
> So I think that some lower-level logic should be called instead. 
>
> Any comments on why loaddata was implemented this way in the first 
> place?

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-developers/-/Z3iq0-j3B9sJ.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: ./manage.py loaddata calls .save() on all models... should it?

2011-03-31 Thread Jeremy Dunck
On Thu, Mar 31, 2011 at 2:02 PM, Patryk Zawadzki  wrote:
...
> Rather simple fix to the signal handler:
>
> def on_something_saved(sender, instance, created, **kwargs):
>    if not kwargs.get('raw', False):
>        do_stuff()

Concur, if the handler shouldn't run on loaddata, checking raw is the
right change.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: ./manage.py loaddata calls .save() on all models... should it?

2011-03-31 Thread Patryk Zawadzki
On Wed, Mar 30, 2011 at 10:55 AM, George Karpenkov
 wrote:
> Oh thanks Russel!
>
> Turns out django-tagging was creating those objects via the post-save
> signal hook.
>
> http://code.djangoproject.com/ticket/8399 <- here is the ticket which
> proposes an option to disable the signal handling during the loaddata
> operation.
>
> Malcolm says that some people might want to use signals while running
> loaddata - ie to create related objects.
>
> I don't really understand why anyone would want that -- the only use
> case I've ever seen for the loaddata operation was "dump the entire
> database -> load entire database", though I guess use cases differ.
>
> Any updated comments on disabling the signal handling for the loaddata
> operation?

Rather simple fix to the signal handler:

def on_something_saved(sender, instance, created, **kwargs):
if not kwargs.get('raw', False):
do_stuff()

-- 
Patryk Zawadzki
I solve problems.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: ./manage.py loaddata calls .save() on all models... should it?

2011-03-30 Thread Russell Keith-Magee
On Wed, Mar 30, 2011 at 4:55 PM, George Karpenkov
 wrote:
> Oh thanks Russel!
>
> Turns out django-tagging was creating those objects via the post-save
> signal hook.
>
> http://code.djangoproject.com/ticket/8399 <- here is the ticket which
> proposes an option to disable the signal handling during the loaddata
> operation.
>
> Malcolm says that some people might want to use signals while running
> loaddata - ie to create related objects.
>
> I don't really understand why anyone would want that -- the only use
> case I've ever seen for the loaddata operation was "dump the entire
> database -> load entire database", though I guess use cases differ.

There's the rub -- use cases do differ. Even in the "dump and load"
scenario, there are at least two ways to interpret the problem:

 * Dump the *entire* database, and load the *entire* database
 * Dump the "core" of the database, and reconstitute parts of the
database that can be automatically rebuilt.

Consider, for example, model permissions or contenttypes -- should
they be serialized and reloaded, or generated and reference (using
natural keys)?

> Any updated comments on disabling the signal handling for the loaddata
> operation?

Not beyond what is listed in that ticket. The original solution
proposed is the obvious solution and Malcolm has raised the obvious
objection. The ticket has been accepted; there is merit in the idea in
principle; it's just a matter of finding a workable solution that
doesn't suffer from the problem Malcolm highlights.

I can see how gsong's solution would work, but I'm not wild about it.
Firstly, it requires stack inspection, which wont necessarily catch
all the ways the signal may need to be silenced. Secondly, it requires
registration at the time the signal is defined, which means that if a
signal is provided as part of a third party app, the app author (and
not the end-user) makes the decision as to whether the signal should
be silenced. This may be possible in many cases, but there will be
cases where this isn't possible.

Yours,
Russ Magee %-)

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: ./manage.py loaddata calls .save() on all models... should it?

2011-03-30 Thread George Karpenkov
Oh thanks Russel!

Turns out django-tagging was creating those objects via the post-save
signal hook.

http://code.djangoproject.com/ticket/8399 <- here is the ticket which
proposes an option to disable the signal handling during the loaddata
operation.

Malcolm says that some people might want to use signals while running
loaddata - ie to create related objects.

I don't really understand why anyone would want that -- the only use
case I've ever seen for the loaddata operation was "dump the entire
database -> load entire database", though I guess use cases differ.

Any updated comments on disabling the signal handling for the loaddata
operation?

On Mar 30, 5:30 pm, Russell Keith-Magee 
wrote:
> On Wed, Mar 30, 2011 at 2:21 PM, George Karpenkov
>
>  wrote:
> > If we'll look into core/management/commands/loaddata we'll see the
> > line
> > "obj.save(using=using)" which saves the data.
>
> ... and if you dig a little deeper, you'll find that "obj" in that
> context is a "DeserializedObject", and calling save() on a
> deserialized object invokes a "raw save" on the underlying object's
> base save.
>
> That is, the save() method on the object *shouldn't* be invoked as a
> result of loading a fixture. That's what was reported in #4459, and
> fixed in r5658.
>
> If you can provide a test case that demonstrates otherwise, please
> open a ticket.
>
> Yours,
> Russ Magee %-)

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: ./manage.py loaddata calls .save() on all models... should it?

2011-03-30 Thread Russell Keith-Magee
On Wed, Mar 30, 2011 at 2:21 PM, George Karpenkov
 wrote:
> If we'll look into core/management/commands/loaddata we'll see the
> line
> "obj.save(using=using)" which saves the data.

... and if you dig a little deeper, you'll find that "obj" in that
context is a "DeserializedObject", and calling save() on a
deserialized object invokes a "raw save" on the underlying object's
base save.

That is, the save() method on the object *shouldn't* be invoked as a
result of loading a fixture. That's what was reported in #4459, and
fixed in r5658.

If you can provide a test case that demonstrates otherwise, please
open a ticket.

Yours,
Russ Magee %-)

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: ./manage.py loaddata calls .save() on all models... should it?

2011-03-30 Thread George Karpenkov
Oh okay, apparently there is a ticket from 4 years ago
http://code.djangoproject.com/ticket/4459

On Mar 30, 5:21 pm, George Karpenkov  wrote:
> If we'll look into core/management/commands/loaddata we'll see the
> line
> "obj.save(using=using)" which saves the data.
>
> *however* consider the case when application has some custom database-
> altering logic in .save method. The common thing that comes to mind is
> timestamp, or something similar. What would happen is that instead of
> loading data the data from the fixture will be partly changed, which
> is really not what you want.
>
> That's not the worst case though - I've just spent 40 minutes loading
> the database from a fixture which contained data in django-tagging,
> which inserts it's own "INSERT" SQL statements into save. So loaddata
> was consistently crashing with "column blah is not unique" while the
> data from the dump was perfectly fine. It made me quite sad.
>
> so coming to think of it i can't really think of a use case where
> you'd want the custom logic in .save() to be executed. All the data is
> already there, and we *know* that it is valid data - so what else we
> might possibly want to do with it? (unless our application
> communicates over network with different services and it uses .save()
> to maintain integrity with them, but I haven't seen a single django
> website like that)
> So I think that some lower-level logic should be called instead.
>
> Any comments on why loaddata was implemented this way in the first
> place?

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



./manage.py loaddata calls .save() on all models... should it?

2011-03-30 Thread George Karpenkov
If we'll look into core/management/commands/loaddata we'll see the
line
"obj.save(using=using)" which saves the data.

*however* consider the case when application has some custom database-
altering logic in .save method. The common thing that comes to mind is
timestamp, or something similar. What would happen is that instead of
loading data the data from the fixture will be partly changed, which
is really not what you want.

That's not the worst case though - I've just spent 40 minutes loading
the database from a fixture which contained data in django-tagging,
which inserts it's own "INSERT" SQL statements into save. So loaddata
was consistently crashing with "column blah is not unique" while the
data from the dump was perfectly fine. It made me quite sad.

so coming to think of it i can't really think of a use case where
you'd want the custom logic in .save() to be executed. All the data is
already there, and we *know* that it is valid data - so what else we
might possibly want to do with it? (unless our application
communicates over network with different services and it uses .save()
to maintain integrity with them, but I haven't seen a single django
website like that)
So I think that some lower-level logic should be called instead.

Any comments on why loaddata was implemented this way in the first
place?

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.