On Mon, Mar 30, 2009 at 5:57 AM, Russ <taal...@gmail.com> wrote:
>
> My apologies for the length!
>
>
> Concisely, I intend to provide the Django user with some granular
> control of the data to be serialized without sacrificing backwards
> compatibility for old code, or for users who need the straightforward,
> current functionality.

Hi Russ,

First off - I would warn you (if you aren't already aware) that there
are several other people floating around with proposals based on the
serialization system. You would be well advised to read up on those
other proposals, and my comments on those proposals. You will probably
find that those comments help clarify my comments on your specific
proposal.

> Skipping merrily back to the real world, here is the issue, plain and
> simple.  Sometimes, you need to include more information than is just
> in the one model.  Sometimes, that's going to appear in the form of
> inherited models, sometimes via one-to-many or many-to-many
> relationships, and sometimes just little relevant bits of information
> that may not actually be in the model at all.  There are numerous open
> tickets about this issue [1][2][3][4].  Clearly, this needs some
> thought.

Those 4 tickets don't have anything to do with the problem you describe.

#4565 is a dependency discovery problem, and only applies to
serialization, not deserialization.

#7052 is the exact opposite of the problem you describe -
deserializing when you know that data already exists in the database,
and you don't want to recreate it. This is a particular problem for
the auth and contenttype autogenerated types.

#9422 is an edge case for serializing foreign keys.

#10295 is a feature request based around changing the fundamental
serialization format. There are a number of ways to tackle this
problem, ranging from the trivial (just include the extra attribute)
to the complex (provide a fully flexible rendering mechanism so end
users can determine what attributes are included during rendering.

The problem you describe is a combination of #9549. Your observations
about inherited models adds complexity to the problem, but
fundamentally, you're talking about clobbering of existing data during
deserialization.

> I like the idea [5] of providing a Serializer class, defined similarly
> to the ModelAdmin class, to allow custom ways of serializing data,
> something like:
>
> # in serializers.py (or something)
> class ProductListing(serializers.Serializer):
>    fields = {
>        'Product': ['price', 'name', 'description'],
>        'Subscription': ['price', 'name', 'description', 'recurrence']
>    }
>
> serializers.register(ProductListing)

As I've noted elsewhere - why do we need to register anything?
Registration like this is only required when there is a default mode
of internal operation that you don't have control over. However,
serialization is controlled by a call to serializers.serialize(). You
have direct access to the arguments to pass in a set of serialization
instructions.

> $ python manage.py shell
>>>> from project.app import models
>>>> from django.core import serializers
>>>> print serializers.serialize('json', list(models.Product.objects.all()) +
> ...     list(models.Subscription.objects.all()),
> ...     serializer='ProductListing', indent=4)
> [{
>    "pk": 1,
>    "model": "app.product",
>    "fields": {
>        "price": 9.98,
>        "name": "Product #1",
>        "description": "Description for Product #1"
>    }
> },
> # other products...
> {
>    "pk": 13,
>    "model": "app.subscription",
>    "fields": {
>        "price": 14.98,
>        "name": "Subscription #1, Product #13",
>        "description: "My demonstrations aren't particularly
> creative.",
>        "recurrence": 12
>    }
> }]
>
> Not only is this backwards-compatible (no new serializers == no new
> behavior), but it also continues to allow deserialization: The
> deserialization process would see that app.subscription is a child of
> product, and fill in the data appropriately.

This isn't as simple as you make out. There is a very good reason that
the current serializers omit parent fields from the serialized fields
list. In order to create a Subscription with PK 13, you need to create
a Product with pk 13. Product objects.all() contains a list of all
products... including a product with primary key #13. Unless you plan
on implementing a filter on Product.objects.all() that excludes all
instances that have subclasses that are also being serialized, you're
going to have headaches.

> This circumvents one of
> the largest drawbacks, IMO, ofDjangoFullSerializers [6]; being able to
> deserialize this data is often just as important as serializing it
> (lookin ' at you, fixtures).

Erm... what exactly what this epithet about? Django has an extensive
test suite that demonstrates the round-trip nature of fixtures. I'm
not for a second going to claim that there are no bugs in the existing
serializers, but I will claim a high degree of confidence that the
most common use cases work exactly as advertised.

As for the rest of this proposal: I'm all in favour of a class-based
declarative structure, and there is plenty of precedent for this sort
of approach (e..g., Feeds). However, this proposal doesn't really seem
to be exploiting the capabilities of having a class - it's really just
a bunch of keyword arguments that could just as easily be passed in to
the existing serialize() function. Once you strip away the new syntax,
all you really get for your effort is a set of fairly minor
modifications that allow you to tweak the existing base serialization
format (as well as introducing a bunch of new serialization problems
and ambiguities that you won't discover until you're knee deep in
fixtures).

Again, as I've noted elsewhere: The real challenge for serialization
is to allow for completely arbitrary serialization output formats -
for example, a list of (author name, book title) tuples. The goal here
isn't just to allow for minor tweaks around a base structure. The goal
is to allow for easy reconfiguration of serialized output to suit any
external data consumer - an AJAX library, for example. It's also worth
noting that there's no guarantee that an arbitrary serialization
format will be deserializable. While it's important that the default
Django serializers can maintain round trip serialization, that doesn't
necessarily have to be true for a customized serializer.

Unless I'm missing something, the modifications contained in
DjangoFullSerializers pretty much already implement the interesting
tweaks you have described. However, the approach contained in
DjangoFullSerializers can't allow for a completely arbitrary
serialization output. This is the challenge that a strong GSoC
application will tackle.

Yours,
Russ Magee %-)

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to