Re: Customizable Serialization check-in

Tom Christie Mon, 07 May 2012 11:13:29 -0700

Hey Piotr,

Here's a few comments...

You have 'fields' and 'exclude' option, but it feels like it's missing an 
'include' option - How would you represent serializing all the fields on a 
model instance (without replicating them), and additionally including one 
other field?  I see that you could do that by explicitly adding a Field 
declaration, but 'include' would seem like an obvious addition to the other 
two options.  

I'd second Russell's comment about aliases.  Defining a label on the field 
would seem more tidy.

Likewise the comment about 'preserve_field_order'  I've still got this in 
for 'django-serializers' at the moment, but I think it's something that 
should become private.  (At an implementation level it's still needed, in 
order to make sure you can exactly preserve the field ordering for json and 
yaml dumpdata, which is unsorted (determined by pythons dict key ordered). 

Being able to nest serializers inside other serializers makes sense, but I 
don't understand why you need to be able to nest fields inside fields. 
 Shouldn't serializers be used to represent complex outputs and fields be 
used to represent flat outputs?

When defining custom fields it'd be good if there was a way of overriding 
the serialization that's independent of how the field is retrieved from the 
model.  For example, with model relation fields, you'd like to be able to 
subclass between representing as a natural key, representing as a url, 
representing as a string name etc... without having to replicate all the 
logic that handles the differences between relationship, multiple 
relationships, and reverse relationships.

The "class_name" option for deserialization is making too many assumptions. 
 The class that's being deserialized may not be present in the data - for 
example if you're building an API, the class that's being deserialized 
might depend on the URL that the data is being sent too. eg 
"http://example.com/api/my-model/12";

In your dump data serializer, how do you distinguish that the 'fields' 
field is the entire object being serialized rather than the 'fields' 
attribute of the object being serialized?  Also, the existing dumpdata 
serialization only serializes local fields on the model - if you're using 
multi-table inheritance only the child's fields will be serialized, so 
you'll need some way of handling that.

Your PKFlatField implementation will need to be a bit more complex in order 
to handle eg many to many relationships.  Also, you'll want to make sure 
you're accessing the pk's from the model without causing another database 
lookup.

Is there a particular reason you've chosen to drop 'depth' from the API? 
 Wouldn't it sometimes be useful to specify the depth you want to serialize 
to?

There's two approaches you can take to declaring the 'xml' format for 
dumpdata, given that it doesn't map nicely to the json and yaml formats. 
 One is to define a custom serializer (as you've done), the other is to 
keep the serializer the same and define a custom renderer (or encoder, or 
whatever you want to call the second stage).  Of the two, I think that the 
second is probably a simpler cleaner approach.
When you come to writing a dumpdata serializer, you'll find that there's 
quite a few corner cases that you'll need to deal with in order to maintain 
full byte-for-byte backwards compatibility, including how natural keys are 
serialized, how many to many relationships are encoded, how None is handled 
for different types, down to making sure you preserve the correct field 
ordering across each of json/yaml/xml.  I *think* that getting the details 
of all of those will end up being awkward to express using your current 
approach.
The second approach would be to a dict-like format, that can easily be 
encoded into json or yaml, but that can also include metadata specific to 
particular encodings such as xml (or perhaps, say, html).  You'd have a 
generic xml renderer, that handles encoding into fields and attributes in a 
fairly obvious way, and a dumpdata-specific renderer, that handles the odd 
edge cases that the dumpdata xml format requires.  The dumpdata-specific 
renderer would use the same intermediate data that's used for json and yaml.

I hope all of that makes sense, let me know if I've not explained myself 
very well anywhere.

Regards,

  Tom

On Friday, 4 May 2012 21:08:14 UTC+1, Piotr Grabowski wrote:
>
> Hi, 
>
> During this week I have a lot of work so I didn't manage to present my 
> revised proposal in Monday like i said. Sorry. I have it now: 
> https://gist.github.com/2597306 
>
> Next week I hope there will be some discussion about my proposal. I will 
> also think how it should be done under the hood. There should be some 
> internal API. I should also resolve one Django ticket. I think about 
> this https://code.djangoproject.com/ticket/9279 There will be good for 
> test cases in my future solution. 
>
> I should write my proposal on this group? In github I have nice 
> formatting and in this group my Python code was badly formatted. 
>
> -- 
> Piotr Grabowski 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-developers/-/aafJHttP2QoJ.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Re: Customizable Serialization check-in

Reply via email to