Re: Customizable Serialization check-in
Hi Piotr, Thank you so much for your efforts over the summer. I'd also like to apologise for my lack of communication; I certainly haven't been a model mentor over the course of the program. Although we may not have achieved all the goals we set out to achieve at the start of the program, I don't think it's been a complete loss -- we've certainly thrashed out some interesting ideas, and between your work and Tom's, I'm sure we can salvage something that the community can make use of. Now that the program has finished, If you have any feedback about what we could do differently next year, I'd love to hear it. Obviously, we'd like every SoC student to be a complete success, so if there's anything the Django team could do to improve the chances of success for next year's program, I'd like to be able to learn from the mistakes of this year. Yours, Russ Magee %-) On Thu, Aug 23, 2012 at 8:14 AM, Piotr Grabowski wrote: > Hi, > > Google Sumer of Code is almost ended. I was working on customizable > serialization. This project was a lot harder than I expected, and sadly in > my opinion I failed to do it right. I want to apologize for that and > especially for my poor communication with this group and my mentor. I want > to improve it after midterm evaluation but it was only worse. > > I don't think my project is all wrong but there is a lot things that are > different from how I planned. How it looks like (I wrote more in > documentation) > There is Serializer class that is made of two classes: NativeSerializer and > FormatSerializer. > NativeSerializer is for serialization and deserialization python objects > from/to native python datatypes > FormatSerializer is for serialization and deserialization python native > datatypes to/from some format (xml, json, yaml) > > I want NativeSerializer to be fully independent from FormatSerializer (and > vice versa) but this isn't possible. Either NativeSerializer must return > some additional data or FormatSerializer must give NativeSerializer some > context. For exemple in xml all python native datatypes must be serialized > to string before serializing to xml. Some custom model fields can have more > sophisticated way to serialize to sting than unicode() so > `field.value_to_string` must be called and `field` are only accessible in > NativeSerializer object. So either NativeSerializer will return also `field` > or FormatSerializer will inform NativeSerializer that it handles only text > data. > > Backward compatible dumpdata is almost working. Only few tests are not > passed, but I am not sure why. > > Nested serialization of fk and m2m related fields which was main > functionality of this project is working but not well tested. There are some > issues especially with xml. I must write new xml format because old wont > work with nested serialization. > > I didn't do any performance tests. Running full test suite take 40 seconds > more with my serialization (about 1500s at all) if I remember correctly. > > I will try to complete this project so it will be at least bug free and > usable. If someone was interested in using nested serialization there is > other great project: https://github.com/tomchristie/django-serializers > > Code: https://github.com/grapo/django/tree/soc2012-serialization > Documentation: https://gist.github.com/3085250 > > > -- > Piotr Grabowski > > -- > You received this message because you are subscribed to the Google Groups > "Django developers" group. > To post to this group, send email to django-developers@googlegroups.com. > To unsubscribe from this group, send email to > django-developers+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/django-developers?hl=en. > -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
Thanks Piotr, It's been interesting and helpful watching your progress on this project. I wouldn't worry too much about not quite meeting all the goals you'd hoped for - it's a deceptively difficult task. In particular it's really difficult trying to maintain full backwards comparability with the existing fixture serialization implementation, whilst also totally redesigning the API to support the requirements of a more flexible serialization system. Like you say, I think the overall direction of your project is right, and personally I've found it useful for my own work watching how you've tackled various parts of it. All the best, Tom On Thursday, 23 August 2012 01:14:26 UTC+1, Piotr Grabowski wrote: > > Hi, > > Google Sumer of Code is almost ended. I was working on customizable > serialization. This project was a lot harder than I expected, and sadly > in my opinion I failed to do it right. I want to apologize for that and > especially for my poor communication with this group and my mentor. I > want to improve it after midterm evaluation but it was only worse. > > I don't think my project is all wrong but there is a lot things that are > different from how I planned. How it looks like (I wrote more in > documentation) > There is Serializer class that is made of two classes: NativeSerializer > and FormatSerializer. > NativeSerializer is for serialization and deserialization python objects > from/to native python datatypes > FormatSerializer is for serialization and deserialization python native > datatypes to/from some format (xml, json, yaml) > > I want NativeSerializer to be fully independent from FormatSerializer > (and vice versa) but this isn't possible. Either NativeSerializer must > return some additional data or FormatSerializer must give > NativeSerializer some context. For exemple in xml all python native > datatypes must be serialized to string before serializing to xml. Some > custom model fields can have more sophisticated way to serialize to > sting than unicode() so `field.value_to_string` must be called and > `field` are only accessible in NativeSerializer object. So either > NativeSerializer will return also `field` or FormatSerializer will > inform NativeSerializer that it handles only text data. > > Backward compatible dumpdata is almost working. Only few tests are not > passed, but I am not sure why. > > Nested serialization of fk and m2m related fields which was main > functionality of this project is working but not well tested. There are > some issues especially with xml. I must write new xml format because old > wont work with nested serialization. > > I didn't do any performance tests. Running full test suite take 40 > seconds more with my serialization (about 1500s at all) if I remember > correctly. > > I will try to complete this project so it will be at least bug free and > usable. If someone was interested in using nested serialization there is > other great project: https://github.com/tomchristie/django-serializers > > Code: https://github.com/grapo/django/tree/soc2012-serialization > Documentation: https://gist.github.com/3085250 > > -- > Piotr Grabowski > -- You received this message because you are subscribed to the Google Groups "Django developers" group. To view this discussion on the web visit https://groups.google.com/d/msg/django-developers/-/a2gBdTn5C6EJ. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
Hi, Google Sumer of Code is almost ended. I was working on customizable serialization. This project was a lot harder than I expected, and sadly in my opinion I failed to do it right. I want to apologize for that and especially for my poor communication with this group and my mentor. I want to improve it after midterm evaluation but it was only worse. I don't think my project is all wrong but there is a lot things that are different from how I planned. How it looks like (I wrote more in documentation) There is Serializer class that is made of two classes: NativeSerializer and FormatSerializer. NativeSerializer is for serialization and deserialization python objects from/to native python datatypes FormatSerializer is for serialization and deserialization python native datatypes to/from some format (xml, json, yaml) I want NativeSerializer to be fully independent from FormatSerializer (and vice versa) but this isn't possible. Either NativeSerializer must return some additional data or FormatSerializer must give NativeSerializer some context. For exemple in xml all python native datatypes must be serialized to string before serializing to xml. Some custom model fields can have more sophisticated way to serialize to sting than unicode() so `field.value_to_string` must be called and `field` are only accessible in NativeSerializer object. So either NativeSerializer will return also `field` or FormatSerializer will inform NativeSerializer that it handles only text data. Backward compatible dumpdata is almost working. Only few tests are not passed, but I am not sure why. Nested serialization of fk and m2m related fields which was main functionality of this project is working but not well tested. There are some issues especially with xml. I must write new xml format because old wont work with nested serialization. I didn't do any performance tests. Running full test suite take 40 seconds more with my serialization (about 1500s at all) if I remember correctly. I will try to complete this project so it will be at least bug free and usable. If someone was interested in using nested serialization there is other great project: https://github.com/tomchristie/django-serializers Code: https://github.com/grapo/django/tree/soc2012-serialization Documentation: https://gist.github.com/3085250 -- Piotr Grabowski -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
Hi, In the past 3 weeks, my project has changed a lot. First of all I changed output of first phase of serialization. Previously it was python native datatypes. At some point I added dictionary with metadata to it. Metadata was used in second phase of serialization. Now after first phase I returned ObjectWithMetadata which is wrapping for python native datatypes. It's a bit hackish so I don't know it is good solution: class ObjectWithMetadata(object): def __init__(self, obj, metadata=None, fields=None): self._object = obj self.metadata = metadata or {} self.fields = fields or {} def get_object(self): return self._object def __getattribute__(self, attr): if attr not in ['_object', 'metadata', 'fields', 'get_object']: return self._object.__getattribute__(attr) else: return object.__getattribute__(self, attr) # there is a few more methods like this (for acting like a MutableMapping and Iterabla) and all are similar def __getitem__(self, key): return self._object.__getitem__(key) ... Thanks to this solution, ObjectWithMetadata is acting like object stored in _object in almost all cases (also at isinstance tests), and there is place for storing additional data. I didn't change deserialization so in output there are python native datatypes without wrapping. I don't know if this is good because there is no symmetry in this: Django object -> python native datatype packed in ObjectWithMetadata -> json -> python native datatype -> Django object I have all dumpsdata formats working now (xml, json, yaml). All tests pass, but there is problem with order of fields in yaml. It will be fixed soon. I make new format new_xml which is similar to json and yaml. It's easier to parsing it. Old: rel="ManyToOneRel">1 rel="ManyToManyRel"> New: 1 1 2 There is also problem with json and serialization to stream because json is using extensions written in C (_json) for performance and this leads to exceptions when ObjectWithAttributes is used, so before pass objects to json.loads these objects should be unpacked from ObjectWithMetadata. Probably there is no chance to achieve one of most important requirement which I have specify - using only one Serializer to serialize Django Models to multiple formats: serializers.serialize('json', objects, serializer=MySerializer) serializers.serialize('xml', objects, serializer=MySerializer) Trouble is with xml (like always ;). In xml every (model) field must be converted to string before serializing in xml serializer. In json and yaml if field have protected type (string, int, datetime etc.) then nothing is done with it. Converting is done in first phase because only there is access to field.value_to_string - field method that is used to convert field value to string. It can be override by user so simple doing smart_unicode in second phase instead isn't enough. Most important tasks in TODO: handling natural keys tests x correctness x performance (I suspect my solution will be worse than actual used in Django, but how much?) documentation https://github.com/grapo/django/tree/soc2012-serialization/django/core/serializers -- Piotr Grabowski -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
W dniu 11.07.2012 14:04, Russell Keith-Magee pisze: There is still problem with API and how to do some things but in my opinion it's going in right direction. Generally, I agree. I still have some concerns however; mostly around the things that you're putting onto the Meta class. related_serializer, for example -- Why is this a single attribute in the meta, rather than a method? By using an attribute, you're saying that on any given serializer, *all* related objects will be serialised the same, and I don't see why that should be the case. Not *all* related objects but only those that aren't declared in class definition. I think related_serializer attribute is useful when you want to serialize all related object in one way: to their's primary key value, to their's natural key value, to dumpdata format. If you want to do exception for some fields then you declare it in class definition. class MySerializer(ModelSerializer): special_object = SpecialSerializer() class Meta: related_serializer = PkSerializer In this case all related objects except special_object will be serialized to pk value. What you will do more with a related_serializer method? If you want to serialize some related objects by one serializer and some by another the simplest way to do it is declare this in class definition. I see only two examples when method will be needed. If you want to get serializer by some pattern in field name or if you want to get serializer by related object type (m2m, fk). Then you can override get_object_field_serializer(self, obj, field_name) method to do it. Default this method return related_serializer or field_serializer based on field type. Maybe good idea will be to split this method to two, one for related object and one for non related. Then overriding it will be very similar to set attribute in Meta, but I think attributes are more "declarative". The same argument goes for class_name (which I think I've mentioned before), field_serializer, and so on. And there is method for that :) def create_instance(self, serialized_obj): if self.opts.class_name is not None: if isinstance(self.opts.class_name, str): return _get_model(serialized_obj[self.opts.class_name])() else: return self.opts.class_name() raise base.DeserializationError(u"Can't resolve class for object creation") Maybe it isn't proper way to do this - there is two ways to doing same operation, but I think this is simplest solution for end user. The only fields that I can see that *should* be declarative are 'fields' and 'exclude' -- and if you've been tracking django-dev recently, there's been a discussion about whether the idea of 'exclude' should be deprecated from Django APIs (due to potential security issues -- explicit inclusion is safer than implicit inclusion, because you can accidentally forget to exclude sensitive data from an output list) I have read this discussion. I'm +1 to deprecate 'exclude' :) Personally I almost never use it. Some other API questions: Why is deserialized_value decoupled from set_object? It isn't obvious to me why this separation exists. It's possible that I overcomplicated this. There is three methods: set_object, deserialize and deserialize_value. When you want to deserialize object then you should: * Ensure that this is proper object not list of objects or dict (dict in deserialization is another problem - I will present it below) - 'deserialization' method will handle this - it recursively deserialize lists and dicts. * Do some processing on object you get ( e.g. change string to int) 'deserialize_value' method will handle this * Set this object to upper level object. 'set_object' method will handle this. There shouldn't be reason to override it very often. I think deserialize_value will be method that user would most often needed to override. I would be acquiescent to merge deserialize and deserialize_value. But set_object should be left as is. Problem with deserializing dict: In current implementation in deserialization there is no way to guess that given dict is serialized object or it is dict of objects. So it might be better to don't automatically serialize dicts but leave it to the user decision? I see where you're going with metainfo on fields (and that's a reasonably elegant way of tackling the problem of XML needing additional info to serialize), but what is the purpose of metadata on Serializers? Yours, Russ Magee %-) Because Serializer should also have possibility to give additional info to format serializer. For example which fields should be treat as attributes (pk and model in dumpdata). -- Piotr Grabowski -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googleg
Re: Customizable Serialization check-in
On Wed, Jul 11, 2012 at 8:18 AM, Piotr Grabowski wrote: > Hi, > > It is time to midterm evaluation of my participation in gsoc so I want to > summarize in this check-in what I have done in last month. > https://gist.github.com/3085250 - here is something that can be > "documentation". I wrote some examples of ModelSerializer usage and how it > should work. > https://github.com/grapo/django - in branch soc2012-serialization is code > that I wrote. It's good that you're starting to work on some documentation -- my feedback is that you need to think about the purpose of this documentation -- I can discover the API myself with Python's interactive shell; what that won't tell me is what output I will expect. For example, you give an example of how to defined a 'metadata' method, but you don't show the effect of adding that declaration on the output serialised object. In fact, there doesn't seem to be a single example of serialised *output* in the whole docs. Giving lots of code examples of input doesn't really help me unless I know how that input will shape the output. This is especially important when we're dealing with serializers. > There is still problem with API and how to do some things but in my opinion > it's going in right direction. Generally, I agree. I still have some concerns however; mostly around the things that you're putting onto the Meta class. related_serializer, for example -- Why is this a single attribute in the meta, rather than a method? By using an attribute, you're saying that on any given serializer, *all* related objects will be serialised the same, and I don't see why that should be the case. The same argument goes for class_name (which I think I've mentioned before), field_serializer, and so on. The only fields that I can see that *should* be declarative are 'fields' and 'exclude' -- and if you've been tracking django-dev recently, there's been a discussion about whether the idea of 'exclude' should be deprecated from Django APIs (due to potential security issues -- explicit inclusion is safer than implicit inclusion, because you can accidentally forget to exclude sensitive data from an output list) Some other API questions: Why is deserialized_value decoupled from set_object? It isn't obvious to me why this separation exists. I see where you're going with metainfo on fields (and that's a reasonably elegant way of tackling the problem of XML needing additional info to serialize), but what is the purpose of metadata on Serializers? > Serialization and deserialization of Python objects is almost done. There is > quite stable API, i used some ideas (and little code) from > https://github.com/tomchristie/django-serializers > Objects are serialized to metadicts which are dicts with additional data. > this additional data can be used by format serializer to change presentation > of data (e.g. attributes in xml) > > Serialization of Django models is started. I don't know what fields of model > should be serialized by default: for sure all declared in model fields. What > with pk field, reverse related fields? Your goal here should be to exactly replicate Django's existing serializers. That means serialising all local model fields, with the PK being handled as a special case; reverse related fields aren't included. > Json dumpdata serializer is more or less written - I have not done fields > sorting yet. > > I am sure that I can finish all this work until gsoc end. > > Sadly not all is going well. Especially my communication in this list and > with my mentor should be improved. It's all by my fault. I should wrote > check-ins more regularly and meet the deadlines that I set. I am not very > satisfied with progress I have made. It can be done much more in about one > and a half month. My sincere apologies for not responding as often as I should. I haven't been a very good mentor for this project. I'll try and improve for the second half of the GSoC. I can see you've been getting some feedback from Tom Christie; the good news is that I'm generally in agreement with the feedback he's been giving you, so he hasn't been leading you astray :-) If you ever want to get my attention for a solid block of time to kick around an idea, you can alway grab me on IRC. I lurk in #django-dev most of the time. Yours, Russ Magee %-) -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
Hi, It is time to midterm evaluation of my participation in gsoc so I want to summarize in this check-in what I have done in last month. https://gist.github.com/3085250 - here is something that can be "documentation". I wrote some examples of ModelSerializer usage and how it should work. https://github.com/grapo/django - in branch soc2012-serialization is code that I wrote. There is still problem with API and how to do some things but in my opinion it's going in right direction. Serialization and deserialization of Python objects is almost done. There is quite stable API, i used some ideas (and little code) from https://github.com/tomchristie/django-serializers Objects are serialized to metadicts which are dicts with additional data. this additional data can be used by format serializer to change presentation of data (e.g. attributes in xml) Serialization of Django models is started. I don't know what fields of model should be serialized by default: for sure all declared in model fields. What with pk field, reverse related fields? Json dumpdata serializer is more or less written - I have not done fields sorting yet. I am sure that I can finish all this work until gsoc end. Sadly not all is going well. Especially my communication in this list and with my mentor should be improved. It's all by my fault. I should wrote check-ins more regularly and meet the deadlines that I set. I am not very satisfied with progress I have made. It can be done much more in about one and a half month. Regards, Piotr Grabowski -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
W dniu 26.06.2012 11:52, Tom Christie pisze: > It is the way I am doing deserialization - pass instance to subfields Seems fine. It's worth keeping in mind that there's two ways around of doing this. 1. Create an empty instance first, then populate it with the field values in turn. 2. Populate a dictionary with the field values first, and then create an instance using those values. The current deserialization does something closer to the second. I don't know if there's any issues with doing things the other way around, but you'll want to consider which makes more sense. Second approach assume that every field returns some value. But what if we don't want to deserialize some field? In my deserialization instance is passed to field and field will eventually fill it with some value. def deserialize_value(self, obj, instance, field_name): setattr(instance, field_name, obj) If we don't want to deserialize field we simply do nothing in deserialize_value. If second approach is used we must return value. Some idea is to mark field as not deserializable: class MyField(Field): deserializable = False > Where I returned (native, attributes) I will return (native, metainfo). It's only matter of renaming but metainfo will be more than attributes. Again, there's two main ways around I can think of for populating metadata such as xml attributes. 1. Return the metadata upfront to the renderer. 2. Include some way for the renderer to get whatever metadata it needs at the point it's needed. This is one point where what I'm doing in django-serializers differs from your work, in that rather than return extra metadata upfront, the serializers return a dictionary-like object (that e.g. can be directly serialized to json or yaml), that also includes a way of returning the fields for each key (so that e.g. the xml renderer can call field.attributes() when it's rendering each field.) Again, you might decide that (1) makes more sense, but it's worth considering. As ever, if there's any of this you'd like to talk over off-list, feel free to drop me a mail - t...@tomchristie.com Regards, Tom I rewrite this so it's more similar to django-serializers. But from the beginning - what I do in this week? :) I agreed that xml attributes in my solution are overstated. So I want to modify it. Attributes in xml are one of (two) ways of presenting information. I still want to have field for attributes, but doing it in this way: class MyField(Field): attr1 = Field() attr2 = Field() def serialized_value(self, obj, field_name): return field_value def metainfo(self): return {'attributes' : ['attr1', 'attr2']} JSON will skip attributes at all: some_field : field_value XML will render it: field_value If metainfo won't return dict with attributes XML will render this: val1 val2 field_value I code it like django-serializers's DictWithMeta but I added one more functionality to represent Field that have subfields and one extra value. I'm still not convicted it is good solution, so I rewrite it several times but always end up with something like that :) I will push code tomorrow because I still want to do some tweaks. -- Piotr Grabowski -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
> default deserialized_value don't returns anything. It sets the field value. Cool, that's exactly what I meant. > but declaring function only to say "do nothing" isn't good solution for me. Declaring a method to simply 'pass' seems fine to me if you want to override it to do nothing. > It is the way I am doing deserialization - pass instance to subfields Seems fine. It's worth keeping in mind that there's two ways around of doing this. 1. Create an empty instance first, then populate it with the field values in turn. 2. Populate a dictionary with the field values first, and then create an instance using those values. The current deserialization does something closer to the second. I don't know if there's any issues with doing things the other way around, but you'll want to consider which makes more sense. > Where I returned (native, attributes) I will return (native, metainfo). It's only matter of renaming but metainfo will be more than attributes. Again, there's two main ways around I can think of for populating metadata such as xml attributes. 1. Return the metadata upfront to the renderer. 2. Include some way for the renderer to get whatever metadata it needs at the point it's needed. This is one point where what I'm doing in django-serializers differs from your work, in that rather than return extra metadata upfront, the serializers return a dictionary-like object (that e.g. can be directly serialized to json or yaml), that also includes a way of returning the fields for each key (so that e.g. the xml renderer can call field.attributes() when it's rendering each field.) Again, you might decide that (1) makes more sense, but it's worth considering. As ever, if there's any of this you'd like to talk over off-list, feel free to drop me a mail - t...@tomchristie.com Regards, Tom On Wednesday, 20 June 2012 16:28:51 UTC+1, Piotr Grabowski wrote: > > W dniu 20.06.2012 13:50, Tom Christie pisze: > > > > deserialized_value function with empty content > > Are you asking about how to be able to differentiate between a field that > deserializes to `None`, and a field that doesn't deserialize a value at > all? > > No :) I had this problem before and I managed to resolve it - default > deserialized_value don't returns anything. It sets the field value. > def deserialized_value(self, obj, instance, field_name): > setattr(instance, field_name, obj) > > It is the way I am doing deserialization - pass instance to subfields, > retrieve it from them (should be same instance, but in specific cases eg. > immutable instance, I can imagine that another instance of same class is > returned) and return it. > > If I don't declare deserialized_value function then function from base > class is taken. It's expected behavior. So how to say "This field shouldn't > be deserialized". Now I declare: > def deserialized_value(self, obj, instance, field_name): > pass > For true, I can do anything in this function excepting set some value to > instance, but declaring function only to say "do nothing" isn't good > solution for me. > > > > I changed python datatype format returned from serializer.serialize > method. Now it is tuple (native, attributes) > > I'm not very keen on either this, or on the way that attributes are > represented as fields. > To me this looks like taking the particular requirements of serializing to > xml, and baking them deep into the API, rather than treating them as a > special case, and dealing with them in a more decoupled and extensible way. > > For example, I'd rather see an optional method `attributes` on the > `Field` class that returns a dictionary of attributes. You'd then make > sure that when you serialize into the native python datatypes prior to > rendering, you also have some way of passing through the original Field > instances to the renderer in order to provide any additional metadata that > might be required in rendering the basic structure. > > Wiring up things this way around lets you support other formats that > have extra information attached to the basic structure of the data. As an > example use-case - In addition to json, yaml and xml, a developer might > also want to be able to serialize to say, a tabular HTML output. In order > to do this they might need to be able attach template_name or widget > information to a field, that'd only be used if rendering to HTML. > > It might be that it's a bit late in the day for API changes like that, > but hopefully it at least makes clear why I think that treating XML > attributes as anything other than a special case isn't quite the right > thing to do. - Just my personal opinion of course :) > > > Regards, > >Tom > > > You right that I shouldn't treated attributes so special. I have idea how > to fix this. Where I returned (native, attributes) I will return (native, > metainfo). It's only matter of renaming but metainfo will be more t
Re: Customizable Serialization check-in
W dniu 20.06.2012 13:50, Tom Christie pisze: >deserialized_value function with empty content Are you asking about how to be able to differentiate between a field that deserializes to `None`, and a field that doesn't deserialize a value at all? No :) I had this problem before and I managed to resolve it - default deserialized_value don't returns anything. It sets the field value. def deserialized_value(self, obj, instance, field_name): setattr(instance, field_name, obj) It is the way I am doing deserialization - pass instance to subfields, retrieve it from them (should be same instance, but in specific cases eg. immutable instance, I can imagine that another instance of same class is returned) and return it. If I don't declare deserialized_value function then function from base class is taken. It's expected behavior. So how to say "This field shouldn't be deserialized". Now I declare: def deserialized_value(self, obj, instance, field_name): pass For true, I can do anything in this function excepting set some value to instance, but declaring function only to say "do nothing" isn't good solution for me. > I changed python datatype format returned from serializer.serialize method. Now it is tuple (native, attributes) I'm not very keen on either this, or on the way that attributes are represented as fields. To me this looks like taking the particular requirements of serializing to xml, and baking them deep into the API, rather than treating them as a special case, and dealing with them in a more decoupled and extensible way. For example, I'd rather see an optional method `attributes` on the `Field` class that returns a dictionary of attributes. You'd then make sure that when you serialize into the native python datatypes prior to rendering, you also have some way of passing through the original Field instances to the renderer in order to provide any additional metadata that might be required in rendering the basic structure. Wiring up things this way around lets you support other formats that have extra information attached to the basic structure of the data. As an example use-case - In addition to json, yaml and xml, a developer might also want to be able to serialize to say, a tabular HTML output. In order to do this they might need to be able attach template_name or widget information to a field, that'd only be used if rendering to HTML. It might be that it's a bit late in the day for API changes like that, but hopefully it at least makes clear why I think that treating XML attributes as anything other than a special case isn't quite the right thing to do. - Just my personal opinion of course :) Regards, Tom You right that I shouldn't treated attributes so special. I have idea how to fix this. Where I returned (native, attributes) I will return (native, metainfo). It's only matter of renaming but metainfo will be more than attributes. In xml metainfo can contains attributes for field, in html it can be template_name or widget for rendering. If I don't use metainfo in my serializer class then it's still universal - can be used for serialization to any format. How to create metainfo? Have a method `metainfo' in `Field` class that returns a dictionary seems to be good idea. And it is for this use-cases for html. But what to do with xml attributes again? :) They aren't only field meta informations but they can also contains instance information valuable in deserialization (like instance pk in current django solution) so they should be treated as fields, should have access to instance in serialization and deserialization. My last thought is that attributes should be treated as normal fields and be in tuple's native object and in metainfo there will be information for xml which fields in native should be rendered as attributes. After first phase: native =={ 'field_1' : value1, 'field_2' : value2, 'field_3' : value3, } metainfo == { 'as_attributes' : ['field_2', 'field_3'], 'template_name' : 'my_template' } So if we use json in second phase field_2 and field_3 will be render same way as field_1 because json don't read metainfo. Xml will render fields according to metainfo['as_attributes']. Html will render native dict using my_template. -- Piotr Grabowski On Tuesday, 19 June 2012 21:48:37 UTC+1, Piotr Grabowski wrote: Hi! This week I wrote simple serialization and deserialization for json format so it's possible now to encode objects from and to json: import django.core.serializers as s class Foo(object): ��� def __init__(self): ��� self.bar = [Bar(), Bar(), Bar()] ��� self.x = "X" class Bar(object): ��� def __init__(self): ��� self.six = 6 class MyField2(s.Field): ��� def deserialized_value(self, obj, instance,� field_name): ��� pass class MyField(s.Field): ��� x = MyField2(label="my_attribut
Re: Customizable Serialization check-in
> if I put list in input I want list in output, not generator I wouldn't worry about that. The input and output should be *comparable*, but it doesn't mean they should be *identical*. A couple of cases for example: *) You should be able to pass both lists and generator expressions to a given serializer, but they'll end up with the same representation - there's no way to distinguish between the two cases and deserialize accordingly. *) Assuming you're going to maintain backwards compatibility, model instances will be deserialized into django.core.serializer.DeserializedObject instances, rather than deserializing directly back into complete model instances. These match up with the original serialized instances, but they are not identical objects. > deserialized_value function with empty content Are you asking about how to be able to differentiate between a field that deserializes to `None`, and a field that doesn't deserialize a value at all? I'd suggest that the deserialization hook for a field needs to take eg. the dictionary that the value should be deserialized into, then it can determine which key to deserialize the field into, or simply 'pass' if it doesn't deserialize a value. > I changed python datatype format returned from serializer.serialize method. Now it is tuple (native, attributes) I'm not very keen on either this, or on the way that attributes are represented as fields. To me this looks like taking the particular requirements of serializing to xml, and baking them deep into the API, rather than treating them as a special case, and dealing with them in a more decoupled and extensible way. For example, I'd rather see an optional method `attributes` on the `Field` class that returns a dictionary of attributes. You'd then make sure that when you serialize into the native python datatypes prior to rendering, you also have some way of passing through the original Field instances to the renderer in order to provide any additional metadata that might be required in rendering the basic structure. Wiring up things this way around lets you support other formats that have extra information attached to the basic structure of the data. As an example use-case - In addition to json, yaml and xml, a developer might also want to be able to serialize to say, a tabular HTML output. In order to do this they might need to be able attach template_name or widget information to a field, that'd only be used if rendering to HTML. It might be that it's a bit late in the day for API changes like that, but hopefully it at least makes clear why I think that treating XML attributes as anything other than a special case isn't quite the right thing to do. - Just my personal opinion of course :) Regards, Tom On Tuesday, 19 June 2012 21:48:37 UTC+1, Piotr Grabowski wrote: > > Hi! > > This week I wrote simple serialization and deserialization for json format > so it's possible now to encode objects from and to json: > > > import django.core.serializers as s > > class Foo(object): > ��� def __init__(self): > ������� self.bar = [Bar(), Bar(), Bar()] > ������� self.x = "X" > > class Bar(object): > ��� def __init__(self): > ������� self.six = 6 > > class MyField2(s.Field): > ��� def deserialized_value(self, obj, instance,� field_name): > ������� pass > > class MyField(s.Field): > ��� x = MyField2(label="my_attribute", attribute=True) > > ��� def serialized_value(self, obj, field_name): > ������� return getattr(obj, field_name, "No field like this") > > ��� def deserialized_value(self, obj, instance,� field_name): > ������� pass > > class BarSerializer(s.ObjectSerializer): > ��� class Meta: > ������� class_name = Bar > > class FooSerializer(s.ObjectSerializer): > ��� my_field=MyField(label="MYFIELD") > ��� bar = BarSerializer() > ��� class Meta: > ������� class_name = Foo > > > foos = [Foo(), Foo(), Foo()] > ser = s.serialize('json', foos, serializer=FooSerializer, indent=4) > new_foos = s.deserialize('json', ser, deserializer=FooSerializer) > > > There are cases that I don't like: > >- deserialized_value function with empty content - what to do with >fields that we don't want to deserialize. Should be better way to handle >this, >- I put list foos but return generator new_foos, also bar in Foo >object is generator, not list like in input. Generators are better for >performance but if I put list in input I want list in output, not >generator. I don't know what to do with this. > > > Next week I will handle rest of issues that I mentioned in my last week > check-in and refactor json format (de)serialization - usage of streams and > proper parameters handling (like indent, etc.) > > -- > Piotr Grabowski > > > > -- You received this message because you are subscribe
Re: Customizable Serialization check-in
Hi! This week I wrote simple serialization and deserialization for json format so it's possible now to encode objects from and to json: import django.core.serializers as s class Foo(object): def __init__(self): self.bar = [Bar(), Bar(), Bar()] self.x = "X" class Bar(object): def __init__(self): self.six = 6 class MyField2(s.Field): def deserialized_value(self, obj, instance, field_name): pass class MyField(s.Field): x = MyField2(label="my_attribute", attribute=True) def serialized_value(self, obj, field_name): return getattr(obj, field_name, "No field like this") def deserialized_value(self, obj, instance, field_name): pass class BarSerializer(s.ObjectSerializer): class Meta: class_name = Bar class FooSerializer(s.ObjectSerializer): my_field=MyField(label="MYFIELD") bar = BarSerializer() class Meta: class_name = Foo foos = [Foo(), Foo(), Foo()] ser = s.serialize('json', foos, serializer=FooSerializer, indent=4) new_foos = s.deserialize('json', ser, deserializer=FooSerializer) There are cases that I don't like: * deserialized_value function with empty content - what to do with fields that we don't want to deserialize. Should be better way to handle this, * I put list foos but return generator new_foos, also bar in Foo object is generator, not list like in input. Generators are better for performance but if I put list in input I want list in output, not generator. I don't know what to do with this. Next week I will handle rest of issues that I mentioned in my last week check-in and refactor json format (de)serialization - usage of streams and proper parameters handling (like indent, etc.) -- Piotr Grabowski -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
Hi! This week I managed to write deserialization functions and tests. *Issues with deserialization* Working on deserialization give me a lot thoughts about previous concepts. I rewrite Field class so now Field can't be nested. Field can only have subfields if subfields are attributes: class ContentField(Field): title = Field(attribute=True) # valid content = Field() # invalid -> raise exception in class declaration time def serialized_value(...): ... Of course if ContentField is initialized as attribute and have subfields exception is raised (when ContentField is initialized) I changed python datatype format returned from serializer.serialize method. Previously it was dict with serialized fields (label or field name as key) and special key __attributes__ with dict of attributes. Now it is tuple (native, attributes) where native is dict with serialized fields (or generator of dicts) serializer.deserialize always return object instance After first phase of serialization, python_serialized_object will be serialized by NativeFormat instance. Each format (json, xml, yaml, ...) have one NativeFormat that will translate python_serialized_object to serialized_string. I want to be able to do this: object -> python_serial = object_serializer.serialize(object) -> string_serial = native_format.serialize(python_serial) -> python_deserial = native_format.deserialize(string_serial) -> object2 = object_serializer.deserialize(python_deserial) object2 has same content as object Now I have: object -> python_serial = object_serializer.serialize(object) -> object2 = object_serializer.deserialize(python_deserial) *Tests* I wrote some tests (NativeSerializersTests) for ObjectSerializer in django/tests/modeltests/serializers/tests.py but I'm not sure this is good place for them. I used model (Article) defined in models.py but I used it like normal object. Relation fields aren't serialized in proper way. Until now I tested the most important functions of ObjectSerializer. Creating custom fields, attributes, rename fields (using labels). Next I want to resolve issues with: * Instance creation when deserialize. I have create_instance method and Meta.class_name. I must do some public API from them. * Ensure that Field serialize method returns always simple native python datatypes * Write NativeFormat for (at least) json * Find better names for already defined classes, methods and files * More tests and documentation When I do this serialization and deserialization will be more or less done for (non model) python objects. -- Piotr Grabowski -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
Hi, Sorry for being late with weekly update. Due to some issues with Meta and my wrong understanding of metaclasses that Russell pointed I spend time on enhance my knowledge about this. I rewrote also some part of code that I have written week before. This week I will do what I was suppose to do last week - initial tests, documentations. After this week serialization should work with simple objects. -- Piotr Grabowski -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
W dniu 29.05.2012 02:28, Russell Keith-Magee pisze: Hi Piotr; Apologies for the delay in responding to your updated API. On Tue, May 22, 2012 at 6:59 AM, Piotr Grabowski wrote: I do some changes to my previous API: (https://gist.github.com/2597306<- change are included) * which fields of object are default serialized. It's depend on include_default_field but opposite to Tom Christie solution it's default value is True so all fields (eventually specified in Meta.model_fields) are present Field options: ~~ * There's a complication here that doesn't make sense to me. Following your syntax, the following would appear to be legal: class FieldA(Field): def serialize(…): def deserialize(…): class FieldB(Field): to = FieldA() def serialize(…): def deserialize(…): class FieldC(Field): to = FieldB(attribute=True) def serialize(…): def deserialize(…): i.e., if Field allows declaration style definitions, and Field can be *used* in declaration style definitions, then it's possible to define them in a nested fashion -- at which point, it isn't clear to me what is going to be output. It seems to me that "attribute" shouldn't be an option on a field declaration; it should either be something that's encompassed in a similar way to serialise/deserialize (i.e., either additional input/output from the serialise methods, or a parallel pair of methods), or the use of a Field as a declarative definition implies that it is of type attribute, and prevents the use of field types that themselves have attributes. In example that You present I thought about raising an exception when the FieldC is defined. Another option is to define class as being attribute: class FieldB(Field): to = FieldA() def serialize(…): def deserialize(…): class Meta: attribute=True Then raise an exception when FieldB is defined because of 'to' field. Still one of my principle is to have one Serializer for all formats (or at least possibility to serialize Serializer in each format) and attribute is something really problematic. About value returns by Field.serialize (Serializer.serialize in general) - now it is dict with key __attribute__, maybe better will be to return tuple (dict/field_value, attributes_dict) because of issues if there is no field_name and attributes are present. Field methods: ~~~ * serialize_value(), deserialize_value(); this is bike shedding, but is there any reason to not use just "serialize() and deserialize()"? I'm using serialize and deserialize in my code. Serializer.serialize(...) returns native python datatype. It's matter of naming but in my opinion serialize is method that should return serialized Field/ObjectSerializer not only part of result (serialized_value returns only part of data needed for Field serialization) ObjectSerializer methods: * Why does ObjectSerializer have options at all? How can it be "meta" operating on a generic object? Consider -- if you pass in an instance of an object, you'll need to use obj.field_name to access fields; if you pass in a dictionary, you'll need to use obj['field_name']. And if you're given a generic object what's the list of default fields to serialize? Like I said last time, ObjectSerializer should be completely definition based. Look at Django's Form base class - it has no "meta" concept -- it's fully declaration based. Then there's ModelForm, which has a meta class; but the output of the ModelForm could be completely manually generated using a base Form. Ok, I think I get this idea finally. Before I think about class Meta more like options for class where it is. ObjectSerializer now is more like ModelForm than like Form. I have idea how to rewrite it and I will notice You when it will be done. * I mentioned this last time -- why is class_name a meta option, rather than a method on the base class with a default implementation? Having it as an Meta attribute I answered You last time, I should add this to proposal. Probably I don't understand the issue. get_class(self, data): if self._meta.class_name is not None: if isinstance(self._meta.class_name, str): return object_from_string(data['self._meta.class_name']) else: return self._meta.class_name raise Exception('No class for deserialization provided') If someone wants more sophisticated class from data resolving then he can override get_class. When I rewrite ObjectSerializer it will be different than this but my idea is to have class_name as short cut for writing method get_class. * I'm not wild about the way related_serializer seems to work, either. Again, like class_name, it seems like it should be a method, not an option. By making it an option, you're assuming that it will have a single obvious value, which definitely won't be true -- e.g., I have an object with relations to users, groups and permissions; I want to output users as a li
Re: Customizable Serialization check-in
Hi Piotr; Apologies for the delay in responding to your updated API. On Tue, May 22, 2012 at 6:59 AM, Piotr Grabowski wrote: > I do some changes to my previous API: (https://gist.github.com/2597306 <- > change are included) > > * which fields of object are default serialized. It's depend on > include_default_field but opposite to Tom Christie solution it's default > value is True so all fields (eventually specified in Meta.model_fields) are > present Field options: ~~ * There's a complication here that doesn't make sense to me. Following your syntax, the following would appear to be legal: class FieldA(Field): def serialize(…): def deserialize(…): class FieldB(Field): to = FieldA() def serialize(…): def deserialize(…): class FieldC(Field): to = FieldB(attribute=True) def serialize(…): def deserialize(…): i.e., if Field allows declaration style definitions, and Field can be *used* in declaration style definitions, then it's possible to define them in a nested fashion -- at which point, it isn't clear to me what is going to be output. It seems to me that "attribute" shouldn't be an option on a field declaration; it should either be something that's encompassed in a similar way to serialise/deserialize (i.e., either additional input/output from the serialise methods, or a parallel pair of methods), or the use of a Field as a declarative definition implies that it is of type attribute, and prevents the use of field types that themselves have attributes. Field methods: ~~~ * serialize_value(), deserialize_value(); this is bike shedding, but is there any reason to not use just "serialize() and deserialize()"? ObjectSerializer methods: * Why does ObjectSerializer have options at all? How can it be "meta" operating on a generic object? Consider -- if you pass in an instance of an object, you'll need to use obj.field_name to access fields; if you pass in a dictionary, you'll need to use obj['field_name']. And if you're given a generic object what's the list of default fields to serialize? Like I said last time, ObjectSerializer should be completely definition based. Look at Django's Form base class - it has no "meta" concept -- it's fully declaration based. Then there's ModelForm, which has a meta class; but the output of the ModelForm could be completely manually generated using a base Form. * I mentioned this last time -- why is class_name a meta option, rather than a method on the base class with a default implementation? Having it as an Meta attribute * I'm not wild about the way related_serializer seems to work, either. Again, like class_name, it seems like it should be a method, not an option. By making it an option, you're assuming that it will have a single obvious value, which definitely won't be true -- e.g., I have an object with relations to users, groups and permissions; I want to output users as a list of nested objects, permissions as a list of natural keys, and groups as a list of primary keys. * I'm not sure I see why include_default_fields is needed. Isn't this implied by the values for "fields" and "exclude"? i.e., if fields or exclude is defined, you're not including everything by default; otherwise you are. Why the additional setting? What's the interaction of include_default_fields with fields and exclude? * I don't understand what follow_object is trying to do. Isn't the issue here whether you use a serializer that just outputs a primary key, or an object that outputs field values? And if it's the latter, the sub-serializer determines how deep things go? ModelSerializer options: * I'm really not a fan of model_fields. This seems like a short cut that will make the implementation a whole lot more complex, and ultimately is much less explicit than just naming the fields that you want to serialize. > I'm aware that there will be lot of small issues but I believe that ideas > are good. I'm still optimistic, but there's still some fundamental issues here -- in particular, the existence of Meta on ObjectSerializer, and the way that attributes on XML tags are being handled. I don't think we've hit any blockers, but we need to get these sorted out before you start producing too much code. Yours, Russ Magee %-) -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
On May 27, 7:37 pm, Piotr Grabowski wrote: > Hi, > > This week I started coding my project. It' available on branch > soc2012-serialization onhttps://github.com/grapo/django. > > I'm not very familiar with git so I'm not suer that I do it right: > * I forked django repo from github > * clone it to my computer > * create new branch soc2012 > * work in this branch > * push it to origin > > When I want to synchronize my branch with django trunk I will fetch > master from upstream (django/django) and merge master to my branch. > It's this flow good? I think that is a good way to go. It might be the branch will need some history rewriting when it is otherwise ready for commit, but until then keeping your history intact so that others can easily follow you work is good. One advice I have seen is that you should not merge upstream changes too often, it will just mess up the history. You can easily enough create another branch where you test how your work interacts with master branch. Only merge your soc2012 branch if upstream changes are such that your work needs major changes by them. Trivial merge conflicts do not require merging upstream back. Another option is rebase workflow for the branch, but in this case you should make it absolutely clear that others should not consider your github branch as anything else than a convenient way to publish pa your work as patch-series. The good thing about this way of working is that your changes will be on top of the commit log all the time, and thus it is very easy to see what you have done in your branch. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
Hi, This week I started coding my project. It' available on branch soc2012-serialization on https://github.com/grapo/django. I'm not very familiar with git so I'm not suer that I do it right: * I forked django repo from github * clone it to my computer * create new branch soc2012 * work in this branch * push it to origin When I want to synchronize my branch with django trunk I will fetch master from upstream (django/django) and merge master to my branch. It's this flow good? Until now I coded base for Serializers and Fields. I don't include any test or documentation so it can be hard to try it. I am pretty sure that writing appropriate docstring will be a challenge for me :) I copied some metaclass code from django forms and models. You can instantiate ObjectSerializer and try to serialize some simple python objects with it. It will serializer all fields presented in object.__dict__ and return python native datatype. The code is still in early phase so it's not polished and need for some refactor but if You have some tips for me I will be very grateful. Next week I will fix some issues, code ModelSerializer and write documentation and test for what I done so far. I must also think about renaming some functions so the API will be more convenient. -- Piotr Grabowski -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
I do some changes to my previous API: (https://gist.github.com/2597306 <- change are included) * which fields of object are default serialized. It's depend on include_default_field but opposite to Tom Christie solution it's default value is True so all fields (eventually specified in Meta.model_fields) are present . * follow_object attribute. In short - on which object should work Serializer's child Serializer. Tom wrote about this in previous mail but I didn't fully understand the problem so I gave him bad answer. It's better described in algorithm I present. * get rid of aliases and preserve_field_ordering fields * change class hierarchy class Serializer(object) # base class for serializing class Field(Serializer) # class for serializing fields in objects class ObjectSerializer(Serializer) # class for serializing objects class ModelSerializer(Serializer) # class for serializing Django Models. I prepare list of steps for first phase of serialization. It's written in English-Python pseudo code :) Hope indentation will be preserved. Serializer.serialize is function that for object will return dict with python native datatypes. (Object|Model)Serializer.serialize(object, field_name (can be None), **options) 1. Get object 1.1. if object is iterable then do this algorithm for all elements and return list of returned values 1.2. if field_name for object is set from upper level we have object Obj: 1.2.1. if Meta.follow_object == True then work on object Obj.field_name 1.2.2. else work on Obj 2. Find all fields Fs that should be serialized 2.1. Get all fields declared in Serializer 2.2. Get all fields from Meta.fields 2.3. If Meta.include_default_fields = True then get all fields where type is valid in Meta.model_fields and not in Meta.exclude 3. Create dictionary A and for F in Fs: 3.1. Find serializer for F 3.1.1. If F is declared in Serializer then serializer is explicit declared 3.1.2. Else get serializer for F type (m2m related etc) 3.2. Save in dictionary A[field_name] = serializer_value 3.2.1. If field has set label then field_name = label 3.2.2. If field has set attribute=True then add this to dictionary A[__attributes__][field_name] = serializer_value 4. Return A Field.serialize(object, field_name (can be None), **options) 1. Get object 1.1. if it is iterable then do this algorithm for all elements 1.2. work on object Obj passed from upper level 2. Find all fields Fs that should be serialized 2.1. Get all fields from declared fields 3. Create dictionary A and for F in Fs: 3.1. Find serializer for F 3.1.1. F is in declared fields so serializer is explicit declared 3.2. Save in dictionary A[field_name] = serializer_value 3.2.1. If field has set label then field_name = label 3.2.2. If field has set attribute=True then add this to dictionary A[__attributes__][field_name] = serializer_value 4. Resolve function serialized_value 4.1. If Fs (and A) is empty: 4.1.1. If function field_name returns None then return serialized_value 4.1.2. Else return {field_name() : serialized_value()} 4.2. Else 4.2.1. If function field_name returns None then raise Exception 4.2.2. Else A.update({field_name() : serialized_value()}) 5. Return A We have dict (list of dicts) from first phase of serialization. Next __attributes__ must be resolve (depends on format and strategy). Deserialization: (it's early idea) SomeSerializer.deserialize(D - python_native_datetype_objects (dict or list of dict), instance=None, field_name=None, class_name=None, **options) 1. Get object instance # Resolving this may be more complicated than I wrote below (e.g. base on D fields - duck typing) 1.1. If instance is not None then use it 1.2. Else try resolve class_name 1.2.1. If class_name is class object instantiate it. 1.2.2. If class_name is string then find string value for this key in D and instantiate it 1.2.3. If class_name is None raise Exception 2. Find all fields in D and find fields in Serializer for deserializing them 2.1. Resolve label attribute for fields 3. Pass instance, data D and field_name to all fields Serializers 4. Return instance I'm aware that there will be lot of small issues but I believe that ideas are good. -- Piotr Grabowski -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
Hi, During this week I was focused on my exams. Now I have more time for serialization project. Sadly API isn't finished yet. 21 May in gsoc calendar is time for start coding. Tomorrow I will send updates to API proposal and I will present idea of algorithm (maybe list of steps will be better name) used for serialization. Wednesday 23 May I want start coding and Saturday 27 may I will write next check in and present my initial code. First thing I want to code is basis for serializers.serializer method, Serializer and Field class. After two first weeks I want to be able to serialize very simple objects to json. Like I wrote in my first proposal I'm ready to spend 20 hours per week on this. In two first weeks it will be less due to my studies tasks. -- Piotr Grabowski -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
Hi, This week I think about internal API for Serializer. I want that developers can eventually use it for better customization of their solutions. Next week I must learn for my exams so I suppose I will not do much with serialization project. I will try to resolve some issues about my API that Tom Christie pointed. I know that I didn't do much but at the end of semester I have many tasks related to my studies. After end of May I will have much more time. -- Piotr Grabowski -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
W dniu 07.05.2012 20:13, Tom Christie pisze: Hey Piotr, Here's a few comments... You have 'fields' and 'exclude' option, but it feels like it's missing an 'include' option - How would you represent serializing all the fields on a model instance (without replicating them), and additionally including one other field? I see that you could do that by explicitly adding a Field declaration, but 'include' would seem like an obvious addition to the other two options. Default all model fields will be serialized and additional all fields adding by Fields declaration. If 'fields' is set then only fields present in 'fields' and additional fields added by Fields declaration will be serialized. To many fields :). If exclude is set then all model fields except fields set in exclude will be serialized and additional fields added by explicit declaration. I think it's like in ModelForm declaration. Did I'm missing some case? I'd second Russell's comment about aliases. Defining a label on the field would seem more tidy. Likewise the comment about 'preserve_field_order' I've still got this in for 'django-serializers' at the moment, but I think it's something that should become private. (At an implementation level it's still needed, in order to make sure you can exactly preserve the field ordering for json and yaml dumpdata, which is unsorted (determined by pythons dict key ordered). I answer Russell about that Being able to nest serializers inside other serializers makes sense, but I don't understand why you need to be able to nest fields inside fields. Shouldn't serializers be used to represent complex outputs and fields be used to represent flat outputs? At first I think Serializer should be tied with object (one Serializer = one object). But then I figured out that Serializer can work with object passed in upper level Serialized (so 'source' field isn't needed). Maybe nested serializers and flat field is better approach. I must consider this. The "class_name" option for deserialization is making too many assumptions. The class that's being deserialized may not be present in the data - for example if you're building an API, the class that's being deserialized might depend on the URL that the data is being sent too. eg "http://example.com/api/my-model/12"; I wrote about class_name in answer to Russell. If model class is in url then we can do something like that: serializers.deserialize("json", data_from_response, deserializer=UserSerializer(class_name=model_from_url(url))) In your dump data serializer, how do you distinguish that the 'fields' field is the entire object being serialized rather than the 'fields' attribute of the object being serialized? fields = ModelFieldsSerializer(...) will be feed with object to serialize and name 'fields'. I'm only interested at output from it. It must be python native datatype and I do something like serialized_dict['fields'] = output_of_mode_fields_serializer ModelFieldsSerializer knows what do with object. Also, the existing dumpdata serialization only serializes local fields on the model - if you're using multi-table inheritance only the child's fields will be serialized, so you'll need some way of handling that. Your PKFlatField implementation will need to be a bit more complex in order to handle eg many to many relationships. Also, you'll want to make sure you're accessing the pk's from the model without causing another database lookup. Thanks for point that. Have to think about it. Is there a particular reason you've chosen to drop 'depth' from the API? Wouldn't it sometimes be useful to specify the depth you want to serialize to? Sometimes maybe. But in most cases no. And there are some other ways to do that. In my opinion going (globally) more than one level depth almost never be needed. If there is need to go deeper in only one (or few but not all) fields 'depth' is unusable. There's two approaches you can take to declaring the 'xml' format for dumpdata, given that it doesn't map nicely to the json and yaml formats. One is to define a custom serializer (as you've done), the other is to keep the serializer the same and define a custom renderer (or encoder, or whatever you want to call the second stage). Of the two, I think that the second is probably a simpler cleaner approach. When you come to writing a dumpdata serializer, you'll find that there's quite a few corner cases that you'll need to deal with in order to maintain full byte-for-byte backwards compatibility, including how natural keys are serialized, how many to many relationships are encoded, how None is handled for different types, down to making sure you preserve the correct field ordering across each of json/yaml/xml. I *think* that getting the details of all of those will end up being awkward to express using your current approach. The second approach would be to a dict-like format, that can easily be encoded into json or yam
Re: Customizable Serialization check-in
W dniu 06.05.2012 10:45, Russell Keith-Magee pisze: - I'm not sure I follow how class_name would be used in practice. The act of deserialization is to take a block of data, and process it to populate an object. In the simplest case, you could provide an empty instance (or factory) that is then populated by deserialization. In this case, no class name is required -- it's provided explicitly by the object you provide. I have this functionality with class_name serializers.deserialize("json", data, deserializer=UserSerializer(class_name=User)) A more complex case is to use the data itself to determine the type of object to create. This seems to be the reason you have "class_name", but I'm not sure it's that simple. Consider a case where you're deserializing a thing of objects; if the data has a "name" attribute, create a "Person" object, otherwise create a "Thing" object. The object required is well defined, but not neatly available in a field. If we have homogeneous list of object there is no problem. We can use same construction as above, or depends (by class_name) on some field in object. But if list is heterogeneous and we havn't information about type - it's difficult then. There is need for feature like that? My first thought is to have method in Serializer class like: get_class(self, data): # data is object (dict?) produced by first phase of deserialization) # user can search some field in it and return class for object creation if 'name' in data: return Person return Thing it can be more like internal API which can be default set to: get_class(self, data): if self._meta.class_name is not None: if isinstance(self._meta.class_name, str): return object_from_string(data['self._meta.class_name']) else: return self._meta.class_name raise Exception('No class for deserialization provided') So if someone has simple needs he can use simple functionality like class_name=Profile, but if there is need to find class by duck typing philosophy using overwriting get_class will be suitable. There's also no requirement that deserialization into an object is handled by a ModelSerializer. ModelSerializer should just be a convenient factory for populating a Serializer based on attributes of a model -- so anything you do with ModelSerializer should be possible to do manually with a Serializer. If class_name is tied to ModelSerializer, we lose this ability. Yes, I make a mistake - where I wrote ModelSerializer options I should wrote Serializer options because ModelSerializer is just Serializer which understands difference about fields in object (m2m, fk ...) - I'm not sure I see the purpose of "aliases" -- or, why this role can't be played by other parts of the system. In particular, I see Field() as a definition for how to fill out one 'piece' of a serialised object. Why doesn't Field() contain the logic for how to extract it's value from the underlying object? Previously I used it with additional meaning -> if aliases[x] = aliases[y] then x = [value[x], value[y]], but now it's only shortcut for writing: 1) fname = Field(label=first_name) 2) aliases = {'fname' :'first_name'} It's redundant but I think this can be helpful - Why is preserve_field_ordering needed? Can't ordering be handled by the explicit order of field definitions, or the ordering in the "fields" attribute? I agree, ordering in the 'fields' attribute (like in Forms) will be better. * As a matter of style, serializer_field_value and deserialize_field_value seem excessively long as names. Is there something wrong with serialize and deserialize? For now I want reserve serialize and deserialize names because I think these names would be more appropriate for methods that will return python native datatypes after first phase of serialization. If user overwrite it he can do anything he want and must return native datatypes. But sure, (de)serializer_field_value seems to be too long. Any other propositions? Maybe get_value (because it must get value from object field for serialization) and set_value (it sets value ob object field in deserialization) ? * I don't think getattr() works quite how you think it does. In particular, I don't think: getattr(instance, instance_field_name) = getattr(obj, field_name) will do what you think it does. I think you're looking for setattr() here. Oops :) Definitely setattr should be there. * Can you elaborate some more on the XML attribute syntax in your proposal? One of your original statements (that I agree with) is that the "format" is independent of the syntax, and that a single set of formatting rules should be able to be used for XML or for JSON. The big difference between XML and JSON is that XML allows for values to be packed as attributes. I can see that you've got an 'attribute' argument on a Field, but it isn't clear to me how JSON would interpret this, or how XML would interpret: I
Re: Customizable Serialization check-in
Hey Piotr, Here's a few comments... You have 'fields' and 'exclude' option, but it feels like it's missing an 'include' option - How would you represent serializing all the fields on a model instance (without replicating them), and additionally including one other field? I see that you could do that by explicitly adding a Field declaration, but 'include' would seem like an obvious addition to the other two options. I'd second Russell's comment about aliases. Defining a label on the field would seem more tidy. Likewise the comment about 'preserve_field_order' I've still got this in for 'django-serializers' at the moment, but I think it's something that should become private. (At an implementation level it's still needed, in order to make sure you can exactly preserve the field ordering for json and yaml dumpdata, which is unsorted (determined by pythons dict key ordered). Being able to nest serializers inside other serializers makes sense, but I don't understand why you need to be able to nest fields inside fields. Shouldn't serializers be used to represent complex outputs and fields be used to represent flat outputs? When defining custom fields it'd be good if there was a way of overriding the serialization that's independent of how the field is retrieved from the model. For example, with model relation fields, you'd like to be able to subclass between representing as a natural key, representing as a url, representing as a string name etc... without having to replicate all the logic that handles the differences between relationship, multiple relationships, and reverse relationships. The "class_name" option for deserialization is making too many assumptions. The class that's being deserialized may not be present in the data - for example if you're building an API, the class that's being deserialized might depend on the URL that the data is being sent too. eg "http://example.com/api/my-model/12"; In your dump data serializer, how do you distinguish that the 'fields' field is the entire object being serialized rather than the 'fields' attribute of the object being serialized? Also, the existing dumpdata serialization only serializes local fields on the model - if you're using multi-table inheritance only the child's fields will be serialized, so you'll need some way of handling that. Your PKFlatField implementation will need to be a bit more complex in order to handle eg many to many relationships. Also, you'll want to make sure you're accessing the pk's from the model without causing another database lookup. Is there a particular reason you've chosen to drop 'depth' from the API? Wouldn't it sometimes be useful to specify the depth you want to serialize to? There's two approaches you can take to declaring the 'xml' format for dumpdata, given that it doesn't map nicely to the json and yaml formats. One is to define a custom serializer (as you've done), the other is to keep the serializer the same and define a custom renderer (or encoder, or whatever you want to call the second stage). Of the two, I think that the second is probably a simpler cleaner approach. When you come to writing a dumpdata serializer, you'll find that there's quite a few corner cases that you'll need to deal with in order to maintain full byte-for-byte backwards compatibility, including how natural keys are serialized, how many to many relationships are encoded, how None is handled for different types, down to making sure you preserve the correct field ordering across each of json/yaml/xml. I *think* that getting the details of all of those will end up being awkward to express using your current approach. The second approach would be to a dict-like format, that can easily be encoded into json or yaml, but that can also include metadata specific to particular encodings such as xml (or perhaps, say, html). You'd have a generic xml renderer, that handles encoding into fields and attributes in a fairly obvious way, and a dumpdata-specific renderer, that handles the odd edge cases that the dumpdata xml format requires. The dumpdata-specific renderer would use the same intermediate data that's used for json and yaml. I hope all of that makes sense, let me know if I've not explained myself very well anywhere. Regards, Tom On Friday, 4 May 2012 21:08:14 UTC+1, Piotr Grabowski wrote: > > Hi, > > During this week I have a lot of work so I didn't manage to present my > revised proposal in Monday like i said. Sorry. I have it now: > https://gist.github.com/2597306 > > Next week I hope there will be some discussion about my proposal. I will > also think how it should be done under the hood. There should be some > internal API. I should also resolve one Django ticket. I think about > this https://code.djangoproject.com/ticket/9279 There will be good for > test cases in my future solution. > > I should write my proposal on this group? In github I have nice
Re: Customizable Serialization check-in
On Sat, May 5, 2012 at 4:08 AM, Piotr Grabowski wrote: > Hi, > > During this week I have a lot of work so I didn't manage to present my > revised proposal in Monday like i said. Sorry. I have it now: > https://gist.github.com/2597306 Hi Piotr, At a high level, I think you're headed in the right direction. I like the way you've separated Field and Serializer, and I like the way that Serializer represents on "nesting level" of the final output (so if you want complex formats for a single object, such as with the way Django's JSON serializer has id, model and fields at the top level, you nest Serializers to suit). Here's some specific feedback: * I can see that ModelSerializer will play an important part in your proposal. However, some of your API proposals seem a little unnecessary -- or are unclear why they're needed. Some areas that need clarification: - I'm not sure I follow how class_name would be used in practice. The act of deserialization is to take a block of data, and process it to populate an object. In the simplest case, you could provide an empty instance (or factory) that is then populated by deserialization. In this case, no class name is required -- it's provided explicitly by the object you provide. A more complex case is to use the data itself to determine the type of object to create. This seems to be the reason you have "class_name", but I'm not sure it's that simple. Consider a case where you're deserializing a thing of objects; if the data has a "name" attribute, create a "Person" object, otherwise create a "Thing" object. The object required is well defined, but not neatly available in a field. There's also no requirement that deserialization into an object is handled by a ModelSerializer. ModelSerializer should just be a convenient factory for populating a Serializer based on attributes of a model -- so anything you do with ModelSerializer should be possible to do manually with a Serializer. If class_name is tied to ModelSerializer, we lose this ability. - I'm not sure I see the purpose of "aliases" -- or, why this role can't be played by other parts of the system. In particular, I see Field() as a definition for how to fill out one 'piece' of a serialised object. Why doesn't Field() contain the logic for how to extract it's value from the underlying object? - Why is preserve_field_ordering needed? Can't ordering be handled by the explicit order of field definitions, or the ordering in the "fields" attribute? * As a matter of style, serializer_field_value and deserialize_field_value seem excessively long as names. Is there something wrong with serialize and deserialize? * I don't think getattr() works quite how you think it does. In particular, I don't think: getattr(instance, instance_field_name) = getattr(obj, field_name) will do what you think it does. I think you're looking for setattr() here. * Can you elaborate some more on the XML attribute syntax in your proposal? One of your original statements (that I agree with) is that the "format" is independent of the syntax, and that a single set of formatting rules should be able to be used for XML or for JSON. The big difference between XML and JSON is that XML allows for values to be packed as attributes. I can see that you've got an 'attribute' argument on a Field, but it isn't clear to me how JSON would interpret this, or how XML would interpret: - A Field that had multiple sub-Fields, all of which were attribute=True - A Field that had multiple sub-Fields, several of which were attribute=False - The difference between these two definitions by your formatting rules: subval main value In particular, why is the top level structure of the JSON serializer handled with nested Serializers, but the structure of the XML serializer is handled with nested Fields? > Next week I hope there will be some discussion about my proposal. I will > also think how it should be done under the hood. There should be some > internal API. I should also resolve one Django ticket. I think about this > https://code.djangoproject.com/ticket/9279 There will be good for test cases > in my future solution. I would suggest that you don't spend *too* much time on this. It's certainly a good idea to get to know our committing procedures, and historically we've encouraged students to get to use working on a small ticket as a way to do this. However, your project is unusual in that you've been accepted without a firm API proposal. Given that you won't really be able to work on the GSoC without an accepted proposal, I'd suggest that your API should take precedence in your pre-GSoC plans. > I should write my proposal on this group? In github I have nice formatting > and in this group my Python code was badly formatted. It's up to you; however, the problem with posting to a Gist (or similar) is that it's very hard to comment on specific parts of your proposal. I know code formatting is a pain in Google groups, but it is a much be
Re: Customizable Serialization check-in
Hi, During this week I have a lot of work so I didn't manage to present my revised proposal in Monday like i said. Sorry. I have it now: https://gist.github.com/2597306 Next week I hope there will be some discussion about my proposal. I will also think how it should be done under the hood. There should be some internal API. I should also resolve one Django ticket. I think about this https://code.djangoproject.com/ticket/9279 There will be good for test cases in my future solution. I should write my proposal on this group? In github I have nice formatting and in this group my Python code was badly formatted. -- Piotr Grabowski -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
W dniu 27.04.2012 12:39, Tom Christie pisze: Hey Piotr, > I quickly skimmed the proposal and I noticed speed/performance wasn't mentioned. I believe performance is important in serialization and especially in deserialization. Right. Also worth considering is making sure the API can deal with streaming large querysets, rather than loading all the data into memory at once. (See also https://code.djangoproject.com/ticket/5423) - Tom. Maybe it can be done with chain of two black box generators. First generator input are queryset (iterable sequence) and user defined Serializer class contains how to transform single object and output is python primitive type objects. Second is feed with this objects and outputs serialized_string. What with nested objects - more generators? Generators are good because we can also reuse Serializer objects == better performance. But like Anssi said - optimize after the code is written, not before :) -- Piotr Grabowski -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
Hey Piotr, Thanks for the quick response. > However sharing ideas and discuss how the API should look and work it will be very desirable. That'd be great, yup. I've got a couple of comments and questions about bits of the API, but I'll wait until you've had a chance to post your proposal to the list before starting that discussion. > I quickly skimmed the proposal and I noticed speed/performance wasn't mentioned. I believe performance is important in serialization and especially in deserialization. Right. Also worth considering is making sure the API can deal with streaming large querysets, rather than loading all the data into memory at once. (See also https://code.djangoproject.com/ticket/5423) - Tom. On Friday, 27 April 2012 10:11:56 UTC+1, Piotr Grabowski wrote: > > W dniu 27.04.2012 10:36, Anssi K��ri�inen pisze: > > On Apr 27, 11:14 am, Piotr Grabowski wrote: > >> Hi! > >> > >> I'm Piotr Grabowski, student from University of Wroclaw, Poland > >> In this Google Summer of Code I will deal with problem of customizable > >> serialization in Django. > >> > >> You can find my proposal here https://gist.github.com/2319638 > > I quickly skimmed the proposal and I noticed speed/performance wasn't > > mentioned. I believe performance is important in serialization and > > especially in deserialization. It is not the number one priority item, > > but it might be worth it to write a couple of benchmarks (preferably > > to djangobench [1]) and check that there are no big regressions > > introduced by your work. If somebody already has good real-life > > testcases available, please share them... > > > > - Anssi > > > > [1] https://github.com/jacobian/djangobench/ > > > I didn't think about performance a lot. There will be regressions. > Now serialization is very simple: Iterate over fields, transform it into > string (or somethink serializable), serialize it with json|yaml|xml. > In my approach it is: transform (Model) object to Serializer object, > each field from original object is FieldSerializer object, next (maybe > recursively) get native python type object from each field, serialize it > with json|yaml|xml. > I can do some optimalizations in this process but it's clear it will > take longer to serialize (and deserialize) object then now. It can be > problem with time taken by tests if there is a lot of fixtures. > I will try to write good, fast code but I will be very glad if someone > give me tips about performance bottlenecks in it. > > -- > Piotr Grabowski > > -- You received this message because you are subscribed to the Google Groups "Django developers" group. To view this discussion on the web visit https://groups.google.com/d/msg/django-developers/-/K9cslx5Fa_sJ. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
On Apr 27, 12:11 pm, Piotr Grabowski wrote: > I didn't think about performance a lot. There will be regressions. > Now serialization is very simple: Iterate over fields, transform it into > string (or somethink serializable), serialize it with json|yaml|xml. > In my approach it is: transform (Model) object to Serializer object, > each field from original object is FieldSerializer object, next (maybe > recursively) get native python type object from each field, serialize it > with json|yaml|xml. > I can do some optimalizations in this process but it's clear it will > take longer to serialize (and deserialize) object then now. It can be > problem with time taken by tests if there is a lot of fixtures. > I will try to write good, fast code but I will be very glad if someone > give me tips about performance bottlenecks in it. One possibility is to have a fast-path for simple cases. But, premature optimization is the root of all evil, so lets first see how fast the code is, and then check if anything needs to be done. I still think it is a good idea to actually check how fast the new serialization code is, not just assume it is fast enough. So, please include some simple benchmarks in your project. I hope users who have a need for fast serialization will participate in this discussion by telling their use cases. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
W dniu 27.04.2012 10:36, Anssi Kääriäinen pisze: On Apr 27, 11:14 am, Piotr Grabowski wrote: Hi! I'm Piotr Grabowski, student from University of Wroclaw, Poland In this Google Summer of Code I will deal with problem of customizable serialization in Django. You can find my proposal here https://gist.github.com/2319638 I quickly skimmed the proposal and I noticed speed/performance wasn't mentioned. I believe performance is important in serialization and especially in deserialization. It is not the number one priority item, but it might be worth it to write a couple of benchmarks (preferably to djangobench [1]) and check that there are no big regressions introduced by your work. If somebody already has good real-life testcases available, please share them... - Anssi [1] https://github.com/jacobian/djangobench/ I didn't think about performance a lot. There will be regressions. Now serialization is very simple: Iterate over fields, transform it into string (or somethink serializable), serialize it with json|yaml|xml. In my approach it is: transform (Model) object to Serializer object, each field from original object is FieldSerializer object, next (maybe recursively) get native python type object from each field, serialize it with json|yaml|xml. I can do some optimalizations in this process but it's clear it will take longer to serialize (and deserialize) object then now. It can be problem with time taken by tests if there is a lot of fixtures. I will try to write good, fast code but I will be very glad if someone give me tips about performance bottlenecks in it. -- Piotr Grabowski -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: Customizable Serialization check-in
On Apr 27, 11:14 am, Piotr Grabowski wrote: > Hi! > > I'm Piotr Grabowski, student from University of Wroclaw, Poland > In this Google Summer of Code I will deal with problem of customizable > serialization in Django. > > You can find my proposal here https://gist.github.com/2319638 I quickly skimmed the proposal and I noticed speed/performance wasn't mentioned. I believe performance is important in serialization and especially in deserialization. It is not the number one priority item, but it might be worth it to write a couple of benchmarks (preferably to djangobench [1]) and check that there are no big regressions introduced by your work. If somebody already has good real-life testcases available, please share them... - Anssi [1] https://github.com/jacobian/djangobench/ -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
[GSoC] Customizable Serialization check-in
Hi! I'm Piotr Grabowski, student from University of Wroclaw, Poland In this Google Summer of Code I will deal with problem of customizable serialization in Django. You can find my proposal here https://gist.github.com/2319638 It's obviously not a finished idea, it's need to be simplified for sure. My mentor Russel Keith Magee told me to look at Tom Christie's serialization API. I found it similar to my proposal, there is a lot in common - declarative fields, same approach to various aspect of serialization , but his API is simpler and it feels better. Since Tom already post on group about his project I can refer to it: W dniu 27.04.2012 06:44, Tom Christie pisze: ... Given that Piotr's GSoC proposal has now been accepted, I'm wondering what the right way forward is? I'd like to continue to push forward with this, but I'm also aware that it might be a bit of an issue if there's already an ongoing GSoC project along the same lines? Having taken a good look through the GSoC proposal, it looks good, and there seems to be a fair bit of overlap, so hopefully he'll find what I've done useful, and I'm sure I'll have plenty of comments on his project as it progresses. I'd consider suggesting a collaborative approach, but the rules of the GSoC wouldn't allow that right? -- Like I said above, your work will be very useful for me. I must read GSoC regulations carefully but for sure collaboration with code writing is impossible. I don't know that I could use your existing code base but I think it's also impossible. However sharing ideas and discuss how the API should look and work it will be very desirable. My plan for next few weeks is to meet Django contribution requirements, solve ticket to prove I now the process off doing it, and what's most important have discussion about serialization API. I hope community will be interested in this feature. After weekend I will post my proposal with updates from Tom's API. -- Piotr Grabowski -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.