Hi All,

First of all Thank you SO MUCH for your comments. It's really nice to hear 
great feedback from the community.
These last two weeks I have been working on improving the existing API 
implementation and terminology. Here is an overview of the 3 main tasks:

*1) get_fields should return a list instead of a tuple*
Previously, get_fields would return a tuple. The main reason was related to 
memory and immutability. After discussing with Russell, we decided this 
endpoint should actually return a list as it is a more suited datastructure.
The API has changed to return a list on each API call, unfortunately this 
has led to a number of problems:

*Manipulating lists inappropriately can cause cache errors*
As lists can be manipulated, this can cause a series of issues. get_fields 
stores results in cache for efficiency, if the results are then manipulated 
on the "outside" this will directly manipulate the cache results stored. A 
solution to this is to always provide a "copy" of the cache, but after 
doing some cProfile I found out that copy is the second most expensive call 
throughout all operations.
As this is a public API, it brings us to a design decision: do we want to 
prioritise performance and document that fields results should be never 
manipulated or should we return a copy? There are pros and cons of each 
aspect, and I will be discussing with Russell tomorrow.


*2) Move tree cache out of the apps registry*
The main optimisation for the new API (described 
here https://code.djangoproject.com/wiki/new_meta_api#Mainoptimizationpoints) 
is currently storing information on the Apps registry. After discussing 
with Russell we agree this shouldn't be the correct place to keep this. A 
solution is to store related fields information separately on each model 
(https://github.com/PirosB3/django/pull/5/files#diff-98e98c694c90e830f918eb5279134ab9R275).
 
This has been done, and all tests pass.
Unfortunately, there are performance implications with this approach: we 
are storing a new list on each model regardless of if it has related fields 
or not. I will be discussing with Russell about performance tomorrow.
Finally, for a better design, cache is now stored in a namedtuple called 
RelationTree.


*3) Revisiting field types and options*
Last week I opened 
this: https://groups.google.com/forum/#!topic/django-developers/s2Lp2lTEjAE 
in order to discuss with the community possible new field and option names. 
This week I have actually revisited the fields and options for get_field/s 
and I have come up with the following:

*Removed concrete fields*
Technically speaking, concrete fields are data fields without a column. The 
only concrete field in the codebase is ForeignObject. There are 2 classes 
that inherit from ForeignObject: GenericRelation and ForeignKey, but each 
of them fall into different categories (virtual and data). Given that 
ForeignObject is an internal structure, and given that Django only looks 
for concrete fields on very few occasions, it makes sense to remove 
include_concrete from get_fields options.


*Removed include_proxy option*
In only 2 parts of the codebase (deletion.py) we want to fetch all related 
objects for a model, including any relations to proxies of that model. This 
is used in order to perform a "bubble up" delete of relations (an example 
of this can be seen 
at 
https://github.com/django/django/blob/master/tests/delete_regress/tests.py#L209).
 
Given that most developers will not need this option, and given that it can 
be easily computed from the outside, it makes sense to remove it from the 
get_fields options.


I think that by removing these two options, and by revisiting the fields 
terminology, we will come up with a simpler and clearer API.
Below is the new, revisited, API for get_fields:

        """
        Returns a list of fields associated to the model. By default will 
only search in data.
        This can be changed by enabling or disabling field types using
        the flags available.

        Fields can be any of the following:
        - data:             any field that has an entry on the database
        - m2m:              a ManyToManyField defined on the current model
        - related_objects:  a one-to-many relation from another model that 
points to the current model
        - related_m2m:      a M2M relation from another model that points 
to the current model
        - virtual:          fields that do not necessarily have an entry on 
the database (like GenericForeignKey)

        Options can be any of the following:
        - include_parents:        include fields derived from inheritance
        - include_hidden:         include fields that have a related_name 
that starts with a "+"
        """

*Regarding virtual:*
I have also passed a bit of time in understanding what the properties of 
virtual are and what they should be. A virtual field does not have a direct 
entry on the database, but uses the data of 1 or more other fields to 
create special, abstract, structures.
An example of this could be a "composite field" such as a Point2D:

class City(models.Model):
  x = models.FloatField()
  y = models.FloatField()
  position = Point2D('x', 'y')

position is a virtual field because it does not have any presence on the 
database and relies on the information of 'x' and 'y'.

*Finally*
I apologise for posting late, EuroPython was a really good conference and 
it was great to meet some of the core developers, such as Tom, Honza, 
Baptiste. I feel I worked 50% of my time, and I will recover in the future.
In the mean-time, I hope to make up for it with:

 - Latest benchmarks! (coming soon)
 - A picture of me with Baptiste, at the sprints!

<https://lh3.googleusercontent.com/-85QCib0NdRA/U9401yTNqtI/AAAAAAAALeY/vKzvQ-FtQHY/s1600/Photo+27-07-14+15+00+42.jpg>

 

On Saturday, July 19, 2014 10:42:55 PM UTC+2, Andre Terra wrote:
>
> On Sat, Jul 19, 2014 at 2:38 AM, Raffaele Salmaso <[email protected] 
> <javascript:>> wrote:
>
>> On Fri, Jul 18, 2014 at 7:32 PM, Daniel Pyrathon <[email protected] 
>> <javascript:>> wrote:
>>
>>> *- Started on my GMail Store*
>>> (...)
>>>
>> Pirosb3, just three words: really nice work!
>>  
>>
>
> I absolutely agree! The possibilities are endless.
>
> Congratulations on delivering such a major contribution to the framework! 
> I am sure your work will be a key benchmark to future GSOC applicants.
>
>
> Cheers,
> AT
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/42037bd1-3ac0-4fdc-a359-4ae967283489%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to