Re: GSoC Meta refactor: Bikeshedding time!!

Anssi Kääriäinen Mon, 18 Aug 2014 03:04:19 -0700

On Monday, August 18, 2014 7:45:17 AM UTC+3, Russell Keith-Magee wrote:
>
> I understand what you're driving at here, and I've had similar thoughts 
> over the course of the SoC. The catch is that this makes the API for 
> get_fields() fairly complicated.
>
> If every field fits into one specific type, then get_fields() just 
> requires a single boolean flag (do I include fields of type X) for each 
> field type. We can also easily add new field types by adding new booleans 
> to the API.
>
> However, if a field fits into multiple categories, then it's impossible 
> (or, at least, exceedingly complicated) to make a single call to 
> get_fields() that will specify all your field requirements. "Get me all 
> non-virtual data fields" requires "virtual=False, data=True, m2m=False", 
> but "Get all virtual data fields that represent m2ms" requires 
> "virtual=True, data=False, m2m=True". You can't pass in both sets of 
> arguments at the same time, so you either have to make multiple calls to 
> get_fields(), or you have to invent some sort of query syntax for 
> get_fields() that allows union queries. 
>
> Plus, at the end of the day, get_fields() is abstracted behind highly 
> cached and optimised properties for key lookups. These properties are 
> effectively a cached call to get_fields() with a specific set of arguments 
> - so even if get_fields() doesn't expose a "one category per field" 
> requirement, the API will require, at some level, names that have clear 
> (and preferably non-overlapping) membership.
>


If fields are in multiple categories then users will want to do the full 
range of set operation on the categories. Encoding that in to the API 
doesn't sound promising.

I don't think users actually want to get fields based on the suggested 
>> categorization. I feel we get an easier to use and more flexible API if we 
>> have higher level categories and allow fields to match multiple categories. 
>> As a practical example if I want all relation fields, that is going to be 
>> hard using the suggested API. Getting all relation fields is a more 
>> realistic use case than getting related virtual objects.
>>
>
> Quite probably true. As a point of interest, the current (as in, 1.6) API 
> actually doesn't differentiate between category (a) "pure data" and 
> category (b) "relating data (i.e., FK)" fields - if you ask for "data 
> fields" you get pure data *and* foreign keys. So, at least as far as 
> Django's own usage is concerned, you're correct in saying that taxonomy 
> I've described isn't fully required. 
>
> Daniel's survey of internal usage reveals that there are three use cases 
> for getting a list of fields in Django's internal API:
>
>  * Get all data and m2m fields (i.e., categories  a, b, and d). This is 
> effectively "all fields on *this* model"
>
>  * Get all data, m2m, related objects, related m2m, and virtual fields 
> (i.e., categories a, b, d, f, g, h, i - excluding c and e because Django 
> doesn't currently have any fields of this type). This is "all fields on 
> this model, or related to this model"
>
>  * Get all m2m fields (i.e., category d)
>  
> So - at the very least, we need names to describe those three groups. My 
> intention with describing a richer taxonomy is to try and give names to 
> other groupings of interest. 
>
> If we want to have all fields to match single and only single category, 
>> then we need to redefine the categories to make sure ForeignKeys as virtual 
>> fields are possible, and that more esoteric custom join based fields fit in 
>> to the categorization.
>>
>
> Agreed - that's why I threw this out there for discussion :-)
>
> Properties like "data", "virtual", "external", "related", "relating" - 
> these are high level concepts describing the way a field manifests. 
> However, that doesn't mean we need to expose these properties as part of 
> the formal API.
>
> Part of the underlying problem here -- lets say we roll out Django 1.7 
> with some version of this API, and in 1.8, foreign key fields change to 
> become virtual. That effectively becomes backwards incompatible for queries 
> that are sensitive to a "virtual" flag; but it doesn't change the 
> underlying need to identify that a field is a foreign key. We need to 
> capture the latter use case, but not necessarily the former.
>
 
Could we go with a minimal API for get_fields()? Instead of having 
categorization on the get_fields() API, we could provide field flags for 
the categories. With field flags it is straightforward to filter the return 
list of get_fields(). As an example, fetching those fields which are 
relations but which aren't virtual: [f for f in get_fields() if 
f.relational and not f.virtual]. If this path is taken, then I am not sure 
how minimal the get_fields() API should be. We likely need flags for at 
least if the field is defined on local, parent or some remote model.

As for changing ForeignKey to virtual field plus concrete field 
representation - I just realized this will be backwards incompatible no 
matter what we do regarding categorization. An all-fields including 
get_fields() call will return separate author (virtual) and author_id 
(concrete) fields after the split. I am not sure what we can do about this. 
It would be very unfortunate if we can't refactor the way ForeignKeys work 
due to the meta API. Any ideas how we can avoid the backwards compatibility 
trap?

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/567cf6b8-f634-4316-87cc-1e5ac4246454%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: GSoC Meta refactor: Bikeshedding time!!

Reply via email to