I'd say ArrayField is a straight up data field at the moment. It stores 0-1 lists of data. It's no different to CommaSeparatedIntegerField (seriously, why does that exists...)
*If* PG gets the relevant update that will allow `integer[] references` (i.e. ArrayField(ForeignKey)) then this would be different, and would be more like a m2m field. There is an argument that it's 0-N anyway, but in the implementation both within Django and in the database I don't think the distinction is useful at the point, from an ORM point of view in any case. For a forms point of view it's quite different. On 20 August 2014 09:19, Russell Keith-Magee <[email protected]> wrote: > > On Mon, Aug 18, 2014 at 6:03 PM, Anssi Kääriäinen <[email protected] > > wrote: > >> On Monday, August 18, 2014 7:45:17 AM UTC+3, Russell Keith-Magee wrote: >>> >>> I understand what you're driving at here, and I've had similar thoughts >>> over the course of the SoC. The catch is that this makes the API for >>> get_fields() fairly complicated. >>> >>> If every field fits into one specific type, then get_fields() just >>> requires a single boolean flag (do I include fields of type X) for each >>> field type. We can also easily add new field types by adding new booleans >>> to the API. >>> >>> However, if a field fits into multiple categories, then it's impossible >>> (or, at least, exceedingly complicated) to make a single call to >>> get_fields() that will specify all your field requirements. "Get me all >>> non-virtual data fields" requires "virtual=False, data=True, m2m=False", >>> but "Get all virtual data fields that represent m2ms" requires >>> "virtual=True, data=False, m2m=True". You can't pass in both sets of >>> arguments at the same time, so you either have to make multiple calls to >>> get_fields(), or you have to invent some sort of query syntax for >>> get_fields() that allows union queries. >>> >>> Plus, at the end of the day, get_fields() is abstracted behind highly >>> cached and optimised properties for key lookups. These properties are >>> effectively a cached call to get_fields() with a specific set of arguments >>> - so even if get_fields() doesn't expose a "one category per field" >>> requirement, the API will require, at some level, names that have clear >>> (and preferably non-overlapping) membership. >>> >> >> If fields are in multiple categories then users will want to do the full >> range of set operation on the categories. Encoding that in to the API >> doesn't sound promising. >> >> >> I don't think users actually want to get fields based on the suggested >>>> categorization. I feel we get an easier to use and more flexible API if we >>>> have higher level categories and allow fields to match multiple categories. >>>> As a practical example if I want all relation fields, that is going to be >>>> hard using the suggested API. Getting all relation fields is a more >>>> realistic use case than getting related virtual objects. >>>> >>> >>> Quite probably true. As a point of interest, the current (as in, 1.6) >>> API actually doesn't differentiate between category (a) "pure data" and >>> category (b) "relating data (i.e., FK)" fields - if you ask for "data >>> fields" you get pure data *and* foreign keys. So, at least as far as >>> Django's own usage is concerned, you're correct in saying that taxonomy >>> I've described isn't fully required. >>> >>> Daniel's survey of internal usage reveals that there are three use cases >>> for getting a list of fields in Django's internal API: >>> >>> * Get all data and m2m fields (i.e., categories a, b, and d). This is >>> effectively "all fields on *this* model" >>> >>> * Get all data, m2m, related objects, related m2m, and virtual fields >>> (i.e., categories a, b, d, f, g, h, i - excluding c and e because Django >>> doesn't currently have any fields of this type). This is "all fields on >>> this model, or related to this model" >>> >>> * Get all m2m fields (i.e., category d) >>> >>> So - at the very least, we need names to describe those three groups. My >>> intention with describing a richer taxonomy is to try and give names to >>> other groupings of interest. >>> >>> If we want to have all fields to match single and only single category, >>>> then we need to redefine the categories to make sure ForeignKeys as virtual >>>> fields are possible, and that more esoteric custom join based fields fit in >>>> to the categorization. >>>> >>> >>> Agreed - that's why I threw this out there for discussion :-) >>> >>> Properties like "data", "virtual", "external", "related", "relating" - >>> these are high level concepts describing the way a field manifests. >>> However, that doesn't mean we need to expose these properties as part of >>> the formal API. >>> >>> Part of the underlying problem here -- lets say we roll out Django 1.7 >>> with some version of this API, and in 1.8, foreign key fields change to >>> become virtual. That effectively becomes backwards incompatible for queries >>> that are sensitive to a "virtual" flag; but it doesn't change the >>> underlying need to identify that a field is a foreign key. We need to >>> capture the latter use case, but not necessarily the former. >>> >> >> Could we go with a minimal API for get_fields()? Instead of having >> categorization on the get_fields() API, we could provide field flags for >> the categories. With field flags it is straightforward to filter the return >> list of get_fields(). As an example, fetching those fields which are >> relations but which aren't virtual: [f for f in get_fields() if >> f.relational and not f.virtual]. If this path is taken, then I am not sure >> how minimal the get_fields() API should be. We likely need flags for at >> least if the field is defined on local, parent or some remote model. >> >> As for changing ForeignKey to virtual field plus concrete field >> representation - I just realized this will be backwards incompatible no >> matter what we do regarding categorization. An all-fields including >> get_fields() call will return separate author (virtual) and author_id >> (concrete) fields after the split. I am not sure what we can do about this. >> It would be very unfortunate if we can't refactor the way ForeignKeys work >> due to the meta API. Any ideas how we can avoid the backwards compatibility >> trap? >> > > I think Daniel and I might have come up with a way to meet both these > requirements - a minimalist API for get_fields, with at least some > protection against the known incoming backwards compatibility issue. > > The summary so far: it appears that a complex taxonomy isn't especially > helpful - firstly, because any complex taxonomy is going to have edge cases > that are hard to categorize, but also because a complex taxonomy leads to a > much more complex internal API that is going to be prone to backwards > compatibility problems. > > So - instead of worrying about 'virtual' and other properties like that, > lets look at why the _meta API is fundamentally used - to get a list of > fields that need to be handled in data processing. This primarily means > forms, but other forms of serialisation are also included. In these use > cases, there are always going to be per-field differences (even a CharField > and an IntegerField require *slightly* different handling), so we won't > focus on internal representations, storage mechanisms, or anything like > that. Instead, lets focus on cardinality - a field represents some sort of > data that has a cardinality with the object on which it is stored. If > something has cardinality 1, you can display a single field. If it's > cardinality N, you need to display a list, or some sort of inline. > > This results in 3 categories that are mutually exclusive: > > a) "Data fields": Fields of cardinality 0-1: > > * A CharField stores 0 or 1 strings (0 is the case of a nullable field). > > * An IntegerField stores 0 or 1 integers. > > * A FileField stores 0 or 1 file paths. > > * An ImageField stores 0 or 1 file paths - although in being modified, it > might modify some other fields. > > * A ForeignKey stores 0 or 1 references to another object. > > * A GenericForeignKey stores 0 or 1 references to another object. > > * A notional "DocumentField" on a NoSQL store references 0 or 1 external > documents. > > b) "ManyToMany Fields": Fields that are locally defined that represent a > cardinality 0-N relationship with another object: > > * Many to Many fields store 0-N references to a second model. > > c) "Related Objects": Fields that represent a cardinality 0-N relationship > with this object, but aren't locally defined: > > * The 'related' side of a ForeignKey > > * The 'related' side of a ManyToMany > > * A GenericRelation representing the reverse side of a GenericForeignKey > > These three types are mutually exclusive - you either have cardinality 1 > *or* cardinality N, not both; and you're either locally defined on this > object or you're not. I can't think of an example of "cardinality 1 data > that isn't defined on this object", but it would fit into this taxonomy if > it were needed; I also can't think of a field definition that would span > models. > > In addition to this basic classification, a field can be marked as > "hidden". The immediate use for this is to hide the related_name='+' case > of a FK or M2M. Looking forward, it would be used to mask fields that > exist, but aren't intended to be user visible - for example, in the > potential future case where a ForeignKey is split in two, or a Composite > Key, there would be a "hidden" integer field (or fields) storing the actual > data, and a virtual (but non-hidden) field that is the public API for > manipulating the relationship. This would also be backwards compatible, > because the "visible" field list hasn't changed. > > Fields are also tracked according to their parentage; this is used by > tools interacting with inheritance relationships to know which fields are > actually on this model, and which are inherited from a base class. > > This yields the following formal API for _meta: > > * get_fields(data, many_to_many, related, include_hidden, include_parents) > > * @property data_fields (=> get_fields(data=True, many_to_many=False, > related=False, include_hidden=False, include_parents=True) > > * @property many_to_many_fields (=> get_fields(data=False, > many_to_many=True, related=False, include_hidden=False, > include_parents=True) > > * @property related_objects (=> get_fields(data=False, > many_to_many=False, related=True, include_hidden=False, > include_parents=True) > > Does this sound any more sane as an API? > > My one lingering question is whether the "many_to_many" name/category is > too explicit. I can conceive how an ArrayField could be considered a data > field (it stores 0-1 arrays of data), or a "many_to_many" field (because it > stores 0-N instances of some data). This all hinges on whether the > definition for that field category is that it is a relationship with > another *model*, or if it's just cardinality N data. It's trivial to call > it a Data field and just leave it at that, but I'm wondering if there might > be benefit in broadening the definition of "many_to_many". > > Russ %-) > > -- > You received this message because you are subscribed to the Google Groups > "Django developers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/django-developers. > To view this discussion on the web visit > https://groups.google.com/d/msgid/django-developers/CAJxq84_OcibE72RKB9T60BJW9AtY8_YYhmhM5dXH36TtW3KsYw%40mail.gmail.com > <https://groups.google.com/d/msgid/django-developers/CAJxq84_OcibE72RKB9T60BJW9AtY8_YYhmhM5dXH36TtW3KsYw%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Django developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAMwjO1HLabZ7C%3D87Y3F50PWUYDncH1ip_VgtQN-cPOXthk8yHQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
