Re: A second stab at an implementation of composite fields

2013-04-17 Thread Michal Petrucha
On Wed, Apr 17, 2013 at 03:49:11AM -0700, Anssi Kääriäinen wrote:
> The basic idea is that there is a new ForeignObject class. A
> ForeignObject basically just takes a related model, and from_fields
> and to_fields which are the names of the fields used for the relation.
> Then, the ORM knows how to create JOINs, subqueries etc based on a
> ForeignObject. The ForeignObject itself is a virtual field, it doesn't
> have any concrete fields in the DB.
> 
> There is currently zero public API support for using the
> ForeignObject. The addition of ForeignObject is there to make the work
> on composite fields and multicolumn foreign keys in particular easier.

Thanks for the overview. For the sake of completeness, we had a
discussion on IRC earlier today, the outcome of which is that the
ForeignObject refactor will simplify the task immensely. We also came
upon a few concerns which I'll describe in a moment.

> I took a look at the code and I think the .Meta fields implementation
> is good. (Minor point is that there is no need for the field.virtual
> flag - it should be enough to check if it has a db column or not).

Agreed. Actually, the way I tried to go about the implementation, was
to replace field.column with field.columns which would be a tuple and
in case of virtual fields this would be a property which automatically
gathers the columns of all of its enclosed concrete fields, which
means a field.virtual would be necessary, but it looks like this
brings an inappropriate amount of complexity with little benefit.

> Do you have any plans for updatable primary keys? Updatable primary
> keys would be useful: the current api where obj.lastname = newval;
> obj.save() will result in save of new object instead of update of the
> lastname is plain ugly. OTOH This problem might be too complex to
> include in GSoC - cascading the updates is going to be hard. One
> possible solution is to add some way to create the foreign keys as ON
> UPDATE CASCADE and let the DB solve the problem.

Now I get to the issues that I still haven't figured out or would like
the opinion of others on. I'll separate them into sections for better
readability.

Updatable primary keys
--

The one you mention here is the way the ORM finds out whether to
perform an INSERT or UPDATE on model.save(). The current way this
works is just fine for models with an AutoField which is not visible
to the user. The problem is that as soon as you use an explicit
primary key and aren't extremely careful, you'll get unexpected
results.

This isn't directly related to composite primary keys, however, those
make it more apparent. I actually raised this issue two years ago
(IIRC) and I think I discussed it with Carl and possibly others; the
outcome was that we'd just put a note into the docs that would warn
people to be careful not to make any part of primary key user-editable
in the admin and generally be cautious when handling them directly or
otherwise bad things would happen.

I still think it would be good to change this behavior. The downsides
are that
 1) it would subtly change behavior for models with explicit primary
keys in a backwards-incompatible way
 2) it would require models to hold something like "the last known
value of the primary key in DB"
 3) it would cause weird behavior in situations like if you have two
separate instances corresponding to the same DB row, modify the PK
in one and save. I don't really know how much of an issue this
would be.
 4) as Anssi said, we would need a cascading mechanism for updates.
With certain backends, this could probably be done by the database
itself, with others, I believe we'd have to do it manually
(SQLite?)...

Again, this is not entirely related to this project, but it would
still be nice to get this done before composite fields are released,
should we agree we want this implemented.

GenericForeignKey
-

Two years ago we gave this some thought and decided to leave this for
a later stage. I think the later stage is here.

It's fairly easy to represent composite values as strings, something
like the following works quite well::

",".join(escape(value) for value in composite_value)

Of course, we can just do this and stick it into the database as the
object_id. The problem is with GenericRelation. This is something that
works on the database level and we can't just ask the database server
to compare a string built this way with a tuple of other values.

Ideally, we'd need to come up with an encoding that would be possible
to reproduce as a SQL expression, preferably with the same result on
all DB backends. (The expression itself can be backend-dependent, the
result should probably not.)

As far as I know, Simone Federici had something that was close to what
we need, but IIRC, it had some issues. Simone, could you please chime
in and provide some more detail on this? Have you succeeded? Was it a
dead end?

I'll admit right away that I'm 

Re: A second stab at an implementation of composite fields

2013-04-17 Thread Anssi Kääriäinen
On 12 huhti, 18:34, Michal Petrucha  wrote:
> On Fri, Apr 12, 2013 at 07:35:45AM -0700, Anssi Kääriäinen wrote:
> > On 12 huhti, 16:44, Michal Petrucha  wrote:
> > ForeignKeys have been changed a lot since 2012-11-04. The introduction
> > of ForeignObject (which is base for ForeignKey) means that there is
> > support for multicolumn joins in the ORM now, and that ForeignObject
> > itself is a non-concrete field (not in ._meta.virtual_fields though).
> > ForeignObject currently works only on ORM level, so there is no API
> > for creating multicolumn foreign keys.
>
> > In any case, the GSoC proposal should take the introduction of
> > ForeignObject in account. tests/foreign_object contain some examples
> > related to ForeignObject functionality, exploring how those examples
> > work is a good idea.
>
> This has been on my TODO list for a few days actually, I was going
> through the work done regarding #19385 [1] and noticed this. However,
> I still need to give it a more thorough examination to fully get a
> grasp on this.

The basic idea is that there is a new ForeignObject class. A
ForeignObject basically just takes a related model, and from_fields
and to_fields which are the names of the fields used for the relation.
Then, the ORM knows how to create JOINs, subqueries etc based on a
ForeignObject. The ForeignObject itself is a virtual field, it doesn't
have any concrete fields in the DB.

There is currently zero public API support for using the
ForeignObject. The addition of ForeignObject is there to make the work
on composite fields and multicolumn foreign keys in particular easier.

> > Another big issue to solve is how to store the fields in model._meta.
> > IMO a good idea is to add all fields into some attribute ('all_fields'
> > as name might be good...), then separate the fields for different use
> > cases based on features the fields has. For example concrete_fields is
> > [f for f in self.all_fields if f.column], form fields are [f for f in
> > self.all_fields if f.form_field] (or maybe this should be called
> > logical fields - a multicolumn foreign key is logical, but not
> > concrete field, and the fields used by the foreign key are concrete
> > but not logical fields). The different field lists can be
> > cached_properties so there will be no problems from performance
> > perspective.
>
> As a matter of fact, this is one of the first things I did back in
> 2011. (-: Almost exactly like that -- _meta.fields contains all fields
> (except for M2M), _meta.concrete_fields, _meta.virtual_fields and I
> think I also needed a _meta.local_concrete (for non-inherited fields;
> I couldn't find a better name for it that wouldn't be too verbose).

I took a look at the code and I think the .Meta fields implementation
is good. (Minor point is that there is no need for the field.virtual
flag - it should be enough to check if it has a db column or not).

Do you have any plans for updatable primary keys? Updatable primary
keys would be useful: the current api where obj.lastname = newval;
obj.save() will result in save of new object instead of update of the
lastname is plain ugly. OTOH This problem might be too complex to
include in GSoC - cascading the updates is going to be hard. One
possible solution is to add some way to create the foreign keys as ON
UPDATE CASCADE and let the DB solve the problem.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: A second stab at an implementation of composite fields

2013-04-12 Thread Michal Petrucha
On Fri, Apr 12, 2013 at 07:35:45AM -0700, Anssi Kääriäinen wrote:
> On 12 huhti, 16:44, Michal Petrucha  wrote:
> ForeignKeys have been changed a lot since 2012-11-04. The introduction
> of ForeignObject (which is base for ForeignKey) means that there is
> support for multicolumn joins in the ORM now, and that ForeignObject
> itself is a non-concrete field (not in ._meta.virtual_fields though).
> ForeignObject currently works only on ORM level, so there is no API
> for creating multicolumn foreign keys.
> 
> In any case, the GSoC proposal should take the introduction of
> ForeignObject in account. tests/foreign_object contain some examples
> related to ForeignObject functionality, exploring how those examples
> work is a good idea.

This has been on my TODO list for a few days actually, I was going
through the work done regarding #19385 [1] and noticed this. However,
I still need to give it a more thorough examination to fully get a
grasp on this.

> Another big issue to solve is how to store the fields in model._meta.
> IMO a good idea is to add all fields into some attribute ('all_fields'
> as name might be good...), then separate the fields for different use
> cases based on features the fields has. For example concrete_fields is
> [f for f in self.all_fields if f.column], form fields are [f for f in
> self.all_fields if f.form_field] (or maybe this should be called
> logical fields - a multicolumn foreign key is logical, but not
> concrete field, and the fields used by the foreign key are concrete
> but not logical fields). The different field lists can be
> cached_properties so there will be no problems from performance
> perspective.

As a matter of fact, this is one of the first things I did back in
2011. (-: Almost exactly like that -- _meta.fields contains all fields
(except for M2M), _meta.concrete_fields, _meta.virtual_fields and I
think I also needed a _meta.local_concrete (for non-inherited fields;
I couldn't find a better name for it that wouldn't be too verbose).


[1] https://code.djangoproject.com/ticket/19385


signature.asc
Description: Digital signature


Re: A second stab at an implementation of composite fields

2013-04-12 Thread Anssi Kääriäinen
On 12 huhti, 16:44, Michal Petrucha  wrote:
 > As far as relationship fields go, we tried to ignore them at first
and
> get back to them during the second half of GSoC. Two approaches were
> considered, one was to special-case CompositeField targets in
> ForeignKey and in this case, make a single ForeignKey field manage
> multiple database columns directly. This turned out to be really
> painful and messy, so we tried another path, which was to turn
> ForeignKey into a virtual field and create an auxiliary copy of the
> target field in the local model which is supposed to manage the
> database column and its raw value. This way, ForeignKey only takes
> care of the higher-level relationship stuff.
>
> A large part of the ForeignKey refactor has been done. However, I was
> doing it on top of the CompositeField patchset, which makes it a real
> PITA to keep it in sync with master. I had managed to keep it in sync
> for about a year but this has become increasingly tedious, especially
> with all the Py3k changes and recent ORM improvements and cleanups.
> Currently my code is rebased on top of a commit from 2012-11-04, which
> is already five months in the past.

ForeignKeys have been changed a lot since 2012-11-04. The introduction
of ForeignObject (which is base for ForeignKey) means that there is
support for multicolumn joins in the ORM now, and that ForeignObject
itself is a non-concrete field (not in ._meta.virtual_fields though).
ForeignObject currently works only on ORM level, so there is no API
for creating multicolumn foreign keys.

In any case, the GSoC proposal should take the introduction of
ForeignObject in account. tests/foreign_object contain some examples
related to ForeignObject functionality, exploring how those examples
work is a good idea.

Another big issue to solve is how to store the fields in model._meta.
IMO a good idea is to add all fields into some attribute ('all_fields'
as name might be good...), then separate the fields for different use
cases based on features the fields has. For example concrete_fields is
[f for f in self.all_fields if f.column], form fields are [f for f in
self.all_fields if f.form_field] (or maybe this should be called
logical fields - a multicolumn foreign key is logical, but not
concrete field, and the fields used by the foreign key are concrete
but not logical fields). The different field lists can be
cached_properties so there will be no problems from performance
perspective.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.